NO.209 Empowering Local Open Source LLMs

Shonan Village Center

September 30 - October 3, 2024 (Check-in: September 29, 2024 )


  • Yuki Arase
    • Osaka University, Japan
  • Alexander Löser
    • Berlin University of Applied Sciences, Germany
  • Keisuke Sakaguchi
    • Tohoku University, Japan


Description of the Meeting

Overview. Since the release of ChatGPT, nearly each week novel LLMs are being published worldwide. As a result, the current landscape of LLM development shows two extremes: On one side, there are very large LLMs like GPT-4, Google’s PaLM 2, and Claude by Anthropic, among others. These models cover a broad range of domains, languages, and scenarios. However, they often remain black boxes, making it challenging to assess their techniques and training data. Worse, privacy and legal regulations may prevent companies from sending their data to these providers. On the other side, there are numerous novel local LLMs that are smaller in comparison, such as Falcon, Mosaic MTB, OpenLLAMA, Salesforce’s XGen or Meta’s LLAMA-2. These smaller LLMs might lack some functionality, but they can run on local cloud infrastructures and do not require sending data at inference time to an external LLM provider. As a result, organizations can adapt, augment, and verify these LLMs with their data and with moderate effort while ensuring sovereignty, can adhere to legal regulations, and can ensure privacy. Since the training data of local LLMs is often known, companies can even ensure compliance with local regulations, such as the EU AI Act and the copyright laws in Japan.

Major Reasons for LLM Hallucinations. However, LLMs, particularly local LLMs, often exhibit hallucinations due to their inherent incompleteness. When an LLM cannot provide a memorized answer from its vast knowledge representation, it tends to generalize and employ a similarity function to produce answers based on similar patterns of tokens. This can lead users to assume that the model is hallucinating. Another significant reason for LLM hallucinations is the training on a large amount of unverified information from the Web, which becomes a major source of incorrect responses. Moreover, contradictory information from various data sources can force the LLM to rely solely on raw training data statistics to generate answers, potentially resulting in wrong responses. Additionally, longer sequences as answers from LLMs tend to suffer from more hallucinations as prediction errors propagate when emitting subsequent tokens. Recent research indicates that LLMs' performance is often highest when relevant information occurs at the beginning or end of the input context, but significantly degrades when accessing information in the middle of long contexts. Furthermore, when LLMs break down larger complex tasks into smaller sequences of inference and actions as required in a domain, the outputs for these tasks may also experience hallucinations.

Verification, Provenance and Retrieval Augmentation. Addressing hallucinations in LLMs is a challenging task. To overcome this problem, our meeting will explore various strategies, including verification, provenance, and retrieval augmentation. For instance, LLMs could be complemented with additional sources and provenance information during training. Alternatively, answers could be augmented and verified at adaptation or query-time with trusted sources, which could be included in the LLM during training. Continuous updates in retrieval-augmented or memory-based approaches might also offer potential solutions. Moreover, machines (agents) could be employed to verify outputs of LLMs, assess grounding, retrieve complementary information from third-party sources at inference time, or continuously update the model's memory for improved answering quality in the future. The design of such agents is also a point of interest for our meeting.


An interesting question therefore is to investigate techniques to avoid hallucination in particular for Local LLMs while preserving the sovereignty and privacy of local data from an organization. Therefore, the main goal of our meeting is to discuss problems, research opportunities and methodologies for avoiding hallucinations, for providing adaptation, provenance and verification of Local LLMs.

This topic is of interest to a wide variety of groups, including:

  • Academic researchers who are interested in investigating strategies for adaptation, provenance and verification of Local Open Source LLMs that work across domains/tasks, and
  • Industry practitioners who are interested in building and improving their AI systems using company-owned data for optimizing their business and developing new products.


This workshop aims to bring world-renowned researchers working on LLMs and practitioners, such as those from health, legal, entertainment, media, social networks or other domains, together and collaboratively develop innovative methods avoiding hallucination in LLMs and methods for adaptation, verification, and provenance. The primary focus is on Local LLMs where in-house data is used, such as for adaptation, verification or provenance. Our central questions are: (1) The relation between the task-performance and model size is unclear yet. Considering a specific scenario, do we really need 100+B parameters? Is there a sweet spot that balances the computational cost and the performance? Can we modularize LLMs and provide an efficient continuously adapted memory? (2) How can we utilize potentially complementary (in-house) data not available at training time for continuously augmenting LLMs or for verifying outputs of LLMs and providing provenance?


Our specific objectives are as follows.

  • We will identify research opportunities for avoiding hallucinations in LLMs.
  • We will focus on open source LLMs and explore their combination with other data sources in particular at adaptation and inference time for verifying outputs of LLM or adding provenance.
  • We will form a broader research community with cross-disciplinary collaboration, including NLP, machine learning or information retrieval.
  • We will assist emerging researchers in linking to international researchers, finding industrial contacts, and applying for competitive research grants.

Significance and Innovation

Adaptation, verification and provenance in LLMs without hallucinations is a big challenge in NLP. While practitioners are keen to use LLMs, they struggle to provide these functionalities, in particular when sovereignty and privacy must be ensured by using local LLMs from Open Source. Novel methods for adaptation, verification and provenance in LLMs without hallucinations provide great potential to overcome the current limitations of Open Source LLMs and expand their applicability to a wide variety of practical problems. We therefore believe that our meeting has the potential to set a new research agenda for these topics.

Expected Outcomes

  • Innovative techniques for Adaptation, Verification and Provenance in Language Models while avoiding hallucinations
  • Solutions for industry practitioners in health, legal, entertainment, media, social networks etc.
  • Joint publications at top conferences and journals in NLP and AI
  • Joint funding applications for long-term research collaborations continuing the research beyond 2023


  • Academic impact: Research publications at top conferences and journals
  • Societal impact: Support for practitioners from not only large but also medium and small-sized institutions in various domains to benefit from Open Source Local LLMs

Plan for Workshop Program

We will invite 20-30 researchers of both emergent and world-renowned talents to discuss these exciting and critical topics and establish strong connections among attendees.

  • Before the workshop: Organizers send related research papers to participants for reading and set up interactive communication channels such as Slack.
  • Day 1: 8 Invited Talks and Open Problem Session (organizers)

Each session consists of 2-3 invited talks followed by QA

○ 9:30-12:00: Invited Talk 1: Adaptation of LLMs

○ 12:00-13:30: Lunch

○ 13:30-15:30: Invited Talk 2: Verification of LLMs

○ 15:30-16:00: Coffee break

○ 16:30-18:00: Invited Talk 3: Provenance of LLMs

  • Day 2: Problem Solving Session I based on small group discussion.

Discussions in 3 groups: Adaptation, verification and provenance

○ 9:30-12:00: Discussion 1: What are the critical problems?

○ 12:00-13:30: Lunch

○ 13:30-15:30: Discussion 2: Status of these problems and current solutions

○ 15:30-16:00: Coffee break

○ 16:30-18:00: Mid-Group Report Presentation

  • Day 3: Problem Solving Session II based on small group discussion / Excursion

○ 9:30-12:00: Discussion 3: How can we address the problems in future?

○ Afternoon: Excursion

  • Day 4: Group Report Presentation (from each group) and Planning (for a continuation of the research)

○ 9:30-12:00: Group Report Presentation

○ 12:00-13:30: Lunch

○ 13:30-15:30: Discussion of Future Collaboration