NO.227 Engineering Trustworthy Foundation Models

November 3 - 6, 2025 (Check-in: November 2, 2025 )

Organizers

Fuyuki Ishikawa
- National Institute of Informatics, Japan
Lei Ma
- University of Tokyo, Japan
Foutse Khomh
- Polytechnique Montréal, Canada
Shaukat Ali
- Simula Research Laboratory and Oslo Metropolitan University, Norway

Overview

Description of the Meeting

Scope and Aims

Recently, foundation models (FMs) including Large-Language Models (LLMs) such as Generative Pretrained Transformer (GPT) [1] and LLaMA [2] have received significant attention for various language tasks such as textual content generation, language translation, and conversational chatbots. Naturally, the applications of LLMs across many application domains such as software engineering are also increasing, such as for code generation (e.g., GitHub’s Co-Pilot [3]) and test generation [4].

Even though FMs are offering exciting opportunities in various areas, they are not always trustworthy. There are countless examples of FMs hallucinations reported in the literature. Hence, research is needed to address developing novel and cost-effective methods to engineer trustworthy FMs. As a result, the new area of engineering for trustworthy FMs is emerging focusing on analyzing FMs' requirements and their development, testing, and maintenance. In addition, the use of FMs pose novel challenges that must be systematically address such as privacy and security concerns, possible violations of intellectual property rights, ethical concerns (such as biases and discrimination), enormous environmental footprint, explainability issues, inherent uncertainty, proneness to hallucination, and generation of misinformation.

The objective of this Shonan meeting is to discuss current research in the context of engineering of trustworthy FMs, identifying challenges and opportunities, and consequently designing a research roadmap for this topic towards approaching trustworthy large foundational models for real-world applications. This meeting aims to gather researchers at different stages of their career, and practitioners, who work at the intersection of FMs and their various applications to achieve the objective of this meeting.

Topics

The meeting expects the participants to contribute to one or more of the following research topics:

What are various major aspects of trustworthiness for FMs?
What are various applications (including software engineering) for FMs?
Which tasks are more appropriate to be handled efficiently by FMs including their unique characteristics.
How to ensure correctness of FMs specifically designed for various tasks across different application domains?
How to quantify and address uncertainty in FMs to ensure trustworthiness of the decisions made by them?
How to deal with FMs’ hallucination and misinformation generation?
How to address the novel ethical challenges imposed by FMs such as related to biases, discrimination, and violations of intellectual property rights?
What are the challenges and research opportunities related to explainability concerns of FMs?
How to address the enormous environmental footprint of FMs?

Format

The meeting will be conducted in the following format with a certain degree of flexibility:

Individual Presentations: Each participant will be invited to present their current research in one or more of the topics related to the meeting. In addition, each participant will be requested to indicate the research topics they are interested in working in and associated challenges and research opportunities. These presentations will be scheduled for the first two days of the meeting and will be used as the base for more in-depth discussion in groups in the rest of the days of the meeting.
Group Discussion and Reporting: Based on the individual presentations, we will identify a set of groups for more focused and in-depth discussions on various topics of the meeting. Each group will perform in-depth discussion followed by presentation of discussion by each group to the rest of the groups in a joint reporting session. These group discussion and reporting sessions will be scheduled for the third and fourth days.
Research Roadmap Planning: A final session will be conducted on the last day of the meeting to plan the overall research roadmap for dependable engineering of FMs. A discussion will also be conducted on whether to write a book or a joint paper research roadmap paper.

References:

[1] OpenAI. 2023. GPT-4 Technical Report. arXiv:2303.08774 [cs.CL]

[2] Hugo Touvron, Thibaut Lavril, Gautier Izacard, Xavier Martinet, Marie-Anne Lachaux, Timothée Lacroix, Baptiste Rozière, Naman Goyal, Eric Hambro, Faisal Azhar, et al. 2023. Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).

[3] GitHub. 2023. Github copilot. https://copilot.github.com/

[4] Junjie Wang, Yuchao Huang, Chunyang Chen, Zhe Liu, Song Wang, and Qing Wang. 2023. Software Testing with Large Language Model: Survey, Landscape, and Vision. arXiv preprint arXiv:2307.07221 (2023).

Seminars