NO.208 Trustworthy Machine Learning System Engineering Techniques for Practical Applications

Shonan Village Center

October 14 - 17, 2024 (Check-in: October 13, 2024 )


  • Paolo Arcaini
    • National Institute of Informatics, Japan
  • Zhi Jin
    • Peking University, China
  • Lei Ma
    • The University of Tokyo, Japan / University of Alberta, Canada


Description of the meeting

Machine learning has experienced a fast boom over the past decade, which enables us to implement complex tasks that cannot be easily solved with traditional software, including those in safety-critical domains such as autonomous driving and healthcare. The recent report of JST forecasts a 60 trillion JPY market from AI applications in industry, in particular in domains such as transportation, healthcare, finance, etc., among which many sectors and applications require quality and reliability of the machine learning-based systems to be medium to high level. Such widespread adoption of AI needs proper system engineering support and toolchains. Both the research community and practitioners need to consider what would be a suitable next step and how to achieve trustworthy AI systems with development and use of systematic engineering tools. The increasing importance of engineering is also witnessed by the 2022 survey of McKinsey that reports that software engineers emerged as the most hired AI role over the past year.

The development paradigm of ML-based systems is fundamentally different from the one of traditional systems. While traditional software systems are often implemented by means of rules (such as program code) that define system behaviour, machine learning systems adopt a data-driven development paradigm, where the decision logic of the software is automatically or semi-automatically learned from continuous aggregation of relevant data. Therefore, traditional software engineering techniques cannot be directly applied as they are, and they need to be adapted to handle the unique characteristics of ML-based systems from a wide range of engineering perspectives. Requirements engineering, for example, needs to be adapted to consider that requirements are satisfied in a quantitative way (e.g., a given precision level) and that some level of misbehaviour is inevitable in an ML-based system; therefore, we need new methods to specify data requirements, including those to flexibly capture uncertainty’ requirements considering the need to address the uncertain nature of ML systems. Functionalities of the ML models depend on the training tasks and the training data; therefore the needed training tasks and the distribution of training data, including requirements for continuous sample, will need to be addressed. Fault localization must provide new definitions of faults for ML-based systems, and testing must define new criteria for exposing and correcting these faults. This kind of debugging and repair, typically based on explainability of ML systems, must consider the unstable nature of ML systems in which minor changes have a large effect on the system behaviour. In addition, as the key asset and artefact of ML-based systems, the data must also be carefully managed with more rigorous quality assurance and engineering techniques, with the records of its relation to the code and ML models.

Moreover, since total correctness is not possible, accuracy of requirements of ML-based systems should be prioritised differently, by considering the hazard that could occur if a specific requirement is not satisfied. These priorities will affect the whole development process, from training, to testing and debugging, across multiple development stages of ML-based systems.

The goal of the meeting is to bring together software engineering and AI experts from academia and industry, featuring and taking a special focus on the engineering side of ML-based system, to discuss how to design new engineering practices, especially for ML-based systems that would allow to effectively engineer an ML based system in a more trustworthy way. The long term goal is to make the activities across the whole development lifecycle of ML-based software engineerable (Engineerable AI) as it is nowadays for traditional software. Machine learning system engineering community fastly grew in the past few years with some early stage results by researchers around the world. We believe the meeting we propose is a great chance to gather the world-leading researchers and industry practitioners who achieved results over the past few years, so that we could further exchange state-of-the-art ideas, techniques, and promote such an important research direction and its potential industrial applications, together conquering the currently urgent demand of trustworthiness of ML-based systems.

The meeting will have two types of sessions:

  • The first type of session will have short presentations where each participant will introduce their research work, and list some ideas/challenges/research directions that they would like to discuss during the meeting. Such presentations will be scheduled during the first two days of the meeting.
  • The second type of session will consist in intensive discussions among sub-groups of participants. The topics of the discussions will be decided at the meeting among those proposed by participants during their short talks; still, organisers will prepare beforehand some possible topics of discussion. The discussions will proceed on two phases, by first exploring different topics, and then deepening on a restricted set of selected topics.
    • [Exploration phase] In the first instances of this type of session, the meeting will use a dedicated method for guiding the discussion. The current plan is to use a variation of the world café method ( this will allow participants to move across the different sub-groups and so get to know the different initial topics.
    • Afterwards, each participant will decide which topic they would like to explore. At this stage, some topics that did not get too much interest will be dropped.
    • [Deepening phase] The remaining sessions will be dedicated to intensive discussion on the selected topics. Ultimately, each group should come up with a research agenda for its topic.
    • Afterwards, we are planning to sort out a research agenda for engineering Trustworthy ML-based software systems, and plan to make a book proposal through the agenda that is discussed and agreed upon by the participants. We will apply to the “Call for Book Proposals” provided by the Shonan meeting organisation.