Seminars

NO.267 The Next 50 Years of SQL: Challenges and Directions

Shonan Village Center

September 13 - 16, 2027 (Check-in: September 12, 2027 )

Organizers

  • Viktor Leis
    • The Technical University of Munich, Germany
  • Manuel Rigger
    • The National University of Singapore, Singapore
  • Jeff Shute
    • Google, USA

Overview

Description of the Meeting

Context Structured Query Language (SQL) remains the cornerstone of data management and analytics. It is the primary interface for interacting with relational databases, but also extends far beyond them: data warehouses, stream processors, embedded engines, and even data science notebooks rely on SQL or SQL-like query languages. Despite being over fifty years old, SQL continues to evolve and expand its reach—becoming the common abstraction layer across various kinds of data systems. Whether in enterprise settings, opensource projects, or large-scale cloud infrastructures, SQL underpins critical applications and workflows. In short, SQL is everywhere, and its importance continues to grow.

SQL dialects While SQL is standardized under the ISO/IEC 9075 specification, no two database management systems (DBMSs) implement the standard in exactly the same way. Vendors and open-source projects frequently extend SQL with proprietary features, modify semantics, or diverge in subtle syntactic ways to support innovation, performance, or legacy behavior. The result is a complex ecosystem of SQL dialects: PostgreSQL, MySQL, Oracle, SQL Server, SQLite, BigQuery, Snowflake, and many others each speak their own variant. Differences range from keyword sets and data types to function libraries, expression evaluation order, and query semantics. Consequently, SQL is less a single language and more a family of closely related dialects, with partial overlap but limited interoperability.

Evolving role of SQL The future of SQL is increasingly uncertain as agentic systems—autonomous agents and AI-driven workflows—begin to interact with databases in dynamic, non-deterministic ways. While SQL has long provided a standardized interface for structured data, the rise of agentic workloads challenges its traditional assumptions: queries may be generated, modified, and executed autonomously, often across heterogeneous dialects and systems. Existing extensions such as procedural logic, triggers, and analytical functions are being repurposed to support these new workflows, but it remains unclear how SQL will adapt to meet the demands of fully agentic systems.

SQL dialects causing issues The proliferation of SQL dialects as well as the evolving role of SQL in the age of agentic AI creates practical and conceptual challenges for multiple stakeholders, with some of the key areas highlighted below:

  • For DBMS implementers: Although the SQL standard provides a specification, it is both vast and underspecified in key areas. Many details—such as query evaluation order or type coercion—are left open to interpretation. As a result, implementers cannot simply “follow the standard” but must make their own design decisions, balancing performance, compatibility, and usability. Over time, these pragmatic choices accumulate into dialectal differences that become de facto standards themselves. Moreover, because every DBMS must support its own variant of SQL, implementers cannot easily reuse existing building blocks such as parsers, analyzers, or optimizers from other systems. Even seemingly generic components must be customized to align with the chosen dialect, leading to duplicated engineering effort and hindering interoperability.
  • For SQL analysis tools: Linters, formatters, refactoring tools, and static analyzers must handle the idiosyncrasies of each dialect, often reimplementing the same parsing and semantic logic. This hinders tool interoperability and reuse, leading to fragmented ecosystems and inconsistent analysis results.
  • For SQL users and developers: Practitioners must learn and remember dialect-specific behaviors, functions, and limitations. Queries that work on one platform often require non-trivial adaptation on another, undermining portability and productivity. Even basic operations—such as date arithmetic, string manipulation, or data type casting—can vary in syntax or semantics.
  • For educators and learners: Teaching SQL becomes complicated by the lack of a single authoritative dialect. Textbooks, online tutorials, and MOOCs often rely on a specific DBMS, leaving learners confused when confronted with differing behaviors in other environments.
  • For researchers: Empirical studies, benchmarks, and reproducibility efforts are impeded by dialect discrepancies. Query workloads designed for one system may not run elsewhere without manual rewriting, making fair comparison and replication difficult.

Goal of the seminar The goal of the seminar is to bring together diverse communities that rarely interact—database systems researchers, programming language and compiler experts, software engineering researchers, and industry practitioners—to collectively discuss challenges with and future directions of SQL. Participants will discuss key challenges, survey existing solutions and open problems, and explore directions for systematically addressing the problem. We aim to bridge theoretical and practical perspectives, fostering collaboration across disciplines. A key tangible outcome will be a jointly authored vision paper that articulates future directions for SQL, including promising paths toward taming SQL dialect fragmentation and evolving SQL to accommodate agentic workloads.

Organizers background The seminar is organized by researchers and practitioners with complementary expertise spanning database systems, language design, and tooling; thus, we hope to attract representative researchers from these communities. Viktor Leis has led the development SaneQL [1] and recently proposed his vision on SaneIR,1 projects focusing on a “sane” SQL language as well as common intermediate representation. His research interests primarily lies within database systems and he is well-known in that community. Manuel Rigger and his group worked on SQLancer++ and ShQveL [4, 5], tools for automated testing and query verification that explicitly account for SQL dialect differences, and is currently working on SQLFlex, a dialect-agnostic SQL parsing framework. As his research interests lie primarily within the software engineering and programming languages community, he identified invitees in that community. Jeff Shute has extensive industry experience working on establishing and improving the SQL ecosystem at Google—opensourced as zetasql2—and has co-authored work on the “pipe” syntax for more composable SQL queries [3]. In addition, he has started and led multiple other high-impact projects, such as F1, the first database and query engine built from the ground up to scale out as a distributed system at Google’s scale [2]. Given his high standing in the industry, he will be able to attract many key players from companies. Together, the organizers bring academic and industrial perspectives necessary for productive interdisciplinary discussion.

Timeliness of the seminar The posed problem is important and timely, and will continue to gain more importance. The landscape of data systems is undergoing rapid change: new AI-infused and naturallanguage-driven SQL interfaces are emerging, promising to make querying more accessible but further increasing dialectal variation. At the same time, “friendly” SQL dialects such as DuckDB are gaining traction by emphasizing usability and interoperability. Furthermore, modern DBMSs are increasingly multi-modal—supporting not only relational data but also semi-structured, time-series, graph, and vector data—stretching the boundaries of what SQL represents. These developments highlight both the vitality of SQL and the growing urgency of addressing its fragmentation.

Structure of the meeting The seminar will be structured around a mix of plenary talks, breakout discussions, and collaborative working sessions. The first day will focus on introductions as well as establishing a shared understanding of the SQL dialect landscape, including presentations from implementers and tool builders. In the subsequent days, cross-disciplinary working groups will formulate open research questions and outline possible approaches. The third day before the excursion will be devoted to consolidating outcomes and drafting the joint vision paper, ensuring that discussions lead to tangible impact.

Expected outcomes and impact In the longer term, we expect the seminar to catalyze sustained collaborations among participants from academia and industry. The discussions may inspire shared datasets and benchmarks for studying dialect diversity, open-source initiatives for dialect-aware tooling, and crossinstitutional research projects aimed at formalizing and unifying SQL dialects. By building bridges between previously disconnected communities and establishing a shared understanding of SQL’s evolving ecosystem, we genuinely hope to advance.

1 See the recent keynote at https://sites.google.com/view/dbpl2025/home/program.

2 https://github.com/google/zetasql