论文标题

用两值逻辑处理SQL nulls

Handling SQL Nulls with Two-Valued Logic

论文作者

Libkin, Leonid, Peterfreund, Liat

论文摘要

SQL的设计基于三值逻辑(3VL),而不是熟悉的布尔逻辑。 3vl添加了一个真实值,而不是为了处理nulls。与SQL表现力相比,这是必不可少的,与此同时,由于查询的不直觉行为和程序员错误的根源而受到了很多批评。我们表明,与广泛持有的观点相反,SQL本可以是基于标准布尔逻辑而设计的,而不会丧失表现力,而不会放弃零。该方法本身遵循SQL的评估,该评估仅保留了哪些条件的分组。我们表明,将未知与虚假的未知相结合到不使用第三个真实价值的SQL的同样表达版本。在两值语义下编写的查询可以有效地转换为标准SQL,因此可以在任何现有的RDBMS上执行。这些结果涵盖了SQL 1999标准的核心:通过子征服和/存在/所有/所有条件以及递归查询扩展的select-flom-where组群。我们还研究了由两值SQL启用的新优化规则,并表明,对于许多查询,包括在TPC-H和TPC-DS等基准中发现的大多数问题,在三值和两值版本之间没有差异。

The design of SQL is based on a three-valued logic (3VL), rather than the familiar Boolean logic. 3VL adds a truth value unknown to true and false to handle nulls. Viewed as indispensable for SQL expressiveness, it is at the same time much criticized for unintuitive behavior of queries and being a source of programmer mistakes. We show that, contrary to the widely held view, SQL could have been designed based on the standard Boolean logic, without any loss of expressiveness and without giving up nulls. The approach itself follows SQL's evaluation, which only retains tuples for which conditions in WHERE evaluate to true. We show that conflating unknown with false leads to an equally expressive version of SQL that does not use the third truth value. Queries written under the two-valued semantics can be efficiently translated into the standard SQL and thus executed on any existing RDBMS. These results cover the core of the SQL 1999 Standard: SELECT-FROM-WHERE-GROUP BY-HAVING queries extended with subqueries and IN/EXISTS/ANY/ALL conditions, and recursive queries. We also investigate new optimization rules enabled by the two-valued SQL, and show that for many queries, including most of those found in benchmarks such as TPC-H and TPC-DS, there is no difference between three- and two-valued versions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源