The Semantics of Aggregate Queries in Data Exchange Revisited.

Abstract

Defining “good” semantics for non-monotonic queries and for aggregate queries in the context of data exchange has turned out to be a challenging problem for a number of reasons, including the dependence of the semantics of the concrete syntactic representation of the schema mapping at hand. In this paper, we revisit the semantics of aggregate queries in data exchange by introducing the aggregate most-certain answers, a new semantics that is invariant under logical equivalence. Informally, the aggregate most-certain answers are obtained by taking the intersection of the aggregate certain answers over all schema mappings that are logically equivalent to the given schema mapping. Our main technical result is that for schema mappings specified by source-to-target tuple-generating dependencies only (no target constraints), the aggregate most-certain answers w.r.t. a schema mapping coincide with the aggregate certain answers w.r.t. the schema mapping in normal form associated with the given schema mapping. This result provides an intrinsic justification for using schema mappings in normal form and, at the same time, implies that the aggregate most-certain answers are computable in polynomial time. We also consider the semantics of aggregate queries w.r.t. schema mappings whose specification includes target constraints, and discuss some of the delicate issues involved in defining rigorous semantics for such schema mappings.

Publication
In 7th International on Conference Scalable Uncertainty Management (SUM).
Date