Vol-3194/paper19
Jump to navigation
Jump to search
Paper
Paper | |
---|---|
edit | |
description | |
id | Vol-3194/paper19 |
wikidataid | Q117344918βQ117344918 |
title | Counting Database Repairs Entailing a Query: The Case of Functional Dependencies |
pdfUrl | https://ceur-ws.org/Vol-3194/paper19.pdf |
dblpUrl | https://dblp.org/rec/conf/sebd/CalauttiLPS22 |
volume | Vol-3194βVol-3194 |
session | β |
Counting Database Repairs Entailing a Query: The Case of Functional Dependencies
Counting Database Repairs Entailing a Query: The Case of Functional Dependencies (Discussion Paper) Marco Calautti1 , Ester Livshits2 , Andreas Pieris2,3 and Markus Schneider2 1 DISI, University of Trento, Italy 2 University of Edinburgh, UK 3 University of Cyprus, Cyprus Abstract A key task in the context of consistent query answering is to count the number of repairs that entail the query, with the ultimate goal being a precise data complexity classification. This has been achieved in the case of primary keys and self-join-free conjunctive queries (CQs) via an FP/β―P-complete dichotomy. We lift this result to the more general case of functional dependencies (FDs). Another important task in this context is whenever the counting problem in question is intractable, to classify it as approximable, i.e., the target value can be efficiently approximated with error guarantees via a fully polynomial-time randomized approximation scheme (FPRAS), or as inapproximable. Although for primary keys and CQs (even with self-joins) the problem is always approximable, we prove that this is not the case for FDs. We show, however, that the class of FDs with a left-hand side chain forms an island of approximability. We see these results, apart from being interesting in their own right, as crucial steps towards a complete classification of approximate counting of repairs in the case of FDs and self-join-free CQs. 1. Introduction A database is inconsistent if it does not satisfy its integrity constraints. There is a consensus that inconsistency is a real-life phenomenon that arises due to many reasons such as integration of conflicting sources. With the aim of obtaining conceptually meaningful answers to queries posed over inconsistent databases, Arenas, Bertossi, and Chomicki introduced in the late 1990s the notion of Consistent Query Answering (CQA) [1]. The key elements underlying CQA are (i) the notion of (database) repair of an inconsistent database π·, that is, a consistent database whose difference with π· is somehow minimal, and (ii) the notion of query answering based on certain answers, that is, answers that are entailed by every repair. Example 1. Consider the relation name Employee(id, name, dept) that comes with the constraint that the attribute id functionally determines name and dept. Consider also the database π· con- sisting of the tuples: (1, Bob, HR), (1, Bob, IT), (2, Alice, IT), (2, Tim, IT). It is easy to see that π· is inconsistent since we are uncertain about Bobβs department, and the name of the employee with id 2. To devise a repair, we need to keep one tuple from each conflicting pair, which leads to a SEBD 2022: The 30th Italian Symposium on Advanced Database Systems, June 19-22, 2022, Tirrenia (PI), Italy " marco.calautti@unitn (M. Calautti); ster.livshits@ed.ac.uk (E. Livshits); apieris@inf.ed.ac.uk (A. Pieris); m.schneider@ed.ac.uk (M. Schneider) Β© 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) οΏ½maximal subset of π· that is consistent. Observe now that the query that asks whether employees 1 and 2 work in the same department is true only in two out of four repairs, and thus, not entailed. Counting Repairs Entailing a Query. A key task in this context is to count the number of repairs of an inconsistent database π· w.r.t. a set Ξ£ of constraints that entail a given query π; for clarity, we base our discussion on Boolean queries. Given a set Ξ£ of constraints and a query π, the problem β―Repairs(Ξ£, π) that takes as input a database π·, and asks for the number of repairs of π· w.r.t. Ξ£ that entail π, can be tractable or intractable depending on the shape of Ξ£ and π. This leads to the natural question whether we can establish a complete classification, i.e., for every Ξ£ and π, classify β―Repairs(Ξ£, π) as tractable or intractable by simply inspecting Ξ£ and π. We already know from [2] that for a set Ξ£ of primary keys, and a self-join-free conjunctive queries (SJFCQs) query π, β―Repairs(Ξ£, π) is either in FP or β―P-complete, and we can determine in polynomial time, by simply analyzing Ξ£ and π, which complexity statement holds. However, the dichotomy result in [2] does not apply when we consider the more general class of functional dependencies (FDs). This brings us to the following question: Question 1: Can we lift the dichotomy result for primary keys and SJFCQs to the more general case of functional dependencies? The closest known result related to Question 1 is for the problem β―Repairs(Ξ£), where Ξ£ is a set of FDs, that given a database π·, asks for the number of repairs of π· w.r.t. Ξ£ (without considering a query). In particular, we know from [3] that whenever Ξ£ has a so-called left-hand side (LHS, for short) chain (up to equivalence), β―Repairs(Ξ£) is in FP; otherwise, it is β―P-complete. Approximate Counting. Another key task is to classify β―Repairs(Ξ£, π), whenever it is intractable, as approximable via a so-called fully polynomial-time randomized approximation scheme (FPRAS), or as inapproximable. For a set Ξ£ of primary keys, and a CQ π (even with self-joins), β―Repairs(Ξ£, π) is always approximable [4]. The next question is whether we can obtain a full classification for FDs: Question 2: For a set Ξ£ of FDs, and an SJFCQ π, can we determine whether β―Repairs(Ξ£, π) admits an FPRAS by inspecting Ξ£ and π? Summary of Contributions. Concerning Question (1), we lift the dichotomy of [2] for primary keys and SJFCQs to the general case of FDs (Theorem 4). Concerning Question (2), although we do not establish a complete classification (which would resolve a challenging open problem), we show that, for every set Ξ£ of FDs with an LHS chain (up to equivalence) and a CQ π (even with self-joins), β―Repairs(Ξ£, π) admits an FPRAS. On the other hand, we show that there is a very simple set Ξ£ of FDs such that, for every SJFCQ π, β―Repairs(Ξ£, π) does not admit an FPRAS (under a standard complexity assumption). 2. Preliminaries We consider the disjoint countably infinite sets C and V of constants and variables, respectively. For π > 0, let [π] be the set {1, . . . , π}, and for a finite set π, let β―π be the cardinality of π. Relational Databases. A schema S is a finite set of relation names with associated arity; we write π /π to denote that π has arity π > 0. Each relation name π /π is associated with a tuple οΏ½of distinct attribute names (π΄1 , . . . , π΄π ); we write att(π ) for the set {π΄1 , . . . , π΄π }. A position of S is a pair (π , π΄), where π β S and π΄ β att(π ), that essentially identifies the attribute π΄ of π . A atom πΌ over S is an expression of the form π (π‘1 , . . . , π‘π ), where π /π β S, and π‘π β C βͺ V for each π β [π]. A fact is an atom mentioning only constants; we may say π -atom/fact to indicate that the relation name is π . For an atom πΌ = π (π‘1 , . . . , π‘π ), with (π΄1 , . . . , π΄π ) being the tuple of attribute names of π , we write πΌ[π΄π ] for the term π‘π . A database π· over S is a finite set of facts over S. We write π·π , for π β S, for the database {π (π‘Β―) | π (π‘Β―) β π·}. The active domain of π·, denoted dom(π·), is the set of constants occurring in π·. Functional Dependencies. A functional dependency (FD) π over a schema S is an expression of the form π : π β π , where π /π β S and π, π β att(π ); we say π -FD to indicate that the relation name is π . We call π a key if π βͺ π = att(π ). Given a set Ξ£ of FDs over S, we write Ξ£π , for π β S, for the set {π β Ξ£ | π is an π -FD}. We call Ξ£ a set of primary keys if it consists only of keys, and |Ξ£π | β€ 1 for each π β S. A database π· over S satisfies an FD π over S, denoted π· |= π, if, for every two π -facts π, π β π· the following holds: π [π΄] = π[π΄] for every π΄ β π implies π [π΅] = π[π΅] for every π΅ β π . We say that π· is consistent w.r.t. a set Ξ£ of FDs, written π· |= Ξ£, if π· |= π for every π β Ξ£; otherwise, π· is inconsistent w.r.t. Ξ£. Two sets of FDs Ξ£, Ξ£β² are equivalent if, for every database π·, π· |= Ξ£ iff π· |= Ξ£β² . Conjunctive Queries. A (Boolean) conjunctive query (CQ) π over S is an expression of the form βπ¦Β―1 , . . . , βπ¦Β―π π 1 (π¦Β―1 ) β§ Β· Β· Β· β§ π π (π¦Β―π ), where each π π (π¦Β―π ), for π β [π], is an atom over S. With abuse of notation we may treat π as the set of its atoms. The query π is a self-join-free CQ (SJFCQ) if it mentions every relation name of S at most once. Let var(π) and const(π) be the set of variables and constants in π, respectively. A homomorphism from π to a database π· is a function β : var(π) βͺ const(π) β dom(π·), which is the identity over C, such that π π (β(π¦Β―π )) β π· for π β [π]. We say that π· entails π, and write π· |= π if there exists a homomorphism from π to π·. We point out that in this paper we are only dealing with Boolean queries, but all the presented results can be generalized to non-Boolean CQs. Database Repairs. Given a database π· and a set Ξ£ of FDs, both over a schema S, a repair of π· w.r.t. Ξ£ is a maximal subset π·β² of π· such that π·β² |= Ξ£. Let repΞ£ (π·) be the set of repairs of π· w.r.t Ξ£. Given a CQ π over S, we write repΞ£,π (π·) for {π·β² β repΞ£ (π·) | π·β² |= π}. The following is a useful observation that we will exploit later in the paper: Lemma 2. Consider a database π·, a set Ξ£ of FDs, and a Boolean SJFCQ π. We can compute in polynomial time a database π·β² such that β―repΞ£ (π·) = β―repΞ£,π (π·β² ). Problem Definition. For each set Ξ£ of FDs and CQ π, we focus on the problem β―Repairs(Ξ£, π), which takes as input a database π·, and asks for the number β―repΞ£,π (π·). The goal is to classify it as tractable (place it in FP) or as intractable (show that is β―P-complete). Another important task is whenever β―Repairs(Ξ£, π) is intractable to classify it as approx- imable via a so-called fully polynomial-time randomized approximation scheme (FPRAS, for short), or as inapproximable. Formally, an FPRAS for β―Repairs(Ξ£, π) is a randomized algorithm A that takes as input a database π·, and numbers π > 0 and 0 < πΏ < 1, runs in polyno- mial(οΈ time in ||π·||, 1/π and log(1/πΏ), and produces )οΈ a random variable A(π·, π, πΏ) such that Pr |A(π·, π, πΏ) β β―repΞ£,π (π·)| β€ π Β· β―repΞ£,π (π·) β₯ 1 β πΏ. οΏ½LHS Chain FDs. Consider a set Ξ£ of FDs over S, and a relation name π β S. We say that Ξ£π has a left-hand side chain (LHS chain) if the FDs of Ξ£π can be arranged in a sequence π : π1 β π1 , . . . , π : ππ β ππ such that π1 β π2 β Β· Β· Β· β ππ [3]. We call such a sequence an LHS chain of Ξ£π , and say that Ξ£ has an LHS chain if, for every π β S, Ξ£π has an LHS chain. We know that for sets of FDs with an LHS chain, counting the number of repairs is tractable: Proposition 3 ([3]). Given a database π·, and a set Ξ£ of FDs with an LHS chain (up to equiva- lence), β―repΞ£ (π·) is computable in polynomial time in ||π·||. 3. Exact Counting The goal of this section is to discuss the key ideas behind our main result on exact counting: Theorem 4. For a set Ξ£ of FDs, and an SJFCQ π, β―Repairs(Ξ£, π) is either in FP or β―P-complete, and we can decide in polynomial time in ||Ξ£|| + ||π|| which of the two cases hold. We observe that for every set Ξ£ of FDs, and an SJFCQ π, β―Repairs(Ξ£, π) is in β―P: simply guess (in polynomial time) a subset π·β² of the given database π·, and verify (again in polynomial time) that π·β² is a repair of π· w.r.t. Ξ£ that entails π. We also note that the non-existence of an LHS chain (up to equivalence) is a preliminary boundary for the hard side of the target dichotomy, regardless of the CQ. We know from [3] that, for a set Ξ£ of FDs, the problem β―Repairs(Ξ£), which given a database π·, asks for β―repΞ£ (π·), is β―P-hard whenever Ξ£ does not have an LHS chain (up to equivalence). From the above discussion, and Lemma 2, we get the following result: Proposition 5. Consider a set Ξ£ of FDs without an LHS chain (up to equivalence), and an SJFCQ π. β―Repairs(Ξ£, π) is β―P-complete. From Proposition 5, and the fact that checking whether a set Ξ£ of FDs has an LHS chain (up to equivalence) is feasible in polynomial time [3], to obtain Theorem 4, it suffices to provide an analogous result for FDs with an LHS chain (up to equivalence). To this end, following what was done for primary keys in [2], we are going to concentrate on the problem of computing the relative frequency of the query. The relative frequency of a CQ π w.r.t. a database π· and a β―rep (π·) set Ξ£ of FDs is defined as the ratio rfreqπ·,Ξ£ (π) = β―repΞ£,π(π·) that computes the percentage of Ξ£ repairs that entail the query π. We then define the problem RelFreq(Ξ£, π) that takes as input a database π·, and asks for rfreqπ·,Ξ£ (π). We can indeed focus on RelFreq(Ξ£, π) since we know from Proposition 3 that β―Repairs(Ξ£) is in FP whenever Ξ£ has an LHS chain (up to equivalence), which implies that β―Repairs(Ξ£, π) and RelFreq(Ξ£, π) have the same complexity, that is, both are either in FP or β―P-hard. Consequently, to obtain Theorem 4 it suffices to show the following: Theorem 6. For a set Ξ£ of FDs with an LHS chain (up to equivalence), and an SJFCQ π, RelFreq(Ξ£, π) is either in FP or β―P-hard, and we can decide in polynomial time in ||Ξ£|| + ||π|| which of the two cases hold. In the rest, we give some hints on the proof of Theorem 6; we first need some new notions. Canonical Covers. If Ξ£ has an LHS chain (up to equivalence), we know that it has a single canonical cover1 Ξ£β² , and Ξ£β² has an LHS chain [3]. Furthermore, it can be verified that, for every 1 That is, (i) the FDs in Ξ£ are not redundant and have no redundant attributes, and (ii) for a relation name π and π β att(π ), there is at most one FD in Ξ£ of the form π : π β π . οΏ½relation name π occurring in Ξ£β² , there exists a unique sequence π : π1 β π1 , . . . , π : ππ β ππ of the FDs of Ξ£β²π , which we call the LHS chain of Ξ£β²π , such that (i) ππ β ππ+1 for each π β [π β 1], (ii) ππ β© ππ = β for each π, π β [π], and (iii) ππ β© ππ = β for each π, π β [π] with π ΜΈ= π. Since a canonical cover of a set of FDs can be computed in polynomial time, to establish Theorem 6 it suffices to show it for sets of FDs with an LHS chain that are canonical. Therefore, in the rest of the section, we assume that sets of FDs are canonical. Primary FDs and Positions. Consider now an SJFCQ π. Let πΌπ be the π -atom in π, and let Ξπ = π : π1 β π1 , . . . , π : ππ β ππ be the LHS chain of Ξ£π . We call an FD π : ππ β ππ , for some π β [π], the primary FD of Ξ£π w.r.t. π if ππ βͺππ contains an attribute π΄ such that at position (π , π΄) in πΌπ we have a variable, whereas at each position of {(π , π΅) | π΅ β ππ βͺ ππ for π < π} in πΌπ we have a constant. Note that there is no guarantee that the primary FD of Ξ£π w.r.t. π exists. If the primary FD π : ππ β ππ of Ξ£π w.r.t. π exists, for some π β [π], then the primary- lhs positions of πΌπ (w.r.t. Ξ£) are the positions {(π , π΄) | π΄ β ππ }, while the non-primary-lhs positions of πΌπ (w.r.t. Ξ£) are the positions {(π , π΅) | π΅ β att(π ) β ππ }. We further call the sequence of FDs π : π1 β π1 , . . . , π : ππβ1 β ππβ1 the primary prefix of Ξ£π w.r.t. π. If the primary FD of Ξ£π w.r.t. π does not exist, then, by convention, the primary-lhs positions of πΌπ are the positions {(π , π΄) | π΄ β att(π )}, i.e., all the positions in πΌπ are primary-lhs, which in turn implies that πΌπ has no non-primary-lhs positions. Moreover, the primary prefix of Ξ£π w.r.t. π is defined as the sequence Ξπ itself. We denote by pvarΞ£ (πΌπ ) the set of variables occurring in πΌπ at primary-lhs positions of πΌπ . Query Complex Part. A variable π₯ β var(π) is a liaison variable (of π) if it occurs more than once in π. The complex part of π w.r.t. Ξ£, denoted compΞ£ (π), is the set of atoms πΌ of π in which we have a constant or a liaison variable at a non-primary-lhs position (π , π΄), with πΌ being the π -atom of π, such that, for every FD π : π β π in the primary prefix of Ξ£π w.r.t. π, π΄ ΜΈβ π βͺ π . We observe that when compΞ£ (π) = β , the number of repairs of π· w.r.t. Ξ£ that Ξ£,π entail π coincides with the number of repairs (without any query) of the subset π·conf of π· obtained after removing the facts of π· that are somehow in a conflict with π, and thus, they Ξ£,π cannot appear in a repair that entails π. Since π·conf can be computed in polynomial time in ||π·||, by Proposition 3 we obtain the following: Proposition 7. Given a database π·, a set Ξ£ of FDs with an LHS chain, and an SJFCQ π with compΞ£ (π) = β , it holds that rfreqπ·,Ξ£ (π) is computable in polynomial time in ||π·||. 3.1. The Tractable Side We start by defining some properties that, when satisfied by a SJFCQ, allow to restate its relative frequency in terms of the relative frequency of simpler queries. We write π1 β π2 for the CQ consisting of the atoms of two non-empty CQs π1 , π2 that share no atoms and variables, i.e., π1 β© π2 = β and var(π1 ) β© var(π2 ) = β . We also write ππ₯β¦βπ , where π₯ β var(π) and π β C, for the CQ obtained from π after replacing π₯ with π. Theorem 8. Consider a set Ξ£ of FDs with an LHS chain, and a SJFCQ π. For every database π·, the following hold: 1. If π = π1 β π2 , then rfreqπ·,Ξ£ (π) = rfreqπ·,Ξ£ (π1 ) Γ rfreqπ·,Ξ£ (π2 ). οΏ½ 2. If compΞ£ (π) ΜΈ= β , and there exists a variable π₯ β var(π) such that π₯ β pvarΞ£ (πΌ) for every πΌ β compΞ£ (π), then there is π·* β π·, computable in polynomial time in ||π·||, such that β β (οΈ )οΈ Ξ£,π βοΈ (οΈ )οΈ β―repΞ£ π· β π· conf rfreqπ·,Ξ£ (π) = β1 β 1 β rfreqπ·* ,Ξ£ (ππ₯β¦βπ ) β Γ . β―repΞ£ (π·) πβdom(π·) 3. If compΞ£ (π) ΜΈ= β , and there exists an atom πΌ = π (π‘Β―) β compΞ£ (π) such that pvarΞ£ (πΌ) = β , and a variable π₯ occurs in πΌ at a position of {(π , βοΈπ΄) | π΄ β π }, where π : π β π is the primary FD of Ξ£π w.r.t. π, then rfreqπ·,Ξ£ (π) = πβdom(π·) rfreqπ·,Ξ£ (ππ₯β¦βπ ). Item (1) of Theorem 8 is straightforward, and implies that when π = π1 β π2 , one can compute the relative frequency of π in polynomial time, if the same can be done for the relative frequencies of π1 and π2 . Items (2) and (3) are much more involved, and we refer the reader to the full version of the paper [5]. The main message is that when a SJFCQ π satisfies the conditions of item (2), then its relative frequency is polynomial-time computable if so is for the relative frequency of ππ₯β¦βπ , for each constant π of the database, over a certain easily computable subset π·* of π·, where π₯ is as defined in item (2). Similarly, when π satisfies the conditions of item (3), then its relative frequency is polynomial-time computable, if so is for the relative frequency of ππ₯β¦βπ , for each constant π of the database, where π₯ is as defined in item (3). For a set Ξ£ of FDs with an LHS chain and an SJFCQ π, we say that π is Ξ£-safe if either its complex part is empty, i.e., compΞ£ (π) = β , or after recursively applying the conditions of items (1)-(3) of Theorem 8 to π, we are only left with queries having an empty complex part.2 By definition of Ξ£-safe queries, and by Proposition 7 and Theorem 8, we obtain the following: Theorem 9. Consider a set Ξ£ of FDs with an LHS chain, and an SJFCQ π. If π is Ξ£-safe, then RelFreq(Ξ£, π) is in FP. Interestingly, safety exhausts the tractable side of the dichotomy stated in Theorem 6. That is, if π is not Ξ£-safe, then RelFreq(Ξ£, π) is β―P-hard, as we discuss in the next section. Let us stress that checking whether π is Ξ£-safe is feasible in polynomial time in ||Ξ£|| + ||π||, which establishes the second part of Theorem 6. 3.2. The Hard Side We conclude this section by briefly discussing how the intractable side of the dichotomy stated in Theorem 6 is obtained. We know from [2] that, for every set Ξ£ of primary keys, and an SJFCQ π that is not Ξ£-safe, RelFreq(Ξ£, π) is β―P-hard. Hence, with Theorem 9 in place, stating that query safety ensures tractability, we can obtain the hard side of the dichotomy by showing: Theorem 10. There is a set Ξ£β² of primary keys, and a non Ξ£β² -safe SJFCQ πβ² such that, for each set Ξ£ of FDs with an LHS chain and non Ξ£-safe SJFCQ π, RelFreq(Ξ£β² , πβ² ) reduces to RelFreq(Ξ£, π). 2 We remark that for items (2) and (3), it is enough to recursively apply the conditions of Theorem 8 to ππ₯β¦βπ for only one arbitrarily chosen constant from C, and thus the notion of Ξ£-safety is database independent. οΏ½Theorem 10 is shown in two main steps. We first prove that there exists a set Ξ£* of FDs, and an SJFCQ π* such that RelFreq(Ξ£* , π* ) can be reduced to RelFreq(Ξ£, π) by means of a set of rewriting rules that extend the ones of [2]. Second, we show that there is a set Ξ£β² of primary keys, and an SJFCQ πβ² that is not Ξ£β² -safe such that, for every database π·, rfreqπ·,Ξ£* (π* ) = rfreqπ·,Ξ£β² (πβ² ), which then implies that RelFreq(Ξ£β² , πβ² ) reduces to RelFreq(Ξ£* , π* ), as needed. 4. Approximate Counting We now briefly discuss our results on approximate counting. The main contribution of this section is the following theorem; BPP is the class of decision problems that are efficiently solvable via a randomized algorithm with a bounded two-sided error [6]. Theorem 11. For every set Ξ£ of FDs with an LHS chain (up to equivalence), and a CQ π, β―Repairs(Ξ£, π) admits an FPRAS. Moreover, for Ξ£β = {π : π΄1 β π΄2 , π : π΄3 β π΄4 }, where att(π ) = {π΄π | π β [4]}, β―Repairs(Ξ£β , π) admits no FPRAS, for every SJFCQ π, unless NP β BPP. To prove the approximability result we observe that β―Repairs(Ξ£, π) can be seen as an in- stantiation of the so-called union of sets problem βοΈ[7], which given a succinct representation of π β₯ 1 sets π1 , . . . , ππ , asks for the number | πβ[π] ππ |. Indeed, given a database π·, let homπ·,Ξ£ (π) be the set of all homomorphic images of π in π· that are consistent w.r.t. Ξ£. Formally, homπ·,Ξ£ (π) β= {β(π) | β is a homomorphism β from π to π· such that β(π) |= Ξ£}. Hence, β―repΞ£,π (π·) = β πβ[π] repΞ£,π»π (π·)β, assuming that homπ·,Ξ£ (π) = {π»1 , . . . , π»π }.3 ββοΈ β From the above discussion, and the results of [7] on the approximability of the union of sets problem, showing the existence of an FPRAS for β―Repairs(Ξ£, π) boils down to showing that for each π» β homπ·,Ξ£ (π) we can (i) compute β―repΞ£,π» (π·), (ii) sample elements of repΞ£,π» (π·) uniformly at random, and (iii) check whether a database is in repΞ£,π» (π·), all in polynomial time in ||π·||. We prove that the above properties hold when Ξ£ has an LHS chain (up to equivalence), and the first part of Theorem 11 follows. The proof of the inapproximability of β―Repairs(Ξ£β , π), for every SJFCQ π, employs a so- called gap preserving reduction from the promise problem Gap3SATπΎ , for some fixed πΎ β (0, 18 ), i.e., the problem that given a Boolean formula π in 3CNF, asks whether π is satisfiable. The promise is that if π is unsatisfiable, then every truth assignment makes at most 78 + πΎ of the clauses of π true. For every πΎ β (0, 81 ), Gap3SATπΎ is NP-complete [8]. For a fixed πΎ β (0, 81 ), the reduction maps each formula π to a database db(π) such that the following holds: for any yes instance ππ of Gap3SATπΎ , and any no instance ππ of Gap3SATπΎ , the βgapβ (i.e., the ratio) between the numbers β―repΞ£β ,π (db(ππ )) and β―repΞ£β ,π (db(ππ )) is so large that an FPRAS for β―Repairs(Ξ£β , π) could be used, by employing a small enough error π, to distinguish between yes and no instances of Gap3SATπΎ with high probability, and thus, placing the NP-complete problem Gap3SATπΎ in BPP. The Difficulty Underlying a Dichotomy. Despite our efforts, we could not get a complete approximability/inapproximability classification. However, in the full version of the paper [5] 3 By abuse of terminology, we treat each π»π as a CQ consisting of facts. οΏ½we show that obtaining such a classification is as hard as solving the challenging open question whether β―MaxMatch admits an FPRAS, i.e., the problem that given a bipartite graph πΊ, asks for the number of maximal matchings in πΊ; please refer to the full version for more details. 5. Conclusion The obvious problems that remain open are the following: (1) lift the dichotomy of Theo- rem 4 for FDs and self-join-free CQs to arbitrary CQs with self-joins, and (2) establish an approximability/inapproximability dichotomy for FDs and SJFCQs. We point out that approximate versions of CQA have already been considered, both in the database and ontological setting. In the database setting, to the best of our knowledge [4, 9, 10] are the only works that focus on approximations in CQA. In particular, [4] studies approximation algorithms for the relative frequency under primary keys only, which have also been experimentally evaluated in [9]. In [10], FDs are considered, but the notion of repair is based on value updates, rather than tuple deletions, as we do. In the ontological setting, different approximations have been considered. However, in this setting, approximations are understood as subsets (i.e., sound but not complete) of the standard consistent answers (e.g., see [11, 12]). References [1] M. Arenas, L. E. Bertossi, J. Chomicki, Consistent query answers in inconsistent databases, in: PODS, 1999, pp. 68β79. [2] D. Maslowski, J. Wijsen, A dichotomy in the complexity of counting database repairs, J. Comput. Syst. Sci. 79 (2013) 958β983. [3] E. Livshits, B. Kimelfeld, J. Wijsen, Counting subset repairs with functional dependencies, J. Comput. Syst. Sci. 117 (2021) 154β164. [4] M. Calautti, M. Console, A. Pieris, Counting database repairs under primary keys revisited, in: PODS, 2019, pp. 104β118. [5] M. Calautti, E. Livshits, A. Pieris, M. Schneider, Counting database repairs entailing a query: The case of functional dependencies, in: PODS (to appear), 2022, pp. 91β103. [6] S. Arora, B. Barak, Computational Complexity - A Modern Approach, Cambridge University Press, 2009. [7] R. M. Karp, M. Luby, N. Madras, Monte-carlo approximation algorithms for enumeration problems, J. Algorithms 10 (1989) 429β448. [8] J. HΓ₯stad, Some optimal inapproximability results, J. ACM 48 (2001) 798β859. [9] M. Calautti, M. Console, A. Pieris, Benchmarking approximate consistent query answering, in: PODS, 2021, pp. 233β246. [10] S. Greco, C. Molinaro, Probabilistic query answering over inconsistent databases, Ann. Math. Artif. Intell. 64 (2012). [11] S. Greco, C. Molinaro, I. Trubitsyna, Computing approximate query answers over incon- sistent knowledge bases, in: IJCAI, 2018, pp. 1838β1846. [12] T. Lukasiewicz, E. Malizia, C. Molinaro, Complexity of approximate query answering under inconsistency in datalog+/-, in: IJCAI, 2018, pp. 1921β1927. οΏ½