Difference between revisions of "Vol-3194/paper56"
Jump to navigation
Jump to search
(edited by wikiedit) |
(modified through wikirestore by wf) |
||
Line 1: | Line 1: | ||
− | + | =Paper= | |
{{Paper | {{Paper | ||
− | | | + | |id=Vol-3194/paper56 |
+ | |storemode=property | ||
+ | |title=Query Answer Explanations under Existential Rules | ||
+ | |pdfUrl=https://ceur-ws.org/Vol-3194/paper56.pdf | ||
+ | |volume=Vol-3194 | ||
+ | |authors=Ismail Ilkan Ceylan,Thomas Lukasiewicz,Enrico Malizia,Andrius Vaicenavičius | ||
+ | |dblpUrl=https://dblp.org/rec/conf/sebd/CeylanLMV22 | ||
}} | }} | ||
+ | ==Query Answer Explanations under Existential Rules== | ||
+ | <pdf width="1500px">https://ceur-ws.org/Vol-3194/paper56.pdf</pdf> | ||
+ | <pre> | ||
+ | Query Answer Explanations under Existential Rules⋆ | ||
+ | (Discussion Paper) | ||
+ | |||
+ | İsmail İlkan Ceylan1 , Thomas Lukasiewicz1 , Enrico Malizia2 and | ||
+ | Andrius Vaicenavičius1 | ||
+ | 1 | ||
+ | Department of Computer Science, University of Oxford, UK | ||
+ | 2 | ||
+ | DISI, University of Bologna, Italy | ||
+ | |||
+ | |||
+ | Abstract | ||
+ | Ontology-mediated query answering is an extensively studied paradigm, which aims at improving | ||
+ | query answers with the use of a logical theory. In this paper, we focus on ontology languages based on | ||
+ | existential rules, and we carry out a thorough complexity analysis of the problem of explaining query | ||
+ | answers in terms of minimal subsets of database facts and related tasks. | ||
+ | |||
+ | Keywords | ||
+ | Knowledge representation, Existential rules, Query answering, Explanations, Computational complexity | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | 1. Introduction | ||
+ | Ontology-based data access (OBDA) is a popular paradigm in knowledge representation and | ||
+ | reasoning [2]. In this paradigm, an ontology enriches the user query with commonsense | ||
+ | knowledge, to form a so-called ontology-mediated query (OMQ) [3]. The task of evaluating OMQs | ||
+ | is called ontology-mediated query answering (OMQA). Description logics (DLs) [4] and existential | ||
+ | rules [5] are the most widely used ontology languages. Tracking down the causes of entailment | ||
+ | in DL ontologies has attracted extensive interest. In the DL context, a prominent approach is | ||
+ | identifying explanations as ontology axioms subsets [6, 7, 8], which are called justifications [9, 10]. | ||
+ | Axiom pinpointing is extensively studied in DLs, and some implementations exist as well [6, 11]. | ||
+ | However, most of the existing approaches to explanations in the DL framework focus on classical | ||
+ | reasoning tasks. Explaining query entailments has only been investigated for the DL-Lite family | ||
+ | of languages [12, 13], and negative answers to OMQs in DLs [14]. | ||
+ | In this paper, we study explanations in the existential rules context. The only prior work | ||
+ | related to explanations in existential rules is given in [15] in the context of probabilistic databases, | ||
+ | and hence of a very different flavor. Here, given an OMQ, we are interested in explaining the | ||
+ | OMQ in terms of the minimal satisfying subsets, called minimal explanations of a given database. | ||
+ | We analyze the complexity for each of the problems introduced in terms of data, fp-combined, | ||
+ | ba-combined and combined complexity, for a representative set of existential rules. | ||
+ | |||
+ | SEBD 2022: The 30th Italian Symposium on Advanced Database Systems, June 19-22, 2022, Tirrenia (PI), Italy | ||
+ | ⋆ | ||
+ | This is a short version of the IJCAI-19 paper [1]. | ||
+ | $ ismail.ceylan@cs.ox.ac.uk (İ. İ. Ceylan); thomas.lukasiewicz@cs.ox.ac.uk (T. Lukasiewicz); | ||
+ | enrico.malizia@unibo.it (E. Malizia); andrius.vaicenavicius@cs.ox.ac.uk (A. Vaicenavičius) | ||
+ | � 0000-0003-4118-4689 (İ. İ. Ceylan); 0000-0002-7644-1668 (T. Lukasiewicz); 0000-0002-6780-4711 (E. Malizia) | ||
+ | © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). | ||
+ | CEUR | ||
+ | Workshop | ||
+ | Proceedings | ||
+ | http://ceur-ws.org | ||
+ | ISSN 1613-0073 | ||
+ | CEUR Workshop Proceedings (CEUR-WS.org) | ||
+ | �2. Preliminaries | ||
+ | We here briefly recall some basics on existential rules from the context of Datalog± [5]. | ||
+ | General. We assume a set C of constants, a set N of labeled nulls, and a set V of variables. A | ||
+ | term 𝑡 is a constant, null, or variable. We assume a set of predicates, each associated with an arity. | ||
+ | An atom has the form 𝑝(𝑡1 , . . . , 𝑡𝑛 ), where 𝑝 is an 𝑛-ary predicate, and 𝑡1 , . . . , 𝑡𝑛 are terms. An | ||
+ | atom containing only constants is called fact. Conjunctions of atoms are also identified with | ||
+ | the sets of their atoms. An instance 𝐼 is a (possibly infinite) set of atoms defined over constants | ||
+ | and nulls. A database 𝐷 is a finite instance containing only constants. A homomorphism is a | ||
+ | substitution ℎ : C ∪ N ∪ V ↦→ C ∪ N ∪ V that is the identity on C and maps N to C ∪ N. With | ||
+ | a slight abuse of notation, homomorphisms are applied also to (sets/conjunctions of) atoms. A | ||
+ | conjunctive query (CQ) 𝑞 has the form ∃Y𝜑(X, Y), where 𝜑(X, Y) is a conjunction of atoms | ||
+ | without nulls. The answer to 𝑞 over an instance 𝐼, denoted 𝑞(𝐼), is the set of all |X|-tuples | ||
+ | t over C for which there is a homomorphism ℎ such that ℎ(𝜑(X, Y)) ⊆ 𝐼 and ℎ(X) = t. A | ||
+ | Boolean CQ (BCQ) 𝑞 is a CQ ∃Y𝜑(Y), i.e., all variables are existentially quantified; 𝑞 is true | ||
+ | over 𝐼, denoted 𝐼 |= 𝑞, if 𝑞(𝐼) ̸= ∅, i.e., there is a homomorphism ℎ with ℎ(𝜑(Y)) ⊆ 𝐼. | ||
+ | Dependencies. A tuple-generating dependency (TGD) 𝜎 is an FO formula ∀X∀Y 𝜙(X, Y) → | ||
+ | ∃Z 𝑝(X, Z), where X, Y, and Z are pairwise disjoint sets of variables, 𝜙(X, Y) is a conjunction | ||
+ | of atoms, and 𝑝(X, Z) is an atom, all without nulls. An instance 𝐼 satisfies 𝜎, written 𝐼 |= 𝜎, | ||
+ | whenever there exists a homomorphism ℎ such that ℎ(𝜙(X, Y)) ⊆ 𝐼, then there exists ℎ′ ⊇ ℎ|X , | ||
+ | where ℎ|X is the restriction of ℎ on X, such that ℎ′ (𝑝(X, Z)) ∈ 𝐼. Given a set Σ of TGDs, 𝐼 | ||
+ | satisfies Σ, written 𝐼 |= Σ, if 𝐼 satisfies each TGD of Σ. For brevity, we omit the universal | ||
+ | quantifiers in front of TGDs, and use the comma (instead of ∧) for conjoining atoms. Finite sets | ||
+ | of TGDs are called programs (or, ontologies), and TGDs are also called existential rules. | ||
+ | The Datalog± languages ℒ that we consider to guarantee decidability are among the most | ||
+ | frequently analyzed in the literature, namely, linear (L) [5], guarded (G) [16], sticky (S) [17], and | ||
+ | acyclic TGDs (A), along with the “weak” (proper) generalizations weakly sticky (WS) [17] and | ||
+ | weakly acyclic TGDs (WA) [18], as well as their “full” (i.e., existential-free) proper restrictions | ||
+ | linear full (LF), guarded full (GF), sticky full (SF), and acyclic full TGDs (AF), respectively, | ||
+ | and full TGDs (F) in general. We also recall the following further inclusions: L ⊂ G and | ||
+ | F ⊂ WA ⊂ WS. We refer to [19] for a more detailed overview. | ||
+ | Ontology-Mediated Query Answering. The OMQA paradigm generalizes query answering | ||
+ | over databases by incorporating additional background knowledge via an ontology. | ||
+ | Formally, an ontology-mediated query (OMQ) is a pair (𝑞, Σ), where 𝑞 is a query, and Σ is | ||
+ | an ontology. Let ℒ be a Datalog± language. If Σ ∈ ℒ, then we say that (𝑞, Σ) is an ℒ-OMQ. | ||
+ | Consider a database 𝐷 and an OMQ (𝑞, Σ). The set of models of (𝐷, Σ), denoted mods(𝐷, Σ), | ||
+ | is the set of instances {𝐼 | 𝐼 ⊇ 𝐷 ∧ 𝐼 |= Σ}. We say that 𝐷 entails (𝑞, Σ), denoted 𝐷 |= (𝑞, Σ), | ||
+ | if 𝐼 |= 𝑄 for every 𝐼 ∈ mods(𝐷, Σ). Ontology-mediated query answering (OMQA) is the task of | ||
+ | deciding whether 𝐷 |= (𝑞, Σ) for a given database 𝐷 and an OMQ (𝑞, Σ). A different way to | ||
+ | define the existential rules semantics is via the concept of the Chase (see, e.g., [20, 21]). | ||
+ | Following Vardi [22], the combined complexity of BCQ answering considers the database, | ||
+ | the set of dependencies, and the query as part of the input. The bounded-arity-combined (or | ||
+ | ba-combined) complexity assumes that the arity of the underlying schema is bounded by an | ||
+ | �Table 1 Figure 1 | ||
+ | Complexity of OMQA under existential rules [19]. Protein containment in complexes. | ||
+ | |||
+ | ℒ Data fp-comb. ba-comb. Comb. 𝑝1 𝑝2 𝑝3 | ||
+ | 0 𝑐1 | ||
+ | L, LF, AF in ac np np pspace | ||
+ | S, SF in ac0 np np exp | ||
+ | A in ac0 np nexp nexp 𝑐2 | ||
+ | G p np exp 2exp 𝑝4 𝑝5 | ||
+ | F, GF p np np exp 𝑐3 | ||
+ | WS, WA p np 2exp 2exp | ||
+ | |||
+ | |||
+ | integer constant. The fixed-program-combined (or fp-combined) complexity considers the sets | ||
+ | of TGDs as fixed; the data complexity also assumes the query fixed. Table 1 summarizes the | ||
+ | complexity results for OMQA in the different TGD classes here considered; OMQA (ℒ) denotes | ||
+ | the OMQA problem when restricted over ontologies belonging to ℒ. | ||
+ | An OMQ (𝑞, Σ) is FO-rewritable, if there exists a query 𝑞Σ such that, for all databases 𝐷, | ||
+ | we have that 𝐷 |= (𝑞, Σ) iff D|= 𝑞Σ . In this case, 𝑞Σ is an FO-rewriting of (𝑞, Σ). A class of | ||
+ | programs ℒ is FO-rewritable, if it admits an FO-rewriting for every query and program in ℒ. All | ||
+ | languages from Table 1 with ac0 data complexity are FO-rewritable. | ||
+ | |||
+ | |||
+ | 3. Explanations for Ontology-Mediated Query Answers | ||
+ | In our approach, a minimal explanation for an ontology-mediated query answer is an inclusion- | ||
+ | minimal subset of a database that supports such query answer. | ||
+ | Definition 1. For a database 𝐷 and an OMQ (𝑞, Σ), where 𝑞 is a query and Σ is a program, | ||
+ | such that 𝐷 |= (𝑞, Σ), an explanation for 𝐷 |= (𝑞, Σ) is a subset 𝐸 ⊆ 𝐷 of facts entailing the | ||
+ | OMQ (𝑞, Σ), i.e., 𝐸 |= (𝑞, Σ); 𝐸 is a minimal explanation, or MinEx, for 𝐷 |= (𝑞, Σ) if 𝐸 is | ||
+ | subset-minimal, i.e., there is no proper subset 𝐸 ′ ⊊ 𝐸 that is an explanation for 𝐷 |= (𝑞, Σ). | ||
+ | Example 2. Each protein 𝑝1 , . . . , 𝑝5 belongs to complexes 𝑐1 , 𝑐2 and 𝑐3 , as shown in Fig 1. We | ||
+ | want to find minimal subsets of proteins covering all complexes (a minimal hitting set [23, 24]). | ||
+ | This is tantamount to looking for a MinEx, where the database is 𝐷 = {prt(𝑝𝑖 ) | 1 ≤ 𝑖 ≤ | ||
+ | 5} ∪ {in(𝑝𝑖 , 𝑐𝑗 ) | 𝑝𝑖 is in 𝑐𝑗 }, the ontology is Σ = {prt(𝑃 ) ∧ in(𝑃, 𝐶) → cov (𝐶)}, and the | ||
+ | query is 𝑞 = (cov (𝑐1 ) ∧ cov (𝑐2 ) ∧ cov (𝑐3 )) ∧ 𝜑in , where 𝜑in is a conjuction of all in facts in 𝐷. | ||
+ | Clearly, the MinExes for (𝑞, Σ) in 𝐷 are in bijection with the protein covers of complexes. ◁ | ||
+ | We analyze these problems. Is-MinEX: given a set of facts, decide if it is a MinEx; All- | ||
+ | MinEX: given sets of facts, decide if they are all the MinExes; MinEX-Rel: given a fact, decide | ||
+ | if it is contained in some MinEx; MinEX-Irrel: given forbidden sets of facts, decide if there is | ||
+ | a MinEx not containing any forbidden set; and Small-MinEX (resp., Large-MinEX): given a | ||
+ | number 𝑛, decide if there is a MinEx with less (resp. more) elements than 𝑛. | ||
+ | Example 3. Consider the following subsets of the database: | ||
+ | |||
+ | 𝐸1 = {prt(𝑝1 ), prt(𝑝3 )} ∪ 𝐷in , and 𝐸2 = {prt(𝑝2 ), prt(𝑝4 ), prt(𝑝5 )} ∪ 𝐷in , | ||
+ | �Table 2 | ||
+ | Complexity results for Is-MinEX (ℒ) and for All-MinEX (ℒ). | ||
+ | Is-MinEX All-MinEX | ||
+ | ℒ Data fp-comb. ba-comb. Comb. Data fp-comb. ba-comb. Comb. | ||
+ | L, LF, AF in ac0 dp dp pspace in p dp dp pspace | ||
+ | S, SF in ac0 dp dp exp in p dp dp exp | ||
+ | A in ac0 dp dexp dexp in p dp dexp dexp | ||
+ | G p dp exp 2exp co-np dp exp 2exp | ||
+ | F, GF p dp dp exp co-np dp dp exp | ||
+ | WS, WA p dp 2exp 2exp co-np dp 2exp 2exp | ||
+ | |||
+ | |||
+ | where 𝐷in is the set of all in facts from 𝐷. Then we have: | ||
+ | • Is-MinEX: Both 𝐸1 and 𝐸2 are MinExes for (𝑞, Σ) in 𝐷 . On the other hand, 𝐸1′ = {prt(𝑝1 ), | ||
+ | prt(𝑝4 )} ∪ 𝐷in and 𝐸2′ = {prt(𝑝1 ), prt(𝑝2 ), prt(𝑝3 )} ∪ 𝐷in are not MinExes as they do not | ||
+ | entail the query and are not minimal, respectively. | ||
+ | • All-MinEX: Both 𝐸3 = {prt(𝑝1 ), prt(𝑝5 )} ∪ 𝐷in and 𝐸4 = {prt(𝑝3 ), prt(𝑝4 )} ∪ 𝐷in are | ||
+ | MinExes, and the set ℰ = {𝐸1 , 𝐸2 , 𝐸3 , 𝐸4 } is the set of all MinExes for (𝑞, Σ) in 𝐷 . | ||
+ | • MinEX-Rel: For each 𝑖 ∈ {1, . . . , 5}, the fact prt(𝑝𝑖 ) is contained in some MinEx for (𝑞, Σ) | ||
+ | in 𝐷 , and thus the fact is relevant. | ||
+ | • MinEX-Irrel: Let {{prt(𝑝1 ), prt(𝑝3 )}, {prt(𝑝5 )}} be a set of forbidden sets of facts. Note | ||
+ | that there is an explanation that does not contain any of these sets, which is 𝐸4 . Notice, | ||
+ | however, 𝐸1 , 𝐸2 and 𝐸3 contain a forbidden set. | ||
+ | • Large-MinEX and Small-MinEX: All MinExes are either of size 2 + |𝐷in | or 3 + |𝐷in |. ◁ | ||
+ | |||
+ | |||
+ | 4. Recognizing (Sets of All) Minimal Explanations | ||
+ | First, we provide general results for Is-MinEX and All-MinEX. These general algorithms use | ||
+ | the procedure for the OMQA problem. For Is-MinEX, checking that a set 𝐸 is a MinEx requires | ||
+ | checking that 𝐸 entails the OMQ, and that 𝐸 is minimal, if removing any element from 𝐸 yields | ||
+ | a set that does not entail the OMQ. For All-MinEX, we need to check that each set of facts in | ||
+ | the input entails the OMQ, and that in the input there is no missing explanation. | ||
+ | |||
+ | Theorem 4. For a class ℒ of TGDs, if OMQA (ℒ) is in the complexity class 𝒞, then: | ||
+ | • Is-MinEX(ℒ) can be decided by a check in 𝒞 and a linear number of checks in co-𝒞; | ||
+ | • All-MinEX(ℒ) can be decided by a check in p𝒞 and a check in co-(np𝒞 ). | ||
+ | |||
+ | Theorem 5. If ℒ is a class of TGDs for which OMQA is FO-rewritable, then Is-MinEX(ℒ) is in | ||
+ | ac0 in the data complexity. | ||
+ | |||
+ | Theorem 6. For a class ℒ of TGDs, if OMQA(ℒ) is in np, then All-MinEX(ℒ) is in dp . | ||
+ | |||
+ | Theorem 7. Let ℒ be class of TGDs for which OMQA is FO-rewritable. Then, computing all | ||
+ | MinExes for 𝐷 |= (𝑞, Σ), where Σ is from ℒ, is feasible in polynomial time in the data complexity. | ||
+ | �Table 3 | ||
+ | Complexity results for MinEX-Irrel/Small-MinEX (ℒ) and for MinEX-Rel/Large-MinEX (ℒ). | ||
+ | |||
+ | MinEX-Irrel / Small-MinEX MinEX-Rel / Large-MinEX | ||
+ | ℒ Data fp-comb. ba-comb. Comb. Data fp-comb. ba-comb. Comb. | ||
+ | L, LF, AF in p np np pspace in p Σp2 Σp2 pspace | ||
+ | S, SF in p np np exp in p Σp2 Σp2 exp | ||
+ | A in p np nexp nexp in p Σp2 pnexp pnexp | ||
+ | G np np exp 2exp np Σp2 exp 2exp | ||
+ | F, GF np np np exp np Σp2 Σp2 exp | ||
+ | WS, WA np np 2exp 2exp np Σp2 2exp 2exp | ||
+ | |||
+ | |||
+ | The upper-bounds stated above can be shown tight. For Is-MinEX, the p-hardness results are | ||
+ | obtained via a reduction from OMQA for GF, which is p-complete in those cases; the dp -hardness | ||
+ | results via a reduction from the classical Sat-UnSat problem; and the dexp -hardness results | ||
+ | via a reduction from the dexp -complete problem ExpTiling-NonExpTiling: given two tiling | ||
+ | problems TP 1 and TP 2 for the exponential square 2𝑛 × 2𝑛 , and two initial tiling conditions | ||
+ | 𝑤1 and 𝑤2 , decide whether TP 1 have no solution with 𝑤1 and TP 2 have a solution with 𝑤2 . | ||
+ | All the remaining hardness results in Table 2 for Is-MinEX follow from the observation that | ||
+ | Is-MinEX is no easier than OMQA in the ba- and combined complexity. | ||
+ | For All-MinEX, the co-np-hardness results are via a reduction from UnSat, while all other | ||
+ | hardness results are obtained with minimal variations on the Is-MinEX’s hardness proofs. | ||
+ | |||
+ | |||
+ | 5. Existence of an Explanation Excluding/Including Facts | ||
+ | For MinEX-Irrel, to check the existence of a MinEx not intersecting any of the forbidden sets | ||
+ | provided in input, it suffices to guess a set of fact entailing the OMQ (we do not need to check | ||
+ | the minimality of the guessed set in this case). For MinEX-Rel, checking the existence of a | ||
+ | MinEx including a fact we need to guess a set of facts that subsequently needs to be checked to | ||
+ | be a MinEx (we do need to check the minimality of the guessed set in this case). | ||
+ | |||
+ | Theorem 8. For a class ℒ of TGDs, if OMQA (ℒ) (resp., Is-MinEX (ℒ)), is in the complexity class | ||
+ | 𝒞 (resp., 𝒟), then MinEX-Irrel(ℒ) (resp., MinEX-Rel (ℒ)) is in np𝒞 (resp., np𝒟 ). | ||
+ | |||
+ | Theorem 9. For a class ℒ of TGDs, if OMQA (ℒ) is in np, then MinEX-Irrel(ℒ) is in np. | ||
+ | |||
+ | Also these complexity results are tight. In particular, for MinEX-Irrel, the np-hardness results | ||
+ | in the data complexity are obtained via a reduction from the problem of deciding whether in a | ||
+ | graph there is a path between two vertices avoiding a set of pairs of edges [25, Problem GT54]. | ||
+ | For MinEX-Rel, the np-hardness results are via a reduction from the problem of deciding | ||
+ | whether in a directed graph there is a path from a node 𝑠 to a node 𝑡 passing through a node | ||
+ | 𝑚 [26]; the Σp2 -hardness results via a reduction from the validity problem ∃𝑋∀𝑌 𝜑(𝑋, 𝑌 ); | ||
+ | and the pnexp -hardness results via a reduction from the following problem [19]: for a triple | ||
+ | (𝑚, TP 1 , TP 2 ), where 𝑚 is an integer in unary notation, and TP 1 and TP 2 are two tiling | ||
+ | �problems for the exponential square 2𝑛 × 2𝑛 , decide whether there is an initial condition 𝑤 of | ||
+ | length 𝑚, such that TP 1 has no solution with 𝑤, and TP 2 has a solution with 𝑤. | ||
+ | The other hardness results for MinEX-Irrel (resp., MinEX-Rel) follow via a reduction from | ||
+ | OMQA in the fp-, ba-, and combined complexity (resp., in the ba- and combined complexity). | ||
+ | |||
+ | |||
+ | 6. Cardinality-Based Explanation Problems | ||
+ | The intuitions for the membership results of Small-MinEX and Large-MinEX problems are as | ||
+ | follows. For Small-MinEX, deciding if there is a MinEx not bigger than a given size, it suffices | ||
+ | to guess a small enough set of facts, and then test that this set entails the OMQ (in this case, | ||
+ | we do not need to ensure the minimality of the guessed set). For Large-MinEX, to decide the | ||
+ | existence of a big enough MinEx, we can guess a proper set of facts and check that it is actually | ||
+ | a MinEx (in this case, we do need to check the minimality of the guessed set). | ||
+ | Theorem 10. For a class ℒ of TGDs, if OMQA (ℒ) (resp., Is-MinEX (ℒ)), is in the complexity class | ||
+ | 𝒞 (resp., 𝒟), then Small-MinEX (ℒ) (resp., Large-MinEX (ℒ)) is in np𝒞 (resp., np𝒟 ). | ||
+ | Some entries in Table 3 need the result below; p membership results follow from Theorem 7. | ||
+ | Theorem 11. For a class ℒ of TGDs, if OMQA (ℒ) is in np, then Small-MinEX (ℒ) is in np. | ||
+ | We have completeness results also for these problems. For Small-MinEX, the np-hardness | ||
+ | results in the data complexity are via a reduction from Vertex Cover. | ||
+ | For Large-MinEX, the np-hardness results are obtained via a reduction from the problem | ||
+ | Hamiltonian Path; the Σp2 -hardness results and the pnexp -hardness are obtained via small | ||
+ | variations of the corresponding MinEX-Rel’s hardness proofs. | ||
+ | The other hardness results for Small-MinEX (resp., Large-MinEX) are via a reduction from | ||
+ | OMQA in the fp-, ba-, and combined complexity (resp., in the ba- and combined complexity). | ||
+ | |||
+ | |||
+ | 7. Summary and Outlook | ||
+ | In this paper, we have studied the problem of explaining query answers in terms of minimal | ||
+ | subsets of database facts, and provided a thorough complexity analysis for several decision | ||
+ | problems associated with minimal explanations under existential rules. | ||
+ | Future work can include studying other ontology languages. Our work on explaining OMQA | ||
+ | has been extended to the inconsistent setting [27, 28], to negative OMQA [29], to preferred | ||
+ | explanations [30], and also to DL [31]. The explanation notion, in line with the inconsistent | ||
+ | setting above, can also be extended to the cases of cardinality-based repairs [32], generalized | ||
+ | repairs [33], probabilistic/preferred repairs [34, 35, 36], and repairs based on value updates [37]. | ||
+ | Inspired by the idea of exploring preferences over explanations, we can also consider how more | ||
+ | elaborate preference models can be included in this framework [38, 39, 40, 41]. | ||
+ | |||
+ | |||
+ | Acknowledgments | ||
+ | This work was supported by the Alan Turing Institute under the UK EPSRC grant EP/N510129/1, | ||
+ | and by the EPSRC grants EP/R013667/1, EP/L012138/1, and EP/M025268/1. | ||
+ | �References | ||
+ | [1] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Explanations for query answers | ||
+ | under existential rules, in: IJCAI, 2019, pp. 1639–1646. | ||
+ | [2] A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, Linking data to | ||
+ | ontologies, in: Journal on Data Semantics X, Springer-Verlag, 2008, pp. 133–173. | ||
+ | [3] M. Bienvenu, B. T. Cate, C. Lutz, F. Wolter, Ontology-based data access: A study through | ||
+ | disjunctive Datalog, CSP, and MMSNP, ACM T. Database Syst. 39 (2014) 33:1–33:44. | ||
+ | [4] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The | ||
+ | Description Logic Handbook, 2nd ed., Cambridge University Press, 2007. | ||
+ | [5] A. Calì, G. Gottlob, T. Lukasiewicz, A general Datalog-based framework for tractable query | ||
+ | answering over ontologies, J. Web Sem. 14 (2012) 57–83. | ||
+ | [6] A. Kalyanpur, B. Parsia, M. Horridge, E. Sirin, Finding all justifications of OWL DL | ||
+ | entailments, in: ISWC/ASWC, 2007, pp. 267–280. | ||
+ | [7] F. Baader, B. Suntisrivaraporn, Debugging SNOMED CT using axiom pinpointing in the | ||
+ | description logic EL+, in: KR-MED, 2008. | ||
+ | [8] R. Peñaloza, B. Sertkaya, Understanding the complexity of axiom pinpointing in lightweight | ||
+ | description logics, Artif. Intell. 250 (2017) 80–104. | ||
+ | [9] M. Horridge, B. Parsia, U. Sattler, Laconic and precise justifications in OWL, in: ISWC, | ||
+ | 2008, pp. 323–338. | ||
+ | [10] M. Horridge, B. Parsia, U. Sattler, Explaining inconsistencies in OWL ontologies, in: SUM, | ||
+ | 2009, pp. 124–137. | ||
+ | [11] R. Sebastiani, M. Vescovi, Axiom pinpointing in lightweight description logics via Horn- | ||
+ | SAT encoding and conflict analysis, in: CADE, 2009, pp. 84–99. | ||
+ | [12] A. Borgida, D. Calvanese, M. Rodriguez-Muro, Explanation in the DL-Lite family of | ||
+ | description logics, in: OTM, 2008, pp. 1440–1457. | ||
+ | [13] M. Bienvenu, C. Bourgaux, F. Goasdoué, Computing and explaining query answers over | ||
+ | inconsistent DL-Lite knowledge bases, J. Artif. Intell. Res. 64 (2019) 563–644. | ||
+ | [14] D. Calvanese, M. Ortiz, M. Šimkus, G. Stefanoni, Reasoning about explanations for negative | ||
+ | query answers in DL-Lite, J. Artif. Intell. Res. 48 (2013) 635–669. | ||
+ | [15] İ. İ. Ceylan, S. Borgwardt, T. Lukasiewicz, Most probable explanations for probabilistic | ||
+ | database queries, in: IJCAI, 2017, pp. 950–956. | ||
+ | [16] A. Calì, G. Gottlob, M. Kifer, Taming the infinite chase: Query answering under expressive | ||
+ | relational constraints, J. Artif. Intell. Res. 48 (2013) 115–174. | ||
+ | [17] A. Calì, G. Gottlob, A. Pieris, Towards more expressive ontology languages: The query | ||
+ | answering problem, Artif. Intell. 193 (2012) 87–128. | ||
+ | [18] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa, Data exchange: Semantics and query answering, | ||
+ | Theor. Comput. Sci. 336 (2005) 89–124. | ||
+ | [19] T. Lukasiewicz, E. Malizia, M. V. Martinez, C. Molinaro, A. Pieris, G. I. Simari, Inconsistency- | ||
+ | tolerant query answering for existential rules, Artif. Intell. 307 (2022) art. no. 103685. | ||
+ | [20] E. Tsamoura, D. Carral, E. Malizia, J. Urbani, Materializing knowledge bases via trigger | ||
+ | graphs, VLDB Endow. 14 (2021) 943–956. | ||
+ | [21] M. Calautti, S. Greco, C. Molinaro, I. Trubitsyna, Exploiting equality generating dependen- | ||
+ | cies in checking chase termination, VLDB Endow. 9 (2016) 396–407. | ||
+ | �[22] M. Vardi, The complexity of relational query languages, in: STOC, 1982, pp. 137–146. | ||
+ | [23] G. Gottlob, E. Malizia, Achieving new upper bounds for the hypergraph duality problem | ||
+ | through logic, in: LICS, 2014, pp. 43:1–43:10. | ||
+ | [24] G. Gottlob, E. Malizia, Achieving new upper bounds for the hypergraph duality problem | ||
+ | through logic, SIAM J. Comput. 47 (2018) 456–492. | ||
+ | [25] M. R. Garey, D. S. Johnson, Computers and Intractability, W. H. Freeman, New York, 1979. | ||
+ | [26] A. S. Lapaugh, C. H. Papadimitriou, The even-path problem for graphs and digraphs, | ||
+ | Networks 14 (1984) 507–513. | ||
+ | [27] T. Lukasiewicz, E. Malizia, C. Molinaro, Explanations for inconsistency-tolerant query | ||
+ | answering under existential rules, in: AAAI, 2020, pp. 2909–2916. | ||
+ | [28] T. Lukasiewicz, E. Malizia, C. Molinaro, Explanations for negative query answers under | ||
+ | inconsistency-tolerant semantics, in: IJCAI, 2022. | ||
+ | [29] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, C. Molinaro, A. Vaicenavičius, Explanations for | ||
+ | negative query answers under existential rules, in: KR, 2020, pp. 223–232. | ||
+ | [30] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, C. Molinaro, A. Vaicenavičius, Preferred explanations | ||
+ | for ontology-mediated queries under existential rules, in: AAAI, 2021, pp. 6262–6270. | ||
+ | [31] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Explanations for ontology- | ||
+ | mediated query answering in description logics, in: ECAI, 2020, pp. 672–679. | ||
+ | [32] T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Complexity of inconsistency-tolerant query | ||
+ | answering in Datalog+/– under cardinality-based repairs, in: AAAI, 2019, pp. 2962–2969. | ||
+ | [33] T. Lukasiewicz, E. Malizia, C. Molinaro, Complexity of approximate query answering | ||
+ | under inconsistency in Datalog± , in: IJCAI, 2018, pp. 1921–1927. | ||
+ | [34] S. Greco, C. Molinaro, Probabilistic query answering over inconsistent databases, Ann. | ||
+ | Math. Artif. Intell. 64 (2012) 185–207. | ||
+ | [35] M. Calautti, S. Greco, C. Molinaro, I. Trubitsyna, Preference-based inconsistency-tolerant | ||
+ | query answering under existential rules, in: KR, 2020, pp. 203–212. | ||
+ | [36] M. Calautti, L. Caroprese, S. Greco, C. Molinaro, I. Trubitsyna, E. Zumpano, Existential | ||
+ | active integrity constraints, Expert Syst. Appl. 168 (2021) art. no. 114297. | ||
+ | [37] S. Greco, C. Molinaro, I. Trubitsyna, Computing approximate query answers over incon- | ||
+ | sistent knowledge bases, in: IJCAI, 2018, pp. 1838–1846. | ||
+ | [38] T. Lukasiewicz, E. Malizia, On the complexity of 𝑚CP-nets, in: AAAI, 2016, pp. 558–564. | ||
+ | [39] T. Lukasiewicz, E. Malizia, Complexity results for preference aggregation over (m)CP-nets: | ||
+ | Pareto and majority voting, Artif. Intell. 272 (2019) 101–142. | ||
+ | [40] T. Lukasiewicz, E. Malizia, Complexity results for preference aggregation over (m)CP-nets: | ||
+ | Max and rank voting, Artif. Intell. 303 (2022) art. no. 103636. | ||
+ | [41] G. Greco, E. Malizia, L. Palopoli, F. Scarcello, Non-transferable utility coalitional games | ||
+ | via mixed-integer linear constraints, J. Artif. Intell. Res. 38 (2010) 633–685. | ||
+ | � | ||
+ | </pre> |
Revision as of 17:56, 30 March 2023
Paper
Paper | |
---|---|
edit | |
description | |
id | Vol-3194/paper56 |
wikidataid | →Q117344929 |
title | Query Answer Explanations under Existential Rules |
pdfUrl | https://ceur-ws.org/Vol-3194/paper56.pdf |
dblpUrl | https://dblp.org/rec/conf/sebd/CeylanLMV22 |
volume | Vol-3194→Vol-3194 |
session | → |
Query Answer Explanations under Existential Rules
Query Answer Explanations under Existential Rules⋆ (Discussion Paper) İsmail İlkan Ceylan1 , Thomas Lukasiewicz1 , Enrico Malizia2 and Andrius Vaicenavičius1 1 Department of Computer Science, University of Oxford, UK 2 DISI, University of Bologna, Italy Abstract Ontology-mediated query answering is an extensively studied paradigm, which aims at improving query answers with the use of a logical theory. In this paper, we focus on ontology languages based on existential rules, and we carry out a thorough complexity analysis of the problem of explaining query answers in terms of minimal subsets of database facts and related tasks. Keywords Knowledge representation, Existential rules, Query answering, Explanations, Computational complexity 1. Introduction Ontology-based data access (OBDA) is a popular paradigm in knowledge representation and reasoning [2]. In this paradigm, an ontology enriches the user query with commonsense knowledge, to form a so-called ontology-mediated query (OMQ) [3]. The task of evaluating OMQs is called ontology-mediated query answering (OMQA). Description logics (DLs) [4] and existential rules [5] are the most widely used ontology languages. Tracking down the causes of entailment in DL ontologies has attracted extensive interest. In the DL context, a prominent approach is identifying explanations as ontology axioms subsets [6, 7, 8], which are called justifications [9, 10]. Axiom pinpointing is extensively studied in DLs, and some implementations exist as well [6, 11]. However, most of the existing approaches to explanations in the DL framework focus on classical reasoning tasks. Explaining query entailments has only been investigated for the DL-Lite family of languages [12, 13], and negative answers to OMQs in DLs [14]. In this paper, we study explanations in the existential rules context. The only prior work related to explanations in existential rules is given in [15] in the context of probabilistic databases, and hence of a very different flavor. Here, given an OMQ, we are interested in explaining the OMQ in terms of the minimal satisfying subsets, called minimal explanations of a given database. We analyze the complexity for each of the problems introduced in terms of data, fp-combined, ba-combined and combined complexity, for a representative set of existential rules. SEBD 2022: The 30th Italian Symposium on Advanced Database Systems, June 19-22, 2022, Tirrenia (PI), Italy ⋆ This is a short version of the IJCAI-19 paper [1]. $ ismail.ceylan@cs.ox.ac.uk (İ. İ. Ceylan); thomas.lukasiewicz@cs.ox.ac.uk (T. Lukasiewicz); enrico.malizia@unibo.it (E. Malizia); andrius.vaicenavicius@cs.ox.ac.uk (A. Vaicenavičius) � 0000-0003-4118-4689 (İ. İ. Ceylan); 0000-0002-7644-1668 (T. Lukasiewicz); 0000-0002-6780-4711 (E. Malizia) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings http://ceur-ws.org ISSN 1613-0073 CEUR Workshop Proceedings (CEUR-WS.org) �2. Preliminaries We here briefly recall some basics on existential rules from the context of Datalog± [5]. General. We assume a set C of constants, a set N of labeled nulls, and a set V of variables. A term 𝑡 is a constant, null, or variable. We assume a set of predicates, each associated with an arity. An atom has the form 𝑝(𝑡1 , . . . , 𝑡𝑛 ), where 𝑝 is an 𝑛-ary predicate, and 𝑡1 , . . . , 𝑡𝑛 are terms. An atom containing only constants is called fact. Conjunctions of atoms are also identified with the sets of their atoms. An instance 𝐼 is a (possibly infinite) set of atoms defined over constants and nulls. A database 𝐷 is a finite instance containing only constants. A homomorphism is a substitution ℎ : C ∪ N ∪ V ↦→ C ∪ N ∪ V that is the identity on C and maps N to C ∪ N. With a slight abuse of notation, homomorphisms are applied also to (sets/conjunctions of) atoms. A conjunctive query (CQ) 𝑞 has the form ∃Y𝜑(X, Y), where 𝜑(X, Y) is a conjunction of atoms without nulls. The answer to 𝑞 over an instance 𝐼, denoted 𝑞(𝐼), is the set of all |X|-tuples t over C for which there is a homomorphism ℎ such that ℎ(𝜑(X, Y)) ⊆ 𝐼 and ℎ(X) = t. A Boolean CQ (BCQ) 𝑞 is a CQ ∃Y𝜑(Y), i.e., all variables are existentially quantified; 𝑞 is true over 𝐼, denoted 𝐼 |= 𝑞, if 𝑞(𝐼) ̸= ∅, i.e., there is a homomorphism ℎ with ℎ(𝜑(Y)) ⊆ 𝐼. Dependencies. A tuple-generating dependency (TGD) 𝜎 is an FO formula ∀X∀Y 𝜙(X, Y) → ∃Z 𝑝(X, Z), where X, Y, and Z are pairwise disjoint sets of variables, 𝜙(X, Y) is a conjunction of atoms, and 𝑝(X, Z) is an atom, all without nulls. An instance 𝐼 satisfies 𝜎, written 𝐼 |= 𝜎, whenever there exists a homomorphism ℎ such that ℎ(𝜙(X, Y)) ⊆ 𝐼, then there exists ℎ′ ⊇ ℎ|X , where ℎ|X is the restriction of ℎ on X, such that ℎ′ (𝑝(X, Z)) ∈ 𝐼. Given a set Σ of TGDs, 𝐼 satisfies Σ, written 𝐼 |= Σ, if 𝐼 satisfies each TGD of Σ. For brevity, we omit the universal quantifiers in front of TGDs, and use the comma (instead of ∧) for conjoining atoms. Finite sets of TGDs are called programs (or, ontologies), and TGDs are also called existential rules. The Datalog± languages ℒ that we consider to guarantee decidability are among the most frequently analyzed in the literature, namely, linear (L) [5], guarded (G) [16], sticky (S) [17], and acyclic TGDs (A), along with the “weak” (proper) generalizations weakly sticky (WS) [17] and weakly acyclic TGDs (WA) [18], as well as their “full” (i.e., existential-free) proper restrictions linear full (LF), guarded full (GF), sticky full (SF), and acyclic full TGDs (AF), respectively, and full TGDs (F) in general. We also recall the following further inclusions: L ⊂ G and F ⊂ WA ⊂ WS. We refer to [19] for a more detailed overview. Ontology-Mediated Query Answering. The OMQA paradigm generalizes query answering over databases by incorporating additional background knowledge via an ontology. Formally, an ontology-mediated query (OMQ) is a pair (𝑞, Σ), where 𝑞 is a query, and Σ is an ontology. Let ℒ be a Datalog± language. If Σ ∈ ℒ, then we say that (𝑞, Σ) is an ℒ-OMQ. Consider a database 𝐷 and an OMQ (𝑞, Σ). The set of models of (𝐷, Σ), denoted mods(𝐷, Σ), is the set of instances {𝐼 | 𝐼 ⊇ 𝐷 ∧ 𝐼 |= Σ}. We say that 𝐷 entails (𝑞, Σ), denoted 𝐷 |= (𝑞, Σ), if 𝐼 |= 𝑄 for every 𝐼 ∈ mods(𝐷, Σ). Ontology-mediated query answering (OMQA) is the task of deciding whether 𝐷 |= (𝑞, Σ) for a given database 𝐷 and an OMQ (𝑞, Σ). A different way to define the existential rules semantics is via the concept of the Chase (see, e.g., [20, 21]). Following Vardi [22], the combined complexity of BCQ answering considers the database, the set of dependencies, and the query as part of the input. The bounded-arity-combined (or ba-combined) complexity assumes that the arity of the underlying schema is bounded by an �Table 1 Figure 1 Complexity of OMQA under existential rules [19]. Protein containment in complexes. ℒ Data fp-comb. ba-comb. Comb. 𝑝1 𝑝2 𝑝3 0 𝑐1 L, LF, AF in ac np np pspace S, SF in ac0 np np exp A in ac0 np nexp nexp 𝑐2 G p np exp 2exp 𝑝4 𝑝5 F, GF p np np exp 𝑐3 WS, WA p np 2exp 2exp integer constant. The fixed-program-combined (or fp-combined) complexity considers the sets of TGDs as fixed; the data complexity also assumes the query fixed. Table 1 summarizes the complexity results for OMQA in the different TGD classes here considered; OMQA (ℒ) denotes the OMQA problem when restricted over ontologies belonging to ℒ. An OMQ (𝑞, Σ) is FO-rewritable, if there exists a query 𝑞Σ such that, for all databases 𝐷, we have that 𝐷 |= (𝑞, Σ) iff D|= 𝑞Σ . In this case, 𝑞Σ is an FO-rewriting of (𝑞, Σ). A class of programs ℒ is FO-rewritable, if it admits an FO-rewriting for every query and program in ℒ. All languages from Table 1 with ac0 data complexity are FO-rewritable. 3. Explanations for Ontology-Mediated Query Answers In our approach, a minimal explanation for an ontology-mediated query answer is an inclusion- minimal subset of a database that supports such query answer. Definition 1. For a database 𝐷 and an OMQ (𝑞, Σ), where 𝑞 is a query and Σ is a program, such that 𝐷 |= (𝑞, Σ), an explanation for 𝐷 |= (𝑞, Σ) is a subset 𝐸 ⊆ 𝐷 of facts entailing the OMQ (𝑞, Σ), i.e., 𝐸 |= (𝑞, Σ); 𝐸 is a minimal explanation, or MinEx, for 𝐷 |= (𝑞, Σ) if 𝐸 is subset-minimal, i.e., there is no proper subset 𝐸 ′ ⊊ 𝐸 that is an explanation for 𝐷 |= (𝑞, Σ). Example 2. Each protein 𝑝1 , . . . , 𝑝5 belongs to complexes 𝑐1 , 𝑐2 and 𝑐3 , as shown in Fig 1. We want to find minimal subsets of proteins covering all complexes (a minimal hitting set [23, 24]). This is tantamount to looking for a MinEx, where the database is 𝐷 = {prt(𝑝𝑖 ) | 1 ≤ 𝑖 ≤ 5} ∪ {in(𝑝𝑖 , 𝑐𝑗 ) | 𝑝𝑖 is in 𝑐𝑗 }, the ontology is Σ = {prt(𝑃 ) ∧ in(𝑃, 𝐶) → cov (𝐶)}, and the query is 𝑞 = (cov (𝑐1 ) ∧ cov (𝑐2 ) ∧ cov (𝑐3 )) ∧ 𝜑in , where 𝜑in is a conjuction of all in facts in 𝐷. Clearly, the MinExes for (𝑞, Σ) in 𝐷 are in bijection with the protein covers of complexes. ◁ We analyze these problems. Is-MinEX: given a set of facts, decide if it is a MinEx; All- MinEX: given sets of facts, decide if they are all the MinExes; MinEX-Rel: given a fact, decide if it is contained in some MinEx; MinEX-Irrel: given forbidden sets of facts, decide if there is a MinEx not containing any forbidden set; and Small-MinEX (resp., Large-MinEX): given a number 𝑛, decide if there is a MinEx with less (resp. more) elements than 𝑛. Example 3. Consider the following subsets of the database: 𝐸1 = {prt(𝑝1 ), prt(𝑝3 )} ∪ 𝐷in , and 𝐸2 = {prt(𝑝2 ), prt(𝑝4 ), prt(𝑝5 )} ∪ 𝐷in , �Table 2 Complexity results for Is-MinEX (ℒ) and for All-MinEX (ℒ). Is-MinEX All-MinEX ℒ Data fp-comb. ba-comb. Comb. Data fp-comb. ba-comb. Comb. L, LF, AF in ac0 dp dp pspace in p dp dp pspace S, SF in ac0 dp dp exp in p dp dp exp A in ac0 dp dexp dexp in p dp dexp dexp G p dp exp 2exp co-np dp exp 2exp F, GF p dp dp exp co-np dp dp exp WS, WA p dp 2exp 2exp co-np dp 2exp 2exp where 𝐷in is the set of all in facts from 𝐷. Then we have: • Is-MinEX: Both 𝐸1 and 𝐸2 are MinExes for (𝑞, Σ) in 𝐷 . On the other hand, 𝐸1′ = {prt(𝑝1 ), prt(𝑝4 )} ∪ 𝐷in and 𝐸2′ = {prt(𝑝1 ), prt(𝑝2 ), prt(𝑝3 )} ∪ 𝐷in are not MinExes as they do not entail the query and are not minimal, respectively. • All-MinEX: Both 𝐸3 = {prt(𝑝1 ), prt(𝑝5 )} ∪ 𝐷in and 𝐸4 = {prt(𝑝3 ), prt(𝑝4 )} ∪ 𝐷in are MinExes, and the set ℰ = {𝐸1 , 𝐸2 , 𝐸3 , 𝐸4 } is the set of all MinExes for (𝑞, Σ) in 𝐷 . • MinEX-Rel: For each 𝑖 ∈ {1, . . . , 5}, the fact prt(𝑝𝑖 ) is contained in some MinEx for (𝑞, Σ) in 𝐷 , and thus the fact is relevant. • MinEX-Irrel: Let {{prt(𝑝1 ), prt(𝑝3 )}, {prt(𝑝5 )}} be a set of forbidden sets of facts. Note that there is an explanation that does not contain any of these sets, which is 𝐸4 . Notice, however, 𝐸1 , 𝐸2 and 𝐸3 contain a forbidden set. • Large-MinEX and Small-MinEX: All MinExes are either of size 2 + |𝐷in | or 3 + |𝐷in |. ◁ 4. Recognizing (Sets of All) Minimal Explanations First, we provide general results for Is-MinEX and All-MinEX. These general algorithms use the procedure for the OMQA problem. For Is-MinEX, checking that a set 𝐸 is a MinEx requires checking that 𝐸 entails the OMQ, and that 𝐸 is minimal, if removing any element from 𝐸 yields a set that does not entail the OMQ. For All-MinEX, we need to check that each set of facts in the input entails the OMQ, and that in the input there is no missing explanation. Theorem 4. For a class ℒ of TGDs, if OMQA (ℒ) is in the complexity class 𝒞, then: • Is-MinEX(ℒ) can be decided by a check in 𝒞 and a linear number of checks in co-𝒞; • All-MinEX(ℒ) can be decided by a check in p𝒞 and a check in co-(np𝒞 ). Theorem 5. If ℒ is a class of TGDs for which OMQA is FO-rewritable, then Is-MinEX(ℒ) is in ac0 in the data complexity. Theorem 6. For a class ℒ of TGDs, if OMQA(ℒ) is in np, then All-MinEX(ℒ) is in dp . Theorem 7. Let ℒ be class of TGDs for which OMQA is FO-rewritable. Then, computing all MinExes for 𝐷 |= (𝑞, Σ), where Σ is from ℒ, is feasible in polynomial time in the data complexity. �Table 3 Complexity results for MinEX-Irrel/Small-MinEX (ℒ) and for MinEX-Rel/Large-MinEX (ℒ). MinEX-Irrel / Small-MinEX MinEX-Rel / Large-MinEX ℒ Data fp-comb. ba-comb. Comb. Data fp-comb. ba-comb. Comb. L, LF, AF in p np np pspace in p Σp2 Σp2 pspace S, SF in p np np exp in p Σp2 Σp2 exp A in p np nexp nexp in p Σp2 pnexp pnexp G np np exp 2exp np Σp2 exp 2exp F, GF np np np exp np Σp2 Σp2 exp WS, WA np np 2exp 2exp np Σp2 2exp 2exp The upper-bounds stated above can be shown tight. For Is-MinEX, the p-hardness results are obtained via a reduction from OMQA for GF, which is p-complete in those cases; the dp -hardness results via a reduction from the classical Sat-UnSat problem; and the dexp -hardness results via a reduction from the dexp -complete problem ExpTiling-NonExpTiling: given two tiling problems TP 1 and TP 2 for the exponential square 2𝑛 × 2𝑛 , and two initial tiling conditions 𝑤1 and 𝑤2 , decide whether TP 1 have no solution with 𝑤1 and TP 2 have a solution with 𝑤2 . All the remaining hardness results in Table 2 for Is-MinEX follow from the observation that Is-MinEX is no easier than OMQA in the ba- and combined complexity. For All-MinEX, the co-np-hardness results are via a reduction from UnSat, while all other hardness results are obtained with minimal variations on the Is-MinEX’s hardness proofs. 5. Existence of an Explanation Excluding/Including Facts For MinEX-Irrel, to check the existence of a MinEx not intersecting any of the forbidden sets provided in input, it suffices to guess a set of fact entailing the OMQ (we do not need to check the minimality of the guessed set in this case). For MinEX-Rel, checking the existence of a MinEx including a fact we need to guess a set of facts that subsequently needs to be checked to be a MinEx (we do need to check the minimality of the guessed set in this case). Theorem 8. For a class ℒ of TGDs, if OMQA (ℒ) (resp., Is-MinEX (ℒ)), is in the complexity class 𝒞 (resp., 𝒟), then MinEX-Irrel(ℒ) (resp., MinEX-Rel (ℒ)) is in np𝒞 (resp., np𝒟 ). Theorem 9. For a class ℒ of TGDs, if OMQA (ℒ) is in np, then MinEX-Irrel(ℒ) is in np. Also these complexity results are tight. In particular, for MinEX-Irrel, the np-hardness results in the data complexity are obtained via a reduction from the problem of deciding whether in a graph there is a path between two vertices avoiding a set of pairs of edges [25, Problem GT54]. For MinEX-Rel, the np-hardness results are via a reduction from the problem of deciding whether in a directed graph there is a path from a node 𝑠 to a node 𝑡 passing through a node 𝑚 [26]; the Σp2 -hardness results via a reduction from the validity problem ∃𝑋∀𝑌 𝜑(𝑋, 𝑌 ); and the pnexp -hardness results via a reduction from the following problem [19]: for a triple (𝑚, TP 1 , TP 2 ), where 𝑚 is an integer in unary notation, and TP 1 and TP 2 are two tiling �problems for the exponential square 2𝑛 × 2𝑛 , decide whether there is an initial condition 𝑤 of length 𝑚, such that TP 1 has no solution with 𝑤, and TP 2 has a solution with 𝑤. The other hardness results for MinEX-Irrel (resp., MinEX-Rel) follow via a reduction from OMQA in the fp-, ba-, and combined complexity (resp., in the ba- and combined complexity). 6. Cardinality-Based Explanation Problems The intuitions for the membership results of Small-MinEX and Large-MinEX problems are as follows. For Small-MinEX, deciding if there is a MinEx not bigger than a given size, it suffices to guess a small enough set of facts, and then test that this set entails the OMQ (in this case, we do not need to ensure the minimality of the guessed set). For Large-MinEX, to decide the existence of a big enough MinEx, we can guess a proper set of facts and check that it is actually a MinEx (in this case, we do need to check the minimality of the guessed set). Theorem 10. For a class ℒ of TGDs, if OMQA (ℒ) (resp., Is-MinEX (ℒ)), is in the complexity class 𝒞 (resp., 𝒟), then Small-MinEX (ℒ) (resp., Large-MinEX (ℒ)) is in np𝒞 (resp., np𝒟 ). Some entries in Table 3 need the result below; p membership results follow from Theorem 7. Theorem 11. For a class ℒ of TGDs, if OMQA (ℒ) is in np, then Small-MinEX (ℒ) is in np. We have completeness results also for these problems. For Small-MinEX, the np-hardness results in the data complexity are via a reduction from Vertex Cover. For Large-MinEX, the np-hardness results are obtained via a reduction from the problem Hamiltonian Path; the Σp2 -hardness results and the pnexp -hardness are obtained via small variations of the corresponding MinEX-Rel’s hardness proofs. The other hardness results for Small-MinEX (resp., Large-MinEX) are via a reduction from OMQA in the fp-, ba-, and combined complexity (resp., in the ba- and combined complexity). 7. Summary and Outlook In this paper, we have studied the problem of explaining query answers in terms of minimal subsets of database facts, and provided a thorough complexity analysis for several decision problems associated with minimal explanations under existential rules. Future work can include studying other ontology languages. Our work on explaining OMQA has been extended to the inconsistent setting [27, 28], to negative OMQA [29], to preferred explanations [30], and also to DL [31]. The explanation notion, in line with the inconsistent setting above, can also be extended to the cases of cardinality-based repairs [32], generalized repairs [33], probabilistic/preferred repairs [34, 35, 36], and repairs based on value updates [37]. Inspired by the idea of exploring preferences over explanations, we can also consider how more elaborate preference models can be included in this framework [38, 39, 40, 41]. Acknowledgments This work was supported by the Alan Turing Institute under the UK EPSRC grant EP/N510129/1, and by the EPSRC grants EP/R013667/1, EP/L012138/1, and EP/M025268/1. �References [1] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Explanations for query answers under existential rules, in: IJCAI, 2019, pp. 1639–1646. [2] A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, Linking data to ontologies, in: Journal on Data Semantics X, Springer-Verlag, 2008, pp. 133–173. [3] M. Bienvenu, B. T. Cate, C. Lutz, F. Wolter, Ontology-based data access: A study through disjunctive Datalog, CSP, and MMSNP, ACM T. Database Syst. 39 (2014) 33:1–33:44. [4] F. Baader, D. Calvanese, D. L. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The Description Logic Handbook, 2nd ed., Cambridge University Press, 2007. [5] A. Calì, G. Gottlob, T. Lukasiewicz, A general Datalog-based framework for tractable query answering over ontologies, J. Web Sem. 14 (2012) 57–83. [6] A. Kalyanpur, B. Parsia, M. Horridge, E. Sirin, Finding all justifications of OWL DL entailments, in: ISWC/ASWC, 2007, pp. 267–280. [7] F. Baader, B. Suntisrivaraporn, Debugging SNOMED CT using axiom pinpointing in the description logic EL+, in: KR-MED, 2008. [8] R. Peñaloza, B. Sertkaya, Understanding the complexity of axiom pinpointing in lightweight description logics, Artif. Intell. 250 (2017) 80–104. [9] M. Horridge, B. Parsia, U. Sattler, Laconic and precise justifications in OWL, in: ISWC, 2008, pp. 323–338. [10] M. Horridge, B. Parsia, U. Sattler, Explaining inconsistencies in OWL ontologies, in: SUM, 2009, pp. 124–137. [11] R. Sebastiani, M. Vescovi, Axiom pinpointing in lightweight description logics via Horn- SAT encoding and conflict analysis, in: CADE, 2009, pp. 84–99. [12] A. Borgida, D. Calvanese, M. Rodriguez-Muro, Explanation in the DL-Lite family of description logics, in: OTM, 2008, pp. 1440–1457. [13] M. Bienvenu, C. Bourgaux, F. Goasdoué, Computing and explaining query answers over inconsistent DL-Lite knowledge bases, J. Artif. Intell. Res. 64 (2019) 563–644. [14] D. Calvanese, M. Ortiz, M. Šimkus, G. Stefanoni, Reasoning about explanations for negative query answers in DL-Lite, J. Artif. Intell. Res. 48 (2013) 635–669. [15] İ. İ. Ceylan, S. Borgwardt, T. Lukasiewicz, Most probable explanations for probabilistic database queries, in: IJCAI, 2017, pp. 950–956. [16] A. Calì, G. Gottlob, M. Kifer, Taming the infinite chase: Query answering under expressive relational constraints, J. Artif. Intell. Res. 48 (2013) 115–174. [17] A. Calì, G. Gottlob, A. Pieris, Towards more expressive ontology languages: The query answering problem, Artif. Intell. 193 (2012) 87–128. [18] R. Fagin, P. G. Kolaitis, R. J. Miller, L. Popa, Data exchange: Semantics and query answering, Theor. Comput. Sci. 336 (2005) 89–124. [19] T. Lukasiewicz, E. Malizia, M. V. Martinez, C. Molinaro, A. Pieris, G. I. Simari, Inconsistency- tolerant query answering for existential rules, Artif. Intell. 307 (2022) art. no. 103685. [20] E. Tsamoura, D. Carral, E. Malizia, J. Urbani, Materializing knowledge bases via trigger graphs, VLDB Endow. 14 (2021) 943–956. [21] M. Calautti, S. Greco, C. Molinaro, I. Trubitsyna, Exploiting equality generating dependen- cies in checking chase termination, VLDB Endow. 9 (2016) 396–407. �[22] M. Vardi, The complexity of relational query languages, in: STOC, 1982, pp. 137–146. [23] G. Gottlob, E. Malizia, Achieving new upper bounds for the hypergraph duality problem through logic, in: LICS, 2014, pp. 43:1–43:10. [24] G. Gottlob, E. Malizia, Achieving new upper bounds for the hypergraph duality problem through logic, SIAM J. Comput. 47 (2018) 456–492. [25] M. R. Garey, D. S. Johnson, Computers and Intractability, W. H. Freeman, New York, 1979. [26] A. S. Lapaugh, C. H. Papadimitriou, The even-path problem for graphs and digraphs, Networks 14 (1984) 507–513. [27] T. Lukasiewicz, E. Malizia, C. Molinaro, Explanations for inconsistency-tolerant query answering under existential rules, in: AAAI, 2020, pp. 2909–2916. [28] T. Lukasiewicz, E. Malizia, C. Molinaro, Explanations for negative query answers under inconsistency-tolerant semantics, in: IJCAI, 2022. [29] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, C. Molinaro, A. Vaicenavičius, Explanations for negative query answers under existential rules, in: KR, 2020, pp. 223–232. [30] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, C. Molinaro, A. Vaicenavičius, Preferred explanations for ontology-mediated queries under existential rules, in: AAAI, 2021, pp. 6262–6270. [31] İ. İ. Ceylan, T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Explanations for ontology- mediated query answering in description logics, in: ECAI, 2020, pp. 672–679. [32] T. Lukasiewicz, E. Malizia, A. Vaicenavičius, Complexity of inconsistency-tolerant query answering in Datalog+/– under cardinality-based repairs, in: AAAI, 2019, pp. 2962–2969. [33] T. Lukasiewicz, E. Malizia, C. Molinaro, Complexity of approximate query answering under inconsistency in Datalog± , in: IJCAI, 2018, pp. 1921–1927. [34] S. Greco, C. Molinaro, Probabilistic query answering over inconsistent databases, Ann. Math. Artif. Intell. 64 (2012) 185–207. [35] M. Calautti, S. Greco, C. Molinaro, I. Trubitsyna, Preference-based inconsistency-tolerant query answering under existential rules, in: KR, 2020, pp. 203–212. [36] M. Calautti, L. Caroprese, S. Greco, C. Molinaro, I. Trubitsyna, E. Zumpano, Existential active integrity constraints, Expert Syst. Appl. 168 (2021) art. no. 114297. [37] S. Greco, C. Molinaro, I. Trubitsyna, Computing approximate query answers over incon- sistent knowledge bases, in: IJCAI, 2018, pp. 1838–1846. [38] T. Lukasiewicz, E. Malizia, On the complexity of 𝑚CP-nets, in: AAAI, 2016, pp. 558–564. [39] T. Lukasiewicz, E. Malizia, Complexity results for preference aggregation over (m)CP-nets: Pareto and majority voting, Artif. Intell. 272 (2019) 101–142. [40] T. Lukasiewicz, E. Malizia, Complexity results for preference aggregation over (m)CP-nets: Max and rank voting, Artif. Intell. 303 (2022) art. no. 103636. [41] G. Greco, E. Malizia, L. Palopoli, F. Scarcello, Non-transferable utility coalitional games via mixed-integer linear constraints, J. Artif. Intell. Res. 38 (2010) 633–685. �