Vol-3194/paper61
Jump to navigation
Jump to search
Paper
Paper | |
---|---|
edit | |
description | |
id | Vol-3194/paper61 |
wikidataid | →Q117344931 |
title | Investigating Monotone Abstractions |
pdfUrl | https://ceur-ws.org/Vol-3194/paper61.pdf |
dblpUrl | https://dblp.org/rec/conf/sebd/CimaCLP22 |
volume | Vol-3194→Vol-3194 |
session | → |
Investigating Monotone Abstractions
Investigating Monotone Abstractions Gianluca Cima2 , Marco Console1 , Maurizio Lenzerini1 and Antonella Poggi1 1 Sapienza Università di Roma 2 CNRS & University of Bordeaux Abstract In Ontology-based Data Management (OBDM), an abstraction of a source query q is a query over the ontology capturing the semantics of q in terms of the concepts and the relations available in the ontology. Since a perfect characterisation of a source query may not exist, the notions of best sound and complete approximations of an abstraction have been introduced and studied in the typical OBDM context, i.e., in the case where the ontology is expressed in DL-Lite, and source queries are expressed as unions of conjunctive queries (UCQs). Interestingly, if we restrict our attention to abstractions expressed as UCQs, even best approximations of abstractions are not guaranteed to exist. Thus, a natural question to ask is whether such limitations affect even larger classes of queries. In this paper, we answer this fundamental question for an essential class of queries, namely the class of monotone queries. We define a monotone query language based on disjunctive Datalog enriched with an epistemic operator, and show that its expressive power suffices for expressing the best approximations of monotone abstractions of UCQs. Keywords Abstraction, Disjunctive Datalog, Monotone queries, Epistemic queries 1. Introduction In Ontology-based Data Management (OBDM) [1], an ontology, i.e., a formal, logic-based repre- sentation of a domain of interest, is used to provide a high-level conceptual tool for accessing and managing the data sources of an information system. Suitable mappings declaratively specify the relationship between the data at the sources and the elements in the ontology, and this enables the user to carry out many relevant tasks on data through the lens of the ontology [2, 3]. Recent papers [4, 5, 6, 7, 8] address a novel issue in OBDM: starting from a query 𝑞𝒮 expressed over the sources, the goal is to find a so-called abstraction of 𝑞𝒮 [9], i.e., an ontology-based characterization of 𝑞𝒮 expressed in terms of the ontology elements, and whose answers coincide with the answer to the original query, modulo the ontology and the mapping. We encountered the need of abstraction during a joint project with a public statistical research institute. The institute’s departments must publish subsets of the data they gather in the form of semantically described linked open data. To compute the content of the datasets the departments execute suitable queries over the data sources mapped to a shared ontology. Notably, when the dataset is published, it must be documented through a SPARQL query expressed in terms of the ontology. SEBD 2022: The 30th Italian Symposium on Advanced Database Systems, June 19-22, 2022, Tirrenia (PI), Italy " gianluca.cima@u-bordeaux.fr (G. Cima); console@diag.uniroma1.it (M. Console); lenzerini@diag.uniroma1.it (M. Lenzerini); poggi@diag.uniroma1.it (A. Poggi) © 2022 Copyright for this paper by its authors. Use permitted under Creative Commons License Attribution 4.0 International (CC BY 4.0). CEUR Workshop Proceedings (CEUR-WS.org) �This task is currently done manually. The notion of abstraction perfectly captures this scenario and provides the formal tool for automating the process: given the query over the sources computing the content of the dataset, the abstraction of such query with respect to the mapping and the ontology is exactly the SPARQL query to be associated to the open dataset. Besides the above use case, abstraction can be the appropriate tool in various scenarios. For additional insights we refer to the references mentioned above. The first investigations on abstraction appear in [4, 5]. Both papers point out that the “perfect” abstraction of a union of conjunctive queries (UCQ) expressed over the data source not always exists, and present algorithms for computing such abstraction in the case where it both exists, and can be expressed as a UCQ over the ontology. In [4, 6] the notion of (sound and complete) approximations of the perfect abstraction is introduced, exactly to cope with situations in which perfectness cannot be achieved. Moreover, both papers make it clear that, for a given class of queries 𝐶, one is probably interested in two specific forms of approximations, called 𝐶- minimally complete and 𝐶-maximally sound abstractions. Based on these notions, [6] presents a thorough analysis of the verification problem (check whether a given query is a complete or sound abstraction) and the computation problem of UCQ-minimally complete and UCQ- maximally sound abstractions of UCQ source queries in OBDM systems based on DL-Lite. In [7] the computation problem is studied in the context of a specific class of non-monotone queries for expressing abstractions, and it is shown that this class can provide abstractions that are better than the one in the UCQ class. Thus, with the exception of [7], all the results on abstractions have been obtained under the assumptions that abstractions are expressed as UCQs over the ontology, and many of them originate from the observations that best approximations in the UCQ class are not guaranteed to exist. Thus, a natural issue to investigate is whether such limitations affect even larger classes of queries for expressing abstractions. The main goal of this paper is to address the following question: do approximations of perfect abstraction that are best in a fundamental class of queries, namely the class of monotone queries, always exist? Obviously, a related goal is to derive algorithms for computing approximations of abstractions that are best in the class of monotone queries, if they exist. Note that the class of monotone queries includes queries expressible in First-Order Logic and is therefore extremely important. In this paper we answer positively to the above-mentioned fundamental question. More specifically, the contributions of the paper can be summarized as follows. We present a general framework for abstraction in OBDM, based on the definition of queries as functions from the logical models of OBDM systems to sets of tuples. The framework includes a new monotone query language, called Datalog∨ K , based on disjunctive Datalog enriched with inequalities and an epistemic operator (Section 2). We consider a scenario where the OBDM specification 𝐽 is based on DL-LiteRDFS (i.e., the fragment of RDFS expressible in Description Logic), and show that, in the considered scenario, for any source UCQ 𝑞𝒮 , the best (sound or complete) approximations of the 𝐽-abstraction of 𝑞𝒮 in the class of monotone queries always exists and can be expressed in Datalog∨ K (Section 3). As a consequence, if the perfect abstraction exists and is in the class of monotone queries, then it can be expressed in Datalog∨ K (Section 4). This paper is an extended abstract of [10]. Hence, while we assume basic knowledge about databases [11] and Description Logics (DL) [12], for specific concepts and notations, we refer to the Preliminaries section of [10]. �2. Framework In what follows, we implicitly refer to an OBDM specification 𝐽 = ⟨𝒪, 𝒮, ℳ⟩, and when we denote a query by 𝑞𝒪 (resp., 𝑞𝒮 ) we mean that the query is a query for 𝐽 (resp., a source query), i.e., is over the signature of the ontology 𝒪 (resp., the schema 𝒮). We follow [6] for the basic definitions related to abstraction. 𝐽,𝐷 We say that 𝑞𝒪 is a perfect 𝐽-abstraction of 𝑞𝒮 if 𝑞𝒪 = 𝑞𝒮𝐷 , for each 𝒮-database 𝐷 consistent with 𝐽. As the condition for an ontology query to be a perfect abstraction of a source query is a strong one, it might be very well the case that a perfect abstraction of a source query does not exist. It is then reasonable to consider weaker notions, such as sound or complete approximations, of perfectness. 𝐽,𝐷 𝐽,𝐷 We say that 𝑞𝒪 is a complete (resp. sound) 𝐽-abstraction of 𝑞𝒮 if 𝑞𝒮𝐷 ⊆ 𝑞𝒪 (resp. 𝑞𝒪 ⊆ 𝑞𝒮𝐷 ), for each 𝒮-database 𝐷 consistent with 𝐽. Obviously, we might be interested in complete or sound abstractions that approximate 𝑞𝒮 at best, at least in the context of a specific class of queries. If ℒ𝒪 is a class of queries, we say that a query 𝑞𝒪 ∈ ℒ𝒪 is an ℒ𝒪 -minimally complete (resp., ℒ𝒪 -maximally sound) 𝐽-abstraction of 𝑞𝒮 if 𝑞𝒪 is a complete (resp., sound) 𝐽-abstraction of 𝑞𝒮 and there is no query 𝑞𝒪 ′ ∈ ℒ such that 𝒪 𝑞𝒪′ is a complete (resp., sound) 𝐽-abstraction of 𝑞 and 𝑞 ′ ⊏ 𝑞 (resp., 𝑞 ⊏ 𝑞 ′ ). 𝒮 𝒪 𝐽 𝒪 𝒪 𝐽 𝒪 We now introduce a new language, called Datalog∨ K , for expressing queries over OBDM specifications. The language is based on disjunctive Datalog, and is used in this paper for expressing abstractions. The basic component of a Datalog∨ K query is a rule. Assume two disjoint and countably infinite sets of predicates ℰ and 𝒫, called extensional and intensional, respectively. A Datalog∨ K rule has one of the following forms: • The typical form of disjunctive Datalog, i.e., ¯ ) → 𝛼1 (𝑥 𝛾(𝑥 ¯ 1 ) ∨ . . . ∨ 𝛼𝑛 (𝑥 ¯𝑛) (1) where 𝛾(𝑥 ¯ ) is a conjunction of relational atoms on the predicates of 𝒫 with ⃗𝑥 as variables, ¯ 𝑖 ) is a single relational atom whose predicate is in 𝒫 such that and for 𝑖 = 1, . . . , 𝑛, 𝛼𝑖 (𝑥 ¯𝑖 ⊆ 𝑥 𝑥 ¯, • A new form specified as follows ¯ , 𝑧¯) ∧ 𝜉(𝑥 K(∃𝑧¯.𝜑(𝑥 ¯ )) → 𝜓1 (𝑥 ¯ 1 ) ∨ . . . ∨ 𝜓𝑛 (𝑥 ¯𝑛) (2) where 𝜑 is a conjunction of relational atoms over ℰ, 𝜉(𝑥 ¯ ) is a conjunction of inequality atoms involving only variables from 𝑥 ¯ and for 𝑗 = 1, . . . , 𝑛, 𝜓𝑗 = ∃𝑦¯𝑗 .𝛾𝑗 (𝑥 ¯ 𝑗 , 𝑦¯𝑗 ), where 𝛾𝑗 is a conjunction of relational atoms on 𝒫. When 𝑥 ¯ 𝑗 contains only variables 𝑥 ¯ occurring in 𝜑(𝑥¯ , 𝑧¯), for 𝑗 = 1, . . . , 𝑛, we say that the rule is safe. An 𝑛-ary Datalog∨ ∨ K query 𝑞𝒪 over an OBDM specification 𝐽 is a finite set of DatalogK rules whose extensional predicates coincide with the alphabet of 𝒪, and whose intensional predicates include a special 𝑛-ary predicate 𝐴𝑛𝑠. We say that 𝑞𝒪 is safe if all of its rules are safe. The semantics of 𝑞𝒪 is provided relative to an OBDM system. Given an OBDM system ⟨𝐽, 𝐷⟩, an interpretation for 𝑞𝒪 is a pair 𝐼 = (𝑚𝑜𝑑(⟨𝐽, 𝐷⟩), 𝑓 ), where 𝑓 is a first-order interpretation �(with domain Const) for the predicates in 𝒫. As usual, we may also see 𝑓 as the set of facts {𝑝(𝑐¯) | ¯𝑐 ∈ 𝑝𝑓 }. We now define when 𝐼 satisfies a Datalog∨ K rule. • 𝐼 satisfies a rule of the form (1) if the first-order formula ∀𝑥 ¯ ) → 𝛼1 (𝑥 ¯ .𝛾(𝑥 ¯ 1 )∨. . .∨𝛼𝑛 (𝑥 ¯𝑛) is true in 𝑓 , • 𝐼 satisfies a rule of the form (2) if for all tuples ¯𝑐 of constants in Const, the fact that the first-order formula ∃𝑧¯.𝜑(𝑐¯, 𝑧¯) ∧ 𝜉(𝑐¯) is satisfied by every model in mod (⟨𝐽, 𝐷⟩) implies that 𝑓 satisfies the first-order formula ∃𝑦¯𝑗 .𝛾𝑗 (𝑐¯𝑗 , 𝑦¯𝑗 ), for some 𝑗 = 1, . . . , 𝑛. Observe that, if the rule is unsafe, ¯𝑐𝑗 may contain constants of Const that do not occur in ¯𝑐. An interpretation 𝐼 for 𝑞𝒪 is called a model of 𝑞𝒪 if all the rules of 𝑞𝒪 are satisfied by 𝐼. Finally, we define the notion of answers to an 𝑛-ary Datalog∨ K query 𝑞𝒪 w.r.t. an OBDM system ⟨𝐽, 𝐷⟩, 𝐽,𝐷 denoted by 𝑞𝒪 , as follows: {𝑐¯ ∈ Dom(𝐽, 𝐷, 𝑞𝒪 )𝑛 | ¯𝑐 ∈ 𝐴𝑛𝑠𝑓 for each model (𝑚𝑜𝑑(⟨𝐽, 𝐷⟩), 𝑓 ) of 𝑞𝒪 }. Let 𝐽 = ⟨𝒪, 𝒮, ℳ⟩ be an OBDM specification such that 𝒪 = ∅, 𝒮 = {𝑠1 /2, 𝑠2 /1, 𝑠3 /1} and ℳ = {m1 , m2 , m3 , m4 }, where: m1 : ∀𝑥1 , 𝑥2 .𝑠1 (𝑥1 , 𝑥2 ) → E(𝑥1 , 𝑥2 ) m2 : ∀𝑥.𝑠1 (𝑥, 𝑥) → SN(𝑥) m3 : ∀𝑥.𝑠2 (𝑥) → ∃𝑧.E(𝑥, 𝑧) m4 : ∀𝑥.𝑠3 (𝑥) → E(𝑥, 𝑥) The mapping ℳ of 𝐽 establishes how predicates in the schema 𝒮 relate to the ontology predicates E (which stands for edge) and SN (which stands for special node). Let us now consider the following safe Datalog∨ K query 𝑞𝒪 over 𝐽. 𝑞𝒪 returns all the pairs (𝑣1 , 𝑣2 ) of special nodes that are known to be distinct and such that there is a path from 𝑣1 to 𝑣2 passing only for nodes known to be special: K(E(𝑥1 , 𝑥2 ) ∧ SN(𝑥1 ) ∧ SN(𝑥2 )) → 𝑇1 (𝑥1 , 𝑥2 ) K(SN(𝑥1 ) ∧ SN(𝑥2 ) ∧ 𝑥1 ̸= 𝑥2 ) → 𝑇2 (𝑥1 , 𝑥2 ) 𝑇1 (𝑥1 , 𝑦) ∧ 𝑇1 (𝑦, 𝑥2 ) → 𝑇1 (𝑥1 , 𝑥2 ) 𝑇1 (𝑥1 , 𝑥2 ) ∧ 𝑇2 (𝑥1 , 𝑥2 ) → Ans(𝑥1 , 𝑥2 ) The following proposition shows that Datalog∨ K is a monotone query language. Every Datalog∨ K query over 𝐽 is in M 𝐽 , for every OBDM specification 𝐽. The semantics should make it clear that K is the knowledge operator in the S5 epistemic logic: the formula K𝐴 should be read as “𝐴 is known (i.e., logically implied) by the system” [13]. Therefore, when accessing the information modeled by ⟨𝐽, 𝐷⟩, a Datalog∨ K query extracts what is known by the system, and this characteristic is crucial for not falling into undecidability resulting from using Datalog rules jointly with Description Logics (see [14, 15]), as stated in the following proposition. Let Σ be an OBDM system. Answering safe Datalog∨ K queries w.r.t. Σ is decidable if and only if answering CQs w.r.t. Σ is decidable . 1 Although our framework is general enough to consider any DL for expressing ontologies and any query language for expressing source queries, in the rest of this paper we will carry out our investigation in the following setting: (𝑖) ontologies are expressed in DL-LiteRDFS , and (𝑖𝑖) source queries are expressed as UCQs. 1 With answering we implicitly refer to the associated recognition problem, i.e., check whether a tuple is in the answer to a query. � At this point, one may wonder whether Datalog∨ K is the right language to express monotone abstractions in this setting. While a thorough analysis of the language is outside the scope of the present paper, the following proposition provides a positive answer to this question, at least from the computational point of view. In our setting, (𝑖) answering safe Datalog∨K queries is in coNP in data complexity, and (𝑖𝑖) there exists an OBDM specification 𝐽 and a CQ 𝑞𝒮 such that, given an 𝒮-database 𝐷, answering the M-maximally sound 𝐽-abstraction of 𝑞𝒮 is coNP-hard in data complexity. To ease the presentation, from now on we assume that mappings and source queries do not mention constants. However, all our results can be straightforwardly adapted to the case where constants are allowed. 3. Monotone approximations of abstractions In this section we investigate the problem of the existence of M-minimally complete and M- maximally sound abstractions, in the restricted setting specified at the end of the previous section. In particular, we start by showing that M-minimally complete abstractions of UCQ source queries always exist and can be expressed in Datalog∨ K . Then, because of the lack of space, we will just mention the corresponding result on M-maximally sound abstractions. Given a CQ 𝑞𝒮 = {𝑥 ¯ , 𝑦¯)}, the subroutine SaturateQ(𝑞𝒮 ) computes a UCQ̸= in the ¯ | ∃𝑦¯.𝜑(𝑥 following way: for each possible unifier 𝜇 on the variables in 𝑥 ¯ ∪ 𝑦¯ such that 𝜇(𝑥) ∈ 𝑥 ¯ for each 𝑥∈𝑥 ¯ , SaturateQ(𝑞𝒮 ) contains a query obtained from 𝜇(𝑞𝒮 ) by adding the inequality atom 𝑡1 ̸= 𝑡2 for each pair of distinct variables 𝑡1 , 𝑡2 occurring in 𝜇(𝑞𝒮 ). For a UCQ 𝑞𝒮 , we denote by SaturateQ(𝑞𝒮 ) the UCQ̸= obtained by applying SaturateQ(𝑞) to each disjunct 𝑞 of 𝑞𝒮 . We write each CQ̸= 𝑞 generated by SaturateQ(𝑞𝒮 ) as 𝑞 = {𝑥 ¯ | ∃𝑦¯.𝜑(𝑥¯ , 𝑦¯) ∧ 𝜉(𝑥 ¯ , 𝑦¯)}, where 𝜑(𝑥 ¯ , 𝑦¯) and 𝜉(𝑥 ¯ , 𝑦¯) are the conjunctions of the relational atoms over 𝒮 and of the inequality atoms, respectively, occurring in the body of 𝑞. Moreover, for an OBDM specification 𝐽 = ⟨𝒪, 𝒮, ℳ⟩ and a CQ̸= 𝑞 = {𝑥 ¯ | ∃𝑦¯.𝜑(𝑥 ¯ , 𝑦¯) ∧ ∨ ¯ , 𝑦¯)} over 𝒮, we denote by 𝑟𝑞 the following DatalogK rule of form (2): 𝜉(𝑥 ¯ , 𝒴¯ )) → Ans(𝑥 𝑟𝑞 = K(∃𝑧¯.ℳ(𝑞) ∧ 𝜉(𝑥 ¯ ), where (i) ℳ(𝑞) is computed by simply ignoring the inequality atoms and chasing the set of relational atoms occurring in the body of 𝑞; (ii) 𝒴¯ ⊆ 𝑦¯ is the subset of the existential variables of 𝑞 occurring in ℳ(𝑞); (iii) 𝑧¯ are the fresh variables introduced when computing ℳ(𝑞); and (iv) 𝜉(𝑥¯ , 𝒴¯ ) is the conjunction of the inequality atoms obtained from 𝜉(𝑥 ¯ , 𝑦¯) by removing all those atoms of the form 𝑦 ̸= 𝑡 and 𝑡 ̸= 𝑦 in which 𝑦 is an existential variable occurring in 𝑦¯ but not in ¯ (i.e., not in ℳ(𝑞)) and 𝑡 is any other possible variable. Observe that the epistemic operator 𝒴 is exploited to bind the existential variables coming from 𝑞. This is achieved by pushing the subset 𝒴¯ of the existential variables 𝑦¯ of 𝑞 occurring in ℳ(𝑞) inside the K operator. We are now ready to present the algorithm M-MinComplete for computing the M-minimally complete 𝐽-abstractions. Given an OBDM specification 𝐽 = ⟨𝒪, 𝒮, ℳ⟩ and a UCQ 𝑞𝒮 over 𝒮 such that SaturateQ(𝑞𝒮 ) = 𝑞1 ∪ . . . ∪ 𝑞𝑛 , M-MinComplete(𝐽, 𝑞𝒮 ) outputs the Datalog∨ K query 𝑞𝒪 = {𝑟𝑞1 , . . . , 𝑟𝑞𝑛 } over 𝐽. � Consider the OBDM specification 𝐽 illustrated in Example 2 and the CQ 𝑞𝒮 = {(𝑥) | ∃𝑦.𝑠1 (𝑥, 𝑦)} over 𝒮. One can verify that M-MinComplete(𝐽, 𝑞𝒮 ) returns the following safe Datalog∨ K query 𝑞𝒪 over 𝐽 asking for all those nodes 𝑣 such that either 𝑣 is connected to a node 𝑣 ′ known to be different from 𝑣 or 𝑣 is a special node with a self-loop: K(E(𝑥, 𝑦) ∧ 𝑥 ̸= 𝑦) → Ans(𝑥) K(E(𝑥, 𝑥) ∧ SN(𝑥)) → Ans(𝑥) Note that 𝑞𝒪 is a better complete approximation than the query {(𝑥) | ∃𝑦.E(𝑥, 𝑦)}, which is the UCQ-minimally complete 𝐽-abstraction of 𝑞𝒮 [6]. M-MinComplete(𝐽, 𝑞𝒮 ) terminates and returns the unique (up to 𝐽-equivalence) M-minimally complete 𝐽-abstraction of 𝑞𝒮 . Furthermore, we observe that the result of M-MinComplete(𝐽, 𝑞𝒮 ) is independent from the assertions occurring in the ontology of the OBDM specification 𝐽. Similar results can be obtained for OBDM specifications based on more expressive Horn DL ontologies. Before concluding this section, we observe that M-MinComplete(𝐽, 𝑞𝒮 ) may return unsafe Datalog∨ K queries. Nevertheless, these queries enjoy nice computational properties in our setting. Let 𝑞𝒪 be a Datalog∨ K query with only rules of the form K(∃𝑧 ¯ , 𝑧¯) ∧ 𝜉(𝑥 ¯.𝜑(𝑥 ¯ )) → Ans(𝑥 ¯ 𝑎 ). In our setting, (i) answering 𝑞𝒪 is in PTime in data complexity, and (ii) if 𝑞𝒪 is safe, then it is 𝐽,𝐷 possible to compute a UCQ̸= 𝑞𝑟 over 𝒮 such that 𝑞𝒪 = 𝑞𝑟𝐷 , for each 𝒮-database 𝐷. As for best sound approximations of abstractions, it can be shown (cf. details in [10]) that, similarly to the case of best complete abstractions, an algorithm exists that, given a UCQ 𝑞𝒮 , terminates and computes the unique (up to 𝐽-equivalence) M-maximally sound 𝐽-abstraction of 𝑞𝒮 , expressed as a Datalog∨ K query. Thus, we have the following: Given a UCQ 𝑞𝒮 , both the unique (up to 𝐽-equivalence) M-minimally complete and the M-maximally sound 𝐽-abstraction of 𝑞𝒮 always exist and can be expressed as Datalog∨ K queries. Consider the OBDM specification 𝐽 illustrated in Example 2. One can verify that the following set ℛ of safe Datalog∨ K rules is the unique (up to 𝐽-equivalence) M-maximally sound 𝐽-abstraction of 𝑞𝒮 : K(E(𝑥1 , 𝑥2 ) ∧ 𝑥1 ̸= 𝑥2 ) → 𝑠1 (𝑥1 , 𝑥2 ) K(E(𝑥, 𝑥)) → 𝑠3 (𝑥) ∨ 𝑠1 (𝑥, 𝑥) K(∃𝑧.E(𝑥, 𝑧)) → 𝑠2 (𝑥) ∨ ∃𝑦.𝑠1 (𝑥, 𝑦) ∨ 𝑠3 (𝑥) K(SN(𝑥)) → 𝑠1 (𝑥, 𝑥) 4. Perfect Abstractions It follows from Theorem 3 that either the perfect abstraction of a source UCQ can be expressed in Datalog∨ K , or it cannot be expressed as a monotone query (if it exists at all). We now present an algorithm that, given an OBDM specification 𝐽 and a source UCQ 𝑞𝒮 , returns the perfect 𝐽-abstraction of 𝑞𝒮 , if and only if it exists and is in M. To this aim, we make use of Proposition 3, and refer to the 𝑞𝑟 defined in that proposition as the rewriting of 𝑞𝒪 w.r.t. 𝐽. Our algorithm, that we call M-Perfect, goes as follows. Given an OBDM specification 𝐽 = ⟨𝒪, 𝒮, ℳ⟩ and a UCQ 𝑞𝒮 over 𝒮 as input: if (i) 𝑞𝒪 = M-MinComplete(𝐽, 𝑞𝒮 ) is safe and �(ii) 𝑞𝑟 ⊑𝒮 𝑞𝒮 ; then return 𝑞𝒪 ; otherwise, report “no perfect 𝐽-abstraction of 𝑞𝒮 is in M”. In Example 3, M-Perfect(𝐽, 𝑞𝒮 ) returns 𝑞𝒪 , which is the perfect 𝐽-abstraction of 𝑞𝒮 . We conclude this section by establishing termination and correctness of the M-Perfect algorithm. M-Perfect(𝐽, 𝑞𝒮 ) terminates and returns the unique (up to 𝐽-equivalence) perfect 𝐽-abstraction of 𝑞𝒮 if and only if such an abstraction can be expressed in M. 5. Conclusion We presented a thorough study of monotone abstractions of UCQs in OBDM systems. We proved that best approximations of such abstractions always exist and introduced a query language, Datalog∨ K , that captures them. Directions for future work are many. In the context of monotone abstractions, we would like to investigate the case of more expressive ontology languages, e.g., DL-Liteℛ , as well as more expressive source query languages, e.g., unions of conjunctive queries with inequalities and disjunctive Datalog. Finally, the problem of checking whether given best approximations are expressible in simpler and more user friendly languages remains open. Acknowledgements This work has been partially supported by the ANR AI Chair INTENDED (ANR-19-CHIA-0014), by MIUR under the PRIN 2017 project “HOPE” (prot. 2017MMJJRE), and by the EU under the H2020-EU.2.1.1 project TAILOR, grant id. 952215. References [1] A. Poggi, D. Lembo, D. Calvanese, G. De Giacomo, M. Lenzerini, R. Rosati, Link- ing data to ontologies, Journal on Data Semantics X (2008) 133–173. doi:10.1007/ 978-3-540-77688-8\_5. [2] M. Lenzerini, Managing data through the lens of an ontology, AI Magazine 39 (2018) 65–74. [3] G. De Giacomo, D. Lembo, M. Lenzerini, A. Poggi, R. Rosati, Using ontologies for semantic data integration, in: A Comprehensive Guide Through the Italian Database Research Over the Last 25 Years, 2018, pp. 187–202. [4] G. Cima, Preliminary results on ontology-based open data publishing, in: Proceedings of the Thirtieth International Workshop on Description Logics (DL 2017), volume 1879 of CEUR Electronic Workshop Proceedings, 2017. [5] C. Lutz, J. Marti, L. Sabellek, Query expressibility and verification in ontology-based data access, in: Proceedings of the Sixteenth International Conference on the Principles of Knowledge Representation and Reasoning (KR 2018), 2018, pp. 389–398. [6] G. Cima, M. Lenzerini, A. Poggi, Semantic characterization of data services through ontologies, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI 2019), 2019, pp. 1647–1653. � [7] G. Cima, M. Lenzerini, A. Poggi, Non-monotonic ontology-based abstractions of data services, in: Proceedings of the Seventeenth International Conference on Principles of Knowledge Representation and Reasoning (KR 2020), 2020, pp. 243–252. [8] G. Cima, M. Console, M. Lenzerini, A. Poggi, Abstraction in Data Integration, in: Proceed- ings of the Thirty-Sixth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2021), 2021, pp. 1–11. [9] G. Cima, Abstraction in Ontology-based Data Management, Ph.D. thesis, Sapienza Univer- sity of Rome, 2020. [10] G. Cima, M. Console, M. Lenzerini, A. Poggi, Monotone abstractions in ontology-based data management, in: Proceedings of the Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2022), 2022. [11] S. Abiteboul, R. Hull, V. Vianu, Foundations of Databases, Addison Wesley Publ. Co., 1995. [12] F. Baader, D. Calvanese, D. McGuinness, D. Nardi, P. F. Patel-Schneider (Eds.), The Descrip- tion Logic Handbook: Theory, Implementation and Applications, Cambridge University Press, 2003. [13] H. J. Levesque, G. Lakemeyer, The Logic of Knowledge Bases, The MIT Press, 2001. [14] A. Y. Levy, M.-C. Rousset, Combining Horn rules and description logics in CARIN, Artificial Intelligence 104 (1998) 165–209. [15] D. Calvanese, R. Rosati, Answering recursive queries under keys and foreign keys is unde- cidable, in: Proceedings of the Tenth International Workshop on Knowledge Representation meets Databases (KRDB 2003), volume 79 of CEUR Electronic Workshop Proceedings, 2003. �