Table Of Content

7 1 0 2 n a J Efficiently Computing Provenance Graphs for 0 2 Queries with Negation ] B D . Seokki Lee, Sven Ko¨hler, Bertram Luda¨scher, Boris Glavic s c [ 1 IIT DB Group Technical Report v 9 IIT/CS-DB-2016-03 9 6 5 0 2016-10 . 1 0 7 1 : v i X r a http://www.cs.iit.edu/∼dbgroup/ LIMITED DISTRIBUTION NOTICE: The research presented in this report may be submitted as a whole or in parts for publication and will probably be copyrighted if accepted for publication. It has been issued as a Technical Report for early dissemination of its contents. In view of the transfer of copyright to the outside publisher, its distribution outside of IIT-DB prior to publication should be limited to peer communications and specific requests. After outside publication, requests should be filled only by reprints or legally obtained copies of the article (e.g. payment of royalties). Efficiently Computing Provenance Graphs for Queries with Negation Seokki Lee∗ Sven Ko¨hler† Bertram Luda¨scher‡ Boris Glavic∗ ∗Illinois Institute of Technology. {[email protected], [email protected]} ‡University of Illinois at Urbana-Champaign. {[email protected]} †University of California at Davis. {[email protected]} Abstract—Explainingwhyananswerisintheresultofaquery r1 :Q(X,Y):−Train(X,Z),Train(Z,Y),¬Train(X,Y) or why it is missing from the result is important for many RelationTrain ResultofqueryQ applications including auditing, debugging data and queries, fromCity toCity and answering hypothetical questions about data. Both types of s c nneewwyyoorrkk waschhinicgatgoondc washinXgtondc chiYcago questions, i.e., why and why-not provenance, have been studied chicago seattle newyork seattle extensively. In this work, we present the first practical approach w n seattle chicago seattle seattle for answering such questions for queries with negation (first- washingtondc seattle chicago chicago orderqueries).OurapproachisbasedonarewritingofDatalog Fig. 1: Example train connection database and query rules(calledfiringrules)thatcapturessuccessfulrulederivations within the context of a Datalog query. We extend this rewriting Q(n,s) tosupportnegationandtocapturefailedderivationsthatexplain missinganswers.Givena(whyorwhy-not)provenancequestion, r1(n,s,w) r1(n,s,c) wecomputeanexplanation,i.e.,thepartoftheprovenancethatis g1(n,w) g2(w,s) g3(n,s) g1(n,c) g2(c,s) 1 1 1 1 1 relevanttoanswerthequestion.Weintroduceoptimizationsthat prunepartsofaprovenancegraphearlyonifwecandetermine T(n,w) T(w,s) T(n,s) T(n,c) T(c,s) thattheywillnotbepartoftheexplanationforagivenquestion. Fig. 2: Provenance graph explaining WHYQ(n,s) We present an implementation that runs on top of a relational database using SQL to compute explanations. Our experiments demonstrate that our approach scales to large instances and significantly outperforms an earlier approach which instantiates (FO) queries (i.e., non-recursive Datalog with negation) and the full provenance to compute explanations. an efficient method for explaining a (missing) answer using SQL. Our approach is based on the observation that typically I. INTRODUCTION only a part of provenance, which we call explanation in this Provenance for relational queries records how results of a work, is actually relevant for answering the user’s provenance query depend on the query’s inputs. This type of information question about the existence or absence of a result. can be used to explain why (and how) a result is derived by a Example 1. Consider the relation Train in Fig.1 that stores query over a given database. Recently, approaches have been trainconnectionsintheUS.TheDatalogruler inFig.1com- developed that use provenance-like techniques to explain why 1 puteswhichcitiescanbereachedwithexactlyonetransfer,but atuple(orasetoftuplesdescribeddeclarativelybyapattern) notdirectly.Weusethefollowingabbreviationsinprovenance is missing from the query result. However, the two problems graphs:T=Train;n=NewYork;s=Seattle;w=Washington of computing provenance and explaining missing answers DC and c = Chicago. Given the result of this query, the user have been treated mostly in isolation. A notable exception may be interested to know why he/she is able to reach Seattle is [22] which computes causes for answers and non-answers. from New York (WHYQ(n,s)) with one intermediate stop but However, the approach requires the user to specify which not directly or why it is not possible to reach New York from missing inputs to consider as causes for a missing output. Seattle in the same fashion (WHYNOTQ(s,n)). Capturing provenance for a query with negation necessitates the unification of why and why-not provenance, because to An explanation for either type of question should justify explainaresultofthequerywehavetodescribehowexisting the existence (absence) of a result as the success (failure) to and missing intermediate results (via positive and negative derive the result through the rules of the query. Furthermore, subqueries,respectively)leadtothecreationoftheresult.This it should explain how the existence (absence) of tuples in the has also been recognized by Ko¨hler et al. [20]: asking why a database caused the derivation to succeed (fail). Provenance tuple t is absent from the result of a query Q is equivalent graphsprovidingthistypeofjustificationforWHYQ(n,s)and to asking why t is present in ¬Q. Thus, a provenance model WHYNOTQ(s,n) are shown in Fig.2 and Fig.3, respectively. thatsupportsquerieswithnegationnaturallysupportswhy-not Therearethreetypesofgraphnodes:rulenodes(boxeslabeled provenance.Inthispaper,wepresentaframeworkthatanswers with a rule identifier and the constant arguments of a rule why and why-not questions for queries with negation. To this derivation), goal nodes (rounded boxes labeled with a rule end, we introduce a graph model for provenance of first-order identifier and the goal’s position in the rule’s body), and tuple Q(s,n) in Fig.3) because there exists no connection from Seattle to WashingtonDC(atuplenodeT(s,w)inred),andWashington r1(s,n,w) r1(s,n,c) r1(s,n,s) r1(s,n,n) DC to New York (a tuple node T(w,n) in red). Note that the g1(s,w) g2(w,n) g2(c,n) g1(s,s) g2(s,n) g1(s,n) g2(n,n) successful goal ¬T(s,n) (there is no direct connection from 1 1 1 1 1 1 1 Seattle to New York) does not contribute to the failure of this T(s,w) T(w,n) T(c,n) T(s,s) T(s,n) T(n,n) derivation and, thus, is not part of the explanation. A failed Fig. 3: Provenance graph explaning WHYNOTQ(s,n) negatedgoalisexplainedbyanexistingtupleinthedatabase. Thatis,ifatuple(s,n)wouldexistintheTrainrelation,then an additional failed goal node g3(s,n) would be part of the 1 nodes (ovals). In these provenance graphs, nodes are either explanation and be connected to each failed rule derivation. colored as green (successful/existing) or red (failed/missing). Overview and Contributions. Provenance games [20], a Example 2. Consider the explanation (provenance graph in game-theoretical formalization of provenance for first-order Fig.2) for question WHYQ(n,s). Seattle can be reached from (FO) queries, also supports queries with negation. However, New York by either stopping in Washington DC or Chicago theapproachiscomputationallyexpensive,becauseitrequires and there is no direct connection between these two cities. instantiationofaprovenancegraphexplainingallanswersand Thesetwooptionscorrespondtotwosuccessfulderivationsfor missinganswers.Forinstance,theprovenancegraphproduced rule r with X=n, Y=s, and Z=w (or Z=c, respectively). by this approach for our toy example already contains more 1 In the provenance graph, there are two rule nodes denoting than 64 (=43) nodes (i.e., only counting nodes corresponding thesesuccessfulderivationsofQ(n,s)byruler .Aderivation to rule derivations), because there are 43 ways of binding 1 is successful if all goals in the body evaluate to true, i.e., values from adom(I)={c,n,s,w} to the 3 variables (X, Y, a successful rule node is connected to successful goal nodes and Z) of the rule r . Typically, most of the nodes will not 1 (e.g., r is connected to g1, the 1st goal in the rule’s body). endupbeingpartoftheexplanationfortheuser’sprovenance 1 1 A positive (negated) goal is successful if the corresponding question.Toefficientlycomputetheexplanation,weintroduce tuple is (is not) in the database. Thus, a successful goal node anewDatalogprogramwhichcomputespartoftheprovenance is connected to the node corresponding to the existing (green) graph of an explanation bottom-up. Evaluating this program or missing (red) tuple justifying the goal, respectively. over instance I returns the edge relation of an explanation. The main driver of our approach is a rewriting of Datalog Supporting negation and missing answers is quite challeng- rules that captures successful and failed rule derivations. This ing,becauseweneedtoenumerateallpotentialwaysofderiv- rewritingreplacestherulesoftheprogramwithso-calledfiring ing a missing answer (or intermediate result corresponding to rules. Firing rules for positive queries were first introduced a negated subgoal) and explain why each of these derivations in [19]. These rules are similar to other query instrumentation has failed. An important question in this respect is how to techniquesthathavebeenusedforprovenancecapturesuchas bound the set of missing answers to be considered. Under the therewriterulesofPerm[13].Oneofourmajorcontributions open world assumption, provenance would be infinite. Using is to extend this concept for negation and failed rule deriva- the closed world assumption, only values that exist in the tions which is needed to support Datalog with negation and database or are postulated by the query are used to construct missing answers. Firing rules provide sufficient information missing tuples. As is customary in Datalog, we refer to this forconstructingexplanations.However,tomakethisapproach set of values as the active domain adom(I) of a database efficient, we need to avoid capturing rule derivations that will instance I. We will revisit the assumption that all derivations not contribute an explanation, i.e., they are not connected to with constants from adom(I) are meaningful later on. the nodes corresponding to the provenance question in the Example3. TheexplanationforWHYNOTQ(s,n)isshownin provenancegraph.Weachievethisbypropagatinginformation Fig.3, i.e., why it is not true that New York is reachable from from the user’s provenance question throughout the query to Seattle with exactly one transfer, but not directly. The tuple prune rule derivations early on 1) if they do not agree with Q(s,n) is missing from the query result because all potential the constants in the question or 2) if we can determine that ways of deriving this tuple through the rule r have failed. In basedontheirsuccess/failurestatustheycannotbepartofthe 1 thisexample,adom(I)={c,n,s,w}and,thus,thereexistfour explanation.Forinstance,inourrunningexample,Q(n,s)can failed derivations of Q(s,n) choosing either of these cities as onlybeconnectedtosuccessfulderivationsoftheruler1 with the intermediate stop between Seattle and New York. A rule X=nandY=s.Wehavepresentedaproof-of-conceptversion derivation fails if at least one goal in the body evaluates to of our approach as a poster [21]. Our main contributions are: false.Intheprovenancegraph,onlyfailedgoalsareconnected • We introduce a provenance graph model for full first- to the failed rule derivations explaining missing answers. order (FO) queries, expressed as non-recursive Datalog Failedpositivegoalsinthebodyofafailedruleareexplained queries with negation (or Datalog for short). by missing tuples (red tuple nodes). For instance, we cannot • We extend the concept of firing rules to support negation reach New York from Seattle with an intermediate stop in and missing answers. Washington DC (the first failed rule derivation from the left • We present an efficient method for computing explana- tionstoprovenancequestions.Unlikethesolutionin[20], variables,weuser(c ,...,c )todenotethederivationthatis 1 n our approach avoids unnecessary work by focusing the the result of binding X =c . We call a derivation successful i i computation on relevant parts of the provenance graph. wrt. an instance I if each atom in the body of the rule is true • We prove the correctness of our algorithm that computes in I and failed otherwise. the explanation to a provenance question. B. Negation and Domains • We present a full implementation of our approach in the GProM [1] system. Using this system, we compile To be able to explain why a tuple is missing, we have to Datalog into relational algebra expressions, and translate enumerateallfailedderivationsofthistupleand,foreachsuch the expressions into SQL code that can be executed by a derivation, explain why it failed. As mentioned in Sec. I, the standard relational database backend. question is what is a feasible set of potential answers to be The remainder of this paper is organized as follows. We considered as missing. While the size of why-not provenance formally define the problem in Sec.II, discuss related work in istypicallyinfiniteundertheopenworldassumption,wehave Sec.III, and present our approach for computing explanations to decide how to bound the set of missing answers in the in Sec.IV. We then discuss our implementation (Sec.V), closed world assumption. We propose a simple, yet general, present experiments (Sec.VI), and conclude in Sec.VII. solution by assuming that each attribute of an IDB or EDB relation has an associated domain. II. PROBLEMDEFINITION Definition 1 (Domain Assignment). Let S ={R ,...,R } be We now formally define the problem addressed in this 1 n a database schema where each R (A ,...,A ) is a relation work: how to find the subgraph of a provenance graph for i 1 m schema. Given an instance I of S, a domain assignment dom a given query (input program) P and instance I that explains is a function that associates with each attribute R.A a domain existence/absence of a tuple in/from the result of P. of values. We require dom(R.A)⊇adom(R.A). A. Datalog In our approach, the user specifies each dom(R.A) as a A Datalog program P consists of a finite set of rules query dom that returns the set of admissible values for R.A ri : R(X(cid:126)):−R1(X(cid:126)1),..., Rn(X(cid:126)n) where X(cid:126)j denotes a tuple the domain of attribute R.A. We provide reasonable defaults of variables and/or constants. We assume that the rules of to avoid forcing the user to specify dom for every attribute, a program are labeled r1 to rm. R(X(cid:126)) is the head of the e.g., dom(R.A)=adom(R.A) for unspecified domR.A. These rule, denoted head(ri), and R1(X(cid:126)1),...,Rn(X(cid:126)n) is the body associated domains fulfill two purposes: 1) to reduce the (each Rj(X(cid:126)j) is a goal). We use vars(ri) to denote the size of explanations and 2) to avoid semantically meaningless set of variables in ri. In this paper, consider non-recursive answers. For instance, if there would exist another attribute Datalogwithnegation,sogoalsRj(X(cid:126)j)inthebodyareliterals, Price in the relation Train in Fig.1, then adom(I) would i.e., atoms A(X(cid:126) ) or their negation ¬A(X(cid:126) ). Recursion is not also include all the values that appear in this attribute. Thus, j j allowed. All rules r of a program have to be safe, i.e., every some failed rule derivations for r would assign prices to the 1 variable in r must occur positively in r’s body (thus, head variable representing intermediate stops. Different attributes variables and variables in negated goals must also occur in a may represent the same type of entity (e.g., fromCity and positivegoal).Forexample,Fig.1showsaDatalogquerywith toCity in our example) and, thus, it would make sense to a single rule r . Here, head(r ) is Q(X,Y) and vars(r ) is use their combined domain values when constructing missing 1 1 1 {X,Y,Z}. The rule is safe since the head variables and the answers.Fornow,weleaveituptotheusertospecifyattribute variables in the negated goal also occur positively in the body domains. Using techniques for discovering semantic relation- (X and Y in both cases). The set of relations in the schema ships among attributes to automatically determine feasible over which P is defined is referred to as the extensional attribute domains is an interesting avenue for future work. database (EDB), while relations defined through rules in P When defining provenance graphs in the following, we form the intensional database (IDB), i.e., the IDB relations are only interested in rule derivations that use constants are those defined in the head of rules. We require that P has from the associated domains of attributes accessed by the a distinguished IDB relation Q, called the answer relation. rule. Given a rule r and variable X used in this rule, let GivenP andinstanceI,weuseP(I)todenotetheresultofP attrs(r,X) denote the set of attributes that variable X is evaluated over I. Note that P(I) includes the instance I, i.e., bound to in the body of the rule. For instance, in Fig.1, allEDBatomsthataretrueinI.ForanEDBorIDBpredicate attrs(r ,Z)={Train.fromCity,Train.toCity}. We say a 1 R, we use R(I) to denote the instance of R computed by P rule derivation r(c ,...,c ) is domain grounded iff c ∈ 1 n i (cid:84) and R(t)∈P(I) to denote that t∈R(I) according to P. dom(A) for all i∈{1,...,n}. A∈attrs(r,Xi) We use adom(I) to denote the active domain of instance C. Provenance Graphs I, i.e., the set of all constants that occur in I. Similarly, we use adom(R.A) to denote the active domain of attribute A of Provenance graphs justify the existence or absence of a relation R. In the following, we make use of the concept of a query result based on the success or failure of derivations rule derivation. A derivation of a rule r is an assignment of using a query’s rules, respectively. They also explain how variables in r to constants from adom(I). For a rule with n the existence or absence of tuples in the database caused derivations to succeed or fail, respectively. Here, we present a compact provenance graphs. For instance, observe that the constructive definition of provenance graphs that provide this node g3(n,s) is shared by two rule nodes in the explanation 1 type of justification. Nodes in these graphs carry two types of shown in Fig2. labels: 1) a label that determines the node type (tuple, rule, D. Questions and Explanations or goal) and additional information, e.g., the arguments and rule identifier of a derivation; 2) the success/failure status of Recall that the problem we address in this work is how nodes. to explain the existence or absence of (sets of) tuples using provenancegraphs.Suchasetoftuplesiscalledaprovenance Definition 2 (ProvenanceGraph). LetP beafirst-order(FO) question (PQ) in this paper. The two questions presented in query,I adatabaseinstance,dom adomainassignmentforI, Example1useconstantsonly,butwealsosupportprovenance andLthedomaincontainingallstrings.Theprovenancegraph questionswithvariables,e.g.,foraquestionQ(n,X)wewould PG(P,I)isagraph(V,E,L,S)withnodesV,edgesE,and return all explanations for existing or missing tuples where nodelabellingfunctionsL:V →LandS :V →{T,F}.We the first attribute is n, i.e., why or why-not a city X can be require that ∀v,v(cid:48) ∈ V : L(v) = L(v(cid:48)) → v = v(cid:48). PG(P,I) reachedfromNewYorkwithonetransfer,butnotdirectly.We is defined as follows: say a tuple t(cid:48) of constants matches a tuple t of variables and • Tuple nodes: For each n-ary EDB or IDB predicate R constants written as t(cid:48) (cid:50) t if we can unify t(cid:48) with t, i.e., we and tuple (c ,...,c ) of constants from the associated 1 n can equate t(cid:48) with t by applying a valuation that substitutes domains (c ∈dom(R.A )), there exists a node v labeled i i variables in t with constants from t(cid:48). R(c ,...,c ). S(v) = T iff R(c ,...,c ) ∈ P(I) and 1 n 1 n S(v)=F otherwise. Definition 3 (Provenance Question). Let P be a query, I an • Rule nodes: For every successful domain grounded instance, and Q an IDB predicate. A provenance question derivation r (c ,...,c ), there exists a node v in V la- PQ is an atom Q(t) where t = (v ,...,v ) is a tuple i 1 n 1 n beled r (c ,...,c ) with S(v)=T. For every failed do- consisting of variables and constants from the associated i 1 n main grounded derivation ri(c1,...,cn) where head(ri domain (dom(Q.A) for each attribute Q.A). WHYQ(t) and (c1,...,cn))(cid:54)∈P(I), there exists a node v as above but WHYNOTQ(t) restrict the question to existing and missing withS(v)=F.Inbothcases,v isconnectedtothetuple tuples t(cid:48) (cid:50)t, respectively. node head(r (c ,...,c )). i 1 n In Example1, we have presented subgraphs of PG(P,I) as • Goal nodes: Let v be the node corresponding to a explanationsforPQs,implicitlyclaimingthatthesesubgraphs derivation r (c ,...,c ) with m goals. If S(v) = T, i 1 n aresufficientforexplainingthePQ.Below,weformallydefine then for all j ∈ {1,...,m}, v is connected to a goal this type of explanation. node v labeled gj with S(v ) = T. If S(v) = F, then j i j for all j ∈{1,...,m}, v is connected to a goal node vj Definition4(Explanation). TheexplanationEXPL(P,Q(t),I) withS(v )=F ifthejth goalisfailedinr (c ,...,c ). for Q(t) (PQ) according to P and I, is the subgraph of j i 1 n Each goal is connected to the corresponding tuple node. PG(P,I)containingonlynodesthatareconnectedtoatleast one node Q(t(cid:48)) where t(cid:48) (cid:50) t. For WHYQ(t), only existing Our provenance graphs model query evaluation by con- tuples t(cid:48) are considered to match t. For WHYNOTQ(t) only struction. A tuple node R(t) is successful in PG(P,I) iff missing tuples are considered to match t. R(t) ∈ P(I). This is guaranteed, because each tuple built fromvaluesoftheassociateddomainexistsasanodev inthe Given this definition of explanation, note that 1) all nodes graphanditslabelS(v)isdecidedbasedonR(t)∈P(I).Fur- connected to a tuple node matching the PQ are relevant for thermore, there exists a successful rule node r((cid:126)c)∈PG(P,I) computingthistupleand2)onlynodesconnectedtothisnode iff the derivation r((cid:126)c) succeeds for I. Likewise, a failed rule are relevant for the outcome. Consider t(cid:48) where t(cid:48) (cid:50) t for a node r((cid:126)c) exists iff the derivation r((cid:126)c) is failed over I and Q(t) (PQ). If Q(t(cid:48)) ∈ P(I), then all successful derivations head(r((cid:126)c))(cid:54)∈P(I). Fig.2 and3 show subgraphs of PG(P,I) with head Q(t(cid:48)) justify the existence of t(cid:48) and these are for the query from Fig.1. Since Q(n,s) ∈ P(I) (Fig.2), precisely the rule nodes connected to Q(t(cid:48)) in PG(P,I). If this tuple node is connected to all successful derivations with Q(t(cid:48))(cid:54)∈P(I),thenallderivationswithheadQ(t(cid:48))havefailed Q(n,s) in the head which in turn are connected to goal nodes and are connected to Q(t) in the provenance graph. Each for each of the three goals of rule r . Q(s,n)∈/ P(I) (Fig.3) such derivation is connected to all of its failed goals which 1 and, thus, its node is connected to all failed derivations with are responsible for the failure. Now, if a rule body references Q(s,n)asahead.Here,wehaveassumedthatallcitiescanbe IDB predicates, then the same argument can be applied to considered as starting and end points of missing train connec- reason that all rules directly connected to these tuples explain tions, i.e., both dom(T.fromCity) and dom(T.toCity) are why they (do not) exist. Thus, by induction, the explanation defined as adom(T.fromCity)∪adom(T.toCity). Thus, we contains all relevant tuple and rule nodes that explain the PQ. have considered derivations r (s,n,Z) for Z ∈{c,n,s,w}. 1 Animportantcharacteristicofourprovenancegraphsisthat III. RELATEDWORK eachnodev inagraphisuniquelyidentifiedbyitslabelL(v). Our provenance graphs have strong connections to other Thus, common subexpressions are shared leading to more provenance models for relational queries, most importantly provenance games and the semiring framework, and to ap- explain a missing answer based on the query [3], [4], [6], proaches for explaining missing answers. [25] (i.e., which operators filter out tuples that would have contributed to the missing answer) or based on the input ProvenanceGames.Provenancegames[20]modeltheevalu- data [16], [17] (i.e., what tuples need to be inserted into ationofagivenquery(inputprogram)P overaninstanceI as the database to turn the missing answer into an answer). a 2-player game in a way that resembles SLD(NF) resolution. The missing answer problem was first stated for query-based If the position (a node in the game graph) corresponding to explanations in the seminal paper by Chapman et al. [6]. a tuple t is won (the player starting in this position has a Huang et al. [17] first introduced an instance-based approach. winning strategy), then t ∈ P(I) and if the position is lost, Sincethen,severaltechniqueshavebeendevelopedtoexclude then t (cid:54)∈ P(I). By virtue of supporting negation, provenance spuriousexplanations,tosupportlargerclassesofqueries[16], games can uniformly answer why and why-not questions for and to support distributed Datalog systems in Y! [26]. The queries with negation. However, provenance games may be approachesforinstance-basedexplanations(withtheexception hardtocomprehendfornon-powerusersastheyrequiresome ofY!)haveincommonthattheytreatthemissinganswerprob- backgroundingametheorytobeunderstood,e.g.,thewon/lost lem as a view update problem: the missing answer is a tuple status of derivations in a provenance game is contrary to thatshouldbeinsertedintoaviewcorrespondingtothequery what may be intuitively expected. That is, a rule node is andthisinserthastobetranslatedtoaninsertintothedatabase lost if the derivation is successful. The status of rule nodes instance.Anexplanationisthenoneparticularsolutiontothis in our provenance graphs matches the intuitive expectation view update problem. In contrast to these previous works, our (e.g., the rule node is successful if the derivation exists). provenance graphs explain missing answers by enumerating Ko¨hler et al. [20] also present an algorithm that computes the all failed rule derivations that justify why the answer is not provenancegameforP andI.However,thisapproachrequires in the result. Thus, they are arguably a better fit for use cases instantiation of the full game graph (which enumerates all such as debugging queries, where in addition to determining existing and missing tuples) and evaluation of a recursive Datalog¬ program over this graph using the well-founded which missing inputs justify a missing answer, the user also needstounderstandwhyderivationshavefailed.Furthermore, semantics [11]. In constrast, our approach computes expla- we do support queries with negation. Importantly, solutions nations that are succinct subgraphs containing only relevant for view update missing answer problems can be extracted provenance. We use bottom-up evaluation instrumented with from our provenance graphs. Thus, in a sense, provenance firingrulestocaptureprovenance.Furthermore,weenablethe graphs with our approach generalize some of the previous user to restrict provenance for missing answers. approaches(fortheclassofqueriessupported,e.g.,wedonot Database Provenance. Several provenance models for supportaggregationyet).Interestingly,recentworkhasshown database queries have been introduced in related work, e.g., that it may be possible to generate more concise summaries see [7], [18]). The semiring annotation framework generalizes of provenance games [12], [24] which is particularly useful thesemodelsforpositiverelationalalgebra(and,thus,positive for negation and missing answers to deal with the potentially non-recursive Datalog). In this model, tuples in a relation are largesizeoftheresultingprovenance.Similarly,somemissing annotated with elements from a commutative semiring K. An answer approaches [16] use c-tables to compactly represent essential property of the K-relational model is that semiring setsofmissinganswers.Theseapproachesarecomplementary N[X],thesemiringofprovenancepolynomials,generalizesall to our work. other semirings. It has been shown in [20] that provenance games generalize N[X] for positive queries and, thus, all ComputingProvenanceDeclaratively.Theconceptofrewrit- ingaDatalogprogramusingfiringrulestocaptureprovenance other provenance models expressible as semirings. Since our as variable bindings of rule derivations was introduced by graphs are equivalent to provenance games in the sense that Ko¨hler et al. [19] for provenance-based debugging of positive there exist lossless transformations between both models (the Datalog queries. These rules are also similar to relational discussion is beyond the scope of this paper), our graphs also encode N[X]. Provenance graphs which are similar to our implementations of provenance polynomials in Perm [13], LogicBlox [14], and Orchestra [15]. Zhou et al. [27] leverage graphs restricted to positive queries have been used as graph suchrulesforthedistributedExSPANsystemusingeitherfull representationsofsemiringprovenance(e.g.,see[8],[9],[18]). propagation or reference based provenance. An extension of Both our graphs and the boolean circuits representation of firing rules for negation is the main enabler of our approach. semiring provenance [9] explicitly share common subexpres- sionsintheprovenance.However,whilethesecircuitssupport recursive queries, they do not support negation. Exploring the IV. COMPUTINGEXPLANATIONS relationship of provenance graphs for queries with negation Recall from Sec. I that our approach generates a new and m-semirings (semirings with support for set difference) is Datalog program GP(P,Q(t),I) by rewriting a given query an interesting avenue for future work. Justifications for logic (input program) P to return the edge relation of explana- programs [23] are also closely related. tion EXPL(P,Q(t),I) for a provenance question Q(t) (PQ). Why-not and Missing Answers. Approaches for explaining In this section, we explain how to generate the program missing answers can be classified based on whether they GP(P,Q(t),I) using the following steps: 1. We unify the input program P with the PQ by propagating Algorithm 1 Unify Program With PQ theconstantsinttop-downthroughouttheprogramtobeable 1: procedure UNIFYPROGRAM(P, Q(t)) to later prune irrelevant rule derivations. 2: todo←[Q(t)] 3: done←{} 2. Afterwards, we determine for which nodes in the graph 4: PU =[] we can infer their success/failure status based on the PQ. We 5: while todo(cid:54)=[] do model this information as annotations on heads and goals of 6: a←POP(todo) rules and propagate these annotations top-down. 7: INSERT(done,a) 3. Based on the annotated and unified version created in the 8: rules←GETRULESFORATOM(P,a) 9: for all r∈rules do previous steps, we generate firing rules that capture variable 10: unRule←UNIFYRULE(r,a) bindings for successful and failed rule derivations. 11: PU ←PU::unRule 4. To be in the result of one of the firing rules obtained in 12: for all g∈body(unRule) do the previous step is a necessary, but not sufficient, condition 13: if g(cid:54)∈done then todo←todo::g forthecorrespondingPG(P,I)fragmenttobeconnectedtoa 14: return PU node matching the PQ. To guarantee that only relevant frag- ments are returned, we introduce additional rules that check connectivity to confirm whether each fragment is connected. that agree with this binding can derive tuples that agree with 5. Finally, we create rules that generate the edge relation of this binding. Based on this unification step, we know which the provenance graph (i.e., EXPL(P,Q(t),I)) based on the bindingsmayproducefragmentsofPG(P,I)thatarerelevant rule binding information that the firing rules have captured. for explaining the PQ. The algorithm implementing this step In the following, we will explain each step in detail and is given as Algorithm1. illustrate its application based on the question WHYQ(n,s) B. Add Annotations based on Success/Failure from Example1, i.e., why is New York connected to Seattle via a train connection with one intermediate stop, but is not ForWHY andWHYNOT questions,weonlyconsidertuples that are existing and missing, respectively. Based on this directly connected to Seattle. information, we can infer restrictions on the success/failure A. Unify the Program with PQ status of nodes in the provenance graph that are connected to PQ node(s) (belong to the explanation). We store these The node Q(n,s) in the provenance graph (Fig.2) is only restrictions as annotations T, F, and F/T on heads and goals connected to rule derivations which return Q(n,s). For in- of rules. Here, T indicates that we are only interested in stance,ifvariableX isboundtoanothercityx(e.g.,Chicago) successfulnodes,F thatweareonlyinterestedinfailednodes, in a derivation of the rule r , then this rule cannot return 1 andF/T thatweareinterestedinboth.Theseannotationsare the tuple (n,s). This reasoning can be applied recursively to determinedusingatop-downpropagationseededwiththePQ. replacevariablesinruleswithconstants.Thatis,weunifythe rules in the program top-down with the PQ. This step may Example 5. Continuing with our running example question produce multiple duplicates of a rule with different bindings. WHYQ(n,s), we know that Q(n,s) is successful because Weusesuperscriptstomakeexplicitthevariablebindingused the tuple is in the result (Fig.1). This implies that only by a replica of a rule. successful rule nodes and their successful goal nodes can be connected to this tuple node. Note that this annotation does Example 4. Given the question WHYQ(n,s), we unify the not imply that the rule r would be successful for every Z single rule r using the assignment (X=n,Y=s): 1 1 (i.e., every intermediate stop between New York and Seattle). r(X=n,Y=s) :Q(n,s):−T(n,Z),T(Z,s),¬T(n,s) It only indicates that it is sufficient to focus on successful rule 1 derivations since failed ones cannot be connected to Q(n,s). We may have to create multiple partially unified versions of a rule or an EDB atom. For example, to explore suc- r(X=n,Y=s),T :Q(n,s)T :−T(n,Z)T,T(Z,s)T,¬T(n,s)T 1 cessful derivations of Q(n,s) we are interested in both train We now propagate the annotations of the goals in r through- connections from New York to some city (T(n,Z)) and from 1 outtheprogram.Thatis,foranygoalthatisanIDBpredicate, any city to Seattle (T(Z,s)). Furthermore, we need to know we propagate its annotation to the head of all rules deriving whetherthereisadirectconnectionfromNewYorktoSeattle the goal’s predicate and, then, propagate these annotations (T(n,s)). The general idea of this step is motivated by [2] to the corresponding rule bodies. Note that the inverted which introduced “magic sets”, an approach for rewriting annotation is propagated for negated goals. For instance, if logical rules to cut down irrelevant facts by using additional T would be an IDB predicate, then the annotation on the predicates called “magic predicates”. Similar techniques exist goal ¬T(n,s)T would be propagated as follows. We would instandardrelationaloptimizationunderthenameofpredicate annotatetheheadofallrulesderivingT(n,s)withF,because move around. We use the technique of propagating variable Q(n,s) can only exist if T(n,s) does not exist. bindings in the query to restrict the computation based on the user’s interest. This approach is correct because if we Partially unified atoms (such as T(n,Z)) may occur in bind a variable in the head of rule, then only rule derivations both negative and positive goals of the rules of the program. Algorithm 2 Success/Failure Annotations F (n,s):−F (n,s,Z) Q,T r1,T 1: procedure ANNOTPROGRAM(PU, Q(t)) F (n,s,Z):−F (n,Z),F (Z,s),F (n,s) 2: state←typeof(Q(t)) r1,T T,T T,T T,F 3: todo←[Q(t)state] F (n,Z):−T(n,Z) T,T 4: done←{} 5: PA =[] FT,T(Z,s):−T(Z,s) 6: while todo(cid:54)=[] do F (n,s):−¬T(n,s) T,F 7: a←POP(todo) 8: state←typeof(a) Fig. 4: Example firing rules for WHYQ(n,s) 9: INSERT(done,a) 10: rules←GETUNRULESFORATOM(P,a) 11: for all r∈rules do firing rule consists of the body of the rule (but using the 12: annotRule←ANNOTRULE(r,state) 13: PA ←PA::annotRule firing version of each predicate in the body) and a new head 14: for all g∈body(annotRule) do predicate that contains all variables used in the rule. In this 15: if state = F then way, the firing rule captures all the variable bindings of a rule 16: newstate←F/T derivation. Furthermore, for each IDB predicate R that occurs 17: else as a head of a rule r, we create a firing rule that has the 18: newstate←state 19: if ISNEGATED(g) then firing version of predicate R in the head and a firing version 20: newstate←SWITCHSTATE(state) of the rules r deriving the predicate in the body. For EDB 21: if gnewstate (cid:54)∈done∧ISIDB(g) then predicates, we create firing rules that have the firing version 22: todo←todo::gnewstate ofthepredicateintheheadandtheEDBpredicateinthebody. 23: for all r∈PA do 24: if typeof(r)=F/T then 25: PA ←REMOVEANNOTATEDRULES(PA,r,{F,T}) Example6. ConsidertheannotatedprograminExample5for 26: return PA the question WHYQ(n,s). We generate the firing rules shown in Fig.4. The firing rule for r(X=n,Y=s),T (the second rule 1 fromthetop)isderivedfromtheruler byaddingZ (theonly 1 We denote such atoms using a F/T annotation. The use existential variable) to the head, renaming the head predicate of these annotations will become more clear in the next as F , and replacing each goal with its firing version (e.g., r1,T subsection when we introduce firing rules. The pseudocode F for the two positive goals and F for the negated goal). T,T T,F forthealgorithmthatdeterminestheseannotationsisgivenas Notethatnegatedgoalsarereplacedwithfiringrulesthathave Algorithm2. In short: invertedannotations(e.g.,thegoal¬T(n,s)T isreplacedwith F (n,s)). Furthermore, we introduce firing rules for EDB 1) Annotate the head of all rules deriving tuples matching T,F tuples (the three rules from the bottom in Fig.4) the question with T (why) or F (why-not). 2) Repeat the following steps until a fixpoint is reached: As mentioned in Sec. III, firing rules for successful rule a) Propagate the annotation of a rule head to goals in derivations have been used for declarative debugging of posi- the rule body as follows: propagate T for T annotated tive Datalog programs [19] and, for non-recursive queries, are heads and F/T for F annotated heads. essentially equivalent to rewrite rules that instrument a query b) Foreachannotatedgoalintherulebody,propagateits tocomputeprovenancepolynomials[1],[13].Weextendfiring annotation to all rules that have this atom in the head. rules to support queries with negation and capture missing For negated goals, unless the annotation is F/T, we answers. To construct a PG(P,I) fragment corresponding propagatetheinvertedannotation(e.g.,F forT)tothe to a missing tuple, we need to find failed rule derivations head of rules deriving the goal’s predicate. with the tuple in the head and ensure that no successful derivations exist with this head (otherwise, we may capture C. Creating Firing Rules irrelevant failed derivations of existing tuples). In addition, To be able to compute the relevant subgraph of PG(P,I) we need to determine which goals are failed for each failed (the explanation) for the provenance question PQ, we need to rule derivation because only failed goals are connected to the determine successful and/or failed rule derivations. Each rule node representing the failed rule derivation in the provenance derivation paired with the information whether it is successful graph. To capture this information, we add additional boolean over the given database instance (and which goals are failed variables - V for goal gi - to the head of a firing rule that i in case it is not successful) corresponds to a certain subgraph. record for each goal whether it failed or not. The body of a Successful rule derivations are always part of PG(P,I) for a firing rule for failed rule derivations is created by replacing givenquery(inputprogram)P whereasfailedrulederivations everygoalinthebodywithitsF/T firingversion,andadding only appear if the tuple in the head failed, i.e., there are no the firing version of the negated head to the body (to ensure successful derivations of any rule with this head. To capture thatonlybindingsformissingtuplesarecaptured).Firingrules the variable bindings of successful/failed rule derivations, capturing failed derivations use the F/T firing versions of we create “firing rules”. For successful rule derivations, a their goals because not all goals of a failed derivation have F (s,n):−¬F (s,n) Algorithm 3 Create Firing Rules Q,F Q,T FQ,T(s,n):−Fr1,T(s,n,Z) 12:: proPcFediruer←e C[]REATEFIRINGRULES(PA, Q(t)) Fr1,F(s,n,Z,V1,V2,¬V3):−FQ,F(s,n),FT,F/T(s,Z,V1), 43:: tsotadtoe←←[Qty(pte)ostfa(teQ](t)) F (Z,n,V ),F (s,n,V ) T,F/T 2 T,F/T 3 5: done←{} Fr1,T(s,n,Z):−FT,T(s,Z),FT,T(Z,n),FT,F(s,n) 67:: whRile(tt)oσd←o(cid:54)=PO[]Pd(toodo) (cid:46) create rules for a predicate FT,F/T(s,Z,true):−FT,T(s,Z) 8: INSERT(done,R(t)σ) F (s,Z,false):−F (s,Z) 9: if ISEDB(R) then T,F/T T,F 10: CREATEEDBFIRINGRULE(PFire,R(t)σ) FT,T(s,Z):−T(s,Z) 11: else FT,F(s,Z):−domT.toCity(Z),¬T(s,Z) 1123:: rCuRlEeAsT←EIDGEBTNREUGLREUS(LRE((Pt)Fσi)re,R(t)σ) Fig. 5: Example firing rules for WHYNOTQ(s,n) 14: for all r∈rules do (cid:46) create firing rule for r 15: args←(vars(r)−vars(head(r))) 16: args←args(head(r))::args 17: CREATEIDBPOSRULE(PFire,R(t)σ,r,args) to be failed and the failure status determines whether the 18: CREATEIDBFIRINGRULE(PFire,R(t)σ,r,args) corresponding goal node is part of the explanation. A firing 19: return PFire rule capturing missing IDB or EDB tuples may not be safe, i.e., it may contain variables that only occur in negated goals. In fact, these variables should be restricted to the associated tuple T(s,n) (a direct train connection from Seattle to New domains for the attributes the variables are bound to. Since York) is missing. This causes the third goal of r1 to succeed the associated domain dom for an attribute R.A is given as for any derivation where X=s and Y=n. For each partially an unary query dom , we can use these queries directly in unified EDB atom annotated with F/T, we create four rules: R.A firing rules to restrict the values the variable is bound to. This one for existing tuples (e.g., FT,T(s,Z):−T(s,Z)), one for is how we ensure that only missing answers formed from the thefailurecase(e.g.,FT,F(s,Z):−domT.toCity(Z),¬T(s,Z)), associated domains are considered and that firing rules are andtwofortheF/T firingversion.Forthefailurecase,weuse always safe. predicate domT.toCity to only consider missing tuples (s,Z) whereZ isavaluefromtheassociateddomainofthisattribute. Example 7. Reconsider the question WHYNOTQ(s,n) from The algorithm that creates the firing rules for an annotated Example1. The firing rules generated for this question are input query is shown as Algorithm3 (the pseudocode for the shown in Fig.5. We exclude the rules for the second goal subprocedures is given as Algorithm4). It maintains a list of T(Z,n) and the negated goal ¬T(s,n) which are analogous annotated atoms that need to be processed which is initialized to the rules for the first goal T(s,Z). Since Q(s,n) is failed with the Q(t) (PQ). For each such atom R(t)σ (here σ is the (becausetuple(s,n),i.e.,NewYorkcannotbereachablefrom annotation of the atom), it creates firing rules for each rule r Seattle with exactly one transfer, is not in the result), we thathasthisatomasaheadandapositivefiringruleforR(t). are only interested in failed rule derivations of the rule r 1 Furthermore, if the atom is annotated with F/T or F, then with X=s and Y=n. Furthermore, each rule node in the additional firing rules are added to capture missing tuples and provenancegraphcorrespondingtosucharulederivationwill failed rule derivations. onlybeconnectedtofailedsubgoals.Thus,weneedtocapture which goals are successful or failed for each such failed EDB atoms. For an EDB atom R(t)T, we use procedure derivation. This can be modelled through boolean variables CREATEEDBFIRINGRULE to create one rule FR,T(t):−R(t) V , V , and V (since there are three goals in the body) that returns tuples from relation R that matches t. For miss- 1 2 3 that are true if the corresponding goal is successful and false ing tuples (R(t)F), we extract all variables from t (some otherwise. The firing version F (s,n,Z,V ,V ,¬V ) of r arguments may be constants propagated during unification) will contain all variable bindinrg1,sF for deriva1tion2s of3r such1 and create a rule that returns all tuples that can be formed 1 that Q(s,n) is the head (i.e., guaranteed by adding F (s,n) from values of the associated domains of the attributes these Q,F to the body), the rule derivations are failed, and the ith goal variables are bound to and do not exist in R. This is achieved is successful or failed for this binding iff Vi is true or false, by adding goals dom(Xi) as explained in Example7. respectively.Toproduceallthesebindings,weneedrulescap- Rules. Consider a rule r :R(t):−g (X(cid:126) ),...,g (X(cid:126) ). If the 1 1 n n turingsuccessfulandfailedtuplenodesforeachsubgoalofthe head of r is annotated with T, then we create a rule with ruler .WedenotesuchrulesusingaF/T annotationanduse head F (X(cid:126)) where X(cid:126) = vars(r) and the same body as r 1 r,T abooleanvariable(trueorfalse)torecordwhetheratupleex- except that each goal is replaced with its firing version with ists(e.g.,F (s,Z,true):−F (s,Z)isoneoftheserules). appropriate annotation (e.g., T for positive goals). For rules T,F/T T,T Negatedgoalsaredealtwithbynegatingthisbooleanvariable, annotated with F or F/T, we create one additional rule with i.e., the goal is successful if the corresponding tuple does not headF (X(cid:126),V(cid:126))whereX(cid:126) isdefinedasabove,andV(cid:126) contains r,F exist. For instance, F (s,n,false) represents the fact that V iftheith goalofr ispositiveand¬V otherwise.Thebody T,F/T i i Algorithm 4 Create Firing Rules Subprocedures with these annotations. For each R(t)F, we create a rule with 1: procedure CREATEEDBFIRINGRULE(PFire, R(t)σ) ¬FR,T(t) in the body using the associated domain queries to 2: [X1,...,Xn]←vars(t) restrictvariablebindings.ForR(t)F/T,weaddtwoadditional 3: rT ←FR,T(t):−R(t) rules as shown in Fig.5 for EDB atoms. 4: rF ←FR,F(t):−dom(X1),...,dom(Xn),¬R(t) 5: rF/T−1 ←FR,F/T(t,true):−FR,T(t) Theorem 1 (Correctness of Firing Rules). Let P be an input 6: rF/T−2 ←FR,F/T(t,false):−FR,F(t) program,r denotearuleofP withmgoals,andP bethe 7: if σ=T then Fire 8: PFire ←PFire::rT firingversionofP.Weuser(t)|=P(I)todenotethattherule 9: else if σ=F then derivation r(t) is successful in the evaluation of program P 10: PFire ←PFire::rT ::rF over I. The firing rules for P correctly determine existence of 11: else tuples, successful rule derivations, and failed rule derivations 12: PFire ←PFire::rT ::rF ::rF/T−1::rF/T−2 for missing answers: 1: procedure CREATEIDBNEGRULE(PFire, R(t)σ) 2: [X1,...,Xn]←vars(t) • FR,T(t)∈PFire(I)↔R(t)∈P(I) 3: if σ(cid:54)=T then • FR,F(t)∈PFire(I)↔R(t)(cid:54)∈P(I) 564::: if σPrnF=eiwrFe←/←TFPRt,hFFe(itnr)e::−:rdnoewm(X1),...,dom(Xn),¬FR,T(t) •• FFrr,,TF((tt,)V(cid:126)∈)P∈FirPeF(Iir)e(↔I)r↔(t)r|=(tP) ((cid:54)|=I)P(I)∧head(r(t)) (cid:54)∈ 7: rT ←FR,F/T(t,true):−FR,T(t) P(I) and for i∈{1,...,m} we have that Vi is false iff 8: rF ←FR,F/T(t,false):−FR,F(t) ith goal fails in r(t). 9: PFire ←PFire::rT ::rF 1: procedure CREATEIDBPOSRULE(PFire, R(t)σ, r, args) Proof: We prove Theorem 1 by induction over the struc- 2: rpred ←FR,T(t):−Fr,T(args) tureofaprogram.Forthebasecase,weconsiderprogramsof 3: PFire ←PFire::rpred “depth” 1, i.e., only EDB predicates are used in rule bodies. 1: procedure CREATEIDBFIRINGRULE(PFire, R(t)σ, r, args) Then,weprovecorrectnessforprogramsofdepthn+1based 2: bodynew ←[] onthecorrectnessofprogramsofdepthn.Wedefinethedepth 3: for all gi(X(cid:126))∈body(r) do dofpredicates,rules,andprogramsasfollows:1)forallEDB 4: σgoal ←T predicatesR,wedefined(R)=0;2)foranIDBpredicateR, 5: if ISNEGATED(g) then wedefined(R)=max d(r),i.e.,themaximaldepth 6: σgoal ←F head(r)=R 87:: bgondewyn←ewF←prebdo(gdiy),nσegowal:(:X(cid:126)gn)ew aismdo(nrg)a=llmrualexsRr∈bwodiyth(r)hde(aRd()r+)=1,Ri.e;.3,)ththeemdaexpitmhaolfdaepruthleorf 9: if g(X(cid:126))T (cid:54)∈(done∪todo)∧σ=T then all predicates in its body plus one; 4) the depth of a program 10: todo←todo::g(X(cid:126))σgoal P is the maximum depth of its rules: d(P)=maxr∈P d(r). 11: rnew ←Fr,T(args):−bodynew 1) Base Case. Assume that we have a program P with depth 12: PFire ←PFire::rnew 1, e.g., r :Q(X(cid:126)):−R(X(cid:126)1),...,R(X(cid:126)n). We first prove that the 13: if σ(cid:54)=T then positive and negative versions of firing rules for EDB atoms 14: for all gi ∈body(r) do are correct, because only these rules are used for the rules of 15: if ISNEGATED(gi) then 16: args←args::¬bi depth 1 programs. A positive version of EDB firing rule FR,T 17: else createsacopyoftheinputrelationRand,thus,atuplet∈Riff 18: args←args::bi t∈FR,T. For the negative version FR,F, all variables are bound 19: bodynew ←[] to associated domains dom and it is explicitly checked that 20: for all gi(X(cid:126))∈body(r) do ¬R(X(cid:126)) is true. Finally, F uses F and F to determine 21: gnew ←Fpred(gi),F/T(X(cid:126),bi) whether the tuple exists inR,FR/.TSince tRh,eTse ruleRs,Fare correct, it 22: bodynew ←bodynew::gnew 23: if g(X(cid:126))F/T (cid:54)∈(done∪todo) then follows that FR,F/T is correct. The positive firing rule for the 24: todo←todo::g(X(cid:126))σgoal rule r (Fr,T) is correct since its body only contains positive 25: rnew ←Fr,σ(args):−bodynew and negative EDB firing rules (FR,T respective FR,F) which are 26: PFire ←PFire::rnew already known to be correct. The correctness of the positive firingversionofarule’sheadpredicate(F )followsnaturally Q,T from the correctness of F . The negative version of the rule r,T F (X(cid:126),V(cid:126))containsanadditionalgoal(i.e.,¬Q(X(cid:126)))anduses ofthisrulecontainstheF/T versionofeverygoalinr’sbody r,F the firing version F to return only bindings for failed plus an additional goal F to ensure that the head atom is R,F/T R,F derivations. Since F has been proven to be correct, we failed. As an example for this type of rule, consider the third R,F/T only need to prove that the negative firing version of the head rule from the top in Fig.5. predicate of r is correct. For a head predicate with annotation IDB atoms. For each rule r with head R(t), we create a rule F, we create two firing rules (F and F ). The rule F Q,T Q,F Q,T F (t):−F (X(cid:126)) where X(cid:126) is the concatenation of t with all was already proven to be correct. F is also correct, because R,T r,T Q,F existentialvariablesfromthebodyofr.IDBatomswithF or it contains only F and domain queries in the body which Q,T F/T annotations are handled in the same way as EDB atoms were already proven to be correct.

Efficiently Computing Provenance Graphs for Queries with Negation PDF

0.75 MB·

by Seokki Lee

#journals #arxiv

Checking for file health...

Save to my drive

Quick download

Download

Download Efficiently Computing Provenance Graphs for Queries with Negation PDF Free - Full Version

by Seokki Lee| 0.75

Download Efficiently Computing Provenance Graphs for Queries with Negation by Seokki Lee in PDF format completely FREE. No registration required, no payment needed. Get instant access to this valuable resource on PDFdrive.to!

Free Download PDF

About Efficiently Computing Provenance Graphs for Queries with Negation

No description available for this book.

Detailed Information

Author:	Seokki Lee
File Size:	0.75
Format:	PDF
Price:	FREE

Download Free PDF

Safe & Secure Download - No registration required

Why Choose PDFdrive for Your Free Efficiently Computing Provenance Graphs for Queries with Negation Download?

100% Free: No hidden fees or subscriptions required for one book every day.
No Registration: Immediate access is available without creating accounts for one book every day.
Safe and Secure: Clean downloads without malware or viruses
Multiple Formats: PDF, MOBI, Mpub,... optimized for all devices
Educational Resource: Supporting knowledge sharing and learning

Frequently Asked Questions

Is it really free to download Efficiently Computing Provenance Graphs for Queries with Negation PDF?

Yes, on https://PDFdrive.to you can download Efficiently Computing Provenance Graphs for Queries with Negation by Seokki Lee completely free. We don't require any payment, subscription, or registration to access this PDF file. For 3 books every day.

How can I read Efficiently Computing Provenance Graphs for Queries with Negation on my mobile device?

After downloading Efficiently Computing Provenance Graphs for Queries with Negation PDF, you can open it with any PDF reader app on your phone or tablet. We recommend using Adobe Acrobat Reader, Apple Books, or Google Play Books for the best reading experience.

Is this the full version of Efficiently Computing Provenance Graphs for Queries with Negation?

Yes, this is the complete PDF version of Efficiently Computing Provenance Graphs for Queries with Negation by Seokki Lee. You will be able to read the entire content as in the printed version without missing any pages.

Is it legal to download Efficiently Computing Provenance Graphs for Queries with Negation PDF for free?

https://PDFdrive.to provides links to free educational resources available online. We do not store any files on our servers. Please be aware of copyright laws in your country before downloading.

The materials shared are intended for research, educational, and personal use in accordance with fair use principles.