Join Query Optimization Techniques for Complex EventProcessing Applications

Ilya KolchinskyTechnion, Israel Institute of Technology

Haifa 32000 [emailprotected]

Assaf SchusterTechnion, Israel Institute of Technology

Haifa 32000 [emailprotected]

ABSTRACTComplex event processing (CEP) is a prominent technologyused in many modern applications for monitoring and track-ing events of interest in massive data streams. CEP enginesinspect real-time information �ows and attempt to detectcombinations of occurrences matching prede�ned patterns.This is done by combining basic data items, also called�primitive events�, according to a pattern detection plan,in a manner similar to the execution of multi-join queriesin traditional data management systems. Despite this sim-ilarity, little work has been done on utilizing existing joinoptimization methods to improve the performance of CEP-based systems.In this paper, we provide the �rst theoretical and experi-

mental study of the relationship between these two researchareas. We formally prove that the CEP Plan Generationproblem is equivalent to the Join Query Plan Generationproblem for a restricted class of patterns and can be reducedto it for a considerably wider range of classes. This resultimplies the NP-completeness of the CEP Plan Generationproblem. We further show how join query optimization tech-niques developed over the last decades can be adapted andutilized to provide practically e�cient solutions for complexevent detection. Our experiments demonstrate the superi-ority of these techniques over existing strategies for CEPoptimization in terms of throughput, latency, and memoryconsumption.

PVLDB Reference Format:

Ilya Kolchinsky and Assaf Schuster. Join Query OptimizationTechniques for Complex Event Processing Applications. PVLDB,11 (11): 1332-1345, 2018.DOI: https://doi.org/10.14778/3236187.3236189

Categories and Subject DescriptorsH.2.4 [Database Management]: Systems

General TermsAlgorithms, Design, Performance

Permission to make digital or hard copies of all or part of this work forpersonal or classroom use is granted without fee provided that copies arenot made or distributed for profit or commercial advantage and that copiesbear this notice and the full citation on the first page. To copy otherwise, torepublish, to post on servers or to redistribute to lists, requires prior specificpermission and/or a fee. Articles from this volume were invited to presenttheir results at The 44th International Conference on Very Large Data Bases,August 2018, Rio de Janeiro, Brazil.Proceedings of the VLDB Endowment, Vol. 11, No. 11Copyright 2018 VLDB Endowment 2150-8097/18/07... $ 10.00.DOI: https://doi.org/10.14778/3236187.3236189

KeywordsStream Processing, Complex Event Processing, Lazy Eval-uation, Query Optimization

1. INTRODUCTIONComplex event processing has become increasingly im-

portant for applications in which arbitrarily complex pat-terns must be e�ciently detected over high-speed streamsof events. Online �nance, security monitoring, and frauddetection are among the many examples. Pattern detectiongenerally consists of collecting primitive events and combin-ing them into potential (partial) matches using some typeof detection model. As more events are added to a partialmatch, a full pattern match is eventually formed and re-ported. Popular CEP mechanisms include nondeterministic�nite automata (NFAs) [5, 18, 51], �nite state machines [6,45], trees [36], and event processing networks [21, 42].A CEP engine creates an internal representation for each

pattern P to be monitored. This representation is based ona model used for detection (e.g., an automaton or a tree)and re�ects the structure of P . In some systems [5, 51], thetranslation from a pattern speci�cation to a correspondingrepresentation is a one-to-one mapping. Other frameworks[6, 30, 36, 42, 45] introduce the notion of a cost-based evalu-ation plan, where multiple representations of P are possible,and one is chosen according to the user's preference or someprede�ned cost metric.We will illustrate the above using the following example.

Assume that we are receiving periodical readings from fourtra�c cameras A, B, C and D. We are required to recognizea sequence of appearances of a particular vehicle on all fourcameras in order of their position on a road, e.g., A→ B →C → D. Assume also that, due to a malfunction in cameraD, it only transmits one frame for each 10 frames sent bythe other cameras.Figure 1(a) displays a nondeterministic �nite automaton

(NFA) for detecting this pattern, as described in [51]. Astate is de�ned for each pre�x of a valid match. Duringevaluation, a combination of camera readings matching eachpre�x will be represented by a unique instance of the NFAin the corresponding state. Transitions between states aretriggered nondeterministically by the arrival of an event sat-isfying the constraints de�ned by the pattern. A new NFAinstance is created upon each transition.The structure of the above automaton is uniquely dictated

by the order of events in the given sequence. However, due tothe low transmission rate of D, it would be bene�cial to waitfor its signal before examining the local history for previous

1332

(a)

(b)

(c)

Figure 1: Evaluation structures for a CEP patternSEQ(A,B,C,D): (a) NFA with no reordering; (b) NFA withreordering; (c) evaluation tree.

readings of A, B and C that match the constraints. Thisway, fewer pre�xes would be created. Figure 1(b) demon-strates an out-of-order NFA for the rewritten pattern (de-�ned as �Lazy NFA� in [30]). It starts by monitoring therarest event type D and storing the other events in the ded-icated bu�er. As a reading from camera D arrives, the bu�eris inspected for events from A, B and C preceding the onereceived from D and located in the same time window. Thisplan is more e�cient than the one implicitly used by the�rst NFA in terms of the number of partial matches createdduring evaluation. Moreover, unless more constraints on theevents are de�ned, it is the best out of all (4!) possible plans,that is, mutual orders of A, B, C and D.Not all CEP mechanisms represent a plan as an evalua-

tion order. Figure 1(c) depicts a tree-based evaluation mech-anism [36] for detecting the above pattern. Events are ac-cepted at the corresponding leaves of the tree and passed to-wards the root where full matches are reported. This modelrequires an evaluation plan to be supplied, because, for a

pattern of size n, there are at least n! · Cn−1 = (2n−2)!(n−1)!

pos-

sible trees (where Cn is the nth Catalan number) [37].In many scenarios, we will prefer the evaluation mech-

anisms supporting cost-based plan generation over thosemechanisms allowing for only one such plan to be de�ned.This way, we can drastically boost system performance sub-ject to selected metrics by picking more e�cient plans. How-ever, as the space of potential plans is at least exponentialin pattern size, �nding an optimal plan is not a trivial task.Numerous authors have identi�ed and targeted this is-

sue. Some of the proposed solutions are based on rewritingthe original pattern according to a set of prede�ned rulesto maximize the e�ciency of its detection [42, 45]. Otherapproaches discuss various strategies and algorithms for gen-erating an evaluation plan that maximizes the performancefor a given pattern according to some cost function [6, 30,36]. While the above approaches demonstrate promisingresults, this research �eld remains largely unexplored, andthe space of the potential optimization techniques is still farfrom being exhausted.

The problem described above closely resembles the prob-lem of estimating execution plans for large join queries. Asopposed to CEP plan generation, this is a well-known, estab-lished, and extensively targeted research topic. A plethora ofmethods and approaches producing close-to-optimal resultswere published during the last few decades. These methodsrange from simple greedy heuristics, to exhaustive dynamicprogramming techniques, to randomized and genetic algo-rithms [32, 33, 38, 46, 47, 48].Both problems look for a way to e�ciently combine mul-

tiple data items such that some cost function is minimized.Also, both produce solutions possessing similar structures.If we reexamine Figure 1, we can see that evaluation plansfor NFAs (1(b)) and trees (1(c)) closely resemble left-deeptree plans and bushy tree plans [26] respectively. An in-teresting question is whether join-related techniques can beused to create better CEP plans using a proper reduction.In this work, we attempt to close the gap between the two

areas of research. We study the relationship between CEPPlan Generation (CPG) and Join Query Plan Generation(JQPG) problems and show that any instance of CPG canbe transformed into an instance of JQPG. Consequently, anyexisting method for JQPG can be made applicable to CPG.Our contributions can be summarized as follows:� We formally prove the equivalence of JQPG and CPG

for a large subset of CEP patterns, the conjunctive pat-terns. The proof addresses the two major classes of evalu-ation plans, the order-based plans and the tree-based plans(Section 3).� We extend the above result by showing how other pat-

tern types can be converted to conjunctive patterns, thusproving that any instance of CPG can be reduced to aninstance of JQPG (Section 4).� The deployment of a JQPG method to CPG is not triv-

ial, as multiple CEP-speci�c issues need to be addressed,such as detection latency constraints, event consumptionpolicies, and adaptivity considerations. We present and dis-cuss the steps essential for successful adaptation of JQPGtechniques to the CEP domain (Section 5).� We validate our theoretical analysis in an extensive ex-

perimental study. Several well-known JQPG methods, suchas Iterative Improvement [48] and Dynamic Programming[46], were applied on a real-world event dataset and com-pared to the existing state-of-the-art CPG mechanisms. Theresults demonstrate the superiority of the adapted JQPGtechniques (Section 6).

2. BACKGROUND AND TERMINOLOGYIn this section, we introduce the notations used through-

out this paper, provide the necessary background on com-plex event processing and describe the two problems whoserelationship will be closely studied in the next sections.

2.1 CEP PatternsThe patterns recognized by CEP systems are normally

formed using declarative speci�cation languages [14, 18, 51].A pattern is de�ned by a combination of primitive events,operators, a set of predicates to be satis�ed by the partici-pating events, and a time window. Each event is representedby a type and a set of attributes, including the occurrencetimestamp. We assume that each primitive event has a well-de�ned type, i.e., the event either contains the type as anattribute or it can be easily inferred from other attributes

1333

using negligible system resources. The operators describethe relations between di�erent events comprising a patternmatch. The predicates, usually organized in a Boolean for-mula, specify the constraints on the attribute values of theevents. As an example, consider the following pattern spec-i�cation syntax, taken from SASE [51]:

PATTERN op (T1 e1, T2 e2, · · · , Tn en)WHERE (c1,1 ∧ c1,2 ∧ · · · ∧ cn,n−1 ∧ cn,n)WITHIN W.

Here, the PATTERN clause speci�es the events e1, · · · , enwe would like to detect and the operator op to combine them(see below). The WHERE clause de�nes a Boolean CNFformula of inter-event constraints, where ci,j ; 1 ≤ i, j ≤ nstands for the mutual condition between attributes of ei andej . ci,i declares �lter conditions on ei. Any of ci,j can beempty. For the rest of our paper, we assume that all condi-tions between events are at most pairwise. This assumptionis for presentational purposes only, as our results can be eas-ily generalized to arbitrary predicates. The WITHIN clausesets the time window W , which is the maximal allowed dif-ference between the timestamps of any pair of events in amatch.In this paper, we will consider the most commonly used

operators, namely AND, SEQ, and OR. The AND operatorrequires the occurrence of all events speci�ed in the patternwithin the time window. The SEQ operator extends thisde�nition by also expecting the events to appear in a prede-�ned temporal order. The OR operator corresponds to theappearance of any event out of those speci�ed.Two additional operators of particular importance are the

negation operator (NOT) and the Kleene closure operator(KL). They can only be applied on a single event and areused in combination with other operators. NOT (ei) re-quires the absence of the event ei from the stream (or froma speci�c position in the pattern in the case of the SEQ op-erator), whereas KL (ei) accepts one or more instances ofei. In the remainder of this paper, we will refer to NOT andKL as unary operators, while AND, SEQ and OR will becalled n-ary operators.The PATTERN clause may include an unlimited number

of n-ary and unary operators. We will refer to patterns con-taining a single n-ary operator, and at most a single unaryoperator per primitive event, as simple patterns. On thecontrary, nested patterns are allowed to contain multiple n-ary operators (e.g., a disjunction of conjunctions and/or se-quences will be considered a nested pattern). Nested pat-terns present an additional level of complexity and requireadvanced techniques (e.g., as described in [34]).We will further divide simple patterns into subclasses. A

simple pattern whose n-ary operator is an AND operatorwill be denoted as a conjunctive pattern. Similarly, sequencepattern and disjunctive pattern will stand for patterns withSEQ and OR operators, respectively. A simple pattern con-taining no unary operators will be called a pure pattern.The �four cameras pattern� described in Section 1 illus-

trates the above. This is a pure sequence pattern, writtenin SASE as follows:

PATTERN SEQ(A a,B b,C c,D d)WHERE(a.vehicleID=b.vehicleID=

=c.vehicleID=d.vehicleID)WITHIN W.

2.2 Order-based Evaluation MechanismsOrder-based evaluation mechanisms play an important

role in CEP engines based on state machines. One of themost commonly used models following this principle is theNFA (nondeterministic �nite automaton) [5, 18, 51]. AnNFA consists of a set of states and conditional transitionsbetween them. Each state corresponds to a pre�x of a fullpattern match. Transitions are triggered by the arrival of theprimitive events, which are then added to partial matches.Conditions between events are veri�ed during the transi-tions. Figure 1(a) depicts an example of an NFA constructedfor the �four cameras� sequence pattern. While in theoryNFAs may possess an arbitrary topology, non-nested pat-terns are normally detected by a chain-like structure.The basic NFA model does not include any notion of alter-

ing the �natural� evaluation order or any other optimizationbased on pattern rewriting. Multiple works have presentedmethods for constructing NFAs with out-of-order processingsupport. W.l.o.g., we will use the Lazy NFA mechanism, achain-structured NFA introduced in [29, 30].Given a pattern of n events and a user-speci�ed order

O on the event types appearing in the pattern, a chain ofn+1 states is constructed, with each state k correspondingto a match pre�x of size k − 1. The order of the statesmatches O. If a type appears more than once in a pattern,it will also appear multiple times in O. The (n+ 1)th statein the chain is the accepting state. To achieve out-of-orderevaluation, incoming events are stored locally. A bu�eredevent is retrieved and processed when its corresponding statein the chain is reached. Figure 1(b) presents an example ofthis construction for O = (D,A,B,C).This construction method allows us to apply all possible

(n!) orders without a�ecting the detection correctness.

2.3 Tree-based Evaluation MechanismsAn alternative to NFA, the tree-based evaluation mech-

anism [36] speci�es which subsets of full pattern matchesare to be tracked by de�ning tree-like structures. For eachevent participating in a pattern, a designated leaf is created.During evaluation, events are routed to their correspondingleaves and are bu�ered there. The non-leaf nodes accumu-late the partial matches. The computation at each non-leafnode proceeds only when all of its children are available (i.e.,all events have arrived or partial matches have been calcu-lated). Matches formed at the tree root are reported to theend users. An example is shown in Figure 1(c).ZStream assumes a batch-iterator setting [36]. To perform

our study under a uni�ed framework, we modify this behav-ior to support arbitrary time windows. As described abovewith regard to NFAs, a separate tree instance will be cre-ated for each currently found partial match. As a new eventarrives, an instance will be created containing this event.Every instance I corresponds to some subtree s of the treeplan, with the leaves of s holding the primitive events in I.Whenever a new instance I ′ is created, the system will at-tempt to combine it with previously created �siblings�, thatis, instances corresponding to the subtree sharing the parentnode with s′. As a result, another new instance containingthe uni�ed subtree will be generated. This in turn will trig-ger the same process again, and it will proceed recursivelyuntil the root of the tree is reached or no siblings are found.ZStream includes an algorithm for determining the opti-

mal tree structure for a given pattern. This algorithm is

1334

(a) (b) (c)

Figure 2: Evaluation trees for a pattern SEQ(A,B,C): (a)a left-deep tree produced by ZStream; (b) a right-deep treeproduced by ZStream; (c) an optimal evaluation tree, whichcannot be produced by ZStream.

based on a cost model that takes into account the arrivalrates of the primitive events and the selectivities of theirpredicates. However, since leaf reordering is not supported,a subset of potential plans is missed. We will illustrate thisdrawback using the following example:

PATTERN SEQ(A a,B b,C c)WHERE(a.x=c.x) WITHIN W.

We assume that all events arrive at identical rates, andthat the condition between A and C is very restrictive. Fig-ures 2(a) and 2(b) present the only two possible plans ac-cording to the algorithm presented in [36]. However, due tothe condition between A and C, the most e�cient evaluationplan is the one displayed in Figure 2(c).

2.4 CEP Plan GenerationWe will start with the de�nition of the CEP evaluation

plan. The evaluation plan provides a scheme for the evalua-tion mechanism, according to which its internal pattern rep-resentation is created. Therefore, di�erent evaluation plansare required for di�erent CEP frameworks. In this paper, wedistinguish between two main types of plans, the order-basedplan and the tree-based plan.An order-based plan consists of a permutation of the prim-

itive event types declared by the pattern. A CEP engineuses this plan to set the order in which events are processed.Order-based plans are applicable to mechanisms evaluatinga pattern event-by-event, as described in Section 2.2.A tree-based plan extends the above by providing a tree-

like scheme for pattern evaluation. It speci�es which subsetsof valid matches are to be locally bu�ered and how to com-bine them into larger partial matches. Plans of this type areused by the evaluation mechanism presented in Section 2.3.We can thus de�ne two variations of the CEP Plan Gen-

eration problem, order-based CPG and tree-based CPG. Ineach variation, the goal is to determine an optimal evalua-tion plan P subject to some cost function Cost (P ). Di�er-ent CEP systems de�ne di�erent metrics to measure theire�ciency. In this paper we will consider a highly relevantperformance optimization goal: reducing the number of ac-tive partial matches within the time window (denoted belowsimply as number of partial matches).Regardless of the system-speci�c performance objectives,

the implicit requirement to monitor all valid subsets of prim-itive events can become a major bottleneck. Because anypartial match might form a full pattern match, their numberis worst-case exponential in the number of events participat-ing in a pattern. Further, as a newly arrived event needs tobe checked against all (or most of) the currently stored par-tial matches, the processing time and resource consumption

per event can become impractical for real-time applications.Other metrics, such as detection latency or network commu-nication cost, may also be negatively a�ected. Thus, giventhe crucial role of the number of partial matches in all as-pects of CEP, it was chosen as our primary cost function.

2.5 Join Query Plan GenerationJoin Query Plan Generation is a well-known problem in

query optimization [32, 46, 48]. We are given relationsR1, · · · , Rn and a query graph describing the conditions tobe satis�ed by the tuples in order to be included in the re-sult. A condition between a pair of relations Ri, Rj has aknown selectivity fi,j ∈ [0, 1]. The goal is to produce a joinquery plan minimizing a prede�ned cost function.One popular choice for the cost function is the number

of intermediate tuples produced during plan execution. Forthe rest of this paper, we will refer to it as the intermediateresults size. In [13], the following expression is given tocalculate this function for each two-way join of two inputrelations:

C (Ri, Rj) = |Ri| · |Rj | · fi,j ,where |Ri| , |Rj | are the cardinalities of the joined relations.This formula is naturally extended to relations producedduring join calculation:

C (S, T ) = |S| · |T | · fS,T .

Here, S = Ri1 ./ · · · ./ Ris ;T = Rj1 ./ · · · ./ Rjt arethe partial join results of some subsets of R1, · · · , Rn andfS,T = (|S ./ T | / |S × T |) is the product of selectivities ofall predicates de�ned between the individual relations com-prising S and T .The two most popular classes of join query plans are the

left-deep trees and the bushy trees [26]. A join tree of theformer type processes the input relations one-by-one, addinga new relation to the current intermediate result during eachstep. Hence, for this class of techniques, a valid solutionis a join order rather than a join plan. Approaches basedon bushy trees pose no limitations on the plan topology,allowing it to contain arbitrary branches.JQPG was shown by multiple authors to be NP-complete

[13, 24], even when only left-deep trees are considered.

3. THE EQUIVALENCE OF CPG AND JQPGFOR PURE CONJUNCTIVE PATTERNS

This section presents the formal proof of equivalence be-tween CPG and JQPG for pure conjunctive patterns. Weshow that, when the pattern to be monitored is a pure con-junctive pattern and the CPG cost function represents thenumber of partial matches, the two problems are equivalent.

3.1 Order-Based EvaluationWe will �rst focus on a CPG variation for order-based

evaluation plans. In this section we will show that this prob-lem is equivalent to JQPG restricted to left-deep trees. Tothat end, we will de�ne the cost model functions for bothproblems and then present the equivalence theorem.Our cost function Costord will re�ect the number of par-

tial matches coexisting in memory within the time window.The calculations will be based on the arrival rates of theevents and the selectivities of the predicates.Let seli,j denote the selectivity of ci,j , i.e., the prob-

ability of a partial match containing instances of events

1335

of types Ti and Tj to pass the condition. Additionally,let r1, · · · rn denote the arrival rates of corresponding eventtypes T1, · · ·Tn. Then, the expected number of primitiveevents of type Ti arriving within the time windowW isW ·ri.Let O = (Tp1 , Tp2 , · · ·Tpn) ; pi ∈ [1, n] denote an executionorder. Then, during pattern evaluation according to O, theexpected number of partial matches of length k, 1 ≤ k ≤ nis given by:

PM (k) =W k ·k∏

i=1

rpi ·∏

i,j≤k;i≤j

selpi,pj .

The overall cost function we will attempt to minimize isthus the sum of partial matches of all sizes, as follows:

Costord (O) =

n∑k=1

W k ·k∏

i=1

rpi ·∏

i,j≤k;i≤j

selpi,pj

.

For the JQPG problem restricted to output left-deep treesonly, we will use the two-way join cost function C (S, T )de�ned in Section 2.5. Let L be a left-deep tree and let{i1, i2, · · · , in} be the order in which input relations are tobe joined according to L. Let Pk, 1 ≤ k < n denote theresult of joining the �rst k tables by L (that is, P1 = Ri1 ,P2 = Ri1 ./ Ri2 , etc.). In addition, let C1 = |Ri1 | · fi1,i1be the cost of the initial selection from Ri1 . Then, the costof L will be de�ned according to a left-deep join (LDJ) costfunction:

CostLDJ (L) = C1 +

n∑k=2

C (Pk−1, Rik ) .

We are now ready to formally prove the statement formu-lated in the beginning of the section.

Theorem 1. Given a pure conjunctive pattern P , theproblem of �nding an order-based evaluation plan for P min-imizing Costord is equivalent to the Join Query Plan Gen-eration problem for left-deep trees subject to CostLDJ .

We will only show here the reduction from CPG to JQPG,which will be used in the later sections to apply join plangeneration algorithms on CEP patterns. The opposite di-rection is symmetric and can be found in [27].Given a pure conjunctive pattern P de�ned over event

types T1, · · · , Tn with predicates ci,j : 1 ≤ i, j ≤ n, letR1, · · · , Rn be a set of relations such that each Ri corre-sponds to an event type Ti. For each attribute of Ti, in-cluding the timestamp, a matching column will be de�nedin Ri. The cardinality of Ri will be set to W · ri, and, foreach predicate ci,j with selectivity seli,j , an identical pred-icate will be formed between the relations Ri and Rj . Wewill de�ne the query corresponding to P as follows:

SELECT * FROM R1, · · ·Rn

WHERE (c1,1 AND · · · AND cn,n) .

We will show that a solution to this instance of the JQPGproblem is also a solution to the initial CPG problem. Recallthat a left-deep JQPG solution L minimizes the functionCostLDJ . By opening the recursion and substituting theparameters with those of the original problem, we get:

CostLDJ (L) = C1 +

n∑k=2

C (Pk−1, Rik ) =

= |Ri1 | · fi1,i1 +

n∑k=2

k∏j=1

∣∣Rij

∣∣ · ∏j.l≤k;j≤l

fij ,il

=

=

n∑k=1

k∏j=1

(W · rij

)·

∏j.l≤k−1;j≤l

selij ,il

= Costord (O) .

Consequently, the solution that minimizes CostLDJ alsominimizes Costord, which completes the proof.�In [13] the authors showed the problem of Join Query

Plan Generation for left-deep trees to be NP-complete forthe general case of arbitrary query graphs. From this resultand from the equivalence stated by Theorem 1 (proven infull in [27]) we will deduce the following corollary.

Corollary 1. The problem of �nding an order-based eval-uation plan for a general pure conjunctive complex eventpattern that minimizes Costord is NP-complete.

3.2 Tree-Based EvaluationIn this section, we will extend Theorem 1 to tree-based

evaluation plans. This time we will consider the unrestrictedJQPG problem, allowed to return bushy trees. Similarly toSection 3.1, we will start by de�ning the cost functions andthen proceed to the proof of the extended theorem.We will de�ne the cost model for evaluation trees in a

manner similar to Section 3.1. We will estimate the numberof partial matches accumulated in each node of the evalua-tion tree and sum them up to produce the cost function.For a leaf node l collecting events of type Ti, the expected

number of partial matches is equal to the number of eventsof type Ti arriving inside a time window:

PM(l) =W · ri.

To obtain an estimate for an internal node in, we multiplythe cost function values of its children by the total selectivityof the predicates veri�ed by this node:

PM(in) = PM (in.left) · PM (in.right) · SELLR (in) ,

where SELLR is the selectivity of the predicates de�nedbetween event types accepted at the left and the right sub-trees of node in, or, more formally:

SELLR (in) =∏

ei ∈ in.ltree; ej ∈ in.rtreeseli,j .

The total cost function on a tree T is thus de�ned asfollows:

Costtree (T ) =∑

N∈nodes(T )

PM (N) .

For bushy trees, we will extend the cost function de�nedin Section 3.1. The cost of a tree node N will be de�ned asfollows:

C (N) =

|Ri| N is a leaf representing Ri

|L| · |R| · fL,R N is an internal node representing

a sub− join L ./ R,

with the bushy join (BJ) cost function de�ned as follows:

CostBJ (T ) =∑

N∈nodes(T )

C (N) .

We will now extend Theorem 1 to tree-based plans.

1336

Theorem 2. Given a pure conjunctive pattern P , theproblem of �nding a tree-based evaluation plan for P mini-mizing Costtree is equivalent to the Join Query Plan Gen-eration problem subject to CostBJ .

To prove the theorem, we decompose each of the tree costfunctions Costtree, CostBJ into two components, separatelycalculating the cost of the leaves and the internal nodes:Costltree (T ) =

∑N∈leaves(T ) PM (N)

Costintree (T ) =∑

N∈in_nodes(T ) PM (N)

CostlBJ (T ) =∑

N∈leaves(T ) C (N)

CostinBJ (T ) =∑

N∈in_nodes(T ) C (N) .

Obviously, the following equalities hold:

Costtree (T ) = Costltree (T ) + Costintree (T )

CostBJ (T ) = CostlBJ (T ) + CostinBJ (T ) .

Thus, it is su�cient to prove Costltree (T ) = CostlBJ (T )and Costltree (T ) = CostinBJ (T ) for every T . From here itwill follow that the solution minimizing Costtree will alsominimize CostBJ and vice versa.Applying either direction of the reduction from Theorem

1, we observe the following for the �rst pair of functions:

Costltree (T ) =∑

N∈leaves(T )

PM (N) =

n∑i=1

W · ri =

=

n∑i=1

|Ri| =∑

N∈leaves(T )

C (N) = CostlBJ (T ) .

For the second pair of functions, we will �rst expand therecursion of Costintree:

Costintree (T ) =∑

N∈in_nodes(T )

PM (N) =

=∑

N∈in_nodes(T )

PM (N.left)·PM (N.right)·SELLR (N) =

=∑

N∈in_nodes(T )

∏m∈leaves(N)

W · rm ·∏

i,j∈leaves(N)

seli,j

,

where leaves (N) denotes the leaves of a tree rooted at N .By similarly opening the recursion of CostinJB we obtain:

CostinBJ (T ) =

∑N∈in_nodes(T )

∏m∈leaves(N)

|Rm| ·∏

i,j∈leaves(N)

fi,j

.

After substituting rm = |Rm|W

and selpi,pj = fpi,pj , thetwo expressions are identical, which completes the proof.�The CPG-JQPG reduction that we will use for tree-based

evaluation is the one demonstrated in Theorem 1 for order-based evaluation.By Theorem 2 and the generalization of the result in [13],

we derive the following corollary.

Corollary 2. The problem of �nding a tree-based eval-uation plan for a general pure conjunctive complex eventpattern that minimizes Costtree is NP-complete.

3.3 Join Query TypesAs Corollaries 1 and 2 imply, no e�cient algorithm can

be devised to optimally solve CPG for a general conjunctivepattern unless P = NP . However, better complexity resultsmay be available under certain assumptions regarding thepattern structure. Numerous works considered the JQPGproblem for restricted query types, that is, speci�c topolo-gies of the query graph de�ning the inter-relation conditions.Examples of such topologies include clique, tree, and star.It was shown in [24, 32] that an optimal plan can be

computed in polynomial time for left-deep trees and queriesforming an acyclic graph (i.e., tree queries), provided thatthe cost function has the ASI (adjacent sequence interchange)property [39]. The left-deep tree cost function CostLDJ hasthis property [13], making the result applicable for our sce-nario. A polynomial algorithm without the ASI requirementwas proposed for bushy tree plans for chain queries [40].From Theorems 1 and 2 we can conclude that, for conjunc-tive patterns only, CPG∈P under the above constraints.However, these results only hold when the plans produced

by a query optimizer are not allowed to contain cross prod-ucts [13, 40]. While this limitation is well-known in rela-tional optimization [50], it is not employed by the existingCPG methods [6, 30, 36, 45]. Thus, even when an exactpolynomial algorithm is applicable to CPG, it is inferior tonative algorithms in terms of the considered search spaceand can only be viewed as a heuristic. In that sense, it issimilar to the greedy and randomized approaches [47, 48].Other optimizations utilizing the knowledge of the query

type were proposed. For example, the optimal bushy planwas empirically shown to be identical to the optimal left-deep plan for star queries and, in many cases, for grid queries[47]. This observation allows us to utilize a cheaper left-deepalgorithm for the above query types without compromisingthe quality of the resulting plan.With the introduction of additional pattern types (Section

4) and event selection strategies (Section 5.2), new querygraph topologies might be identi�ed and type-speci�c e�-cient algorithms designed. This topic is beyond the scope ofthis paper and is a subject for future work.Although not used directly by the JQPG algorithms, the

order-based CPG cost functions Costord and Costlatord (thatwe will introduce in Section 5.1) also have the ASI property.We formally prove this in the extended paper [27].

4. JQPG FOR GENERAL PATTERN TYPESThe CPG-JQPG reduction presented above only applies

to pure conjunctive patterns. However, real-world patternsare much more diverse. To complete the solution, we haveto consider simple patterns containing SEQ, OR, NOT andKL operators. We also have to address nested patterns.This section describes how a pattern of each of the afore-

mentioned types can be represented and detected as eithera pure conjunctive pattern or their union. The transforma-tions presented below are only applied for the purpose ofplan generation, that is, no actual conversion takes placeduring evaluation. The formal correctness proofs for theshown reductions can be found in the extended paper [27].Sequence patterns. We observe that a sequence pattern

is merely a conjunctive pattern with additional temporalconstraints, i.e., predicates on the values of the timestampattribute. Thus, a general pure sequence pattern of the form

1337

PATTERN SEQ (T1 e1, T2 e2, · · · , Tn en)WHERE (c1,1 ∧ c1,2 ∧ · · · ∧ cn,n−1 ∧ cn,n)

can be rewritten as follows without any semantic change:

PATTERN AND (T1 e1, T2 e2, · · · , Tn en)WHERE (c1,1 ∧ · · · ∧ cn,n∧

∧ (e1.ts < e2.ts) ∧ · · · ∧ (en−1.ts < en.ts)).

An instance of the sequence pattern is thus reduced fromCPG to JQPG similarly to a conjunctive pattern, with thetimestamp column added to each relation Ri representing anevent type Ti, and constraints on the values of this columnintroduced into the query representation.Kleene closure patterns. In a pattern with an event

type Ti under a KL operator, any subset of events of Ti

within the time window can participate in a match. Duringplan generation, we are interested in modeling this behaviorin a way comprehensible by a JQPG algorithm, that is, usingan equivalent pattern without Kleene closure. To that end,we introduce a new type T ′i to represent all event subsetsaccepted by KL (Ti), that is, the power set of events of Ti.A set of k events of type Ti will be said to contain 2k �events�of type T ′i , one for each subset of the original k events. Thenew pattern is constructed by replacing KL (Ti) with T ′i .Since a time window of size W contains 2ri·W subsets of Ti

(where ri is the arrival rate of Ti), the arrival rate r′i of T

′i is

set to 2ri·W

W. The predicate selectivities remain unchanged.

For example, given the following pattern with the arrivalrate of 5 events per second for each event type:

PATTERN AND(A a,KL(B b),C c)WHERE (true ) WITHIN 10 seconds,

the pattern to be utilized for plan generation will be:

PATTERN AND(A a,B' b,C c)WHERE (true ) WITHIN 10 seconds.

The arrival rate of B′ will be calculated as r′B = 2rB ·W

W=

110· 250. A plan generation algorithm will then be invoked

on the new pattern. Due to an extremely high arrival rate ofB′, its processing will likely be postponed to the latest stepin the plan, which is also the desired strategy for the originalpattern in this case. B′ will then be replaced with B in theresulting plan, and the missing Kleene closure operator willbe added in the respective stage (by modifying an edge typefor a NFA [29] or a node type for a tree [36]), thus producinga valid plan for detecting the original pattern.Negation patterns. Patterns with a negated event will

not be rewritten. Instead, we will introduce a negation-aware evaluation plan creation strategy. First, a plan willbe generated for a positive part of a pattern as describedabove. Then, a check for the appearance of a negated eventwill be added at the earliest point possible, when all positiveevents it depends on are already received. This constructionprocess will be implemented by augmenting a plan with atransition to the rejecting state for a NFA [29] or with aNSEQ node for a ZStream tree [36]. For example, givena pattern SEQ(A,NOT(B),C,D), the existence of a matchingB in the stream will be tested immediately after the lat-est of A and C have been accepted. Since both Lazy NFA

and ZStream incorporate event bu�ering, this technique isfeasible and easily applicable.Nested patterns. Patterns of this type can contain an

unlimited number of n-ary operators. After transformingSEQ to AND as shown above, we are left with only twosuch operator types, AND and OR. Given a nested pat-tern, we convert the pattern formula to DNF form, that is,an equivalent nested disjunctive pattern containing a list ofsimple conjunctive patterns is produced. Then, a separateevaluation plan is created for each conjunctive subpattern,and their detection proceeds independently. The returnedresult is the union of all subpattern matches.Note that applying the DNF transformation can cause

some expressions to appear in multiple subpatterns. Forexample, a nested pattern of the form AND(A,B,OR(C,D))will be converted to a disjunction of conjunctive patternsAND(A,B,C)) and AND(A,B,D)). As a result, redundant com-putations will be performed by automata or trees corre-sponding to di�erent subpatterns (e.g., comparing A's toB's). This problem can be solved by applying known multi-query techniques [17, 35, 43, 44, 54].

5. ADAPTING JQPG ALGORITHMS TOCOMPLEX EVENT PROCESSING

The theoretical results from previous sections imply thatany existing technique for determining a close-to-optimalexecution plan for a join query can be adapted and used inCEP applications. However, many challenges arise when at-tempting to perform this transformation procedure in prac-tice. First, despite the bene�ts of the cost function intro-duced in Section 2.4, simply counting the partial matchesis not always su�cient. Additional performance metrics,such as the latency, are often essential. Second, complexevent speci�cation languages contain various constructs notpresent in traditional databases, such as event selection strate-gies. In this section, we will show how these extensions canbe incorporated into existing JQPG algorithms.In addition, the arrival rates of event types and the pred-

icate selectivities are rarely obtained in advance and canchange rapidly over time. An adaptive solution must be de-vised to measure the desired statistics on-the-�y and adaptthe evaluation plan accordingly [9, 19, 30, 36]. Due to theconsiderable importance and complexity of adaptive CEP,we devote a separate paper [28] to discuss this problem.

5.1 Pattern Detection LatencyLatency is de�ned as the di�erence between the arrival

time of the last event comprising a full match and the time ofreporting this match. As many existing applications involvestrong real-time requirements, pattern detection latency isa popular optimization goal. Unfortunately, in most cases itis impossible to simultaneously achieve maximal throughputand minimal latency. Trade-o�s between the two are widelystudied in the CEP context [6, 52].Detection schemes utilizing out-of-order evaluation, like

those discussed in this paper, often su�er from increased la-tency as compared to simpler approaches. The main reasonis that, when an execution plan is optimized for maximalthroughput, the last event in the pattern may not be thelast event in the plan. After this event is accepted, the sys-tem still needs to process the remaining part of the plan,resulting in late detection of the full match.

1338

Algorithms adopted from JQPG do not naturally supportlatency. However, since they are generally independent ofthe cost model, this problem can be solved by providing anappropriate cost function. In addition to functions presentedin Sections 3.1 and 3.2, which we will refer to as Costtrptord

and Costtrpttree, a new pair of functions, Costlatord and Costlattree,

will re�ect the expected latency of a plan. To combine thefunctions, many existing multi-objective query optimizationtechniques can be used, e.g., pareto optimal plan calcula-tion [6] or parametric methods [49]. Systems with limitedcomputational resources may utilize simpler and less expen-sive solutions, such as de�ning the total cost function as aweighted sum of its two components:

Cost (Plan) = Costtrpt (Plan) + α · Costlat (Plan) ,

where α is a user-de�ned parameter adjusted to �t the re-quired throughput-latency trade-o�. This latter model wasused during our experiments (Section 6).We will now formally de�ne the latency cost functions.

For a sequence pattern, let Tn denote the last event type inthe order induced by the pattern. Then, for an order-basedplan O, let SuccO (Tn) denote the event types succeedingTn in O. Following the arrival of an event of type Tn, in theworst case we need to examine all locally bu�ered events oftypes in SuccO (Tn). There are W · ri such events of typeTi, hence Cost

latord (O) =

∑Ti∈SuccO(Tn)W · ri.

Similarly, for a tree-based plan T , let AncT (Tn) denoteall ancestor nodes of the leaf corresponding to Tn in T , i.e.,nodes located on a path from Tn to the root (excluding theroot). Let us examine the traversal along this path. Whenan internal node N with two children L and R receives apartial match from, say, the child L, it compares this matchto all partial matches currently bu�ered on R. Thus, theworst-case detection latency of a sequence pattern endingwith Tn is proportional to the number of partial matchesbu�ered on the siblings of the nodes in AncT (Tn). Moreformally, let sibling (N) denote the other child of the parentof N (for the root this function will be unde�ned). Then,Costlattree (T ) =

∑N∈AncT (Tn) PM (sibling (N)) .

For a conjunctive pattern, estimating the detection la-tency is a more di�cult problem, as the last arriving eventis not known in advance. One possible approach is to intro-duce a new system component, called the output pro�ler.The output pro�ler examines the full matches reported asoutput and records the most frequent temporal orders inwhich primitive events appear. Then, as enough informa-tion is collected, the latency function may be de�ned as inthe previous case, subject to the event arrival order with thehighest probability of appearance.Finally, for a disjunctive pattern, we de�ne the latency

cost function as the maximum over the disjunction operands.This de�nition applies also for arbitrary nested patterns.

5.2 Event Selection StrategiesIn addition to event types, operators and predicates, CEP

patterns are further de�ned using the event selection strate-gies [5, 16, 21]. An event selection strategy speci�es howevents are selected from an input stream for partial matches.In this section, we discuss four existing strategies and showhow a reduction from JQPG to CPG can support them.Until now, we have implicitly assumed the skip-till-any-

match selection strategy [5], which permits a primitive eventto participate in an unlimited number of matches. This

strategy is the most �exible, as it allows all possible combi-nations of events comprising a match to be detected. How-ever, some streaming applications do not require such func-tionality. Thus, additional strategies were de�ned.The skip-till-next-match selection strategy [5] limits an

event to appear in at most a single full match. This is en-forced by �consuming� events already assigned to a match.While this strategy prevents some matches from being dis-covered, it considerably simpli�es the detection process. In asystem operating under skip-till-next-match, our cost modelwill no longer provide a correct estimate for a number of par-tial matches. However, since most JQPG algorithms do notdepend on a speci�c cost function, we can solve this issue byreplacing Costord and Costtree with newly devised models.Let us examine the number of partial matches in an order-

based setting under the skip-till-next-match strategy. Wewill denote by m [k] the number of matches of size k ex-pected to exist simultaneously in a time window. Obviously,m [1] = W · rp1 , where Tp1 is the �rst event type in the se-lected evaluation order. For the estimate of m [2], there aretwo possibilities. If rp1 > rp2 , there will not be enoughinstances of Tp2 to match all existing instances of Tp1 , andsome of the existing matches of size 1 will never be extended.Hence, m [2] = W · rp2 in this case. Otherwise, as an exist-ing partial match cannot be extended by more than a singleevent of type Tp2 , m [1] will be equal to m [2]. In addition, ifa mutual condition exists between Tp1and Tp2 , the resultingexpression has to be multiplied by selp1,p2 .By extending this reasoning to an arbitrary partial match,

we obtain the following expression:

m [k] =W ·min (rp1 , rp2 , · · · , rpk )·∏

i,j≤k;i≤j

selpi,pj ; 1 ≤ k ≤ n.

And the new cost function for order-based CPG is

Costnextord (O) =

n∑k=1

(W ·m [k]) .

Using similar observations, the above result can be triv-ially extended for the tree-based model.The two remaining selection strategies, strict contiguity

and partition contiguity [5], further restrict the appearanceof events in a match. The strict contiguity requirementforces the selected events to be contiguous in the inputstream, i.e., it allows no other events to appear in between.The partition contiguity strategy is a slight relaxation ofthe above. It partitions the input stream according to somecondition and only requires the events located in the samepartition to be contiguous.We will base the cost models of these strategies on the one

presented above for skip-till-next-match. To express strictcontiguity, we will augment each event with an attributere�ecting its position in the stream. Then, we will add acondition for each pair of potentially neighboring events, re-quiring the numbers to be adjacent. For partition contiguity,the new attribute will represent an inner, per-partition or-der rather than a global one. The new contiguity conditionwill �rst compare the partition IDs of the two events, andonly verify their positional counters if the IDs match. Weassume that the value distribution across the partitions re-mains unchanged. Otherwise, the evaluation plan is to begenerated on a per-partition basis. Techniques incorporat-ing per-partition plans are beyond the scope of this paperand are a subject for our future research.

1339

6. EXPERIMENTAL EVALUATIONIn this section, we present our experimental study on real-

world data. Our main goal was to compare some of the well-known JQPG algorithms, adapted for CPG as describedabove, to the currently used methods developed directly forCPG. The results demonstrate the superiority of the formerin terms of quality and scalability of the generated plans.

6.1 CPG and JQPG AlgorithmsWe implemented 5 order-based and 3 tree-based CPG al-

gorithms. Out of those, 3 order-based and 2 tree-based al-gorithms are JQPG methods adapted to the CEP domain.The rest are native CPG techniques. The order-based plangeneration algorithms included the following:� Trivial order (TRIVIAL) - the evaluation plan is set to

the initial order of the sequence pattern. This strategy isused in various CEP engines based on NFAs, such as SASE[51] and Cayuga [18].� Event frequency order (EFREQ) - the events are pro-

cessed by the ascending order of their arrival frequencies.This is the algorithm of choice for frameworks such as PB-CED [6] and the Lazy NFA [30].�Greedy cost-based algorithm (GREEDY) [48] - this greedy

heuristic algorithm for JQPG proceeds by selecting at eachstep the relation which minimizes the value of the cost func-tion. Here and below, unless otherwise stated, we will usecost functions minimizing the intermediate results size (Sec-tions 3.1 and 3.2).� Iterative improvement algorithm (II-RANDOM / II-

GREEDY) [48] - a local search JQPG algorithm, startingfrom some initial execution plan and attempting a set ofmoves to improve the cost function, until a local minimumis reached. We experimented with two variations of this al-gorithm. The �rst, denoted as II-RANDOM, starts froma random order. The second, denoted as II-GREEDY, �rstapplies a greedy algorithm to create an initial state. In bothcases, the functions used to traverse between states are swap(the positions of two event types in a plan are swapped) andcycle (the positions of three event types are shifted).�Dynamic programming algorithm for left-deep trees (DP-

LD) [46] - this exponential-time algorithm utilizes dynamicprogramming to produce a provably optimal execution plan.The result is limited to a left-deep tree topology.For the tree-based plan generation algorithms, the follow-

ing were used:� ZStream plan generation algorithm (ZSTREAM) [36] -

creates an evaluation tree by iterating over all possible treetopologies for a given sequence of leaves.� ZStream with greedy cost-based ordering (ZSTREAM-

ORD) - as was demonstrated in Section 2.3, the limitationof the ZStream algorithm is in its inability to modify theorder of tree leaves. This algorithm attempts to utilize anorder-based JQPG method to overcome this drawback. Itoperates by �rst executing GREEDY on the leaves of thetree to produce a 'good' ordering, then applying ZSTREAMon the resulting list.� Dynamic programming algorithm for bushy trees (DP-

B) [46] - same as DP-LD, but without the topology restric-tion.

6.2 Experimental SetupThe data used during the experiments was taken from

the NASDAQ stock market historical records [1]. Each data

(a)

(b)

Figure 3: Throughput for di�erent pattern types (higher isbetter): (a) order-based methods; (b) tree-based methods.

record represents a single update to the price of a stock,spanning a 1-year period and covering over 2100 stock iden-ti�ers with prices periodically updated. Our input streamcontained 80,509,033 primitive events, each consisting of astock identi�er, a timestamp, and a current price. For eachidenti�er, a separate event type was de�ned. We also aug-mented the event format with the precalculated di�erencebetween the current and the previous price of each stock.The majority of the experiments were performed sepa-

rately on 5 sets of patterns: (1) pure sequences; (2) se-quences with a negated event (marked as 'negation' pat-terns in the graphs below); (3) conjunctions; (4) sequencescontaining an event under KL operator (marked as 'Kleeneclosure' patterns); (5) composite patterns, consisting of adisjunction of three sequences (marked as 'disjunction' pat-terns). Each set contained 500 patterns with the sizes (num-bers of the participating events) ranging from 3 to 7, 100patterns for each value. The pattern time window was setto 20 minutes.The pattern structure was motivated by the problem of

monitoring the relative changes in stock prices. Each pat-tern included a number of predicates, roughly equal to halfthe size of a pattern, comparing the di�erence attributes oftwo of the involved event types. For example, one patternof size 3 from the set of conjunctions was de�ned as follows:

PATTERN AND(MSFTStock m, GOOGStock g, INTCStock i)WHERE (m.difference < g.difference)WITHIN 20 minutes.

All arrival rates and predicate selectivities were calculatedduring the preprocessing stage. The measured arrival ratesvaried between 1 and 45 events per second, and the selectiv-ities ranged from 0.002 to 0.88. As discussed in Section 5, inmost real-life scenarios these statistics are not available inadvance and may �uctuate frequently and signi�cantly dur-ing runtime. We experimentally study the impact of theseissues in a separate paper [28].

1340

(a)

(b)

Figure 4: Memory consumption for di�erent pattern types(lower is better): (a) order-based methods; (b) tree-basedmethods.

To compare a set of plan generation algorithms, we imple-mented two evaluation mechanisms discussed in this paper,the out-of-order lazy NFA [30] and the instance-based treemodel based on ZStream [36] as presented in Section 2.3.The former was then used to evaluate plans created by eachorder-based CPG or JQPG algorithm on the patterns gen-erated as described below. The latter was similarly used forcomparing tree-based plans.We selected throughput and memory consumption as our

performance metrics for this study. Throughput was de�nedas the number of primitive events processed per second dur-ing pattern detection using the selected plan. To estimatethe memory consumption, we measured the peak memoryrequired by the system during evaluation. The metrics wereacquired separately for each pattern, and the presented re-sults were then calculated by taking the average.All models and algorithms were implemented in Java. The

experiments were run on a machine with 2.20 Ghz CPU and16.0 GB RAM and took about 2.5 months to complete.

6.3 Experimental ResultsFigures 3 and 4 present the comparison of the plan genera-

tion algorithms described in Section 6.1 in terms of through-put and memory consumption, respectively. Each group rep-resents the results obtained on a particular set of patternsdescribed above, and each bar depicts the average value of aperformance metric for a particular algorithm. For clarity,order-based and tree-based methods are shown separately.On average, the plans generated using JQPG algorithms

achieve a considerably higher throughput than those cre-ated using native CPG methods. For order-based plans, theperceived gain of the best-performed DP-LD over EFREQranged from a factor of 1.7 for iteration patterns to 2.7 forconjunctions. Similar results were obtained for tree-basedplans (ZSTREAM vs. DP-B). JQPG methods also displaybetter overall memory utilization. The order-based JQPG

(a)

(b)

Figure 5: Throughput as a function of the sequence patternlength (higher is better): (a) order-based methods; (b) tree-based methods.

plans consume about 65-85% of the memory required bythose produced by EFREQ. An even greater di�erence wasobserved for tree-based plans, with DP-B using up to almost4 times less memory than the CEP-native ZSTREAM.Unsurprisingly, the best performance was observed for

plans created using the exhaustive algorithms based on dy-namic programming, namely DP-LD and DP-B. However,due to the exponential complexity of these algorithms, theiruse in practice may be problematic for large patterns, es-pecially in systems where new evaluation plans are to begenerated with high frequency. Thus, one goal of the ex-perimental study was to test the exhaustive JQPG methodsagainst the nonexhaustive ones (such as GREEDY and II al-gorithms) to see whether the performance gain of the formercategory is worth the high plan generation cost.For the order-based case, the answer is indeed negative,

as the results for DP-LD and the heuristic JQPG algorithmsare comparable and no signi�cant advantage is achieved bythe former. Due to the relatively small size of the left-deeptree space, the heuristics usually succeed in locating theglobally optimal plan. Moreover, the II-GREEDY algorithmgenerally produces plans that are slightly more memory-e�cient. This can be attributed to our cost model, whichonly counts the partial matches, but does not capture theother factors such as the size of the bu�ered events. Thepicture looks entirely di�erent for the tree-based methods,where DP-B displays a convincing advantage over both thebasic ZStream algorithm and its combination with the greedyheuristic method.Another important conclusion from Figures 3 and 4 is

that methods following the tree-based model greatly out-perform the order-based ones, both in throughput and mem-ory consumption. This is not a surprising outcome, as thetree-based algorithms are capable of creating a signi�cantlylarger space of plans. However, the best order-based JQPGalgorithm (DP-LD) is comparable or even superior to theCPG-native ZStream in most settings.Figure 5 depicts the throughput as a function of the pat-

tern size. For lack of space, we only present here the evalu-ation performed on sequence patterns. The results obtained

1341

(a)

(b)

Figure 6: Performance metrics as a function of the cost fororder-based and tree-based patterns: (a) throughput; (b)memory consumption.

for the rest of the pattern sets, as well as the respectivememory consumption measurements, follow similar trendsand are discussed in the extended paper [27]. Althoughthe performance of all methods degrades drastically as thepattern size grows, the relative throughput gain for JQPGmethods over native CPG methods is consistently higherfor longer sequences. This is especially evident for the tree-based variation of the problem (5(b)), where the most ef-�cient JQPG algorithm (DP-B) achieves 7.6 times higherthroughput than the native CPG framework (ZSTREAM)for patterns of length 7, compared to a speedup of only 1.2times for patterns of 3 events. We can thus conclude that, atleast for the pattern sizes considered in this study, the JQPGmethods provide a considerably more scalable solution.In our next experiment, we evaluated the quality of the

cost functions used during plan generation. To that end,we created 60 order-based and 60 tree-based plans for pat-terns of various types using di�erent algorithms. The planswere then executed on the stock dataset. The throughputand the memory consumption measured during each execu-tion are shown in Figure 6 as the function of the cost as-signed to each plan by the corresponding function (Costordor Costtree). The obtained throughput seems to be inverselyproportional to the cost, behaving roughly as 1

xc ; c ≥ 1. Formemory consumption, an approximately linear dependencycan be observed. These results match our expectations, asa cheaper plan is supposed to yield better performance andrequire less memory. We may thus conclude that the costsreturned by Costord and Costtree provide a reasonably ac-curate estimation of the actual performance of a plan.The above conclusion allowed us to repeat the experiments

summarized in Figure 5 for larger patterns, using the plancost as the objective function. We generated 200 patternsof sizes ranging from 3 to 22. We then created a set of plansfor each pattern using di�erent algorithms and recorded theresulting plan costs. Due to the exponential growth of thecost with the pattern size, directly comparing the costs wasimpractical. Instead, the normalized cost was calculated forevery plan. The normalized cost of a plan Pl created byan algorithm A for a pattern P was de�ned as the cost ofa plan generated for P by the empirically worst algorithm(the CEP-native EFREQ), divided by the cost of Pl.

(a)

(b)

Figure 7: Generation of large plans (selected algorithms):(a) average normalized plan cost (higher is better); (b) aver-age plan generation time (logarithmic scale, lower is better).The results are presented as a function of pattern size.

The results for selected algorithms are depicted in Fig-ure 7(a). Each data point represents an average normalizedcost for all plans of the same size created by the same algo-rithm. As we observed previously, the DP-based join algo-rithms consistently produced signi�cantly cheaper plans (upto a factor of 57) than the heuristic alternatives. Also, theworst JQPG method (GREEDY) and the best CPG method(ZSTREAM) produced plans of similar quality, with theformer slightly overperforming the latter for larger patternsizes. The worst-performing EFREQ algorithm was used fornormalized cost calculation and is thus not shown.Figure 7(b) presents the plan generation times measured

during the above experiment. The results are displayed inlogarithmic scale. While all algorithms incur only negligibleoptimization overhead for small patterns, it grows rapidlyfor methods based on dynamic programming (for a patternof length 22, it took over 50 hours to create a plan using DP-B). This severely limits the applicability of the DP-based ap-proaches when the number of events in a pattern is high. Onthe other hand, all non-DP algorithms were able to completein under a second even for the largest tested patterns. Thejoin-based greedy algorithm (GREEDY) demonstrated thebest overall trade-o� between optimization time and quality.Next, we studied the performance of the hybrid throughput-

latency cost model introduced in Section 5.1. Each of the6 JQPG-based methods discussed in Section 6.1 was evalu-ated using three di�erent values for the throughput-latencytrade-o� parameter α: 0, 0.5 and 1. Note that for the �rstcase (α = 0) the resulting cost model is identical to the onede�ned in Section 3 and used in the experiments above. Foreach algorithm and for each value of α, the throughput andthe average latency (in milliseconds) were measured.Figure 8 demonstrates the results, averaged over 500 pat-

terns included in the sequence pattern set. Measurementsobtained using the same algorithm are connected by straightlines, and the labels near the highest points (diamonds) indi-cate the algorithms corresponding to these points. It can beseen that increasing the value of α results in a signi�cantlylower latency. However, this also results in a considerabledrop in throughput for most algorithms. By �ne-tuning thisparameter, the desired latency can be achieved with min-

1342

Figure 8: Throughput vs. latency using di�erent valuesfor the alpha parameter of the cost model.

Figure 9: Throughput for di�erent event selection strate-gies (logarithmic scale).

imal loss in throughput. It can also be observed that thetree-based algorithms DP-B and ZSTREAM-ORD achievea substantially better throughput-latency trade-o� as com-pared to other methods.Finally, we performed a comparative throughput evalua-

tion of the sequence pattern set under three di�erent eventselection strategies: skip-till-any-match, skip-till-next-matchand contiguity (Section 5.2). The results are depicted in Fig-ure 9 for selected algorithms. Due to large performance gapsbetween the examined methods, the results are displayedin logarithmic scale. For skip-till-next-match, JQPG meth-ods hold a clear advantage, albeit less signi�cant than theone demonstrated above for skip-till-any-match. The oppo-site observation can be made about the contiguity strategy,where the trivial algorithm following a static plan outper-forms other, more complicated methods. Due to the sim-plicity of the event detection process and the lack of nonde-terminism in this case, the plan set by an input speci�ca-tion always performs best, while the alternatives introduce aslight additional overhead of reordering and event bu�ering.

7. RELATED WORKSystems for scalable detection of complex events have be-

come an increasingly important research �eld during lastdecades [16, 21]. Their inception can be traced to earliersystems for massive data stream processing [3, 8, 11, 12].Later, a broad variety of general purpose complex event pro-cessing solutions emerged [4, 6, 10, 15, 18, 30, 34, 36, 42, 45,51], including the widely used commercial CEP providers,such as Esper [2] and IBM System S [7].Various performance optimization techniques are imple-

mented in CEP systems [23]. In [42], a rewriting frame-work is described, based on unifying and splitting patterns.A method for e�cient Kleene closure evaluation based onsharing with postponed operators is discussed in [53], while

in [41] the above problem is solved by maintaining a com-pact graph encoding of event sequences and utilizing it fore�ective reuse. RunSAT [20] utilizes another approach, pre-processing a pattern and setting optimal points for termi-nation of the detection process. ZStream [36] presents anoptimization framework for optimal tree generation, basedon a complex cost model. Advanced methods were also pro-posed for multi-query CEP optimization [17, 35, 43, 44, 54].CEP engines utilizing the order-based evaluation approach

have also adopted di�erent optimization strategies. SASE[51], Cayuga [18] and T-Rex [15] design e�cient data struc-tures to enable smart runtime memory management. TheseNFA-based mechanisms do not support out-of-order process-ing, and hence are still vulnerable to the problem of large in-termediate results. In [6, 30, 45], various pattern reorderingmethods for e�cient order-based complex event detectionare described. None of these works takes the selectivities ofthe event constraints into account.Calculating an optimal evaluation plan for a join query

has long been considered one of the most important prob-lems in the area of query optimization [47]. Multiple authorshave shown the NP-completeness of this problem for arbi-trary query graphs [13, 24], and a wide range of methodswere proposed to provide either exact or approximate close-to-optimal solutions [31, 32, 33, 38, 46, 47, 48].Methods for join query plan generation can be roughly

divided into two main categories. The heuristic algorithmsproduce fast solutions, but the resulting execution plans areoften far from the optimum. They are often based on com-binatorial [25, 47, 48] or graph-based [32, 33] techniques.The second category, the exhaustive algorithms, provide

provable guarantees on the optimality of the returned solu-tions. These methods are often based on dynamic program-ming [38, 46] and thus su�er from worst-case exponentialcomplexity. Hybrid techniques presenting a trade-o� be-tween the speed of heuristic approaches and the precision ofDP-based approaches were also proposed [31].Incorporating join optimization techniques from traditional

DBMSs was already considered in the �elds related to CEP,such as XPath [22] and data stream processing [12, 23]. Tothe best of our knowledge, none of the above works providesa formal reduction to JQPG.

8. CONCLUSIONS AND FUTURE WORKIn this paper, we studied the relationship between CEP

Plan Generation and Join Query Plan Generation. It wasshown that the CPG problem is equivalent to JQPG for asubset of pattern types, and reducible to it for other types.We discussed how close-to-optimal solutions to CPG can bee�ciently obtained in practice by applying existing JQPGmethods. The presented experimental evaluation resultssupported our theoretical analysis. Our future work will tar-get advanced challenges of applying join-related techniquesin the �eld of CEP, such as handling predicate dependenciesand data uncertainty.

9. ACKNOWLEDGMENTSThe research leading to these results has received funding

from the European Union's Horizon 2020 Research And In-novation Programme under grant agreement no. 688380 andwas partially supported by the Israel Science Foundation(grant no. 919191) and by HPI-Technion Research School.

1343

10. REFERENCES[1] http://www.eoddata.com.

[2] http://www.espertech.com.

[3] D. J. Abadi, Y. Ahmad, M. Balazinska, M. Cherniack,J. Hwang, W. Lindner, A. S. Maskey, E. Rasin,E. Ryvkina, N. Tatbul, Y. Xing, and S. Zdonik. Thedesign of the Borealis stream processing engine. InCIDR, pages 277�289, 2005.

[4] A. Adi and O. Etzion. Amit - the situation manager.The VLDB Journal, 13(2):177�203, 2004.

[5] J. Agrawal, Y. Diao, D. Gyllstrom, and N. Immerman.E�cient pattern matching over event streams. InSIGMOD, pages 147�160, 2006.

[6] M. Akdere, U. Çetintemel, and N. Tatbul. Plan-basedcomplex event detection across distributed sources.PVLDB, 1(1):66�77, 2008.

[7] L. Amini, H. Andrade, R. Bhagwan, F. Eskesen,R. King, P. Selo, Y. Park, and C. Venkatramani. Spc:A distributed, scalable platform for data mining. InProceedings of the 4th International Workshop onData Mining Standards, Services and Platforms, pages27�37, New York, NY, USA, 2006. ACM.

[8] A. Arasu, S. Babu, and J. Widom. The CQLcontinuous query language: Semantic foundations andquery execution. The VLDB Journal, 15(2):121�142,2006.

[9] S. Babu, R. Motwani, K. Munagala, I. Nishizawa, andJ. Widom. Adaptive ordering of pipelined stream�lters. In Proceedings of the 2004 ACM SIGMODInternational Conference on Management of Data,pages 407�418, New York, NY, USA, 2004. ACM.

[10] R. S. Barga, J. Goldstein, M. H. Ali, and M. Hong.Consistent streaming through time: A vision for eventstream processing. In CIDR, pages 363�374, 2007.

[11] S. Chandrasekaran, O. Cooper, A. Deshpande, M. J.Franklin, J. M. Hellerstein, W. Hong,S. Krishnamurthy, S. Madden, V. Raman, F. Reiss,and M. A. Shah. Telegraphcq: Continuous data�owprocessing for an uncertain world. In CIDR, 2003.

[12] J. Chen, D. J. DeWitt, F. Tian, and Y. Wang.Niagaracq: A scalable continuous query system forinternet databases. SIGMOD Rec., 29(2):379�390,2000.

[13] S. Cluet and G. Moerkotte. On the complexity ofgenerating optimal left-deep processing trees withcross products. In Proceedings of the 5th InternationalConference on Database Theory, ICDT '95, pages54�67, London, UK, 1995. Springer-Verlag.

[14] G. Cugola and A. Margara. Tesla: a formally de�nedevent speci�cation language. In DEBS, pages 50�61.ACM, 2010.

[15] G. Cugola and A. Margara. Complex event processingwith T-REX. J. Syst. Softw., 85(8):1709�1728, 2012.

[16] G. Cugola and A. Margara. Processing �ows ofinformation: From data stream to complex eventprocessing. ACM Comput. Surv., 44(3):15:1�15:62,2012.

[17] A. Demers, J. Gehrke, M. Hong, M. Riedewald, andW. White. Towards expressive publish/subscribesystems. In Proceedings of the 10th InternationalConference on Advances in Database Technology,pages 627�644. Springer-Verlag.

[18] A. Demers, J. Gehrke, B. Panda, M. Riedewald,V. Sharma, and W. White. Cayuga: A generalpurpose event monitoring system. In CIDR, pages412�422, 2007.

[19] A. Deshpande, Z. Ives, and V. Raman. Adaptive queryprocessing. Foundations and Trends in Databases,1(1):1�140, January 2007.

[20] L. Ding, S. Chen, E. A. Rundensteiner, J. Tatemura,W. P. Hsiung, and K. S. Candan. Runtime semanticquery optimization for event stream processing. IEEE24th International Conference on Data Engineering(ICDE), 0:676�685, 2008.

[21] O. Etzion and P. Niblett. Event Processing in Action.Manning Publications Co., 2010.

[22] T. Grust, J. Rittinger, and J. Teubner. Whyo�-the-shelf RDBMSs are better at XPath than youmight expect. In Proceedings of the 2007 ACMSIGMOD International Conference on Management ofData, SIGMOD '07, pages 949�958, New York, NY,USA, 2007. ACM.

[23] M. Hirzel, R. Soulé, S. Schneider, B. Gedik, andR. Grimm. A catalog of stream processingoptimizations. ACM Comput. Surv., 46(4):46:1�46:34,March 2014.

[24] T. Ibaraki and T. Kameda. On the optimal nestingorder for computing n-relational joins. ACM Trans.Database Syst., 9(3):482�502, 1984.

[25] Y. Ioannidis and Y. Kang. Randomized algorithms foroptimizing large join queries. SIGMOD Rec.,19(2):312�321, May 1990.

[26] Y. Ioannidis and Y. Kang. Left-deep vs. bushy trees:An analysis of strategy spaces and its implications forquery optimization. SIGMOD Rec., 20(2):168�177,April 1991.

[27] I. Kolchinsky and A. Schuster. Join queryoptimization techniques for complex event processingapplications. CoRR, abs/1801.09413, 2017.

[28] I. Kolchinsky and A. Schuster. E�cient adaptivedetection of complex event patterns. PVLDB,11(11):1346�1359, 2018.

[29] I. Kolchinsky, A. Schuster, and D. Keren. E�cientdetection of complex event patterns using lazy chainautomata. CoRR, abs/1612.05110, 2016.

[30] I. Kolchinsky, I. Sharfman, and A. Schuster. Lazyevaluation methods for detecting complex events. InDEBS, pages 34�45. ACM, 2015.

[31] D. Kossmann and K. Stocker. Iterative dynamicprogramming: A new class of query optimizationalgorithms. ACM Trans. Database Syst., 25(1):43�82,2000.

[32] R. Krishnamurthy, H. Boral, and C. Zaniolo.Optimization of nonrecursive queries. In Proceedingsof the 12th International Conference on Very LargeData Bases, VLDB '86, pages 128�137, San Francisco,CA, USA, 1986. Morgan Kaufmann Publishers Inc.

[33] C. Lee, C.S. Shih, and Y.H. Chen. A graph-theoreticmodel for optimizing large join queries. In DASFAA,volume 6 of Advanced Database Research andDevelopment Series, pages 87�96. World Scienti�c,1997.

[34] M. Liu, E. Rundensteiner, D. Dougherty, C. Gupta,S. Wang, I. Ari, and A. Mehta. High-performance

1344

nested CEP query processing over event streams. InProceedings of the 27th International Conference onData Engineering, ICDE 2011, pages 123�134.

[35] M. Liu, E. Rundensteiner, K. Green�eld, C. Gupta,S. Wang, I. Ari, and A. Mehta. E-cube:Multi-dimensional event sequence analysis usinghierarchical pattern query sharing. In Proceedings ofthe 2011 ACM SIGMOD International Conference onManagement of Data, SIGMOD '11, pages 889�900,New York, NY, USA, 2011. ACM.

[36] Y. Mei and S. Madden. ZStream: a cost-based queryprocessor for adaptively detecting composite events.In SIGMOD Conference, pages 193�206. ACM, 2009.

[37] G. Moerkotte. Building query compilers. 2014.http://pi3.informatik.uni-mannheim.de/ moer/querycompiler.pdf.

[38] G. Moerkotte and T. Neumann. Analysis of twoexisting and one new dynamic programming algorithmfor the generation of optimal bushy join trees withoutcross products. In Proceedings of the 32ndInternational Conference on Very Large Data Bases,pages 930�941. VLDB Endowment, 2006.

[39] C. Monma and J. Sidney. Sequencing withseries-parallel precedence constraints. Math. Oper.Res., 4(3):215�224, August 1979.

[40] M. Orlowski. On optimization of joins in distributeddatabase system. In Future Databases 92, pages106�114. 1992.

[41] O. Poppe, C. Lei, S. Ahmed, and E. Rundensteiner.Complete event trend detection in high-rate eventstreams. In Proceedings of the 2017 ACM InternationalConference on Management of Data, SIGMOD '17,pages 109�124, New York, NY, USA, 2017. ACM.

[42] E. Rabinovich, O. Etzion, and A. Gal. Patternrewriting framework for event processing optimization.In Proceedings of the 5th ACM InternationalConference on Distributed Event-based Systems, pages101�112. ACM, 2011.

[43] M. Ray, C. Lei, and E. A. Rundensteiner. Scalablepattern sharing on event streams. In Proceedings ofthe 2016 International Conference on Management ofData, pages 495�510, New York, NY, USA. ACM.

[44] M. Ray, E. Rundensteiner, M. Liu, C. Gupta,S. Wang, and I. Ari. High-performance complex eventprocessing using continuous sliding views. InProceedings of the 16th International Conference onExtending Database Technology, EDBT '13, pages525�536, New York, NY, USA, 2013. ACM.

[45] N. P. Schultz-Møller, M. M., and P. R. Pietzuch.Distributed complex event processing with queryrewriting. In DEBS. ACM, 2009.

[46] P. Selinger, M. Astrahan, D. Chamberlin, R. Lorie,and T. Price. Access path selection in a relationaldatabase management system. In Proceedings of the1979 ACM SIGMOD Conference, pages 23�34, 1979.

[47] M. Steinbrunn, G. Moerkotte, and A. Kemper.Heuristic and randomized optimization for the joinordering problem. The VLDB Journal, 6(3):191�208,1997.

[48] A. Swami. Optimization of large join queries:Combining heuristics and combinatorial techniques.SIGMOD Rec., 18(2):367�376, 1989.

[49] I. Trummer and C. Koch. Multi-objective parametricquery optimization. SIGMOD Rec., 45(1):24�31, 2016.

[50] B. Vance and D. Maier. Rapid bushy join-orderoptimization with cartesian products. SIGMOD Rec.,25(2):35�46, June 1996.

[51] E. Wu, Y. Diao, and S. Rizvi. High-performancecomplex event processing over streams. In SIGMODConference, pages 407�418. ACM, 2006.

[52] I. Yi, J. G. Lee, and K. Y. Whang. Apam: Adaptiveeager-lazy hybrid evaluation of event patterns for lowlatency. In Proceedings of the 25th ACM Conferenceon Information and Knowledge Management, pages2275�2280. ACM, 2016.

[53] H. Zhang, Y. Diao, and N. Immerman. On complexityand optimization of expensive queries in complexevent processing. In SIGMOD, pages 217�228, 2014.

[54] S. Zhang, H. T. Vo, D. Dahlmeier, and B. He.Multi-query optimization for complex event processingin SAP ESP. In 33rd IEEE International Conferenceon Data Engineering, ICDE 2017, San Diego, CA,USA, April 19-22, 2017, pages 1213�1224, 2017.

1345