Products: Complex Event Processing (CEP)

Home
StreamBase Studio
StreamBase Server
StreamBase Adapters
Complex Event Processing (CEP)
CEP (Glossary)
About StreamSQL
Try StreamBase
FAQs
Client Care
Knowledge Center

Printer Friendly

Complex Event Processing

What is Complex Event Processing?

Complex Event Processing (CEP) is a technology for low-latency filtering, correlating, aggregating, and computing on real- world event data.

Within the Complex Event Processing industry, some event processing systems can only process events or messages as they pass by, without necessarily retaining and processing state. However, StreamBase's software for event processing (also referred to as Event Stream Processing, or ESP) handles the bulk, if not all, of traditional Complex Event Processing.

First it is critical to define several terms. The processing of messages as they arrive is called, "real-time processing", and the use of a sophisticated and optimized storage mechanism is called "historical processing". Another term that shows up in the Complex Event Processing literature is an "event cloud" or "cloud" for short. StreamBase believes that a "cloud" is really a manifestation of historical processing coupled with real-time processing. The power of a good event processing platform is in how well it integrates real-time and historical processing. In other words, a cloud can be easily simulated by a modern stream processing system.




StreamSQL & Complex Event Processing

The most full-featured event processing systems integrate database functionality by integrating streaming operations with the SQL programming language. StreamBase does this via a language called StreamSQL. StreamSQL (and other SQL derivatives) allow users to mix access to the non-persistent stream with access to the stored database which, by the way, can be massive. They also extend the SQL DML to allow updates to stored tables from tuples that appear on a stream. Thus, any part of the stream can be stored for further use. Here is a very simple example of how that might work.

INSERT INTO IBMTicks (Symbol, Date, Price)
SELECT Symbol, Date, Price FROM Ticks
WHERE Symbol = "IBM"

This query is registered with the system and operates continuously. It stores any new IBM tick that appears on the stream called Ticks into the table called IBMTicks.



Complex Event Processing Clouds

What is it that comprises a "cloud" as defined in Complex Event Processing systems? A cloud allows for arbitrary orders primarily because it is an abstraction of storage. Whenever storage exists it is possible to sort the items in the store in anyway that suits the application. Beyond this, a cloud supports some additional functionality including:

  1. Active processing — event driven
  2. Complex events
  3. Causality and posets
  4. Pattern matching

Each of these can be addressed effectively with event processing technology.

Active processing means that computation in the cloud is initiated by the arrival of a new event. A set of rules look for patterns involving this event in relation to previously stored events (from the cloud) and react by creating one or more new (possibly composite) events. This is precisely what a stream processing engine can do with the arrival of new events with respect to its own internal store.

Complex events are events that are composed out of other more basic component events. In the relational context, this is easily handled with a Join. Joins find related tuples that are then concatenated into a larger tuple (or complex event).


 

Causality in Complex Event Processing

Causality (and other relationships) is modeled within a cloud as a partial order between events. A strict partial order (poset) can be described (in the mathematical sense) as a binary relation R that is irreflexive (aRa), antisymmetric (if aRb and bRa then a=b), and transitive (if aRb and bRc then aRc). The important thing to note is that a poset is a relation R among the members of a given set. For finite sets, it is not too surprising that such a (mathematical) relation is easily captured as a stored relation in a relational database system. An example of how this might look in practice is found below. Adding language extensions like those in StreamSQL (or CCL) enhance these manipulations with a connection to real-time streams of events. This gets us the event-orientation of a cloud as well.


 

Pattern Matching in Complex Event Processing

Finally, a pattern matching capability is typically included in the programming model of a cloud. In broad brush terms, these patterns allow the detection of relationships between events that are based on temporal ordering. Thus, one may want to ask for events that precede a given event or set of events. While many of these pattern-based queries can be captured in SQL99, the expression can be awkward. Thus, most modern stream-processing engines enhance their query languages with direct support for temporal patterns. To a large extent this is simply better syntax for capability that is already there. Adding syntax that captures new idioms is not a fundamental problem. The new syntax can be easily transformed to the old with a minimal amount of effort. Missing functionality would be a much more serious problem, but, at this point in time, it is difficult to see what that would be.

SQL has been highly successful because of its declarative programming model. Declarative programming allows the optimizer to make decisions that are based on the current structure of the data (e.g., indices, sort orders, sizes of data sets, etc). Expressivity is only a part of the story. Thus, SQL is a natural choice for stored data so extending it to include data streams is a great way to achieve a single programming paradigm for both. StreamSQL (like SQL) enables sophisticated optimizations that are needed to get the message throughput and low latency that ESP applications require. Moreover, extending SQL with constructs like windows offers a natural way to express common analytics that are typically run over bounded runs of data.

In summary, real-time ESP integrated with stored data can accomplish the same tasks as Complex Event Processing. For ESP solutions which handle real-time and stored data equally well, the difference between these two terms is so vanishingly small as to be non-existent for practical purposes.


 

Example: Complex Event Processing & Causality

Suppose one is interested in describing events involved in the spreading of a piece of gossip initially known by Joe. Joe tells Mary. Mary tells Pete. Pete tells Jim and Frank, etc. The event of telling someone creates a dependency between the event that makes someone aware of the gossip and the event that transfers this knowledge to the next person(s). These events and their causal relationships can be modeled as a table with the following signature.

Knows (teller, listener)

The rows in this database represent people who know the gossip (the teller column) and the causal relationship between that teller and the listener (in the listener column). Thus, an event database that stores six people who know the gossip might look as follows:

teller listener
Joe Mary
Mary Pete
Pete Jim
Pete Frank
Jim null
Frank null

Each event of telling the gossip to someone creates a new row. Note that causal relationships are modeled as embedded identifiers in the listener attribute. This attribute could just as well be named causes to indicate the causal path.

Questions regarding causality can be addressed through SQL queries. To determine who caused Jim to know the gossip, execute a simple lookup.

Determining and identifying all of the people who Jim is responsible for within two tellings, use a straight-forward self-join. While it may sound computationally expensive to use a self-join to follow relationships, it need not be if the proper indices are available.

Uncovering all of the people that Pete caused to know the gossip either directly or indirectly, perform a recursive query which SQL-99 now supports through the WITH RECURSIVE construct.

Other kinds of causal relationships can be computed in a similar way. Thus, causality can be represented and manipulated in relational databases.

NEXT PAGE: Complex Event Processing Glossary

1 | 2 Next »

Free CEP Software

Test drive StreamBase for high-performance Complex Event Processing (CEP).

Download now 

 

StreamSQL Webcast

Technical experts discuss use of streaming SQL for real-time CEP and event processing.

View Webcast 

 

Customer Success: BNY ConvergEx

BNY ConvergEx Group Deploys StreamBase for U.S. Trading Operations

View Press Release