Hans Gilde’s weblog

Business value for Detection Oriented Event Processing (part 2)

Posted in eventprocessing by Hans on December 23, 2009

I recently began a line of thinking about breaking Event Processing into two categories: Detection Oriented and Operations Oriented. In my last post (part 1), I began to talk about the business value of Detection Oriented Event Processing. Here I will continue that topic.

This post should provide ideas for researching uses for Event Processing. I am currently discussing the data analysis side of Event Processing, so all the ideas here involve data analysis. In other words, the Event Processing referred to in this post is treated as a tool for data analysis.

There are many complimentary technologies for Detection Oriented Event Processing, and I will get into the specific technologies in a future post. For now, I am focussed on business value and I will only go into the very surface of how Event Processing can help.  That may be frustrating for some readers, but it follows a certain pattern: 1) identify value and then 2) find the technical means to deliver it. We are currently on step (1).

In fact, a lot of this stuff can (and often should) be done without using Event Processing. And just because a scenario falls into one of these categories, does not mean that Event Processing can definitely help. There is no magic.

Ideas for uses of Detection Oriented Event Processing:

Lower the latency of existing data analysis

Lowering the latency of existing data analysis just means getting data where it is needed, faster. But as I mentioned in part 1 of this post, you only get value if the business can adjust to use the data at the new speed. Here are some ideas for how Event Processing can help lower latency of existing data analysis.

Rather than processing data in a batch, collect and process it incrementally as it is generated. By processing the data incrementally, the results are available immediately after the last of the data is available. Contrast this to batch-based processing, where the analysis only starts when the last of the data is available.

Deliver periodic, incremental results for existing data analysis. For example, some analysis may currently take into account one entire day’s worth of data; it might deliver more value by producing results throughout the day showing the analysis up until that point (including a result showing the analysis over the whole day).

Apply existing analysis over shorter periods of time. The same Performance Indicator that applied to daily operations might also apply to hourly operations.

Automate analysis that is conducted by people. Although often a challenge, it is increasingly possible and valuable to automate on-the-fly analysis that is currently done by people. Think about this: not too long ago, loan applications were approved by hand. There was no such thing as instant credit, a major driver of revenue for lenders today.

Develop new data analysis that requires lower latency

This area can be particularly interesting. Think of analysis that works much better when acted upon quickly. For example, predicting the immediate future based on very recent history (the past hour, minute, second or less) can be very accurate. But that accuracy degrades very quickly with time; a prediction (or optimization) based on the previous thirty minutes may be highly accurate for the next thirty minutes but almost useless a day later.

Find decisions or operational parameters that deliver value when they are adjusted quickly, then trace back their data requirements. If the data can be provided faster, perhaps adjustments can be made faster or can be allowed to take into account more data. The actual process of fast decision-making falls into Operations Oriented Event Processing, which I intend to cover in the future.

For example: the delivery of materials to a factory may be optimized by quickly adding or cancelling materials shipments from a storage facility, depending on the recent factory production level and materials requirements.

Increase the throughput of existing data analysis

Find and address bottlenecks in existing data processing. There are no miracles in Event Processing, but two common bottlenecks are disk storage and memory. Event Processing techniques are usually meant to process lots of data in limited memory and with little disk storage. Once these bottlenecks are eliminated, it may be possible to push through more analysis in the same amount of time.

For example: processing of high volume logs. There are many technologies that can help with this task, including map/reduce and other distributed computing. But those solutions can be (but not necessarily) even more disk intensive than just loading everything into a database. By treating each log entry as an event, it may be possible to leverage inherently resource constrained Event Processing techniques to process more log data at once.

Develop new analysis that requires high throughput

Sometimes there is just too much data to analyze; it would be nice to comb through all of it, but the return just doesn’t justify the cost. Again, without expecting miracles, Event Processing may help process data in a resource-efficient way. With lower resource usage, you may be able to finally analyze all that data that was otherwise going to waste.

For example: It might be nice to install RFID sensors throughout a store to record product movement. But the amount of data generated by a busy store would mean prohibitive storage costs. By treating RFID readers as event sources, it may be possible to pre-process all that data before it ever gets to storage, so that the summary data that is eventually stored remains at a manageable level.

Business value for Detection Oriented Event Processing

Posted in eventprocessing by Hans on December 23, 2009

Following up on my last post about the two broad categories of Event Processing, this post continues that train of thought by analyzing one of these categories from the perspective of business value.

Today I’m writing about Detection Oriented Event Processing. Per my previous post: “Detection Oriented Event Processing applications want to locate interesting information in a flow of data.” In this case, we use “detection” very broadly to mean data analysis that produces results of interest to the enterprise. That could be a simple report destined for a human, a set of values used in an optimization problem or a complex statistical method. I might as well say Data Analysis Oriented Event Processing.

The term Detection Oriented is in contrast to applications that make some decision or take some action based on incoming events. I’ll call those Operations Oriented applications, and cover them in a later post (and also cover how they integrate with Detection Oriented applications). Of course there are many intersections of Detection and Operations Oriented Event Processing, but they each deserve some separate analysis first.

The motivation for using Detection Oriented Event Processing is to increase the “speed” of data analysis. This runs contrary to views of Event Processing that focus on improving detection or prediction capabilities. Unfortunately Event Processing can’t directly help improve the accuracy of detection or prediction (only better detection or prediction methods can do that), but it can indirectly help by easing the burden of implementation.

In general, there are two types of possible speed increase: lower latency and increased throughput.

Lower latency means that some data analysis once took X amount of time, and now it takes less time. For example, if a report summarizing daily sales figures used to be available the next morning, and now it is available at the close of the same business day – the latency of this report has been reduced. In other words, the amount of time taken to produce the analysis (measured from when all data is available to when the analysis is complete) has been reduced.

Increased throughput means that more analysis is done in the same amount of time. For example, if I were once able to simultaneously analyze 30 stocks for correlation and now I can run the same analysis on 300 simultaneous stocks – the throughput of my analysis has been increased. Each individual analysis (the analysis of one stock) may take the same amount of time, so increased throughput may allow a large task to complete faster, but does not necessarily correspond to lower latency of each individual component task.

Detection Oriented Event Processing, then, is a potential solution for problems where we need to either decrease latency or increase throughput of data analysis. Again, Event Processing is not the only method of achieving these goals, it is one method in the toolbox.

Before getting started on generating business value, we need to have one thing clear:

Since we are increasing the speed of data analysis, value only comes when the enterprise can react at the new speed. It doesn’t matter how fast we are able to produce results, if the enterprise can’t react fast enough to use them.

So we have three scenarios:

The enterprise can already make use of faster data analysis. In this case, there is work that could produce more value if only some kind of data analysis were faster. The location and nature of the bottleneck may even be common knowledge. A speed increase will bring immediate business value.

For example:

An internal system (human or computer) requires data to optimize its decision-making, and it would be better to get that data sooner rather than later.

Certain data is distributed to customers, who would rather have it faster. While the enterprise may or may not have its own use for faster data analysis, the customers certainly do and there is value in happier customers. Logistics systems come to mind here.

Build faster data analysis, allow the rest of the enterprise to catch up. It is possible that, were faster data analysis available, others in the enterprise would quickly adapt (and in adapting, increase the value of their activities). In this case, the speed of existing data analysis has set the pace for downstream activities (which depend on the data). Those activities would adapt to a faster pace or larger volume of data, were it available.

For example: Electronic trading is often a game of speed. When data is available faster, downstream systems quickly adapt to make use of it.

Build faster data analysis and faster reaction time, at once. In this case, there is some kind of reaction that should happen quickly. It may be something that the enterprise already reacts to, but would rather react faster. Or it may be some new stimulus, where there is only value in reacting quickly and little or none in reacting slowly. In either case, we must increase both the speed of data analysis and the speed of reaction to see any benefit.

This scenario may be a candidate for a mixture of Detection Oriented and Operations Oriented Event Processing.

The two types of Event Processing

Posted in eventprocessing by Hans on December 18, 2009

I was thinking this morning about two types of Event Processing: “Detection Oriented” and “Operations Oriented”. It looks like these are the main categories of Event Processing, both in terms of business value and requirements for a COTS product. Today the Event Processing Technical Society does not make a big deal out of this distinction, but I think that will change.

Detection Oriented Event Processing applications want to locate interesting information in a flow of data. Operations Oriented Event Processing applications want to take an action (which involves making decisions) based on incoming events.

Detection Oriented Event Processing is the Event Processing equivalent of data analysis. It is driven by a need to detect faster than was previously possible and is important to understand that Event Processing adoption is not driven by the need to detect more accurately. More accurate detection requires better detection methods. Detection methods are not part of Event Processing, although Event Processing may make it easier to implement certain methods.

Detection Oriented Event Processing suffers from a significant disadvantage compared to traditional data analysis: the data to be analyzed is not all available at once. Rather, the data arrives incrementally as events.

Detection Oriented applications are developed by data analysts. They require a good understanding of what is being detected and how the events contribute data. They may benefit from statistics, data mining or machine learning.  Their challenge is to balance the goal of detecting with their performance requirements.  They are tested by running various real-life or simulated data scenarios through the Event Processing Network.

Operations Oriented Event Processing is driven by a need to react faster (with lower latency or higher throughput) than was previously possible. Again we must distinguish faster from “better”. Better reaction requires better decision-making. And better decision-making is not a part of Event Processing, although Event Processing may help make decisions with lower latency or higher throughput.

Operations Oriented Event Processing applications are developed by logic coders. They require a good understanding of how the business should react and what the events mean to the operation of the business. They may benefit from decision management, decision-making under uncertainty or other areas of operations research. Their challenge is to balance the goal of deciding with their performance requirements.  They are tested by running specific sequences of events through the Event Processing Network.

The two types of event processing put different requirements on Event Processing software. For example:

Detection Oriented applications will change as the data changes, as the detection methods change or as the detection goals change. It is common for these applications to regularly add more detection logic, while keeping the old logic.

Operations Oriented applications will change as the business optimizes or changes its decision-making strategies. It is more common for these applications to update their decision-making logic and get rid of the old logic.

Of course, every Detection Oriented application will do something with its detection results. And every Operations Oriented application will involve some amount of detection. But the focus of the application remains on one or the other goal.

There are applications that require both Detection Oriented and Operations Oriented Event Processing. These are applications with both complex detection and complex reaction requirements. In my experience, is common for these applications to be cleanly divided between the two goals. This division is often so complete that the two goals are implemented essentially by separate systems and often by separate teams. And when this division is not so complete, everyone usually wishes that it were.

So if these types of Event Processing are separate in terms of their value and their implementation, maybe they should each use separate Event Processing software?

Cloud computing project applicable to event processing

Posted in eventprocessing, programming, streaming SQL by Hans on December 17, 2009

The UC Berkeley cloud computing project BOOM might have relevance to  event processing. Their goal is to make cloud based data processing easier, and they’ve put together a language called Bloom with the goal of providing a declarative language for cloud data processing.  And this language looks to be an event processing language, although missing some common event processing idioms.

Event Processing in Action review, part 2

Posted in eventprocessing, financial services, programming by Hans on December 8, 2009

Per my previous post, I’m currently reading a preview copy of Event Processing in Action (EPIA) from Manning. I’ll write a short summary mixed with review, composed of several posts.

Unfortunately, the name of Manning’s “In Action” series was ruined for me by a book from another publisher; I just can’t read In Action without thinking Inaction. I will try to put this prejudice aside.

I only have time to go into Chapter 1 right now. I doubt that I’ll have time for one post per chapter, but this one turned out longer than I thought.

The book starts slow, because the authors are very thorough about capturing the basics. Chapter 1 is mostly an introduction to terminology. It also goes into many examples of Event Processing in use today. Finally, it introduces an example Event Processing application (a flower delivery service) that will be used throughout the book.

As I’m already familiar with event driven systems, some of the pages on terminology were a little boring. At the same time, the level of detail and the real world examples are great for a less experienced reader. This book has an academic flavor, and I’m the type to read through the boring introductory chapter of a textbook before getting started on the subject. That’s because I do learn a little (maybe more than I realize at the time) and it gets me on the same page (so to speak) as the author for the rest of the book.

So from Chapter 1, I see that Event Processing in Action is about events and the processing thereof. These “events” are the same ones that drive an Event Driven Architecture (EDA). The point of the book is to look beyond the basic pattern of EDA logic that says “send an event and interested parties will eventually receive it”. Beyond this simple pattern lie many patterns of processing logic that are common to most or all event driven systems.

For example (and I’m extrapolating a little here), as an Event Driven Architecture implementation grows and matures, there’s a natural desire to extract more and more information and value from the event flow. Where each event type may have started with just one interested consumer, others find uses for existing events.  Architects start devising applications that combine and rehash events in new ways. Often an EDA is adopted with grand visions of squeezing ever increasing value from events.

Over time, we see common patterns in the logic used to extract information from events in the EDA. We can use those patterns to design the logic at a higher level than code and code-level design patterns. We can use our logic designs to compare the goals and the logic of our EDA to another EDA, and to learn lessons and develop best practices for exactly how we will extract more value from events. Those logic patterns and their use is the core of … Event Processing.

One might not be interested in an EDA per-se, but still want to extract information and value from events. The most common example seems to be automatic (electronic) trading (other examples are listed in Chapter 1). Trading systems get events (mostly feeds of quotes, news and order executions) and extract information from them. While the media seems to focus on the mathematics, the fact is that most of the logic of trading is exactly the same as logic that extracts information from events in any EDA. The former deals with feeds and trading connectivity, the latter with message buses and shared event definitions. Trading may require maximum performance, an EDA may require guaranteed delivery. But mostly they share patterns in the logic, because they are both doing Event Processing.

First thoughts on Event Processing In Action

Posted in eventprocessing by Hans on December 2, 2009

I’m in the process of reading the preview version of a new book to be published by Manning: Event Processing In Action by Opher Etzion and Peter Niblett. I agreed to read the book and post a review, and my intention is to post several small reviews as I read through it.

One of the authors (Opher) maintains a blog called Event Processing Thinking, which has many posts about the book and a lot of good content as well.

I am particularly interested in this book because my first impression of Event Processing (as described by Opher on his blog), was that it may not be particularly useful. He began his blog in the same way that he begins the book, with very simple ideas. I would sometimes read his early blog posts and think that he is providing information that is so basic that it may not be worth writing about.  At that time, I also did not see his ideas fitting together into something usable or helpful.

Over time, though, I kept reading his blog. And now I see that not only do the ideas fit together, but that they are quite obviously useful. I’m surprised, in fact, that no one has put this stuff forward before. I don’t know exactly when I had the ah-ha moment about Event Processing, but I’m excited to read the book.

Event Processing In Action (EPIA for short) is about software that processes events, and will be familiar territory to those who have dealt with Event Driven Architecture. Most technical books about this topic cover how to process events. They are about algorithms, coding, software concepts like pub/sub and queue, or hardware architecture to use when processing events.

EPIA adds another dimension to events: the logic that they pass through. Not algorithms, but vocabulary and patterns that can be used to specify and discuss the logic that events flow through. This book is about what happens when processing events, not how it happens.

It seems only natural to separate the specification of event processing logic from the implementation. This book is about specifying the logic, and leaves the implementation details to other books. In retrospect this seems like an obvious idea, but I’ve not seen any other book filling this need. I might put EPIA in a category closer to books on the Unified Modelling Language than to books on programming or systems architecture.

Follow

Get every new post delivered to your Inbox.