Following up on several comments from this thread in the CEP Interest yahoo group: http://tech.groups.yahoo.com/group/CEP-Interest/message/1469 and some of the replies to it.

Rather than think of use cases as measuring performance *or* measuring engine capability, I want to propose an alternative way of looking at the situation. A CEP engine is essentially a set of message processing algorithms, driven off of some configuration language, hooked up to threading and the network. The requirements of these engines are, therefor, as varied as the requirements of message processing algorithms. While David Luckham’s research was in finding a very expressive event processing language that can be used for a broad problem set, products today are still not so expressive. Yet they are used every day, and it is those less expressive products that are predicted to be a billion dollar market by the year whatever.

Some people in this community seem only interested in solving the most complex problems. This is certainly a good area of research but it is just as valid to see what can be done with existing solutions and to find those things that can make existing solutions better. This is more of an incremental approach and while I certainly do not rule out an innovation that dramatically improves the field, I find that more often these innovations come over time as part of an incremental process. In any case, with an incremental process at least one can get their head around the next step, which helps in building an understanding of the field. Taking too big a bite at one time, as often as not, leads to more confusion than progress. This important thing is to keep organized.

So relating this to event processing use cases.

When you’re looking to choose software components to meet your needs, how do you go about it? If you’re buying the components, vendors would often like you to think of their product as being automatically useful for your particular problem type, as in “my product is meant to solve just that problem.” But if you want to really make an informed choice, you will rely on the products demonstrated capabilities to solve your particular problems. Most of the time, you will find that many vendors can do it, but they each have features that bear consideration.

So, if your problem domain is better detection of network intrusion events, then you will be interested in testing these capabilities. In this case, you will be looking not only at the vendors on the CEP Interest list, but at vendors that make specialized probability based correlation and detection software. Since the problem domain is so complex, you will have very complicated testing criteria and will probably end up combining a variety of solutions.

But if your problem is *not* that complex, then you will not necessarily care about the ability of a probability based correlation engine to filter signal from noise. Instead, you will care about how the features of each engine impact *your* project.

In other words, it is not true that simple use cases are not useful. The fact is that if the use case properly covers a problem that many people have, then it is useful. If your road is linear, then the Linear Road benchmark is quite useful. When someone uses a use case to evaluate engines for their problem, it is up to *them* to decide how important performance is. A use case and the implementation of a solution using various engines simply serves to highlight the features of each engine, one of which is performance.

Of course, vendors will immediately jump on any use case, saying “our product executes this use case a million times faster than the next best guy.” There are two points about this behavior: Firstly, you can not entirely stop it. Vendors have sales and marketing people who will spin every single possible result in their favor, no matter what the use case is and no matter what you do. Note that other vendors will immediately jump to the other position and show how the features of their product are important and overshadow the raw performance. Which brings me to the second point: if the product can accomplish the use case, then they should have a right to brag about their feature set.

It is most important to design the use case to properly meet the problem domain. For example, last year I published this use case.

It is a very simple use case and initially designed to measure both accuracy and performance of a small problem. It is not very complicated and it can certainly use improvement. The use case tests the ability of an engine to correlate messages coming from various parts of a distributed system into whole transactions, and then to measure certain things about those transactions.

Some people would, I think, classify this use case as being too simple. The initial problem that many people spot is that if an event does not come in within a few seconds, it will never arrive. People would say things like “that never happens in real life” or some such. Of course, it certainly does happen and it is exactly what happens in my problem domain. The use case was also criticized for being too focussed on performance. “Most people don’t need that kind of performance” was the critique here. My response has always been that many people really do need that performance, so let’s come up with an alternate use case that covers the need for events to arrive an hour or a day later, but let’s not claim that my use case is invalid. It’s not invalid, and I know that because I got several vendors to post solutions and dozens of emails from people thanking me for putting it up and asking me to help them come up with ways to test their particular problems. Unfortunately, I’ve never had the time to follow up on this and do more use cases.

From implementing this use case (well, a slightly more sophisticated version of it) on many vendor platforms, I learned a lot about the features of each platform. Some platforms have performance benefits, while others have different very nice features that may be needed fr certain problem sets. The overall experience with using the use case, however, was very enlightening.

Taking the example of an RSS reader that allows for sophisticated filtering before delivering the feeds to a client. Google would certainly care about the performance of such a solution, because for them the better performing solution has the potential to save millions of dollars in hardware. I cared about performance in my use case for just that reason.
In summary, when coming up with use cases, the important thing is to avoid tailoring the use case to be applicable to a particular type of engine. Instead, tailor the use case to accurately reflect to the problem and allow each solution to show all of the benefits and features that it can. You may find that some solutions implement the use case as you wrote it, but their solution fails to implement certain crucial features that you forgot to test in the use case. Great! Create another, more sophisticated use case! Through incremental modification and improvement, a use case can start small and before you know it, turn into something very useful.


Leave a Reply