The news is full of “liquidity problems” these days, and here’s another reminder of how dependent our system is on maintaining liquidity:
A few market makers got confused about short selling rules on Friday, September 19, 2008 and withheld their liquidity from the market. In an order driven market, when there are lots of orders on one side of the market (buy or sell) and few orders on the other side – the price is supposed to change.
Now what happens when this trading is all electronic and there are no sanity checks on the price of shares? Yup, the price swings can be pretty big.
Take a look at the spread in the price of CMCSA on Friday. Yup, that’s right, some poor soul entered a market sell and got only about half of what they were expecting, only to see the price go right back to normal levels. Was this huge price swing based on news? Nope, it was based on the fact that for a short time, there were way too few buy orders to match to the sell orders.
Actually, the price swings were much worse than 50%. The swings were so bad that brokers all agreed to cancel a bunch of trades. The WSJ says that these trades were cancelled because of a “trading glitch”, but what was the glitch? The “glitch” was that the firms that usually take the other side of a trade when no one else will, which serve to stabilize the price of stocks… didn’t. And without this stabilization, prices went crazy.
Now, for all the event processing folks – how do we detect this kind of situation? After all, when news hits the market, share prices are allowed to swing wildly. But what about when big market participants accidentally stop providing liquidity? How do the computers know when price swings are ok and when they are not?
A recent discussion with a member of the CEP community reminded me to post an initial impression of the new IBM CEP product WebSphere Business Events. I have not used the product, so please do take my opinion with a few grains of salt. I’m basing this post on web research looking at the general direction of WS-BE.
IBM seems to be taking the scalability and performance approach with this initial WS-BE release. They have put together a package that seems to me to focus mostly on “extending transactional integrity to low-latency environments” and allowing large scalability. And this is a good idea – it fills a hole in the software market. If this product can lower the barrier to high performance, or to replace custom, low-latency transactional code with a software framework, this may be a step forward.
So what else might they have focused on? Making it easier to develop large and complicated rule sets for processing events. This is a big deal since event processing applications often evolve into large rule sets. Think about trading and order management systems, real-time monitoring systems and such. They start out simple and often evolve into surprisingly complex and tough to maintain logic.
So while scalability and low latency transactional messaging is very nice, they are really only useful if one can implement and maintain a real-life set of processing rules. There is always the risk of developing a large scale, low latency application only to find that it will be very expensive to maintain and extend over time.
Based on past experience with the authoring model that WS-BE uses, I am skeptical about whether WS-BE will be able to manage large rule sets and express complex rules in a clear and understandable way. Hopefully I am wrong, and if so then I offer an apology in advance to the folks who have put in lots of hard work and money to produce the current generation of WS-BE.
IBM is in a unique position to combine their existing CEP research with both AptSoft and ILOG expertise into a product that provides an innovative system for creating and maintaining event processing logic. IBM takes the route of evolving their products over time, so I am looking forward to some great innovations in future versions of this product.
I check my blog stats every so often and the other day I remembered that I hadn’t looked in a couple of months. My traffic has picked up quite a bit, and a lot of it looks to come from search engines. I hope that this means I’m providing at least a little useful information!
There is a section of the WordPress traffic report that shows search terms that were used to locate my blog. After a brief scan of these terms, I was surprised by two things:
One – People search for my name more than I would imagine. Almost every day, apparently. Back in the early days, I was one of three Hans Gilde’s with search results (and I’m related to the other two). Are more Hans Gilde’s getting on the net nowadays? And will this post, containing many references to Hans Gilde on a blog by Hans Gilde, now become the most relevant web page on Hans Gilde?
Two – Perhaps more relevant, I noticed an interesting shift in the search terms containing vendor names. Last time I looked, I saw lots of searches for particular CEP vendor names and often they were paired with “CEP” or similar words. But for the past few weeks (WP does not make it easy to find historical information) I see that over half the time, vendor names are paired with words like “problems” or “limitations”.
So watch out vendors, looks like the customers are learning to do their homework.
The event processing community has watched with interest as Marco Seiriö tries his hand in event processing with the SaaS approach. Many people object to this approach, as he describes in this post on his blog. Current users of event processing products often have very low latency or high volume requirements and it’s probably hard to convince them that EP can really work on someone else’s network.
But I see potential for Marco’s model. For example, RuleCore could be deployed as a service in the Amazon cloud to handle messages from the SQS. In this case, the customer is basically getting a managed server into which they can deploy event processing logic that takes advantage of the scalability of SQS and other AWS components. From here, a logical step is to turn EP into a cloud service along the lines of the Google App Engine. When I think about it this way, it doesn’t sound so crazy. Although there is the question of how far ahead of its time this kind of thinking is.
So good luck Marco, not everyone thinks you’re on the wrong track.
I was interested to read Paul Vincent’s analysis of the release of TIBCO BusinessEvents 3.0. Based on all of the recent press (and press-releases-disguised-as-reporting) about this release, TIBCO is definitely putting more marketing muscle into this product.
In his post, Paul talks about BE as a platform that combines many techniques for procssing events (inference rules, state management and continuous queries). He also discusses both distributed processing and decision management as features of a CEP product. And TIBCO seems to have a continued focus on simplified modelling of event processing, for use by less technical users. I have not looked at BE in a very long time, so I could not say how well all of it works. But you have to give them credit for the completeness of their vision.
Now we see direct competition between IBM and TIBCO in event processing. Their most recent or upcoming EP products seem to focus on similar features, but they will take different approaches. On one hand, we have TIBCO BusinessEvents, which has really evolved based in large part on a (relatively) big customer base. On the other hand, we have IBM, which has put some of their significant research muscle into this area and is looking to jump ahead with some upcoming product releases.
Looks like the big boys are finally getting serious about event processing.
For those who are interested in streaming SQL or use cases for Event Processing, I recommend taking a look at the site of the vendor SQLstream. They’re a streaming SQL vendor along the lines of StreamBase, Coral8 and Aleri (their focus seems to put them more along the lines of StreamBase and Coral8). The part that caught my attention is the list of use cases on their site. They’ve got lots of ideas for using streaming SQL that go beyond the typical point solutions for capital markets and into enterprise infrastructure.
Also, Julian Hyde is a part of SQLstream and his blog has some very fun and interesting reading on SQL and streaming SQL. Reading over the past few months in his archives, there is some more good reading on this topic.
Colin Clark recently posted some coverage on the Gartner EP Summit, but his Day 1 post goes off the track of covering the summit and into a great point about EP products – the difference between a bunch of point solutions and a platform.
I have noticed the same thing that Colin has – that EP vendors haven’t had much luck selling their products as something on which multiple EP projects can standardize. I have some ideas about this, and maybe a slightly different perspective than Colin (and I don’t have time to write about it now). However, the EP community will benefit greatly from more of this type of practical insight, so I am looking forward to hearing what Colin has to say.
There has been some interesting talk lately on EP blogs about Smart Order Routing (SOR) and CEP products. There are posts from Aleri and StreamBase about how SOR is more than just simple decision making. This debate has happened before, since SOR is one of the areas where CEP produts have been used for a long time.
Modern SOR is not just “routing”. SOR is the process by which an order is filled at the best price available right now. This contrasts with algorithmic trading, which is the process by which an order gets filled at the best price available over a longer period of time. SOR basically looks for market depth, meaning available liquidity at a given price range. It might find all the depth it needs on one place, or it might want to allocate an order over several markets at a given price range. So even basic SOR logic can be somewhat complicated, becoming what was once considered “algorithmic trading.”
But SOR is a deeper topic even than just splitting orders based on advertised liquidity. Coincedentally, it is also the subject of some of my recent research. Predictive analytics have already begun to creep into SOR in some shops (where the differences between SOR and algorithmic trading have also begun to blur). The trading arms-race being what it is, I expect everyone to follow suit within a few years. Many others share this view.
So now everyone has begun thinking about predictive analytics in SOR. But there’s another dimension – attacks on SOR algorithms. Just like algorithmic trading shops attack each other’s algorithms directly, the same potential exists to attack SORs. And the more advanced the SOR gets, the more opportunity there is to attack it. It’s a fun topic.
I see that Mark at StreamBase has noticed some differences between IP routing and smart order routing. I have also noticed this difference. In the comments on Marks’ post, Opher mentions that some people will want SOR to use probability before they will consider it to be CEP. This is already under way and will probably increase dramatically in the years to come.
Radford Neal points out a few problems in R in this post on his blog.
At first I thought that his approach would not be the best, but maybe I was wrong. So I read up on R internals, and here is my first thought on implementing fixes. The good news is that some of this can be done without changing the R internals yet.
There are two problems:
First, in R the : operator has some annoying properties. One usually uses this operator to generate increasing sequences where 1:3 becomes c(1,2,3). This is often used to index matrices like m[1:3,]. But : will happily generate decreasing sequences, as in 1:-1 becomes c(1,0,-1). This causes indexing to fail in strange ways.
Second, matrix indexing behavior is not consistent. If the indexing returns multiple entries in the matrix, then the result keeps the dimensions of the matrix. But if it returns one entry, then the entry is returned as a vector with no dimensions. For example, m[1,] returns the first row of the matrix as a vector and not as a matrix.
There are two basic ways to solve the first problem. The simple way would be to create two new operators, %>% and %<% where a %>% b would produce only an increasing sequence and either an empty vector or an error if a>b. Clearly, %<% would produce only decreasing sequences. This is easy.
Another possibility would be for %>% and %<% to produce a new class of objects that would would be treated specially by index operations and loops. These objects would not be vectors, but would cause indexing and loops to act as though they are vectors. So 1%>%1000 would not be a vector with 1000 entries, but loops and indexing would treat it as such. This is harder, so I will skip it for the moment.
The second problem is a little more straight forward. In order to stop matrix indexing from returning a vector, one uses the drop=FALSE argument to the indexing operation. So here is a solution – put an attribute on the indexing argument that says “drop=FALSE”. Then update the [ and [<- indexing operations to look for this property on their arguments and automatically use the drop=FALSE option if this attribute is set. This can be done without, for the moment, changing the internal [ and [<- operators.
So here is an example:
h=1 m[h,] # this returns one row from the matrix m, but that row is a vector drop.no(h) # this sets the drop=FALSE attribute on h m[h,] # this returns one row from m and that row is a matrix h=1 %>% 1 # %>% returns a vector that already has the drop=FALSE attribute set m[h,] # so this returns one row from m and that row is a matrix
Having some spare time, I will see if I can code this up in the next few days.
So some interesting discussion lately, related to this announcement about a standard SQL approach to EP. Here are a few points about this topic, some of which are in response to recent blog posts.
The goal of this paper is to move toward an SQL language for EP where the same query with the same input will produce the same output, no matter which product is used. This is in the same way that, excluding string matching and maybe a few other small problems, a SELECT statement will work in fundamentally the same way on any database platform.
But note that even though the SELECT statement has the same semantics on Oracle and Sybase, these two databases have many fundamental differences in their feature set. So this paper is not exactly unifying the feature sets of the products. Each product will be free to implement its own feature set. But the basic SQL operators for EP will now have standard behavior.
I am curious now whether they will really invite other vendors to participate in this standard. We will know the full extent of their intentions by the amount of time it takes them to get their respective company names out of the language definition. You can invite competitors to unify around “An EP-SQL standard.” You can’t invite them to unify around “The EP-SQL standard by Oracle and StreamBase.” If Oracle and StreamBase really want to make this a global standard, they will refrain from implying that everyone who implements this standard is riding their coattails.
About an EP-SQL standard and an EP LINQ: these two things go hand in hand. LINQ came from the same set operations that are supported by SQL. So EP extensions to LINQ would naturally include same kinds of operations supported by EP-SQL. An EP-SQL standard provides a stepping stone on which LINQ extensions can be defined.