Opposing views on streaming SQL
I just found this great comment from Julian Hyde in my WordPress spam folder. Sorry Julian, didn’t see it until today. It was in response to my previous post about making progress on streaming SQL. I am reprinting almost his whole comment:
I am chief architect of SQLstream and I do a bit of technical blogging athttp://julianhyde.blogspot.com about what SQLstream is capable of. Not enough, I admit – sometimes it’s a choice between blogging and ‘real work’.
I want to improve the usability of streaming SQL languages, but I think that if we stray too far from relational semantics we will end up with something less declarative, more proprietary (and therefore more difficult to understand by the many folks who have a SQL background and would like to process data in flight), and less maintainable.
I actually read that particular Streambase post with some horror. The problems solved by that post are already solved, much better, by standard SQL and implemented in a few database systems. Streambase have introduced concepts similar to standard SQL concepts but have given them different, and misleading, names. Where they use CREATE SCHEMA, the rest of the world would use CREATE TYPE (standard SQL has a SCHEMA but it means something completely different). What they call TUPLE, standard SQL calls ROW. Wildcard attributes might be a quick win to deploy a project quickly, but you will end up with a project that is brittle: you can’t even add a column without the risk that it will be captured by a wildcard rule somewhere in your application.
I’m not one of those relational bigots who believe that we should remain faithful to every word E.F. Codd wrote in 1970. I believe that SQL systems have been effective because they have a small number of basic operations that can be combined in powerful ways, they allow structures and operations to be specified declaratively so that the system can optimize, because there are standards to allow SQL systems to interoperate, and because there are a lot of IT professionals who understand SQL deeply.
Those principles are as important, if not more so, for problems of streaming data. We may need to add more or two new operators, but the basic operations are applicable to streams and can achieve a lot of power. The SQL standard has some newer elements, such as moving totals, nested relations, XML support, user-defined transforms, and SQL/MED that are perfect for streaming systems but I have not seen any other streaming SQL vendors exploiting them. At SQLstream we have started with these fundamentals, then added a few key extensions for streaming data.
So this is a very interesting perspective, basically taking the opposite view to mine and saying that adding features just for the sake of having features is the wrong way to go. Also he comments on the apparent divergence of StreamBase’s naming conventions from SQL standards – something that I will not comment on other than to say that it would surprise me to find that they are not religiously following previous SQL conventions. Anyway, that is a separate point from whether they have made language “improvements” or are going down the wrong path.
Among many opinions on streaming SQL, Opher also frequently says that it is the wrong starting point for a generic event processing language. He sees SQL as being natural for expressing certain parts of event processing but not all. Just search his blog for “SQL”, there is a good amount of perspective there.
I will try to post some of my own thinking on this at some point in the future. But so far, I am still in favor of StreamBase’s new enhancements from an end user perspective.

This is an interesting vs. topic that I think will go on for a while; SQL vs. X, where X is a vendor’s custom event processing language.
I’m the the same camp as Opher. SQL is really not ideal for event processing. BUT, it is a good choice for processing streams of *DATA*. The SQLStream guys seems to understand this.
Herein lies the confusion. Data stream processing is labeled CEP and everyone is confused. Not going into the what-is-real-CEP I think this is the source of confusion among vendors (mainly their marketing depts., developers understand this) and customers.
I’m happy to see both data and event processing called CEP, but it will be confusing for a while before everyone starts to understand that there are two subsets of CEP products. One that SQL works fine for and another which requires tools and languages designed ground up for event processing and not just extensions to data processing concepts.
Hi Marco,
I’m afraid that I don’t get the whole “this is really CEP and that is not”. No one really cares if some product is really CEP, but if someone were to look to verify your (or anyone else’s) claim about what CEP is, they would not be able to. If you look back at Luckham’s Rapide language or any other piece of the history of CEP or event processing in general, you will not be able to find a pedigree of products that points in any one direction.
Rather than take the line of “I am right and other people are confused” why not just show off your solution and what it can do? Take the positive approach of showing off your ideas, rather than the negative approach of knocking alternative claims. This will do much more to convince people that your solution has advantages or is the best for some use case.
My observation is that the market really does not distinguish between the two subtypes of event processing, and my opinion is that hybrid solution will emerge, we already see that happening, so I agree with Hans, rather than explaining why X is or is not Y, let’s look of what is the effective way to implement a certain application.
cheers,
Opher