Fair and unfair criticism of an SQL EP approach
Recently we have seen various criticisms of the SQL approach to EP. I find it interesting that I agree with some of the general premise of the criticism, yet I find many of the particular arguments to be flawed.
Apama has been a long time critic of the SQL-esque approach and I think that they are highly credible in this area. Apama could absolutely implement an SQL language into their EP solution. They have very significant resources and an extremely flexible EP product which could almost certainly allow SQL and MonitorScript to coexist. So when they say that they don’t like SQL, it’s because they’ve though a great deal about the topic and genuinely feel that implementing an SQL language is not in the best interest of their customers. If a market leader takes a strong position like this, it’s only wise to pay attention.
And yet, this post from inferences about an SQL EP approach based on various issues with SQL databases, without looking deeply enough into the topic to support those inferences. I find the latter category to be misleading because some of these inferences are incorrect and that throws an unnecessary shadow over the whole issue.
For example of a “category 1″ flaw,
This is, of course, not to say that a nested data structure is necessarily inappropriate. But as we can see, it is certainly incorrect to assume that using a nested structure is better than a flat structure without significantly more analysis. This leads to the argument about O/R mappers. The argument about how the existence of O/R mappers proves that an SQL approach is bad, is illogical. If a nested object view of data were so much better than a flat table approach, object databases would be much more popular than they are. The fact is that flat table structures are so common not because of some limitation of database vendors or user’s imagination, but because they are often found to be much more flexible in the long run than a nested structure.
Having now disagreed with Louis’ nested-data-structure argument, I find myself thinking “ok, but if I determine that a nested structure truly would be best, I would like to be able to use it.” This leads me into “category 2″ flaws.
inferences about an SQL EP approach based on problems that are common with an SQL database. After much consideration, I think that while there is the a core of a good point in this idea, the general comparison between SQL databases and an SQL EP approach is being abused. As we have heard recently, not all SQL-like languages prevent nested data structures. As far as I know, nothing in SQL like syntax prevents addressing a nested structure. This leaves it up to each vendor to implement the capability for nested structures (or not). Let’s not get caught into the trap of assuming that just because relational databases don’t allow for nested data, an SQL language approach to EP will always have the same “problem”. We can see the example of Esper, which allows for an SQL-like approach while retaining the ability to use nested objects.
Apama is an extremely professional organization, so I know that they will take this post for what it is: a comment on the discussion of SQL-or-not. I criticise one of their recent posts, but not of their opinion about SQL. They have a much more broad view of the EP market than I do and when they say that they don’t see a need for an SQL-like approach, I imagine that they know what they’re talking about. Indeed, I have seen several things about existing SQL languages that I wish were better. But still, I have not yet seen an argument that condemns an SQL approach to eternal uselessness. At the same time, I have seen several arguments that show why an SQL approach might be appropriate. So I look forward to Apama demonstrating their reasons over time. In the mean time, I continue to believe that the best thing would be for each user to analyze both SQL and SQL-less approaches in the context of their problem.
P.S. I believe that Louis provides an unnecessarily inefficient bit of code in his article. I’m no expert on StreamSQL, but I’m pretty sure that I recall that it’s possible to implement a counter like this as a select statement from the input stream without using a memory table.
Update: Apologies to Louis for the “inefficient bit of code” comment, I didn’t realize that it comes from a vendor. Indeed that bit of code demonstrates an annoying “workaround” in SQL.