EP approaches cooperating on computer security
Tim Bass recently posted this summary of his experiences discussing computer security and business rules. He says that most people that he talked to are more excited about statistics based approaches than about using business rules. He mentions hearing that statistics based approaches are the only feasible way to deal with enterprise class computer security problems.
I would be surprised to hear anyone argue against statistical approaches in detecting attacks for large scale computer security problems. Statistics has been used and researched extensively in this area (although possibly not fully explored from a research perspective).
But it seems to me that rules and statistics are natural allies in cyber-security systems. My experience is that statistics based approaches generate events corresponding to a detected situation and it’s very useful to pass these events to a rules engine to handle the response to that situation. Rules handle the mundane tasks like the work flow surrounding the alerts.
I wonder if people discussing the use of statistics over rules are talking mostly about the use of those types of EP to detect security situations. If someone advocating statistics over rules were asked “do you see business rules handling the work flow surrounding security events?” how many would say no?
Also along the lines of rules and statistics working together: When developing a statistical model, there are generally plenty of edge cases that need accounting for. Of course, given enough time, these cases can be included in the model. But often (not always) these edge cases are easily handled by code outside of the core model. In a business context, we would like our overall solution to be able to respond quickly to changes that have not yet been incorporated into our models. In this case, I think that it would be very useful to include a rules engine or one of the stream processing engines in the solution. In addition to handling mundane tasks like work flow, the addition of a more flexible EP system (alongside statistics based approaches) would, in many cases, allow important corrections to be implemented in the short term and then possibly incorporated into the models over time.
Observations on COTS EP software
I thought I’d add a few observations to the ongoing discussion between Opher Etzion and Tim Bass on COTS EP software. Tim writes an interesting post on requirements for COTS EP applications in the event detection space. Also Tim and Opher have been exploring the idea of real-time and latency requirements. Opher offers his views on a point raised by Gartner about COTS EP software in the BI space and how there are more BI customers with a medium or relatively low volume of events than with a very high volume of events. And Tim is an advocate of EP products with better detection “accuracy” rather than minimum latency or maximum throughput. Here are some additional observations.
Real-time applications that do tasks other than event detection
There are a few reasons to think about using a COTS EP product. Among those that haven’t been discussed in the last few rounds of post on this topic is the replacement of real-time software that does not primarily detect event patterns. This area, IMO, has big potential for cost savings in financial services. There are lots and lots of real-time applications out there doing tasks like order book management, ticker normalization and aggregation (see this post from Louis at Apama on that topic), order routing and position and P&L tracking. Sometimes, within a firm, these applications share a common code base. But at bigger shops, they are each written and maintained by a separate group that reports in to the business line using the product. Often there are multiple of the same type of real-time application in a single big bank, each serving a different product area (for example, order management systems). They may have a bit of detection in them (matching incoming messages to their order state, for example), but it’s not necessarily their primary function. COTS EP software can potentially help save costs associated with these infrastructure applications. EP software usually includes the mundane infrastructure of a real-time application including IO and pluggable translation from network messages to internal message formats as well as management and monitoring, high availability, threading (although a few big name EP vendors, for better or for worse, avoid any kind of multi-threading in their product) and testing . Also, they provide a language that naturally addresses streams of messages (events), avoiding the need to maintain the code that passes these messages to various components of the application. In other words, a COTS EP product can help programmers focus on the business logic by removing the need to maintain the infrastructure code. And maybe the EP language will even make the business logic either easier to maintain or more accessible to non-programmers, or both. Moving to an EP platform for the many real-time projects in a big firm has the potential to reduce or eliminate the 50% or more (much more in some cases) of real-time application code that is not associated with business logic. The benefits from this might include a lower learning curve for developers moving between projects, less code to maintain and less reinvention of the wheel, a consistent interface for operating the processes in production, better reliability and the potential to lower the required skill set (and so the cost of programmers) to work on real-time projects. Note that allowing a business user to directly modify the logic falls under lowering the required skill set.
For these infrastructure real-time applications, throughput and latency may or may not be a big issue. The primary reasons for thinking about latency and throughput in this context stem from hardware cost and the fact that these systems can’t get too far behind in processing their incoming messages before they become almost useless.
Financial services firms certainly don’t represent the whole EP market, but providing an infrastructure for all these real-time infrastructure products and projects in these firms could save quite a lot of money and thus be very lucrative for the firms that can do it.
On the accuracy of event detection
At this point, accuracy of event detection in a COTS EP product is often directly related to the accuracy of the business logic entered by the user. There are certain specialized products for detecting spam, monitoring network security, algorithmic trading and maybe a couple of other categories. There are also a few data mining products that may only work over stored data (rather than network events). These products can be judged by the quality of their algorithms. But judging the accuracy of a more general product comes down to how well it lets you write your business rules. It’s possible that new products incorporating better semantics or explicit definition of concepts like context (as discussed recently by Opher) can help to make business rules more clear or more manageable. But the extent to which these things make a system more “accurate” really depends on the context in which the system is implemented and the kinds of events that need detecting.
Just like we’re seeing that many EP environments may not require the very highest performance, it’s also possible that many do not need the ultimate in pattern recognition. Of course, the easier it is to specify and detect patterns, the better. But it’s also true that the more complex the analysis capabilities are, the more one has to think about them for fear of getting it wrong. This is particularly true when statistics become involved. For example, if you need to detect event patterns like [A,B, not C] then a Bayesian classifier, while good for spam detection, will not only not help you but will be very frustrating if you don’t understand how to properly use it. For simple requirements, even a very basic EP product can be 100% “accurate” while a more complex product might actually be less “accurate.”
Take as another example an order matching engine for an electronic exchange or crossing system (again, a financial services example). This kind of thing can be thought of as detecting the pattern of two orders that match and should result in a trade. The requirements here are simple, if not always very straight forward. This kind of thing will never require statistics or complex data analysis because the whole point is to express concrete, deterministic and understandable rules for matching orders. In this case, the choice of COTS EP system is less about “accuracy” and more about how well the kinds of rules that go into order matching are represented by the system. One would also probably consider how well the engine meets other requirements of an order matcher, like the ability to publish quotes or to query the current state of the order queues.
Of course, I believe that there continues to be a great future in more sophisticated event detection techniques. In fact, statistics and data analysis are my areas of research (and what I am taking a break from by writing this post). Just as long as making the hard stuff easier doesn’t get in the way of making the easier stuff as easy as possible.
Posts on EP from my old blog
I was reminded recently by a post from Tim Bass about a couple of my older blog entries on EP. In 2005, I had I had a use case and I looked at implementations by several of the bigger vendors. I posted a highly summarized description of some of the solutions and then eventually I posted the use case itself.
The use case is very simple and much more specific than the high level stuff that’s been collected since. One has to be a little technical to
interpret how this use case would come about in a real system. But it was the only time (as far as I know) on the CEP Interest list that we’ve gotten multiple vendors to all post their solutions to the same problem.
I had hoped to encourage more of this kind of sharing of many answers to the same problem, maybe we can try that again in the EP (CEP) community. I know that Esper has been doing some work in the realm of tools that might be used to test many products.
Here are the posts. It would be great, IMO, to get back to looking at various solutions to the same problem:
http://www.jroller.com/hgilde/entry/test – A description of some of
the details of various solutions to the use case. Also an interesting
point about the use case from Aleri in the comments and my response.
http://www.jroller.com/hgilde/entry/three_event_challenge – The use
case itself with comments from those who posted solutions.
Context and EP
I watch with interest as Opher examines issues of context and the required semantics for EP. Reading Opher’s post about CoDA and several of his previous posts, I wonder why the notion of context is tied to EP. The diagram from Gartner connects several technologies to a future where context drives the next stage of innovation. Analyzing data in its proper context, though, seems to be useful in plenty of cases and not constrained to a real-time solution.
Here’s one way to break down the ideas:
- Soft real-time processing, where EP software helps to quickly manage, analyze and/or act on network messages. This is an application of software to a particular class of problem, for example most of the use cases put forth in various EP groups. Also in the problem class are ESB mediation and network routing.
- Analysis of data using the ideas of events, context and situations as Opher describes here. This is the application of data analysis techniques to problems like fraud and opportunity detection and image recognition, to name a very few. These techniques could be as broad as stochastic processes, time series analysis, state automata, graph theory and other fields of research that use these ideas under possibly different names.
- Semantics. In other words, expressing the instructions for processing data in particular ways. Here we have all programming and computer languages, including EP languages, SQL and machine languages. Each of these languages supports different semantics that makes it more or less suitable for certain problems. For data processing, some users want the semantics as close to the concepts in their area of expertise as possible (e.g. the business user) while others want them to be flexible and might not mind doing a little translation between the problem domain and the language (e.g. the programmer).
So what are the dependencies between these areas? Using context could benefit many areas that don’t involve real-time requirements. The same can be said for languages that better express data processing instructions.
Positive aspects of various EP languages
Reading the recent round of discussion on SQL EP, I wonder why SQL in EP has to be an all or nothing proposition. As Marco said, there are cases where SQL seems good and cases where it seems bad.
Here are some features that I like from various SQL EP languages and when they’re ubiquitous in EP, I’ll consider giving up on SQL.
- Easy syntax for aggregation over sliding time windows. Trust me, sliding time windows are a pain in C++. Take a look at the SQL syntax for calculating VWAP. It’s one simple statement! Come on, how can you argue against that? Not to say that it solves all EP problems, but you can’t tell me it’s not a good idea.
- Stream queries and a nice way to express set relations between streams. As we’ve seen, not every kind of query can be easily implemented in current SQL EPs. But when it supports the kinds of queries you want, SQL is a beautiful thing. Again, take a close look at the Coral8 Portal and tell me it’s got no promise.
- Multiple kinds of stream-joining. SQL products have both traditional relational joins and also operators like GATHER in StreamSQL, which does another kind of very useful stream matching. These operators save me the work of writing a few kinds of windowed buffering.
- Materialized windows. It’s a window that acts like a table, complete with indexes for efficient searching. How many times have I written something like that? I want to be able to just declare it!!
- Easy sorting of an unordered stream… or not. If you don’t need a ordered stream, SQL EP doesn’t force you into ordering. And if you do want sorting, you don’t have to write a stream sorting routine.
- Declarative pattern matching. Although useful, I’m not sure if this new-ish feature of some SQL EP languages is so much better than a bunch of ‘if’ statements. But still, it’s got promise.
So there are lots of features in SQL EP languages that seem pretty good for many classes of EP problems. I think that the Coral8 Portal is a great example of leveraging strengths of SQL EP. There are plenty of very interesting stream queries that can be expressed succinctly in SQL and Coral8 now lets the semi-technical user easily apply these queries to real time data. Way to go Coral8.
At the same time, we see that there are cases where SQL syntax is terribly clumsy. Like the simple example from Louis, which I incorrectly accused of being unnecessarily burdensome. Or maybe the example in my previous post.
But let’s not think that the promise of SQL EP is all or nothing. Until a time when we have the perfect product that lets us define the EP application in purely semantic terms (and even probably after this time, I imagine), EP developers will benefit greatly from features currently only available in SQL EP languages. Is there no hope for an implementation of MonitorScript that incorporates some of these features that make SQL attractive? As a step toward making the hard stuff possible but making the easy stuff as easy as possible, I suspect that such a thing would be very well received.
EP languages as tools for understanding EP requirements
I see this morning a post from Opher about sticking to more fundamental requirements of EP rather than getting caught up in a discussion of current EP languages. It seems to me that an honest discussion of languages is really a discussion of fundamental requirements. An honest discussion of the languages could be thought of as a discussion of the topology, so to speak, of the EP space. We discuss how paths (to a solution) in one product are related to paths in another product and whether or not they amount to the same thing.
Also, note that the discussion of languages is not entirely driven by vendors. Most of the voices in the EP community make money off of EP vendors in one way or another (I don’t). But looking in to this community are people like me, engaged in analyzing their own requirements and trying to understand if and how existing or future products can help. These people might be happy to see progress in the theoretical side of EP, but they are also more immediately interested a practical discussion that will help them with their projects and plans.
As a user of EP software, I like to talk about languages and related features because the discussion helps me to draw out specific requirements and understand how well they addressed. I like to look at detailed requirements because it prevents me from winding up with a project that is 80% solved by a product, but where the remaining 20% is an unexpected source of cost or delay.
On a related note, let me speculate on why vendors may not have been entirely successful in using their language as a differentiator. I imagine that a big part comes from the fact most customers don’t have enough experience with different EP languages to understand how language differences will impact their usage.
We regularly see vendors making certain claims about their approach related to others. Most customers, though, haven’t used all these approaches, so they have to rely on gut-level feelings or less than complete evaluations of the suggestions made by vendors trying to sell. This is not optimal for the customers. The best thing would be for a customer to be able to compare their needs with the needs and experiences of others. The difficulty in comparing, as far as I can see, doesn’t help the EP space. It may seem logical to the marketing staff of any one vendor to limit the discussion of their features to one that plays to their strengths. But since every vendor does this, the customer finds themselves in a situation where it is truly a pain in the @$$ to understand the EP space without putting in lots of work. There is no public discourse on the various languages and features, so everyone has to go to the vendors directly and thus get hit full in the face by sales teams looking to rush the buy decision.
So I suggest that an honest discussion of existing languages can benefit users, vendors of existing solutions and also researchers. Users can compare their requirements to those being discussed. Vendors get the benefits of a real community. And when researchers eventually need to implement a product, at worst they can ignore the language discussion entirely. But I suspect that somewhere along the way, even the researcher will benefit from looking over the language discussions if only to more fully analyze the “topology” of the product that they’ve decided to build.
One down side to an SQL EP approach
Having mentioned in a previous post that a flat data structure can be more flexible to query than a hierarchical one, I also have a use case that is indeed annoying using every SQL-like EP languages that I’ve tried.
Looking at the quote aggregator mentioned in this post from Louis at Apama, we have a one to many relationship between symbol and the last quote from each quote-source for that symbol. As I mentioned in my previous post, the choice of structure for storing symbols and quotes significantly affects searching complexity. It’s not obvious to me that quotes should be stored in a hierarchy under symbol, since I can think of several reasons to search for a symbol using attributes of its quotes in addition to searching for quotes using attributes of the symbol. An SQL EP may want to store data in flat table-like structures and I’m not sure that this is such a bad thing overall.
But here’s an example where SQL EP can be a pain: The client specifies an interest list of symbols and wants a notification when the quotes on these symbols change. Or maybe they want to specify a symbol and some criteria on price or liquidity and get a notification when their criteria are met. In the notification, the client wants the symbol identifier and also the set of open quotes on the symbol. The client in this case can be an external client or can be running in the same process as the quote aggregator and listening on a stream.
If we are processing an incoming quote using a traditional programming language (like Java, C++ or something with similar structures), we have logic as follows:
- Find the symbol attached to the incoming quote and store the quote internally.
- Check for clients interested in the symbol and quote criteria.
- If interested clients are found, build a set of quotes for the symbol and send the set to an output stream.
But in SQL EP, we do something like this:
- Find the symbol attached to the incoming quote and store the quote internally.
- Check for clients interested in the symbol and quote criteria.
- If interested clients are found, send the quotes for the symbol as individual events to an output stream.
It’s the third step here that’s the catch. In a more procedural language, it’s easy to build a set of quotes and send that set along to the client. But SQL doesn’t provide a way to express “get records from storage, form them into a set and send that set along to a stream.” Instead, SQL wants to “get records from storage and send those records individually to a stream.”
Now look at these two scenarios from the client’s perspective. In the first scenario, the client gets a set of quotes as a unit. In the second scenario, the client gets a stream of quotes. How does the client in the second scenario know when the last quote for that symbol has come in? At best, you may be able to send a “quote stream done for this symbol” message. But even this “done” message has to be expressed using the SQL-like language, so you have some fairly annoying code. Again, there’s no direct way to express “join rows from storage into this stream and then, after all the rows have been sent, send a “done” event to the stream.”
I haven’t worked with an SQL EP language in a few months, so maybe I’m missing something. But Last time I checked, use cases like this one were hard using SQL EP languages. SQL is very expressive in certain ways and I find that it can significantly reduce the amount of code required for many EP tasks. But with the current state of SQL EP languages, I suspect that any big project will end up writing a certain amount of Java/C++/C# code in addition to the SQL.
Determinism and scalability
I got some interesting email responses to my post about BAM failure scenarios, some of which would probably do better in a public forum. I sense that hovering around the fringes of groups like the CEP Yahoo Group, are people looking for practical advice on implementing EP. They have some interest in strategic implications of this and that technology but really they have specific problems to solve and are looking to discuss potential solutions.
So I’m writing a little bit about scalability and determinism, which is at the core of a question I found in my in box. Hopefully, this will both help people who have to deal with this problem and highlight the need for more practical advice on implementing EP. My initial goal is to define the domain of the problem and the context in which it must be solved. Maybe later, I’ll have some answers.
The gist of the question is something like “shouldn’t EP products give us explicit control over the trade off between scalability and determinism and what advice might I have about that trade off?”. All this was in the context of BAM, but I think the issue is more broad than that. The initial question had the hint of frustration stemming from apparent experience with vendors that either claim determinism as being important above all other features or dismissing it entirely. We know that neither of these positions can be 100% correct.
The trade off in question is as old as the ability to distribute work loads among multiple CPUs. The simplest way to get determinism is never to share data between threads or processes and to distribute work to threads (or processes) in a repeatable way. But this tactic leads to certain potential issues. Let’s say that I distribute transactions among threads based on the name (or company name) of the person requesting the transaction. This is a common tactic in financial compliance and order management. Since orders placed by one firm frequently don’t impact the credit or execution of orders placed by another firm, it’s often possible to distribute work load for these orders to threads (or processes or servers) based on the firm placing the order. In other words, as long as orders placed by the same firm always go to the same thread, there is no need for threads to share data between them. Similarly, work load might be distributed by the symbol of the instrument being traded, as is common in automatic trading systems.
So in this simple case, we have n threads, they never share data, and we distribute load among them based on some deterministic manipulation of the first character of the name of the person (or company) requesting the transaction. Obviously here we have n less than or equal to 26. Let’s also assume that transactions for one customer never affect transactions for another customer and that every thread will get as much CPU time as it wants (obviously, each thread will only use one CPU at any time). The two immediate problems are: (1) it’s possible for certain threads to become backed up while others are unused, based only on the distribution our customer names and (2) if one thread becomes slow or blocked, the transactions for all customers whose names are routed to that thread must also become slow or blocked.
Among financial service providers, by far the most common tactic is to accept these potential issues. Since the beginning of electronic exchanges, order book traffic has been distributed to processes based on the symbol. Several generations of traders are all too familiar with the emails from an exchange saying “we are experiencing queuing of the following symbols,” meaning that the process handling those symbols is not keeping up with the traffic and so incoming transaction messages for those symbols are being queued. And several generations of programmers at financial institutions are used to thinking that there must be a better way.
As far as I know, the second most common tactic is to use some kind of distributed cache, maybe combined with a smart workload routing system. Under the more advanced systems, work is usually routed by customer name (or whatever criteria), so in general threads don’t have to share data. But if one thread becomes overloaded or backed up, work is redistributed to other threads that then access the required data from the distributed cache. This is how many, many web sites work. They try to stick you to a particular thread based on a cookie, but if that thread isn’t doing so well, they will send you to a different thread that will somehow retrieve all your pertinent data from a cache. Of course, since threads are interacting with each other in this model, we introduce the possibility of losing determinism. The key to making this scenario work is generally locking. If you lock objects in the cache properly, you can guarantee deterministic behavior where you need it.
These two approaches generally embody the trade offs between throughput and scalability, average latency and maximum latency. A solution without the potential to dynamically distribute load does not need very much intra-thread locking, resulting in lower average latency and higher throughput per thread. But a solution that does have this capability can route work around bottlenecks and so may have lower maximum latency. Also, such a system may be able to scale up to use more threads (processes, servers) during peak load periods and then release those resources during low load periods. I put the second may in italics because such scalability often sounds better in theory than it turns out to be in practice.
In finance, many systems are very sensitive to average latency. Trading decisions or order matches needs to be made as fast as possible and orders need to reach their markets before the next guy can get in on your action. This is why the solution with the lowest average latency is so common.
Now… along comes the potential for real time BAM or fraud detection and we have to think seriously about what strategy is best. BAM and fraud detection often need to operate on data across arbitrary customers/symbols/whatever and so traditional tactics for distributing load among threads that don’t share data, might not work. At the same time, these systems are handling all the messages from the entire distributed application in one place, so they need to be as efficient as possible in their CPU use. All of this can be compounded by a common goal of many real-time BAM and fraud monitoring projects: reducing the skill set needed to process real-time data into a business context. Not only are projects taking on the challenge of processing bazillions of messages per second, they would like to do it in a way that doesn’t require senior programmers with threading experience to write the processing rules.
After writing all of this out, it seems obvious to me why so many smart people are interested relating EP to these problems. The issues involved here are at the cutting edge of modern software.
The only advice that I was able to muster on this topic was this: (1) Give up determinism at your own peril. Nothing is more frustrating or wastes more time than tracking down errors from production that you can’t reproduce in development. (2) Requiring complex object locking schemes goes against the goal of reducing the skill set needed to write or update your business rules. (3) Simple object locking schemes often have a higher impact on performance than you initially imagine. When you start down the road of object locking, think very carefully about both your current and future requirements and about future generations that will work on your project.
OK, so no big revelations in this post, but at least I got it all typed out. I’ve used up my alloted hour of writing for this week and dealing with these issues if pretty complicated. So we’ll see if I get more time to write about it. Hopefully, though, this post will help to spark others to talk about their experience both in the practical and theoretical sides of this topic.
Also, see these blog posts for an EP perspective on this topic: Apama and Coral8.
Finally, please forgive grammar problems or oversights in my posts. Writing on this blog is basically a distraction from other tasks and I never have much time to proof read or edit.
Fair and unfair criticism of an SQL EP approach
Recently we have seen various criticisms of the SQL approach to EP. I find it interesting that I agree with some of the general premise of the criticism, yet I find many of the particular arguments to be flawed.
Apama has been a long time critic of the SQL-esque approach and I think that they are highly credible in this area. Apama could absolutely implement an SQL language into their EP solution. They have very significant resources and an extremely flexible EP product which could almost certainly allow SQL and MonitorScript to coexist. So when they say that they don’t like SQL, it’s because they’ve though a great deal about the topic and genuinely feel that implementing an SQL language is not in the best interest of their customers. If a market leader takes a strong position like this, it’s only wise to pay attention.
And yet, this post from inferences about an SQL EP approach based on various issues with SQL databases, without looking deeply enough into the topic to support those inferences. I find the latter category to be misleading because some of these inferences are incorrect and that throws an unnecessary shadow over the whole issue.
For example of a “category 1″ flaw,
This is, of course, not to say that a nested data structure is necessarily inappropriate. But as we can see, it is certainly incorrect to assume that using a nested structure is better than a flat structure without significantly more analysis. This leads to the argument about O/R mappers. The argument about how the existence of O/R mappers proves that an SQL approach is bad, is illogical. If a nested object view of data were so much better than a flat table approach, object databases would be much more popular than they are. The fact is that flat table structures are so common not because of some limitation of database vendors or user’s imagination, but because they are often found to be much more flexible in the long run than a nested structure.
Having now disagreed with Louis’ nested-data-structure argument, I find myself thinking “ok, but if I determine that a nested structure truly would be best, I would like to be able to use it.” This leads me into “category 2″ flaws.
inferences about an SQL EP approach based on problems that are common with an SQL database. After much consideration, I think that while there is the a core of a good point in this idea, the general comparison between SQL databases and an SQL EP approach is being abused. As we have heard recently, not all SQL-like languages prevent nested data structures. As far as I know, nothing in SQL like syntax prevents addressing a nested structure. This leaves it up to each vendor to implement the capability for nested structures (or not). Let’s not get caught into the trap of assuming that just because relational databases don’t allow for nested data, an SQL language approach to EP will always have the same “problem”. We can see the example of Esper, which allows for an SQL-like approach while retaining the ability to use nested objects.
Apama is an extremely professional organization, so I know that they will take this post for what it is: a comment on the discussion of SQL-or-not. I criticise one of their recent posts, but not of their opinion about SQL. They have a much more broad view of the EP market than I do and when they say that they don’t see a need for an SQL-like approach, I imagine that they know what they’re talking about. Indeed, I have seen several things about existing SQL languages that I wish were better. But still, I have not yet seen an argument that condemns an SQL approach to eternal uselessness. At the same time, I have seen several arguments that show why an SQL approach might be appropriate. So I look forward to Apama demonstrating their reasons over time. In the mean time, I continue to believe that the best thing would be for each user to analyze both SQL and SQL-less approaches in the context of their problem.
P.S. I believe that Louis provides an unnecessarily inefficient bit of code in his article. I’m no expert on StreamSQL, but I’m pretty sure that I recall that it’s possible to implement a counter like this as a select statement from the input stream without using a memory table.
Update: Apologies to Louis for the “inefficient bit of code” comment, I didn’t realize that it comes from a vendor. Indeed that bit of code demonstrates an annoying “workaround” in SQL.

leave a comment