Does EP change the nature of data analysis?
I wonder if EP (or CEP, whatever the difference may be) will change the nature of data analysis. If it does, this is a really big deal. But I’m skeptical. I’m not talking about the fact that EP software helps implement detection rules over real-time data. I’m talking about the theories and methods that we use to develop detection rules. Will EP usher in new ways of locating patterns in data and, if so, will those new methods or theories then shape new ideas in data analysis?
This post was, in some way, prompted by a recent post from Jack at Aleri. But the question has been lingering in my mind since Opher posted about the possibility of EP ushering in the widespread adoption of CoDA. I began wondering if EP is something other than the inevitable productizing of best practices and frameworks for real-time processing.
Data analysis is a huge field. We have mathematics (e.g. central limit theorems and much more) and plenty of applied techniques for analyzing different kinds of data. We also have visualization techniques and plenty of research going on there. And the list goes on. So what will EP contribute?
Visualizing data in real-time is useful in many cases, and EP software can be used here to slice and dice the data in real time. Jack points out a good idea in this area, applying a data-dicing UI to real-time data. But I would not exactly call this new. Even thought maybe it would happen in real time, it boils down to breaking data into windows. And windows are useful but not new (although the ease of declaring these windows, as provided by SQL-like EP solutions, is a clear advance). So EP contributes in this area by making these techniques available for real-time data, but has not yet produced a change in how people analyze data.
I have also noticed that the folks over at The CEP Blog apparently want CEP to provide some new (and possibly magical) pattern detection techniques. And I wonder: where are these techniques supposed to come from? As far as I know, they will come from research in mathematics and statistics. Or they will come from research in applying math to particular problem domains (e.g. network security, trading) or to visualization. So the question becomes: what will EP contribute to these established fields? Is EP’s role to contribute original ideas to data analysis, or to provide a convenient way to apply techniques developed by other research, to real-time data?