Correlation – what’s it good for? Absolutely nothing!*
* Thank you Edwin Starr.
Ok, that might be a little harsh, but hear me out.
The grand vision of Security Information and Event Management is that it will tell you when you are in danger, and the means to deliver this is through sifting mountains of log files looking for trouble signs. I like to think of that as big-C correlation. Big-C correlation is an admirable concept of associating events with importance. But whenever a discussion occurs about correlation or for that matter SIEM – it quickly becomes a discussion about what I call little-c correlation – that is rules-based multi-event pattern matching.
To proponents of correlation, correlation can detect patterns of behavior so subtle that it would be impossible for a human unaided to do the same. It can deliver the promise of SIEM – telling you what is wrong in a sea of data. Heady stuff indeed and partially true. But the naysayers have numerous good arguments against as well; in no particular order some of the more common ones:
• Rules are too hard to write
• The rule builders supplied by the vendors are not powerful enough
• Users don’t understand the use cases (that is usually a vendor rebuttal argument for the above).
• Rules are not “set and forget” and require constant tuning
• Correlation can’t tell you anything you don’t already know (you have to know the condition to write the rule)
• Too many false positives
The proponents reply that this is a technical challenge and the tools will get better and the problem will be conquered. I have a broader concern about correlation (little c) however, and that is just how useful is it to the majority of customer uses cases. And if it is not useful, is SIEM, with a correlation focus, really viable?
The guys over at Securosis have been running a series defining SIEM that is really worth a read. Now the method they recommend for approaching correlation is that you look at your last 4-5 incidents when it comes to rule-authoring. Their basic point is that if the goals are modest, you can be modestly successful. OK, I agree, but then how many of the big security problems today are really the ones best served by correlation? Heck it seems the big problems are people being tricked into downloading and running malware and correlation is not going to help that. Education and Change Detection are both better ways to avoid those types of threats. Nor will correlation help with SQL injection. Most of the classic scenarios for correlation are successful perimeter breaches but with a SQL attack you are already within the perimeter. It seems to me correlation is potentially solving yesterdays’ problems – and doing it, because of technical challenges, poorly.
So to break down my fundamental issue with correlation – how many incidents are 1) serious 2) have occurred 3) cannot be mitigated in some other more reasonable fashion and 4) the future discovery is best done by detecting a complex pattern?
Not many, I reckon.
No wonder SIEM gets a bad rap on occasion. SIEM will make a user safer but the means to the end is focused on a flawed concept.
That is not to say correlation does not have its uses – certainly the bigger and more complex the environment the more likely you are going to have cases where correlation could and does help. In F500 the very complexity of the environment can mean other mitigation approaches are less achievable. The classic correlation focused SEM market started in large enterprise but is it a viable approach?
Let’s use Prism as an example, as I can speak for the experiences of our customers. We have about 900 customers that have deployed EventTracker, our SIEM solution. These customers are mostly smaller enterprises, what Gartner defines as SME, however they still purchased predominantly for the classic Gartner use case – the budget came from a compliance drive but they wanted to use SIEM as a means of improving overall IT security and sometime operations.
In the case of EventTracker the product is a single integrated solution so the rule-based correlation engine is simply part of the package. It is real-time, extensible and ships with a bunch of predefined rules.
But only a handful of our customers actually use it, and even those who do, don’t do much.
Interestingly enough, most of the customers looked at correlation during evaluation but when the product went into production only a handful actually ended up writing correlation rules. So the reality was, although they thought they were going to use the capability, few did. A larger number, but still a distinct minority, are using some of the preconfigured correlations as there are some uses cases (such as failed logins on multiple machines from a single IP) that a simple correlation rule makes good sense for. Even with the packaged rules however customers tended to use only a handful and regardless these are not the classic “if you see this on a firewall, and this on a server, and this in AD, followed by outbound ftp traffic you are in trouble” complex correlation examples people are fond of using.
Our natural reaction was there was something wrong with the correlation feature so we went back to the installed base and started nosing about. The common response was, no nothing wrong, just never got to it. On further questioning we surfaced the fact for most of the problems they were facing – rules were simply not the best approach to solving the problem.
So we have an industry that, if you agree with my premise, is talking about core value that is impractical to all but a small minority. We are, as vendors, selling snake oil.
So what does that mean?
Are prospects overweighting correlation capability in their evaluations to the detriment of other features that they will actually use later? Are they setting themselves up to fail with false expectations into what SIEM can deliver?
From a vendor standpoint are we all spending R&D dollars on capability that is really simply demoware? Case in point is correlation GUIs. Lots of R&D $$ go into correlation GUIs because writing rules is too hard and customers are going to write a lot of rules. But the compelling value promised for correlation is the ability to look for highly complex conditions. Inevitably when you make a development tool simpler you compromise the power in favor of speed of development. In truth you have not only made it simpler, but also stupider, and less capable. And if you are seldom writing rules, per the Securosis approach, does it need to be fast and easy at all?
That is not to say SIEM is valueless, SIEM is extremely valuable, but we are focusing on its most difficult and least valuable component which is really pretty darn strange. There was an interesting and amusing exchange a week or so ago when LogLogic lowered the price of their SEM capability. This, as you might imagine, raised the hackles of the SEM apologists. Rocky uses Arcsight as an example of successful SIEM (although he conveniently talks about SEM as SIEM, and the SIEM use case is broader now than SEM) – but how much ESM is Arcsight selling down-market? I tend to agree with Rocky in large enterprise but using that as an indicator of the broad market is dangerous. Plus the example of our customers I gave above would lead one to believe people bought for one reason but using the products in an entirely different way.
So hopefully this will spark some discussion. This is, and should not be, a slag between Log Management, SEM or SIM because it seems to me the only real differences between SEM and LM these days is in the amount of lip service paid to real-time rules.
So let’s talk about correlation – what is it good for?