Is correlation killing the SIEM market?

Correlation – what’s it good for? Absolutely nothing!*

* Thank you Edwin Starr.

Ok, that might be a little harsh, but hear me out.

The grand vision of Security Information and Event Management is that it will tell you when you are in danger, and the means to deliver this is through sifting mountains of log files looking for trouble signs. I like to think of that as big-C correlation. Big-C correlation is an admirable concept of associating events with importance. But whenever a discussion occurs about correlation or for that matter SIEM – it quickly becomes a discussion about what I call little-c correlation – that is rules-based multi-event pattern matching.

To proponents of correlation, correlation can detect patterns of behavior so subtle that it would be impossible for a human unaided to do the same. It can deliver the promise of SIEM – telling you what is wrong in a sea of data. Heady stuff indeed and partially true. But the naysayers have numerous good arguments against as well; in no particular order some of the more common ones:

• Rules are too hard to write
• The rule builders supplied by the vendors are not powerful enough
• Users don’t understand the use cases (that is usually a vendor rebuttal argument for the above).
• Rules are not “set and forget” and require constant tuning
• Correlation can’t tell you anything you don’t already know (you have to know the condition to write the rule)
• Too many false positives

The proponents reply that this is a technical challenge and the tools will get better and the problem will be conquered. I have a broader concern about correlation (little c) however, and that is just how useful is it to the majority of customer uses cases. And if it is not useful, is SIEM, with a correlation focus, really viable?

The guys over at Securosis have been running a series defining SIEM that is really worth a read. Now the method they recommend for approaching correlation is that you look at your last 4-5 incidents when it comes to rule-authoring. Their basic point is that if the goals are modest, you can be modestly successful. OK, I agree, but then how many of the big security problems today are really the ones best served by correlation? Heck it seems the big problems are people being tricked into downloading and running malware and correlation is not going to help that. Education and Change Detection are both better ways to avoid those types of threats. Nor will correlation help with SQL injection. Most of the classic scenarios for correlation are successful perimeter breaches but with a SQL attack you are already within the perimeter. It seems to me correlation is potentially solving yesterdays’ problems – and doing it, because of technical challenges, poorly.

So to break down my fundamental issue with correlation – how many incidents are 1) serious 2) have occurred 3) cannot be mitigated in some other more reasonable fashion and 4) the future discovery is best done by detecting a complex pattern?

Not many, I reckon.

No wonder SIEM gets a bad rap on occasion. SIEM will make a user safer but the means to the end is focused on a flawed concept.

That is not to say correlation does not have its uses – certainly the bigger and more complex the environment the more likely you are going to have cases where correlation could and does help. In F500 the very complexity of the environment can mean other mitigation approaches are less achievable. The classic correlation focused SEM market started in large enterprise but is it a viable approach?

Let’s use Prism as an example, as I can speak for the experiences of our customers. We have about 900 customers that have deployed EventTracker, our SIEM solution. These customers are mostly smaller enterprises, what Gartner defines as SME, however they still purchased predominantly for the classic Gartner use case – the budget came from a compliance drive but they wanted to use SIEM as a means of improving overall IT security and sometime operations.

In the case of EventTracker the product is a single integrated solution so the rule-based correlation engine is simply part of the package. It is real-time, extensible and ships with a bunch of predefined rules.

But only a handful of our customers actually use it, and even those who do, don’t do much.

Interestingly enough, most of the customers looked at correlation during evaluation but when the product went into production only a handful actually ended up writing correlation rules. So the reality was, although they thought they were going to use the capability, few did. A larger number, but still a distinct minority, are using some of the preconfigured correlations as there are some uses cases (such as failed logins on multiple machines from a single IP) that a simple correlation rule makes good sense for. Even with the packaged rules however customers tended to use only a handful and regardless these are not the classic “if you see this on a firewall, and this on a server, and this in AD, followed by outbound ftp traffic you are in trouble” complex correlation examples people are fond of using.

Our natural reaction was there was something wrong with the correlation feature so we went back to the installed base and started nosing about. The common response was, no nothing wrong, just never got to it. On further questioning we surfaced the fact for most of the problems they were facing – rules were simply not the best approach to solving the problem.

So we have an industry that, if you agree with my premise, is talking about core value that is impractical to all but a small minority. We are, as vendors, selling snake oil.

So what does that mean?

Are prospects overweighting correlation capability in their evaluations to the detriment of other features that they will actually use later? Are they setting themselves up to fail with false expectations into what SIEM can deliver?

From a vendor standpoint are we all spending R&D dollars on capability that is really simply demoware? Case in point is correlation GUIs. Lots of R&D $$ go into correlation GUIs because writing rules is too hard and customers are going to write a lot of rules. But the compelling value promised for correlation is the ability to look for highly complex conditions. Inevitably when you make a development tool simpler you compromise the power in favor of speed of development. In truth you have not only made it simpler, but also stupider, and less capable. And if you are seldom writing rules, per the Securosis approach, does it need to be fast and easy at all?

That is not to say SIEM is valueless, SIEM is extremely valuable, but we are focusing on its most difficult and least valuable component which is really pretty darn strange. There was an interesting and amusing exchange a week or so ago when LogLogic lowered the price of their SEM capability. This, as you might imagine, raised the hackles of the SEM apologists. Rocky uses Arcsight as an example of successful SIEM (although he conveniently talks about SEM as SIEM, and the SIEM use case is broader now than SEM) – but how much ESM is Arcsight selling down-market? I tend to agree with Rocky in large enterprise but using that as an indicator of the broad market is dangerous. Plus the example of our customers I gave above would lead one to believe people bought for one reason but using the products in an entirely different way.

So hopefully this will spark some discussion. This is, and should not be, a slag between Log Management, SEM or SIM because it seems to me the only real differences between SEM and LM these days is in the amount of lip service paid to real-time rules.

So let’s talk about correlation – what is it good for?

-Steve Lafferty

SIEM or Log Management?

Mike Rothman of Securosis has a thread titled Understanding and Selecting SIEM/Log Management. He suggests both disciplines have fused and defines the holy grail of security practitioners as “one alert telling exactly what is broken”. In the ensuing discussion, there is a suggestion that SIEM and Log Mgt have not fused and there are vendors that do one but not the other.

After a number of years in the industry, I find myself uncomfortable with either term (SIEM or Log Mgt) as it relates to the problem the technology can solve, especially for the mid-market, our focus.

The SIEM term suggests it’s only about Security, and while that is certainly a significant use-case, it’s hardly the only use for the technology. That said if a user wishes to use the technology for only the security use case, fine, but that is not a reflection of the technology. Oh by the way, Security Information Management would perforce include other items such as change audit and configuration assessment data as well which is outside scope of “Log Management”.

The trouble with the term Log Management is that it is not tied to any particular use case and that makes it difficult to sell (not to mention boring). Why would you want to manage logs anyway? Users only care about solutions to real problems they have; not generic “best practice” because Mr. Pundit says so.

SIEM makes sense as “the” use case for this technology as you go to large (Fortune 2000) enterprises and here SIEM is often a synonym for correlation.
But to do this in any useful way, you will need not just the box (real or
virtual) but especially the expert analyst team to drive it, keep it updated and ticking. What is this analyst team busy with? Updating the rules to accommodate constantly changing elements (threats, business rules, IT components) to get that “one alert”. This is not like AntiVirus where rule updates can happen directly from the vendor with no intervention from the admin/user. This is a model only large enterprises can afford.

Some vendors suggest that you can reduce this to an analyst-in-a-box for small enterprise i.e., just buy my box, enable these default rules, minimal intervention and bingo you will be safe. All too common results are either irrelevant alerts or the magic box acts as the dog in the night time. A major reason for “pissed-off SIEM users”. And of course a dedicated analyst (much less a team) is simply not available.

This not to say that the technology is useless absent the dedicated analyst or that SIEM is a lost cause but rather to paint a realistic picture that any “box” can only go so far by itself; and given the more-with-less needs in this mid-market, obsessing on SIEM features obscures the greater value offered by this technology.

Most Medium Enterprise networks are “organically grown architectures” a response to business needs — there is rarely an overarching security model that covers the assets. Point solutions dominate based on incidents or perceived threats or in response to specific compliance mandates. See the results of our virtualization survey for example. Given the resource constraints, the technology must have broad features beyond the (essential) security ones. The smarter the solution, the less smart the analyst needs to be — so really it’s a box-for-an-analyst (and of course all boxes now ought to be virtual).

It makes sense to ask what problem is solved, as this is the universe customers live in. Mike identifies reacting faster, security efficiency and compliance automation to which I would add operations support and cost reduction. More specifically, across the board, show what is happening (track users, monitor critical systems/applications/firewalls, USB activity, database activity, hypervisor changes, physical eqpt etc), show what has happened (forensic, reports etc) and show what is different (change audit).

So back to the question, what would you call such a solution? SIEM has been pounded by Gartner et al into the budget line items of large enterprises so it becomes easier to be recognized as a need. However it is a limiting description. If I had only these two choices, I would have to favor Log Management where one (essential) application is SIEM.

-Ananth