Well this is starting to turn into a bit of a bun fight, which was not my intent as I was merely attempting to clarify some incorrect claims in the Splunk post. Well, now Anton has weighed in with his perspective:
“I think this debate is mostly about two approaches to logs: collect and parse some logs (typical SIEM approach) vs collect and index all logs (like, ahem, “IT search”).”
Yes, he is right in a sense. It is a great concise statement, but the statement needs to be looked at as there are some nuances here that need to be understood.
Just a bit of level-set before going to work on the meat of the statement.
Most SIEM solutions today have a real-time component, (typically a correlation engine), and some kind of analytics capability. Depending on the vendor some do one or the other better (and of course we all package and price them all differently).
Most of the “older” vendors started out as correlation vendors targeting F2000 enabling real-time threat detection in the SOC. The analytics piece was a bit of a secondary requirement, and secure, long term storage not so much as all. The Gartner guys called these vendors SEM or Security Event Management providers which is instructive – event to me implies a fairly short-term context. Since 2000, the analytics and reporting capability has become increasingly important as compliance has become the big driver. Many of the newer vendors in the SIEM market focused on solving the compliance use-case and these solutions typically featured secure and long term storage, compliance packs, good reporting etc. These new vendors were sometimes referred to as SIM or Security Information Management. These vendors fit a nice gap left in the capabilities of the correlation vendors. Some of the newer vendors like Loglogic made a nice business focusing on selling log collection solutions to large enterprise – typically as an augmentation to an existing SIM. Some of these newer vendors like Prism , focused on mid tier and provided lower-cost, easy to deploy solutions that did both compliance as well as provided real-time capabilities to companies that did not that did have the money or the people to afford the enterprise correlation guys. These companies had a compliance requirement and wanted to get some security improvements as well.
But really all of us, SIM/SEM, enterprise, mid-tier, Splunk were/are collecting the same darn logs – we were just doing slightly different things with them. So of course the correlation guys have released log aggregators (like Arcsight Logger), and the Log Management vendors have added or always had real-time capability. And at the end of the day we ended up getting lumped into the SIEM bucket, and here we are.
For anyone with a SIEM requirement… You should understand what your business requirements are and then look long and hard at the vendor’s capability – preferably by getting them in house to do an evaluation in your own environment. Buying according to which one claims to do the most events per second or supports the most devices, or even the one has the most mindshare in the market is really short sighted. Nothing beats using the solution in action for a few weeks, and this is a classic “the devil is in the details…”
So, back to Anton’s statement (finally!). When Anton refers to “collect and parse some logs” that is the typical simplification of the real-time security use case – you are looking for patterns of behavior and only certain logs are important because you are looking for attack patterns in specific event types.
The “collect and index all the logs” is the typical compliance use case. The indexing is simply the method of storing for efficient retrieval during analysis – again a typical analytics requirement.
Another side note. The importance of collecting all the logs is a risk assessment that the end user should do. Many people tend to collect “all” the logs because they don’t know what is important and it is deemed the easiest and safest approach. The biggest beneficiaries of that approach are the SIEM appliance vendors as they get to sell another proprietary box when the event volume goes through the roof, and of course those individuals that hold stock in EMC. Despite compression, a lot of logs is still a lot of logs!
Increasingly, customers I talk to are making a conscious decision to not collect or retain all the logs as there is overhead and a security risk in storing logs as they consider them sensitive data. Quite frankly you should look for a vendor that allows you to collect all the data, but also provides you with some fairly robust filtering capability in case you don’t want or need to. This is a topic for another day, however.
So when Anton claims that you need to do both – if you want to do real-time analysis as well as forensics and compliance -then yes, I agree, but when he claims the “collect and parse” is the typical SIEM approach then that is an overgeneralization, which really was the purpose of my post to begin with. I tend not to favor them as they simply misinform the reader.
– Steve Lafferty