Download the Report
Advanced Threat Protection
Download the Datasheet
Let's Go Threat Hunting: Gain Visibility and Insight into Potential Threats and Risks
Download the Whitepaper
Bracing for the Tidal Wave of Data Privacy Compliance in America
View Recent Catches
Catch More Threats
January 18, 2008
Selection criteria for pragmatic Log Management As we wrap up our 6-month tour of Pragmatic Log Management, let’s focus on what are some of the important buying criteria that you should consider when looking at log management offerings. Ultimately, a lot of the vendors in the space have done a good job of making all the products sound the same. So really deciphering what differentiates one product versus another is an art form.
January 17, 2008
Well this is starting to turn into a bit of a bun fight, which was not my intent as I was merely attempting to clarify some incorrect claims in the Splunk post. Well, now Anton has weighed in with his perspective:
“I think this debate is mostly about two approaches to logs: collect and parse some logs (typical SIEM approach) vs collect and index all logs (like, ahem, “IT search”).”
Yes, he is right in a sense. It is a great concise statement, but the statement needs to be looked at as there are some nuances here that need to be understood.
Just a bit of level-set before going to work on the meat of the statement.
Most SIEM solutions today have a real-time component, (typically a correlation engine), and some kind of analytics capability. Depending on the vendor some do one or the other better (and of course we all package and price them all differently).
Most of the “older” vendors started out as correlation vendors targeting F2000 enabling real-time threat detection in the SOC. The analytics piece was a bit of a secondary requirement, and secure, long term storage not so much as all. The Gartner guys called these vendors SEM or Security Event Management providers which is instructive – event to me implies a fairly short-term context. Since 2000, the analytics and reporting capability has become increasingly important as compliance has become the big driver. Many of the newer vendors in the SIEM market focused on solving the compliance use-case and these solutions typically featured secure and long term storage, compliance packs, good reporting etc. These new vendors were sometimes referred to as SIM or Security Information Management. These vendors fit a nice gap left in the capabilities of the correlation vendors. Some of the newer vendors like Loglogic made a nice business focusing on selling log collection solutions to large enterprise – typically as an augmentation to an existing SIM. Some of these newer vendors like Prism , focused on mid tier and provided lower-cost, easy to deploy solutions that did both compliance as well as provided real-time capabilities to companies that did not that did have the money or the people to afford the enterprise correlation guys. These companies had a compliance requirement and wanted to get some security improvements as well.
But really all of us, SIM/SEM, enterprise, mid-tier, Splunk were/are collecting the same darn logs – we were just doing slightly different things with them. So of course the correlation guys have released log aggregators (like Arcsight Logger), and the Log Management vendors have added or always had real-time capability. And at the end of the day we ended up getting lumped into the SIEM bucket, and here we are.
For anyone with a SIEM requirement… You should understand what your business requirements are and then look long and hard at the vendor’s capability – preferably by getting them in house to do an evaluation in your own environment. Buying according to which one claims to do the most events per second or supports the most devices, or even the one has the most mindshare in the market is really short sighted. Nothing beats using the solution in action for a few weeks, and this is a classic “the devil is in the details…”
So, back to Anton’s statement (finally!). When Anton refers to “collect and parse some logs” that is the typical simplification of the real-time security use case – you are looking for patterns of behavior and only certain logs are important because you are looking for attack patterns in specific event types.
The “collect and index all the logs” is the typical compliance use case. The indexing is simply the method of storing for efficient retrieval during analysis – again a typical analytics requirement.
Another side note. The importance of collecting all the logs is a risk assessment that the end user should do. Many people tend to collect “all” the logs because they don’t know what is important and it is deemed the easiest and safest approach. The biggest beneficiaries of that approach are the SIEM appliance vendors as they get to sell another proprietary box when the event volume goes through the roof, and of course those individuals that hold stock in EMC. Despite compression, a lot of logs is still a lot of logs!
Increasingly, customers I talk to are making a conscious decision to not collect or retain all the logs as there is overhead and a security risk in storing logs as they consider them sensitive data. Quite frankly you should look for a vendor that allows you to collect all the data, but also provides you with some fairly robust filtering capability in case you don’t want or need to. This is a topic for another day, however.
So when Anton claims that you need to do both – if you want to do real-time analysis as well as forensics and compliance -then yes, I agree, but when he claims the “collect and parse” is the typical SIEM approach then that is an overgeneralization, which really was the purpose of my post to begin with. I tend not to favor them as they simply misinform the reader.
– Steve Lafferty
January 15, 2008
I posted a commentary a while ago on a post by Raffy, who discussed the differences between IT Search (or Splunk, as they are the only folks I know who are trying to make IT Search a distinct product category) and SIEM. Raffy posted a clarification in response to my commentary. What I was pointing out in my original post was that all vendors, SIEM or Splunk, are loading the same standard formats – and what needed to be maintained was, in fact, not the basic loader, but the knowledge (the prioritization, the reports, alerts etc) of what to do with all that data. And the knowledge is a core part of the value that SIEM solutions provide. On that we seem to agree. And as Raffy points out, the Splunk guys are busily beavering away producing knowledge as well. Although be careful — you may wake up one morning and find that you have turned into a SIEM solution!
Sadly the concept of the bad “parser” or loader continues to creep in – Splunk does not need it which is good. SIEM systems do, which is bad.
I am reasonably familiar with quite a few of the offerings out there for doing SIEM/log management, and quite frankly, outside of perhaps Arcsight (I am giving Raffy the benefit of the doubt here as he used to work at Arcsight, so he would know better than I), I can’t think of a vendor that writes proprietary connectors or parsers to simply load raw data. We (EventTracker) certainly don’t. From an engineering standpoint, when there are standard formats like Windows EVT, Syslog and SNMP it would be pretty silly to create something else. Why would you? You write them only when there is a proprietary API or data format like Checkpoint where you absolutely have to. No difference here. I don’t see how this parser argument is in any way, shape or form indicative of a core difference.
I am waiting on Raffy’s promised follow-on post with some anticipation – he states that he will explain the many other differences between IT Search and SIEM, although he prefaced some of it with the Splunk is Google-like and Google is God ergo…
Google was/is a gamechanging application, and there are a number of things that made them unique – easy to use, fast, and the ability to return valuable information. But what made Google a gazillion dollar corporation is not the Natural Language Search – I mean, that is nice but simple “and” “or” “not” is really not a breakthrough in the grand scheme of things. Now the speed of the Google search, that is pretty impressive – but that is due to enormous server farms so that is mechanical. Most of the other early internet search vendors had both these capabilities. My early personal favorite was AltaVista, but I switched a long time ago to Google.
Why? What absolutely blew my socks off and continues to do so to this day about Google is their ability to figure out which of the 10 millions entries for my arbitrary search string are the ones I care about, and providing them, or some of them, to me in the first hundred entries. They find the needle in the proverbial haystack. Now that is spectacular (and highly proprietary) and the ranking algorithm is a closely guarded secret I hear. Someone once told me that lot of it is done around ranking from the millions of people doing similar searches – it is the sheer quantity of search users on the internet. The more searches they conduct the better they become. I can believe that. Google works because of the quantity of data and because the community is so large – and they have figured out a way to put the two together.
I wonder how an approach like that would work however, when you have a few admins searching a few dozen times a week. Not sure how that will translate, but I am looking forward to finding out!
January 14, 2008
Mid-size organizations continue to be tossed on the horns of the Security/Compliance dilemma. Is it reasonable to consider regulatory compliance a natural benefit of a security focused approach?
Consider why regulatory standards came into being in the first place. Some like PCI-DSS, FISMA and DCID/6 are largely driven by security concerns and the potential for loss of high value data. Others like Sarbanes-Oxley seek to establish responsibility for changes and are an incentive to blunt the insider threat. Vendor provided Best Practices have come about because of concerns about “attack surface” and “vulnerability”. Clearly security issues.
While large organizations can establish dedicated “compliance teams”, the high cost of such an approach precludes it as an option for mid tier organizations. If you could only have one team and effort and had to choose, its a no-brainer. Security wins. Accordingly, such organizations naturally consider that compliance efforts are folded into the security teams and budgets.
While this is a reasonable approach, recognize that some compliance regulations are more auditor and governance related and a strict security view is a misfit. An adaptation, is to transition the ownership of tools and their use from the security to the operational team.
The classic approach for mid-size organizations to the dilemma — start as a security focused initiative, transition to the operations team.
January 10, 2008
Did you know? PCI-DSS forbids storage of CVV
A recent Ecommerce Checkout Report stated that “55% of the Top 100 retailers require shoppers to give a CVV2, CID, or CVC number during the checkout process.” That’s great for anti-fraud and customer verification purposes, but it also creates a high level of risk around inappropriate information storage.
To clarify, the CVV (Card Verification Value) is actually a part of the of the magnetic track data in the card itself. CVV2/CVC2/CID information is the 3 or 4 digit code on the back of the signature strip of a credit or debit card (or on the front of American Express cards).
The Payment Card Industry Data Security Standard (PCI DSS) clearly states* that there are three pieces of data that may not be stored after authorization is complete (regardless of whether you are handling card-present or card-not-present transactions):
January 03, 2008
Came across an interesting blog entry by Raffy at Splunk. As a marketing guy I am jealous as they are generating a lot of buzz about “IT Search”. Splunk has led a lot of people that are knowledgeable to wonder how this is something different than what all the log management vendors have been providing.
Still, while Raffy touched on what is one of the real differences between IT Search and Log Management, he left a few of the salient points out in the discussion of a “connector” and how a connector puts you at the mercy of the vendor to produce the connector, and what happens when the log data format changes?
Let’s step back — at the most basic level in log management (or IT Search for that matter) you have to do 2 fundamental things, you have to help people 1) collect logs from a mess of different sources, and 2) help them do interesting things with them. The “do interesting things” includes the usual stuff like correlation, reporting, analytics, secure storage etc.
You can debate fiercely the relative robustness of collection architectures – and there are a number of differences if you are evaluating vendors you should look at. For the sake of this discussion however most any log management system worthy of its salt will have a collection mechanism for all the basic methods – if you handle (in no particular order) ODBC, Syslog, read the Windows event format, maybe SNMP, throw in a file reader for custom applications, well you have the collection pretty much covered..
The reality is, as Raffy points out, there are a few totally proprietary access methods to get logs like Checkpoint. It is far easier for a system or application vendor to write one of the standard methods. So getting access to the raw logs in some way, shape or form is straightforward.
So here is where the real difference between IT search and Log Management begins.
Raffy mentions a small change in the syslog format causing the connector to break. Well syslog is a standard so if it would not break any standard syslog receiver, what it actually meant is that the syslog message has not changed but the content had.
Log Management vendors provide “knowledge” about the logs beyond simple collection.
Let’s make an analogy – IT Search is like the NSA collecting all of the radio transmissions in all of the languages in the entire world. Pretty useful. However, if you want to make sense of the Russian ones you hire your Russian expert, Swahili, your Swahili expert and so on. You get the picture.
Logs are like languages — the fact of the matter is the only thing that is the same about logs is that the content is all different. If you happen to be an uber-log weenie and you understand the format of 20 different logs, simple IT Search is really powerful. If you are only concerned about a single log format like Windows (although Windows by itself is pretty darn arcane), IT Search can be a powerful tool. If you are like the rest of us whose entire lives are not spent understanding multiple log formats, or get really rusty because many of us often don’t get exposed to certain formats all the time, well, it gets a little harder. What Log Management vendors do is to help you ( as the user) out with the knowledge – rules that categorize important event logs from unimportant ones, alerts, reports that are configured to look for key words in the different log streams. How this is done is different from vendor to vendor – some normalize, i.e. translate logs into a standard canonical format, others don’t. And this knowledge is what can conceivably get out of date.
In IT Search, there is no possibility for anything to get out of date mainly because there is no knowledge, only the ability to search the log in its native format. Finally, if a Log Management vendor is storing the original log and you can search on it, your Log Management application gives you all the capability of IT Search.
Seems to me IT Search is much ado about nothing…
The rational approach to pretty much any IT project is the same…define the requirements, solutions, do a pilot project, implement/refine and operationalize.
Often you win or lose early at requirements gathering time.
So what should you keep in mind while defining requirements for a Security Information and Event Management (SIEM) project?
Look at it in two ways:
Well, for ourselves, we see a clear increase in attacks from the outside. These are increasingly sophisticated (which is expected I guess since it’s an arms race) and disturbingly indiscriminate. Attacks seem to be launched merely because we exist on the Internet and have connectivity and disconnecting from the Internet is not an option.
We see attacks that we recognize immediately (100 login failures between 2-3 AM). We see attacks that are not so obvious (http traffic from a server that should not have any). And we see the almost unrecognizable zero-day attacks. These appear to work their way through our defenses and manifest as subtle configuration changes.
Of the expert prognosticators, we (like many others) find that the PCI-DSS standard is a good middle ground between loosely defined guidelines (HIPAA anyone?) and vendor “Best Practices”.
The interesting thing is that PCI-DSS requirements seem to match what we see. Section 10 speaks to weaponry that can detect (and ideally remediate) the attacks and Section 11.5 speaks to the ability to detect configuration changes.
Its all SIEM, in the end.
So what are the requirements for SIEM?
As the saying goes — well begun is half done. Get your requirements correct and improve your odds of success.