Do you need a Log Whisperer?

Quick, take a look at these four log entries

  1. Mar 29 2014 09:54:18: %PIX-6-302005: Built UDP connection for faddr 198.207.223.240/53337 gaddr10.0.0.187/53 laddr 192.168.0.2/53
  2. Mar 12 12:00:08 server2 rcd[308]: id=304 COMPLETE ‘Downloading https://server2/data/red-carpet.rdf’time=0s (failed)
  3. 200.96.104.241 – – [12/Sep/2006:09:44:28 -0300] “GET /modules.php?name=Downloads&d_op=modifydownloadrequest&%20lid=-%20UNION%20SELECT%200,username,user_id,
    user_password,name,%20user_email,user_level,0,0%20FROM%20nuke_users HTTP/1.1” 200 9918 “-”
    “Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)”
  4. Object Open:Object Server: Security
    Object Type: File
    Object Name: E:\SALES RESOURCE\2010\Invoice 2010 7-30-2010.xls
    Handle ID: –
    Operation ID: {0,132259258}
    Process ID: 4
    Image File Name:
    Primary User Name: ACCOUNTING$
    Primary Domain: PMILAB
    Primary Logon ID: (0x0,0x3E7)
    Client User Name: Aaron
    Client Domain: CONTOSO
    Client Logon ID: 0x0,0x7E0808E)
    Accesses: DELETE
    READ_CONTROL
    ACCESS_SYS_SEC
    ReadData (or ListDirectory)
    ReadEA
    ReadAttributes
    Privileges: –
    Restricted Sid Count: 0
    Access Mask: 0x1030089

Any idea what they mean?

No? Maybe you need a Log Whisperer — someone who understands these things.

Why, you ask?
Think security — aren’t these important?

Actually #3 and #4 are a big deal and you should be jumping on them, whereas #1 and #2 are routine — nothing to get excited about.

Here is what they mean:

  1. A Cisco firewall allowed a packet through (not a “connection” because it’s a UDP packet — never mind what the text says)
  2. An attempt to update by an OpenSuSE Linux machine, but some software packages are failing to be updated.
  3. A SQL injection attempt on PHP Nuke
  4. Access denied to a shared resource in a Windows environment

Log Whisperers are the heart of our SIEM Simplified. They are the experts who review logs, determine what they mean and provide remediation recommendations in simple, easy to understand language.

Not to be confused with these guys.

And no, they don’t look like Robert Redford either. You are thinking about the Horse Whisperer.

We don’t need no stinkin Connectors

#36 on the American Film Institute list of Top Movie Quotes is “Badges? We don’t need no stinkin badges” which has been used often (e.g., Blazing Saddles). The equivalent of this in the log management universe is a “Connector”. We are often asked how many “Connectors” we have readily available or how long it takes to develop a Connector.

These questions stem from a model used by programs such as ArcSight which depend on Early Binding. In an earlier era of computing, Early Binding was needed for the compiler could not create an entry in the virtual method table for the procedure being compiled. It has the advantage of being efficient, an important consideration when CPU and memory are in very short supply, like years ago.

Just in time languages such as .NET or Java adopt Late Binding where the v-table is computed at run time. Years ago, Late Binding had negative connotations in terms of performance but that hasn’t been true for at least 20 years now.

Early binding requires a fixed schema to be mandated for all possible entries and for input to be “normalized” to this schema. The benefit of the fixed plan is efficiency in output since the data is already normalized. While that may make some sense for compilers, input in formalized language grammars makes almost no sense in the log management universe, where the input is log data from sources that do not adopt any standardization at all. The downside of such an approach is to require a “Connector” to normalize a new log source to the normalized schema. Another consideration is that outputs can greatly vary depending on usage – there are many possible uses for the data, the limitation is only the users imagination. The Early Binding model however, is designed with fixed outputs in mind. These disadvantages limit such designs.

In contrast, EventTracker uses Late Binding, where the meaning of tokens can be assigned at output (run) time, rather than being fixed at receive time. Thus new log formats do not need a “Collector” to be available at ingest time. The desired output format can be specified at search or report time for easy viewing. This requires somewhat greater computing capacity with Moores Law to the rescue. Late Binding is the primary advantage of EventTrackers’ “Fast In, Smart Out” architecture.

Introducing EventTracker Log Manager

The IT team of a Small Business has it the worst. Just 1-2 administrators to keep the entire operation running, which includes servers, workstations, patching, anti-virus, firewalls, applications, upgrades, password resets…the list goes on. It would be great to have 25 hours in a day and 4 hands per admin just to keep up. Adding security or compliance demands to the list just make it that much harder.

The path to relief? Automation, in one word. Something that you can “fit-and-forget”.

You need a solution which gathers all security information from around the network, platforms, network devices, apps etc. and that knows what to do with it. One that retains it all efficiently and securely for later if-needed for analysis, displays it in a dashboard for you to examine at your convenience, alerts you via e-mail/SMS etc. if absolutely necessary, indexes it all for fast search, and finds new or out-of-ordinary patterns by itself.

And you need it all in a software-only package that is quickly installed on a workstation or server. That’s what I’m talking about. That’s EventTracker Log Manager.

Designed for the 1-2 sys admin team.
Designed to be easy to use, quick to install and deploy.
Based on the same award-winning technology that SC Magazine awarded a perfect 5-star rating to in 2013.

How do you spell relief? E-v-e-n-t-T-r-a-c-k-e-r  L-o-g  M-a-n-a-g-e-r.
Try it today.

2013 Security Resolutions

A New Year’s resolution is a commitment that a person makes to one or more personal goals, projects, or the reforming of a habit.

  • The ancient Babylonians made promises to their gods at the start of each year that they would return borrowed objects and pay their debts.
  • The Romans began each year by making promises to the god Janus, for whom the month of January is named.
  • In the Medieval era, the knights took the “peacock vow” at the end of the Christmas season each year to re-affirm their commitment to chivalry.

Here are mine:

1)      Shed those extra pounds of logs:

Log retention is always a challenge — how much to keep, for how long? Keep them too long and they are just eating away storage space. Pitch them mercilessly and keep wondering if you will need them.  For guidance, look to any regulation that may apply. PCI-DSS says 365 days, for example; NIST 800-92 unhelpfully says “This should be driven primarily by organizational policies” and then goes on to classify logs into system, infrastructure and application levels. Bottom line, use your judgment because you know your environment best.

2)      Exercise your log analysis muscles regularly

As the Verizon Data Breach report says year in and year out, the bad guys are hoping that you are not collecting logs, and if you are, that you are not reviewing them. More than 96% of all attacks were not highly difficult and were avoidable (at least in hindsight) without difficult or expensive countermeasures. Easier said than done, isn’t it? Consider co-sourcing the effort.

3)      Play with existing toys before buying new ones

Know what configuration assessment is? It’s applying secure configurations to existing equipment. Agencies such as NIST, CIS and DISA provide detailed guidelines. Vendors such as Microsoft provide hardening guides. It’s a question of applying them to existing hardware. This reduces attack surface and contributes greatly to a more secure posture. You already have the equipment, just apply the secure configuration.  EventTracker can help measure results.

Happy New Year.

Seven deadly sins of SIEM

1) Lust: Be not easily lured by the fun, sexy demo. It always looks fantastic when the sales guy is driving. How does it work when you drive? Better yet, on your data?

2) Gluttony: Know thy log volume. When thee consumeth mucho more raw logs than thou expected, thou shall pay and pay dearly. More SIEM budgets die from log gluttony than starvation.

3) Greed: Pure pursuit of perfect rules is perilous. Pick a problem you’re passionate about, craft monitoring, and only after it is clearly understood do you automate remediation.

4) Sloth: The lazy shall languish in obscurity. Toilers triumph. Use thy SIEM every day, acknowledge the incidents, review the log reports. Too hard? No time you say?     Consider SIEM Simplified.

5) Wrath: Don’t get angry with the naysayers. Attack the problem instead. Remember “those who can, do; those who cannot, criticize.” Democrats: Yes we can v2.0.

6) Envy: Do not copy others blindly out of envy for their strategy. Account for your differences (but do emulate best practices).

7) Pride: Hubris kills. Humility has a power all its own. Don’t claim 100% compliance or security. Rather you have 80% coverage but at 20% cost and refining to get the rest. Republicans: So sayeth Ronald Reagan.

Trending Behavior – The Fastest Way to Value

Our  SIEM Simplified  offering is manned by a dedicated staff overseeing the EventTracker Control Center (ECC). When a new customer comes aboard, the ECC staff is tasked with getting to know the new environment, identifying which systems are critical, which applications need watching, and what access controls are in place, etc. In theory, the customer would bring the ECC staff up to speed (this is their network, after all) and keep them up to date as the environment changes. Reality bites and this is rarely the case. More commonly, the customer is unable to provide the ECC with anything other than the most basic of information.

How then can the ECC “learn” and why is this problem interesting to SIEM users at large?

Let’s tackle the latter question first. A problem facing new users at a SIEM installation is that  they get buried in getting to know the baseline pattern and the enterprise (the very same problem the ECC faces). See this  article  from a practitioner.

So it’s the same problem. How does the ECC respond?

Short answer: By looking at behavior trends and spotting the anomalies.

Long answer: The ECC first discovers the network and learns the various device types (OS, application, network devices etc.). This is readily automated by the StatusTracker module. If we are lucky, we get to ask specific the customer questions to bolster our understanding. Next, based on this information and the available knowledge packs within EventTracker, we schedule suitable daily and weekly reports and configure alerts. So far, so good, but really no cigar. The real magic lies in taking these reports  and creating flex reports where we control the output format to focus on parameters of value that are embedded within the description portion of the log messages (this is always true for syslog formatted messages but also for Windows style events). When these parameters are trended in a graph, all sorts of interesting information emerges.

In one case, we saw that a particular group of users was putting their passwords in the username field then logging in much more than usual — you see a failed login followed by a successful one; combine the two and you have both the username and password. In another case, we saw repeated failed logon after hours from a critical IBM i-Series machine and hit the panic button. Turns out someone left a book on the keyboard.

Takeaway: Want to get useful value from your SIEM but don’t have gobs of time to configure or tune the thing for months on end? Think trending behavior, preferably auto-learned. It’s what sets EventTracker apart from the search engine based SIEMs or from the rules based products that need an expen$ive human analyst chained to the product for months on end. Better yet, let the ECC do the heavy lifting for you. SIEM Simplified, indeed.

SIEM Fevers and the Antidote

SIEM Fever is a condition that robs otherwise rational people of common sense in regard to adopting and applying Security Information and Event Management (SIEM) technology for their IT Security and Compliance needs. The consequences of SIEM Fever have contributed to misapplication, misuse, and misunderstanding of SIEM with costly impact. For example, some organizations have adopted SIEM in contexts where there is no hope of a return on investment. Others have invested in training and reorganization but use or abuse the technology with new terminology taken from the vendor dictionary.   Alex Bell of Boeing first described these conditions.

Before you get your knickers in a twist due to a belief that it is an attack on SIEM and must be avenged with flaming commentary against its author, fear not. There are real IT Security and Compliance efforts wasting real money, and wasting real time by misusing SIEM in a number of common forms. Let’s review these types of SIEM Fevers, so they can be recognized and treated.

Lemming Fever: A person with Lemming Fever knows about SIEM simply based upon what he or she has been told (be it true or false), without any first-hand experience or knowledge of it themselves. The consequences of Lemming Fever can be very dangerous if infectees have any kind of decision making responsibility for an enterprise’s SIEM adoption trajectory. The danger tends to increase as a function of an afflictee’s seniority in the program organization due to the greater consequences of bad decision making and the ability to dismiss underling guidance. Lemming Fever is one of the most dangerous SIEM Fevers as it is usually a precondition to many of the following fevers.

Easy Button Fever: This person believes that adopting SIEM is as simple as pressing Staple’s Easy Button, at which point their program magically and immediately begins reaping the benefits of SIEM as imagined during the Lemming Fever stage of infection. Depending on the Security Operating Center (SOC) methodology, however, the deployment of SIEM could mean significant change. Typically, these people have little to no idea at all about the features which are necessary for delivering SIEM’s productivity improvements or the possible inapplicability of those features to their environment.

One Size Fits All Fever: Victims of One Size Fits All Fever believe that the same SIEM model is applicable to any and all environments with a return on investment being implicit in adoption. While tailoring is an important part of SIEM adoption, the extent to which SIEM must be tailored for a specific environment’s context is an important barometer of its appropriateness. One Size Fits All Fever is a mental mindset that may stand alone from other Fevers that are typically associated with the tactical misuse of SIEM.

Simon Says Fever: Afflictees of Simon Says Fever are recognized by their participation in SIEM related activities without the slightest idea as to why those activities are being conducted or why they important other than because they are included in some “checklist”. The most common cause of this Fever is failing to tie all log and incident review activities to adding value and falling into a comfortable, robotic regimen that is merely an illusion of progress.

One-Eyed King Fever: This Fever has the potential to severely impact the successful adoption of SIEM and occurs when the SIEM blind are coached by people with only a slightly better understanding of SIEM. The most common symptom occurring in the presence of One-Eyed King Fever is failure to tailor the SIEM implementation to its specific context or the failure of a coach to recognize and act on a low probability of return on investment as it pertains to a enterprise’s adoption.

The Antidote: SIEM doesn’t cause the Fevers previously described, people do. Whether these people are well intended have studied at the finest schools, or have high IQs, they are typically ignorant of SIEM in many dimensions. They have little idea about the qualities of SIEM which are the bases of its advertised productivity improving features, they believe that those improvements are guaranteed by merely adopting SIEM, or have little idea that the extent of SIEM’s ability to deliver benefit is highly dependent upon program specific context.

The antidote for the many forms of SIEM Fever is to educate. Unfortunately, many of those who are prone to the aforementioned SIEM infections are most desperately in need of such education, are often unaware of what they don’t know about SIEM, are unreceptive to learning about what they don’t know, or believe that those trying to educate them are simply village idiots who have not yet seen the brightly burning SIEM light.

While I’m being entirely tongue-in-cheek, the previously described examples of SIEM misuse and misapplication are real and occurring on a daily basis.   These are not cases of industrial sabotage caused by rogue employees planted by a competitor, but are instead self-inflicted and frequently continue even amidst the availability of experts who are capable of rectifying them.

Interested in getting help? Consider SIEM Simplified.

Surfing the Hype Cycle for SIEM

The Gartner hype cycle is a graphic “source of insight to manage technology deployment within the context of your specific business goals.”     If you have already adopted Security Information and Event Management (SIEM) (aka log management) technology in your organization, how is that working for you? As candidate, Reagan famously asked “Are you better off than you were four years ago?”

Sadly, many buyers of this technology are wallowing in the “trough of disillusionment.”   The implementation has been harder than expected, the technology more complex than demonstrated, the discipline required to use/tune the product is lacking, resource constraints, hiring freezes and the list goes on.

What next? Here are some choices to consider.

Do nothing: Perhaps the compliance check box has been checked off; auditors can be shown the SIEM deployment and sent on their way; the senior staff on to the next big thing; the junior staff have their hands full anyway; leave well enough alone.
Upside: No new costs, no disturbance in the status quo.
Downside: No improvements in security or operations; attackers count on the fact that even if you do collect log SIEM data, you will never really look at it.

Abandon ship: Give up on the whole SIEM concept as yet another failed IT project; the technology was immature; the vendor support was poor; we did not get resources to do the job and so on.
Upside: No new costs, in fact perhaps some cost savings from the annual maintenance, one less technology to deal with.
Downside: Naked in the face of attack or an auditor visit; expect an OMG crisis situation soon.

Try managed service: Managing a SIEM is 99% perspiration and 1% inspiration;offload the perspiration to a team that does this for a living; they can do it with discipline (their livelihood depends on it) and probably cheaper too (passing on savings to you);   you deal with the inspiration.
Upside: Security usually improves; compliance is not a nightmare; frees up senior staff to do other pressing/interesting tasks; cost savings.
Downside: Some loss of control.

Interested? We call it SIEM SimplifiedTM.

Big Data Gotcha’s

Jill Dyche writing in the Harvard Business Review suggests that “the question on many business leaders’ minds is this: Does the potential for accelerating existing business processes warrant the enormous cost associated with technology adoption, project ramp up, and staff hiring and training that accompany Big Data efforts?

A typical log management implementation, even in a medium enterprise is usually a big data endeavor. Surprised? You should not be. A relatively small network of a dozen log sources easily generates a million log messages per day with volumes in the 50-100 million per day being commonplace. With compliance and security guidelines requiring that logs be retained for 12 months or more, pretty soon you have big data.

So let’s answer the question raised in the article:

Q1: What can’t we do today that Big Data could help us do?   If you can’t define the goal of a Big Data effort, don’t pursue it.

A1: Comply with regulations like PCI-DSS, SOX 404, and HIPAA etc.; be alerted to security problems in the enterprise; control data leakage via insecure endpoints; improve operational efficiency

Q2: What skills, technologies, and existing data development practices do we have in place that could help kick-start a Big Data effort? If your company doesn’t have an effective data management organization in place, adoption of Big Data technology will be a huge challenge.

A2: Absent a trained and motivated user of the power tool that is the modern SIEM, an organization that acquires such technology is consigning it to shelf ware.   Recognizing this as a significant adoption challenge in our industry, we offer Monitored SIEM as a service; the best way to describe this is SIEM simplified! We do the heavy lifting so you can focus on leveraging the value.

Q3: What would a proof-of-concept look like, and what are some reasonable boundaries to ensure its quick deployment? As with many other proofs-of-concept the “don’t boil the ocean” rule applies to Big Data.

A3:   The advantage of a software-only solution like EventTracker is that an on premises trial is easy to set up. A virtual appliance with everything you need is provided; set up as a VMware or Hyper-Virtual machine within minutes.   Want something even faster? See it live online.

Q4: What determines whether we green light Big Data investment? Know what success looks like, and put the measures in place.

A4: Excellent point; success may mean continuous compliance;   a 75% reduction in cost of compliance; one security incident averted per quarter; delegation of log review to a junior admin.

Q5: Can we manage the changes brought by Big Data? With the regular communication of tangible results, the payoff of Big Data can be very big indeed.

A5: EventTracker includes more than 2,000 pre-built reports designed to deliver value to every interested stakeholder in the enterprise ranging from dashboards for management, to alerts for Help Desk staff, to risk prioritized incident reports for the security team, to system uptime and performance results for the operations folk and detailed cost savings reports for the CFO.

The old adage “If you fail to prepare, then prepare to fail” applies. Armed with these questions and answers, you are closer to gaining real value with Big Data.

Learning from JPMorgan

The single most revealing moment in the coverage of JPMorgan’s multibillion dollar debacle can be found in this take-your-breath-away passage from The Wall Street Journal: On April 30, associates who were gathered in a conference room handed Mr. Dimon summaries and analyses of the losses. But there were no details about the trades themselves. “I want to see the positions!” he barked, throwing down the papers, according to attendees. “Now! I want to see everything!”

When Mr. Dimon saw the numbers, these people say, he couldn’t breathe.

Only when he saw the actual trades — the raw data — did Mr. Dimon realize the full magnitude of his company’s situation. The horrible irony: The very detail-oriented systems (and people) Dimon had put in place had obscured rather than surfaced his bank’s horrible hedge.

This underscores the new trust versus due diligence dilemma outlined by Michael Schrage. Raw data can have enormous impact on executive perceptions that pre-chewed analytics lack.   This is not to minimize or marginalize the importance of analysis and interpretation; but nothing creates situational awareness faster than seeing with your own eyes what your experts are trying to synthesize and summarize.

There’s a reason why great chefs visit the farms and markets that source their restaurants:   the raw ingredients are critical to success — or failure.

We have spent a lot of energy in building dashboards for critical log data and recognize the value of these summaries; but while we should trust our data, we also need to do the due diligence.

Big Data – Does insight equal decision?

In information technology, big data consists of data sets that grow so large that they become awkward to work with using whatever database management tools are on-hand. For that matter, how big is big? It depends on when you need to reconsider data management options – in some cases it may be 100Gb, in others, it may be 100Tb. So, following up on our earlier post about big data and insight, there is one more important consideration:

Does insight equal decision?

The foregone conclusion from big data proponents is that each nugget of “insight” uncovered by data mining will somehow be implicitly actionable and the end user (or management) will gush with excitement and praise.

The first problem is how can you assume that “insight” is actionable? It very well may not be, so what do you do then? The next problem is how can you convince the decision maker that the evidence constitutes an imperative to act? Absent action, the “insight” remains simply a nugget of information.

Note that management typically responds to “insight” with skepticism, seeing the message bearer as yet another purveyor of information (“insight”) and insisting that this new method is the silver bullet, thereby adding to workload.

Being in management myself, my team often comes to me with their little nuggets … some are gold, but some are chicken.   Rather than purvey insight, think about a recommendation backed up by evidence.

Silly human – logs are for machines (too)

Here is an anecdote from a recent interaction with an enterprise application in the electric power industry:

1. Dave the developer logs all kinds of events. Since he is the primary consumer of the log, the format is optimized for human-readability. For example:

02-APR-2012 01:34:03 USER49 CMD MOD0053: ERROR RETURN FROM MOD0052 RETCODE 59

Apparently this makes perfect sense to Dave:   each line includes a timestamp and some text.

2. Sam from the Security team needs to determine the number of daily unique users. Dave quickly writes a parser script for the log and schedules it. He also builds a little Web interface so that Sam can query the parsed data on his own. Peace reigns.

3. A few weeks later, Sam complains that the web interface is broken. Dave takes a look at the logs, only to realize that someone else has added an extra field in each line, breaking his custom parser. He pushes the change and tells Sam that everything is okay again. Instead of writing a new feature, Dave has to go back and fill in the missing data.

4. Every 3 weeks or so, repeat Step 3 as others add logs.

What is your maximum NPH?

In The Information Diet, Clay Johnson wrote, “The modern human animal spends upwards of 11 hours out of every 24 in a state of constant consumption. Not eating, but gorging on information … We’re all battling a storm of distractions, buffeted with notifications and tempted by tasty tidbits of information. And just as too much junk food can lead to obesity, too much junk information can lead to cluelessness.”

Audit yourself and you may be surprised to find that you get more than 10 notifications per hour; they can be disruptive to your attention. I find myself trying hard (and often failing) to ignore the smartphone as it beeps softly to indicate a new distraction. I struggle to remain focused on the person in my office as the desktop tinkles for attention.

Should you kill off notifications though? Clay argues that you should and offers tools to help.

When designing EventTracker v7, minimizing notifications was a major goal. On Christmas Day in 2008, nobody was stirring, but the “alerts” console rung up over 180 items demanding review. It was obvious these were not “alerts.” This led to the “risk” score which dramatically reduces notifications.

We know that all “alerts”  are not equal: some merit attention before going to lunch, some before the end of the day, and some by the end of the quarter, budget permitting. There are a very rare few that require us to drop the coffee mug and attend instantly. Accordingly, a properly configured EventTracker installation will rarely “notify” you; but when you need to know — that alert will come screaming for your attention.

I am frequently asked what is the maximum events per second that can be managed. I think I’ll begin to ask how many notifications per hour (NPH) the questioner can handle. I think Clay Johnson would approve.

Data, data everywhere but not a drop of value

The sailor in The Rime of the Ancient Mariner relates his experiences after long sea voyage when his ship is blown off course:

“Water, water, every where,

And all the boards did shrink;

Water, water, every where,

Nor any drop to drink.”

An albatross appears and leads them out, but is shot by the Mariner and the ship winds up in unknown waters.  His shipmates blame the Mariner and force him to wear the dead albatross around his neck.

Replace water with data, boards with disk space, and drink with value and the lament would apply to the modern IT infrastructure. We are all drowning in data, but not so much in value. “Big data” are datasets that grow so large that managing them with on-hand tools is awkward. They are seen as the next frontier in innovation, competition, and productivity.

Log management is not immune to this trend. As the basic log collection problem (different sources, different protocols and different formats) has been resolved, we’re now collecting even larger datasets of logs. Many years ago we refuted the argument that log data belonged in a RDBMS, precisely because we saw the side problem of efficient data archival begin to overwhelm the true problem of extracting value from the data. As log data volumes continue to explode, that decision continues to be validated.

However, while storing raw logs in a database was not sensible, their power in extracting patterns and value from data is well established. Recognizing this, EventVault Explorer was released in 2011. Users can extract selected datasets to their choice of external RDBMS (a datamart) for fuzzy searching, pivot tables etc.   As was noted here , the key to managing big data is to personalize the results for maximum impact.

As you look under the covers of SIEM technology, pay attention to that albatross called log archives. It can lead you out of trouble, but you don’t want it around your neck.

SIEM and the Appalachian Trail

The Appalachian Trail is a marked hiking trail in the eastern United States extending between Georgia and Maine. It is approximately 2,181 miles long and takes about six months to complete. It is not a particularly difficult journey from start to finish; yet even so, completing the trail requires more from the hiker than just enthusiasm, endurance and will.

Likewise, SIEM implementation can take from one to six months to complete (depending on the level of customization) and like the Trail, appears deceptively simple.   It too, can be filled with challenges that reduce even the most experienced IT manager to despair, and there is no shortage of implementations that have been abandoned or uncompleted.   As with the Trail, SIEM implementation requires thoughtful consideration.

1) The Reasons Why

It doesn’t take too many nights scurrying to find shelter in a lightning storm, or days walking in adverse conditions before a hiker wonders: Why am I doing this again? Similarly, when implementing any IT project, SIEM included, it doesn’t take too many inter-departmental meetings, technical gotchas, or budget discussions before this same question presents itself: Why are we doing this again?

  All too often, we don’t have a compelling answer, or we have forgotten it. If you are considering a half year long backpacking trip through the woods, there is a really good reason for it.   In the same way, one embarks on a SIEM project with specific goals, such as regulatory compliance, IT security improvement or to control operating costs.   Define the answer to this question before you begin the project and refer to it when the implementation appears to be derailing. This is the compass that should guide your way.   Make adjustments as necessary.

2) The Virginia Blues

Daily trials can include anything from broken bones to homesickness, a circumstance that occurs on the Appalachian Trail about four to eight weeks into the journey, within the state lines of Virginia. Getting through requires not just perseverance but also an ability to adapt.

For a SIEM project, staff turnover, false positives, misconfigurations or unplanned explosions of data can potentially derail the project. But pushing harder in the face of distress is a recipe for failure. Step back, remind yourself of the reasons why this project is underway, and look at the problems from a fresh perspective. Can you be flexible? Can you make find new avenues to go around the problems?

  3) A Fresh Perspective

In the beginning, every day is chock full of excitement, every summit view or wild animal encounter is exciting.   But life in the woods will become the routine and exhilaration eventually fades into frustration.

In  much the same way, after the initial thrill of installation and its challenges, the SIEM project devolves into a routine of discipline and daily observation across the infrastructure for signs of something amiss.

This is where boredom can set in, but the best defense against the lull that comes along with the end of the implementation is the expectation of it. The journey’s going to end.   Completing it does not occur when the project is implemented.   Rather, when the installation is done, the real journey and the hard work begins.

New Bill Promises to Penalize Companies for Security Breaches

On September 22, the Senate Judiciary Committee approved and passed Sen. Richard Blumenthal’s (D, Conn.) bill, the “Personal Data Protection and Breach Accountability Act of 2011,” sending it to the Senate floor. The bill will penalize companies for online data breaches and was introduced on the heels of several high profile security breaches and hacks that affected millions of consumers. These included the Sony breach which compromised the data of 77 million customers, and the DigiNotar breach which resulted in 300,000 Google GMail account holders having their mail hacked and read. The measure addresses companies that hold the personal information of more than 10,000 customers and requires them to put privacy and security programs in place to protect the information, and to respond quickly in the event of a security failure.

The bill proposes that companies be fined $5,000 per day per violation, with a maximum of $20 million per infringement. Additionally, companies who fail to comply with the data protection law (if it is passed) may be required to pay for credit monitoring services and subject to civil litigation by the affected consumers. The bill also aims to increase criminal penalties for identity theft, as well as crimes including the installing of a data collection program on someone’s computer and concealing any security breached in which personal data is compromised.

Key provisions in the bill include a process to help companies establish appropriate minimum security standards, notifications requirements, information sharing after a breach and company accountability.

While the intent of the bill is admirable, the problem is not a lack of laws to deter breaches, but the insufficient enforcement of these laws. Many of the requirements espoused in this new legislation already exist in many different forms.

SANS is the largest source for information security training and security certification, and their position is that we don’t need an extension to the Federal Information Security Management Act of 2002 (FISMA) or other compliance regulations, which have essentially encouraged a checkbox mentality: “I checked it off, so we are good.” This is the wrong approach to security but companies get rewarded for checking off criteria lists. Compliance regulations do not drive improvement. Organizations need to focus on the actual costs that can occur by not being compliant:

  • Loss of consumer confidence: Consumers will think twice before they hand over their personal data to an organization perceived to be careless with that information which can lead to a direct hit in sales.
  • Increased costs of doing business as with PCI-DSS: PCI-DSS is one example where enforcement is prevalent, and the penalties can be stringent. Merchants who do not maintain compliance are subject to higher rates charged by VISA, MasterCard, etc.
  • Negative press: One need only look at the recent data breaches to consider the continuing negative impact on the compromised company’s brand and reputation. In one case (DigiNotar), the company folded.

The gap does not exist in the laws, but rather, in the enforcement of those laws. Until there is enforcement any legislation or requirements are hollow threats.

Lessons from Honeynet Challenge “Log Mysteries”

Ananth, from Prism Microsystems, provides in-depth analysis on the Honeynet Challenge “Log Mysteries” and his thoughts on what it really means in the real world. EventTracker’s Syslog monitoring capability protects your enterprise infrastructure from external threats. “Syslog monitoring”

SIEM or Log Management?

Mike Rothman of Securosis has a thread titled Understanding and Selecting SIEM/Log Management. He suggests both disciplines have fused and defines the holy grail of security practitioners as “one alert telling exactly what is broken”. In the ensuing discussion, there is a suggestion that SIEM and Log Mgt have not fused and there are vendors that do one but not the other.

After a number of years in the industry, I find myself uncomfortable with either term (SIEM or Log Mgt) as it relates to the problem the technology can solve, especially for the mid-market, our focus.

The SIEM term suggests it’s only about Security, and while that is certainly a significant use-case, it’s hardly the only use for the technology. That said if a user wishes to use the technology for only the security use case, fine, but that is not a reflection of the technology. Oh by the way, Security Information Management would perforce include other items such as change audit and configuration assessment data as well which is outside scope of “Log Management”.

The trouble with the term Log Management is that it is not tied to any particular use case and that makes it difficult to sell (not to mention boring). Why would you want to manage logs anyway? Users only care about solutions to real problems they have; not generic “best practice” because Mr. Pundit says so.

SIEM makes sense as “the” use case for this technology as you go to large (Fortune 2000) enterprises and here SIEM is often a synonym for correlation.
But to do this in any useful way, you will need not just the box (real or
virtual) but especially the expert analyst team to drive it, keep it updated and ticking. What is this analyst team busy with? Updating the rules to accommodate constantly changing elements (threats, business rules, IT components) to get that “one alert”. This is not like AntiVirus where rule updates can happen directly from the vendor with no intervention from the admin/user. This is a model only large enterprises can afford.

Some vendors suggest that you can reduce this to an analyst-in-a-box for small enterprise i.e., just buy my box, enable these default rules, minimal intervention and bingo you will be safe. All too common results are either irrelevant alerts or the magic box acts as the dog in the night time. A major reason for “pissed-off SIEM users”. And of course a dedicated analyst (much less a team) is simply not available.

This not to say that the technology is useless absent the dedicated analyst or that SIEM is a lost cause but rather to paint a realistic picture that any “box” can only go so far by itself; and given the more-with-less needs in this mid-market, obsessing on SIEM features obscures the greater value offered by this technology.

Most Medium Enterprise networks are “organically grown architectures” a response to business needs — there is rarely an overarching security model that covers the assets. Point solutions dominate based on incidents or perceived threats or in response to specific compliance mandates. See the results of our virtualization survey for example. Given the resource constraints, the technology must have broad features beyond the (essential) security ones. The smarter the solution, the less smart the analyst needs to be — so really it’s a box-for-an-analyst (and of course all boxes now ought to be virtual).

It makes sense to ask what problem is solved, as this is the universe customers live in. Mike identifies reacting faster, security efficiency and compliance automation to which I would add operations support and cost reduction. More specifically, across the board, show what is happening (track users, monitor critical systems/applications/firewalls, USB activity, database activity, hypervisor changes, physical eqpt etc), show what has happened (forensic, reports etc) and show what is different (change audit).

So back to the question, what would you call such a solution? SIEM has been pounded by Gartner et al into the budget line items of large enterprises so it becomes easier to be recognized as a need. However it is a limiting description. If I had only these two choices, I would have to favor Log Management where one (essential) application is SIEM.

-Ananth

Sustainable vs. Situational Values

I am often asked that if Log Management is so important to the modern IT department, then how come more than 80% of the market that “should” have adopted it has not done so?

The cynic says “unless you have best practice as an enforced regulation (think PCI-DSS here)” then twill always be thus.

One reason why I think this is so is because earlier generations never had power tools and found looking at logs to be hard and relatively unrewarding work. That perception is hard to overcome even in this day and age after endless punditry and episode after episode has clarified the value.

Still resisting the value proposition? Then consider a recent column in the NY Times which quotes Dov Seidman, the C.E.O. of LRN who describes two kinds of values: “situational values” and “sustainable values.”

The article is in the context of the current political situation in the US but the same theme applies to many other areas.

“Leaders, companies or individuals guided by situational values do whatever the situation will allow, no matter the wider interests of their communities. For example, a banker who writes a mortgage for someone he knows can’t make the payments over time is acting on situational values, saying: I’ll be gone when the bill comes due.”

At the other end, people inspired by sustainable values act just the opposite, saying: I will never be gone. “I will always be here. Therefore, I must behave in ways that sustain — my employees, my customers, my suppliers, my environment, my country and my future generations.”

We accept that your datacenter grew organically, that back-in-the-day there were no power tools and you dug ditches with your bare hands outside when it was 40 below and tweets were for the birds…but…that was then and this is now.

Get Log Management, it’s a sustainable value.

Ananth

100 Log Management uses #54 PCI Requirements V & VI

Last we looked at PCI-DSS Requirements 3 and 4, so today we are going to look at Requirements 5 and 6. Requirement 5 talks about using AV software, and log management can be used to monitor AV applications to ensure they are running and updated. Requirement 6 is all about building and maintaining a secure network for which log management is a great aid.

-By Ananth

Can you count on dark matter?

Eric Knorr, the Editor in Chief over at InfoWorld has been writing about “IT Dark Matter” which he defines as system device and application logs. Turns out half of enterprise data is logs or so-called Dark Matter. Not hugely surprising and certainly good news for the data storage vendors and hopefully for SIEM vendors like us! He described these logs or dark matter as “widely distributed and hidden” which got me thinking. The challenge with blogging is that we have to reduce fairly complex concepts and arguments into simple claims otherwise posts end up being on-line books. The good thing in that simplification, however, is that often gives a good opportunity to point out other topics of discussion.

There are two great challenges in log management – the first is being able to provide the tools and knowledge to make the log data readily available and useful, which leads to Eric’s comment on how Dark Matter is “Hidden” as it is simply too hard to mine without some advanced equipment. The second challenge, however, is preserving the record – making sure it is accurate, complete and unchanged. In Eric’s blog this Dark Matter is “widely distributed” and there is an implied assumption that this Dark Matter is just there to be mined – that the Dark Matter will and does exist and even more so, it is accurate. In reality it is, for all practical purposes, impossible to have logs widely distributed and expect them to be complete and accurate – this fatally weakens their usefulness.

Let’s use a simple illustration we all know well in computer security — almost the first thing a hacker will do once they penetrate a system is shut down logging, or as soon as they finish whatever they are doing, delete or alter the logs. Let’s use the analogy of video surveillance at your local 7/11. How useful would it be if you left the recording equipment out in the open at the cash register unguarded – not real useful, right? When you do nothing to secure the record, the value of the record is compromised, and the more important the record the more likely it is to be compromised or simple deleted.

This is not to imply that there are not useful nuggets to be mined even if the records are distributed. Without attempting to secure and preserve the logs, logs become the trash heap of IT. Archeologists spend much of their time digging through the trash of civilizations to figure out how people lived. Trash is an accurate indication of what really happened simply because 1) it was trash and had no value and 2) no one worried that someone 1000 years later was going to dig it up. It represents a pretty accurate, if fragmentary, picture of day to day existence. But don’t expect to find treasure, state secrets or individual records in the trash heap however. The usefulness of the record is 1) a matter of luck that the record was preserved and 2) directly inverse to the interest of the creating parties to modify it.

 Steve Lafferty

Compromise to discovery

The Verizon Business Risk Team publishes a useful Data Breach Investigations Report drawn from over 500 forensic engagements over a four-year period.

The report describes a “Time Span of Breach” event broken into four stages of an attack. These are:

– Pre-Attack Research
– Point of Entry to Compromise
– Compromise to Discovery
– Discovery to Containment

The top two are under control of the attacker but the rest are under the control of the defender. Where log management is particularly useful would be in discovery. So what does the 2008 version of the DBIR show about the time between Compromise to Discovery? Months Sigh. Worse yet, in 70% of the cases, Discovery was the victim being notified by someone else.

Conclusion? Most victims do not have sufficient visibility into their own networks and equipment.

It’s not hard but it is tedious. The tedium can be relieved, for the most part, by a one-time setup and configuration of a log management system. Perhaps not the most exciting project you can think of but hard to beat for effectiveness and return on investment.

Ananth

Log Monitoring – real time or bust?

As a vendor of a log management solution, we come across prospects with a variety of requirements — consistent with a variety of needs and views of approaching problems.

Recently, one prospect was very insistent on “real-time” processing. This is perfectly reasonable but as with anything, when taken to an extreme, can be meaningless. In this instance, the “typical” use case (indeed the defining one) for the log management implementation was “a virus is making its way across the enterprise; I don’t have time to search or refine or indeed any user (slow) action; I need instant notification and ability to sort data on a variety of indexes instantly”.

As vendors we are conditioned to think “the customer is always right” but I wonder if the requirement is reasonable or even possible. Given specifics of a scenario, I am sure many vendors can meet the requirement — but in general? Not knowing which OS, which attack pattern, how logs are generated/transmitted?

I was reminded again by this blog by Bejtlich in which he explains that “If you only rely on your security products to produce alerts of any type, or blocks of any type, you will consistently be “protected” from only the most basic threats.”

While real-time processing of logs is a perfectly reasonable requirement, retrospective security analysis is the only way to get a clue as to attack patterns and therefore a defense.

 Ananth

Extreme logging or Too Much of a Good Thing

Strict interpretations of compliance policy standards can lead you up the creek without a paddle. Consider two examples:

  1. From PCI-DSS comes the prescription to “Track & monitor all access to network resources and cardholder data”. Extreme logging is when you decide this means a db audit log larger than the db itself plus a keylogger to log “all” access.
  2. From HIPAA 164.316(b)(2) comes the Security Rule prescription to “Retain … for 6 years from the date of its creation or the date when it last was in effect, whichever is later.” Sounds like a boon for disk vendors and a nightmare for providers.

Before you assault your hair follicles, consider:
1) In clarification, Visa explains “The intent of these logging requirements is twofold: a) logs, when properly implemented and reviewed, are a widely accepted control to detect unauthorized access, and b) adequate logs provide good forensic evidence in the event of a compromise. It is not necessary to log all application access to cardholder data if the following is true (and verified by assessors):
– Applications that provide access to cardholder data do so only after making sure the users are authorized
– Such access is authenticated via requirements 7.1 and 7.2, with user IDs set up in accordance with requirement 8, and
– Application logs exist to provide evidence in the event of a compromise.

2) The Office of the Secretary of HHS waffles when asked about retaining system logs- this can be reasonably interpreted to mean the six year standard need not be taken literally for all system and network logs.

Ananth

SIEM: What are you searching for?

Search engines are now well established as a vital feature of IT and applications continue to evolve in breadth and depth at dizzying rates.  It is tempting to try and reduce any and all problems to one of query construction against an index. Can Security Information and Event Management or SIEM be (force) fitted into the search paradigm?

The answer depends on what you are looking to do and your skill with query construction.

If you are an expert with detailed knowledge of log formats and content, you may find it easy to construct a suitable query. When launched against a suitably indexed log collection, results can be gratifyingly fast and accurate. This is however a limited use-case in the SIEM universe of use-cases. This model usually applies when Administrators are seeking to resolve Operational problems.

Security analysts however are usually searching for behavior and not simple text searches. While this is the holy grail of search engines, attempts from Excite (1996) to Accoona (RIP Oct 2008) never made the cut. In the SIEM world, the context problem is compounded by myriad formats and the lack of any standard to assign meaning to logs even within one vendor’s products and versions of a product.

All is not lost, SIEM vendors do offer solutions by way of pre-packaged reports and the best ones offer users the ability to perform analysis of behavior within a certain context (as opposed to simple text search). By way of example – show me all failed logins after 6PM; from this set, show only those that failed on SERVER57; from this set show me those for User4; now go back and show me all User4 activity after 6PM on all machines.

Don’t try this with a “simple” text search engine….or like John Wayne in The Searchers, you may become bitter and middle aged.

– Ananth

Will SIEM and Log Management usage change with the economic slowdown?

When Wall Street really began to implode a couple of weeks ago one of the remarkable side-effects of the plunge was a huge increase of download activity in all items related to ROI on the Prism website. A sign of the times as ROI always becomes more important in times of tight budgets, and our prospects were seeing the lean times coming. So what does the likelihood of budget freezes or worse mean for how SIEM/Log Management is used or how it is justified in the enterprise?

Compliance is and will remain the great budget enabler of SIEM and Log Management but often a compliance project can be done in a far more minimal deployment and still meet the requirement. There is, however, enormous tangible and measurable benefit in Log Management beyond the compliance use case that has been largely ignored.

SIEM/Log Management for the most part has been seen (and positioned by us vendors) as a compliance solution with security benefits or in some cases a security solution that does compliance. Both of these have a hard ROI to measure as it is based on a company’s tolerance for risk.  A lot of SIEM functionality, and the log management areas in particular, is also enormously effective in increasing operational efficiencies – and provides clear, measurable, fast and hard ROI. Very simply, compliance will keep you out of jail, security reduces risk, but by using SIEM products for operations you will save hard dollars on administrator costs and reduce system down-time which in turn increases productivity that directly hits the bottom line. Plus you still get the compliance and security for free effectively. A year ago when we used to show these operational features to prospects (mostly security personnel) they were greeted 9 out of 10 times with a polite yawn. Not anymore.

We believe this new cost conscious buying behavior will also drive broader rather than deeper requirements in many mid-tier businesses. It is the “can I get 90% of my requirements, and 100% of the mandatory ones in several areas, and is that better than 110% in a single area?” discussion. Recently Prism added some enhanced USB device monitoring capability in EventTracker. While it is beyond what typical SIEM vendors provide in that we track files written and deleted on the USB drive in real-time, I would not consider it to be as good as a best of breed DLP provider. But for most people it gets them where they need to be and is included in EventTracker for no additional cost. It is amazing the level of interest this functionality receives today from prospects while at the same time you get correspondingly less interest in features with a dubious ROI like many correlation use cases. Interesting times.

-Posted by Steve Lafferty

Some Ruminations on the Security Impact of Software As A Service

In a recent post I talked a little about the security and compliance issues facing companies that adopt cloud-based SaaS for any mission-critical function. I referred to this as security OF the cloud to differentiate it from a cloud-based security offering or security IN the cloud. This is going to pose a major change in the security industry if it takes off.

Take a typical small business “SmallCo” as an example – They depend on a combination of Quickbooks and an accounting firm for their financial processes. For all practical purposes SmallCo outsources the entire accounting function. They use a hosting company to host Quickbooks for a monthly fee, and their external CPA, internal management and accounts staff access the application for data processing. Very easy to manage, no upfront investment, no servers to maintain, all the usual reasons why a SaaS model is so appealing.

One can easily argue the crown jewels of SmallCo’s entire business are resident in that hosted solution. SmallCo began to question whether this critical data was secure from being hacked or stolen. Would SmallCo be compliant if they were obligated to follow a compliance standard? Is it the role of the hosting provider to ensure security and compliance? To all of those questions there was and is no clear cut answer. SmallCo is provided access to the application and can have access to any audit capability that is supported in the Quickbook product (which is not a great deal), and there is no ability to collect that audit and usage data other than to manually run a report. At the time SmallCo began it did not seem to be important but as SmallCo grew so did their exposure.

Salesforce, another poster child for SaaS, is much the same. I read a while back they were going to put the ability to monitor changes in some of their database fields in their Winter 2008 release. But there appears to be nothing for user level auditing or even admin auditing (of your staff much less theirs). A trusted user can steal an entire customer list and not even have to be in the office to do it. The best DLP technology will not help you as it can be accessed and exported through any web browser on any machine. Having used Salesforce in previous companies I can personally attest, however, that it is a fine  CRM system, cost-effective, powerful and well-designed. But you have to maintain a completely separate access control list, and you have no real ability to monitor what is accessed by whom for audit purposes. For a prospect with privacy concerns is it really a viable, secure solution?

Cloud based computing changes the entire paradigm of security. Perimeter defense is the first step of a defense in depth to protect service availability and corporate data, but what happens when there is no data resident to be defended? In fact, when there are a number of services in the cloud, is event management going to be viable? Will the rules be the same when you are correlating events from different services on the cloud?

So here is the challenge I believe — as more and more mission critical processes are moved to the cloud, SaaS suppliers are going to have to provide log data in a real-time, straight forward manner, probably for their admins as well as their customers’ personnel. In fact since there is only a browser and login and no firewall, network or operating system level security  to breach, auditing would have to be very, very robust.  With all these cloud services is it feasible that an auditor will accept 50 reports from 50 providers and pass the company under audit? Maybe, but someone – either the end-user or a MSSP has to be responsible for monitoring for security and compliance, and unless the application and data is under the control of end-users, they will be unable to do so

So If I were an application provider like Salesforce I would be thinking really hard about being a good citizen in a cloud based world. Like providing real-time audit records for at least user log-on and log-off, log-on failures and a complete audit record of all data extracts as a first step, as well as a method to push the events out in real-time. I would likely do that before I worried too much about auditing fields in the database.

Interesting times.

Steve Lafferty

Outsource? Build? Buy?

So you decided that it’s time to manage your security information. Your trigger was probably one of a) Got handed a directive from up high “The company shall be fully compliant with applicable regulation [insert one] PCI/HIPAA/SOX/GLBA/FISMA/Basel/…” b) Had a security incident and realized OMG we really need to keep those logs.

Choice: Build
Upside: It’ll be perfect, it’ll be cheap, it’ll be fun
Downside: Who will maintain, extend, support (me?), how will it scale?

Choice: Outsource
Upside: Don’t need the hardware or staff, pay-go, someone else will deal with the issues
Downside: Really? Someone else will deal with the issues? How do you get access to your info? What is the SLA?

Choice: Buy
Upside: Get a solution now, upgrades happen, you have someone to blame
Downside: You still have to learn/use it, is the vendor stable?

What is the best choice?
Well, how generic are your requirements?
What sort of resources can you apply to this task?
How comfortable are you with IT? [From ‘necessary evil’…to… ‘We are IT!’] What sort of log volume and sources do you have?

Outsource if you have – generic requirements, limited sources/volume and low IT skills

Build if you have – programming skills, fixed requirements, limited sources/volume

Buy if you have – varied (but standard) sources, good IT skills,
moderate-high volume

As Pat Riley says “Look for your choices, pick the best one, then go with it.”

Ananth

Compliance: Did you get the (Pinto) Memo?

The Ford Pinto was a subcompact manufactured by Ford (introduced on 9/11/70 — another infamous coincidence?). It became a focus of a major scandal when it was alleged that the car’s design allowed its fuel tank to be easily damaged in the event of a rear-end collision, which sometimes resulted in deadly fires and explosions. Ford was aware of this design flaw but allegedly refused to pay what was characterized as the minimal expense of a redesign. Instead, it was argued, Ford decided it would be cheaper to pay off possible lawsuits for resulting deaths. The resulting liability case produced a judicial opinion that is a staple of remedy courses in American law schools.

What brought this on? Well, a recent conversation with a healthcare institution went something like this:

Us: Are you required to comply with HIPAA?

Them: Well, I suppose…yes

Us: So how do you demonstrate compliance?

Them: Well, we’ve never been audited and don’t know anyone that has

Us: So you don’t have a solution in place for this?

Them: Not really…but if they ever come knocking, I’ll pull some reports and wiggle out of it

Us: But there is a better, much better way with all sorts of upside

Them: Yeah, yeah whatever…how much did you say this “better” way costs?

Us: Paltry sum

Them: Well why should I bother? A) I don’t know anyone that has been audited. B) I’ve got better uses for the money in these tough times. C) If they come knocking, I’ll plead ignorance and ask for “reasonable time” to demonstrate compliance. D) In any case, if I wait long enough Microsoft and Cisco will probably solve this for me in the next release.

Us: Heavy sigh

Sadly..none of this is true and there is overwhelming evidence of that.

Regulations are not intended to be punitive of course and implementing log management in reality provides positive ROI

– Ananth

Let he who is without SIM cast the first stone

In a recent post Raffael Marty points out the shortcomings of a “classic” SIM solution including high cost in part due to a clumsy, expensive tuning process.

More importantly, he points out that SIM’s were designed for network-based attacks and these are on the wane, replaced by host-based attacks.

At Prism, we’ve long argued that a host-based system is more appropriate and effective. This is further borne out by the appearance of polymorphic strains such as Nugache that now dominate Threatscape 2008.

However is “IT Search” the complete answer? Not quite. As a matter of fact, any such “silver bullet” has never worked out. Fact is, users (especially in mid-tier) are driven by security concerns, so proactive correlation is useful (in moderation), compliance remains a major driver and event reduction with active alerting is absolutely essential for the overworked admin. That said “IT Search” is a useful and powerful tool in the arsenal of the modern, knowledgeable Security Warrior.

A “Complete SIM” solution is more appropriate for the enterprise. Such a solution blends the “classic” approach which is based on log consolidation and multi-event correlation from host and network devices PLUS a white/greylist scanner PLUS the Log Search function. Long term storage and flexible reporting/forensic tools round out the ideal feature set. Such a solution has better potential to satisfy the different user profiles. These include Auditors, Managers and Security Staff, many of who are less comfortable with query construction.

One dimensional approaches such as “IT Search” or “Network Behavior Anomaly Detection” or “Network Packet Correlation” while undeniably useful are in themselves limited.

Complete SIM, IT Search included, that’s the ticket.

 Ananth

Architectural Chokepoints

I have been thinking a bit on scalability lately – and I thought it might be an interesting exercise to examine a couple of the obvious places in a SIEM solution where scalability problems can be exposed. In a previous post I talked about scalability and EPS. The fact is there are multiple areas in a SIEM solution where the system may not scale and anyone thinking of a SIEM procurement should be thinking of scalability as a multi-dimensional beast.

First, all the logs you care about need to be dependably collected. Collection is where many vendors build EPS benchmarks – but generally the number of events per second is based on a small normalized packet. Event size varies widely depending on source so understand your typical log size, and calculate accordingly. The general mitigation strategies for collection are faster collection hardware (collection is usually a CPU intensive task), distributed collection architecture, and log filtering.

One thing to think off — log generation is often quite “bursty” in nature. You will, for instance, get a slew of logs generated on Monday mornings when staff arrive to work and start logging onto system resources. You should evaluate what happens if the system gets overloaded – do the events get lost, does the system crash?

As a mitigation strategy, Event filtering is sometimes pooh-poohed , however the reality is that 90% of traffic generated by most devices consists of completely useless (from a security perspective) status information. Volume varies widely depending on audit settings as well. A company generating 600,000 events per day on a windows network can easily generate 10-fold as much by increasing their audit settings slightly. . If you need the audit levels high, filtering is the easiest way to ease pressure on the entire down-stream log system.

Collection is a multi-step process also. Simply receiving an event is too simplistic a view. Resources are expended running policy and rules against the event stream. The more processing, the more system resources consumed. The data must be committed to the event store at some point so it needs to get written to disk. It is highly advisable to look at these as 3 separate activities and validate that the solution can handle your volume successfully.

A note on log storage for those who are considering buying an appliance with a fixed amount of onboard storage – be sure it is enough, and be sure to check out how easy it is to move off, retrieve and process records that have been moved to offline storage media. If your event volume eats up your disk you will likely be doing a lot of the moving off, moving back on activity. Also, some of the compliance standards like PCI require that logs must be stored online a certain amount of time. Here at Prism we solved that problem by allowing events to be stored anywhere on the file system, but most appliances do not afford you that luxury.

Now let’s flip our attention over to the analytics and reporting activities. This is yet another important aspect of scalability that is often ignored. If a system can process 10 million events per minute but takes 10 hours to run a simple query you probably are going to have upset users and a non-viable solution. And what happens to the collection throughput above when a bunch of people are running reports? Often a single user running ad-hoc reports is just fine, put a couple on and you are in trouble.

A remediation strategy here is to look for a solution that can offload the reporting and analytics to another machine so as not to impact the aggregation, correlation and storage steps. If you don’t have that capability absolutely press the vendor for performance metrics if reports and collection are done on the same hardware.

– Steve Lafferty

The EPS Myth

Often when I engage with a prospect their first question is “How many events per second (EPS) can EventTracker handle?” People tend to confuse EPS with scalability so by simply giving back an enormous-enough number (usually larger than the previous vendor they spoke with) it convinces them your product is, indeed, scalable. The fact is scalability and Events per Second (EPS) are not the same and many vendors get away from the real scalability issue by intentionally using the two interchangeably. A high EPS rating does not guarantee a scalable solution.If the only measure of scalability available is an EPS rating, you as a prospect should be asking yourself a simple question. What is the vendor definition of EPS? You will generally find that the answer is different with each vendor.

  • Is it number of events scanned/second?
  • Is it number of events received/second?
  • Is it number of events processed/second?
  • Is it number of events inserted in the event store/second?
  • Is it a real time count or a batch transfer count?
  • What is the size of these events? Is it some small non-representative size, for instance, 100 bytes per event or is it a real event like a windows event which may vary from 1000 to 6,000 bytes?
  • Are you receiving these events in UDP mode or TCP mode?
  • Are they measuring running correlation rules against the event stream? How many rules are being run?
  • And let’s not even talk about how fast the reporting function runs, EPS does not measure that at all.

At the end of the day, an EPS measure is generally a measure of a small, non-typical normalized event received. Nothing measured about actually doing something useful with the event, and indeed, pretty much useless.

With the lack of definition of what an event actually is, EPS is also a terrible comparative measure. You cannot assume that one vendor claiming 12,000EPS is faster than another claiming 10,000EPS as they are often measuring very different things. A good analogy would be if you asked someone how far away an object was, and they replied 100. For all the usefulness of the EPS measure the unit could be inches or miles.

EPS is even worse for ascertaining true solution capability. Some vendors market appliances that promise 2,000 EPS and 150 GB disk space for log storage. They also promise to archive security events for multiple years to meet compliance. For the sake of argument let’s assume the system is receiving, processing and storing 1000 windows events/sec with an average 1K event size (a common size for a Windows event). In 24 hours you will receive 86 million events. Compressed at 90% this consumes 8.6GB or almost 7% of your storage in a single day. Even with heavy compression it can handle only a few weeks of data with this kind of load. Think of buying a car with an engine that can race to 200MPH and a set of tires and suspension that cannot go faster that 75MPH. The car can’t go 200, the engine can, but the car can’t. A SIEM solution is the car in this example, not the engine. Having the engine does not do you any good at all.

So when asked about EPS, I sigh, and say it depends, and try to explain all this. Sometimes it sinks in, sometimes not. All in all don’t pay a lot of attention to EPS – it is largely an empty measure until the unit of measure is standardized, and even then it will only be a small part of overall system capability.

Steve Lafferty

Why words matter…or do they?

Well this is starting to turn into a bit of a bun fight, which was not my intent as I was merely attempting to clarify some incorrect claims in the Splunk post. Well, now Anton has weighed in with his perspective:

“I think this debate is mostly about two approaches to logs: collect and parse some logs (typical SIEM approach) vs collect and index all logs (like, ahem, “IT search”).”

Yes, he is right in a sense. It is a great concise statement, but the statement needs to be looked at as there are some nuances here that need to be understood.

Just a bit of level-set before going to work on the meat of the statement.

Most SIEM solutions today have a real-time component, (typically a correlation engine), and some kind of analytics capability. Depending on the vendor some do one or the other better (and of course we all package and price them all differently).

Most of the “older” vendors started out as correlation vendors targeting F2000 enabling real-time threat detection in the SOC. The analytics piece was a bit of a secondary requirement, and secure, long term storage not so much as all. The Gartner guys called these vendors SEM or Security Event Management providers which is instructive – event to me implies a fairly short-term context. Since 2000, the analytics and reporting capability has become increasingly important as compliance has become the big driver. Many of the newer vendors in the SIEM market focused on solving the compliance use-case and these solutions typically featured secure and long term storage, compliance packs, good reporting etc. These new vendors were sometimes referred to as SIM or Security Information Management. These vendors fit a nice gap left in the capabilities of the correlation vendors. Some of the newer vendors like Loglogic made a nice business focusing on selling log collection solutions to large enterprise – typically as an augmentation to an existing SIM. Some of these newer vendors like Prism , focused on mid tier and provided lower-cost, easy to deploy solutions that did both compliance as well as provided real-time capabilities to companies that did not that did have the money or the people to afford the enterprise correlation guys. These companies had a compliance requirement and wanted to get some security improvements as well.

But really all of us, SIM/SEM, enterprise, mid-tier, Splunk were/are collecting the same darn logs – we were just doing slightly different things with them. So of course the correlation guys have released log aggregators (like Arcsight Logger), and the Log Management vendors have added or always had real-time capability. And at the end of the day we ended up getting lumped into the SIEM bucket, and here we are.

For anyone with a SIEM requirement… You should understand what your business requirements are and then look long and hard at the vendor’s capability – preferably by getting them in house to do an evaluation in your own environment. Buying according to which one claims to do the most events per second or supports the most devices, or even the one has the most mindshare in the market is really short sighted. Nothing beats using the solution in action for a few weeks, and this is a classic “the devil is in the details…”

So, back to Anton’s statement (finally!). When Anton refers to “collect and parse some logs” that is the typical simplification of the real-time security use case – you are looking for patterns of behavior and only certain logs are important because you are looking for attack patterns in specific event types.
The “collect and index all the logs” is the typical compliance use case. The indexing is simply the method of storing for efficient retrieval during analysis – again a typical analytics requirement.

Another side note. The importance of collecting all the logs is a risk assessment that the end user should do. Many people tend to collect “all” the logs because they don’t know what is important and it is deemed the easiest and safest approach. The biggest beneficiaries of that approach are the SIEM appliance vendors as they get to sell another proprietary box when the event volume goes through the roof, and of course those individuals that hold stock in EMC. Despite compression, a lot of logs is still a lot of logs!

Increasingly, customers I talk to are making a conscious decision to not collect or retain all the logs as there is overhead and a security risk in storing logs as they consider them sensitive data. Quite frankly you should look for a vendor that allows you to collect all the data, but also provides you with some fairly robust filtering capability in case you don’t want or need to. This is a topic for another day, however.

So when Anton claims that you need to do both – if you want to do real-time analysis as well as forensics and compliance -then yes, I agree, but when he claims the “collect and parse” is the typical SIEM approach then that is an overgeneralization, which really was the purpose of my post to begin with. I tend not to favor them as they simply misinform the reader.

– Steve Lafferty

More thoughts on SIEM vs. IT Search

I posted a commentary a while ago on a post by Raffy, who discussed the differences between IT Search (or Splunk, as they are the only folks I know who are trying to make IT Search a distinct product category) and SIEM. Raffy posted a clarification  in response to my commentary. What I was pointing out in my original post was that all vendors, SIEM or Splunk, are loading the same standard formats – and what needed to be maintained was, in fact, not the basic loader, but the knowledge (the prioritization, the reports, alerts etc) of what to do with all that data. And the knowledge is a core part of the value that SIEM solutions provide. On that we seem to agree. And as Raffy points out, the Splunk guys are busily beavering away producing knowledge as well. Although be careful — you may wake up one morning and find that you have turned into a SIEM solution!

Sadly the concept of the bad “parser” or loader continues to creep in – Splunk does not need it which is good. SIEM systems do, which is bad.

I am reasonably familiar with quite a few of the offerings out there for doing SIEM/log management, and quite frankly, outside of perhaps Arcsight (I am giving Raffy the benefit of the doubt here as he used to work at Arcsight, so he would know better than I), I can’t think of a vendor that writes proprietary connectors or parsers to simply load raw data. We (EventTracker) certainly don’t. From an engineering standpoint, when there are standard formats like Windows EVT, Syslog and SNMP it would be pretty silly to create something else. Why would you? You write them only when there is a proprietary API or data format like Checkpoint where you absolutely have to. No difference here. I don’t see how this parser argument is in any way, shape or form indicative of a core difference.

I am waiting on Raffy’s promised follow-on post with some anticipation  – he states that he will explain the many other differences between IT Search and SIEM, although he prefaced some of it with the Splunk is Google-like and Google is God ergo…

Google was/is a gamechanging application, and there are a number of things that made them unique – easy to use, fast, and the ability to return valuable information. But what made Google a gazillion dollar corporation is not the Natural Language Search – I mean, that is nice but simple “and” “or” “not” is really not a breakthrough in the grand scheme of things. Now the speed of the Google search, that is pretty impressive – but that is due to enormous server farms so that is mechanical. Most of the other early internet search vendors had both these capabilities. My early personal favorite was AltaVista, but I switched a long time ago to Google.

Why? What absolutely blew my socks off and continues to do so to this day about Google is their ability to figure out which of the 10 millions entries for my arbitrary search string are the ones I care about, and providing them, or some of them, to me in the first hundred entries. They find the needle in the proverbial haystack. Now that is spectacular (and highly proprietary) and the ranking algorithm is a closely guarded secret I hear. Someone once told me that lot of it is done around ranking from the millions of people doing similar searches – it is the sheer quantity of search users on the internet. The more searches they conduct the better they become. I can believe that. Google works because of the quantity of data and because the community is so large – and they have figured out a way to put the two together.

I wonder how an approach like that would work however, when you have a few admins searching a few dozen times a week. Not sure how that will translate, but I am looking forward to finding out!

– Steve Lafferty