Everything You Wanted to Know about Log Management but Were Afraid to Ask (Part 2)
In Part 1 of this series, we discussed what a SIEM actually is. Now we are going to dive down into the essential underpinnings of a SIEM – the lowly, previously unappreciated, but critically important log files. This is a 3 part blog to help you understand SIEM fundamentals. It’s a big topic, so we broke it up into 3 blogs, and give things time to soak…
This is not me in the picture, by the way, but you have to admire those logs. The key point is: do not fear logs or underestimate their power - learn to enjoy them, as this man obviously does.
Log collection is the heart and soul of a SIEM. The more log sources that send logs to the SIEM, the more can be accomplished with the SIEM. Your network generates vast amounts of log data – a Fortune 500 enterprise’s infrastructure can generate 10 Terabytes of plain-text log data per month, without breaking a sweat! You, as an analyst, might break a sweat at this reality, without some help – it’s just too much data. But we will get to that in a minute.
Logs are the key to understanding “Who’s attacking us today?” and “How did they get access to all of our corporate secrets?” While we make think of Security Controls as containing all the information we need to do security, they only contain the things they have detected – there is no “before and after the event” context within them. However, this context is usually vital to allowing you to separate false positives from true detection. Context is the difference between detecting an actual attack, rather than chasing after a merely misconfigured system.
Successful attacks on computer systems rarely look like real attacks except in hindsight – if this were not the case, we would be able to cheerfully automate all security defenses, and not require human analysts. In addition, attackers may try to remove and falsify log entries to cover their tracks. For this reason, having a protected source of log information that can be trusted is vital to any legal proceeding from computer misuse.
You’ll want the logs from the critical components of your network and business. You will want the logs from your firewall for sure. You will also want logs from your key servers, especially your Active Directory server and your key application and database servers. You will want the logs from your IDS and antivirus as well. You will want to keep an eye on your web server.
You need to think about what the key elements of your network are, from a business standpoint. Think about the parts of your infrastructure are crucial to running the business. The logs those components generate are the keys to keeping your network up and the business running.
Especially for a small/medium company, it’s important to decide what is important for you to be watching, as you likely have limited resources allocated to the security monitoring task. You can’t hire enough people to read every line of those logs looking for bad stuff. I’m serious, don’t even try this. Even if you succeeded, the analysts would be so bored they’d never actually spot anything even if it was right in front of their face. Which it would be:
So here are the logs you need to consider for inclusion in your situation:
Logs from your security controls:
- Endpoint Security (Antivirus, antimalware)
- Data Loss Prevention
- VPN Concentrators
- Web filters
Logs from your network infrastructure:
- Domain Controllers
- Wireless Access Points
- Application Servers
- Intranet Applications
Non-log Infrastructure Information
- Network Maps
- Vulnerability Reports
- Software Inventory
Non-log Business Information
- Business Process Mappings
- Points of Contact
- Partner Information
So, it would be profoundly nice if every operating system and every application in the world happened to record their log events in the same format. Unhappily, they do not. Most logs are written to be readable to humans, not computers. Note the picture above indicating the general human reaction to reading logs –ZZZ.
In the example below, note, while these two logs say the same thing to a human being, they are very different from a machine’s point of view:
“User Broberts Successfully Authenticated to 10.100.52.105 from client 10.10.8.22 “
“00.100.52.105 New Client Connection 10.10.8.22 on account: Broberts: Success”
Long story short: what needs to be done is to break down every known log message out there, and put it into a normalized format, like this:
“User [USERNAME] [STATUS] Authenticated to [DESTIP] from client [SOURCEIP]”
“100.100.52.105 New Client Connection 10.10.8.22 on account: Broberts: Success”
So, when you hear a SIEM product marketer talk about “how many devices it supports”, they are talking about how many devises it can parse and normalize log files from. This takes the logs from human-understandable to machine-understandable, so the SIEM can understand and work with the logs from these many, disparate sources.
Breaking down those logs from many sources into their components, or normalizing them, is what allows the SIEM to search across logs from multiple devices and to correlate events between them. Once we’ve normalized logs into a database table, we can do database-style searches, such as
“Show [All Logs] From [All Devices] from the [last two weeks], where the [username] is [Broberts]”
This, in turn, allows the SIEM to do automated correlation of these events, such as matching fields between log events – across time periods and across device types:
“If a single Host fails to log in to three separate servers using the same credentials within a 6 second time window, raise an alert”
This is obviously useful, as opposed to the native logs we started with. In addition, event normalization allows the creation of report summarizations of our log information, such as:
"What User Accounts have accessed the highest number of distinct hosts in the last month?”
"What Subnet generates the highest number of failed login attempts per day, averaged out over 6 months?"
Armed with this understanding of what a SIEM is, and how log files relate, we are ready to move on to the finale of the series. The topic of the next blog in this series focuses on that topic: Better than SIEM: Unified Security Management