This spring, as the product and security operations teams at AT&T Cybersecurity prepared for the launch of our Managed Threat Detection and Response service, it became obvious to us that the market has many different understandings of what “response” could (and should) mean when evaluating an MDR solution. Customers typically want to know: What incident response capabilities does the underlying technology platform enable? How does the provider’s Security Operations Center team (SOC) use these capabilities to perform incident response, and, more importantly, how and when does the SOC team involve the customer's in-house security resources appropriately? Finally, how do these activities affect the return on investment expected from purchasing the service? However, in our review of the marketing literature of other MDR services, we saw a gap. All too often, providers do not provide sufficient detail and depth within their materials to help customers understand and contextualize this crucial component of their offering.
Now that we’ve introduced our own MDR solution, we wanted to take a step back and provide our definition of “response” for AT&T Managed Threat Detection and Response.
Luckily, Gartner provides an excellent framework to help us organize our walk-through. When evaluating an MDR service, a potential customer should be able to quickly understand how SOC analysts, in well-defined collaboration with a customer’s security teams, will:
- Validate potential incidents
- Assemble the appropriate context
- Investigate as much as is feasible about the scope and severity given the information and tools available
- Provide actionable advice and context about the threat
- Initiate actions to remotely disrupt and contain threats
*Source: Gartner Market Guide for Managed Detection and Response Services, Gartner. June 2018.
Validation, context building, and Investigation (Steps 1-3)
It’s worth noting that “response” starts as soon as an analyst detects a potential threat in a customer’s environment. It stands to reason then that the quality of threat intelligence used by a security team directly impacts the effectiveness of incident response operations. The less time analysts spend verifying defenses are up to date, chasing false positives, researching a specific threat, looking for additional details within a customer's environment(s), etc., the quicker they can move onto the next stage of the incident response lifecycle. AT&T Managed Threat Detection and Response is fueled with continuously updated threat intelligence from AT&T Alien Labs, the threat intelligence unit of AT&T Cybersecurity. AT&T Alien Labs includes a global team of threat researchers and data scientists who, combined with proprietary technology in analytics and machine learning, analyze one of the largest and most diverse collections of threat data in the world. This team has unrivaled visibility into the AT&T IP backbone, global USM sensor network, Open Threat Exchange (OTX), and other sources, allowing them to have a deep understanding of the latest tactics, techniques and procedures of our adversaries.
Every day, they produce timely threat intelligence that is integrated directly into the USM platform in the form of correlation rules and behavioral detections to automate threat detection. These updates enable our customers’ to detect emergent and evolving threats by raising alarms for analyzed activity within public cloud environments, on-premises networks, and endpoints. Every alarm is automatically mapped to the Cyber Kill Chain taxonomy and MITRE ATT&CK frameworks and enriched with additional insight into the potential Intent of the attacker, the Strategy and the Method of the identified threat, and Recommendations for remediation. This provides analysts immediately available high fidelity analysis to use when reviewing an alarm, saving valuable time in the incident response lifecycle.
24x7x365, the Managed Threat Detection and Response SOC analyst team monitors the USM platform and reviews the details of every single alarm for all of our customers. As our analysts assess alarms, they update them to an “In Review” status. For all alarms deemed benign, mitigated by existing controls, or allowed by policy, an analyst will apply an informative label and set the alarm status to “Closed”. If they feel that an alarm represents a potential threat, a SOC analyst will set the alarm status to “In Review”, and open an Investigation.
In the USM platform, Investigations serve as the organization hub for coordinating incident response. Core use cases include allowing analysts to gather and present analysis and evidence, communicate with other analysts and customer contacts, and document remediating actions taken either by the SOC or customer teams.
Once an Investigation is open, analysts use their knowledge of the customer’s environment, Alien Labs threat intelligence, and the USM platform’s forensic analysis capabilities to streamline their research and threat hunting activities. These validation and context-building exercises can include any combination of the below:
- An in-depth examination of the security events associated with the alarm or with assets that might be at risk
- A review of previous vulnerability assessments or the initiation of an ad hoc Asset Scan (the USM platform supports both authenticated and unauthenticated scanning)
- Cross-referencing of the threat with identified public cloud configuration issues from the customer’s environment
- Consultation of our current documentation of the customer’s network topology created during our onboarding exercises and regularly updated during customer check-ins
- Execution of an endpoint query using the AlienVault Agent
- Use of the AlienApp for Forensics and Response to collect forensic information from any appropriately configured host currently on the network
By completing the above, analysts can quickly understand the nature of the threat, what happened and how, the severity of risk, what assets or users were involved, the criticality of those assets or users, and what to do next, without having to track down information from multiple disparate security tools or threat research blogs. This helps to reduce context switching, supporting fast and efficient updates to Investigations.
Generating recommendations and initiating actions (Steps 4 and 5)
After the analyst team confirms that they have accurately identified and classified the threat, they begin the process of either providing actionable remediation recommendations or initiating containment and disruptive actions. The scope of this activity can vary dramatically, with analysts utilizing the USM platform to:
- Push configuration changes to third-party technologies using Response Actions available through the USM platform’s AlienApps integration framework
- Use the AlienApp for Forensics and Response to execute Enforcement System Functions
- Initiate coordination across AT&T Cybersecurity Managed Security Services to implement a configuration change in a security control managed by AT&T
- Recommend an examination of a particular user account, update to security control, or the reset of a machine to a known good state by the customer’s security team, with SOC support provided as needed
The specific details and permissions associated with the above activities are determined during our multi-day onsite onboarding for Managed Threat Detection and Response. While onsite, a customer’s assigned analyst will work with their team(s) to create an Incident Response Plan (IRP). This plan is deeply customizable and dictates SOC operations once an analyst opens an Investigation. Different variables can dictate what actions the SOC should take for a given Investigation, such as Investigation severity, the business criticality of the assets associated with the alarms under review, the environment or control generating the alarm, and much more. It also documents what security orchestration actions the SOC can take with and without approval from customer contacts, including enabling Response Action rules. These rules automate the response-related actions towards a customer’s networks and devices as well as other integrated security controls.
The incident response plan is a living document, often updated during weekly tactical check-in calls where we validate that all Investigations and incident response activities are being managed efficiently. The assigned analyst also hosts a monthly meeting with the customer’s team, where, in addition to IRP updates, they review service metrics related to our SLAs, discuss progress towards security program objectives and any recommendations for improvements, plan for any pending compliance or audit requirements, discuss industry threat trends, and more.
Real-world response using the USM platform
Below, we’ve outlined some simplified real-world examples of this lifecycle from the Managed Threat Detection and Response SOC. While it’s impossible to cover all potential scenarios, it should provide some context for what all of this looks like in practice.
An analyst reviews an alarm indicating a Suspicious Download Event from McAfee ePO that might indicate the use of a Windows hacking tool within an industrial supply company.
- The analyst creates an Investigation, adding in-scope alarms and events.
- Using agent queries, it’s verified that someone has executed the Mimikatz credential dumping tool on the host
- The analyst observes processes that are running on the system that doesn’t have an attached file.
- Process hashes are then analyzed using OTX.
- The analyst validates the details of recently encoded PowerShell commands ( such as the name of the script that was ran, arguments applied to it, permissions, and payload).
- After verifying that the Incident Response Plan allows for proactive action for High and Critical Severity Investigations, the analyst executes the “Disable Networking” action on the host using the AlienApp for Forensic and Response.
- The analyst documents this activity on the Investigation and assigns it to the customer contact to take the next step of either performing a more in-depth forensic investigation, or re-imaging the host. Over the next few days, the SOC team works with the customer to investigate whether or not additional containment activities are required.
While monitoring the infrastructure for a healthcare organization that hosts a patient web portal, an analyst reviews an alarm that indicates activity related to a web vulnerability scanner followed by the successful upload of a web shell.
- The analyst reviews the most recent authenticated vulnerability scanning results, noting that one of the web servers has not been patched lately and has known vulnerabilities.
- The analyst opens an Investigation, escalates the severity, notifies the customer and uses the AlienApp for Forensics and Response to collect data such as suspicious files in Internet-accessible locations, files containing references to suspicious keywords, suspicious shell commands and unexpected process creating network connections. The ultimate recommendation is that the customer immediately patch the targeted server.
- The Customer patches the server and updates rules in their Web Application Firewall to block the scanning IP.
- Before closing the Investigation, the analyst works with the customer to install the AlienVault Agent on key servers, adding an additional layer of available telemetry for future Investigations.
An analyst observes continuous scanning of a local government agency’s network from an IP associated with OTX IOC’s.
- Consulting the customer’s IRP, the analyst notes that the customer is also an AT&T Managed Security Services Network Based Firewall customer. Per the IRP, the analyst team has permission to execute MACD’s (firewall change requests) on the customer’s behalf for Medium - Critical severity Investigations.
- The analyst blocks the scanning IP by executing a MACD, documents the change in the Investigation, and closes it.
- Over the next few days, the analyst continues to monitor for additional scanning to verify that no additional changes are needed.
While monitoring the infrastructure for a manufacturing company, a SOC analyst observes a “Credential Abuse” alarm from Box, indicating the user is logged in from two countries simultaneously.
- The analyst consults the IRP (noting that the customer has requested to be contacted about any credential abuse alarms), opens an Investigation, and assigns it to the customer.
- Customer confirms with the user that they are at their home office.
- The analyst disables the Box user using the AlienApp for Box and keeps the Investigation open, supporting the customer in their efforts to remediate and understand if credentials have been compromised.
While performing routine Threat Hunting Activities within an education services company’s environment, an analyst identifies anomalous behavior from a single user.
- An analyst identifies a user running suspicious commands typically used in probing Service Principal Names (SPN). This could be indicative of “Kerberoasting”, which, if successful, could expose credentials via a Brute Force attack.
- With this preliminary observation, an Investigation is created for this activity.
- The analyst now uses the Investigation to add additional suspicious events populated from the identified user (including mixed Kerberos encryption types, erroneous failure codes, and atypical Service Name requests).
- After referring to the customer IRP for permission to take action on users demonstrating suspicious behavior, the “Disable AD User” action is invoked on the suspicious user using the AlienApp for Forensics and Response.
- The analyst will work with the customer to continue to observe and identify any potentially compromised accounts due to the actions of the now disabled user.
- When containment and eradication procedures are concluded, the SOC team works with the customer to implement future detection for the observed actions by creating new Alarm Rules.
An analyst observes an alarm from a financial institution that has installed the AlienVault Agent on key servers.
- An analyst creates an Investigation, adding the in-scope alarm and events.
- It is determined that the alarm was generated by one of the Agent’s scheduled queries. This particular query that detects any processes running from the “/tmp” directory and is attempting communication with a known site used for data exfiltration or command and control.
- Using the AlienApp for Forensics and Response, the analyst executes the “Block Remote Address Outbound” action on the host in order to keep the host running its other services (as it is a critical high availability system).
- The Agent also performs regularly scheduled detection on Root command history as well as file modifications. These are added to the open Investigation as well.
- The SOC team works with the customer to confirm that long term contamination and eradication efforts are completed.
A gaming company that has recently migrated their virtual machine infrastructure to Azure generates a “Multiple Accounts Locked Out” alarm.
- An analyst reviews the alarms in order to identify the user account. It appears as though all of the accounts are random and do not conform to the standard account naming convention of the environment.
- The analyst now reviews the most recent Authenticated Scan for the host and identifies that port 3389 is open on the host.
- An Investigation is created that includes the findings and a request for information on the legitimacy and requirement for 3389 to be open on the host and to check that the Network Security Groups (NSG) are properly configured, as this is a recent cloud migration.
- Per the IRP, the analyst then executes the “Block Remote Address Inbound” action of the AlienApp for Forensics and Response on the source hosts initiating the failure attempts.
So, there you have it. It’s hard to succinctly summarize security operations, but hopefully, we’ve been able to shed some light on what it’s like to work with the AT&T Managed Threat Detection and Response SOC team. If you’d like to learn more, please visit our product page here!
Mark Gray co-authored this blog. You can find his profile on LinkedIn.