On the one hand, automated incident response offers various benefits, such as reducing the mean time to response and resolution and the ability to scale solutions to match an organization’s infrastructure and avoid dependency on “tribal knowledge.” On the other hand, automated incident response presents several quandaries exemplified by questions such as these:
Which processes should be automated first?
What is the appropriate balance between autonomous decision-making and human-in-the-loop decision-making?
Which parts of an organization’s infrastructure should be included in the automation?
These questions don’t have standard answers, which makes matters even more complicated. Each organization operates under a unique set of constraints and requirements, and there are multiple ways to design an automated incident response solution.
This article provides six best practices to help organizations decompose the steps for planning an incident response automation program suitable for their needs. It helps them create a repeatable methodology for identifying security management workflows and incrementally automating them based on the interests and preferences of the stakeholders involved.
Summary of best practices for implementing automated incident response
Define the scope and identify the stakeholders
Automated incident response requires coordination between multiple teams and decision-makers (stakeholders). Each stakeholder holds specific responsibilities and owns a distinct part of capability delivery. Understanding these requirements, combining them with the proposed outcomes, and defining areas of responsibility provide the basis for the project's scope.
For example, service availability teams frequently have infrastructure uptime targets and use automated scaling and load-balancing to achieve them. If a proposed automated incident response action automatically quarantines part of the infrastructure under their administrative control, that team would need to be involved.
Similarly, stakeholders have different perspectives on automated incident response. Some may desire a system that eliminates human-in-the-loop decision-making. In contrast, others may wish to ensure that humans make the decisions and trigger only discrete automated procedures.
Establishing the scope identifies these areas of alignment and conflict and the stakeholders who will be impacted.
Here are some recommendations and resources to start this process.
Infrastructure mapping considerations
Automated incident response relies heavily on accurate application dependency mapping. ADM simply means identifying the assets that support a particular business application, such as network devices, databases, web servers, virtual machines, and containers. Such an asset inventory helps connect detected events to specific assets and business applications to plan corresponding actions. For example, if a database is compromised, knowing if it supports a mission-critical application or an isolated lab environment is important before taking action.
When implementing automated incident response, try to start with application environments with clear mapping of assets and applications—or application environments small enough that identifying assets would be a simple task. If a well-mapped environment is unavailable, it’s worth expanding the project’s scope first to gather an asset inventory.
Workflow orchestration and automation for security teams
- No code or low code - no custom development necessary
- Integrates with all your systems - internal and external
- Built-in safeguards like credential management and change control
Start with a single environment
If multiple well-mapped environments are available, select one to begin with. Use this environment to navigate the entire scoping process.
When choosing an environment, look for one with interested stakeholders, clearly defined infrastructure, and an openness to adding new features. The upfront planning will simplify the implementation process and help build interest and excitement throughout the organization once the initial well-defined project achieves positive results.
Propose a list of automated processes
After selecting an environment, create a list of potential automated actions. These actions form the basis of an automated incident response capability and will help frame the implementation strategy.
The proposed list of automated actions should be comprehensive enough to demonstrate the effectiveness of an automated incident response approach while also showcasing an organization's strengths. For instance, automating network packet capture analysis would be impossible if an organization did not capture raw packet data.
The team at Tines has created the “SOC Automation Capability Matrix,” which is available at this publicly accessible link (here). This page lists security operations center (SOC) processes and aligns them with orchestrations and automation workflows. To use this page, click the link and locate a process matching the targeted environment. Next, click on the selected process to view a list of pre-developed orchestrations available in the Tines library. You can use these orchestrations as they are or edit them as needed.
For instance, if phishing alerts are a high priority for automation, you could:
Navigate to the SOC Automation Capability Matrix page.
Choose the Phishing Alerts and Reports table.
Review the Description and Techniques sections to make sure that this information matches your expectations.
Choose an existing automation from the list of examples provided.
Edit the chosen example as needed to match your infrastructure.

Selecting a process, reviewing the techniques, and choosing an example within the SOC Automation Capability Matrix (source)
The following workflow is an example of the types of workflows available in the SOC Automation Capability Matrix. In this case, the workflow covers the analysis and triage of suspicious emails:

Analyze and triage suspicious emails with various tools
Submit suspicious emails and investigate with a comprehensive analysis of files, URLs, and headers. Add IOCs to various tool blocklists in order to limit impact of phishing campaigns.
Tools
CrowdStrike, EmailRep, Jira Software, NextDNS, URLScan.io, VirusTotal
Created by




Map proposed actions to impacted stakeholders
With an environment selected and a list of proposed actions available, map the proposed actions against assets and stakeholders. This creates a high-level stakeholder and asset list, which can be added to the scoping process.
The table below provides a simple way to perform this mapping. It lists the proposed actions, each having one row. Each action is associated with the assets that will be impacted and the stakeholders responsible for these assets.
Using a deliberately simplistic list of three proposed actions, the table immediately identifies the impacted assets (web servers, AWS infrastructure, and all employee devices) and the impacted stakeholders (infrastructure, engineering, development, and sales teams). The affected stakeholders and assets can now be added to the project's scope, and discussions can be initiated. Note that the examples in the table below have deliberately truncated the list of impacted stakeholders for readability. For most organizations, the list of impacted stakeholders would be significantly larger.
Develop an implementation plan
When planning the implementation of an automated incident response capability, use a crawl, walk, run approach. This minimizes disruption to existing infrastructure while allowing stakeholders to build confidence in the automated system.
Crawl stage
In this first stage, automate the existing manual tasks. Focus on identifying the differences between an automated approach and the existing manual approach and invest time updating impacted tools and processes.
For instance, imagine if a malicious IP address alert is qualified by manually checking the SIEM for any connections to this IP address in the last 30 days. This is a great candidate for automation, as it is a highly repeatable process with an existing manual playbook.
The implementation plan should allow time for this automation to be tested and developed and for SOC playbooks to be updated.

Crawl stage automation block diagram (source)
Walk stage
In this stage, the processes from the crawl stage are combined into more complex orchestrations. This allows the automated incident response system to start making decisions about handling certain alerts without requiring human-in-the-loop decision-making.
For example, imagine if the previous malicious IP address alert was updated so that SOC operators are only passed qualified alerts, meaning only alerts with confirmed connections to the malicious IP. In this instance, the automation would need to be updated with branching logic to handle each option (connection or no connection), along with how to resolve non-qualified alerts.
The implementation plan should allow time for each branch of this logic to be tested and for updating the SOC playbooks. It should also allow time to discuss practical considerations for non-qualified alerts, such as routine reviews.

Walk stage automation block diagram (source)