Incident Management in PowerShell: Recovery, Lessons Learned

A week ago, we released PoshSec at BSides Detroit. This is the Steele release (0.2), named in memory of Will Steele. Will, who launched the PoshSec project, passed away last year. PoshSec is available for download on GitHub.

This is the final part in our series on Incident Management. Incident Management consists of following stages: Preparation, Identification, Containment, Eradication, Recovery, and Lessons Learned. Today, we will look at the long tail of recovery and on lessons learned.

Recovery — Monitoring
The immediate and most visible aspect of recovery is resuming services on the breached system. Resumption gets us back in business. But monitoring keeps us in business. When you listen to Josh Little’s BSides Detroit presentation (A Cascade of Pebbles: How Small Incident Response Mistakes Make for Big Compromises), note how many times the recovery was executed without follow-up monitoring. At each pebble, the responsible parties cleaned up what the saw without raising the alarm. This allowed the attackers to remain in the network for weeks.

The first lesson is to communicate identified security breaches. The second lesson is to maintain a high degree of vigilance for at least two weeks following any incident.

The team’s schedule must be re-prioritized to allocate more time for monitoring post-breach. The bad guys have appeared, and stared into our souls. The attackers potentially have all the information they need for subsequent attacks, phishing attacks, or social engineering attacks. We also do not know for certain that we have not restored infected data. Therefore, plan to spend more time with PowerShell reviewing the logs.

With the exception of remedial changes, put in place a change freeze. Remedial changes include closing the vulnerability that was used in the security breach, resetting passwords to void any potentially captured hashes, and hardening along the kill chain. Other than these, reduce changes to reduce the likelihood of the security team seeing a legitimate change as an attack, or vice versus.

In sum, the last phase in the recovery stage is increased monitoring. Watch our baselines and our honey tokens closely, and be prepared for subsequent attacks.

Lessons Learned
And this brings us to the Lessons Learned stage. Ideally, this stage has three outputs: a root cause analysis (RCA) document with suggestions for improvements; a threat scenario write-up; and an indicators of compromise (IOC) document.

A source of information for these outputs are the PowerShell transcripts and the logs. The PowerShell transcripts must be enabled during the incident (Start-Transcript, Stop-Transcript). Together with the logs, the incident can be pieced back together.

The RCA can be created during the review. Hold a minimum of two review meetings. In the first, do a table-top exercise and walk thru the incident. Capture all the key facts on at timeline. Between the first and second meeting, circulate this timeline and solicit feedback and additional detail. Then hold a second review meeting, and identify at least one improvement for each stage: Preparation, Identification, Containment, Eradication, Recovery.

Depending on the attack, the situation, and the likelihood of reoccurrence, we may want to create a threat scenario. A threat scenario is a sanitized version of the attack that highlights the tactics an attacker uses. The scenario cover the vulnerabilities, threats, and the business impact. Such documents can then be used by the security and operations teams for training purposes.

Finally, we generate an IOC to be shared with the wider security community. As mentioned during the Identification article, groups such as Information Sharing and Analysis Center (ISAC) have been setup to share IOCs. As attackers often use the modus operandi, sharing IOCs with our peers allows us to build better defenses.

We take from the community in Identification by leveraging others’ IOCs. We give back to the community in Lessons Learned by sharing our IOCs.

With that, we have learned from our mistakes, implemented the learning in our training programs, and feed the information back to our peers. Only then can we say that we have completed the security incident.
Summary
Incident Management is a formal process like Business Continuity. The objective of both is reducing the impact on the organization. To do this, we plan and prepare for failures and for breaches. We train for success. We automate key tasks to reduce errors and speed up our response. When an event does occur, we respond by following our training and by implementing our plans. Once done, we pause, reflect, and find ways to improve our game.

Done right, Incident Management is a measured response that deflects the attack and leaves our organization in a stronger position afterwards. Let’s do it right.

This article series is cross-posted on the PoshSec blog.

Posted June 14, 2013 by Wolf Goerlich