Main Content

Heya - HollyGraceful here, I make all of this content in my spare time, like it? Please support me :)
You can donate via Bitcoin or Patreon!

Incident Response Handbook

Summary

The period of dealing with a security breach is one of tension. If a company is not adequately prepared for the efficient handling of an incident then a time of tension becomes one of crisis.

A company who sustains a data breach without an incident response plan will find they are underprepared for activities post-breach. This document has two aims; the first is to give companies who have suffered a data breach something to based their actions on if they do not already have a prepared incident response plan. Additionally it aims to give companies a baseline to start with when developing their own plans before a breach has occurred.

 

Introduction

Current industry best practice defines “Six Steps” for Incident Response. Implementations vary however the general themes are the same. There are two main “Incident Response Lifecycles”. The first is adopted by SANS (https://www.sans.org/reading-room/whitepapers/incident/incident-handlers-handbook-33901) and the second is described in NIST 800-61 (http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-61r2.pdf)

 

“Six Steps” NIST 800-61r2
Preparation Preparation
Identification Detection and Analysis
Containment Containment,

Eradication,

and Recovery

Eradication
Recovery
Lessons Learned Post-Incident Activity

 

This document will follow the “Six Steps” however the is obviously a lot of crossover with the NIST documentation.

This document aims to guide you through the stages of a response and highlight specific key actions, stakeholders and checks that should be performed throughout. It will also include templates that can be used to record actions throughout the process.

No single step for handling a breach is difficult, but a combination of the urge to just run in to start working and many moving parts means handling a breach can be complex.

 

Response Steps – In Short

Preparation

Establish an Incident Response Team (IRT)

Define the types of breach that are the responsibility of the IRT

Enact Policy to allow the IRT to monitor system usage and traffic

Define a threshold for the activation of the Incident Response Plan

Decide if responders should “pull the plug” or “wait and see”

Establish a centralized source of time and configure NTP network wide

Establish a centralized location for the aggregation of logs

Legal and jurisdictional issues

Gather important contact details

 

Identification

Start incident response documentation

Gather intelligence to determine incident potential

Assign initial handlers and note the start time in the IR log

Review potentially affected systems for system consistency

Threat intelligence, contextualization, attribution and motivation

 

Containment

Determine the type of incident

Determine if the attack is multifaceted

“Pull the plug” or “wait and see”

Determine and minimize the business impact

 

Eradication

Root cause identification

Determine rootkit potential

Harden the system

 

Recovery

Validation of eradication

Restore operations

Improve monitoring

On-going technical prevention

Vulnerability analysis and penetration testing

 

Lessons Learned

Incident response report

Debriefing with senior members of business

Future recommendations

Close the incident and all documentation, e.g. the Incident Log

 

Preparation

The main aim of the preparation stages of incident response are to assign a team, enact policies to empower that team to conduct their work effectively, configure systems such that the dealing with a response is as simple as possible, and to a answer difficult questions whilst they can be thought through clearly instead of making important decisions under the pressure of a real breach.

 

Establish an Incident Response Team (IRT)

Assigning team members to the IRT is the first step however team members alone are not enough and they will require training and resources to perform the tasks in their role.

Courses and Exams:

CREST Registered Incident Analyst

SANS Advanced Digital Forensics and Incident Response

GIAC Certified Incident Handler (GCIH)

 

The team will likely require resources in the form of a “Jump bag” that contains:

  • Printed copy of the IRT call list
  • Bootable USB and DVD-ROM with response software loaded
  • A laptop with forensics software, anti-malware, and monitoring tools
  • Toolkit, including screwdrivers and network cables
  • Clean hard-disks and a write blocker
  • Bound notebooks and pens
  • All appropriate paperwork including incident logging forms
  • Ear plugs (Datacenters are loud!)
  • Other PPE required for working environment
  • Flashlight (and batteries)
  • Something to keep all of the above in
  • Network Tap

 

Enact Policy to Allow the IRT to Monitor System Usage and Traffic

There are certain tasks that the IRT may need to perform to effectively handle an incident however to allow this there may be certain legal restrictions or policy requirements. This is generally in regards to the monitoring of system usage and network traffic, it may be a requirement to ensure that end users are aware of such monitoring.

 

Legal and Jurisdictional Issues

There are several legal and compliance restrictions and requirements in regards to incident response. Key laws include:

  • Computer Misuse Act 1990
  • Human Rights Act 1998
  • Data Protection Act 1998
  • Police and Justice Act 2006
  • Regulation of Investigatory Powers Act 2000
  • Protection of Children Act 2008
  • Sexual Offences Act 2008
  • Wireless Telegraphy Act 2006
  • Digital Millennium Copyright Act (and consequences of reverse engineering)

There are also legal items of note around evidential integrity and chain of custody. Furthermore there may be non-legal requirements enforced through regulatory bodies such as FSA and PCI.

 

Decide if responders should “pull the plug” or “wait and see”

A decision should be made on the company’s stance for responding to a potentially compromised machine, whether the responder should “Pull the plug” or “Wait and see”.

The decision rests on many factors, one being the apatite for the company to deal with data loss that results from a server undergoing an immediate loss of power. Additionally the reason to not power down a device immediately may be based on the desire to capture the volatile memory of the device before seizing and containing it.

The system may need to be set up to take a memory dump and this should be investigated before it is required. There are three simple ways of causing a memory dump:

The first is to enable the keyboard combination to force a system crash (so that a crash-dump can be captured) – Information and specifics are available here: https://msdn.microsoft.com/en-us/library/windows/hardware/ff545499(v=vs.85).aspx

Alternatively the same can be achieved through interfaces such as iLO and DRAC by causing an Non-Maskable Interupt, information on that is available here: http://deusexmachina.uk/win/nmi.php. If this method is utilized then the incident response team will require credentials to all iLO/DRAC style devices.

Finally a third way is to boot of a USB drive designed to dump the RAM of the system. These should be prepared ahead of time and the responder should ensure that the system can boot from USB when required.

 

Establish a centralized source of time and configure NTP network wide

Throughout an investigation the response team will require to create a timeline of events that took place during the breach, the task of analyzing events and logs will become incredibly complex if no centralized time is configured.

NTP should be configured on all devices that support it and decide on GMT offset or consistent time zone across the organization.

 

Establish a centralized location for the aggregation of logs

The act of monitoring logs will be much simpler, and the level of trust that can be place on any logs collected will be higher, if a centralized and managed logging solution is utilized.

 

Define a threshold for the activation of the Incident Response Plan

The IRT should be given explicit guidance on when they are empowered to activate the full incident response plan. – including the method for how this can be activated or the process for activation should be documented for responders.

 

Gather Important Contact Details

There are many third parties that a responder may need to contact during an incident, for example if a company experiences a distributed denial-of-service attack they may seek assistance from the ISP to mitigate the traffic nodes. Additionally contact details for representatives at any cloud service providers used by the company should be gathered.

 

Ongoing Preparation

So there is a lot to do in preparation for a data breach, but times of incident response are times of tension, the better equipped your team are the better off you will be in the long run. With any luck you will never have to activate your response plan for a full on code-red, however even for smaller breaches it can increase the effectiveness of your team making the efforts to return-to-normal more efficient.

It should also be noted that you should not end preparation here as a done item ticked off the list but instead as changes are made to the team and systems under their responsibility then this plan should be revisited. For example, ensure that all new team members are appropriately trained and ensure that all new systems are appropriately configured to make use of that new NTP and logging system you have just set up!

 

Identification

The identification phase is the first phase following an actual breach, if you have been breached and you have no incident response plan you will find yourself clambering to work out what you should do first and looking to make a lot of decisions after the fact.

However if you have already ran through preparation at this stage it is time to put it all into practice. There are effectively two main ways of identifying an incident, either through an internal employee or an external agency such as law enforcement.

Effectively you can either have an internal report of a security incident, such as from a developer looking through some application logs discovering something suspicious or perhaps an end user noticing their machine “acting funny”. Alternatively you can be informed of a breach by a third party, such as a banking partner or law enforcement. In fact statistics show that notification by a third party is the most likely event, by a long way. (http://www.verizonenterprise.com/verizon-insights-lab/dbir/2016/) (http://www.computerworlduk.com/news/security/most-data-breaches-still-discovered-by-third-parties-3615783/)

It is likely a good idea to have two incident responders available to handle any incident, wherever possible. This allows one person to manage the wider aspects of moving through the response plan whilst a second person can concentrate on evidence acquisition.

This is the stage at which you move from potential breach to initiating the response policy and so it’s important to get the paperwork right the first time around. Go ahead and take a look at the appendix documents and you will find two critical documents – an “Incident Response Form” and an “Incident Log”.

The Incident Response Form should be used when making contact with the third party or internal staff member who wishes to report a potential breach. The idea of this form is simply to take a note of the situation as the reporter understands it, take an initial log of potentially affected machines and to gain contact details for the report should anything be missed.

The “Incident Log” should be used as a master timeline for events and decisions performed by the IRT. Open the log by noting down the time the response form was filled out. This is effectively the start time of the response measures – if you are unlucky and this turns in to a full on data breach, there may be a requirement to inform third parties, such as notifying affected data subjects or notifying a regulatory body. For example in the UK the ICO expects to be notified of serious data breaches within 24 hours (or where not possible this should be “without undue delay” and a reason for the delay should be reported).

By starting the log at the point of report you have started the clock and can track how effective the response has been from a pure speed basis, but also as you move through actions in the response plan you can track which aspects have been completed, who completed them, critical decisions made, who made them, but also a central record of all other documentation created. So in the even that a report must be given to a third party generating the response will become much simpler.

Now hopefully, as part of your preparation, you will have contacted your legal department and have a documented conversation with them about the when how and who of notifying-a-third-party.

If you’re in the unfortunate position of dealing with a security incident without a response plan and you have made it this far down, it’s likely a good idea to open a conversation with the legal department to determine if you have any legal or regulatory requirements to notify a third-party in the event of a breach and determine what the threshold of notification is.

 

UK – Notifying the Information Commissioners Office

In the UK there may be the requirement to notify the ICO. To determine this you should contact a lawyer, the author is not a lawyer. Although to give you a quick overview

The primary piece of law which is applicable to data security is the Data Protection Act 1998 which defines 8 principles for the processing of personally identifiable information. Principle number seven is:

Appropriate technical and organisational measures shall be taken against unauthorised or unlawful processing of personal data and against accidental loss or destruction of, or damage to, personal data.

It is the ICO who effectively enforce the act and they further the above by stating:

You will need to:

  • design and organise your security to fit the nature of the personal data you hold and the harm that may result from a security breach;
  • be clear about who in your organisation is responsible for ensuring information security;
  • make sure you have the right physical and technical security, backed up by robust policies and procedures and reliable, well-trained staff; and
  • be ready to respond to any breach of security swiftly and effectively.

So you must keep data secure and the ICO has explicit actions that are expected – however if you fail to keep data securely and a breach occurs, do you have to publicly disclose this fact? That’s an interesting question because under the Data Protection Act you do not. There is however, under the Privacy and Electronic Communications Regulations (PECR) (an EC Directive) Service Providers do.

The PECR is pretty clear, Service Providers (such as an ISP or a telecommunications provider) must notify the Information Commissioners Office if a personal data breach occurs. However what if you are not a Service Provider then what are your requirements?

Well the ICO has a paragraph on this:

Under the Data Protection Act (DPA), although there is no legal obligation on data controllers to report breaches of security, we believe that serious breaches should be reported to the ICO.

So what constitutes a serious data breach? The ICO gives an example in their guidance where an attacker could have potentially accessed over 1000 records. Unless those records are particularly sensitive, or have mto the relevant data subject. A more sensitive record could be one that includes national insurance number, or passport number, for example.

A potential data breach is important here as, say an administrative account is compromised then potentially all records have been breached, it would be up to the breached company to prove otherwise. This could be done from database logs for example, if those logs can be shown to be trustworthy – i.e. are not themselves compromised.

Under ICO guidance a breach notification should be submitted within 24 hours, however for a number of real world breaches this is unrealistic as many steps need to be followed through such as gathering the actual details of a breach, informing internal parties and determining the amount of data and types of data that may have been breached. The ICO has commented that it would prefer a full picture of the breach and not an initial disclosure and then a follow up report. Therefore if it is not “feasible” to comply within 24 hours a disclosure “without undue delay” may be taken, where full details are supplied quickly with an appropriate reason for the delay.

So if you are not legally required to notify in the event of a breach, why bother? Well the ICO Statutory Guidance on the issue states in relation to determining the size of the fine for a breach the following will be taken in to account: “What steps the person had taken once they became aware of the contravention (for example, concealing it, voluntarily reporting it to the Commissioner, or not taking action once the commissioner or another body had identified the contravention)”

Once a breach has been reported to the ICO, either voluntarily by the organisation or by a third-party aware of the breach, the ICO may: work with the data controller to ensure within a certain timeline that compliance is achieved, they may issue an enforcement noticed to compel the company to comply within a fixed timeline (failure to comply is a criminal offence), and they may issue a Section 55A fine of up to £500,000.

Still not a lawyer.

 

 

Affected Systems

Your initial incident report will likely give indications as to potentially affected machines. However bear in mind that additional systems may be affected to varying degrees – a good example of this comes with malware breakouts, several systems may have been identified as infected through obvious symptoms (such as ransom notices!) however more machines may have a payload stored on their disks that has not been clicked yet, or more users may have received the delivery email and not yet downloaded the payload.

It’s important to identify all systems that are affected and in many cases analysis of obviously infected systems will give key indicators of the root cause of infection and indicators to determine if additional machines are infected but in a not-active state.

In addition to gathering technical information about an infected host, it’s a good idea to get the responders on the ground to take note of the types of machines infected and the reasons that they are in use. If all of the infected machines are unimportant easily replaceable end user devices things can be simple – however if the affected system is the sole domain controller for the network (or other critical system) things will be complex.

Also bear in mind with breaches attackers have previously been shown to perform more complex attacks, such as using Distributed-Denial-of-Service attacks being paired with data breaches to act as a distraction, or BotNets being used to distribute ransomware. Security Incidents should be fully investigated and ensure that the situation is treated accordingly and that you’re not just treating the visible symptoms.

 

Containment, Eradication and Recovery

At this stage you should have gathered a large amount of information about the incident taking place. With the information that you have gathered you should be able to classify the type of incident that is occurring.

However, it’s easy to forget that complex attacks can occur, don’t get drawn in to diagnosing and following a single breach aspect and ignore other sensors and warnings. A ransomware attack can be used to mask a data-breach, for example.

However the effects anck will drive the next steps, so it’s a good idea to categorize the attack – just don’t forget attacks can be complex.

 

Example attack types:

Data Breach

Denial-of-Service

Malware Infection

Data Loss

Site Defacement

Policy Violation

Social Engineering & Phishing

 

The immediate action at this stage, now that you have gathered data about the attack, is to stop the attacker. This can be through disabling accounts, changing account passwords, or implementing firewall rules. Avoid changing anything on affected machines, before you have gathered volatile memory and a forensic disk image of the device. Avoid tipping off the attacker before images can be taken.

On that note implement a need to know policy and consider using an out-of-band communication method (so, not corporate email) to ensure an attacker cannot keep tabs on the responders.

Consider collecting network traces, gathering logs, gathering volatile data and a disk image. The latter two items may cause disruption so always confirm with the wider business about the down time and make a plan to manage that.

As soon as possible the system should be removed from the environment and a hardened replacement brought up, either a direct physical replacement or a cleaned instance of the machine. It is recommended that affected machines are wiped and rebuilt to remove any doubt of rootkits. When machines are brought up they must be hardened to prevent a repeat occurrence.

 

Eradication

At this stage take a wider view of the network; steps will likely have been taken on the affected machines to harden them, look to deploy these countermeasures estate-wide.

Consider additional security assessment such as penetration testing and vulnerability analysis. If attackers had more than one method in it may not be enough to plug just that one issues. Wider issues should be investigated.

Additional monitoring should be considered to ensure that a repeat attack is caught, but also similar attacks. Intrusion Detection Systems like Snort can be configured with custom rules that could be used to alert on additional attacks of a similar nature.

 

Recovery

At this stage, the initial attack should be contained and steps in place to ensure no further attacker access therefore the company can move on to recovery. This stage involves validation that no systems are showing signs of compromise.

Once this has been shown coordination with the business for service restoration of disabled or reduced services can begin uninterrupted.

 

Lessons Learned

During the final stage of response it’s time to close off all documentation, ensuring that the “Incident Log” has a record of all other documentation created throughout the response.

Utilizing the responders notes, incident log and associated documentation a report should be created. The report should cover the work carried out during the response but also address any longer-term work that is needed. No doubt through the response process items and issues were highlighted – these could be architectural, policy-based, or lack resource for the IRT.

Additionally a business debriefing should be held to run appropriate senior business members through the events of the breach, steps taken to recovery and the requirement for longer term improvements.

Finally the whole process should be reviewed to identify methods by which the incident response process can be improved.

 

Appendix A: Pre-engagement Questionnaire

When responding to a breach as a first action have a pre-engagement meeting with the client and fill this questionnaire in to gather initial information about the clients capability for response and to ensure that critical questions are asked before the point in which they are required.

The questions here are specifically modeled on the questions asked by the ICO in a data breach notification. By asking the questions at the start of an investigation it allows discussion to be opened with the client around difficult questions of risk, exposure, and the requirements for further investigation.

If an answer is not known, simply omit it and make note to investigate further should this be required.

Responder  

 

Incident Number  

 

 

Organization Details

Client Company Name:

Designated Data Controller:

Main Contact:

 

Details of the Data Protection Breach

Please describe the incident in as much detail as possible, when did it happen and how did it happen?

What technical measures were previously in place to prevent an incident of this specific nature happening?

What measures have been put in place, or are you considering, to prevent this incident happening again in the future?

What policies or procedures were in place at the time of the incident that are specifically designed to deal with issues raised and when were they implemented?

 

Personal Data Placed at Risk

Were any accounts compromised directly? Note the number of accounts compromised, their network privilege level and the amount of personal data that the account could gain access to.

Is the client aware of multifaceted attacks, for example machines compromised by BotNets being used to deploy Ransomware or Denial-of-Service attacks being used to hide data loss due to a data breach?

What personal data has been placed at risk? Consider specifically personally identifiable information, sensitive personal information and financial information.

How many individual data subjects are affected?

Are the affected individual data subjects aware that the incident has occurred?

What are the potential consequences and adverse affects on those people?

Have any affected individuals complained to the organization about the incident?

 

Containment and Recovery

Has the organization taken any action to minimize/mitigate the effect on the affected individuals? If so, please provide details.

Had the data placed at risk now been recovered? If so, please provide details as to how and when this occurred.

What steps has your organization taken to prevent a reoccurrence of this incident?

 

Training and Guidance

As the data controller, does the company provide staff with training on the requirements of the data protection act? If so, please provide any extracts relevant to this incident here.

Please confirm if training is mandatory for all staff. Had the staff members involved in this incident received training and if so when?

As the data controller, does the organization provide any detailed guidance to staff on the handling of personal data in relation to the incident you are reporting? If so, please provide any extracts relevant to this incident here.

 

 

Previous contact with the ICO

Have you reported any previous incidents to the ICO in the last two years?

If the answer to the above question is yes, please provide: brief details, the date on which the matter was reported and, where known, the ICO reference number.

 

Miscellaneous

Have you notified any other (overseas) data protection authorities about this incident? If so, please provide details.

Have you informed the Police about this incident? If so, please provide further details and specify the Force concerned. Include reference numbers if known.

Have you informed any other regulatory bodies about this incident? If so, please provide details.

Has there been any media coverage of the incident? If so, please provide details of this.

Is the client aware of their legal requirements to notify the public, affected parties, or the Commissioner in regard to data breaches?

 

Further guidance from the ICO in regards to reporting data breaches is available here:

https://ico.org.uk/media/for-organisations/documents/1536/breach_reporting.pdf

 

If a breach notification is believed to be required information on how to report the breach is available here:

https://ico.org.uk/for-organisations/report-a-breach/

 

Appendix B: Incident Response Form

This form is to be used to document an incident reported to the Incident Response Team. Please be as detailed as possible and ensure contact details are included.

 

Responder Details

Date / Time   
Responder  
Incident Number  
Incident Type  

 

Person Reporting

Name  
Email  
Contact Number  
Department  

 

Affected Systems

Hostname IP Address Location
     
     
     
     

 

Brief Summary of Incident and Steps Taken to Date

 

 

 

 

 

Appendix C: Incident Log

When an incident is opened, begin logging major events and decisions with this form.

 

Responder  
Incident Number  

 

 

DTG Event Initials
   
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 


 

Appendix D: System Time Offset Log

Where no central source of time has been configured, use this form to determine time offsets to allow a central consistent time to be determined when comparing log files.

 

Source of Time Offset e.g. Responder’s Laptop
Unique Identifier e.g. Serial Number
Incident Number  

 

 

IP Address Hostname Offset (hh:mm:ss)
192.168.1.197 FileServer1 +04:07:00
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

 

Appendix E: Incident Responder Checklist

Preparation

Are all staff members aware of the security policy?

Are all staff members aware of whom to contact to report an incident?

Is the responder “jump bag” complete?

Have all responders been involved in a practice response?

Do all members of the response team understand the threshold for response?

Are all systems configured to use a centralized NTP?

Are all systems configured to use centralized logging?

Does the response plan include 3rd part contact details? e.g. for the ISP

Has a secure location been identified for forensic images captured during response?

 

Identification

Where did the incident occur?

Who reported the incident and has a full incident report form been completed?

Has an incident response log been started?

How many systems are affected?

 

Containment

Can the problem be isolated?

Has volatile memory been captured and disk images taken?

Are forensic images stored in a secure location?

Have network traces been gathered?

Have appropriate logs been gathered?

Has business impact been determined?

Have affected systems been isolated?

 

Eradication

Have the affected systems been rebuilt or cleaned?

Have the affected systems been hardened?

 

Recovery

Have wider hardening measures been identified and deployed?

Has a plan been written to bring reduced services back online?

Are all affected systems back online?

Has additional monitoring been put in place?

Has additional technical prevention methods been put in place?

Has vulnerability analysis and penetration testing work been considered?

 

Lessons Learned

Has all opened documentation been closed? e.g. the Incident Response Log

Has an incident response report been written?

Has an incident response debriefing been held?

Have long-term system improvements been identified?

Have response procedures been assessed looking for improvements?