Incident Management Process
The process defined in this document makes use of the workflow and activities defined in the ITS Global Incident Management Process  and ITS Global Critical Incident Management Process . Readers are expected to read and understand the ITS Global Incident Management Process , ITS Global Critical Incident Management Process . ITS UK Incident Management Process  and ITS UK Critical Incident Management Process  before applying the process in the GSOC.
This document describes how IT incidents that fall within the scope of the KPMG Global Security Operations Centre (GSOC) are managed to ensure minimal disruption to GSOC services, to the satisfaction of the customer, and in compliance with the GSOC terms of reference and any agreed service levels.
The document also identifies points of interaction with ITS Global, ITS UK and Third Parties, and provides direction on how the GSOC will interact with them to ensure efficient resolution of incidents.
This document covers IT incidents that disrupts or that could disrupt one or more GSOC services including incidents that are communicated directly by users, technical staff or vendors, and those identified through monitoring tools.
The responsibility of ownership and ongoing management of this document, including the processes contained therein, rests primarily with the GSOC Director.
The intended audience for this document is the GSOC Team and third party service providers.
All requests for exceptions to this processes contained within this document should be directed to the GSOC Director who, depending on the nature and the scope of the request.
Any violations to this policy should be reported directly to the following email address:
IT Incident: an unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident .
Critical IT Incident: refers to an IT incident with an urgent business priority which requires a response that is above and beyond that given to normal incidents.
Urgency: is a measure of how long it will be until an Incident, Problem or Change has a significant impact on the business.
Priority: is a category used to identify the relative importance of an Incident, Problem or Change. Priority is based on impact and urgency and is used to identify required times for actions to be taken.
The following roles have responsibilities for respective components of the incident management process. This list is not intended to provide an extensive list of the responsibilities of each of the roles:
1.7.1 GSOC Director
The GSOC Director has the overall accountability of all services provided by the GSOC and all IT incidents that occur within the GSOC.
The Tooling Engineer has the responsibility to investigate and resolve any IT incidents that occur within the GSOC. The Tooling Engineer will also coordinate any resolution of incidents requiring corporation of other stakeholders such as ITS Global and UK, and third party support teams.
Incident Manager is an assigned role, primarily performed by the GSOC Operations Manager, who has the authority to delegate the role to another member of the GSOC. The incident Manager has the overall responsibility of:
The GSOC Operations Manager (GOM) serves as the Incident Manager both for critical and non-critical incidents. They will assign an incident manager.
Third party support teams have the following responsibilities:
All analysts may be tasked with receiving initial incident requests, i.e. provide first line support. They may also be called upon to assist the Tooling Engineer during the investigation and resolution of the incidents.
- GSOC Change Management Process
- GSOC Service Management Process
- GSOC Problem Management Process
This process makes use of the workflows and activities defined in the ITS Global Incident Management Process  and ITS Global Critical Incident Management Process . The activities defined are executed within the context of and by the roles defined within the GSOC. This section describes the deltas necessary to make this work within the GSOC.
The following are the goals of the GSOC IT Incident Management Process:
- Identification of the underlying causes of an incident and the best resolution and prevention
- Restoration of the service as quickly as possible following an incident while ensuring that all details are recorded
- Reduction of the impact of incidents on the GSOC
- Reduction of the number of problems resulting from repeated occurrence of incidents.
The process workflow for the GSOC Incident Management Process follows the ITS Global Incident Management Process workflow with the following modifications, as indicated in red in the diagram below.
The Priority assigned to a record for the resolution of an Incident depends upon:
- The Impact on the business: size, scope and complexity of the Incident
- The Urgency to the business: time within which resolution is required
- The resource availability
- The expected effort in resolving or completing a task.
The assignment of priority values follows the Global Service Desk Prioritization and SLA Definition .
The ITS Global Incident Management Process  defines a number of roles within this process. These roles map to the GSOC roles defined in Section 2.1 as follows.
|Role||Process Owner||Incident Manager||Service Desk Analyst||Support Engineer||Incident Requester|
|GSOC Operations Manager||Yes|
Figure 1: IT Incident Management workflow – showing deltas and interaction points
In addition to Global ITS support incidents may also come from ITS UK and from members of the GSOC. These sources will use the channels and will be processed as indicated in the interactions defined in Section 3.4.
Event monitoring for identifying IT incidents within the GSOC will initially, version one of the GSOC, be performed using a manual process by the Tooling Engineer until such a time when the GSOC is mature enough.
Within the GSOC, any IT incident that has been identified as needing a change to one or more of the services will result in the initiation of a service request. This will result in the invocation of the GSOC Service Request Management Process, defined as part of the GSOC Service Management Process.
IT incidents that consistently reappear are a sign of an existence of an underlying problem. For this reason, such incidents will be treated as being a problem and hence trigger the GSOC Problem Management Process.
All changes identified during the diagnosis and that require to be implemented will be achieved through the GSOC Change Management Process as opposed to this being an optional step. The decision about how to verify and implement the changes required to fix the incident will be determined through the GSOC Change Management Process.
IT incidents within the GSOC will follow a simplified escalation process as described in this section.
The GSOC will utilise one of the following mechanisms for escalating IT incidents.
220.127.116.11 Direct Communication-based Escalations
Based on the nature of the IT incident escalation may be performed using direct communication with email. The information contained within the email should be captured and used to create an incident.
18.104.22.168 ITS Global Service Desk Tool
ITS Global Service Desk Tool (referred to now as the Service Desk Tool) provided by ITS Global must be used to escalate incidents stored within it.
22.214.171.124 Other IT Incident Management Systems
The Tooling Engineer will use the incident management systems provided by the entity to which they intend to escalate an incident by submitting a new ticket for the issue. This may require manual input of data or tool supported export of the data from the system used within the GSOC to the system used by the receiver.
2.3.2 Criteria for escalations and escalation paths
The Tooling Engineer will handle all the IT incidents. The following matrix defines the types of IT Incidents that may result in the Tooling Engineer triggering an escalation request together with the escalation options applicable.
|Condition||Description||Escalation to||Escalation Mechanism|
|Workload overwhelming||The Tooling Engineer is unable to keep with the resolution of incidents due to increased workload||GOM|
|GSOC process not working||One or more of the processes within the GSOC necessary to resolve an incident is unavailable or not working||GOM|
|Limited capability||An IT incident cannot be resolved because of a lack of expertise in a technology||GOM|
|Communication breakdown||Communication with an entity outside the GSOC cannot be established or has broken down||GOM|
|Resolution depends on ITS Global/UK incident||The resolution of an IT incident depends on ITS Global or ITS UK resolving another incident and ITS Global or ITS UK is unable to resolve the incident within their defined SLAs||GOM|
|Solution does not exist||The solution to an incident does not exist due to technological limitations||GOM|
|Assigned Tooling Engineer unable to resolve issue||The Tooling Engineer initially assigned an IT incident is unable to resolve the issue due to limited expertise||Tooling Engineer||Service Desk Tool|
|L1/L2 Analyst||Unable to resolve an IT incident||Tooling Engineer||Service Desk Tool|
2.4 Interactions with External Parties
The GSOC will interact with ITS Global, ITS UK, third party support teams and member firms to resolve the incidents. The matrix below shows a combination of incident types and the responsibilities in relation to how interactions will be managed. The activities are achieved through the workflows discussed in proceeding sections.
Figure 2: Interactions matrix showing incident types and responsibilities
IT incidents within the GSOC may originate from various sources. How the GSOC processes these incidents will depend on the type of source and the channel they decide to use to report the incident. The Figure below shows the workflow for various sources and channels available to them.
Figure 3: Interactions with incident sources
Incidents will be identified by personnel within the GSOC or entities outside the GSOC. Incidents identified internally may be reported through emails or directly into the Service Desk Tool by the person who has identified the incident.
For incidents reported through email, information will be extracted by the GSOC Service Desk to create a ticket in the Service Desk Tool. This will result in the invocation of activities within the GSOC IT Incident Management Process.
Incidents that are identified by entities outside the GSOC will have two possible channels, i.e. report directly through the Service Desk Tool account provided by ITS Global or use one of the channels defined within the GSOC Communication Process.
Incidents that are identified by ITS Global, and which need to be resolved by the GSOC, will be escalated through the Service Desk Tool as specified in the GSOC Communication Process.
Depending on whether the source of an incident is external or internal to the GSOC, the channels available for reporting the incident will differ.
The GSOC may initiate interactions with parties external to the GSOC. The diagram below shows the workflow for incidents that have been determined to require support from parties external to the GSOC. The decision to seek support from other entities as well as which entity to involve will be determined by the GSOC Service Support Model  and the interactions matrix defined in Section 3.4.1.
The workflow for initiating such interactions is shown in the diagram below.
Figure 4: Interactions with external support mechanisms
During the IT incident investigation phase, an incident may be identified as requiring support from an external party. In such cases, the GSOC will use the Service Desk Tool to record the incident. This will be received by ITS Global who will determine whether the incident needs to be routed to other entities. This decision will be based upon the support structures as defined in the GSOC Services Support Model .
The GSOC will engage directly with the third parties if necessary to provide details as required. However, any updates will still be maintained within the Service Desk Tool. Communication with the third parties will be performed in line with the GSOC Communication Process and upon closing the incident, the Service Desk Tool will be updated as appropriate.
The GSOC will utilise the workflow and follow activities defined in ITS Global Critical Incident Management Process  for managing critical incidents within the scope of the GSOC. The following sections define the deltas necessary to achieve this.
The roles defined within the ITS Global Critical Incident Management Process map to the GSOC roles are follows:
|Risk level||Process Owner||Critical Incident Manager||Technical Response Team (TRT)||Support Engineer||Support Group Lead|
|GSOC Operations Manager||Yes||Yes|
|3rd Party Service Provider||Yes|
The ITS Global Critical Incident Management Process  specifies that ITS Global will approve the business communication plans. However, when this process is applied within the GSOC context, all communications will be performed in accordance with the GSOC Communications Process and will be approved by the GSOC Director or another member of the GSOC delegated with the authority to do so.
The authority to declare a critical incident resides with the GSOC Operations Manager. The Tooling Engineer may recommend any incident believed to be critical and which meets the criteria defined in the next section; however, the final decision resides with the GSOC Operations Manager.
Figure 5: Critical incidents workflow
Any urgent priority incident is a candidate for becoming a critical incident, additional criteria is outlined below and should be considered prior to declaring critical incident:
- The importance of the business function affected by the incident
- The number and type of people affected
- The number of Member Firms affected
- The number and type of GSOC services affected
- The elapsed time of an incident which would have significant impact to GSOC employees or that would result in the inability to sustain business for any extended period of time
- An incident that could lead to security incident if not resolved within a given time frame and that time is nearing expiry or has expired.
Resolution of critical incidents requires additional resources, which results in increased costs when critical incident procedures are followed. Therefore, use of good judgement is expected when declaring a critical incident. The cost of support should be weighed against the urgency and impact to the GSOC as a result of the incident.
Once an incident has been closed, a review will be performed by the Tooling Engineer and the GSOC Operations Manager to identify opportunities for preventing the incident in future and document any lessons learnt. It will be up to the GSOC Operations Manager and the Tooling Engineer to schedule this depending on the criticality of the incident.
 ITIL V3 Glossary of Terms, Definitions and Acronyms
 GSOC Service Support Model