A beginner's guide to Incident Management

An incident is an event or an occurrence that could lead to the disruption of services and operations of an organisation or could lead to them making losses. Management of incidents refers to a team of engineers working in a firm who identify, analyse and correct such incidents to prevent recurrences.
Incidents could be classified based on their technical or characteristic features:
Technical incidents:
- Compromised computing resources:
- Operating System corruption — Due to malware or virus which causes errors to show up in
the system.
- User account compromises — When the control of an
individual’s account is either fully or partially being managed by an
unauthorised person.
- Operating System corruption — Due to malware or virus which causes errors to show up in
the system.
- Exploitation via Email:
- Unsolicited
Commercial Email (UCE) — Commonly known as spam mail. They are generally trick
emails sent in bulk by spambots.
- Phishing emails — They are fraudulent
emails sent by scammers to obtain personal information and use it against the
individuals.
- Unsolicited
Commercial Email (UCE) — Commonly known as spam mail. They are generally trick
emails sent in bulk by spambots.
- Network and Resource Abuses:
- Network scanning
activities and Denial of network service attacks.
- Network scanning
activities and Denial of network service attacks.
- Resource misconfiguration:
- Vulnerable software configurations — Vulnerable
software configurations is caused by holes in computer security and leaves the
systems open to cyber attacks.
- Open proxy servers and anonymous ftp servers — As there is no encryption your data is not safe being transferred in these servers.
- Vulnerable software configurations — Vulnerable
software configurations is caused by holes in computer security and leaves the
systems open to cyber attacks.
Character-based incidents:
- Major incidents
Large scale incidents are a rarity but when they do occur an incident management system should be in place to tackle the problems immediately to ensure no irrevocable damage or major losses are incurred. Speed and efficiency in dealing with these issues are vital. - Repetitive incidents
Some incidents persist despite solving them repeatedly; they could be a sign of underlying problems in the configuration. Scripts can be created to follow a procedure to resolve simple repetitive incidents. - Severity
Incidents are classified based on the parameters of the safety concerns and loss or exposure of personal data. The size of the community affected is also a parameter to check the severity.
Functioning:
The system firsts make a note of the incident. It then
classifies the incidents based on urgency, impact and priority. It assigns the
resolving duty to the appropriate personnel. Finally, it controls the incident
through resolution and reports it after the issue has been resolved. The five
steps involved in resolving an issue are:
- Incident diagnosis — The first step towards resolving an incident where the initial understanding and analysis of the problem takes place.
- Incident escalation — The incident is escalated for quicker solving and is assigned to the team with the right skills to tackle it.
- Incident investigation— The initial diagnosis and relevant information from related incidents and discoveries are put together to come up with a solution.
- Incident resolution — The incident has been handled and is documented for future use.
- Incident closure — The incident is filtered out of the main purview but is added to the organisation’s knowledge base in case there may be related incidents to solve in the future.
Benefits:
- Business is not affected massively, losses are contained
and the effectiveness of the business is increased.
- Improved monitoring
and accurate assessment of service level agreements between service providers
and clients.
- The loss of incidents and record of incorrect incidents is
eliminated.
- There is a general increase in the productivity and
efficiency of the organisation.
- The end user and customer satisfaction
increases.
- There is an overall faster increase in the functional
escalation of the incidents.
- It motivates the incident management team
along with other teams to engage in training one another and builds a culture of
trust between them.
- The growth process for junior staff is quick as they
gain valuable knowledge on the specific incident resolution as well as the
overall system’s functioning.
- Documentation of the incidents is enhanced
qualitatively and quantitatively and the option of customising your reports
helps you document specific information accurately concerning the process.
- The size and growth of the company will lead to differentiation in the
duties and that will lead to the requirement of new tools to be created. The
incident management process helps shed light on this need and helps understand
where to begin creating the tools.
- Communication across various
platforms is expeditious to help detect incidents quickly and resolve them.
- Staff can access and manage the incident details from multiple devices and
can link all the inter-related incidents.
- There is a better organisation of procedure and chain of command as the allocation of resources is done as per available staff and their skill sets.
Zenduty is a cutting edge incident management platform designed by developers keeping the well-being of engineers in mind. Sign up for free here.
Looking for an incident management and on-call scheduling platform?
Sign up for a 14-day free trial of Zenduty. No CC required. Implement modern incident response and on-call practices within your production operations and provide industry-leading SLAs to your customers
Sign up on Zenduty Login to Zenduty