Implementation Engineer - Incident and Event Management
I am looking for an Implementation Engineer with Incident and Event management experience at a large scale. This will be helping my client that has started on a journey to improve their Incident and Event Management across their technology function, to improve resilience of their apps, APIs and services. As system dependencies become more and more complex, they are looking to make their Event Management fit for the future.
This role is specifically looking at the step between the monitoring systems and the incident management system - reducing the time to raise an incident; improving the alert correlation to reduce unnecessary noise; and increasing the richness of data within incident tickets.
Whilst specific responsibilities will be dependent on the changing needs of the their business, the following provides an overview of the key responsibilities:
* Lead the technical discovery of potential even management solutions, working with the Product Manager to evaluate various options. Providing technical expertise to shape the solution.
* Run a Proof of Value on the potential options.
* Work closely with the Monitoring and Incident Management teams to ensure a joined up technical approach.
* Work closely with teams across Technology to ensure a scalable solution that works for everyone, and to support them through the implementation phase.
* Drive the technical direction of the product and has the expertise to make decisions, identify areas of technical risk or concern and suggest actions to mitigate them.
What they're looking for in a candidate
* A passionate technologist with considerable knowledge of IT automation platforms.
* Good practical understanding of Agile software delivery methodology.
* Pro-actively researches relevant technologies and applications.
* Challenges current working practices by critically reviewing current processes and developing and implementing innovative improvements, optimizing for effectiveness, quality and efficiency by using appropriate metrics/data.
* Ability to own and manage end-to-end development and deployment plans to ensure the team deliver on time and to budget.
Key Skills and Experience
* Strong experience building, owning and maintaining incident management, event management or monitoring systems.
* A technical understanding of system implementation, and an ability to interpret, discuss and contribute to architecture approaches and solution designs.
* Experience working in an Agile development team (Scrum or Kanban)
* Strong communication skills with the ability to communicate to colleagues at all levels
* Ability to challenge the status quo and drive wide scale change
* Able to build solid working relationships with peers and work across teams to achieve goals
Please respond with your latest CV to apply.