Site Reliability Engineer (City of London)
This is an opportunity for a technical problem solving SRE to join a leading global fintech who are currently growing their presence in Europe.
Role Responsibilities;
- Investigate, troubleshoot and diagnose incidents
- Provide first-third line investigation and diagnosis of incidents and Service Requests.
- Be the Incident coordinator for operational incidents on the core engineering production platform. This includes all technical internal communications, ensuring processes are followed and all post-incident follow up and analysis.
- Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer
- Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner
- Work with engineers to establish or update runbooks and procedures needed for handling incidents and Service Requests.
- Develop and maintain knowledge base and respond to customer’s technical questions.
- Actively monitor integration endpoints and external programatic dependencies (i.e venue APIs).
- Maintain scripts, dashboards and other programatic tools acquired or built
Qualifications & Required Skillset;
- Ability to diagnose and troubleshoot technical issues both offline and in real-time
- Ability to handle multiple priorities and deal with ambiguity
- Experience with incident and problem management processes
- Experience working as an Application Support / DevOps or SRE Role (preferably with in Trading & Risk Management systems )
- Experience communicating to customers as well as to sr. software engineers
- Experience with Python, PostgresSQL and Unix
- Experience with writing intermediate to advanced SQL queries for data extraction and troubleshooting purposes.
- Experience with using and troubleshooting programming interfaces especially REST APIs and Web Sockets.
- Experience with monitoring tools (Grafana, DataDog)
- Experience working with Crypto and blockchain (DLT)
- Familiarity with common engineering development workflows and tools (e.g. JIRA, Confluences, github, scrum, etc…)
- Familiarly with scaling, monitoring, and general production challenges of real time (banking) systems.
- Familiarity with financial services infrastructure & processes (e.g ITIL) and related systems in an SRE or Dev/Ops capacity
- Familiarity with AWS Cloud Infrastructure & Processes
- Familiarity with Release management processes and SDLC using agile methodologies and best practices.
- Motivated by working with people and solving their problems
- Understanding of basic programming constructs (loops, conditionals, data types, regular expressions) with the ability to write and read non-trivial production and operational scripts.
- Company
- Global Fintech
- Location
- City of London, Greater London, UK
- Employment Type
- Part-time
- Posted
- Company
- Global Fintech
- Location
- City of London, Greater London, UK
- Employment Type
- Part-time
- Posted