Site Reliability Engineer (City of London)

This is an opportunity for a technical problem solving SRE to join a leading global fintech who are currently growing their presence in Europe.

Role Responsibilities;

  • Investigate, troubleshoot and diagnose incidents
  • Provide first-third line investigation and diagnosis of incidents and Service Requests.
  • Be the Incident coordinator for operational incidents on the core engineering production platform. This includes all technical internal communications, ensuring processes are followed and all post-incident follow up and analysis.
  • Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer
  • Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner
  • Work with engineers to establish or update runbooks and procedures needed for handling incidents and Service Requests.
  • Develop and maintain knowledge base and respond to customer’s technical questions.
  • Actively monitor integration endpoints and external programatic dependencies (i.e venue APIs).
  • Maintain scripts, dashboards and other programatic tools acquired or built

Qualifications & Required Skillset;

  • Ability to diagnose and troubleshoot technical issues both offline and in real-time
  • Ability to handle multiple priorities and deal with ambiguity
  • Experience with incident and problem management processes
  • Experience working as an Application Support / DevOps or SRE Role (preferably with in Trading & Risk Management systems )
  • Experience communicating to customers as well as to sr. software engineers
  • Experience with Python, PostgresSQL and Unix
  • Experience with writing intermediate to advanced SQL queries for data extraction and troubleshooting purposes.
  • Experience with using and troubleshooting programming interfaces especially REST APIs and Web Sockets.
  • Experience with monitoring tools (Grafana, DataDog)
  • Experience working with Crypto and blockchain (DLT)
  • Familiarity with common engineering development workflows and tools (e.g. JIRA, Confluences, github, scrum, etc…)
  • Familiarly with scaling, monitoring, and general production challenges of real time (banking) systems.
  • Familiarity with financial services infrastructure & processes (e.g ITIL) and related systems in an SRE or Dev/Ops capacity
  • Familiarity with AWS Cloud Infrastructure & Processes
  • Familiarity with Release management processes and SDLC using agile methodologies and best practices.
  • Motivated by working with people and solving their problems
  • Understanding of basic programming constructs (loops, conditionals, data types, regular expressions) with the ability to write and read non-trivial production and operational scripts.
Company
Global Fintech
Location
City of London, Greater London, UK
Employment Type
Part-time
Posted
Company
Global Fintech
Location
City of London, Greater London, UK
Employment Type
Part-time
Posted