define implement and improve business performance SLO's. 2+ years of experience with Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) and retrospective analysis. 2+ or more years in hands-on technical roles (such as site reliability engineer, software More ❯
Bash/PowerShell). System Knowledge: Hands-on experience with Linux and Windows. Preferred Skills: Familiarity with Refinitiv TREP and DevOps tools (GitHub, Slack, OpsGenie). More ❯
which technology, or pattern to create or leverage). Experience being "on-call" for a service, and familiarity with incident notification tooling (ex. Pagerduty, Opsgenie). Comprehensive understanding of SRE principles (ex. Working knowledge of the Google SRE book). Demonstrated strength in leading a project in an agile More ❯
across departments Comfortable collaborating with global teams, including across US and EMEA time zones Preferred (Bonus) Skills Hands-on experience with tools like PagerDuty, OpsGenie, ServiceNow, CloudWatch, Chronosphere, or similar Understanding of SLA/SLO implementation and performance tracking Exposure to incident management frameworks, automated remediation, and runbook automation More ❯
across departments Comfortable collaborating with global teams, including across US and EMEA time zones Preferred (Bonus) Skills Hands-on experience with tools like PagerDuty, OpsGenie, ServiceNow, CloudWatch, Chronosphere, or similar Understanding of SLA/SLO implementation and performance tracking Exposure to incident management frameworks, automated remediation, and runbook automation More ❯
frontend, backend, and APIs. There's also a strong DevOps and observability culture, so you'll get stuck into tooling like Dynatrace, Splunk, and OpsGenie, and help improve reliability and performance from the ground up. This is a role for someone who wants to own the quality space and More ❯
are willing to present and defend your ideas to technical and non-technical audiences. Additional Desired Skills Experience with incident management platforms like PagerDuty, OpsGenie, or similar tools Understanding of SLO/SLA management and implementations Knowledge of industry standard incident management frameworks and best practices Familiarity with automated More ❯
effort and the escalation and prioritisation of those items. Monitor hardware, applications and environmental conditions of our Order Management systems using tools such as OpsGenie & CheckMK (Nagios). Manage production releases of our Order Management systems. Participate in Disaster Recovery planning, updating run books and DR tests. Ensure that More ❯