any of the key vendors across the Cloud- Azure, GCP, and AWS. Kubernetes & troubleshooting, managed services like AKS Using your SRE Attitude (understanding SLI, SLO & SLA) Container Image Management & Security like Aquasec Code Quality & repository Management like SonarQube & NexusQ Service Mesh (Istio) traffic shaping, canary, blue more »
london, south east england, United Kingdom Hybrid / WFH Options
ByteHire
a Senior Site Reliability Engineer with deep Google Cloud (GCP) experience, to join our customer’s organisation. Responsibilities Influencing ServiceLevelObjectives, Non-Functional Requirements, and infrastructure requirements Ensuring that the ServiceLevel … the SLOs Reliability concepts to ensure high performance and high service availability, able to define, implement and improve business performance SLO’s. Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) Maintain existing compliance and governance standards more »
and operating characteristics of our products and services. Work closely with colleagues and teams across the business to meet servicelevelobjectives, contribute to innovations, and manage risks effectively. What They're Looking For: Proven experience in software engineering with a more »
Support Analyst (UK) About Adaptiva Adaptiva, the Autonomous Endpoint Management company, delivers the fastest way to patch and manage endpoints at scale. The company offers OneSite, the first fully adaptive autonomous endpoint management (AEM) platform. At Adaptiva, we pride ourselves more »
a Senior Site Reliability Engineer with deep Google Cloud (GCP) experience, to join our customer’s organisation. Responsibilities Influencing ServiceLevelObjectives, Non-Functional Requirements, and infrastructure requirements Ensuring that the ServiceLevel … the SLOs Reliability concepts to ensure high performance and high service availability, able to define, implement and improve business performance SLO’s. Production operations including 24x7 on-call support, escalation/paging with OpsGenie, incident management, RCA (Root Cause Analysis) Maintain existing compliance and governance standards more »
yeovil, south west england, United Kingdom Hybrid / WFH Options
Education Horizons
and process control & monitoring to improve quality and efficiency. Ensuring that the SRE team is meeting the required SLOs (ServiceLevelObjectives) & SLAs (ServiceLevel Agreements) for their products & services. Ensuring maintenance is more »
manchester, north west england, United Kingdom Hybrid / WFH Options
bet365
working from home policy. Preferred Skills, Qualifications and Experience Excellent knowledge of SRE principles, including the creation and management of effective SLI’s and SLO’s for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge more »
traffic global platform. What you'll do in the role: Handle production incidents Create and maintain the system disaster recovery process Monitoring, alerting and SLO tracking. Develop tools to maximise efficiency like automating the deployment infrastructure. Be an advocate of the GitOps methodology Please note, we are just considering UK more »
traffic global platform. What you'll do in the role: Handle production incidents Create and maintain the system disaster recovery process Monitoring, alerting and SLO tracking. Develop tools to maximise efficiency like automating the deployment infrastructure. Be an advocate of the GitOps methodology Please note, we are just considering UK more »
engines to run them at scale, optimize performance, and ensure efficient maintenance. SLO/SLA Concepts: Implement and manage ServiceLevelObjectives and Agreements to guarantee our platform's reliability and performance. Infrastructure Management: Use Terraform to manage infrastructure and deployments more »
Work closely with other engineering teams to leverage existing frameworks and improve overall reliability. Drive the establishment and achievement of service-levelobjectives to maintain product reliability. Requirements: 4+ years of experience as a software or systems engineer. Proficiency in observability more »
and operating characteristics of our products and services. Work closely with colleagues and teams across the business to meet servicelevelobjectives, contribute to innovations, and manage risks effectively. What They're Looking For: Proven experience in software engineering with a more »
any of the key vendors across the Cloud- Azure, GCP, and AWS. Kubernetes & troubleshooting, managed services like AKS Using your SRE Attitude (understanding SLI, SLO & SLA) Container Image Management & Security like Aquasec Code Quality & repository Management like SonarQube & NexusQ Service Mesh (Istio) traffic shaping, canary, blue more »
IT Engineer– Birmingham Location – B3 2TA Salary - £34,000-£39,000 Do you have good troubleshooting skills? Are you an analytical problem solver with strong customer service and care skills? Are you looking for a transformative more »
mentioned), PagerDuty/OpsGenie or similar, and Jenkins. NON-TECHNICAL REQUIREMENTS: Awareness of Site Reliability Engineering (SRE) principles, including ServiceLevelObjectives (SLOs), ServiceLevel Indicators (SLIs), and error budgets. Understanding of development more »
develop our data reliability, including building out an active metadata platform, developing data quality monitoring and designing/implementing servicelevelobjectives (SLOs) and servicelevel indicators (SLIs). Stage two will ensure more »
Simply Commerce - Digital Commerce Recruitment Experts
/Typescript. 4+ years experience in an SRE/DevOps role Strong experience with AWS. Experience setting and managing ServiceLevelObjectives (SLOs) and ServiceLevel Agreements (SLAs) Salary up to more »
for a Senior Site Reliability Engineer with deep Google Cloud (GCP) experience, to join our customers organisation. Responsibilities Influencing ServiceLevelObjectives, Non-Functional Requirements, and infrastructure requirements Ensuring that the ServiceLevelmore »
Leeds, West Yorkshire, Yorkshire, United Kingdom Hybrid / WFH Options
Evri
security, and cost optimisation Supporting your team in dealing with operational issues such as availability, performance, and scalability Influencing ServiceLevelObjectives, Non-Functional Requirements, and infrastructure requirements Highlighting deviations from technology standards to the TDA (Technical Design Authority) Ensuring that … the ServiceLevelObjectives in your area are met Helping to develop and promote the SRE service catalogue Ensuring the best security practices are followed Supporting and developing junior members of the team Capturing the SLIs and more »
Employment Type: Permanent, Part Time, Work From Home
solutions with networking architectures from scratch. Skilled in managing and optimising databases at scale. (MongoDB) Build robust observability platforms, mitigates alert fatigue, and understands SLO/SLA concepts. Proficient with Terraform with ability to Automate infrastructure provisioning and deployment processes. Proven experience responding to and resolving incidents, focusing on prevention. more »
providing first level support, incorporating the Service Portal and subject matter experts, to agreed targets and SLO, ensuring Compliance by following relevant GLOBE Standards and Policies, incl. Nestlé IT Security Policy and Cyber Securirty Awareness Supporting the Security & Compliance Specialists with ownership more »