Key Responsibilities : Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to SiteReliabilityEngineering (SRE) principles. Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs … Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices. Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence. Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions. Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices. Design and maintain robust production monitoring systems to ensure timely detection and more »
Key Responsibilities : Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to SiteReliabilityEngineering (SRE) principles. Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs … Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices. Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence. Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions. Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices. Design and maintain robust production monitoring systems to ensure timely detection and more »
Key Responsibilities : Design and manage Java based microservices, bash scripts, Redis, High-Availability design, while strictly adhering to SiteReliabilityEngineering (SRE) principles. Thrive in high-pressure environments, working swiftly and reliably to maintain system integrity and meet service level objectives (SLOs) and service level indicators (SLIs … Lead initiatives to enhance current systems and implement innovative solutions in collaboration with a fast-paced, mission-driven team, focusing on the implementation of SRE best practices. Conduct thorough root-cause analyses for production incidents and generate high-quality RCA reports, leveraging SRE methodologies to prevent recurrence. Apply software engineering principles to rectify operational challenges and optimize system performance, with a specific focus on implementing SRE-driven solutions. Ensure the availability, latency, performance, efficiency, and security of our infrastructure, adhering rigorously to SRE principles and best practices. Design and maintain robust production monitoring systems to ensure timely detection and more »
SiteReliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CD A leading provider of financial services is seeking two SiteReliability Engineers- Leads with a solid and proven background in Azure or GCP. This position will also be based onsite in London … Will consider candidates from any of the key vendors across the Cloud- Azure, GCP, and AWS. Kubernetes & troubleshooting, managed services like AKS Using your SRE Attitude (understanding SLI, SLO & SLA) Container Image Management & Security like Aquasec Code Quality & repository Management like SonarQube & NexusQ Service Mesh (Istio) traffic shaping, canary, blue … Unit/Integration/Load Testing Azure Application Gateway & API Management Azure IAM - Identity & Access Management Azure Policy Management & Cloud Security Azure Express Route SiteReliability Engineer- Lead, Mentoring, Kubernetes, PaaS, IaaS, SQL, Azure DevOps, CI/CD McGregor Boyall is an equal opportunity employer and do not more »
Nottingham, Nottinghamshire, East Midlands, United Kingdom
Microlise
Lead Engineer SRE When registering to this job board you will be redirected to the online application form. Please ensure that this is completed in full in order that your application can be reviewed. Our Engineering Team is 200 strong, from Apprentice Engineers through to Enterprise Architects, and were … you are looking for a new challenge and have a strong technical background, then we want to hear from you! As our new Lead SiteReliability Engineer , you will be key to maximising the observability of our infrastructure and applications, and to resolving error-prone manual processes through more »
Westminster, Colorado, United States Hybrid / WFH Options
Maxar Technologies
a high-quality analytics environment and platform to enable successful data intelligence at Maxar. This position is hybrid with several days a week on-site with your colleagues in Westminster, CO. Life with Us Your Project: The Data Intelligence team owns and maintains a variety of infrastructure and services … for coding standards and software architecture. Responsible for the 'ilities': availability, scalability, maintainability, reliability, and securability of our tools and environments. We embrace SRE principles. Actively identify opportunities for improvement in our infrastructure and propose solutions to realize them. Collaborate with a team of skilled DevOps Engineers, Data Engineers … and Business Intelligence Developers. Minimum requirements for this position: Active U.S. Government security clearance Bachelor's Degree in Software Engineering, Computer Science, or related engineering field. 4 additional years of experience may be substituted for a degree. Minimum of 5 years related work experience for a Senior Software more »
Company Description Internal Grade E/EB9 Job Description Work that matters what youll be doing Were looking for a SiteReliability Engineer to join our Experian Data Quality team where you will be working … on cutting edge products within our Aperture suite (Data Studio and Data Governance). This role has aspects of both reliabilityengineering (SRE) and test engineering (SDET). It is ideally suited to someone looking to take on some aspects of a technical leadership role for the … test frameworks), working in collaboration with our Architects, Development teams, and DevOps specialists to use results of these tests to help prioritize and implement reliability improvements for our customers. You will work closely with our Director of Engineering, Test Automation Lead, and wider QA team to shape our more »
investment in digitising our Individual Annuities customer journey onto a Cloud based platform. We are seeking to recruit a SiteReliability Engineer (SRE) within the Retirement platform where your main responsibilities will be to work with our existing SRE team to ensure strong observability across our services utilizing … tools such as Dynatrace and Splunk. You will work closely with the wider team to embed SRE principles of delivering secure, robust, and reliable infrastructure and features to our customers. Helping our service teams to understand root causes of incidents. Striving to remove manual tasks (toil) through automation and the … where you'll make a difference: Influencing across all disciplines within both the business and engineering side of the business in terms of SRE principles especially in relation to increasing reliability. Whilst skills, knowledge and prior experience are meaningful to us we want people who are highly motivated, and more »
JOB TITLE: SiteReliability Engineer - Homes Platform SALARY: £62,874 - £69,860 LOCATION(S): Halifax or Leeds HOURS: [Full-time] WORKING PATTERN: Our work style is hybrid, which involves … spending at least two days per week currently, or 40% of our time, at our Halifax or Leeds Office About this opportunity Our Cloud SRE (SiteReliabilityEngineering) team is looking for an experienced and passionate Engineer with strong hands-on development experience. As a Cloud SRE … Mortgages at the heart of our strategy to become the best bank for customers. The role will have accountabilities including: Delivering against Azure and SRE Public Cloud technology roadmaps Collaboratively working with other engineering teams to build, release and evolve enterprise-class solutions, that are reliable and evergreen as more »
and operating the best in class, most reliable access network for our customers. About the Team: As a Senior SiteReliability Engineer (SRE), you will be part of the SRE team within the CONNECT OpTek team. Our team is responsible for development and support of multiple tools and … applications used by Comcast field technicians to diagnose and troubleshoot issues within the Comcast nation-wide network. The SRE team is responsible for maintaining the existing systems, supporting our development teams, and implementing innovative solutions. You will work alongside software developers, testers, and project managers. What You'll Do: Your … supporting developers to help maintain/define best practices Configuring, watching, tuning and responding to monitoring events Supporting an on-call rotation with the SRE team Maintaining and improving CI/CD pipelines using Concourse and GoCD Supporting corporate initiatives (e.g., security hardening) Having a good time learning and working more »
and operating the best in class, most reliable access network for our customers. About the Team: As a Senior SiteReliability Engineer (SRE), you will be part of the SRE team within the CONNECT OpTek team. Our team is responsible for development and support of multiple tools and … applications used by Comcast field technicians to diagnose and troubleshoot issues within the Comcast nation-wide network. The SRE team is responsible for maintaining the existing systems, supporting our development teams, and implementing innovative solutions. You will work alongside software developers, testers, and project managers. What You'll Do: Your … supporting developers to help maintain/define best practices Configuring, watching, tuning and responding to monitoring events Supporting an on-call rotation with the SRE team Maintaining and improving CI/CD pipelines using Concourse and GoCD Supporting corporate initiatives (e.g., security hardening) Having a good time learning and working more »
and operating the best in class, most reliable access network for our customers. About the Team: As a Senior SiteReliability Engineer (SRE), you will be part of the SRE team within the CONNECT OpTek team. Our team is responsible for development and support of multiple tools and … applications used by Comcast field technicians to diagnose and troubleshoot issues within the Comcast nation-wide network. The SRE team is responsible for maintaining the existing systems, supporting our development teams, and implementing innovative solutions. You will work alongside software developers, testers, and project managers. What You'll Do: Your … supporting developers to help maintain/define best practices Configuring, watching, tuning and responding to monitoring events Supporting an on-call rotation with the SRE team Maintaining and improving CI/CD pipelines using Concourse and GoCD Supporting corporate initiatives (e.g., security hardening) Having a good time learning and working more »
and operating the best in class, most reliable access network for our customers. About the Team: As a Senior SiteReliability Engineer (SRE), you will be part of the SRE team within the CONNECT OpTek team. Our team is responsible for development and support of multiple tools and … applications used by Comcast field technicians to diagnose and troubleshoot issues within the Comcast nation-wide network. The SRE team is responsible for maintaining the existing systems, supporting our development teams, and implementing innovative solutions. You will work alongside software developers, testers, and project managers. What You'll Do: Your … supporting developers to help maintain/define best practices Configuring, watching, tuning and responding to monitoring events Supporting an on-call rotation with the SRE team Maintaining and improving CI/CD pipelines using Concourse and GoCD Supporting corporate initiatives (e.g., security hardening) Having a good time learning and working more »
and operating the best in class, most reliable access network for our customers. About the Team: As a Senior SiteReliability Engineer (SRE), you will be part of the SRE team within the CONNECT OpTek team. Our team is responsible for development and support of multiple tools and … applications used by Comcast field technicians to diagnose and troubleshoot issues within the Comcast nation-wide network. The SRE team is responsible for maintaining the existing systems, supporting our development teams, and implementing innovative solutions. You will work alongside software developers, testers, and project managers. What You'll Do: Your … supporting developers to help maintain/define best practices Configuring, watching, tuning and responding to monitoring events Supporting an on-call rotation with the SRE team Maintaining and improving CI/CD pipelines using Concourse and GoCD Supporting corporate initiatives (e.g., security hardening) Having a good time learning and working more »
SiteReliability Engineer The successful candidate will be based in the United Kingdom and must have at least good-years residency to be eligible for security vetting. Some level of travel to the client site in central London or Corsham can be expected, in line with the … AWS stack for optimal platform performance. Automation Focus: Patch, update, and automate tasks for maximum efficiency. Incident Lead: Coordinate incident response with L2 and SRE teams. Handover and Reviews: Facilitate daily SRE handovers and post-incident reviews. Reporting and Improvement: Monitor queues, create reports, and implement automations. AWS Knowledge: Expertise more »
Saffron Walden, Essex, South East, United Kingdom Hybrid / WFH Options
EMBL-EBI
The IT & Technical Services department's Operations team is seeking a Senior SiteReliability Engineer to support the growing portfolio of services it provides to EMBl-EBIs service and research teams. The Operations team is responsible for maintaining and developing the Institutes Transfer Services , the application and monitoring … this role, it may suit an individual with experience in a hands-on systems management role, a Senior Infrastructure Engineer, or someone from a sitereliabilityengineering background. The role will initially focus on the email systems - understanding and upgrading the infrastructure, but is expected to rapidly … cultural, multi-disciplinary staff, at different levels of their IT career. We are eager to welcome new talent who will join us in ensuring reliability and supporting EMBL-EBI's mission to advance scientific discovery. Your role During the first months, the role will focus on the upgrade of more »
Cheltenham, Gloucestershire, South West, United Kingdom Hybrid / WFH Options
Searchability NS&D Ltd
project) Skills required in Java Spring Boot, Kubernetes & Docker, Elastic, Helm, Linux, Git, CI/CD Who are we? We are recruiting a Senior SRE with enhanced DV Clearance for a prestigious client to work on a portfolio of public and private sector projects. Our client is a global leader … and platforms. You'll experience excellent career progression opportunities to develop your skillset and personal profile in an inclusive culture. What will the Senior SRE be doing? Monitor system metric dashboards using Kibana Diagnosing problems Remedy and debug any issues from system deployment environments Track issues and carry out releases … me on LinkedIn, just search for Henry Clay-Davies. I look forward to hearing from you. KEY SKILLS: SiteReliability Engineer/SRE/Senior SRE/Kubernetes/Ansible/Elastic Stack/Elastic/Kibana/Linux/Git/Helm/CI/CD/ more »
Senior SiteReliability Engineer Would you like to join our great reliabilityengineering team? Do you have a passion for cloud infrastructure technologies? About The Business At Cirium, our goal is to keep the world connected. We are the industry leader in aviation analytics; helping our … learn more about Cirium at the link below. https://www.cirium.com About Our Team You will be joining a collaborative, curious, team of SiteReliability Engineers at all different levels. By joining us you will have the opportunity to share ownership in solving this problem end to … building features, to design and put in production predictive models and make sure they perform consistently over time. About The Role As a Senior SiteReliability Engineer, your purpose is to ensure that the company's systems and applications are available, reliable, and performant at all times. You more »
working with a leading global travel company dedicated to providing exceptional experiences for travelers around the world. The SiteReliabilityEngineering (SRE) team plays a critical role in ensuring the reliability, performance, and scalability of their systems, enabling them to deliver best-in-class services to … performance of the services and applications. Manage end-to-end availability and performance, implementing automation to prevent and resolve issues. Work closely with other engineering teams to leverage existing frameworks and improve overall reliability. Drive the establishment and achievement of service-level objectives to maintain product reliability. Requirements: 4+ … skills and the ability to mentor team members. Passion for learning and staying updated on industry best practices. Nice to Haves: Experience as a SiteReliability Engineer or with high-availability systems. Background in production infrastructure and troubleshooting distributed systems. Familiarity with mobile development and distributed computing. What more »
Lead SiteReliability Engineer Leeds - once a month in the office on average £80,000-£90,000 + benefits A leading global organisation are seeking a Lead SiteReliability Engineer to play a pivotal role in the development, implementation … and ongoing maintenance of its core Infrastructure and Cloud-based platforms. This role encompasses diverse responsibilities, including leading and managing a small DevOps/SRE team. The Lead SiteReliability Engineer will lead the charge in selecting, configuring, and supporting Cloud Platform components and tooling. Proficiency in observability … mentoring team members. Key Skills Commercial experience with GCP Leadership/management experience Terraform Observability tech such as Grafana/Prometheus Background in software engineering is an advantage If you are interested in the role please apply! We are an equal opportunities employer and welcome applications from all suitably more »
Leeds, England, United Kingdom Hybrid / WFH Options
Candour Solutions
of engineers around the world working on truly groundbreaking projects. So what will you be required to do? Provide leadership and guidance across the SRE team; motivating and driving the team with technical leadership acting as a subject matter expert and leading best practice techniques. Lead the SRE team in … ensuring technical assurance in significant projects, for the delivery of quality technical deliverables, which may involve several teams or technologies. Oversee the SRE team to ensure they are involved in every step of the application software development lifecycle, including product design, development, testing, and transition into operation. Provide coaching and … mentoring to the SRE team to improve their skillset, increase knowledge and set the benchmark of quality and precision engineering Oversee the implementation of service transition and change and release process changes, ensuring that processes are reviewed and improved with onus on optimisation Evaluate risks and defects, analysing specifications more »
multi-disciplined engineers working alongside various teams, inside and out, to deliver key projects and strategy. The role will report into the Head of SiteReliabilityEngineering and Infrastructure. For the right candidate, this is a fantastic opportunity to join a fast-paced company and advanced technology … Ansible, Puppet, Jenkins. Experience in migrating services from on-prem to GCP Cloud and from other Cloud providers to GCP. A good understanding of SRE/DevOps principles Programming and scripting experience – Bash, Python, Perl etc Personal Skills & attributes Proactive, diligent, and enthusiastic. Ability to learn and adapt to new … systems and technology. Willingness and patience to mentor, advise, and help staff across the business in technical skills and SRE principals. A natural tendency for problem solving and the ability to think outside the box to find solutions to challenging problems. A genuine enjoyment of working with technology and have more »
in upskilling, learning new tech Deeply curious, creative, and innovative Flexible in working hours/ability to collaborate in different time zones The Lead SiteReliability Engineer has a pivotal role at the forefront of our engineering operations, responsible for guiding the Platform Team toward achieving exceptional … standards of reliability, performance, and stability across all our applications. The successful candidate will possess deep expertise in these core areas and will be instrumental in defining and implementing industry-leading practices. As a key leader, this role will not only shape the … strategic direction of our platform operations but also establish the benchmarks and processes by which our engineering excellence is measured. Responsibilities Lead the SRE Team, setting clear goals and priorities in line with business objectives. In collaboration with the department Director develop and execute strategies that enhance technological capabilities more »
Yeovil, England, United Kingdom Hybrid / WFH Options
Education Horizons
our customers, our team and the wider business. Drive improvements in automation and process control & monitoring to improve quality and efficiency. Ensuring that the SRE team is meeting the required SLOs (Service Level Objectives) & SLAs (Service Level Agreements) for their products & services. Ensuring maintenance is correctly performed and managed. Applying … positive working team. Work with the other teams in either a consultative or embedded strategy within in the TechOps Group to ensure alignment with SRE Best Practices. Promote a culture of continuous improvement. Working within the Education Horizons Information Security Management System Live and lead the Values of Education Horizons. … Experience and Qualifications Required Experience with the concept of SRE Experience maintaining web-based applications and their backend services Experience with one or more scripting languages such as Python or Bash etc. Strong experience with AWS Cloud Experience with AWS Serverless Strong experience with Linux &/or Windows servers Desirable more »
My client, a renowned hedge fund with a global presence, is in search of a seasoned SiteReliability Engineer to join their London team. As part of this team, you'll play a pivotal role in maintaining the technology infrastructure that drives the fund's operations, directly contributing … to its success. This involves handling large volumes of data for research purposes, and enhancing the reliability and speed of the evolving applications through automation and efficiency measures. The ideal candidate will possess expertise in infrastructure automation, along with in-depth knowledge of Python/Golang/Powershell and … teams, including PMs and traders. My client is committed to offering a competitive salary increase and top-tier benefits, including access to an on-site gym and complimentary breakfast and lunch. To apply, click the link below or send your resume directly to harvey.gilbert@mondrian-alpha.com. more »