Washington, Washington DC, United States Hybrid / WFH Options
OMW Consulting
Job Title: SiteReliabilityEngineer (SRE) Location: Washington, DC - Hybrid Clearance: TS/SCI Salary: $160k-$200k Join a dynamic team dedicated to delivering best-in-class service quality and issue resolution for mission-critical deployments. In this role, you will be instrumental in shaping operational policies … meet SLAs. Collaborate with developers to maintain secure and efficient workflows. What We're Looking For: Minimum of 4 years of experience as an SREengineer, with a strong focus on automation and deployment. Active security clearance with experience in DoD IT environments. Proficiency in VMware, Kubernetes, Docker, Helm More ❯
SiteReliabilityEngineer | Inside IR35 | Hybrid - 2 Days Onsite London | 6 Month Contract Our client a multinational and respected consultancy is hiring for a Lead SiteReliabilityEngineer with expertise in AWS and DevOps Tools for a new project in the Public Sector. Technical More ❯
london, south east england, United Kingdom Hybrid / WFH Options
RP International
SiteReliabilityEngineer | Inside IR35 | Hybrid - 2 Days Onsite London | 6 Month Contract Our client a multinational and respected consultancy is hiring for a Lead SiteReliabilityEngineer with expertise in AWS and DevOps Tools for a new project in the Public Sector. Technical More ❯
We're looking for an innovative and enthusiastic SiteReliabilityEngineer to a well-known, UK games studio who are highly respected throughout the industry. As a SiteReliabilityEngineer your main purpose is solving for scale through collaboration and automation, bringing engineering principles More ❯
London, England, United Kingdom Software and Services Add to Favorites SiteReliabilityEngineer (SRE), UK Description In this role, you will be a part of the operations team for a payments platform. The role requires the candidate to embrace a complex software product architecture to support a More ❯
Washington, Washington DC, United States Hybrid / WFH Options
Technica Corporation
for future positions at Technica Corporation. If a position becomes available that aligns with your qualifications, our recruitment team will contact you. A Senior SiteReliabilityEngineer is responsible for both the operations and maintenance of the Atlassian Developer Tools in support of developer customers. Additionally, this … of the service in the organizational structure as it supports customers in utilizing, but not limited to, the following: Bamboo, Bitbucket, Crucible, & Fisheye. The SiteReliabilityEngineer conducts technical project milestone reviews, codes architecture sessions, provides resource estimation, and utilizes development best practices. He/she will … be located at a government facility in Washington, D.C., and typically requires work to be performed onsite. The government has established policies for on-site vs remote work which varies across divisions and contracts. Staff are obligated to adhere to these and other FBI policies and react accordingly as More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Embarcaderomediagroup
SiteReliability & Platform Engineer to help lead the way. You'll sit at the heart of our engineering operations, bringing together SRE principles and modern platform engineering practices. This includes combining principles of SRE - such as service-level reliability, observability, incident response - with platform engineering practices … ship faster, safer, and more cost-efficiently. What you'll be doing: Designing and operating highly reliable, scalable, and secure Azure-based platforms Applying SRE principles like SLOs, observability, and incident management to drive service reliability Building Infrastructure as Code using Terraform (v1.7+) and GitOps workflows Enabling teams through … opportunity for someone passionate about building robust infrastructure and enabling others to move faster and more securely. You might come from a cloud engineering, SRE, or DevOps background - what matters most is your curiosity, systems thinking, and drive to improve operational efficiency. At Sorted, we are committed to fostering an More ❯
incident response. Preferred Qualifications: Master's degree in Computer Science or Engineering, or a related field. About the Job SiteReliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences and perspectives. We encourage them to collaborate, think big and take risks in a More ❯
to automate routine tasks. Systematic problem-solving approach, coupled with effective verbal and written communication skills. About the Job SiteReliability Engineering (SRE) combines software and systems engineering to build and run large-scale, massively distributed, fault-tolerant systems. SRE ensures that Google Cloud's services-both our … our systems capacity and performance. Much of our software development focuses on optimizing existing systems, building infrastructure and eliminating work through automation. On the SRE team, you'll have the opportunity to manage the complex challenges of scale which are unique to Google Cloud, while using your expertise in coding … algorithms, complexity analysis, and large-scale system design. SRE's culture of diversity, intellectual curiosity, problem solving, and openness is key to its success. Our organization brings together people with a wide variety of backgrounds, experiences, and perspectives. We encourage them to collaborate, think big, and take risks in a More ❯
enabler of Capital One's ambitions. We are keen to add a Senior SiteReliability Engineering Manager (SSREM) to our Nottingham based SRE organisation whose primary focus is to provide effective leadership as we evolve and mature sitereliability practices for the benefit of our cloud … applications and their customers. The successful candidate will be a leader of leaders with custodianship of application services across 5+ SRE teams. We're looking for an experienced professional whose technical background allows effective challenge and support of teams managing primarily Java based applications running in a dynamic IaaC AWS … outcomes in the pursuit of business, functional and personal goals. The successful application will lead by example, build strong and valuable relationships within the SRE org, wider tech and business stakeholders. They have the ability to face ambiguity and understand how to make sense of complexity, importantly being able to More ❯
Services, Azure Functions, Azure Logic Apps, Azure SQL, Azure Storage, Application Insights, Azure Redis, VNets and Azure App Gateway. 2+ years of experience with Reliability concepts to ensure high performance and high service availability, able to define implement and improve business performance SLO's. 2+ years of experience with …/paging with OpsGenie, incident management, RCA (Root Cause Analysis) and retrospective analysis. 2+ or more years in hands-on technical roles (such as sitereliabilityengineer, software engineer, DevOps engineer, infrastructure engineer). Experience with infrastructure management across multiple cloud and on-premise … less experienced engineers. Production environments with on-call rotations. Advocacy: Train and mentor engineering teams on modern observability practices and techniques. Define and socialize SRE culture, best practices, architectural and security standards. Assess and raise risks across the organization. Partnership with: Internal engineering, architecture and operations teams to ensure alignment. More ❯
City Of Bristol, England, United Kingdom Hybrid / WFH Options
Gravitas Recruitment Group (Global) Ltd
products and services within the GCP platform. Meaning the next generation of services that form this Financial Services companies vision for 2025! Role - Lead SiteReliabilityEngineer Salary - £90,440 - £106,400 Location - London – Hybrid/Flexible working. Essential Skills: · Experience working with GCP products (or extensive … Jenkins, or alternatives such as Azure DevOps; You will report partner with service teams to drive the adoption of SiteReliability Engineering (SRE) best practices, ensuring these principles are integrated effectively within our microservices. Collaborate with infrastructure engineers to guarantee the resilience, scalability, and overall performance of the More ❯
large-scale distributed systems. Facilitate service capacity planning and demand forecasting, software performance analysis, and system tuning. Work with SiteReliability Engineering (SRE) team on the shared full stack ownership of a collection of services and/or technology areas. Understand the end-to-end configuration, technical dependencies … Terraform, Infrastructure as Code Oracle Database Linux RMAN Exadata Zero Data Loss Recovery Appliance Additionally we are looking for motivated individuals that have Prior SRE experience managing production cloud services Prior experience in releasing and maintaining cloud services Excellent verbal and written communication Production experience managing systems or database environment … days from the posting date or as long as the job remains posted. Required Skills Automation Cloud Infrastructure Services DevOps SQL (Structured Query Language) SRE Troubleshoot Issues About Us As a world leader in cloud solutions, Oracle uses tomorrow's technology to tackle today's challenges. We've partnered with More ❯
Are you a SiteReliabilityEngineer (SRE) seeking a new interesting challenge ? If your answer is yes, it's your lucky day so keep reading, it can be just what you're looking for ! ️ WHAT WILL YOU DO We are looking for a dynamic, proactive and talented … perso n to join our team and perform the following tasks: Help drive reliability across the product landscape by implementing strong SRE practices, design patterns, and processes Administer and maintain Unix/Linux systems and Tomcat application servers Automate tasks and processes using PowerShell and Python Monitor system health … ARE WE LOOKING FOR IN YOU? Bachelor's degree in Computer Science, Engineering, or a related field 8+ years of relevant experience as an SRE or similar role Native Spanish Language and Advanced level of English (C1) Expert-level knowledge in PowerShell scripting and automation Strong experience with Unix/ More ❯
As a SiteReliabilityEngineer (SRE), you'll continuously drive improvements in observability, performance, and reliability, with the goal to make an impact across the highest levels of government. What you'll do: Monitor platform and containerized applications. Identify performance and availability risks and issues. Work … platform infrastructure. Collaborate with the team and the customer daily. What you'll need to succeed - Required Experience: 8 years of experience as an SRE with a strong understanding of SRE principles for highly scalable and reliable systems. Experience working in a DevSecOps environment and with Source Code repositories and … containerization, K8, and CI/CD Automation. Experience with container orchestration tools (Rancher, OpenShift, etc.) Willing to work in downtown Washington, DC on client site at least 3 days per week. A Bachelor's degree and an active TS SCI clearance. Nice to have: Passion for learning new development More ❯
time data, set us apart as the leader in payments. We're on the hunt for an exceptional SiteReliabilityEngineer (SRE) to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the high availability, scalability … and service quality levels. Contribute to the design of reliable cloud infrastructure and implement reusable cloud-uptime components as code. Regularly review and optimise SRE practices, tools, and methodologies to enhance overall system reliability and team efficiency. Observability and Automation: Contribute to the design, implementation, and maintenance of observability … a culture of reliability. Requirements Bachelor's Degree in Computer Science, Information Technology, or related field. A minimum of 3 years in a dedicated SRE role, as well as 5+ years of prior software development experience. Comprehensive understanding of large-scale distributed platform architecture. Extensive hands-on cloud experience, particularly More ❯
time data, set us apart as the leader in payments. We're on the hunt for an exceptional SiteReliabilityEngineer (SRE) to join our dedicated team. As an SRE at Paymentology, you'll be the superhero responsible for maintaining, improving, and ensuring the high availability, scalability … and service quality levels. Contribute to the design of reliable cloud infrastructure and implement reusable cloud-uptime components as code. Regularly review and optimise SRE practices, tools, and methodologies to enhance overall system reliability and team efficiency. Observability and Automation: Contribute to the design, implementation, and maintenance of observability … a culture of reliability. Requirements Bachelor's Degree in Computer Science, Information Technology, or related field. A minimum of 3 years in a dedicated SRE role, as well as 5+ years of prior software development experience. Comprehensive understanding of large-scale distributed platform architecture. Extensive hands-on cloud experience, particularly More ❯
We are seeking talented Senior SiteReliability Engineers to join our growing SRE team! You will tackle complex challenges by designing and implementing scalable, reliable infrastructure and services that power the future of customer engagement technology. In this pivotal role, you'll leverage your extensive expertise in backend … systems and infrastructure management to enhance the performance and reliability of our platforms. Your contributions will directly influence the shaping of architecture and operational excellence needed for our product to thrive. Some things you'll do Architect and maintain critical infrastructure to enable Customer.io to scale and handle real … processing of billions of messages. Strategically plan and implement infrastructure growth to meet evolving demands and repeatability. Streamline and automate processes for efficiency and reliability, removing manual toil. Participate in on-call rotations to swiftly address availability incidents and support technical engineers with customer-related issues. Develop observability to More ❯
As SiteReliabilityEngineer, you'll lead the design, implementation, and management of highly available and scalable systems, applying industry best practices and reliability engineering principles. We know that you can't have great technology services without amazing people. At MetroStar, we are obsessed withour people … clearance or higher Bachelor's degree in Computer Science, Information Technology, or a related field. Minimum of 3 years of professional experience in a SiteReliability Engineering role or similar capacity. Strong experience with cloud technologies (e.g., AWS, Azure, GCP) and infrastructure as code (e.g., Terraform, Ansible). More ❯
solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliabilityEngineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliability Engineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting More ❯
solutions that simplify the way IT organizations work. We are currently looking for a Senior SiteReliabilityEngineer to join our SRE team in the Platform Engineering organization and help us scale our products to millions of end-users. We are looking for individuals with a passion … improve efficiency and reduce delivery time of applications and infrastructure Other duties as needed About You 7+ years' experience in DevOps and/or SiteReliability Engineering roles 3+ years' experience with an object-oriented language (preferably Java, .NET or C++) Expert+ level Linux administration, scripting, and troubleshooting More ❯
A prestigious, technology-driven hedge fund is seeking a highly skilled SiteReliabilityEngineer (SRE) to join their global infrastructure team. This is a unique opportunity to work in a high-performance, low-latency trading environment where technology is at the heart of the firm’s competitive … critical role in ensuring the performance, reliability, and scalability of the systems that power the fund’s trading and research platforms. As an SRE, you will work closely with software engineers and investment teams to build automation-first solutions that support the firm’s most advanced strategies. Key Responsibilities … across the business. Design and implement automation to eliminate manual tasks and reduce operational risk. Collaborate with software and investment teams to embed the SRE mindset early in the development lifecycle. Ideal Candidate: SRE with experience working with data systems Ability to program (structured, OOP, and TDD) using one or More ❯
A prestigious, technology-driven hedge fund is seeking a highly skilled SiteReliabilityEngineer (SRE) to join their global infrastructure team. This is a unique opportunity to work in a high-performance, low-latency trading environment where technology is at the heart of the firm’s competitive … critical role in ensuring the performance, reliability, and scalability of the systems that power the fund’s trading and research platforms. As an SRE, you will work closely with software engineers and investment teams to build automation-first solutions that support the firm’s most advanced strategies. Key Responsibilities … across the business. Design and implement automation to eliminate manual tasks and reduce operational risk. Collaborate with software and investment teams to embed the SRE mindset early in the development lifecycle. Ideal Candidate: SRE with experience working with data systems Ability to program (structured, OOP, and TDD) using one or More ❯
role: We are looking for a highly capable and experienced SiteReliabilityEngineer to join our growing tech team. As an SRE you will be a hands-on coach for the development teams maintaining and improving our solutions' reliability. You will be part of our DevOps team … and alerting platforms, such as ELK, DataDog, Grafana, Loki, etc. Solid understanding of monitoring and alerting best practices. Previous experience as DevOps/Platform Engineer or SRE. Expertise with IaC tooling (Terraform) and good understanding of cloud technologies, ideally Azure. Hands-on expertise with Kubernetes and Helm. Fundamental understanding More ❯
discretionary bonus Location: Chicago, IL Join a pioneering technology team within one of the top High Frequency Trading firms in Chicago as a Sr SiteReliabilityEngineer skilled in Python, Linux Systems, and C++ to enhance the reliability of our cutting-edge trading systems. Role Responsibilities … Build and maintain Python infrastructure. Ensure extensive knowledge in Linux Systems application. Own the reliability of the firms trading systems. Develop and manage software deployment and create scalable automated systems. Maintain and improve automated testing processes. Skills Required: Bachelors Degree in computer science or equivalent field 4+ years of … experience in sitereliability engineering, Production engineering, or similar field. Python Development: Expertise in writing and maintaining Python infrastructure is crucial. Linux Systems Knowledge: Extensive experience with Linux operating systems to ensure system integrity and performance. C++ Familiarity: Basic understanding of C++ to collaborate effectively with other development More ❯