New York City (Manhattan), New York, United States
Braze
a sharp and passionate team at your back. If Braze sounds like a place where you can thrive, we can't wait to meet you. WHAT YOU'LL DO Site Reliability Engineers (SREs) are responsible for keeping all internal-facing services and platforms running smoothly. In a nutshell, SREs ensure site uptime. SREs blend sensible system administrators and … and sending billions of messages to end-users daily. We use a diverse technology stack rooted in Ruby on Rails, MongoDB, Redis, Kafka, Kubernetes, and more. As a Senior Site Reliability Engineer at Braze, you will collaborate with your team and consumer engineering teams to continuously improve the infrastructure, automation, and tooling that build internal products from these … from ever happening Retrospect everything that happens to turn lessons into system improvements/changes, automation, etc. WHO YOU ARE 5+ years of experience as a Software, DevOps, or Site Reliability Engineer 3+ years of Data Streaming Reliability Engineering Experience in monitoring, troubleshooting, and optimizing Kafka streaming applications, including diagnosing lag, partition imbalances, consumer group issues, and broker More ❯
Tesla is looking for a Site Reliability Engineer to build, enhance, and scale the infrastructure that underpins our Energy IoT applications. These applications provide real-time monitoring, optimization, and control for Tesla's industry-leading energy products, including Powerwall, Megapack, Solar Roof, Supercharger, Wall Connector, Autobidder, and Virtual Power Plants. We are a high-impact team that values More ❯
HQ or the wider global organisation, you'll be a part of collaborative, high-performing teams, creating cutting-edge software, platforms, and infrastructure. The Role Join us as a Site Reliability Engineer and help us build the future of data sovereignty! We're seeking an SRE passionate about creating high-performance, scalable, and reliable services for our production … implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the Acra platform for high availability and fault tolerance. This includes ensuring resilience against Cloud Availability Zone outages and the ability to gracefully handle node failures. Guarantee 99.9% uptime More ❯
Principal Site Reliability Engineer - Core Systems Hybrid in London or Remote within the UK The company Imagine a world where every small business has the power to thrive. That's the world we're building at iwoca. Small businesses aren't just statistics - they're the heartbeat of our communities, the character of our high streets, and the … with Kubernetes, PostgreSQL hosted in AWS RDS, and Snowflake. A track record of shaping incident processes, on-call practices, or sharing reliability ownership across multiple teams. Deep understanding of site reliability principles and applying them to databases, including observability and limiting the impact of long-running or resource-heavy queries. Experience with infrastructure automation, like setting up monitoring and More ❯
Site Reliability Engineer - Software CSG Austin, Texas, United States Hardware Summary Posted: Oct 01, 2024 Role Number: Do you love building elegant solutions to highly complex challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you'll help design and manufacture our next-generation, high-performance, power-efficient processor, system More ❯
Site Reliability Engineer - Software CSG Beaverton, Oregon, United States Hardware Summary Posted: Oct 01, 2024 Role Number: Do you love building elegant solutions to highly complex challenges? Do you intrinsically see the importance in every detail? As part of our Silicon Technologies group, you'll help design and manufacture our next-generation, high-performance, power-efficient processor, system More ❯
Indianapolis, Indiana, United States Hybrid / WFH Options
Eli Lilly
around the world. Come tackle complex challenges and ensure the reliability of critical applications to help patients! Lilly's Software Product Engineering team is actively looking for a Lead Site Reliability Engineer (SRE). Are you ready to own and complete complicated and technically ambitious tasks? Are you passionate about technology with extensive experience in observability, AWS, Kubernetes … everyone is informed about the status of applications and any incidents that occur. Provide guidance and training to junior team members, helping them develop their skills and knowledge in site reliability engineering. Stay current with industry trends and emerging technologies and be willing to adapt and innovate to improve application reliability and performance. Always prioritize the needs and experiences More ❯
Government. We support several Department of Defense (DoD) and Federal Agencies across the CONUS. OVERVIEW Full-time/Permanent Employee Location: Dayton, OH or Hanscom AFB, MA As a Site Reliability Engineer IV, you will work under general supervision to ensure the reliability and performance of systems and applications by addressing technical challenges of moderate scope and complexity. More ❯
Job Description: We are seeking a highly skilled and experienced Reliability Engineer to join our team. The ideal candidate must have a strong background in technology, with specific expertise in Kubernetes, Gitlab, Dynatrace, GraphQL, Node, React with a good understanding of CI/CD pipelines. The candidate must be comfortable with ambiguity, learning new things and have a perseverance More ❯
Los Angeles (Downtown), California, United States Hybrid / WFH Options
Altruist
urgency, swiftly adapting to change and overcoming obstacles. The opportunity Altruist is in the midst of an exceptional growth phase and we're excited to hire our first Senior Site Reliability Engineer to join our growing team. This is a high-impact role where you'll play a key part in building and maintaining the reliability of systems … goals. Drive initiatives to improve performance, reliability, and infrastructure cost-efficiency: Look for opportunities to automate, optimize, and scale smarter. What you bring Experience - 7+years of experience working in Site Reliability Engineering or a related roles: A proven track record of maintaining and improving system reliability. Expertise in AWS services and Kubernetes, including infrastructure as code tools like Terraform More ❯
Los Angeles (Downtown), California, United States Hybrid / WFH Options
Altruist
urgency, swiftly adapting to change and overcoming obstacles. The opportunity Altruist is in the midst of an exceptional growth phase and we're excited to hire our first Staff Site Reliability Engineer to join our growing team. This is a high-impact role where you'll play a key part in building and maintaining the reliability of systems … goals. Drive initiatives to improve performance, reliability, and infrastructure cost-efficiency: Look for opportunities to automate, optimize, and scale smarter. What you bring Experience - 10+years of experience working in Site Reliability Engineering or a related roles: A proven track record of maintaining and improving system reliability. Expertise in AWS services and Kubernetes, including infrastructure as code tools like Terraform More ❯
New York City (Manhattan), New York, United States
Apple
Senior Site Reliability Engineer New York City, New York, United States Software and Services Summary Posted: Mar 28, 2025 Role Number: The Media Platforms SRE team under the Apple Service Engineering division is one of the most exciting examples of Apple's long-held passion for combining art and technology. These are the people who power the App More ❯
you'll still consider applying. Want to learn more about life at Klaviyo? Visit to see how we empower creators to own their own destiny. As a Senior Lead Site Reliability Engineer joining the SRE team, you can expect to lead the technical direction and strategy for business-critical areas of the SRE organization. You will lead by More ❯
fun, and believe that we provide a great place to come to work each day to pursue your passions. The Challenge Join our dynamic and innovative team as a Site Reliability Engineer, where you will play a critical role in ensuring the reliability and performance of our services and infrastructure. You will be in charge of coordinating systems More ❯
Camp Hill, Pennsylvania, United States Hybrid / WFH Options
Delta Dental of California
with a required onsite presence in one of our offices in Rancho Cordova, CA or Camp Hill, PA; fully remote work is not available for this position. The Expert Site Reliability Engineer plays a pivotal role in ensuring seamless and efficient operations, grounded in ITSM methodology. They align IT services with business objectives, proactively adapting to the organization More ❯
Rancho Cordova, California, United States Hybrid / WFH Options
Delta Dental of California
with a required onsite presence in one of our offices in Rancho Cordova, CA or Camp Hill, PA; fully remote work is not available for this position. The Expert Site Reliability Engineer plays a pivotal role in ensuring seamless and efficient operations, grounded in ITSM methodology. They align IT services with business objectives, proactively adapting to the organization More ❯
V2Soft is a global leader in IT services and business solutions, delivering innovative and cost-effective technology solutions worldwide since 1998. We have headquartered in Bloomfield Hills, MI and have 16 offices spread across six countries. We partner with Fortune More ❯
in our Deployment Solutions organization. Core skills and technologies: VMWare, Kubernetes, Docker, Helm, Ansible, Terraform, Linux, AWS, DoD compliance Qualifications • At least 4 years of experience as an SRE engineer, you excel at automating software delivery and deployment while providing documentation and self-service tools for engineering teams and customers. • Holding an active security clearance , you have firsthand experience More ❯
It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500 . Our intelligent cloud-based platform seamlessly connects … more about us? Now that we have set the pace, keep reading if you want to understand more about ServiceNow as a company and the SRE role. As an Engineer on the SRE team you will: Provide relief and sustainable resolution to issues within our infrastructure. Use your experience in software development, systems engineering, and networking to proactively prevent More ❯
top AI computing platform. We equip engineers with the tools to deploy AI that is fast, secure, affordable, and built to scale. Whether they need powerhouse GPU hardware on-site or the flexibility of cloud-based solutions, we've got the horsepower to make it happen. Lambda's AI Cloud has been adopted by the world's leading companies More ❯
It all started in sunny San Diego, California in 2004 when a visionary engineer, Fred Luddy, saw the potential to transform how we work. Fast forward to today - ServiceNow stands as a global market leader, bringing innovative AI-enhanced technology to over 8,100 customers, including 85% of the Fortune 500 . Our intelligent cloud-based platform seamlessly connects … more about us? Now that we have set the pace, keep reading if you want to understand more about ServiceNow as a company and the SRE role. As an Engineer on the SRE team you will: Provide relief and sustainable resolution to issues within our infrastructure. Use your experience in software development, systems engineering, and networking to proactively prevent More ❯
Pasco, Washington, United States Hybrid / WFH Options
Palantir Technologies
who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. The Role We're looking for Forward Deployed Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, primarily across on-prem environments for the US Government. … Forward Deployed Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You'll travel to various locations where you will be the expert for Palantir's More ❯
New York City (Manhattan), New York, United States Hybrid / WFH Options
Palantir Technologies
who need it, our platforms empower our partners to develop lifesaving drugs, forecast supply chain disruptions, locate missing children, and more. The Role We're looking for Forward Deployed Site Reliability Engineers who can help us build, operate, and maintain high-performance, scalable, and reliable services for our production infrastructure, primarily across on-prem environments for the US Government. … Forward Deployed Site Reliability Engineers combine engineering experience and an innate drive to improve existing systems and processes, with the creativity to develop novel solutions to evolving challenges. Our team strives to automate processes wherever possible, using whichever tools are best for the job. You'll travel to various locations where you will be the expert for Palantir's More ❯
Welcome to the intersection of energy and home services. At NRG, were all about propelling the next generation of leaders forward. We are driven by our passion to create a smarter, cleaner and more connected future. We deliver innovative solutions More ❯
Want a fast-paced, rewarding career at a fast-growing, global tech company? Luminance is a young AI company that is growing rapidly: today, Luminance's technology is helping over 600 customers in 70 countries globally. With ambitious growth plans More ❯