Permanent 'Reliability Engineer' Job Vacancies

23 of 23 Permanent Reliability Engineer Jobs

Sr. HW Reliability Engineer, Kuiper, Product Integrity, Hardware Reliability, Product Integrity

Bellevue, Washington, United States
Amazon.com Services LLC
Amazon is a leader in developing first of its kind hardware, such as Kindle, Echo and FireTV. Amazon reliability team aims to develop reliable and robust products that delight our customers. In this role, as a Sr. Hardware Reliability Engineer, you will be responsible for the reliability engineering of our new and emerging category of devices … launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband connectivity to un-served and under-served communities around the world. As a reliability engineer on the Kuiper Customer Terminal products, you will be developing and building inventive technology and mechanism to resolve complex reliability challenges. You should have a good … understanding of Design for Reliability, Reliability statistics, Reliability tests and/or solid understanding of indoor and outdoor consumer electronics to influence design for reliability to: Define new test specifications, methodology and test coverage to assure semiconductor components can meet Amazon product reliability requirement. Identify and validate product risks via Design for Reliability mechanisms More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

HW Reliability Engineer, Kuiper, Product Integrity, Hardware Reliability, Product Integrity

Redmond, Washington, United States
Amazon.com Services LLC
Amazon is a leader in developing first of its kind hardware, such as Kindle, Echo and FireTV. Amazon reliability team aims to develop reliable and robust products that delight our customers. In this role, as a Hardware Reliability Engineer, you will be responsible for the reliability engineering of our new and emerging category of devices - Kuiper … launch a constellation of Low Earth Orbit satellites that will provide low-latency, high-speed broadband connectivity to un-served and under-served communities around the world. As a reliability engineer on the Kuiper Customer Terminal products, you will be developing new reliability specifications. You should have a good understanding of Design for Reliability, Reliability statistics, Reliability tests and/or solid understanding of indoor and outdoor consumer electronics to influence design for reliability to: Define new test specifications, methodology and test coverage based on mission profiles to assure product reliability. Identify and validate product risks via Design for Reliability mechanisms and work with design teams to mitigate them during development. More ❯
Employment Type: Permanent
Salary: USD Annual
Posted:

Embedded System Reliability Engineer

Capenhurst, Cheshire, UK
Hybrid / WFH Options
EA Technology
working/hybrid roles which is why so many of our employees stay with us long term. Due to continued growth and expansion, we're lookingfor an Embedded Systems Reliability Engineer to join our amazing CTO Development team. You'd be working with an exceptionally friendly group of people in a genuinely welcoming and positive environment. So, if … on-site working (Capenhurst CH1 6ES) over the course of the year flexing as projects require*** About the role: This is an exciting opportunity for a talented Embedded Systems Reliability Engineer with proficiency in modern C++ (C++17 or newer) to join us to: Investigate and resolve complex bugs across embedded and desktop systems, implementing fixes and systemic quality … CI/CD pipelines for embedded firmware (Buildroot/make) and desktop applications (CMake/Qt), integrating quality gates and static analysis Define, monitor and drive improvements against key reliability metrics (e.g. crash frequency, memory stability, startup success) Improve diagnostic visibility through structured logging, crash data capture and telemetry via MQTT Collaborate with hardware, software and test engineers to More ❯
Employment Type: Full-time
Posted:

Reliability & Operations Engineer

London Area, United Kingdom
Heart Mind Talent
Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. We’re changing that—delivering market-beating tips, insights, and data powered by our world-class in-house … flagship consumer platform launched in 2024 and is scaling fast, supported by industry-leading social channels with nearly two million highly engaged followers. The Role We’re hiring a Reliability & Operations Engineer to keep our products fast, accurate, and always on. You’ll sit at the heartbeat of our daily operations, monitoring systems and content pipelines, resolving issues … in real time, and automating workflows to help us scale. What you’ll do Own daily reliability: Monitor key metrics and alerts, triage anomalies, and drive quick fixes or escalations to keep uptime and data freshness high. Run production workflows: Manage recurring jobs (content updates, reporting, dashboards) and ensure they complete on schedule. Automate & document: Script repeatable tasks, improve More ❯
Posted:

Reliability & Operations Engineer

City of London, London, United Kingdom
Heart Mind Talent
Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. We’re changing that—delivering market-beating tips, insights, and data powered by our world-class in-house … flagship consumer platform launched in 2024 and is scaling fast, supported by industry-leading social channels with nearly two million highly engaged followers. The Role We’re hiring a Reliability & Operations Engineer to keep our products fast, accurate, and always on. You’ll sit at the heartbeat of our daily operations, monitoring systems and content pipelines, resolving issues … in real time, and automating workflows to help us scale. What you’ll do Own daily reliability: Monitor key metrics and alerts, triage anomalies, and drive quick fixes or escalations to keep uptime and data freshness high. Run production workflows: Manage recurring jobs (content updates, reporting, dashboards) and ensure they complete on schedule. Automate & document: Script repeatable tasks, improve More ❯
Posted:

Reliability & Operations Engineer

london, south east england, united kingdom
Heart Mind Talent
Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. We’re changing that—delivering market-beating tips, insights, and data powered by our world-class in-house … flagship consumer platform launched in 2024 and is scaling fast, supported by industry-leading social channels with nearly two million highly engaged followers. The Role We’re hiring a Reliability & Operations Engineer to keep our products fast, accurate, and always on. You’ll sit at the heartbeat of our daily operations, monitoring systems and content pipelines, resolving issues … in real time, and automating workflows to help us scale. What you’ll do Own daily reliability: Monitor key metrics and alerts, triage anomalies, and drive quick fixes or escalations to keep uptime and data freshness high. Run production workflows: Manage recurring jobs (content updates, reporting, dashboards) and ensure they complete on schedule. Automate & document: Script repeatable tasks, improve More ❯
Posted:

Reliability & Operations Engineer

slough, south east england, united kingdom
Heart Mind Talent
Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. We’re changing that—delivering market-beating tips, insights, and data powered by our world-class in-house … flagship consumer platform launched in 2024 and is scaling fast, supported by industry-leading social channels with nearly two million highly engaged followers. The Role We’re hiring a Reliability & Operations Engineer to keep our products fast, accurate, and always on. You’ll sit at the heartbeat of our daily operations, monitoring systems and content pipelines, resolving issues … in real time, and automating workflows to help us scale. What you’ll do Own daily reliability: Monitor key metrics and alerts, triage anomalies, and drive quick fixes or escalations to keep uptime and data freshness high. Run production workflows: Manage recurring jobs (content updates, reporting, dashboards) and ensure they complete on schedule. Automate & document: Script repeatable tasks, improve More ❯
Posted:

Reliability & Operations Engineer

london (city of london), south east england, united kingdom
Heart Mind Talent
Heart Mind Talent are partnering with Verified Global to hire a Reliability & Operations Engineer based in Central London. Verified Global builds cutting-edge algorithms to flip the odds in sports betting. Every hour, millions of fans place sub-optimal bets. We’re changing that—delivering market-beating tips, insights, and data powered by our world-class in-house … flagship consumer platform launched in 2024 and is scaling fast, supported by industry-leading social channels with nearly two million highly engaged followers. The Role We’re hiring a Reliability & Operations Engineer to keep our products fast, accurate, and always on. You’ll sit at the heartbeat of our daily operations, monitoring systems and content pipelines, resolving issues … in real time, and automating workflows to help us scale. What you’ll do Own daily reliability: Monitor key metrics and alerts, triage anomalies, and drive quick fixes or escalations to keep uptime and data freshness high. Run production workflows: Manage recurring jobs (content updates, reporting, dashboards) and ensure they complete on schedule. Automate & document: Script repeatable tasks, improve More ❯
Posted:

Reliability Engineer

City of London, London, United Kingdom
Digital Realty (UK) Limited
The Engineer will provide a range of support which may include technical difficulties, working with vendors to overcome intrinsic issues, working with site operations teams to improve usage and efficiency aspects, and identifying any areas for improvement. This may include site specific improvements, region wide improvement programmes, and oversight and reporting throughout. The Engineer plays a key role … during incident and follow-up, including taking part in incident calls, root cause analysis exercises, and supporting sites throughout resolution of problem tasks. The Engineer may play a consultative role in the review of high-risk changes and incident reports. What youll do General Duties To assist the EMEA Reliability Manager and Technical Operations team in defining and … of Mechanical and Electrical infrastructure, address any maintenance related issues and concerns and other maintenance related relevant topics. After these meetings, they will provide a regular update to the Reliability Manager and the Director of Technical Operations on any concerns relating to plant condition and operational activities. Reactive Skills When, directed by the Reliability Manager or the Director More ❯
Employment Type: Permanent
Posted:

Senior Azure SaaS Reliability & Support Engineer

Kingston Upon Thames, England, United Kingdom
Hybrid / WFH Options
Reveal Media
Job Title: Senior Azure SaaS Reliability & Support Engineer Department: Service Delivery Location : Kingston Upon Thames/Hybrid Country: UK Level: Individual Contributor Reports To: Head of Customer support Contract Type: Permanent Contracted Hours/Days: 37.5 hours/5 days About Us At Reveal, passion meets purpose. Our body-worn video solutions are more than just technology; they … between support, engineering, and cloud operations: Investigating and fixing complex application and infrastructure issues. Monitoring capacity, performance, and error budgets across all deployments. Designing automation and tooling to improve reliability and reduce manual work. Your Responsibilities and Tasks 1. Environment Health & Incident Response Monitor ST and MT environments for server performance, response times, error rates, and application health. Detect … storage objects. Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents. Provide third-line support for escalated customer cases, collaborating with development for code-level fixes. 2. Reliability Engineering (Fleet Level) Maintain uptime, performance, and scalability across all ST and MT deployments. Define and track service-level objectives (SLOs) and error budgets for different environment types. Perform More ❯
Posted:

Senior Azure SaaS Reliability & Support Engineer

london, south east england, united kingdom
Hybrid / WFH Options
Reveal Media
Job Title: Senior Azure SaaS Reliability & Support Engineer Department: Service Delivery Location : Kingston Upon Thames/Hybrid Country: UK Level: Individual Contributor Reports To: Head of Customer support Contract Type: Permanent Contracted Hours/Days: 37.5 hours/5 days About Us At Reveal, passion meets purpose. Our body-worn video solutions are more than just technology; they … between support, engineering, and cloud operations: Investigating and fixing complex application and infrastructure issues. Monitoring capacity, performance, and error budgets across all deployments. Designing automation and tooling to improve reliability and reduce manual work. Your Responsibilities and Tasks 1. Environment Health & Incident Response Monitor ST and MT environments for server performance, response times, error rates, and application health. Detect … storage objects. Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents. Provide third-line support for escalated customer cases, collaborating with development for code-level fixes. 2. Reliability Engineering (Fleet Level) Maintain uptime, performance, and scalability across all ST and MT deployments. Define and track service-level objectives (SLOs) and error budgets for different environment types. Perform More ❯
Posted:

Senior Azure SaaS Reliability & Support Engineer

london (kingston upon thames), south east england, united kingdom
Hybrid / WFH Options
Reveal Media
Job Title: Senior Azure SaaS Reliability & Support Engineer Department: Service Delivery Location : Kingston Upon Thames/Hybrid Country: UK Level: Individual Contributor Reports To: Head of Customer support Contract Type: Permanent Contracted Hours/Days: 37.5 hours/5 days About Us At Reveal, passion meets purpose. Our body-worn video solutions are more than just technology; they … between support, engineering, and cloud operations: Investigating and fixing complex application and infrastructure issues. Monitoring capacity, performance, and error budgets across all deployments. Designing automation and tooling to improve reliability and reduce manual work. Your Responsibilities and Tasks 1. Environment Health & Incident Response Monitor ST and MT environments for server performance, response times, error rates, and application health. Detect … storage objects. Use Azure diagnostics and telemetry to troubleshoot and resolve complex incidents. Provide third-line support for escalated customer cases, collaborating with development for code-level fixes. 2. Reliability Engineering (Fleet Level) Maintain uptime, performance, and scalability across all ST and MT deployments. Define and track service-level objectives (SLOs) and error budgets for different environment types. Perform More ❯
Posted:

Senior Software Engineer, Network Reliability, GGN WANForms

Dublin, Ireland
Google Inc
Senior Software Engineer, Network Reliability, GGN WANForms Google Dublin, Ireland Apply Bachelor's degree or equivalent practical experience. 7 years of experience in software development. Experience with solutions for traffic engineering resilience and fault tolerance for WAN Networking products. Preferred qualifications: Master's degree, or PhD in Computer Science or related technical field or equivalent experience. About the … scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google Cloud's needs with opportunities to switch teams and projects as you and our fast-paced business grow and … forward. The AI and Infrastructure team is redefining what's possible. We empower Google customers with breakthrough capabilities and insights by delivering AI and Infrastructure at unparalleled scale, efficiency, reliability and velocity. Our customers include Googlers, Google Cloud customers, and billions of Google users worldwide. We're the driving force behind Google's groundbreaking innovations, empowering the development of More ❯
Employment Type: Permanent
Salary: EUR 125,000 - 150,000 Annual
Posted:

Network Reliability Engineer: Fully Remote (£120,000 + Bonus) - Elite Trading firm

Leigh, Greater Manchester, UK
Hybrid / WFH Options
Hunter Bond
Network Engineer – Low Latency Trading Environment Up to £120,000 + Bonus Remote (UK/Europe) | London Office (Moorgate) Take the next step in your career now, scroll down to read the full role description and make your application. We're working with an elite trading technology firm building high-performance, low-latency connectivity solutions powering global financial markets. … They're looking for a Network Infrastructure Engineer to join their SRE and Infrastructure team — helping to design, automate, and support a world-class trading network spanning multiple data centres and venues. If you love solving complex networking challenges, automating everything, and working alongside highly skilled engineers in a modern DevOps-driven culture — this one's for you. The … Ansible, Terraform, and coding in Python/Bash. Develop new features, improve observability, and enhance automation. Collaborate with vendors, circuit providers, and internal engineering teams. Support and improve infrastructure reliability across multiple regions. What You'll Bring Solid experience as a Network Engineer or Systems Administrator — ideally both. Hands-on coding ability Automation experience – Ansible, Terraform, scripting in More ❯
Employment Type: Full-time
Posted:

Network Reliability Engineer: Fully Remote (£120,000 + Bonus) - Elite Trading firm

London, United Kingdom
Hybrid / WFH Options
Network Engineer Low Latency Trading Environment All the relevant skills, qualifications and experience that a successful applicant will need are listed in the following description. Up to £120,000 + Bonus Remote (UK/Europe) London Office (Moorgate) We're working with an elite trading technology firm building high-performance, low-latency connectivity solutions powering global financial markets. … They re looking for a Network Infrastructure Engineer to join their SRE and Infrastructure team helping to design, automate, and support a world-class trading network spanning multiple data centres and venues. If you love solving complex networking challenges, automating everything, and working alongside highly skilled engineers in a modern DevOps-driven culture this one s for you. The … Terraform , and coding in Python/Bash . Develop new features, improve observability, and enhance automation. Collaborate with vendors, circuit providers, and internal engineering teams. Support and improve infrastructure reliability across multiple regions. What You ll Bring Solid experience as a Network Engineer or Systems Administrator ideally both. Hands-on coding ability Automation experience Ansible , Terraform , scripting in More ❯
Posted:

Network Reliability Engineer: Fully Remote (£120,000 + Bonus) - Elite Trading firm

Leigh, Lancashire, United Kingdom
Hybrid / WFH Options
Network Engineer Low Latency Trading Environment All the relevant skills, qualifications and experience that a successful applicant will need are listed in the following description. Up to £120,000 + Bonus Remote (UK/Europe) London Office (Moorgate) We're working with an elite trading technology firm building high-performance, low-latency connectivity solutions powering global financial markets. … They re looking for a Network Infrastructure Engineer to join their SRE and Infrastructure team helping to design, automate, and support a world-class trading network spanning multiple data centres and venues. If you love solving complex networking challenges, automating everything, and working alongside highly skilled engineers in a modern DevOps-driven culture this one s for you. The … Terraform , and coding in Python/Bash . Develop new features, improve observability, and enhance automation. Collaborate with vendors, circuit providers, and internal engineering teams. Support and improve infrastructure reliability across multiple regions. What You ll Bring Solid experience as a Network Engineer or Systems Administrator ideally both. Hands-on coding ability Automation experience Ansible , Terraform , scripting in More ❯
Posted:

Production Reliability Engineer

London Area, United Kingdom
Global Fintech
communications, ensuring processes are followed and all post-incident follow up and analysis. Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures More ❯
Posted:

Production Reliability Engineer

City of London, London, United Kingdom
Global Fintech
communications, ensuring processes are followed and all post-incident follow up and analysis. Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures More ❯
Posted:

Production Reliability Engineer

slough, south east england, united kingdom
Global Fintech
communications, ensuring processes are followed and all post-incident follow up and analysis. Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures More ❯
Posted:

Production Reliability Engineer

london, south east england, united kingdom
Global Fintech
communications, ensuring processes are followed and all post-incident follow up and analysis. Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures More ❯
Posted:

Production Reliability Engineer

london (city of london), south east england, united kingdom
Global Fintech
communications, ensuring processes are followed and all post-incident follow up and analysis. Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures More ❯
Posted:

Application Reliability and Performance Engineer

Reigate, England, United Kingdom
Hybrid / WFH Options
esure Group
insights alongside exceptional service, to deliver personalised experiences that meet our customers ever-changing needs today and in the future. Role description We are currently recruiting for an Application Reliability and performance Engineer to join our Team. We are seeking a high motivated individual to bring a diverse skill set and proactive mindset to help ensure the stability More ❯
Posted:

Application Reliability and Performance Engineer

guildford, south east england, united kingdom
Hybrid / WFH Options
esure Group
insights alongside exceptional service, to deliver personalised experiences that meet our customers ever-changing needs today and in the future. Role description We are currently recruiting for an Application Reliability and performance Engineer to join our Team. We are seeking a high motivated individual to bring a diverse skill set and proactive mindset to help ensure the stability More ❯
Posted:
Reliability Engineer
10th Percentile
£48,000
25th Percentile
£52,500
Median
£57,500
75th Percentile
£60,000