a fast-paced, dynamic environment. Previous experience working on large App/Data migrations engagements. Cloud Platforms and Technology Experience Core Skills: GCP – Networking, Security tool/Best Practices Observability - Operations suite, Logging, Monitoring, Alerting. Additional Skills: Good understanding of Linux OS. Bash, Scripting, Automation, Ansible, Networking, Security. Hands-on experience with DevOps Principles and Tools. Hands-on with Terraform More ❯
/join with: ? London, UK | ? Full-time | ? Senior-Level I'm hiring for a Technical Account Manager on behalf of a high-growth SaaS company building a next-generation observability platform. Their technology helps engineering teams monitor, analyse, and act on their logs, metrics, traces, and security data — improving performance and cutting observability spend. This is a senior, customer-facing … technical role ideal for someone with a background in cloud infrastructure, observability tools, and DevOps. You’ll play a key role in onboarding, supporting, and expanding relationships with enterprise customers — from hands-on implementation to strategic advisory. ? What You’ll Be Doing: Own the technical onboarding journey for new customers — from data integration to configuration and enablement. Work closely with … DevOps, SREs, and engineering teams to understand requirements and deliver high-impact observability solutions. Troubleshoot complex infrastructure issues (Kubernetes, Docker, pipelines, etc.) and advise on best practices. Act as a trusted technical advisor , providing guidance on implementation, optimisation, and long-term success. Partner with sales and customer success teams on renewals, expansions, and QBRs. What You Bring: Strong hands-on More ❯
wide Job Description: Technical Account Manager - DevOps Specialist London - Hybrid (2 days per week in office) · Full-time · Senior About the company My client are rebuilding the path to observability using a real-time streaming analytics pipeline that provides monitoring, visualization, and alerting capabilities without the burden of indexing. By enabling users to define different data pipelines per use case … we provide deep Observability and Security insights, at an infinite scale, for less than half the cost. About the Position Technical Account Managers in my client are key in our effort to meet our customer’s expectations and help them utilize their observability and security data in the most efficient way possible. We are looking for hard-working, sharp, and … humble professionals with proven technical customer-facing experience. Their Technical Account Managers are trusted advisors and consult their customers upon their monitoring, security & observability journey. This role embodies the critical intersection of very high technical expertise and a focus on customer satisfaction, renewal and expansion. Technical Account Managers are senior-level roles and are expected to professionally and accurately solve More ❯
You’ll Be Responsible For As a Senior SRE, you’ll lead initiatives that: Ensure availability, latency, and performance of mission-critical systems across cloud and hybrid environments. Architect observability solutions (monitoring, logging, alerting) that detect and prevent failures before they impact users. Own and improve incident response workflows, including runbooks, communications, and root cause analysis. Define and enforce SLIs … using tools such as Azure DevOps, GitHub Actions, Jenkins, or GitLab. Lead the design and delivery of resilient, scalable infrastructure using IaC (Terraform, Bicep, etc.). Develop automation and observability tooling that enables fast feedback loops and minimal manual intervention. Strategic & Advisory Define infrastructure architecture to support fault-tolerant applications. Collaborate with developers, architects, and product teams to embed reliability More ❯
Reading, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
native Infrastructure-as-Code (IaC) solutions from the ground up? Our client is seeking a talented and motivated Senior Software Engineer to lead the development of our next-generation observability platform. THIS IS NOT A DEVOPS ROLE. Responsibilities Collaborate within a dynamic software engineering team to architect and build a new cloud-native IaC platform. Develop software using technologies such More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
native Infrastructure-as-Code (IaC) solutions from the ground up? Our client is seeking a talented and motivated Senior Software Engineer to lead the development of our next-generation observability platform. THIS IS NOT A DEVOPS ROLE. Responsibilities Collaborate within a dynamic software engineering team to architect and build a new cloud-native IaC platform. Develop software using technologies such More ❯
a key member of the Dynatrace sales engine and will be responsible for providing excellent technical support to the sales team. You will be the expert on Dynatrace and observability, with a specialization in Log Management and Analytics. Within this exciting role, you will be responsible for executing great demos which demonstrate the Dynatrace unique approach in solving the customer … be filled at a higher level based on candidate experience. What will help you succeed Preferred Requirements: Experience with query languages such as SQL, SPL, or KQL. Experience with observability and log collectors/pipelines such as FluentBit, OpenTelemetry, Cribl, and Logstash. Experience with web technologies such as HTML, CSS, and JavaScript. Experience with programming/scripting side technologies such … OpenShift, Serverless functions, and CI/CD pipelines. Experience with automation like Ansible, Puppet, Terraform, etc. Why you will love being a Dynatracer Dynatrace is a leader in unified observability and security. We provide a culture of excellence with competitive compensation packages designed to recognize and reward performance. Our employees work with the largest cloud providers, including AWS, Microsoft, and More ❯
elevate the end-user experience. This position is designed to fuel your hands-on growth, giving you the chance to master cloud architectures, Continuous Integration/Continuous Deployment pipelines, observability tools, and incident management processes — all while working in a fast-paced, ever-evolving environment. You'll report directly to the Cloud Operations Director in this role, with no people … pipelines and CI/CD processes to streamline releases. Troubleshoot production issues and drive initiatives to prevent future disruptions, keeping systems stable and available. Set up and maintain powerful observability tools (logging, monitoring, alerting) to ensure fast incident detection and resolution. Take part in an on-call rotation, gaining invaluable real-time experience in incident management and root cause analysis. More ❯
performance tuning and benchmarking skills (storage, network, Linux kernel) Solid experience with DevOps tooling (Terraform, Ansible, GitLab, Jenkins) Proficiency in Python, Golang, or similar languages Familiarity with monitoring/observability tools like Splunk, Prometheus, and Grafana Bonus: experience with containerization and orchestration (Docker, Kubernetes) If you're passionate about high-performance infrastructure and want to work at the intersection of More ❯
multi-tenant SaaS or large enterprise application. Certifications: AWS Certified Solutions Architect, Google Professional Cloud Architect, Azure Solutions Architect Expert. Experience in data architecture, AI/ML integration, and observability frameworks . #J-18808-Ljbffr More ❯
with business objectives to meet evolving customer needs. As an influential figure in our company, the Systems Reliability Engineering Senior Lead will spearhead initiatives to automate infrastructure, enhance system observability, and drive the transformation of our IT operations. What are we looking for? Bachelor’s degree in Information Technology, Computer Science, Business Management, or a related field 7+ years of … with a proven track record in issue and problem management in a multicultural and global environment. Proficiency with cloud platforms and experience in configuration management, scripting, and monitoring and observability tools. Understanding of business processes, change management, and ITSM processes, including service level management and reporting. Excellent communication skills and the ability to work collaboratively with cross-functional teams. What More ❯
Bracknell, Berkshire, South East, United Kingdom Hybrid / WFH Options
Halian Technology Limited
in the team Contribute to solution architecture and strategic technical direction Build, integrate, and maintain REST APIs and backend services Champion best practices in software quality, CI/CD, observability, and DevOps Collaborate with cross-functional teams including Product, QA, and DevOps Optionally take on people management responsibilities for engineers Stay updated with emerging backend and cloud technologies Key Skills More ❯
of practices (e.g., Cloud, Platforms. AI, Strategy, Custom Application Development, Network & Edge, Security, Resiliency, etc.) Articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps) and operations (e.g., observability, automated response, SRE, etc.) and able to articulate a path toward a target operating model (people, process, and tools) SoftServe is an Equal Opportunity Employer. All qualified applicants will receive More ❯
of practices (e.g., Cloud, Platforms. AI, Strategy, Custom Application Development, Network & Edge, Security, Resiliency, etc.) Articulate the vision for modern engineering (e.g., agile, cloud-native, DevOps) and operations (e.g., observability, automated response, SRE, etc.) and able to articulate a path toward a target operating model (people, process, and tools) SoftServe is an Equal Opportunity Employer. All qualified applicants will receive More ❯
have end-to-end ownership over reliability tooling, incident response, and system performance—working across teams to scale a truly enterprise-grade platform. Key Responsibilities: Leading on production resilience, observability, and incident frameworks. Building SLIs/SLOs and advocating for best practices in platform reliability. Automating recovery, scaling, and monitoring across distributed systems. Collaborating with cross-functional teams to align …/production infrastructure roles. Strong experience with Java (Spring) and cloud platforms (ideally Azure ). Proven track record in building and maintaining mission-critical systems. Deep understanding of Kubernetes, observability tooling (Grafana, Prometheus, ELK, etc.), and Infrastructure as Code (Terraform, Bicep). Ability to lead technical conversations across Engineering and Product. Experience in fintech, crypto, or regulated digital infrastructure RDBMS More ❯
services. Strong background in Linux administration and troubleshooting. Proven experience in implementing and managing CI/CD pipelines and Infrastructure as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (New Relic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/ More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Code (IaC) using Terraform to automate infrastructure provisioning and management. Establish and maintain robust security controls across all cloud environments, ensuring compliance with relevant standards and regulations. Utilise advanced observability tools to monitor and optimise the performance of production services, proactively identifying and resolving issues. Design and optimise CI/CD pipelines using platforms such as GitLab or Jenkins, enabling More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Social network you want to login/join with: Site Reliability Engineer (SRE) Lead – Observability Location: London (Hybrid, 2 days on site per week) Contract Role Overview: Join a high-impact team where you'll lead and shape the SRE and Observability function for a major transformation programme. This role goes beyond traditional SRE – you’ll champion best practices across … product teams, drive observability strategy, and work hands-on with cutting-edge tools like Datadog and AWS. Key Responsibilities: Lead the SRE function and promote observability-first thinking across development and operations teams. Define and implement the observability roadmap across product domains in collaboration with the client. Be hands-on with Datadog for infrastructure and application-level monitoring. Guide and … review daily operations and improvements across observability platforms. Partner with engineering squads to deliver on observability requirements in an agile, demand-led way. Core Skills & Experience: Proven experience as a hands-on SRE Engineer. Deep understanding of observability and monitoring practices. Practical experience with Datadog (or similar observability platforms). Strong DevOps toolchain knowledge: GitHub, GitHub Actions, Jenkins, CodeQL, Nexus More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
development, QA, and operations teams to implement DevOps methodologies and toolchains. Use Infrastructure as Code (IaC) with Terraform for automation. Maintain security controls across cloud environments, ensuring compliance. Utilise observability tools to monitor and optimise production services. Design and improve CI/CD pipelines with platforms like GitLab or Jenkins. Mentor and guide DevOps and development teams, promoting continuous learning. More ❯
paced environment. Responsibilities: Develop scalable tools for automation, deployment, and infrastructure management. Enhance system performance, reliability, and efficiency through automation. Manage AWS infrastructure, ensuring smooth configuration and deployment. Implement observability tools for monitoring and debugging. Ensure fault tolerance, redundancy, and high availability of trading systems. Support infrastructure for C++ and Rust-based trading systems, ensuring seamless integration. Qualifications: Strong programming More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
Terraform, and cloud-native services. Drive infrastructure automation, build and evolve infrastructure as code (Terraform, etc.), and CI/CD pipelines to reduce toil and accelerate deployment frequency. Build observability into everything—own monitoring, alerting, and incident response to minimize MTTR and improve system health. Champion SRE culture and reliability-focused engineering—help shape sustainable engineering practices, SLAs, SLOs, and More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
that automate their processes. Contribute to the development of our AI agent development platform that scales with our product strategy. Ensure our AI services maintain high standards of reliability, observability, availability, and performance. Participate in our machine learning community to influence how we implement machine learning and computer vision technologies, shaping Unitary's future. Take ownership of customer outcomes with More ❯
Reading, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
balancing technical, security, and operational priorities. Bonus Skills (Nice to Have) Proficiency in infrastructure-as-code using Terraform. Experience setting up and managing CI/CD pipelines. Familiarity with observability tools and techniques. Fully remote role or hybrid option from Belfast. Long-term incentive scheme participation. Private health coverage, including critical illness and life insurance. Wellness support including gym discounts More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
skills Strong understanding of cloud-based application architecture and stack, preferably including AWS Good understanding of Docker and experience with CI/CD tooling Good understanding of security and observability best practices and tooling What else? Experience building and maintaining high-traffic server-side web applications Experience with infrastructure-as-code tools such as Terraform or CloudFormation Experience of working More ❯
Slough, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
low latency network environment . You’ll be joining a collaborative and forward-thinking environment with flat structures and deep technical expertise – ideal for someone who enjoys network automation, observability tooling, and IaC . Below I have included a breakdown of the role, company, and requirements. Please review and if the opportunity seems like a good fit share your CV … opportunities for network automation and implement appropriately IaC heavy environment - work with likes of Ansible, Python, CI/CD, GitOps practices Deliver troubleshooting, operational enhancements, and BAU changes Develop observability tooling (dashboards, alerts) and build self-healing or event-driven automation Lead post-incident reviews and trend analysis to continuously improve network reliability and performance Company: Technology-led culture – Drives … per week WFH Competitive Compensation - Year 1 guaranteed bonus, 13% pension, Potential for Sign-On Bonuses Requirements: Proven experience with low latency networking Python for network automation Monitoring and observability tooling such as Kibana, Splunk, Prometheus, Grafana Familiarity with IaC/DevOps tools such as Ansible, GitOps, CI/CD #J-18808-Ljbffr More ❯