meet business needs and objectives. Develop a baseline monitoring and tooling concept for cloud to address the need for compliance infrastructure reporting within agile deliveries as part of our Observability strategy. Develop concepts and tools for chargeback and showback (Financial Instrumentation) in a multicloud context. Implement and mature a cloud forecasting and capacity management solution for the enterprise. Collaborate with More ❯
to make an impact. As a Platform Engineer, you’ll help design, build, and support the infrastructure and tooling that underpins critical systems – from CI/CD pipelines and observability tooling to service deployment and runtime environments. You’ll be part of a high-trust team that values clean code, quick iteration, and leaving things better than you found them. … or Python for building internal tooling and services Hands-on experience with AWS, Kubernetes, Docker, and modern CI/CD pipelines Familiarity with infrastructure-as-code (e.g., Terraform) and observability tooling (e.g., Prometheus, Grafana) Comfortable working on distributed systems and improving developer workflows A product mindset and a collaborative approach to problem-solving Experience with Kafka, gRPC, or open-source More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
bet365
Who we are looking for A Site Reliability Engineer, who will enhance system reliability, observability and performance through a strong engineering approach and assist with incident resolution and best practices. You will have software engineering skills, focusing on system reliability and observability. You will monitor the health, performance and availability of critical systems, directly impacting operational efficiency. Using your engineering … practices and develop features for maintainability. You will also help engineer tools and automation for effective service management. Collaboration is key, working across multiple functions to integrate reliability and observability best practices into the software development life cycle. By supporting governance standards set by the central teams, you will foster a culture where these principles are integral to development. Your … of Site Reliability Engineering principles, including the creation and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, New Relic, Grafana and Pager Duty. Excellent knowledge of programming languages including Python, Golang and JavaScript. Knowledge and experience of modern software development More ❯
Stoke-on-Trent, Staffordshire, UK Hybrid / WFH Options
Signify Technology
and on-premises infrastructure with security and reliability in mind Write high-quality internal tools and automation scripts using Python Collaborate with engineering and security teams to ensure compliance, observability, and incident readiness Contribute to infrastructure as code and CI/CD pipeline improvements Monitor system health and performance, proposing and implementing enhancements Requirements: 5+ years of experience in platform More ❯
Birmingham, Staffordshire, United Kingdom Hybrid / WFH Options
CET Structures Limited
interfaces and working with component libraries like Vuetify. Experience in writing unit and integration tests Experience working with the Azure stack is essential Experience working with DataDog or other observability platforms is desirable Interest in learning new technologies is desirable Additional Skills & Qualities Agile experience: Familiarity with Scrum, Kanban, or similar methodologies. A team player with strong communication skills for More ❯
Stoke-on-Trent, Staffordshire, UK Hybrid / WFH Options
Tembo
Investigations: Lead technical deep-dives and spike solutions to evaluate technologies, libraries, and approaches for improving system reliability, auditing, and financial reconciliation accuracy. Open Standards: Support our commitment to observability and open standards. Contribute to initiatives around OpenTelemetry, OpenAPI, and other tools that improve transparency and traceability across services. About you At least 5 years of professional experience in software More ❯