3 of 3 Permanent Prometheus Jobs in Cambridge

Senior Engineer - Developer Experience (DevEx)

Hiring Organisation
Complexio
Location
Cambridge, Cambridgeshire, UK
Employment Type
Full-time
developer productivity tooling. Hands-on experience with infrastructure automation (e.g., Docker, Kubernetes, IaC with Terraform, Ansible or Pulumi). Familiarity with observability & monitoring (Datadog, Prometheus, or similar). Experience managing or improving monorepo build systems. Strong ability to measure developer productivity gaps and define KPIs. Experience in driving adoption ...

Platform Engineer

Hiring Organisation
SoCode Recruitment
Location
Cambridge, England, United Kingdom
failure and enhancing autoscaling, high availability and managed service usage • Collaborate with SRE, Security and Engineering teams to strengthen observability, monitoring and alerting using Prometheus, Grafana and CloudWatch • Work closely with Security to embed best practice for IAM, secrets management, WAF and cloud posture management • Optimise performance and cloud spend … including cluster scaling, deployment automation and monitoring • Solid background in Linux administration, networking and cloud security principles • Familiarity with observability tools such as Prometheus, Grafana and Loki along with structured alerting practices • Experience with database migrations, high availability configurations, backups and disaster recovery • Strong scripting and automation skills using Terraform ...

Cloud Platform Engineer -AWS, Degree, Cloud, Linux - Cambridge

Hiring Organisation
Adecco
Location
Cambridge, Cambridgeshire, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 - £100,000 per annum
including cluster scaling and deployment automation. Proficiency in Linux administration, networking fundamentals, and cloud security principles. Familiarity with observability stacks such as Prometheus, Grafana, and Loki, with structured alerting practices. Knowledge of database operations, including migrations, high availability, backups, and disaster recovery strategies. Skilled in automation and scripting using Terraform … improving autoscaling, high availability, and eliminating single points of failure. Work closely with SRE and Security teams to enhance monitoring and observability through Prometheus, Grafana, and CloudWatch. Embed security best practices into every layer of the platform, covering IAM, secrets management, WAF, and compliance. Drive cost efficiency and performance improvements ...