City of London, London, England, United Kingdom Hybrid / WFH Options
TalentTrade Recruitment Limited
secrets management. Good experience with continuous integration and continuous deployment (CI/CD) pipelines with GitHub Actions. Familiarity with monitoring and logging tools relevant to distributed systems (eg, Prometheus, Grafana, ELK stack). Experience with Scripting languages such as Bash or Python for automation tasks. More ❯
EC3N, Tower, Greater London, United Kingdom Hybrid / WFH Options
TalentTrade Recruitment Limited
secrets management. Good experience with continuous integration and continuous deployment (CI/CD) pipelines with GitHub Actions. Familiarity with monitoring and logging tools relevant to distributed systems (eg, Prometheus, Grafana, ELK stack). Experience with Scripting languages such as Bash or Python for automation tasks. More ❯
multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience by enabling actionable monitoring and alerting. Drive cloud cost visibility and optimization efforts across engineering through … operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire, and develop talented platform engineers with More ❯
support highly available telephony solutions using AudioCodes and Oracle SBCs Develop scripts, tools, and APIs to improve SIP routing, call flows, and automation Integrate telephony with monitoring platforms like Grafana and ThousandEyes Collaborate with carriers to support SIP infrastructure and hybrid voice networks Contribute to hybrid cloud telephony solutions across UCaaS and CCaaS platforms Participate in Agile sprints and support More ❯
support highly available telephony solutions using AudioCodes and Oracle SBCs Develop scripts, tools, and APIs to improve SIP routing, call flows, and automation Integrate telephony with monitoring platforms like Grafana and ThousandEyes Collaborate with carriers to support SIP infrastructure and hybrid voice networks Contribute to hybrid cloud telephony solutions across UCaaS and CCaaS platforms Participate in Agile sprints and support More ❯
multi-account AWS setups. Extensive experience with AWS Organisations Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Working with Control Tower and Landing Zones Why Work For Us? Competitive base salary up to More ❯
Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension contribution Educational More ❯
requirements ). Preferred Qualifications : Certifications in GCP Familiarity with Azure DevOps Pipelines is a plus. Experience with multi-cloud and hybrid cloud environments. Experience with Elastic (or OpenSearch) and Grafana Knowledge of ServiceNOWfor change management and incident management. Familiarity with observability tools and practices for 24x7x365 monitoring and alerting. Identity and Access Management experience is a plus for this role More ❯
Bradford, Yorkshire, United Kingdom Hybrid / WFH Options
Yorkshire Building Society Group
skills in the following: Continuous Integration/Continuous Delivery pipelines - tools such as Jenkins & GitLab Scripting and automation capabilities Modern monitoring skills and best practices using tools such as Grafana, Prometheus, Kibana, DynaTrace Testing frameworks Knowledge of networks and routing. Knowledge of integrations of services utilising different technologies such as PLSQL, .Net, C#, Java, Sprint Boot, Spring Batch Experience of More ❯
Fine Tuning: Drive the deployment and fine tuning of large language models (LLMs) while ensuring efficient training pipelines and model hosting. Monitoring & Performance Optimization: Implement monitoring (using Prometheus/Grafana and similar tools) and logging solutions to ensure system reliability and to optimise model throughput. Collaborate Across Teams: Work closely with Machine Learning engineers to enable their delivery What We More ❯
Airflow, or on common problems such as model and API monitoring, data drift and validation, autoscaling, access permissions Have previously worked with monitoring tools such as New Relic or Grafana Understand the use of feature stores and related data technologies for operational machine learning products Are proficient with Python and have Spark knowledge. Have leadership experience either through previous management More ❯
to managing our infrastructure, using Terraform. - We follow a GitOps approach to managing our Kubernetes configuration, using ArgoCD and Helm. - We manage a high-availability metrics collection system using Grafana, Thanos & Prometheus. We're in the process of transitioning to OpenTelemetry and Honeycomb for our application telemetry (traces and metrics). - We manage a data pipeline using Pub/Sub More ❯
provided by GCP/AWS, such as S3, FSX, EKS, SQS, SNS, Kinesis, AmazonMQ, DynamoDB, GKE, CloudStorage, PubSub, Filestore, Knowledge of modern observability technologies such as ELK, Splunk, Prometheus, Grafana, Micrometer "What-if" thinking, while designing or reviewing solutions, to foresee or catch potential problems as early in the development process, as only possible Nice to have: Good knowledge of More ❯
primary language for our backend codebase AWS & GCP - we're cloud-native Kubernetes (EKS) Microservice based architecture RESTful APIs PostgreSQL, JDBI, Flyway TeamCity for CI/CD Terraform and Grafana The Team: The Core Banking group is seeking passionate engineers ready to tackle complex challenges and contribute to foundational systems, powering modern banking, that process millions of transactions daily, ensuring More ❯
primary language for our backend codebase AWS & GCP - we're cloud-native Kubernetes (EKS) Microservice based architecture RESTful APIs PostgreSQL, JDBI, Flyway TeamCity for CI/CD Terraform and Grafana The Team: The Core Banking group is seeking passionate engineers ready to tackle complex challenges and contribute to foundational systems, powering modern banking, that process millions of transactions daily, ensuring More ❯
to offer 4 day working weeks and part time options can also be considered. Requirements: - Active DV Clearance - Kubernetes - Terraform - Strong knowledge of monitoring tools such as Prometheus or Grafana - Python or other scripting language If you're a DevOps engineer looking for acontract offering £500 - £ 550 A DAY OUTSIDE IR35 , then send an updated CV to The client is More ❯
Nottingham, Nottinghamshire, East Midlands, United Kingdom Hybrid / WFH Options
Oscar Associates (UK) Limited
Azure AD) and support security/compliance across the estate Troubleshoot and manage network infrastructure - Cisco switches, VLANs, firewalls Support backup (Rubrik), DR (Zerto), and monitoring tools (Dynatrace, Zabbix, Grafana) What We're Looking For: Strong hands-on experience with Linux in enterprise environments Solid background in escalated infrastructure support (3rd/4th line) Scripting and automation skills (Bash, Python More ❯
predictive analytics. Understanding of AI frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn) and their application in network automation and monitoring. Experience with telemetry and observability frameworks (e.g., Prometheus, Grafana) for real-time network monitoring and troubleshooting. Experience : Minimum of 7 years' of experience in network engineering, operations, and support. Proven ability to work hands-on and take strong technical More ❯
for you. Ideally you have several years experience using Go in production. You'll be comfortable with Docker, and familiar with modern observability tools such as Prometheus, Alert Manager, Grafana and X-Ray/Tempo/Jaeger. We're looking for 3+ years tackling hard backend problems Seasoned database experience - we use MySQL, DynamoDB, Elasticsearch and Redis Experience with microservices More ❯
Private Networks, DWDM and Optical Networking, Data Centre builds and design fundamentals. etc. Experience with network modelling Eagerness to learn new technologies and mentor others Experience with Telemetry: Splunk, Grafana, Humio Experience with continuous integration and deployment tools Experience implementing, maintaining and troubleshooting MPLS, BGP, OSPF, IGMP, PIM related internal and external network routing issues in a production environment Knowledge More ❯
Willingness to tackle challenging problems and make meaningful contributions to the success of both the team and the organization. Nice to Have: Experience with Docker and Kubernetes. Familiarity with Grafana and other monitoring tools. Prior experience with Scala and Java is an advantage. What we offer You will have the chance to be involved in something impactful, large-scale, and More ❯
Bracknell, Berkshire, United Kingdom Hybrid / WFH Options
Techex
Experience of public cloud platform architecture/design CCNP or higher/equivalent non-cisco qualification (Routing & Switching or Data-Centre/SDN) Experience with either Influx, Redis, Kafka, Grafana, Kibana Our Values and Benefits We have secured Great Place to work accreditation for the past two years and we seek out individuals who enjoy developing their professional skills, are More ❯
plus Knowledge of Redis and log queries is a plus Experience in automations/AI would be an advantage Experience administering multiple monitoring systems such as Datadog, NewRelic, Kubernetes, Grafana and Elastic Cloud Experience with Cloud Computing, AWS, Microservices Architecture, Unix and Linux Systems Life @ Empowered to think big. Try new opportunities while working with a talented, ambitious and supportive More ❯
Liverpool, Lancashire, United Kingdom Hybrid / WFH Options
Onyx-Conseil
agile environment Familiarity with modern security practices (OAuth, JWT, etc.) Collaborative mindset with the ability to work across disciplines Bonus: Experience with Docker, RabbitMQ, MediatR, Masstransit, Keycloak, Loki or Grafana Why Join Us? Modern tech stack and a strong culture of engineering excellence Hybrid working Career progression into senior leadership or architecture Supportive environment with clear standards and plenty of More ❯
Liverpool, Merseyside, North West, United Kingdom Hybrid / WFH Options
Acorn Insurance
agile environment Familiarity with modern security practices (OAuth, JWT, etc.) Collaborative mindset with the ability to work across disciplines Bonus: Experience with Docker, RabbitMQ, MediatR, Masstransit, Keycloak, Loki or Grafana Why Join Us? Modern tech stack and a strong culture of engineering excellence Hybrid working Career progression into senior leadership or architecture Supportive environment with clear standards and plenty of More ❯