and implement robust engineering standards and principles. Your expertise will contribute to maintaining and enhancing automated CI/CD platforms, secure coding practices, and observability frameworks, ensuring high availability, performance, and cost efficiency. You'll champion innovation, technical improvement, and FinOps initiatives, while proactively reducing technical debt and enhancing automation. More ❯
Code Development & Review : Write, review, and maintain production-quality Python code for NLP applications, ensuring high-quality, reliable, and efficient code. Enhance Scalability and Observability : Optimize NLP solutions to be more scalable, observable, and resilient, with a focus on improving performance and monitoring in production environments. Stakeholder Communication : Serve as More ❯
within anAgileframework to support the development and operations of web applications. Desirable Skills: Serverless & Microservices: Experience withAWS Lambda,Azure Functions, and event-driven architectures. Observability & Monitoring: Familiarity with monitoring tools likeSplunk,Datadog, orNew Relicfor enhanced visibility and observability. Networking: Knowledge ofVPCs,VPNs, andload balancingin cloud environments. GDS Standards: Awareness ofGDS More ❯
environment: Owning and improving CI/CD, IaC best practices, and incident management. Enhance our internal developer platform (Backstage), automate workflows, and lead the Observability strategy-implementing best practices for Logging, Metrics, and Tracing across the business, aligned with AWS Serverless standards. Collaborating in a high-performing team: Engage in More ❯
to develop clean, secure, testable and maintainable code Gain exposure to DevOps tools and processes, including containerisation (e.g., Docker), CI/CD pipelines, and observability platforms (e.g., Datadog, Grafana) Troubleshoot and resolve technical issues in collaboration with your team Stay curious and proactive, learning new technologies and contributing ideas to More ❯
monitoring and logging tools (Dynatrace, ELK stack, Splunk). Experience using logging to derive application insights. Consideration of non-functional requirements (security, accessibility and observability) during design and development. Solid understanding of Object-Relational Mapping principles and proficiency in JPA and Hibernate. Experience using Swagger for API documentation and coding More ❯
monitoring and logging tools (Dynatrace, ELK stack, Splunk). Experience using logging to derive application insights. Consideration of non-functional requirements (security, accessibility and observability) during design and development. Solid understanding of Object-Relational Mapping principles and proficiency in JPA and Hibernate. Experience using Swagger for API documentation and coding More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Deloitte LLP
monitoring and logging tools (Dynatrace, ELK stack, Splunk). Experience using logging to derive application insights. Consideration of non-functional requirements (security, accessibility and observability) during design and development. Solid understanding of Object-Relational Mapping principles and proficiency in JPA and Hibernate. Experience using Swagger for API documentation and coding More ❯
core technologies provided by GCP/AWS, such as S3, FSX, EKS, SQS, SNS, Kinesis, AmazonMQ, DynamoDB, GKE, CloudStorage, PubSub, Filestore Knowledge of modern observability technologies such as ELK, Splunk, Prometheus, Grafana, Micrometer "What-if" thinking, while designing or reviewing solutions, to foresee or catch potential problems as early in More ❯
the edge. Proficiency in Python, Docker, Linux systems, and scripting (Bash, Python). Strong expertise with infrastructure automation tools (Terraform, Ansible). Experience managing observability and monitoring systems, particularly Prometheus. Deep understanding of networking concepts and protocols. Responsibilities: Design, build, and maintain scalable and resilient infrastructure on the edge. Develop … as-code solutions using Terraform, Ansible, and scripting languages (Python, Bash). Deploy and manage containerized applications using Docker and related technologies. Ensure system observability by building and optimizing monitoring systems, particularly using Prometheus. Troubleshoot and optimize Linux-based systems (e.g., Red Hat, CentOS, Ubuntu). xAI's Grok is … technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting metrics. Building easily parsable dashboards. Designing and overseeing our on-call rotations. Improving our deployment process to increase reliability. Luminance is More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
the edge. Proficiency in Python, Docker, Linux systems, and scripting (Bash, Python). Strong expertise with infrastructure automation tools (Terraform, Ansible). Experience managing observability and monitoring systems, particularly Prometheus. Deep understanding of networking concepts and protocols. Responsibilities: Design, build, and maintain scalable and resilient infrastructure on the edge. Develop … as-code solutions using Terraform, Ansible, and scripting languages (Python, Bash). Deploy and manage containerized applications using Docker and related technologies. Ensure system observability by building and optimizing monitoring systems, particularly using Prometheus. Troubleshoot and optimize Linux-based systems (e.g., Red Hat, CentOS, Ubuntu). xAI's Grok is … technologies such as Prometheus, Grafana, and PagerDuty. Expert knowledge of deployment technologies such as Pulumi or Terraform. Expert knowledge of Kubernetes. Responsibilities: Improving our observability by adding/adjusting metrics. Building easily parsable dashboards. Designing and overseeing our on-call rotations. Improving our deployment process to increase reliability. Luminance is More ❯
Automate the provisioning, scaling, and monitoring of cloud resources within Azure, implementing advanced automation and orchestration strategies to increase operational efficiency. Implement and optimise observability, logging, and alerting with tools such as Prometheus, Grafana, and OpenTelemetry. Collaborate with the application development teams to enable and facilitate best practices for containerisation … Docker, and Kubernetes (AKS), with experience in deploying, scaling, and managing containerized workloads in production environments and their supporting infrastructure layers. Experience implementing monitoring, observability, traceability, and logging stacks, including tools such as Prometheus, New Relic, Grafana, Loki, and OpenTelemetry. Strong knowledge of Cloud Native Technologies and CNCF-recommended solutions More ❯
Senior Observability Engineer - FinTech - £90K Our client is one of the world’s leading financial technology (FinTech) companies. They’re looking to hire an experienced Monitoring/Observability engineer to grow their core Infrastructure Engineering team and build out their Observability Platform, Automation and Performance function. You’ll be a … key part of the engineering of monitoring, observability and automation solutions, covering key production systems and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps teams … to revolutionise their infrastructure and re-define what they can do with monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge More ❯
Senior Monitoring and Observability Engineer - FinTech Our client is one of the world’s leading financial technology (FinTech) companies. They’re looking to hire an experienced Monitoring/Observability engineer to grow their core Infrastructure Engineering team and build out their Observability Platform, Automation and Performance function. You’ll be … a key part of the engineering of monitoring, observability and automation solutions, covering key production systems and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps … to revolutionise their infrastructure and re-define what they can do with monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge More ❯
Senior Monitoring and Observability Engineer - FinTech Our client is one of the world’s leading financial technology (FinTech) companies. They’re looking to hire an experienced Monitoring/Observability engineer to grow their core Infrastructure Engineering team and build out their Observability Platform, Automation and Performance function. You’ll be … a key part of the engineering of monitoring, observability and automation solutions, covering key production systems and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps … to revolutionise their infrastructure and re-define what they can do with monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge More ❯
Birmingham, West Midlands (County), United Kingdom
Qualient Technology Solutions UK Limited
Job Description: 15+ Years of Hands-on in Terraform, CICD (AWS,GitHub Actions, Harness), Observability (Grafana, Prometheus), Application High Availability and Clustering, Scripting (NodeJS, Python, Shell), Linux debugging, Customer management and enforcing best practices, Ability to define Process and governance Excellent technical, analytical skills Java technologies Strong communication and interpersonal More ❯
team that provides operational support for Linux servers, networks, and AWS cloud infrastructure. Manage security vulnerabilities and implement mitigations. Implement and maintain monitoring and observability solutions. Provision infrastructure for new projects and products. Support project delivery and provide infrastructure design expertise. Maintain and improve configuration management (Puppet) and DevOps processes. More ❯
building, and maintaining the infrastructure that underpins our core services. Working closely with developers, security, and product teams, you’ll champion automation, scalability, and observability across our platform. Key Responsibilities Design, implement, and maintain scalable cloud infrastructure using AWS and Terraform Manage and optimize our Kubernetes clusters, ensuring high availability More ❯
building, and maintaining the infrastructure that underpins our core services. Working closely with developers, security, and product teams, you’ll champion automation, scalability, and observability across our platform. Key Responsibilities Design, implement, and maintain scalable cloud infrastructure using AWS and Terraform Manage and optimize our Kubernetes clusters, ensuring high availability More ❯
for consumption in reporting, analytics and science. Optimise data pipelines and queries for better performance and cost-efficiency. Integrate data pipelines with monitoring and observability to proactively detect and resolve issues before they impact business operations. Design and build data models for lake house storage and analytics. Implement and maintain More ❯
for consumption in reporting, analytics and science. Optimise data pipelines and queries for better performance and cost-efficiency. Integrate data pipelines with monitoring and observability to proactively detect and resolve issues before they impact business operations. Design and build data models for lake house storage and analytics. Implement and maintain More ❯
taking independent decisions as well as having the ability to work cooperatively within a team, Experience working with microservice architectures and building monitoring/observability metrics, Understanding of cloud native landscapes (AWS or Azure or GCP), Knowledgeable of containerized environments would be beneficial (Docker or Kubernetes). Benefits we offer More ❯
Out in Science, Technology, Engineering, and Mathematics
collaboration such as GitHub, ArgoCD, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, Cloudformation More ❯
Glasgow, City of Glasgow, United Kingdom Hybrid / WFH Options
Square One Resources
networking, firewalls, load balancers, application servers etc. An exposure to test automation, test driven development (TDD) and agile delivery practices. Understanding of monitoring and observability tools such as AppDynamics, ELK, AWS CloudWatch, AWS XRay, etc. Knowledge & experience with Cloud Contact Center tools & technologies. Roles & Responsibilities: Collaborating with development and operations More ❯
as Python, Java Spring Boot, or Unix Shell. Deep understanding of software applications and technical processes, with emerging expertise in specific disciplines. Experience with observability tools like Grafana, Dynatrace, Prometheus, Datadog, Splunk, including monitoring, SLO alerting, and telemetry collection. Knowledge of CI/CD tools such as Jenkins, GitLab, Terraform. More ❯