Eccles, City and Borough of Salford, Greater Manchester, United Kingdom Hybrid / WFH Options
Rebel Recruitment Limited
developers in the team, to automate their deployments. You’ll have experience with DevOps tools, including Terraform for IaC, Docker for containerisation, Kubernetes (k8s) for orchestration, monitoring tools like DataDog, etc. understanding what it takes to make sure their systems, application, etc are secure, scalable, resilient, and with plenty of redundancy baked in, you’ll lean on your networking, traditional More ❯
and SRE teams to embed observability into the full delivery lifecycle Skills & Experience: Strong background in observability, monitoring, and event management Hands-on experience with platforms such as Dynatrace, Datadog, AppDynamics, Splunk, Prometheus, Grafana, New Relic, or Elastic Experience building integrations and automation using APIs, Python, Node.js, Go, or scripting Familiarity with AIOps platforms (BigPanda, Moogsoft, etc.) Knowledge of ITSM More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Morela
and SRE teams to embed observability into the full delivery lifecycle Skills & Experience: Strong background in observability, monitoring, and event management Hands-on experience with platforms such as Dynatrace, Datadog, AppDynamics, Splunk, Prometheus, Grafana, New Relic, or Elastic Experience building integrations and automation using APIs, Python, Node.js, Go, or scripting Familiarity with AIOps platforms (BigPanda, Moogsoft, etc.) Knowledge of ITSM More ❯
london, south east england, united kingdom Hybrid / WFH Options
Prolific
Services (AWS) Programming Languages: Python, JavaScript, TypeScript Frameworks: Django Rest Framework, Serverless architectures, container-based services Databases: MongoDB, DynamoDB, Postgres DevOps & Monitoring Tools: CircleCI, GitHub Actions, Kubernetes, Celery, EventBridge, DataDog Join us at Prolific and play a critical role in shaping the human data infrastructure that is powering the next generation of AI innovation. Apply now and let's build More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Michael Page Technology
The role of a Platform Support Engineer involves providing excellent technical support and maintenance for platform solutions within the technology and telecoms industry. You will ensure the smooth operation of systems, troubleshoot issues, and deliver high-quality service to internal More ❯
Burton-on-Trent, Staffordshire, England, United Kingdom
Crimson
and manage secure, scalable AWS infrastructure. Build and maintain CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Set up monitoring, alerting, and logging with tools like Datadog, Logic Monitor, and Solarwinds. Strong grasp of DevOps principles; hands-on CI/CD experience. Microsoft Certified: DevOps Engineer Expert (AZ-400). Design and deploy containers on AKS/ More ❯
Burton-On-Trent, Staffordshire, Burton upon Trent, United Kingdom Hybrid / WFH Options
Crimson
and manage secure, scalable AWS infrastructure. Build and maintain CI/CD pipelines using Azure DevOps, GitHub Actions, or Jenkins. Set up monitoring, alerting, and logging with tools like Datadog, Logic Monitor, and Solarwinds. Strong grasp of DevOps principles; hands-on CI/CD experience. Microsoft Certified: DevOps Engineer Expert (AZ-400). Design and deploy containers on AKS/ More ❯
Jersey City, New Jersey, United States Hybrid / WFH Options
ArborTekSystem
and Ontological processes). Technical Skills: Five or more years of experience with Python, SQL, and data visualization/exploration tools Full stack observability lead with Splunk (preferred)/Datadog, Infra monitoring, App onboarding and APM experience Proficiency in observability tools: They are familiar with tools for logging, metrics, and tracing, such as ELK Stack, Splunk and distributed tracing systems. More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Interquest
d also have the opportunity to mentor other team members an collaborate with product managers. Skills: TypeScript (Node, React) AWS (Lambda, Fargate, S3, Dynamo, Event Bridge etc.) Observability tools (Datadog, Dynatrace, Honeycomb, CloudWatch etc.) The money is good too – up to £70k plus benefits including hybrid working (2 days per week in Manchester) and a 2pm finish every Friday. If More ❯
Derby, Derbyshire, East Midlands, United Kingdom Hybrid / WFH Options
Experis
secure infrastructure using Terraform and other IaC tools. Own the CI/CD pipeline strategy using Azure DevOps, GitHub Actions, or similar. Set up monitoring, alerting, and logging frameworks (Datadog, LogicMonitor, SolarWinds). Collaborate closely with Cloud and FinOps teams to align infrastructure, cost optimisation, and delivery. Lead incident response, root cause analysis, and post-mortem processes. Mentor engineers and More ❯
Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics, logs, traces and APM. Leadership & Global Operations Proven success leading multi-regional or global technical teams with direct management of managers. Demonstrated ability to build More ❯
AutoSys Experience using cloud infrastructures such as AWS or Azure Experience working in a secure, multi-data center environment Experience with other monitoring tools such as Dynatrace, Paessler, or Datadog Splunk Enterprise Certified Architect Understands how to resolve conflicting priorities and objectives with grace and professionalism Knows to look ahead and think of solutions that benefit the environment in the More ❯
Accreditation Council for Graduate Medical Education
technical, ambiguous domains. Strong knowledge of REST APIs, distributed system design, and performance optimization. Experience with both SQL and NoSQL data stores, caching layers, and observability tooling (e.g., Prometheus, Datadog). Nice To Have Experience deploying or integrating LLMs or NLP models in production systems. Comfortable balancing short-term execution with long-term architectural thinking. Passion for building highly-available More ❯
via Kafka (Confluent Cloud), infrastructure automation with Pulumi (Typescript), our infrastructure is hosted at AWS (most used: ECS, S3, DynamoDB, Aurora, OpenSearch), Github Actions for builds and workflow automation, DataDog for monitoring and alerts. About The Role The role of platform engineer in domain is to support the teams using the platform and to evolve it with growing business and More ❯
Create and maintain IaC solutions with tools like Terraform. Partner with development teams to enable scalable microservices, primarily using Python. Implement and oversee observability tools such as New Relic, DataDog, Splunk, and AWS CloudWatch. Configure and troubleshoot networking components including VPNs, load balancers (NLB/ALB), HTTPS, TLS, and CDNs. Ensure system reliability and performance within Unix/Linux environments. More ❯
mindset geared towards enabling internal engineering teams Platform Engineer nice to have Exposure to AWS and use of the Cloud Development Kit (CDK) Previous experience maintaining observability stacks (e.g. Datadog ) Background in applying security-first approaches to cloud architecture Aimtech Recruitment is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. More ❯
A track record in mentoring other engineers, leading cross-team projects without authority, and driving design and technology decisions. Technologies we use (nice to have experience) Monitoring and alerting: Datadog, Falcon LogScale (formerly Humio) • Database management systems: PostgreSQL, ClickHouse Deployment tools: Flux, Helm, Kustomize Frontend frameworks: React, Angular Infrastructure as code: Terraform, Terragrunt Cloud provider: AWS Event streaming platform: Kafka More ❯
Bethesda, Maryland, United States Hybrid / WFH Options
ALTA IT Services
maintain multiple cloud systems for the client. RESPONSIBILITIES: • Primarily responsible for the client's cloud infrastructure architecture and associated observability/instrumentation of various services using tools such as DataDog, Dynatrace, or similar APMs. • Develop a cloud services delivery and operational model, keeping track of cloud activities, developing and moving applications to the cloud, and specifying computing demands. • Provide advice More ❯
Fort George G Meade, Maryland, United States Hybrid / WFH Options
August Schell
Confluent, Kubernetes operators. • Experience creating data partitioning strategies and monitoring topics for performance. • Experience deploying and upgrading Kafka clusters in high availability containerized environments. • Experience utilizing observability platforms (Elastic, Datadog, etc) to configure monitoring for data pipelines to ensure high availability and throughput, low latency, and alerting • Knowledge of stream processing pipelines and analytics. • Experience with Apache NiFi, multi-cluster More ❯
customers. Collaborate with cross-functional teams to shape and refine foundational capabilities. Own your work from concept to deployment and beyond-digging into production issues using tools like Honeycomb, Datadog, Grafana, and Rollbar to ensure system health. Write clear, maintainable, and well-documented Go code, with observability and long-term maintainability built in. Participate in architectural decisions and technical strategy More ❯
Security Expertise in Palo Alto Firewalls including policy configuration, threat prevention Network segmentation, zero-trust frameworks, and IAM integration Cloud native Web Application Firewalls Tools and Monitoring Monitoring solutions: Datadog, Stackdriver, PA Panorama, or equivalent Has strong practical experience with DevOps tools and methods, like CI/CD, Git, IaC (Terraform) Working and collaborating with Agile Teams (Squad) Good understanding More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
InterQuest Solutions
Pair Programming. As a Senior Software Engineer, you'd also have the opportunity to mentor other team members and collaborate with product managers. Skills: TypeScript (Node, React) Observability tools (Datadog, Dynatrace, Honeycomb, CloudWatch etc.) The money is good too - up to £70k plus benefits including hybrid working (2 days per week in Manchester) and a 2pm finish every Friday. InterQuest More ❯
City of London, London, England, United Kingdom Hybrid / WFH Options
Eligo Recruitment
and capacity planning for mission-critical systems Develop secure backup, recovery, and disaster recovery procedures Explore multi-tenant and sharded architectures to support growth Implement monitoring strategies using Grafana, Datadog, and CI/CD integrations Champion database best practices, mentor teams, and standardize tooling and automation What You’ll Bring Extensive experience managing cloud-hosted PostgreSQL at scale Proficiency in More ❯
Plotly, Tableau, Looker, Grafana, Power BI) Able to write concise reports with actionable insights - weekly summaries, defect overviews, quality scorecards, etc. Familiar with log analysis tools (e.g., Sumologic, Splunk, Datadog, Kibana, ElasticSearch) Comfortable discussing and designing instrumentation/logging with engineers Familiarity with QA concepts, release validation, and production monitoring Strong communication skills; can adapt output to technical and non More ❯
Kubernetes is a plus Knowledge of Redis and log queries is a plus Experience in automations/AI would be an advantage Experience administering multiple monitoring systems such as Datadog, NewRelic, Kubernetes, Grafana and Elastic Cloud Experience with Cloud Computing, AWS, Microservices Architecture, Unix and Linux Systems Life @ ** Empowered to think big. Try new opportunities while working with a talented More ❯