London, South East, England, United Kingdom Hybrid / WFH Options
Rise Technical Recruitment Limited
incidents The Person: *5+ years in SRE, DevOps, or infrastructure engineering*Strong experience with AWS, EKS/Kubernetes, and Terraform*Familiar with Kafka and observability tools like Datadog or Grafana*Able to troubleshoot issues across infrastructure and application layers Reference number: BBBH259300 To apply for this role or for to be considered for further roles, please click "Apply Now" or More ❯
meets at least the following requirements: Expert knowledge of Kubernetes, Expert knowledge of continuous deployment systems such as Buildkite and ArgoCD, Expert knowledge of monitoring technologies such as Prometheus, Grafana, and PagerDuty, Expert knowledge of infrastructure as code technologies such as Pulumi or Terraform. Location We hire engineers in London and in Palo Alto. We usually work from the office More ❯
to-end systems and processes Experience of network support and troubleshooting Exposure to UK and EU equity markets Desired Prior experience in a similar role Knowledge or experience of Grafana Previous experience of Binary Protocols Previous experience of the Atlassian suite of products TCP/UDP knowledge Skills: Trading Application Support - (SQL, FIX Protocol, UNIX/LINUX, Trading, FinTech, Financial More ❯
ideally in distributed, real-time systems Experience with containerisation and orchestration technologies, such as Kubernetes, in production environments Familiarity with observability tooling and practices, such as Victoria Metrics, Prometheus, Grafana, OpenTelemetry and SLOs Well-developed debugging skills with the ability to navigate unfamiliar systems, identify root causes and deliver effective solutions under time pressure Proven track record of contributing to More ❯
others in writing code that is intuitive, clear, and easy to test Developing observability for new and existing ML applications and GenAI/LLM integrations, making use of the Grafana Stack (Prometheus, Loki, Tempo) Working closely with Data Scientists and ML Engineers throughout the lifecycle of productionising their models Being responsive to incidents regarding ML applications - including an understanding of More ❯
meaningfully to the success of the team and company. Nice to Have: Experience with Nginx, proxies, and managing traffic between different services. Experience with Docker and Kubernetes. Familiarity with Grafana and other monitoring tools. Prior experience with Scala and Java is a plus. Experience with Workday and Microsoft Entra What we offer You will have the opportunity to be part More ❯
Willingness to tackle challenging problems and make meaningful contributions to the success of both the team and the organization. Nice to Have: Experience with Docker and Kubernetes. Familiarity with Grafana and other monitoring tools. Prior experience with Scala and Java is an advantage. What we offer You will have the chance to be involved in something impactful, large-scale, and More ❯
bash/shell, python or similar Desired: Graduate Graduate IT Support (SQL, FIX Protocol, UNIX/LINUX, Trading, FinTech, Financial Technology, Computer Science or Finance Degree) Dashboard creation in Grafana Previous experience with Atlassian product suite Basic networking concepts and troubleshooting - TCP/UDP, telnet, IP, Ports etc Skills: Graduate IT Support (SQL, FIX Protocol, UNIX/LINUX, Trading, FinTech More ❯
AWS & GCP - we're cloud-native Microservice based architecture Kubernetes (EKS) TeamCity for CI/CD (lots of team are releasing code 15-20 times per day!) Terraform and Grafana Our process: Interviewing is a two way process and we want you to have the time and opportunity to get to know us, as much as we are getting to More ❯
shedding) • Proficient with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar) • Proficient with operating services in AWS • Experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) • Experience scripting operating system tasks in Bash, Python, etc. Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our More ❯
shedding) • Proficient with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible, or similar) • Proficient with operating services in AWS • Experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) • Experience scripting operating system tasks in Bash, Python, etc. Amazon is an equal opportunities employer. We believe passionately that employing a diverse workforce is central to our More ❯
SQL , and modern cloud-native tools. Automate Everything - from deployments (Docker, Kubernetes, Terraform, Azure) to testing (unit, integration, functional). Monitor & Optimize platform performance with enterprise-grade observability (Prometheus, Grafana, Loki). Troubleshoot & Support APIs and integrations using tools like Bruno, Grafana etc. Champion API Best Practices across Maersk, mentoring and collaborating with engineers in multiple teams. Implement SRE Practices … environments. Experience in cloud platforms (preferably AWS or Azure) and containerized deployments (Docker, Kubernetes). Strong background in infrastructure automation (Terraform, Ansible, or similar). Experience in observability (Prometheus, Grafana, Loki) and API troubleshooting at scale. Familiarity with Generative AI frameworks and APIs (e.g., OpenAI, Azure OpenAI, LangChain, or similar) and ability to integrate AI-driven features into developer tooling More ❯
security best practices across cloud and network environments. Troubleshoot deployment and performance issues across multiple environments. Set up and maintain observability tools for logging, monitoring, and alerting (e.g., Prometheus, Grafana, Loki). Contribute to internal tooling to streamline development, testing, and operations workflows. Stay current with DevOps trends and recommend improvements to tools and processes. Required Qualifications: Bachelor's degree … Exposure to multi-cloud or hybrid cloud architectures. Tech Stack: Cloud: AWS, OCI ZTN: Cloudflare Application: Kong (API Gateway), Java Spring Boot, Python, Go, TypeScript Monitoring: Prometheus Stack (Prometheus, Grafana, Loki) Compute: ECS, EC2, Lambda Frontend: S3, CloudFront Data: Glue, S3, PostgreSQL CI/CD: GitHub Actions IaC: Terraform, AWS SAM Why Join Us? At Intelmatix, you'll work on More ❯
and maintain Infrastructure as Code (IaC) using Terraform and Ansible. Design highly reliable, scalable, and secure infrastructure supporting performance-critical workloads. Build proactive monitoring, observability, and alerting with Prometheus, Grafana, Azure Monitor, DataDog, and Dynatrace. Troubleshoot complex system issues spanning applications, networks, and infrastructure. Define platform SLAs, SLOs, and governance standards for self-service use. Collaborate closely with Salesforce DevOps … Ansible, along with scripting in PowerShell, Python, or Bash Experience implementing GitOps workflows and managing platform SLAs, SLOs, and governance standards Familiarity with observability and monitoring tools including Prometheus, Grafana, Azure Monitor, DataDog, or Dynatrace Preferred experience supporting Salesforce DevOps pipelines and working with Java, .NET, or Node.js application environments Exposure to AI/ML platforms, real-time data pipelines More ❯
and security. Automation & CI/CD: Implement and manage CI/CD pipelines for efficient deployment, testing, and monitoring of applications. Observability & Monitoring: Develop comprehensive monitoring solutions using Prometheus, Grafana, ELK stack, or similar tools to improve system reliability. Security & Compliance: Apply best practices for cloud security, IAM policies, and compliance frameworks (SOC2, ISO 27001, etc.). Incident Response & Performance … . Proficiency in scripting and automation using Python, Bash, or Go. Experience with Infrastructure as Code (Terraform, CloudFormation, or Ansible). Familiarity with monitoring, logging, and observability tools (Prometheus, Grafana, Datadog, ELK, etc.). Strong understanding of networking concepts (VPC, Load Balancers, DNS, Firewalls). Experience with DevOps methodologies, CI/CD pipelines, and GitOps practices. Experience with high-performance More ❯
Dallas, Moray, United Kingdom Hybrid / WFH Options
Arcus Search
Kubernetes (multi-cluster setups) Support, tune, and troubleshoot Linux-based infrastructure Apply networking fundamentals to optimise workload performance Contribute to CI/CD pipelines and operational tooling (e.g., Prometheus, Grafana) What We're Looking For Strong experience withKubernetes internals(e.g., controllers, operators) Deep understanding of distributed systems and event-driven programming Experience with batch computing, DAG workflows, or job scheduling More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Lorien
technologies. with clear progression routes available. Key Requirements: Strong troubleshooting and fault-resolution experience across infrastructure and applications Hands-on experience with monitoring tools such as Instana, Splunk, Prometheus, Grafana, or SolarWinds Confident supporting both Windows and Linux operating systems Experience working in ITIL-aligned support environments Understanding of web hosting technologies (DNS, HTTP/S, SSL Certs, and basic More ❯
of Kubernetes, containerized infrastructure, cloud platforms (e.g. GCP) Database expertise : Production experience with OSS datastores (PostgreSQL, Redis, Kafka) Observability mastery : Hands-on experience with observability stacks (Datadog, Prometheus/Grafana, OpenTelemetry or similar) Programming proficiency : Strong hands-on software engineering skills (Python, Go, Rust) Operational mindset : "You build it, you run it, you own it" philosophy with the focus on More ❯
in Computer Science or a related field (or equivalent experience). Preferred Qualifications: Full-stack data platform knowledge. Experience working with OAuth/OIDC and IAM technologies. Familiarity with Grafana, Datadog, or similar monitoring tools. Prior experience developing pipelines in Prophecy IDE. *Rates depend on experience and client requirements More ❯
tech stack: Languages: TypeScript, Javascript Libraries and frameworks: gRPC, Redux, React Native, React, Next.js Datastores: Vitess, MySQL, CockroachDB, BigQuery, Redis Infrastructure: Google Cloud Platform, Kubernetes, Docker, PubSub, Terraform Monitoring: Grafana, Prometheus, Sentry, Metabase About you: You are a frontend developer with at least 5 years' experience You are fast and love to deliver incredible code You can reduce complex problems More ❯
tech stack: Languages: TypeScript, Javascript Libraries and frameworks: gRPC, Redux, React Native, React, Next.js Datastores: Vitess, MySQL, CockroachDB, BigQuery, Redis Infrastructure: Google Cloud Platform, Kubernetes, Docker, PubSub, Terraform Monitoring: Grafana, Prometheus, Sentry, Metabase About you: You are a frontend developer with at least 2 years' experience You are fast and love to deliver incredible code You can reduce complex problems More ❯
e.g., multi-tenant PostgreSQL, sharded MySQL). Strong backend fundamentals around concurrency, caching, indexing and distributed systems trade-offs. Proven track record of setting SLOs, building dashboards (Prometheus/Grafana, OpenTelemetry, etc.) and tuning alerts. Comfort with Kubernetes , IaC and cloud-native patterns; can debug from network to application layer. Start-up bias for action: you prioritise high-leverage fixes More ❯
regulatory processes DevOps skillset (at least a selection of the below-mentioned skillset will be needed): Github Ansible Automation Platform Nexus Hashicorp Vault Zowe z/OSMF Python APIs Grafana Splunk In addition to the details listed above, the ideal candidate should have the following complimentary skills (although these are not essential): Assembler, Automation, Job Scheduling, ACF2/RACF, GDPS More ❯
Bracknell, Berkshire, United Kingdom Hybrid / WFH Options
Techex
ST 2022, ST 2110) Ability to use test equipment/software to analyse MPEG streams Experience of public cloud platform architecture/design Experience with either Influx, Redis, Kafka, Grafana, Kibana Our Values and Benefits Techex has an impressive history with extremely high customer engagement and satisfaction. As a business we have developed consistently through our stellar reputation in the More ❯
REST API testing (Postman/Newman, REST-Assured) and CI/CD pipelines in GitHub Actions. Familiarity with performance/load testing tools (k6, Locust) and monitoring stacks (NewRelic, Grafana). Comfort with Windows systems engineering: registry, services, installers (MSI/Auto-Updater), PowerShell scripting. Strong analytical skills, clean coding habits, git workflows, and excellent communication skills. Nice to Have More ❯