oriented approach Maintain customer satisfaction by engaging appropriate stakeholders, removing roadblocks and advocating internally to drive product initiatives Hands-on technical troubleshooting experience via logs Experience with tools like Grafana, Splunk, Kibana, Quicksight, etc With hands-on experience with web APIs, you understand web architecture and how data passes between systems Experience using Postman/Testfully/APIDog/Postcode More ❯
a move? Get in touch and apply today! Responsibilities: Respond rapidly to critical AWS incidents, identify root causes, and deploy automated hotfixes. Lead the setup and integration of Prometheus-Grafana observability stack. Refactor and modernize deployment pipelines using GitHub Actions and Kubernetes. Maintain robust monitoring, alerting, and CI/CD systems. Skills/Must have: Strong hands-on experience with … AWS (eg EC2, EKS, CloudWatch, Lambda). Background in incident, change, and problem management; comfortable with on-call rotations. Expertise in Prometheus, Grafana, and Splunk; solid knowledge of PromQL. Proficient in Scripting/programming (Python, Go, Bash, SQL). Salary: £500 per day More ❯
CD pipelines (Jenkins, GitLab CI/CD or similar) * Configuring Kubernetes clusters for secure, scalable deployments * Building automation across infrastructure provisioning and testing * Implementing monitoring and alerting (e.g., Prometheus, Grafana) * Managing repositories and version control (Git) * Driving SRE practices around performance, resilience, and supportability * Working closely with dev teams to integrate platform tooling into workflows * Supporting infrastructure security, maintainability and …/SRE/DevOps roles * Strong Kubernetes experience (config and deployment) * Deep CI/CD experience - Jenkins, GitLab CI/CD or similar * Skilled with infra observability tooling (Prometheus, Grafana, etc.) * Confident with Git and repo management workflows * Strong automation mindset - reducing manual intervention wherever possible * Cloud experience (AWS, Azure or GCP) * Must be a sole UK national and eligible More ❯
Code (IaC) using Helm and Ansible Write scripts in Bash and Python to support infrastructure operations Configure and maintain containerized services using Docker Monitor system health using Prometheus and Grafana Support seamless Git-based workflows and code management Resolve issues related to networking, ingress, storage, and performance within the Kubernetes stack Enforce DevSecOps best practices across environments Required Skills: 8+ … Proficient with Linux command-line interface Scripting with Bash and/or Python Strong Kubernetes experience (troubleshooting, networking, storage) Hands-on with Docker, Helm, Ansible, Git, and Prometheus/Grafana Experience implementing Infrastructure as Code (IaC) solutions Preferred Qualifications: Experience with Atlassian tools (JIRA, Confluence) Familiarity with CI/CD pipelines and secure deployment practices CKA (Certified Kubernetes Administrator) certification More ❯
DevOps Engineer - AWS & Azure Permanent Hybrid (1-2 Days Per Month On-Site) £55,000 + Excellent Benefits Staffordshire/Derbyshire/East Midlands Are you ready to shape the future of cloud infrastructure for a leading health and welfare More ❯
solutions Mentoring junior engineers and contributing to a strong engineering culture Working with a modern, cloud-native stack Cloud: AWS (Lambda, S3, Kinesis, RDS, Step Functions, AppFlow) Monitoring: Graphite, Grafana, Splunk Bonus: Experience in marketing tech or AI What We're Looking For Strong full stack engineering experience Comfortable working without front-end frameworks Ability to mentor and support junior More ❯
evolving AI technologies, data strategies, and toolchains Bonus Points: Experience building or scaling human-in-the-loop systems or teleoperations Skilled with analytics tools and dashboard platforms (e.g. Looker, Grafana, Tableau) Background in R&D or applied ML environments with high feedback iteration cycles Experience managing distributed or international operations More ❯
authoritative data systems, and develop data models and APIs that power analytics and automation. Your work ensures the team has real-time visibility into network performance through platforms like Grafana, ClickHouse, and DOMO/GreenField. You create reliable sources of truth for topology and configuration, seamlessly integrating with CMDBs and automation tools. By engineering robust ETL pipelines and enabling closed More ❯
for ensuring code quality, including CI/CD. Familiarity with UX, accessibility, internationalization, and localization concerns and solutions Nice to have: Experience working in a distributed team. Experience with Grafana or other monitoring platforms Awareness of common security issues in client-side development, such as those in the OWASP top ten, and how to mitigate them Even if you don More ❯
Southampton, Hampshire, United Kingdom Hybrid / WFH Options
MediaKind group
other Scrum ceremonies to ensure smooth project execution. Deployment Tools: Implement and manage deployment processes using Docker, Helm, Kubernetes, and VMs. Operational Platforms: Monitor and optimize operational environments using Grafana and Elastic Search. Cloud Deployment: Leverage tools such as Ansible, Terraform, Cloud API, OpenStack, OpenShift, and public cloud services for cloud deployment. Verification Tools: Use Jenkins and Azure pipelines for … the technologies below. Education: Bachelor's degree in Computer Science, Software Engineering, or a related field. Deployment Experience: Familiarity with Docker, Helm, Kubernetes, and VMs. Operational Knowledge: Experience with Grafana and Elastic Search. Cloud Tools: Understanding of Ansible, Terraform, Cloud API, OpenStack, OpenShift, and public cloud environments. Verification Tools: Experience with Jenkins and Azure pipelines. Configuration Management: Proficiency in Git More ❯
Cheltenham, Gloucestershire, United Kingdom Hybrid / WFH Options
TwinStream
DV Application Support Engineer - Contract (outside of IR35) Who are we: In 2019, our founders were working as engineers solving complex cross domain problems within government organisations TwinStream was formed to consolidate their collective expertise and experience into one business More ❯
in data governance and data quality frameworks.Knowledge of spatial data, IoT, or real-time monitoring systems in an infrastructure context.Experience with data visualisation and analytics platforms (e.g., Power BI, Grafana).Proficiency with data modelling tools and notations (e.g., UML, ERD, ArchiMate).Understanding of rail-specific systems such as signalling, asset management, and control systems. More ❯
Compose Work with LLMs, embedding models, and knowledge retrieval frameworks Develop in Python and Golang Manage CI/CD pipelines using GitLab CI Monitor infrastructure performance using Prometheus and Grafana Use Git for version control and collaboration Collaborate across engineering and security teams to ensure reliability and compliance Required Skills: 7+ years of experience and a BS degree in CS … pipelines, and model orchestration frameworks Experience with containerization and deployment in Kubernetes environments Understanding of CI/CD tools and practices, especially GitLab CI Monitoring and telemetry tools: Prometheus, Grafana Version control with Git Preferred Qualifications: Experience debugging GPU-enabled applications Familiarity with OpenAPI, HTMX, or Hyperscript Experience with Spark, Dask, or Ray for distributed data workflows Knowledge of AI More ❯
basic computer administration including software installation, system configuration, and networking. Comfort with git and automated build pipelines (Jenkins, GitLab CI/CD, etc.) Preferred Passion for observability (Elastic, APM, Grafana, etc.) Experience integrating software with a Large Language Model (LLM) Experience with retrieval-augmented generation (RAG) Production-grade software development experience with Python Service containerization and deployment with Docker and … understanding the value of frequent iterations and user feedback Previous experience working with CNO, USCC, CNMF, or some form of cybersecurity-related past Experience with any of Elasticsearch, Kibana, Grafana, Redis, Kafka, Nginx, AWS, HAProxy Experience integrating open-source models and technologies Excellent Jira hygiene More ❯
DynamoDB). Develop CI/CD pipelines for automated API deployment using AWS services (CodePipeline, Bedrock, CloudFormation). Monitor and improve API performance, logging, and error handling using Prometheus, Grafana, and ELK stack. Work on AI-driven API integrations including Recommender Systems, LLM-based AI Agents, and AI-powered Image & Video Processing workflows. Collaborate with front-end engineers, data scientists … . CI/CD automation expertise (Jenkins, GitHub Actions, AWS CodePipeline). Proficiency in API testing frameworks (Postman, Newman, Jest, Mocha, PyTest). Strong debugging, logging, and monitoring skills (Grafana, Prometheus, ELK stack) Nice to Have Experience with GraphQL federation and Apollo Server. Knowledge of AI/ML APIs and LLM-based chatbot APIs (OpenAI, Hugging Face, LangChain). Experience More ❯
About the role: We're looking for a Lead J ava Engineer to join our BX Online Loyalty Team , driving forward a bold, innovative new loyalty proposition for H&B customers. You'll be joining us at an exciting moment More ❯
Automation Tester (DV Security Clearance) Position Description: CGI was recognized in the Sunday Times Best Places to Work List 2025 and has been named one of the 'World's Best Employers' by Forbes magazine. We offer a competitive salary, excellent More ❯
and system behavior, pulling data from pods and containers using tools like kubectl, event viewer, and central logging platforms Monitor application health, performance, and availability levaraging observability platforms like Grafana and Kibana. Test and interact with API endpoints, documenting and validating their functionality using tools like Swagger/OpenAPI and Postman. Respond to support tickets promptly within defined SLAs, providing … RESTful APIs using Swagger/OpenAPI and Postman. Elastic Stack : Familiarity with Elasticsearch for log aggregation, indexing, and querying. Observability : Good understanding of monitoring and visualization concepts; experience with Grafana Scripting & Automation : Automation-first mindset, can streamline tasks. Powershell experience is a plus. Diagnostics : Skilled with Event Viewer, IIS Manager, SQL Server Management Studio, Linux command line tooling, and container More ❯
Docker, SELinix, FIPS, RedHat Satellite, Infrastructure as Code IaC Terraform, SALT and Ansible Prometheus and Grafana, Bash, Bourne, or C Due to federal contract requirements, United States citizenship and an active TS/SCI security clearance and polygraph are required for the position. Required: Must be a US Citizen Must have TS/SCI clearance w/active polygraph This … Satellite Familiarity with cloud computing environments like AWS Experience with Infrastructure as Code IaC products like Terraform, SALT and Ansible as well as system monitoring platforms like Prometheus and Grafana Experience with SELinix and FIPS $125,000 - $230,000 a year The pay range for this job, with multi-levels, is a general guideline only and not a guarantee of More ❯
on AWS and other providers Operating MongoDB (or other document database) clusters Operating Redis (or other key-value storage) clusters Administering Linux servers Maintaining distributed software Operating Prometheus and Grafana Operating logging collection and analysis systems Participating in the on-call rotation(4:00am - 16:00pm UTC) Skills: Kubernetes & containers (advanced) AWS/EKS (advanced) Linux (advanced) Terraform and IaC … in general (proficient) Helm (proficient) Go and/or Python (familiar) MongoDB (or similar) Redis (or similar) Monitoring - prometheus, grafana, thanos (familiar) Grasp of networking concepts (subnets, routing, peering, load balancing, NAT, etc.) Common networking protocols (DNS, TCP/IP, HTTP, TLS, UDP) Proactive, energetic, innovative and change oriented Nice to have: GCP or Azure Bare metal infrastructure engineering API More ❯
multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience by enabling actionable monitoring and alerting. Drive cloud cost visibility and optimization efforts across engineering through … operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire, and develop talented platform engineers with More ❯
Dublin OR Belfast Hybrid - 2 days/week in the office €80k-€95k/Year + Benefits My client is undergoing a major database restructuring initiative revamping key performance metrics and entire DB structure. This is a roe with Significant More ❯
Department: Tech Services Location: SEGA West London Reporting To: Head of Corporate Infrastructure Position Overview: We are seeking an experienced Senior Build + Release Engineer with games industry experience to design, deploy, and maintain our CI/CD and build More ❯