other internal teams to fully understand client requirements and deliver tailored technical solutions. Design and implement scalable, future-proof architectures for new third-party connectors and integrations. Enhance system observability by improving diagnostics, logging, and tracing to aid technical support teams in resolving issues swiftly. Oversee the ongoing development and management of the public API, covering REST and event streaming More ❯
you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the ACRA platform for More ❯
applications of AI for the construction domain, pushing the boundaries of what's possible. Build core infrastructure that allows us to build and ship LLM apps quickly - this includes observability, how we work with several LLM providers + our own fine tuned models. Work with other engineers in the product and research teams to bring new models and applications to More ❯
DevOps, infrastructure, and platform engineering. Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, CloudWatch, Lambda) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible Monitoring & Observability: Grafana, Prometheus CI/CD: GitHub Actions Automation & Scripting: Python, Bash, Go or Java What We’re Looking For Proven experience running AWS cloud infrastructure in a production or regulated … financial) environment. Hands-on experience managing Kubernetes clusters (preferably EKS). Strong understanding of Infrastructure as Code using Terraform. Familiarity with monitoring and observability stacks such as Prometheus and Grafana. Experience building and maintaining CI/CD pipelines (GitHub Actions or similar). Strong scripting or automation skills using Python, Bash, Go or Java . A collaborative mindset — comfortable working More ❯
AWS (Core Services – EC2, RDS, S3, IAM, Lambda, CloudWatch) Infrastructure as Code: Terraform Containerisation & Orchestration: Docker, Kubernetes (EKS), Helm Configuration Management: Ansible CI/CD Pipelines: GitHub Actions Monitoring & Observability: Grafana, Prometheus Scripting/Automation: Python or Java What We’re Looking For Proven experience managing and scaling AWS cloud environments , ideally supporting live software products or high-traffic platforms. … Strong background in Terraform and Infrastructure as Code best practices. Practical experience with Kubernetes (EKS) in production. Familiarity with monitoring and observability tools such as Grafana and Prometheus. Hands-on experience building CI/CD pipelines (GitHub Actions, Jenkins, CircleCI, etc.). Solid scripting and automation experience using Python or Java . A collaborative engineer who enjoys working closely with More ❯
london, south east england, united kingdom Hybrid/Remote Options
Black Pen Recruitment
tooling, systems design, and operational resilience. Their environment offers opportunities to work on everything from CI/CD pipelines and container orchestration to configuration management, infrastructure as code, and observability tooling. While you may bring experience in specific tools or platforms, you will be expected to contribute broadly across our infrastructure landscape. Our client's core product is a comprehensive … Solid Linux administration and general networking knowledge Understanding of infrastructure security best practices, including secure configuration, identity and access management, and compliance controls Experience with monitoring, alerting, and system observability Background in financial services infrastructure is advantageous but not required More ❯
pipelines, reducing deployment time and improving release reliability Strengthen system resilience through infrastructure improvements and scalability planning Work with Product Engineer's to enhance developer experience Drive automation and observability Requirements: Strong GCP experience Deep understanding of Terraform CI/CD pipelines Containerisation (Kubernetes, GKE) If you're interested get in touch ASAP More ❯
to build cost-effective solutions on Microsoft Azure while maintaining agility and fostering innovation. This position is perfect for engineers who are passionate about optimising cloud usage, enhancing cost observability, and championing a Fin Ops culture. Experience in some of the following would be ideal Partner with engineering, finance and product teams to drive cost-efficiency across Azure Clear understanding More ❯
Employment Type: Permanent
Salary: £75000 - £85000/annum car allowance + bonus + bens
City of London, London, United Kingdom Hybrid/Remote Options
ARC IT Recruitment Ltd
/MTTR via automation, clear SLAs, and robust RCAs/post-mortems. Safer, faster releases (blue/green, canary, feature flags) in partnership with Trading, Quant, and Engineering. Mature observability (logs/metrics/traces), capacity planning, and performance tuning for low-latency flows. Strong production hygiene and controls aligned to MiFID II/MAR/best-ex. Leadership of More ❯
hands-on experience in Microsoft Azure ML Studio * Experience using business intelligence tools, preferably Power BI * Experience applying Generative AI and prompting techniques * Strong understanding of data governance, model observability, and compliance frameworks * Proven ability to deliver secure, scalable, and responsible data science solutions If this sounds like you and you are available on short notice, apply now More ❯
practices for automation tools such as Power Automate Desktop. * Build out robust ALM processes using Azure DevOps or GitHub - including pipelines, solution management, environment variables, and connection references. * Implement observability and monitoring through Application Insights, Azure Monitor, and alerting frameworks. * Design secure integration layers using Azure services such as API Management, Service Bus, Functions, Logic Apps, and Key Vault. * Lead More ❯
practices for automation tools such as Power Automate Desktop.* Build out robust ALM processes using Azure DevOps or GitHub - including pipelines, solution management, environment variables, and connection references.* Implement observability and monitoring through Application Insights, Azure Monitor, and alerting frameworks.* Design secure integration layers using Azure services such as API Management, Service Bus, Functions, Logic Apps, and Key Vault.* Lead More ❯
East London, London, United Kingdom Hybrid/Remote Options
Client Server
crypto offering and split your time between hands-on development with people management (70/30). You'll set the technical direction, mentor engineers and ensure code quality, observability, scalability and security are embedded into high-quality, high-impact releases. You'll be working with a modern, cloud native tech stack using Java, Spring Boot, AWS, Kafka and CI More ❯
and coach junior and transitioning data engineers to accelerate their development and strengthen the team’s overall capabilities. Lead production operations by enforcing standards around testing, CI/CD, observability, and documentation to ensure platform reliability and regulatory compliance. Collaborate effectively with business clients and cross-functional teams to translate requirements into technical solutions and drive innovation across BNY. To More ❯
the rapid and efficient development of new third-party connectors. Ensure the system's interfaces, testing protocols, and designs are robust and future-proof. Monitoring and Diagnostics: Enhance system observability and streamline the diagnosis of technical issues through advanced logging and tracing capabilities, aiding front-line technical staff. API Development and Management: Oversee the development and maintenance of our public More ❯
TW75QD, Syon, Greater London, United Kingdom Hybrid/Remote Options
Sky
maintain platform stability. Take ownership of production support, ensuring swift investigation and resolution of live issues through established support and release processes. Champion best practices in architecture, software engineering, observability, and performance optimisation. Contribute to technical direction, identifying and implementing improvements to tooling, workflows, and development frameworks. What you'll bring Extensive experience developing Lightning.js applications for TV or embedded More ❯
london, south east england, united kingdom Hybrid/Remote Options
Fresha
projects autonomously. Developer Experience - Extend our local development experience offerings for engineers Knowledge Sharing - Enrich knowledge across the department by creating Documentation, SOPs, Runbooks and fascinating knowledge-sharing sessions Observability - Extend Monitoring & Observability capabilities. Accessibility - Simplifying the process for engineers to access this data Collaboration - Collaborate and enable engineers to do their jobs more efficiently Efficiency - Developing tools to maximise More ❯
their core software products. Expect a collaborative engineering culture, modern cloud-native stack, and plenty of freedom to influence tooling, architecture, and reliability practices. If youre passionate about automation, observability, and designing systems that just dont fail , this is the perfect environment for you. Tech Stack Cloud: AWS (EC2, RDS, S3, IAM, Lambda, CloudWatch) Containerisation & Orchestration: Docker, Kubernetes (EKS) Infrastructure … as Code: Terraform Configuration Management: Ansible Monitoring & Observability: Prometheus, Grafana, ELK Stack CI/CD: GitHub Actions Scripting & Automation: Python, Bash, or Go What Youll Be Doing Designing and maintaining reliable, scalable, and secure infrastructure for production systems. Automating operational tasks and improving system efficiency. Implementing observability tooling to monitor system health, performance, and capacity. Working closely with development teams … how reliability and performance are engineered at scale. Work with talented developers and DevOps engineers in a collaborative environment. AWS | Site Reliability | SRE | Cloud | Kubernetes | Terraform | CI/CD | Observability | Python | Go | Automation Click APPLY NOW to be considered for this position! Follow ReVybe IT Recruitment to stay up to date with the latest Cloud, Platform & SRE opportunities. More ❯
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
optimise BI dashboards and data products using Tableau, translating business needs into visual insights. Orchestrate and monitor data pipelines, ensuring data quality and timely delivery. Implement data quality checks, observability, and maintain data cataloging and lineage. Drive CI/CD practices using GitHub Actions or similar tools. Collaborate with cross-functional teams to improve platform capabilities and analytics maturity. Requirements More ❯
london, south east england, united kingdom Hybrid/Remote Options
Mercor
limited license for evaluation and—if mutually agreed—training on selected code segments. Have deep knowledge of your system's architecture, tooling, and standards (Git, CI/CD, testing, observability). Are fluent in one or more of: Python, Java, C/C++, JavaScript, TypeScript (others welcome). Can collaborate asynchronously with researchers/engineers and move quickly with minimal More ❯
London, South East, England, United Kingdom Hybrid/Remote Options
Addition
trusted C-level relationships. Confident presenter with a consultative approach tailored to enterprise telecom clients. Experience working with nearshore/offshore delivery models is a plus. Knowledge of AIOps, observability, network automation, or platform engineering is advantageous. What’s in It for You Join a global team of 6,000+ technologists, with autonomy to shape growth in a critical sector. More ❯
aggregates). Collaborate with analysts, BI users, data scientists, and business stakeholders to translate data requirements into reliable data products (tables, views, metrics). Ensure data quality, consistency, and observability (tests, monitoring, alerting). Optimize SQL queries and transformations for performance in your data warehouse/lakehouse environment. Support or own CI/CD workflows around analytics (e.g. git, reviews More ❯