highest-risk, most complex or ambiguous platform work Build and enhance Azure landing zones and internal platform services Deliver infrastructure-as-code, CI/CD , self-service tooling and observability end-to-end Challenge design assumptions with a “show me the code” approach Pair with engineers to unblock delivery and lift team-wide engineering standards Non-Negotiables: 8+ years infrastructure … Azure Core: networking, security, identity, governance, landing zones Azure PaaS: App Service, Functions, container platforms (ACA/AKS) CI/CD: GitHub Actions or Azure DevOps with full automation Observability: logging, metrics, dashboards and alerting Incident Response: diagnosing and resolving complex platform issues Why Join: Shape a secure, scalable Azure platform in a regulated financial services environment Own complex landing More ❯
highest-risk, most complex or ambiguous platform work Build and enhance Azure landing zones and internal platform services Deliver infrastructure-as-code, CI/CD , self-service tooling and observability end-to-end Challenge design assumptions with a “show me the code” approach Pair with engineers to unblock delivery and lift team-wide engineering standards Non-Negotiables: 8+ years infrastructure … Azure Core: networking, security, identity, governance, landing zones Azure PaaS: App Service, Functions, container platforms (ACA/AKS) CI/CD: GitHub Actions or Azure DevOps with full automation Observability: logging, metrics, dashboards and alerting Incident Response: diagnosing and resolving complex platform issues Why Join: Shape a secure, scalable Azure platform in a regulated financial services environment Own complex landing More ❯
scalability, and cost efficiency of Databricks clusters and workflows. Collaborate with cross-functional teams to support analytics, machine learning, and business intelligence use cases. Implement data governance, lineage, and observability best practices using tools such as Unity Catalog, DataHub, or Collibra. Mentor junior engineers, fostering best practices in data engineering, testing, and DevOps for data (DataOps). Stay current with … and ability to collaborate across teams. Preferred Qualifications Experience with Databricks Unity Catalog and Delta Live Tables. Familiarity with streaming frameworks (Structured Streaming, Kafka, etc.). Background in data observability and metadata management. Exposure to machine learning pipelines or MLflow within Databricks. Knowledge of infrastructure as code (IaC) using Terraform or similar tools. To apply for this role please email More ❯
with Azure. Ability to troubleshoot build failures, manage YAML pipeline configurations, support deployment processes across Azure environments, manage service connections, and collaborate with development teams on release automation. Monitoring & Observability - Proficient in implementing and managing Azure Monitor, Log Analytics workspaces, Application Insights, and Azure dashboards. Experience creating alert rules, action groups, workbooks, and analysing metrics and logs using KQL (Kusto … Query Language). Skilled in performance troubleshooting, implementing Azure Service Health monitoring, and setting up distributed tracing. Ideally, knowledge and experience of Datadog Observability tooling. Security & Compliance - Strong understanding of Azure security best practises including Azure Security Center/Microsoft Defender for Cloud, encryption using Azure Key Vault, network security with NSGs and Azure Firewall, Azure Policy for governance, and More ❯
Edinburgh, Scotland, United Kingdom Hybrid/Remote Options
Explore Group
security in mind. Contribute to and influence system-level architecture decisions involving microservices, APIs, and multi-tenant deployments. Drive engineering best practices for code quality, CI/CD pipelines, observability, and operational excellence. Mentor engineers, foster technical growth, and build a culture of collaboration and accountability. Lead end-to-end delivery, ensuring projects meet both product and technical excellence standards. … 8+ years’ engineering experience , including significant exposure to distributed or cloud-native systems. Proven ability to lead complex technical initiatives , from design through delivery. Deep understanding of system scalability, observability , and performance optimization . Comfortable making architectural trade-offs and communicating them to both technical and non-technical stakeholders. Experienced in mentoring engineers and driving continuous improvement across teams. Passionate More ❯
bristol, south west england, united kingdom Hybrid/Remote Options
IVC Evidensia
technical growth and leadership development Actively contribute to codebases when needed to guide or unblock high-impact work Collaborate closely with Product, UX, Data, and Platform Engineering teams Champion observability best practices , including metrics, logging, and tracing Keep up to date with industry trends to influence our technology strategy What You'll Bring We're looking for a hands-on … sub platforms (e.g., SNS/SQS , EventBridge ) Commitment to quality development practices: TDD , code reviews , design patterns Strong mentoring and leadership experience within high-performing teams Solid understanding of observability tooling and incident response You Matter to Us Benefits At IVC Evidensia, our people are at the heart of everything we do. That's why we invest in your well More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Hargreaves Lansdown
/UX architecture . Comfortable guiding teams through design implementation , collaborating with product and design using tools like Figma . Familiar with cloud-native environments (AWS, Docker, Kubernetes) and observability tools like Prometheus and Grafana . Champions quality and security , embedding testing and scanning into development pipelines. Passionate about mentoring engineers , conducting code reviews , and fostering a culture of continuous … Android) JavaScript/HTML/CSS Figma/Git Testing frameworks : Jest, Cypress, XCTest, Espresso CI/CD pipelines : GitHub Actions, CircleCI, Bitrise Cloud-native architecture : AWS, Docker, Kubernetes Observability tools Interview Process 3 Stage Interview Stage 1 - Discussion with our Hiring Manager (30mins): A chance to talk with our Hiring Manager in more detail about the role, our tech More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Hargreaves Lansdown
grasp of TDD/BDD practices. Comfortable designing and operating in both on-prem and cloud-native environments , with working knowledge of AWS , Docker , and Kubernetes . Advocates for observability and service health , using tools like Prometheus and Grafana to ensure reliability and performance. Champions quality and security , embedding testing and scanning into CI/CD pipelines and engineering workflows. … HTML/CSS RDBMS (Oracle, Sybase)/NoSQL (Document DB) AWS/Docker/Kubernetes CI/CD pipelines : GitHub Actions, CircleCI, Bitrise Testing frameworks : Jest, Cypress, XCTest, Espresso Observability tools : Prometheus, Grafana Interview Process 3 Stage Interview Stage 1 - Discussion with our Hiring Manager (30mins): A chance to talk with our Hiring Manager in more detail about the role More ❯
Overview This is a fast-expanding company at the forefront of odds comparison, where innovation converges with excitement. You will work within a close-knit team with autonomy while enjoying substantial financial backing from the larger enterprise. Responsibilities Spearhead streamlined More ❯
Join our client at the forefront of AI innovation. They’re shaping the digital foundations that will define how humans and AI learn from each other — pioneering technology that is set to transform the way intelligent systems evolve. The Role More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Morson Edge (Technology)
Join our client at the forefront of AI innovation. They’re shaping the digital foundations that will define how humans and AI learn from each other — pioneering technology that is set to transform the way intelligent systems evolve. The Role More ❯
Required Qualifications • Bachelor’s or master’s degree in computer science, Engineering, or a related technical field. • 8+ years of hands-on software development experience, including large-scale backend systems or platform engineering. • Expert in Python with a strong understanding More ❯
Required Qualifications • Bachelor’s or master’s degree in computer science, Engineering, or a related technical field. • 8+ years of hands-on software development experience, including large-scale backend systems or platform engineering. • Expert in Python with a strong understanding More ❯
We are hiring for Elasticsearch/SIEM/Observability/Engineer/Consultant (Security OR Observability) Location: Across multiple locations in UK Proficiency in Elasticsearch Query DSL, EQL, and Kibana Canvas/dashboards. Should have expertise in Elasticsearch, Kibana, including deployment modes and core components. Deliver Elastic-driven solutions to maximise customer security outcomes, with future growth into Observability. Hands … on experience deploying Elastic Observability or similar platforms (e.g., APM, log, metrics, tracing systems). Expert in Bash and Python for automating data onboarding, Scripting skills: Python, Shell, or Painless for pipeline processors. and operational tasks and understanding of network protocols, HTTP, gRPC, and their logging intricacies. Proven ability to design and optimize Logstash pipelines (inputs, filters, outputs) and build … various processors (grok, dissect, script, kv, CSV, geo_IP) for event normalization and enrichment. Strong knowledge of Linux system administration and container orchestration (Docker, Kubernetes). Familiarity with modern observability frameworks like Open Telemetry and Prometheus and their integration with Elastic. More ❯
and improving the platforms, tools, and services that support modern web engineering. Your work will cover everything from building new environments and optimising CI/CD pipelines to enhancing observability, monitoring, and runtime systems. The goal is clear: empower web engineers to ship products faster, with greater reliability and security. This role is hands-on and goes far beyond maintenance. … designing and maintaining CI/CD pipelines Hands-on experience with infrastructure-as-code (e.g. Terraform) Deep understanding of security best practices in cloud and application delivery Exposure to observability tooling (Prometheus, Grafana, structured logging, etc.) Confident debugging and resolving issues in complex distributed systems Background in B2B SaaS web applications, with familiarity in Node a plus Able to operate More ❯
senior partners who have been there since its launch. The Role: You'll join the fund's global technology team, where you will focus on the resilience, automation and observability of production systems that power a mission-critical quantitative trading platform. The role forms part of a follow-the-sun global support model. Primary Duties: Build and maintain automated tools … core requirement), with additional experience using T-SQL and Bash. Infrastructure & Systems: Exposure to Linux and Windows environments, with working knowledge of Docker containers and AWS cloud services. Monitoring & Observability: Familiarity with DataDog, Grafana, and other internal or custom monitoring solutions. Automation & CI/CD: Experience using Git, TeamCity, and configuration management tools such as Ansible or Terraform. Databases: Hands More ❯
senior partners who have been there since its launch. The Role: You'll join the fund's global technology team, where you will focus on the resilience, automation and observability of production systems that power a mission-critical quantitative trading platform. The role forms part of a follow-the-sun global support model. Primary Duties: Build and maintain automated tools … core requirement), with additional experience using T-SQL and Bash. Infrastructure & Systems: Exposure to Linux and Windows environments, with working knowledge of Docker containers and AWS cloud services. Monitoring & Observability: Familiarity with DataDog, Grafana, and other internal or custom monitoring solutions. Automation & CI/CD: Experience using Git, TeamCity, and configuration management tools such as Ansible or Terraform. Databases: Hands More ❯
Manchester, Lancashire, England, United Kingdom Hybrid/Remote Options
Oscar Technology
flag strategies Automating customer onboarding, including keypair management, SSO configuration, and integration workflows Using Infrastructure as Code (Terraform) and GitHub Actions to manage configurations and ensure reliable deployments Implementing observability across authentication services, structured logging, dashboards, alerts, and SLOs Collaborating with engineers, product teams, and security to deliver scalable, secure solutions What you Need Strong proficiency in Node.js (TypeScript experience … similar) Familiarity with AWS services such as API Gateway, Lambda, and CloudWatch A deep understanding of authentication standards (OIDC/SAML) and identity management principles Hands-on experience with observability and monitoring practices Excellent communication skills and a proactive approach to problem-solving Nice to Have Experience with enterprise SSO integrations and SCIM provisioning Background in fintech, SaaS, or other More ❯
and expose data across multiple internal platforms Partner with quant stakeholders to translate real-world requirements into high-performance data solutions Expand data platform functionality, improving latency, scalability, and observability as usage grows Own processes around quality assurance, validation, and error monitoring of datasets Explore and introduce new technologies and tooling to keep systems efficient and future-proof Play a … and expose data across multiple internal platforms Partner with quant stakeholders to translate real-world requirements into high-performance data solutions Expand data platform functionality—improving latency, scalability, and observability as usage grows Own processes around quality assurance, validation, and error monitoring of datasets Explore and introduce new technologies and tooling to keep systems efficient and future-proof Play a More ❯
and expose data across multiple internal platforms Partner with quant stakeholders to translate real-world requirements into high-performance data solutions Expand data platform functionality, improving latency, scalability, and observability as usage grows Own processes around quality assurance, validation, and error monitoring of datasets Explore and introduce new technologies and tooling to keep systems efficient and future-proof Play a … and expose data across multiple internal platforms Partner with quant stakeholders to translate real-world requirements into high-performance data solutions Expand data platform functionality—improving latency, scalability, and observability as usage grows Own processes around quality assurance, validation, and error monitoring of datasets Explore and introduce new technologies and tooling to keep systems efficient and future-proof Play a More ❯
Site Reliability and Core Infrastructure responsibilities - owning everything from AWS cloud systems to on-prem deployments. The team is expanding to meet new strategic demands, including increased automation, enhanced observability, and the rollout of new colocation environments to support lower-latency trading. It’s a technically hands-on position that blends architecture, build, and operational ownership, suited to an engineer … Linux systems for performance and reliability, including kernel tuning and networking configuration Partner with development and platform teams to embed SRE best practices, reducing manual toil through automation and observability Drive improvements in monitoring, alerting, and log collection pipelines to enhance system insight and uptime Participate in architecture and design reviews, guiding platform evolution with reliability and scale in mind More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Techfellow Limited
Site Reliability and Core Infrastructure responsibilities - owning everything from AWS cloud systems to on-prem deployments. The team is expanding to meet new strategic demands, including increased automation, enhanced observability, and the rollout of new colocation environments to support lower-latency trading. It’s a technically hands-on position that blends architecture, build, and operational ownership, suited to an engineer … Linux systems for performance and reliability, including kernel tuning and networking configuration Partner with development and platform teams to embed SRE best practices, reducing manual toil through automation and observability Drive improvements in monitoring, alerting, and log collection pipelines to enhance system insight and uptime Participate in architecture and design reviews, guiding platform evolution with reliability and scale in mind More ❯
Caldecotte, Milton Keynes, Buckinghamshire, England, United Kingdom
Connells Group HQ
mindset, working directly with development teams to understand their needs and deliver solutions. You will work across multiple technical domains including orchestration, automation, CI/CD pipelines, cloud services, observability, and security, developing deeper expertise in areas that align with platform priorities and your interests. Experience with Microsoft Azure is essential.You will play your part in operating the platform aligned … with Docker and basic Kubernetes concepts Understanding of cloud networking concepts (VNets, subnets, NSGs) Awareness of cloud security best practices and compliance requirements Basic knowledge of monitoring, logging, and observability tools Understanding of cloud cost management and resource optimisation principles Comfort with troubleshooting and supporting development teams Understanding of service reliability and incident response practices Connells Group UK is an More ❯
explanations, citations) clear and accessible. Architecture: Shape a modular, scalable platform on AWS (ECS), separating ingestion, retrieval, reasoning, and delivery. Quality & reliability: Ensure reliability through testing, CI/CD, observability (metrics/tracing for LLM and retrieval paths), and performance optimisation. Collaboration: Partner with product and leadership teams, mentor peers, and play a role in shaping technical direction. Innovation: Explore … to have Experience with rerankers (e.g., cross-encoders), hybrid retrieval (SQL + vectors), query expansion, or lightweight knowledge graphs. Familiarity with LLM evaluation tooling (LangChain, LlamaIndex, OpenAI Evals) and observability for cost, relevance, and latency. Background in B2B data products or fintech. Applicants must be based in the UK with full right to work. More ❯
centric environments and wants to make a direct impact on the future of digital advertising addressability. Key Responsibilities: Own and evolve ID5’s platform, ensuring stability, reliability, scalability, and observability across services. Participate in technical design discussions and contribute platform expertise to architectural decisions. Participate in the on-call rotation to support critical production systems. Help define and champion platform … engineering best practices, principles, and standards. Build and improve platform observability with a focus on actionable, predictive metrics. Design and deliver fault-tolerant, highly scalable, and highly available systems. Develop self-service tooling and automation to empower product and data teams. Perform maintenance and lifecycle management of applications and infrastructure, including defining processes for upgrades, patching, and troubleshooting. Required Qualifications More ❯