architectures using module federation Proficiency with cloud-native development in GCP or Azure, including secure and scalable deployments Strong command of modern CI/CD practices, test automation, and observability Familiarity with infrastructure tooling such as Docker, Kubernetes, Helm, and Terraform Broad testing expertise across unit, integration, end-to-end, and non-functional testing Understanding of secure coding practices and More ❯
NestJS). Exposure to low-code platforms (e.g., Retool) for rapid application development. Experience in DevOps practices, including infrastructure-as-code (IaC), monitoring, alerting, and incident management. Familiarity with observability tools (Grafana, Prometheus) and APM tools (New Relic, Datadog). Knowledge of microservices architecture, event-driven design, and scalability best practices. Experience implementing data compliance standards (GDPR, ISO 27001). More ❯
London, South East, England, United Kingdom Hybrid / WFH Options
Become
collaboration skills across multidisciplinary teams Desirable Attributes Exposure to microservices architecture and event-driven systems (e.g., Kafka) Experience with design systems and component libraries (e.g., Material, Storybook) Familiarity with observability tools and performance tuning Prior consulting experience or experience in client-facing roles Engagement Model Outside IR35 12-month initial contract with potential for extension or permanent employment Hybrid working More ❯
delivery Monitor and troubleshoot application and infrastructure issues across environments Collaborate with cross-functional teams to ensure high availability and performance Implement best practices for container security, scalability, and observability Participate in on-call rotations and incident response efforts Document architecture, processes, and troubleshooting guides Other Responsibilities Ensure uptime, scalability, and performance of common 3rd party and internally developed eDiscovery More ❯
engineering role. Security Mindset: Deep understanding of cloud environment security, from OS networking layers to cloud provider configurations. Proven project leadership in security areas such as runtime scanning, security observability, CSPM, etc. Experience with at least one cloud platform (AWS, Azure, GCP), including IAM, VPC, security groups, and cloud security tools (e.g., GuardDuty, Security Hub, CloudTrail). Coding/Automation More ❯
chains, for both internal and external use This involves work across various disciplines, with a tight focus on our specific areas of responsibility, such as cloud provisioning, infrastructure management, observability, and CI/CD You will be responsible for building and maintaining various tools, solutions and services associated with these areas Taking ownership where needed. We've no shortage of More ❯
series storage, and high-frequency analytics. Lead the design and governance of data models that support complex trading strategies, asset optimization, and regulatory reporting. Ensure data quality, lineage, and observability across all layers of the data stack. Strategic Collaboration & Business Alignment Partner with trading desks, quantitative teams, and risk functions to translate business needs into data solutions that enhance decision More ❯
roles. Experience with technical troubleshooting and scripting languages such as Python, Go, or Bash. Experience with Kubernetes security, including workload isolation, RBAC, and network policies, containerisation, orchestration, and Kubernetes observability tools (e.g., Falco, Prometheus, Grafana). Experience with infrastructure-as-code and configuration management tools (e.g., Terraform, Helm, ArgoCD). Eligibility to obtain UK Developed Vetting (DV) security clearance; British More ❯
Sheffield, Yorkshire, United Kingdom Hybrid / WFH Options
OMEGA, Inc
Experience and knowledge with TDD Knowledge working with ASP.Net Core Minimal API, XUnit and Entity Framework Core Responsive UI design Familiarity with cloud hosted containerized microservice architecture (Docker, Kubernetes, Observability tools) Experience of designing and implementing security best practices in web development About Us One Company - HBK Hottinger Brüel & Kjaer (HBK) is a global leader in the fields of sensors More ❯
Experience using managed languages such as Python, Go, C#, Java, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience designing tooling to simplify the operational management of SaaS/PaaS systems. Familiarity with building flexible and More ❯
scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define More ❯
and maintain reusable components , APIs, and services that enable rapid deployment of AI features across products. Champion best practices in MLOps and software engineering , including CI/CD, testing, observability, and versioning for AI systems. Mentor and guide junior engineers and cross-functional team members, fostering a culture of technical excellence and collaboration. Stay current with advancements in AI/ More ❯
awareness of relevant regulation and restrictions Experience with running live services with significant volume of users and establishing appropriate SLOs and error budgets for services and applications Experience designing observability strategies for systems with multiple components Apply for this job indicates a required field First Name Last Name Email Phone Resume/CV Enter manually Accepted file types: pdf, doc More ❯
engineering expertise across Postgres, Redis, InfluxDB, and ClickHouseschema design, indexing, and caching for sub-second reads. Experience deploying microservices in production using Docker and Kubernetes. Skilled in setting up observability and alerting pipelines (Prometheus, Grafana), including model drift detection. Experience with real-time ML inference and model serving frameworks (e.g., TorchServe, Triton, BentoML) for low-latency applications. Experience designing feedback More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
Identify and implement opportunities to automate workflows, reduce technical debt, and drive continuous delivery excellence. Drive a culture of early feedback, enabling faster and more reliable development through improved observability and testing strategies. About you 8+ years of professional software development experience with .NET/C#, including 3+ years in a technical leadership role. Proven track record architecting and delivering More ❯
Responsibilities: Architect and implement scalable, secure Kubernetes-based infrastructure for multi-cloud and hybrid environments. Lead technical direction for core Fleet initiatives-control plane services, tenancy models, deployment pipelines, observability layers, and more. Mentor engineers across the team, fostering a strong engineering culture of ownership, curiosity, and excellence. Drive modernization efforts-introducing patterns like GitOps, Policy-as-Code (Kyverno), Cilium More ❯
Croydon, London, United Kingdom Hybrid / WFH Options
Jane's Group
with 5+ years AWS and an appreciation of Azure services. Experience operating at an organisation-level in a complex multi-account architecture. In-depth infrastructure experience across network, backup, observability, security and governance. Experience at managing security of systems in line with ISO27001, Cyber essentials or NIST standards. In-depth experience with Identity and Access Management, including Privileged Access In More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Jane's Group
with 5+ years AWS and an appreciation of Azure services. Experience operating at an organisation-level in a complex multi-account architecture. In-depth infrastructure experience across network, backup, observability, security and governance. Experience at managing security of systems in line with ISO27001, Cyber essentials or NIST standards. In-depth experience with Identity and Access Management, including Privileged Access In More ❯
Infrastructure Observability Engineer - Leading Trading Company Location: London, UK Contract Type: Permanent Salary: Competitive + Benefits About Our Client Our client is a well-established trading company with a strong presence in the global commodities market. They are committed to leveraging cutting-edge technology solutions to drive operational excellence and maintain their competitive edge in the fast-paced trading environment. … The Role We are seeking an experienced Infrastructure Observability Engineer to lead the design, implementation, and continuous improvement of our client's enterprise observability platform. This role focuses on delivering comprehensive monitoring, event correlation, and impact analysis, demonstrating AIOps capabilities and tools such as BMC Helix Operations Manager. The ideal candidate will be passionate about improving access to infrastructure performance … automating operational intelligence, and reducing mean time to resolution (MTTR) through intelligent alerting and root cause analysis. Key Responsibilities Own and evolve the enterprise observability strategy across all infrastructure tracks Design, implement, and support event management and impact analysis workflows using platforms such as BMC Helix Operations Manager Integrate and correlate data from multiple sources (e.g., 20+ monitoring systems) into More ❯
Manchester, Lancashire, England, United Kingdom Hybrid / WFH Options
Opus Recruitment Solutions Ltd
based environments Troubleshooting complex issues across infrastructure, code, and security layers Collaborating with engineers, architects, and operational teams Maintaining systems in live environments and responding to incidents Ensuring security, observability, and scalability are embedded in solutions Key Skills & Experience Required Automation & Tooling CI/CD tools and pipeline development (e.g. Jenkins, GitLab, Bamboo) Infrastructure as Code (Terraform, Ansible, etc.) Orchestration … deployment Experience managing databases (SQL, NoSQL) Exposure to legacy infrastructure and modernisation approaches Performance tuning and workload scaling Ways of Working Agile and Scrum delivery methodologies SRE principles and observability practices Experience in mission-critical or highly regulated environments is advantageous Excellent problem-solving, communication, and collaboration skills What’s on Offer Highly flexible hybrid working model 25 days annual More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Strive Gaming
in between - ensuring our platform is resilient, efficient, secure and developer-friendly. Key Responsibilities: Design, build, and maintain platform services and infrastructure used by product engineering teams. Improve reliability, observability, and scalability of existing systems. Develop and maintain CI/CD pipelines to support software delivery. Build tooling and automation that supports self-service infrastructure and deployment. Ensure security best More ❯