Own CI/CD pipelines and Docker -based runtime on AWS ; Infrastructure-as-Code via CDK/Terraform (CDKTF) . Apply secure-by-design and TDD ; instrument apps for observability and performance . Collaborate with product, platform, and security teams to meet operational and compliance requirements. The toolkit you’ll use Frontend: TypeScript, React.js, Vite, Material-UI, HTML5, CSS Backend … Docker , CI/CD . Building and consuming RESTful APIs ; JSON schemas; integration testing. Comfortable in AWS and modern Infrastructure-as-Code approaches. Strong engineering fundamentals: code reviews, testing, observability, performance tuning . Security Clearance: Active SC or DV (must be current). Nice-to-haves Military background (RAF/Army/Navy) or delivery in defence, aerospace, or government More ❯
with UK retailers and marketplaces. In this role, you'll ensure our systems are reliable, scalable, and secure. You'll help automate deployments, evolve our cloud infrastructure, and improve observability and developer experience — making it easier for product teams to deliver quality software quickly and safely. Why Zopa Manchester? We're building a new tech hub right in the heart … platform and developer experience teams Ensuring our container platforms (including Kubernetes) are reliable, secure, and up to date Designing scalable, self-service tools to reduce operational toil Supporting infrastructure observability through metrics, tracing, and alerting Working closely with product teams to foster a culture of reliability engineering About You Experience in a Platform/Site Reliability Engineering or similar role More ❯
code across the stack. Participating in architectural discussions and helping shape engineering best practices. Troubleshooting and resolving production issues across services and systems. Contributing to CI/CD pipelines, observability, and automation alongside platform engineers. Your Skills & Experience: Must-haves to be successful in this role: Strong experience writing backend services in Go. Proficiency in React and modern JavaScript/… and code styles. Nobody can do everything, but here are a few related things we’re interested in: Experience working lower in the stack, e.g., databases, infrastructure, Kubernetes, or observability tooling. Exposure to CI/CD tooling Interest in natural language processing, AI, or distributed systems. Here’s our promise to you: We are going to work with you – to More ❯
code across the stack. Participating in architectural discussions and helping shape engineering best practices. Troubleshooting and resolving production issues across services and systems. Contributing to CI/CD pipelines, observability, and automation alongside platform engineers. Your Skills & Experience: Must-haves to be successful in this role: Strong experience writing backend services in Go. Proficiency in React and modern JavaScript/… and code styles. Nobody can do everything, but here are a few related things we’re interested in: Experience working lower in the stack, e.g., databases, infrastructure, Kubernetes, or observability tooling. Exposure to CI/CD tooling Interest in natural language processing, AI, or distributed systems. Here’s our promise to you: We are going to work with you – to More ❯
Experience: Proven delivery of enterprise OpenTelemetry environments, including production-scale collector deployment and config management. Hands-on experience with metrics, logs, traces, attribute design, and routing logic. Familiarity with observability backends (Dynatrace, Splunk, Prometheus, Tempo, Grafana). Strong collaboration skills with developers and infra teams in large, governed organisations. More ❯
Help to improve the resilience, automation, and observability of production systems that power a mission-critical quant trading platform for a systematic hedge fund. This isn’t your typical ops role - they're looking for Engineers who can write code to eliminate toil, improve reliability and automate release, monitoring and recovery processes. You'll build and maintain automated tools in More ❯
Help to improve the resilience, automation, and observability of production systems that power a mission-critical quant trading platform for a systematic hedge fund. This isn’t your typical ops role - they're looking for Engineers who can write code to eliminate toil, improve reliability and automate release, monitoring and recovery processes. You'll build and maintain automated tools in More ❯
City of London, London, United Kingdom Hybrid/Remote Options
Sharpe Search
and JavaScript/TypeScript (React.JS) being essential to drive their frontend and backend systems. You will be designing and delivering scalable, high-performance solutions from product requirements, ensuring robust observability through metrics and monitoring. You’ll work on event-driven architectures using CQRS, apply SOLID principles, and leverage Docker to build high-availability, high-throughput platforms. Experience with AWS services More ❯
and JavaScript/TypeScript (React.JS) being essential to drive their frontend and backend systems. You will be designing and delivering scalable, high-performance solutions from product requirements, ensuring robust observability through metrics and monitoring. You’ll work on event-driven architectures using CQRS, apply SOLID principles, and leverage Docker to build high-availability, high-throughput platforms. Experience with AWS services More ❯
NUnit). Expertise in RESTful and GraphQL APIs, Git, and SOLID principles. Strategic thinking, strong communication, and a love for collaboration. Bonus: Experience with Azure, DevOps, Entity Framework, and observability practices. Why You'll Love It Here: Developer-led culture with hack days, and open access to leadership. Transparent progression and tailored development plans. Great perks: profit share, training budget More ❯
NUnit). Expertise in RESTful and GraphQL APIs, Git, and SOLID principles. Strategic thinking, strong communication, and a love for collaboration. Bonus: Experience with Azure, DevOps, Entity Framework, and observability practices. Why You'll Love It Here: Developer-led culture with hack days, and open access to leadership. Transparent progression and tailored development plans. Great perks: profit share, training budget More ❯
Employment Type: Permanent
Salary: £70000 - £80000/annum Pension, 25 days holiday, Profit Sha
complex data ecosystem Design flexible data ingestion and transformation pipelines for financial market data and trading systems Build and maintain AI/ML infrastructure, including model serving, evaluation, and observability frameworks Collaborate directly with clients to ensure the platform meets real-world enterprise requirements Contribute to both strategic technical direction and hands-on implementation as part of a small, high More ❯
complex data ecosystem Design flexible data ingestion and transformation pipelines for financial market data and trading systems Build and maintain AI/ML infrastructure, including model serving, evaluation, and observability frameworks Collaborate directly with clients to ensure the platform meets real-world enterprise requirements Contribute to both strategic technical direction and hands-on implementation as part of a small, high More ❯
Integrate and extend React Native functionality using native modules in Swift or Kotlin when needed. Take ownership of app store releases, ensuring smooth submission, updates and maintennance processes. Manage observability and performance in production through tools like crash reporting, logging and analytics. Contribute to the architecture and tooling decisions, helping to shape the direction of our mobile stack from the More ❯
Integrate and extend React Native functionality using native modules in Swift or Kotlin when needed. Take ownership of app store releases, ensuring smooth submission, updates and maintennance processes. Manage observability and performance in production through tools like crash reporting, logging and analytics. Contribute to the architecture and tooling decisions, helping to shape the direction of our mobile stack from the More ❯
london (chessington), south east england, united kingdom Hybrid/Remote Options
SoTalent
Resource Management: Oversee project budgets, allocate resources, and lead the monitoring engineering team to deliver on time and within scope. What You'll Bring Strong expertise in system reliability, observability, and monitoring strategy. Deep understanding of end-to-end video processing and broadcast workflows. Proven leadership skills with experience managing engineering teams. Strategic mindset with the ability to align monitoring More ❯
team meetings and performance reviews. Motivate, guide and coach team members to hit agreed targets via formal objectives and supporting development plans. Incident & Problem Management Proactively partner with the Observability Management function to establish trending and opportunities for Customer infrastructure optimisation. Be a process manager and advocate for Problem Management, ensuring root cause analysis takes place on major Incidents and More ❯
TM Forum (eTOM/ODA) and ITIL 4. Strong delivery leadership in Agile/SAFe environments, combined with governance and risk management. Demonstrated success implementing Operational Readiness Reviews (ORR), observability, and support models. Excellent stakeholder management and ability to produce clear written artefacts for both technical and executive audiences. Education Bachelor’s or Master’s degree in Computer Science, Engineering More ❯
end Scale ingestion and indexing for 30+ blockchains, including high-throughput chains Operate a secure fleet of full nodes and indexers with clear SLAs and cost controls Set SLOs, observability, incident management, and make on call boring Build and lead six plus squads. Org design, hiring, mentoring, standards, and SDLC Partner with product, compliance, and customers to turn outcomes into More ❯
end Scale ingestion and indexing for 30+ blockchains, including high-throughput chains Operate a secure fleet of full nodes and indexers with clear SLAs and cost controls Set SLOs, observability, incident management, and make on call boring Build and lead six plus squads. Org design, hiring, mentoring, standards, and SDLC Partner with product, compliance, and customers to turn outcomes into More ❯
implement AI solutions using agents, LLMs, planning algorithms, and decision-making frameworks Develop agent based architectures that support autonomy, interactivity, and task completion Implement best practices for orchestration and observability, monitor performance, conduct evaluations, and implement safety and guardrail mechanisms Integrate agents into applications, APIs, or workflows (e.g., chatbots, copilots, automation tools) Collaborate with researchers, engineers, and product teams to More ❯
backend, web, and mobile Real-time data pipelines for device integration Web and mobile frameworks for dashboards and apps Relational databases with schema management Deployment, CI/CD, and observability tooling AI integration for insights and reporting What They’re Looking For Strong coding experience in TypeScript, Node.js, and React Experience leading projects or small teams Comfortable working across backend More ❯
cable management, hardware lifecycle planning, and environmental monitoring. Participate in capacity planning and performance tuning to support business growth and infrastructure scalability. Reliability & Monitoring Ensure high availability, security, and observability of systems through best practices in reliability and recoverability. Develop and maintain monitoring systems to ensure compliance with service level objectives. Lead and contribute to incident response, root cause analysis More ❯
business problems at scale. What you’ll bring: Expertise in the deployment of enterprise-grade AI solutions to cloud and on-premise customer environments with a focus on availability, observability and security. Proven track record with at least one of the major cloud providers and an understanding of DevOps best practices. Hands-on experience building production-grade solutions using LLMs More ❯