DevOps, Product, and QA teams to ensure production-grade solutions. Participate in architectural discussions, advocate for best practices, and provide mentorship to peers. Drive performance tuning, fault tolerance, and observability improvements across services. Required Skills & Experience: 5+ years of experience in Core Java development with a focus on performance and memory optimization. Proficient in SpringBoot and microservice-based architecture. Proven More ❯
latency, high-availability critical systems or cloud-basedservices. Experience defining, managing, and executing a portfolio of complex engineering projects. Preferred Qualifications: Expertise in event-driven architecture. Expertise in instrumentation, observability and monitoring. Experience with container and orchestration technologies and relevant security considerations.We often use Kubernetes and EKS. Expertise in Relational and non-relational databases. Experience in Celery, EventBridge, SQS and More ❯
GraphQL. ? Implement automated unit, integration, and performance testing using xUnit, Selenium and JMeter to ensure quality and reliability. ? Deploy and monitor applications using Azure DevOps, Azure App Service and observability tools like Dynatrace. ? Maintain and contribute to technical documentation using tools such as Confluence and Azure DevOps. ? Conduct code reviews, mentor peers and promote engineering best practices, coding standards and More ❯
direction from others, is crucial. Ability to deliver tasks within a giventimeframe. Ability to adapt and change to fluidprocesses. DESIRABLE: Obtained HCM certification (valid). Maintain and enhance system observability (logging, monitoring, alerting) to prevent issues before they impact users. Obtained Extend certifications from an accredited body in your area of specialisation (must be valid). Have AWS certified Developer More ❯
based out of our London office. What You'll Do Take ownership of Aztec's custom CI system and internal build tools. Enhance and build CI dashboards to improve observability of engineering workflows. Harden CI infrastructure using least privilege principles and robust sandboxing. Implement productivity enhancements across C++, Rust, Solidity, and TypeScript build systems. Investigate and resolve low-level bugs More ❯
line with standard playbooks from Google product teams. Resolve profile escalations and issues, improve the customer experience, and drive initiatives. Manage the health and performance of the environment using observability tools, carry out maintenance and updates to roll out new features and keep the platform secure. Participate in a 24/7, on-call rotation for incident response escalation within More ❯
data storage, and integrations Collaborate with product and other stakeholders to shape Oak's technical strategy and roadmap. Mentor junior engineers and contribute to technical best practices Drive automation, observability, and CI/CD maturity What We're Looking For 5+ years in back-end development with C#/.NET and experience at scale Deep understanding of software design patterns More ❯
team to promptly triage test failures with precisionand accuracy, maintaining the health of the build pipeline. Work collaboratively with development, product, operations, and support peers to encourage quality. Utilize observability and monitoring tools to proactively detect and remediate failures. Experience with manual test case/test scenario development & execution Ensure all applicable security policies andprocessesare followed to support the organization More ❯
Hemel Hempstead, Hertfordshire, United Kingdom Hybrid / WFH Options
Eckoh
DynamoDB, SQS, and EventBridge Develop robust CI/CD pipelines for applications running in EKS and serverless environments Embrace microservices and event-driven architecture patterns Implement logging, tracing, and observability practices from day one Contribute to the design and development of cloud-native data platforms that support real-time and batch processing AI & LLM Enablement: Collaborate with data scientists and More ❯
Hemel Hempstead, Hertfordshire, South East, United Kingdom Hybrid / WFH Options
Eckoh PLC
DynamoDB, SQS, and EventBridge Develop robust CI/CD pipelines for applications running in EKS and serverless environments Embrace microservices and event-driven architecture patterns Implement logging, tracing, and observability practices from day one Contribute to the design and development of cloud-native data platforms that support real-time and batch processing AI & LLM Enablement: Collaborate with data scientists and More ❯
most common Styling libraries. Strong grasp of the React Framework, relative patterns and best practices. Good understanding of UI/UX best practices and considerations. Understanding of front-end observability with tools like Sentry, LogRocket, Datadog, or New Relic. Experience with CI/CD pipelines, like Github Actions, ArgoCD. Awareness of common front-end security risks (e.g., XSS, CSRF). More ❯
of Go Lang or Java, with hands-on experience building scalable services. Ability and willingness to enhance existing Go Lang backend services regardless of specialisation. Experience with working with observability stack (logging, metrics,tracing). Expertise in building RESTful APIs following company standards. Understanding of Domain-Driven Design and Modularization concepts. Asynchronous processing with approaches like co-routines, messages queuing More ❯
Server and Desktop Strong proficiency in Bash, Powershell and Ansible scripting, Python experience is desirable Expertise in virtualisation platforms and container orchestration and related tooling. Familiarity with monitoring and observability stacks (Prometheus, Grafana, ELK/EFK, or equivalents). Ability to diagnose and resolve complex technical issues with a clear methodical approach Ability to manage multiple tasks and prioritise effectively More ❯
components such as market data feeds, order gateways, execution algorithms, risk engines, UI dashboards, middle office reconciliation, and account infrastructure. We emphasize event-driven, deterministic system design, real-time observability, and strong security. Our tech stack includes Java (low-latency), Python, Web UI (React/Ag-Grid), Aeron, ClickHouse, Kubernetes, and modern CI/CD tooling, with a strong focus More ❯
you thrive in a fast-paced environment where you can make a real difference, we want to hear from you! Required skills/expertise: Develop and implement a comprehensive observability strategy for self-hosted deployments, including infrastructure and tooling for monitoring, alerting, and troubleshooting. This will involve designing and implementing robust metrics and logging systems. Engineer the ACRA platform for More ❯
building resilient, scalable data driven personalisation services end to end Setting up and maintaining robust CI/CD pipelines Running highly available systems exposing data via API with strong observability practices Collaborating effectively across teams Nice to have ML engineering experience Experience supporting data scientists with tooling, workflows and model optimisation Domain driven design experience Applications are reviewed on an More ❯
Fi authentication systems, CRMs and partnered PropTech tools Help to evolve our homegrown DevOps and CI/CD processes by further developing our GitHub Actions pipelines, Terraform definitions and observability integrations. Ensure quality & reliability: write and maintain unit, integration and end-to-end tests, participating in code reviews to uphold high standards Contribute to our cloud-native platform to optimise More ❯
proprietary data that will power AI products. Champion best practices for orchestrating multi-cloud environments (AWS, Azure, GCP) to enhance platform performance, scalability, and cost efficiency. Implement robust security, observability, monitoring frameworks, and data governance to ensure data reliability, minimize downtime, and maintain compliance. Manage budget, implement charge-back models for platforms and services you provide to your customers. Lead More ❯
code, networking and databases with enough knowledge to be able to fault find and identify the root cause. Core Responsibilities involved: Incident management Application design and development Site reliability (observability, alerting, high-availability, self-healing systems etc.) Database administration Infrastructure provisioning Process automation Respond to change requests Skills & Experience Oracle DB Docker (with Docker Swarm) Elastic Stack Typescript/React More ❯
Malvern, Worcestershire, United Kingdom Hybrid / WFH Options
QinetiQ Limited
the evaluation of the performance of LLMs in different contexts. Accountabilities: Understands the technical aspects of the project and the wider customer business model. Solution architecture, including security, availability, observability, scalability, performance, reliability, and cost-efficiency. Ensures team members understand and adhere to project standards for quality, documentation, techniques and tools. Identifies, escalates & manages technical risk with Team Manager and More ❯
create a shared understanding of decision making, direction, priorities, and progress between the team, the org, and the broader company. Experience operating user-facing software at scale, including availability, observability, and security fundamentals. Industry or research knowledge of compilers, program analysis, programming language design and implementation. Knowledge of logic programming or database query languages (e.g. SQL, Prolog, Datalog, Kusto Query More ❯
a live service for users Experience with understanding network architectures and troubleshooting network-related issues using Linux tools In-depth expertise in at least one of: Kubernetes, TerraForm, Networking, Observability Flexibility and mobility are required to deliver this role as there may be requirements to spend time onsite with our clients and partners to enable delivery of the first-class More ❯
will: Design and implement pipelines for training, deploying, and monitoring real-time and batch ML models Work closely with ML Scientists to productionise models and improve reliability, latency, and observability Partner with backend and product teams across Depop to define integration requirements and coordinate deployments of shared ML components Help design and extend the ML platform at Depop in collaboration More ❯
a live service for users Experience with understanding network architectures and troubleshooting network-related issues using Linux tools In-depth expertise in at least one of: Kubernetes, TerraForm, Networking, Observability Flexibility and mobility are required to deliver this role as there may be requirements to spend time onsite with our clients and partners to enable delivery of the first-class More ❯
teams to align on data architecture and ensure our ML systems meet overarching business objectives. Evolve our MLOps infrastructure, driving the strategy for model versioning, automated deployments, monitoring, and observability using modern tools like Prefect. Mentor and guide other members of the team, fostering a culture of technical excellence and continuous improvement through code reviews, design discussions, and knowledge sharing. More ❯