infrastructure provisioning and tooling to enhance development efficiency. You will manage Platform Reliability and Infrastructure ensuring a reliable and stable platform. You will oversee YouLend's the Security and Observability frameworks , focusing on platform security, maintaining observability, and providing dashboards for developers to monitor service health. The ideal candidate is someone who has successfully built and scaled platform architectures, led … the ability to work across technical and non-technical teams. Excellent communication skills, with the ability to translate complex technical concepts to business stakeholders. Operational Focus: Expertise in platform observability, monitoring, incident management, and creating highly reliable systems. Experience implementing SLAs, SLOs, and SLIs is a plus. Security & Compliance: In-depth understanding of platform security, data privacy, and regulatory compliance More ❯
. Build evaluation pipelines to benchmark LLM performance and continuously monitor production accuracy and relevance. Work across the ML stack—from data preparation and model training to serving and observability—either independently or in collaboration with other specialists. Optimize model pipelines for latency, scalability, and cost-efficiency , and support real-time and batch inference needs. Collaborate with MLOps, DevOps, and More ❯
AWS as our cloud compute platform Kubernetes (EKS) for container runtime and orchestration RDS (PostgreSQL, MySQL), Kafka, Redis Terraform for infrastructure as code Lambda and Step Functions Datadog for Observability Github actions for CICD Frontend is React Backend services are developed in NodeJS (TypeScript) As we are an international team, please submit your application and CV in English. About Spendesk More ❯
with React & Material UI, Postgres, Hasura and AWS Serverless Technologies such as Lambda, DynamoDB and EventBridge - all managed via AWS CDK & SST. We use Sentry, Lumigo and LogRocket for observability and Github Actions for automated testing and deployment. End-to-end Ownership. You will be entrusted with end-to-end ownership of your projects. From product, design and architectural decisions … ideally AWS). You focus on having a high impact . You've spearheaded the engineering of critical systems before, working with best-in-class tooling in AWS, IaaC, observability and quality assessments. You want to discover the best ways to bring this to an early-stage startup. You know what good can look like . You understand what it … takes to build highly reliable & well architected products. You build with quality, observability & redundancy at the forefront. You’re ready to get a lot done. You enjoy all aspects of building a product and are comfortable moving across the stack when necessary. You enjoy problem solving and thinking from first principals.. You’re ready to pick up new skills and More ❯
Wandsworth, Greater London, UK Hybrid / WFH Options
Our Future Health
using modern, agile development practices like code review, TDD, CI/CD and pairing using tools like Git and GitHub. Experience of operationally managing software components once live, including; observability, logging, metrics, error reporting, debugging and live incident management. Experience of working with sensitive personal data. Competitive salary starting from £85,000 Generous Pension Scheme – We invest in your future More ❯
technology. • Experience designing RESTful APIs. • Experience with streaming and messaging systems such as gRPC, Kafka and RabbitMQ. • Experience designing and interfacing with user portals. • Experience with monitoring, telemetry and observability technology and patterns. • Understanding of BSS/OSS systems and their integration with network infrastructure. • Experience with agile development methodologies and ways of working. • Awareness of software and network security More ❯
part of our Supply Chain team , implementing scalable, stable, secure, and resilient apps built on cloud-based services that enhance the store space shopping experience through internal tooling, automation, observability, and full-stack engineering practices. Our highly skilled Backend Engineers sit at the heart of our business and are responsible for designing, building, and maintaining our platform solutions. As a More ❯
end-user experiences. ThousandEyes is integrated across the Cisco portfolio and beyond, helping customers deploy at scale while delivering AI-powered insights within Cisco’s Networking, Security, Collaboration, and Observability portfolios. What You'll Do We seek a skilled C++ Software Engineer to join our team. This role involves working on integration and test automation projects, with opportunities to work More ❯
skills and experience (ideally Python, and/or Rust, Go, Kotlin, Java, etc) Sound technical knowledge, ideally across multiple technical competencies and levels (e.g APIs, networking, databases, security, compliance, observability, architecture) Excellent communication skills (written, graphical, remote, in-person, presentation, one:one, one:many) with the ability to engage, influence, and inspire stakeholders and colleagues to drive collaboration and alignment More ❯