GCP Background knowledge and hands-on practice in Observability, specifically experience working with one or more of the following tools - Kibana, Open-Search, Grafana, Datadog, Sumo Logic, New Relic, AppDynamics, Dynatrace, Prometheus, Logz.io, SignalFX, Instana, Splunk, Honeycomb, Jaeger Hands-on experience with Infrastructure as a Code (Terraform/Ansible) Hands More ❯
pipelines and container technologies like Docker and Kubernetes. Deep understanding of networking, distributed systems, and databases. Expertise in monitoring and observability tools such as DataDog, Prometheus, Grafana, ELK stack, or Splunk. Excellent communication skills and a meticulous approach to problem-solving. Desirable Experience: Familiarity with Azure. Experience working in the More ❯
pipelines, and be confident scripting in Python, C# or similar scripting languages. You’ll also be comfortable working with monitoring and performance tools like Datadog or Prometheus, and ideally, you’ll have worked in a fast-moving SaaS or product-led business before. Bonus points if you’ve helped shape More ❯
Proficient in cloud platforms (AWS, Azure, GCP) and modern DevOps tooling (e.g., Terraform, Jenkins, Kubernetes). Hands-on with observability and monitoring tools (e.g., DataDog, Azure Monitor, AppDynamics). Expert in cyber security practices, identity management, encryption, and secure API development. Familiarity with compliance frameworks such as GDPR and PCI More ❯
strategy while delivering incremental value. Technical Debt Management – Experience identifying and remediating inefficient architectures. Observability & Performance Optimization – Familiarity with monitoring and logging tools (e.g., Datadog, Splunk, Prometheus, New Relic). Stakeholder Management – Ability to engage with senior leadership, product managers, and engineering teams. Metrics-Driven Decision Making – Familiarity with engineering More ❯
Experience with AWS certifications (AWS Certified Solutions Architect, Developer, or DevOps Engineer). Experience with Monitoring and Logging solutions like CloudWatch , New Relic , or Datadog . More ❯
AWS (EC2, EKS, ECS, Fargate, Lambda, CloudFormation, Load Balancers, CloudWatch) and equivalents in Azure and GCP. Familiarity with observability tools such as Kibana, Grafana, Datadog, NewRelic, and others. Proficiency in RegEx, Lucene, and PromQL. Leadership & Onboarding: Proven experience leading technical teams focused on observability solutions and customer onboarding. Ability to More ❯
based Containers Orchestration Platforms - AWS EKS. Skilled working with Infrastructure as Code, Terraform required. Proficiency in setting up or integration with Observability tools e.g., Datadog, CloudWatch, X-Ray. Previous experience with troubleshooting and debugging on public cloud infrastructure (AWS). Working Proficiency of RDS Databases and Cache Engines. Experience with More ❯
region infrastructure and services. Manage enterprise schedulers and CI/CD pipelines for production workloads. Implement centralised logging/monitoring with platforms like Grafana, Datadog, or App Insights. Integrate and manage code quality and security tooling (SonarQube, Black Duck, etc.). Oversee network security, secrets management, and documentation practices (Confluence More ❯
senior engineering capacity Preferred Skills: Strong Python programming skills Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake) Experience in AWS, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi) For more information, please apply with a relevant CV. More ❯
Kubeflow, Docker/Kubernetes, and GitOps practices Strong working knowledge of Azure and Databricks services Proficient with observability and monitoring tools (e.g. Prometheus, Grafana, Datadog) Curious and commercially minded — focused on delivering scalable, valuable solutions Familiarity with additional cloud platforms such as AWS or GCP is a plus Demonstrated leadership More ❯
and Linux/Unix services. • Strong experience in scripting language like Power shell, Python and SQL. • Strong Knowledge of monitoring tools – Nagios, Splunk, OTEL, Datadog • Strong Knowledge of FIX protocol • Strong Domain skills - Must have working experience in Capital Markets across modules and instruments especially – CASH, ETS, Bonds, Options, Futures More ❯
impact role that demands serious technical firepower. You’ll need deep experience with Cloud native tooling, hands-on knowledge of observability tools e.g. Grafana, DataDog or Splunk, and the ability to troubleshoot containerised environments like a pro. You have the technical knowledge and the confidence to convey this to whoever More ❯
and reduce emissions, backed by several major investors across the energy, finance, and AI sectors. Technology Stack: Node.js, TypeScript, MongoDB, AWS (Serverless), Snowflake, Fly.io, Datadog, Tableau, and more. What You’ll Bring: This is a hands-on engineering role, ideal for someone who loves solving problems with code. Minimum More ❯
PAS, document management systems, and external data providers. Platform Monitoring : Determine requirements for specific alerts, set up alerts for various events and thresholds, utilise Datadog logs and dashboards for error analysis, and track DXC downtime while communicating updates to users. Platform Updates : Conduct a 3-way merge of updated code More ❯
Typescript & Playwright to write and maintain our end-to-end tests Qase to manage our test case suite Github & Gitlab for source control Sentry & Datadog for metrics & monitoring AWS for our production and staging environments 🤩 We’d love to hear from you if… You have 5+ years' experience in software More ❯
Strategic use of CloudFront or other CDNs to reduce latency and optimize user experience. Monitoring & Optimization: Implementing performance monitoring using tools like AWS CloudWatch, Datadog, or New Relic to proactively address bottlenecks. Security & Compliance Secure Architecture: Understanding of IAM, VPCs, and network segmentation to minimize vulnerabilities. Regulatory Knowledge: Familiarity with More ❯
City of London, Greater London, UK Hybrid / WFH Options
Fruition Group
Job Title: Senior Site Reliability Engineer (SRE) Location: Central London (Hybrid - c. 1-2 days per week) Salary: £80,000 - £100,000 + benefits Why Apply? This is a fantastic opportunity for a seasoned Senior Site Reliability Engineer to take More ❯
champion observability practices across multiple product groups. Provide thought leadership from the Cognizant delivery team on all things SRE. Leverage hands-on experience with Datadog to implement and enhance observability capabilities. Guide and oversee the day-to-day operation and maintenance of observability tools. Partner directly with engineering teams to … Expert-level proficiency with DevOps tools including GitHub, GitHub Actions, Jenkins, Nexus, CloudFormation/Terraform, and CodeQL. Extensive hands-on experience with observability tools – Datadog preferred . Deep understanding of AWS services such as EC2, ELB, ECS, S3, Config, CloudTrail, Lambda, EFS, and VPC. Strong scripting skills in Python and More ❯