secret management tools (e.g., HashiCorp Vault, Azure Key Vault) and SSO/authentication systems (e.g., Okta). Observability: Hands-on experience with platforms like DataDog, Grafana, or Azure Monitor. Networking: Strong understanding of networking principles, DNS, and related technologies. CI/CD: Skilled in creating and maintaining CI/CD More ❯
strategy while delivering incremental value. Technical Debt Management – Experience identifying and remediating inefficient architectures. Observability & Performance Optimization – Familiarity with monitoring and logging tools (e.g., Datadog, Splunk, Prometheus, New Relic). Stakeholder Management – Ability to engage with senior leadership, product managers, and engineering teams. Metrics-Driven Decision Making – Familiarity with engineering More ❯
strategy while delivering incremental value. Technical Debt Management – Experience identifying and remediating inefficient architectures. Observability & Performance Optimization – Familiarity with monitoring and logging tools (e.g., Datadog, Splunk, Prometheus, New Relic). Stakeholder Management – Ability to engage with senior leadership, product managers, and engineering teams. Metrics-Driven Decision Making – Familiarity with engineering More ❯
utilization - Strong understanding of network fundamentals (DNS, DHCP, TCP/IP, routing, load balancing, load shedding) and experience with monitoring frameworks (such as CloudWatch, Datadog, Grafana, Elastic or similar) - Experience scripting operating system tasks in Bash, Python, etc. and with Infrastructure as Code, (such as CDK, CloudFormation, Puppet, Chef, Ansible More ❯
messaging tools (Kafka, Kinesis, Redis), and cloud infrastructure technologies (AWS, Docker, Kubernetes, Terraform). Strong understanding of CI/CD pipelines, observability tools (e.g., DataDog), and Agile and Lean methodologies. Demonstrated ability to adapt to new technologies, align technical decisions with business goals, and champion quality engineering through testable code More ❯
Kinesis, DynamoDB, and Lambda Proficiency in CI/CD tools, particularly Jenkins and Spinnaker Familiarity with monitoring and observability tools such as CloudWatch and Datadog Strong understanding of security best practices in cloud environments Preferred Qualifications In addition to the required qualifications, the following skills and experiences are highly desirable More ❯
Continuous Delivery using tools like Jenkins and Git. Experience working in Agile environments, including SCRUM and/or Kanban Experience with Monitoring tools like Datadog, Kibana, Client and Grafana Good Knowledge on Linux/Unix with troubleshooting skills Good to have experience with python, AWS services like EKS, S3, IAM More ❯
Bash, C#, React, GoLang); Proficient in (Azure) cloud platforms and tooling ( , Terraform/OpenTofu, ArgoCD, GitLab); Experienced in using and extending observability tooling like Datadog, Grafana, OpenTelemetry and system/application performance monitoring; Ability to debug, optimize code, and automate routine operational tasks; Deep understanding in infrastructure and software development More ❯
Experience in data platform, data engineering, large scale data processing, ETL, Lake house and experience in micro services, API design, Kafka, Redis, MemCached, Observability (Datadog, Splunk, Grafana or similar), Orchestration (Airflow, Temporal). Proficient in SQL and in one or more DBMS: Oracle, PostgreSQL, Sybase, MongoDB, Cassandra, CockroachDB, MySQL, Couchbase More ❯
operations of web applications. Desirable Skills: Serverless & Microservices: Experience withAWS Lambda,Azure Functions, and event-driven architectures. Observability & Monitoring: Familiarity with monitoring tools likeSplunk,Datadog, orNew Relicfor enhanced visibility and observability. Networking: Knowledge ofVPCs,VPNs, andload balancingin cloud environments. GDS Standards: Awareness ofGDS Service Standardsand accessibility requirements, especially for public More ❯
CI/CD best practices and tools (e.g. GitHub Actions, Jenkins, CodePipeline) Exposure to monitoring and observability tools for ML systems (e.g. Prometheus, Grafana, DataDog, WhyLabs, Evidently, etc.) Experience in building parallelised or distributed model inference pipelines Nice-to-Have Skills Familiarity with feature stores and model registries (e.g. Feast More ❯
tools like Terraform, Azure Bicep, Cloud Formation, etc. Familiarity with messaging services like Kafka or Azure Eventhubs Experience with monitoring, logging and tracing services (DataDog, Prometheus, Grafana, Splunk, ELK, etc.) Hands-on experience with build frameworks like Gradle, Maven, NPM Experience with Linux operating system Ability to partner with multi … Curiosity and motivation to learn. Nice to Have: Kubernetes certification or any public cloud certification Experience with Azure cloud or Azure DevOps Experience with Datadog Familiarity with application development using Java and React $120,000 - $170,000 a year More ❯
Selenium, Playwright). Knowledge of C#, JavaScript, Typescript, SQL, and test management tools (e.g., Xray, Quality Centre). Experience with performance monitoring tools (e.g., Datadog, Grafana) and CI/CD principles. Familiarity with AWS technologies (e.g., Lambda, Fargate), version control (GIT), and infrastructure as code (Terraform). Knowledge of security More ❯
with policy as code tools (Kyverno, OPA Gatekeeper, ) • Familiarity with container security scanning tools • Experience with monitoring and observability platforms such as Dynatrace, Grafana, DataDog, Elastic, • Knowledge of SIEM solutions and security monitoring tools. Why Join Us? • Impactful Work - Play a key role in managing a highly secure banking application More ❯
Reigate, Surrey, United Kingdom Hybrid / WFH Options
Willis Towers Watson
cost effectiveness Implement infrastructure as code with Pulumi Support the team in infrastructure and networking related issues Maintain and configure observability platforms such as Datadog Proactively monitor production and other environments to ensure stability, availability, security and integrity Participate in incident response, troubleshooting, and root cause analysis to mitigate and More ❯
TLS, GPG, SSH). Familiarity with SIEM frameworks , ISO27001 , and best practices in cloud security. Experience using logging and monitoring tools such as ELK , Datadog , or Logz.io . Certifications in Linux, Kubernetes, or networking (or a willingness to gain them). We’ll support your growth through access to the More ❯
code software such as Terraform Experience in continuous integration practices & tools such as GitHub Actions, etc. Experience in observability implementations using tools such as DataDog Analytical and detail-oriented, aligned to the DORA principles Excellent problem solving, communication, and teamwork skills Excellent time management and organisational skills Comfortable dealing with More ❯
Cambridge, Cambridgeshire, United Kingdom Hybrid / WFH Options
AI Tech Suite
Find the latest job opportunities in AI and tech. RunPod offers GPU cloud computing for AI/ML, providing secure and community cloud options, on-demand and spot pods, and serverless GPU scaling. The flexibility of remote work with an More ❯
Find the latest job opportunities in AI and tech. RunPod offers GPU cloud computing for AI/ML, providing secure and community cloud options, on-demand and spot pods, and serverless GPU scaling. The flexibility of remote work with an More ❯
within the Observability stack and provide L3 support as needed. Top Skills: Prometheus Grafana AWS (EC2, S3, Lambda) Scripting (Python preferred) Nice to Have: ️ Datadog ️ Kubernetes Responsibilities include building observability tools with open-source tech (OTEL, OpenSearch, Grafana, Open Tofu), supporting cloud-native systems, and contributing to monitoring and alerting More ❯
Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake) Experience with FIX protocol and market data analysis proficient in AWS, Kubernetes, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi) For more information, please apply with a relevant CV. More ❯
Proficiency in SQL and data analytics tools (e.g., Sigma, Snowflake) Experience with FIX protocol and market data analysis proficient in AWS, Kubernetes, monitoring tools (Datadog, Prometheus, Grafana), and automation frameworks (Terraform, Ansible, Pulumi) For more information, please apply with a relevant CV. More ❯