assets from legacy systems to our cloud-native environments using AWS and our bespoke Conversion Framework. Build new and maintain existing bespoke systems. Implement .NET-based microservices with strong observability and integration with data platforms. Develop custom ETL pipelines using AWS, Python, and MySQL. Implement governance, lineage, and monitoring to ensure high availability and traceability. AI & Advanced Analytics Integration: Collaborate More ❯
or similar. Experience of code collaboration such as GitHub, ArgoCD, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, Cloudformation, or similar. Experience designing tooling More ❯
Fort George G Meade, Maryland, United States Hybrid / WFH Options
August Schell
with Kafka/Confluent, Kubernetes operators. • Experience creating data partitioning strategies and monitoring topics for performance. • Experience deploying and upgrading Kafka clusters in high availability containerized environments. • Experience utilizing observability platforms (Elastic, Datadog, etc) to configure monitoring for data pipelines to ensure high availability and throughput, low latency, and alerting • Knowledge of stream processing pipelines and analytics. • Experience with Apache More ❯
Out in Science, Technology, Engineering, and Mathematics
or similar. Experience of code collaboration such as GitHub, ArgoCD, or similar. Experience utilizing CI/CD platforms to automate provisioning infrastructure, software builds, tests, and releases. Experience using observability tools such as APM, logging, and metrics to assist with debugging issues. Experience using Infrastructure as Code tools for provisioning infrastructure such as Terraform, Cloudformation, or similar. Experience designing tooling More ❯
assets from legacy systems to our cloud-native environments using AWS and our bespoke Conversion Framework. Build new and maintain existing bespoke systems. Implement .NET-based microservices with strong observability and integration with data platforms. Develop custom ETL pipelines using AWS, Python, and MySQL. Implement governance, lineage, and monitoring to ensure high availability and traceability. AI & Advanced Analytics Integration: Collaborate More ❯
learning, knowledge sharing and continuous improvement. You have a passion for DevOps and Platform as a Service. Understanding of security and compliance requirements related to platform infrastructure. Experience with observability practices and tooling, incident management processes and driving operational excellence. Diversity, Equity and Inclusion If you're excited about this role but your experience doesn't align perfectly, we encourage More ❯
scalability and reduce manual intervention. Operational Security, SRE & Assurance: Ensure security platforms are resilient, continuously monitored, and designed for 24x7 support and incident response readiness. Embed security telemetry and observability to enable proactive threat detection and automated response. Apply SRE principles to improve reliability, performance, and maintainability of security services. Lead platform health, patching automation, and vulnerability remediation workflows. Define More ❯
and maintain reusable components , APIs, and services that enable rapid deployment of AI features across products. Champion best practices in MLOps and software engineering , including CI/CD, testing, observability, and versioning for AI systems. Mentor and guide junior engineers and cross-functional team members, fostering a culture of technical excellence and collaboration. Stay current with advancements in AI/ More ❯
improving operational efficiency and automation. Lead cloud-native infrastructure initiatives, leveraging AWS, Kubernetes, and Terraform to deploy and scale systems efficiently. Implement GitOps-driven workflows to enhance deployment automation, observability, and system governance. Foster a DevSecFinOps culture, ensuring security, compliance, and financial accountability within the development lifecycle. Optimise data storage and retrieval strategies, balancing performance, cost, and compliance in a More ❯
practices (Agile, Scrum, Kanban) Proficiency in CI/CD pipelines, infrastructure as code, and cloud data tooling Familiarity with data governance, privacy, and security principles Experience using metrics and observability tools to monitor data platform health and team performance Experience in performance management and setting measurable goals for team members This role isn't for you if. You rely on More ❯
practices (Agile, Scrum, Kanban) Proficiency in CI/CD pipelines, infrastructure as code, and cloud data tooling Familiarity with data governance, privacy, and security principles Experience using metrics and observability tools to monitor data platform health and team performance Experience in performance management and setting measurable goals for team members This role isn't for you if. You rely on More ❯
practices (Agile, Scrum, Kanban) Proficiency in CI/CD pipelines, infrastructure as code, and cloud data tooling Familiarity with data governance, privacy, and security principles Experience using metrics and observability tools to monitor data platform health and team performance Experience in performance management and setting measurable goals for team members This role isn't for you if. You rely on More ❯
and driving down costs. Application development. If you're currently a application engineer working in Python or NodeJS with a strong operational slant, that can work well for us. Observability (Datadog), with a strong focus on enabling and empowering Engineering teams to understand their product in Production. SAAS Networking. Geolocation based performance, the path to multi-region, frontend performance optimisation. More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
domain. Experience in a strongly/statically typed language. Have a strong understanding of designing, building, and running high-quality, standards-compliant workflow APIs, with a focus on testing, observability, and performance. Have worked with a cloud provider (AWS/Azure/GCP). Have worked with distributed systems and are comfortable debugging through tracing and observability. Willing to be More ❯
orchestration solutions Evaluate and implement tools to improve developer experience and platform stability Introduce new tools or services to enhance infrastructure and operations Reliability & Monitoring Ensure security, reliability, and observability across platform components Monitor system health, capacity, and performance; troubleshoot cross-environment issues Collaborate with developers for seamless integration of infrastructure and application code Share knowledge and mentor junior engineers More ❯
About Zego At Zego, we understand that traditional motor insurance holds good drivers back. It's too complicated, too expensive, and it doesn't reflect how well you actually drive. Since 2016, we have been on a mission to change More ❯
About Zego At Zego, we understand that traditional motor insurance holds good drivers back. It's too complicated, too expensive, and it doesn't reflect how well you actually drive. Since 2016, we have been on a mission to change More ❯
success across all regions. Partner closely with R&D, Customer Success, Product, Sales, and Support to drive holistic customer outcomes. Hands-On Technical Expertise Maintain hands-on fluency in observability tooling, logging infrastructure, and cloud environments. Act as a senior technical escalation point for complex deployments or architectural challenges. Provide in-depth technical guidance on customer environments, use cases, and … performance analytics. Collaborate on the development of tools and dashboards to ensure visibility and impact tracking. Requirements Technical Experience 10+ years of technical experience in Cloud DevOps, SaaS, or observability, with 5+ years in leadership roles. Strong hands-on experience with AWS, GCP, Azure, K8S, Terraform and observability tools: Prometheus, Grafana, OpenTelemetry, ELK, Splunk, Datadog, and similar. Proficiency with metrics … team members are encouraged to challenge the status quo and contribute to our shared mission. If you thrive in dynamic environments and are eager to shape the future of observability solutions, we'd love to hear from you. Coralogix is an equal opportunity employer and encourages applicants from all backgrounds to apply. More ❯
You'll Do Build and maintain cloud infrastructure, helping transition fully to Google Cloud Platform (GCP) Use Terraform for Infrastructure as Code (IaC) Manage and monitor systems using Datadog, observability is key (no separate SRE team) Work with Kubernetes in production environments Collaborate closely with engineering and project management teams to deliver secure, scalable platforms Participate in a shared on … or a related field Strong cloud expertise, with AWS and/or GCP Proven skills in Terraform and infrastructure automation Experience with Kubernetes in production Familiarity with monitoring/observability tools - ideally Datadog Comfortable working cross-functionally and contributing to team-wide initiatives What You'll Get 4 day week Pension match Performance-based bonuses Private medical, dental and optical More ❯
fosters innovation, and delivers exceptional user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery. What we expect from you Drive the development of a comprehensive IDP (e.g., based on Backstage … on-demand environments for development, QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience … tools. Proven success in building and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire More ❯
at scale, leveraging AWS Organizations, Landing Zones, and multi-account best practices. Develop and maintain Infrastructure as Code solutions using Terraform, CloudFormation, and AWS CDK. Champion security, compliance, and observability by integrating services like AWS Security Hub, GuardDuty, and Inspector. Design CI/CD pipelines to enable seamless deployments and self-service models for customers. Innovate with AWS Networking, KMS … Proficiency in Python, Go, or similar languages for automation and scripting. Expert-level knowledge of AWS Networking, TLS, and security best practices. Experience with container orchestration (Kubernetes, EKS) and observability tools (Grafana, ELK). A passion for innovation, problem-solving, and delivering high-impact solutions. Why Work For Us? 25 days holiday + bank holidays Up to 5% employer pension More ❯
fosters innovation, and delivers exceptional user interactions delivering robust internal developer platform (IDP) capabilities, strengthening CI/CD pipelines, enabling on-demand environments, and scaling platform foundations such as observability, security, and FinOps - while adhering to best practices in DevOps and modern software delivery What we expect from you Drive the development of a comprehensive IDP (e.g., based on Backstage … on-demand environments for development, QA, and staging through Infrastructure-as-Code and container orchestration. Support multi-tenancy and environment rationalization to reduce duplication and inefficiency. Define and implement observability standards, including logging, metrics, tracing, and alerting . Use tools like New Relic , Prometheus , and Grafana , alongside building custom instrumentation for key platform services. Drive incident readiness and operational resilience … tools. Proven success in building and operating developer platforms and enablement frameworks. Experience with cloud-native technologies, Kubernetes, and Infrastructure as Code (Terraform, Helm, etc.). Strong understanding of observability tooling (especially New Relic, Prometheus, Grafana) and incident response best practices. Familiarity with FinOps, platform cost tracking, and infrastructure efficiency techniques. Excellent communication, leadership, and stakeholder management skills. Attract, hire More ❯