Engineering team. The ideal candidate will have a strong understanding of DevOps and Service Level Management (SLM) metrics, with experience in event-driven infrastructure projects using tools like Terraform, NewRelic, Kubernetes, AWS, and Kafka. As a Platform Engineering representative, you will collaborate with engineering teams to ensure our platform infrastructure tooling meets their needs and positively impacts … implement, and maintain scalable and highly available systems using load balancing, auto-scaling, canary releases, and blue-green deployments. Develop and maintain monitoring and logging dashboards with tools like NewRelic, Prometheus, Grafana, and Datadog, ensuring observability through metrics, tracing, log aggregation, and alerting. Help teams determine settings and thresholds for alerts and automations based on application performance … requirements. Monitor, optimize, and ensure system reliability and performance using tools like NewRelic and applying DORA metrics. Track uptime, response times, and resolution times to ensure compliance with SLAs, SLOs, and SLIs. Implement and promote system resiliency practices, including Chaos Engineering. Collaborate with cross-functional teams to enhance platform engineering practices and gather metrics data. Requirements Proven More ❯
AWS services and modules. Strong Linux administration and troubleshooting skills. Experience with CI/CD pipelines and Infrastructure as Code (IaC). Experience with monitoring and observability tools like NewRelic, DataDog, or Splunk. Skills and Strengths: Amazon Web Services (AWS) Auto Scaling, Fargate, Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, Go More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
as Code (IAC) solutions. Proven experience in monitoring and observability tools to proactively manage system health. Skills and Strengths: AWS (Amazon Web Services) Auto Scaling Fargate Route53 Observability tools (NewRelic, DataDog, Splunk) Scripting (Ansible, Bash, Python, GO) CI/CD Primary Job Responsibilities: Design and support EC2/ECS/EKS/Fargate environments for high availability … region setups) to ensure global reliability. Maintain and optimize the existing CI/CD pipelines and deployment processes to streamline software delivery, reduce risks, and ensure seamless integration of new features. Collaborate with Development, QA, and DevOps teams to integrate best practices into build and release processes. Implement, manage, and enhance monitoring tools to proactively detect and resolve system More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
with automation and orchestration platforms to automate manual activity and reduce toil. Building sophisticated dashboards using a range of telemetry data and dashboarding technologies such as Grafana, Splunk and New Relic. Maintaining and administering existing monitoring and analytic toolsets. Working with IT Operations to provide and support the use of critical tooling that will enable increasing levels of value More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
ZipRecruiter
and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, NewRelic, Grafana and Pager Duty. Excellent knowledge of programming including Python, Golang and JavaScript. Knowledge and experience of modern software development techniques and lifecycles. Experience with Infrastructure as … with automation and orchestration platforms to automate manual activity and reduce toil. Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic. Maintaining and administering existing monitoring and analytic toolsets. Mentoring colleagues in use of new technologies or practices. Actively participating in live incident resolution and post-mortem analysis More ❯
Stafford, England, United Kingdom Hybrid / WFH Options
JR United Kingdom
and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques and best practice including Splunk, NewRelic, Grafana and Pager Duty. Knowledge and experience of modern software development techniques and lifecycles. Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible … with automation and orchestration platforms to automate manual activity and reduce toil. Building sophisticated dashboards using a range of telemetry data and dash boarding technologies like Grafana, Splunk and New Relic. Maintaining and administering existing monitoring and analytic toolsets. Mentoring colleagues in use of new technologies or practices. Actively participating in live incident resolution and post-mortem analysis More ❯
Stoke-on-Trent, England, United Kingdom Hybrid / WFH Options
bet365
and management of effective Service Level Indicators (SLI) and Service Level Objectives (SLO) for reliability and customer satisfaction. Knowledge of contemporary observability tools, techniques, and best practices including Splunk, NewRelic, Grafana, and PagerDuty. Knowledge and experience of modern software development techniques and lifecycles. Experience with Infrastructure as Code (IaC) automation and orchestration tools such as Ansible and … efficient and resilient. Working with automation and orchestration platforms to automate manual activities and reduce toil. Building sophisticated dashboards using telemetry data and dashboarding technologies like Grafana, Splunk, and New Relic. Maintaining and administering existing monitoring and analytic toolsets. Mentoring colleagues in the use of new technologies or practices. Actively participating in live incident resolution and post-mortem More ❯
To be successful as a Performance Test Engineer, you should have the following skills/experience: Developing load testing scenarios using tools such as LoadRunner. Performance Center, AppDynamics, Dynatrace, NewRelic, Kibana and Splunk. Analysing performance bottlenecks and capacity limits. Creating automated performance test suites. Monitoring system behaviour under stress conditions. Some other highly valued skills may include … they will lead collaborative assignments and guide team members through structured assignments, identify the need for the inclusion of other areas of specialisation to complete assignments. They will identify new directions for assignments and/or projects, identifying a combination of cross functional methodologies or practices to meet required outcomes. Consult on complex issues; providing advice to People Leaders … to support the resolution of escalated issues. Identify ways to mitigate risk and developing new policies/procedures in support of the control and governance agenda. Take ownership for managing risk and strengthening controls in relation to the work done. Perform work that is closely related to that of other areas, which requires understanding of how areas coordinate and More ❯