101 to 125 of 157 Remote Grafana Jobs

Software Developer

Hiring Organisation
Mustard Systems Ltd
Location
Slough, Berkshire, UK
Employment Type
Full-time
work, and Go for select infrastructure Tools: RabbitMQ and Kafka for messaging, PostgreSQL and Redis for data storage Environment: Linux servers Observability: OpenTelemetry, Prometheus, Grafana and Zabbix Requirements Must-Haves: Strong background in software development, with strong experience with Python A degree in Computer Science or a numerical subject from ...

Site Reliability Engineer

Hiring Organisation
Searchability
Location
Wigan, Lancashire, England, United Kingdom
Employment Type
Full-Time
Salary
£65,000 - £70,000 per annum
preferred) * Cloud experience, ideally AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform) * Experience with monitoring and observability tools such as Grafana, Prometheus or OpenTelemetry * Strong understanding of networking fundamentals and distributed systems* Ability to collaborate effectively with engineering, operations and product teams TO BE CONSIDERED: Please ...

Site Reliability Engineer

Hiring Organisation
Searchability (UK) Ltd
Location
Wigan, Greater Manchester, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£70,000
preferred) * Cloud experience, ideally AWS, and knowledge of container orchestration (Kubernetes) and Infrastructure as Code (Terraform) * Experience with monitoring and observability tools such as Grafana, Prometheus or OpenTelemetry * Strong understanding of networking fundamentals and distributed systems * Ability to collaborate effectively with engineering, operations and product teams TO BE CONSIDERED: Please ...

Tech Lead

Hiring Organisation
Acorn Insurance
Location
Liverpool, Merseyside, North West, United Kingdom
Employment Type
Permanent, Work From Home
Salary
£85,000
Framework, MassTransit, Mediator Frontend:React, Next.js, TypeScript Infrastructure: Azure, Docker, Kubernetes (AKS), Nginx, RabbitMQ Architecture: Microservices, Event-driven patterns, Clean Architecture Observability and Monitoring: Grafana, Loki, Sentry, PostHog Tooling and Practices: Git, CI/CD pipelines, Agile methodologies What We're Looking For Proven experience leading software delivery within ...

Senior Software Engineer, Platform Observability Remote - Ireland

Hiring Organisation
Twilio
Location
Dublin, Ireland
Employment Type
Permanent
Salary
EUR 60,000 - 90,000 Annual
telemetry standards, efficient usage patterns, and scalable platform abstractions. Ability to make forward-looking technical decisions and lead others through ambiguity. Familiarity with ClickHouse, Grafana Loki, Athena, or equivalent systems for log and metrics querying. Contributions to open-source observability tools or communities. Experience building cost visibility or FinOps tooling ...

Full Stack Engineer

Hiring Organisation
Global Fintech Talent
Location
Zuid-Holland, Netherlands
Employment Type
Permanent
Salary
EUR Annual
GitHub Actions, GitLab CI, etc.). Managing cloud infrastructure (GCP/AWS) using Terraform. Working with Docker & Kubernetes (GKE), plus monitoring stacks like Datadog, Grafana, Prometheus. Implementing DevSecOps practices: IAM, secrets management, vulnerability scanning. Building and improving infrastructure in a setting where not every process exists yet - and where your … TypeScript, Python, or Bash. Kubernetes certifications (CKA/CKAD), Terraform Associate. Experience in fintech or other regulated environments. Knowledge of observability tooling (OpenTelemetry, Grafana, Prometheus). What's in it for you: Competitive salary based on experience (€50-80K). Participation in an equity/share certificate program. ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
London, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Nottingham, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Liverpool, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Southampton, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Glasgow, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Leicester, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Leeds, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Birmingham, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Bristol, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Woking, Surrey, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Shrewsbury, Shropshire, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Bedford, Bedfordshire, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Stevenage, Hertfordshire, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Plymouth, Devon, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Norwich, Norfolk, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Gloucester, Gloucestershire, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Wakefield, West Yorkshire, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Newport, Isle of Wight, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...

Site Reliability Engineer

Hiring Organisation
SS&C Technologies
Location
Wolverhampton, West Midlands, UK
Employment Type
Full-time
resolve incidents across services and infrastructure; reduce MTTR and prevent recurrences through high-quality post-incident actions. Observability as a first‐class practice: Use Grafana, Datadog, and Splunk (and related tools like Prometheus/OpenTelemetry) to detect anomalies, root cause issues, and create actionable alerts and dashboards. Run Kubernetes … chaos engineering). What you will bring 5+ years operating production systems as an SRE, DevOps engineer, or software engineer. Observability: Hands‐on with Grafana, Datadog, and Splunk for incident investigation, dashboarding, alerting, tracing/logs/metrics correlation, and performance analysis. Kubernetes: Strong experience running and troubleshooting workloads (controllers ...