226 to 250 of 300 Permanent Grafana Jobs

Principal Engineer

Hiring Organisation
Motive Group
Location
City of London, London, United Kingdom
Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar). Strong observability experience using tools like Prometheus/Mimir, Loki, Tempo, Grafana, Alertmanager. Experience deploying and operating large-scale GPU clusters or HPC systems (Ideally). Working knowledge of ML infrastructure and familiarity with GPU drivers, CUDA ...

Principal Engineer

Hiring Organisation
Motive Group
Location
London Area, United Kingdom
Infrastructure-as-Code (Terraform) and configuration management tools (Ansible, Puppet, or similar). Strong observability experience using tools like Prometheus/Mimir, Loki, Tempo, Grafana, Alertmanager. Experience deploying and operating large-scale GPU clusters or HPC systems (Ideally). Working knowledge of ML infrastructure and familiarity with GPU drivers, CUDA ...

Reliability Engineer

Hiring Organisation
City Elite Transaction Services Ltd
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£100,000 - £130,000 per annum
cause analysis 24/7 production support experience in enterprise environments WAN/distributed systems expertise across multi-region deployments Proficiency with Prometheus/Grafana (or equivalent monitoring tools like Geneos, Splunk) Strong Linux/scripting skills (Bash/Python) Financial services background preferred The Role: Administer Solace appliances/ ...

Messaging Engineer

Hiring Organisation
Ncounter LTD
Location
East London, London, United Kingdom
Employment Type
Permanent
issues across application, network, and infrastructure layers Clear communication skills and the ability to collaborate across engineering teams Useful Extras Experience with Prometheus or Grafana Knowledge of Terraform, Ansible, or similar infrastructure as code tools If you are a practical engineer who enjoys owning and improving Solace-based messaging platforms ...

Technical Solutions Engineer - Deep-Tech AI Start-up

Hiring Organisation
Urban Digital Recruitment Ltd
Location
London Area, United Kingdom
+ edge devices Troubleshoot end-to-end: AI model behaviour, device integrations, networks, cloud infra, on-device performance Analyse logs, metrics and telemetry (Grafana, Metabase) to pinpoint root cause Work hands-on with Linux, SQL, Docker, AWS/GCP/Azure Lead pilots, rollouts and on-device testing across major ...

Technical Solutions Engineer - Deep-Tech AI Start-up

Hiring Organisation
Urban Digital Recruitment Ltd
Location
City of London, London, United Kingdom
+ edge devices Troubleshoot end-to-end: AI model behaviour, device integrations, networks, cloud infra, on-device performance Analyse logs, metrics and telemetry (Grafana, Metabase) to pinpoint root cause Work hands-on with Linux, SQL, Docker, AWS/GCP/Azure Lead pilots, rollouts and on-device testing across major ...

Senior Software Engineer

Hiring Organisation
SEEKR
Location
London Area, United Kingdom
JavaScript, C# Frameworks: React, Next.js, Node.js, .NET Testing: Vitest, Playwright, Pact, K6 Datastores: PostgreSQL, CosmosDB, Redis DevOps: GitHub Actions, Azure, Kubernetes, Docker, Terraform Monitoring: Grafana, Azure App Insights What’s On Offer £80,000–£100,000 salary, hybrid working (central London office), benefits, and a mandate to help shape ...

Senior Software Engineer

Hiring Organisation
SEEKR
Location
City of London, London, United Kingdom
JavaScript, C# Frameworks: React, Next.js, Node.js, .NET Testing: Vitest, Playwright, Pact, K6 Datastores: PostgreSQL, CosmosDB, Redis DevOps: GitHub Actions, Azure, Kubernetes, Docker, Terraform Monitoring: Grafana, Azure App Insights What’s On Offer £80,000–£100,000 salary, hybrid working (central London office), benefits, and a mandate to help shape ...

Solace Administrator

Hiring Organisation
BGC Group
Location
City of London, London, United Kingdom
across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related … incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. ...

Solace Administrator

Hiring Organisation
BGC Group
Location
London Area, United Kingdom
across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for messaging-related … incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and integration problems. ...

Solace Administrator

Hiring Organisation
BGC Group
Location
Slough, Berkshire, UK
Employment Type
Full-time
cloud). Provide production support for messaging-related incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana; proactively identify and address anomalies. Configure and optimize Solace across WAN environments, ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams … 24x7 enterprise environment. Experience working with distributed systems over WAN, with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management, performance tuning, and system scaling. Familiarity with ...

AWS DevOps / Platform Engineer - Start Up

Hiring Organisation
Robert Walters
Location
London, South East, England, United Kingdom
Employment Type
Full-Time
Salary
£70,000 - £110,000 per annum
high availability, and managed service utilisation. Collaborate with SRE, Security, and Engineering teams to improve observability, monitoring, and alerting using tools such as Prometheus, Grafana, and CloudWatch. Work closely with Security to embed best practices for IAM, secrets management, WAF, and governance. Optimise performance and cloud spend through automation … including scaling, deployment automation, and monitoring. Solid background in Linux systems administration, networking, and cloud security. Familiarity with observability tools such as Prometheus, Grafana, and Loki, and structured alerting practices. Experience with database migrations, high-availability configurations, backups, and disaster recovery. Strong scripting and automation skills (Terraform, Python, Bash ...

Platform Engineer

Hiring Organisation
SoCode Recruitment
Location
Cambridge, Cambridgeshire, UK
Employment Type
Full-time
enhancing autoscaling, high availability and managed service usage • Collaborate with SRE, Security and Engineering teams to strengthen observability, monitoring and alerting using Prometheus, Grafana and CloudWatch • Work closely with Security to embed best practice for IAM, secrets management, WAF and cloud posture management • Optimise performance and cloud spend through automation … including cluster scaling, deployment automation and monitoring • Solid background in Linux administration, networking and cloud security principles • Familiarity with observability tools such as Prometheus, Grafana and Loki along with structured alerting practices • Experience with database migrations, high availability configurations, backups and disaster recovery • Strong scripting and automation skills using Terraform ...

Need Lead NodeJS & API

Hiring Organisation
AETG Services PVT LTD
Location
Dallas, Texas, United States
Employment Type
Any
Salary
USD Annual
load balancing, failover mechanisms, anddistributed architectures to improve fault tolerance. Monitoring & Observability: Set up and maintain real-time monitoring and alerting using tools likePrometheus, Grafana, ELK stack, Datadog, or New Relic. Ensure comprehensive logging, tracing, andmetrics collection (e.g., through OpenTelemetry,Jaeger, or Zipkin) to provide visibility into system health. Proactively … using Node.js. Experience in building highly available, fault-tolerant systems that can handle production-level traffic. Proficiency in monitoring and observability tools (e.g.,Prometheus, Grafana, ELK stack, Datadog, New Relic Experience with resilience patterns such ascircuit breakers, retry logic, andrate limiting. Deep understanding of API security best practices (OAuth2 ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Buda, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Taylor, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Pflugerville, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Leander, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Austin, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Kyle, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Hutto, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Papillion, Nebraska, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Omaha, Nebraska, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Southlake, Texas, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...

Observability Pipeline Engineer - Hybrid

Hiring Organisation
Charles Schwab
Location
Bellevue, Iowa, United States
Employment Type
Permanent
Salary
USD Annual
Linux administration; Proficient in Kafka administration, including installing software, modifying configuration files, and agent management. Highly efficient multi-tasker and great organization skills. Splunk, Grafana, and Datadog experience a plus. Duties will include: On-boarding new Kafka producer and consumer use cases. Engineering and supporting the enterprise telemetry pipeline Testing … Kafka components, replication factors, and partitioning. E xperience engineering logging platforms Understanding of telemetry monitoring platforms and concepts, like ITRS Geneos, OpenTelemetry agents like Grafana Alloy. Grafana Cloud and Datadog. Deep understanding of security protocols: SSL/TLS, SASL, LDAP, etc. and role-based authentication. Experience working in telemetry monitoring ...