Solace PubSub+ messaging Strong knowledge of production support Good understanding of WAN, networking and latency etc Solid knowledge of tools such as Grafana and Prometheus etc DevOps tooling experience would be ideal Proficiency in troubleshooting message delivery, persistence, and topic routing etc Good Linux/Unix knowledge Excellent communication skills More ❯
in API design and interface technologies. Expertise with container platforms (e.g. Docker, or similar). Expertise in monitoring, debugging and code analysis (e.g. Splunk, Prometheus, Grafana or similar). Fast learner who is generous with their knowledge. More ❯
executing product roadmaps, from idea to launch and scale. Hands-on experience with telemetry data (logs, metrics, traces) and IT infrastructure monitoring (e.g., OpenTelemetry, Prometheus, ELK, Splunk, ITRS Geneos, Datadog, Dynatrace, etc.). Knowledge of AI/ML frameworks (TensorFlow, PyTorch, MLflow) and automation tools (Terraform, Ansible, ServiceNow ITSM). More ❯
us extra happy? Previous experience with billing or payment systems . Expertise in building and optimizing scalable, resilient distributed systems. Familiarity with Kubernetes and Prometheus for container orchestration and monitoring. Redis knowledge for caching and performance optimization. Experience with .NET Framework. Willingness to drive initiatives to upgrade to newer .NET More ❯
applications • Knowledge configuring software-based Load Balancing solutions. • Knowledge with end to end application monitoring, tracing and alerting using tools like CloudWatch, Grafana, Datadog, Prometheus etc. • Understanding of security best practices. • Understanding of distributed computing environments and methodologies About the team Diverse Experiences AWS values diverse experiences. Even if you More ❯
HAProxy, Nginx) and network monitoring tools. Experience in DNS management and troubleshooting. Experience in network security best practices. Proficiency in monitoring and observability tools (Prometheus, Grafana, Splunk). Proficiency in at least one scripting language (Python, Bash) for automation. Experience with CI/CD pipeline management and DevOps practices. Strong … system performance. Experience in tools like df, du, lsblk, and fdisk for managing and troubleshooting file systems and disk partitions. Familiarity with tools like Prometheus and Grafana for monitoring and observability. More ❯
driven architectures. Deep understanding of data processing, analytics, and real-time event streaming. Expertise in PostgreSQL, AWS and Kubernetes. Proficiency in monitoring tools like Prometheus, Grafana, and Kibana. Knowledge of security best practices, including OAuth, JWT, and data encryption. Fluent in English with strong communication and collaboration skills. Thanks Sugan More ❯
infrastructure. Recruit and lead a growing team of data engineers. Tech Stack Python (3.10+), Pandas, NumPy PostgreSQL (TimescaleDB), SQL optimization RabbitMQ, ZeroMQ, Linux servers Prometheus, Grafana, Zabbix Requirements 5+ years of Data Engineering experience with expertise in Python and SQL. Proven leadership experience guiding teams and projects. Strong background in More ❯
london (hammersmith), south east england, united kingdom
OpenSource
infrastructure. Recruit and lead a growing team of data engineers. Tech Stack Python (3.10+), Pandas, NumPy PostgreSQL (TimescaleDB), SQL optimization RabbitMQ, ZeroMQ, Linux servers Prometheus, Grafana, Zabbix Requirements 5+ years of Data Engineering experience with expertise in Python and SQL. Proven leadership experience guiding teams and projects. Strong background in More ❯
ability to interpret and apply control requirements in technical design contexts. • Hands-on experience with performance monitoring, alerting systems, and diagnostic tooling (e.g., Geneos, Prometheus, Grafana, AppDynamics, or similar tools). • Strong communication skills — able to convey technical concepts to senior stakeholders and control partners. Desirable: • Experience in implementing or More ❯
with the ability to work effectively in a team. Technologies we use Golang AWS (Lambda, SQS, EventBridge, DynamoDB, RDS, CDK, OpenSearch) Github, Github Actions Prometheus, Grafana Event-driven architecture and domain-driven design How we reward our team Dynamic working environment with a diverse and driven team Huge opportunity for More ❯
Manchester, Lancashire, United Kingdom Hybrid / WFH Options
Tbwa Chiat/Day Inc
a SaaS or DBaaS environment. Strong understanding of cloud infrastructure components (e.g., compute, storage, networking) and their cost drivers. Experience with observability tools (e.g., Prometheus, Grafana, OpenTelemetry) and a deep understanding of monitoring and alerting best practices. Exceptional communication skills, capable of articulating complex technical concepts to diverse audiences. Demonstrated More ❯
engineering tools (for example IDE shortcuts, shell scripting, browser Dev Tools). Experience with one or more of the following is a plus: Kubernetes, Prometheus, Argo workflows, GitHub Actions, Elasticsearch/Opensearch, PostgreSQL, BigQuery, DBT data pipelines, Fastly, Storybook, Contentful, Deno, Bun. Benefits We want to give you a great More ❯
and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for … messaging-related incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and … in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. More ❯
and reliability across production and non-production environments. This includes working on incident response, capacity planning, WAN optimization, and system observability using tools like Prometheus and Grafana . Key Responsibilities: Administer and maintain Solace PubSub+ appliances and software brokers across environments (on-prem and cloud). Provide production support for … messaging-related incidents, including root cause analysis and resolution. Monitor system performance and health using Prometheus and Grafana ; proactively identify and address anomalies. Configure and optimize Solace across WAN environments , ensuring low-latency, secure, and reliable messaging. Collaborate with development and application support teams to troubleshoot message flow issues and … in a 24x7 enterprise environment. Experience working with distributed systems over WAN , with an understanding of networking, latency, and failover strategies. Solid experience with Prometheus and Grafana for system monitoring and alerting. Proficiency in troubleshooting message delivery, persistence, and topic routing. Experience with capacity management , performance tuning, and system scaling. More ❯
and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps teams you’ll also have the chance to work with containers and Kubernetes, OpenShift, Docker and … monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge of Linux and infrastructure support Experience of CI/CD, Cloud (AWS More ❯
and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps teams you’ll also have the chance to work with containers and Kubernetes, OpenShift, Docker and … monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge of Linux and infrastructure support Experience of CI/CD, Cloud (AWS More ❯
and applications across their entire IT estate. You’ll help drive the vision, design and implementation of monitoring and observability systems including OpenTelemetry, Grafana, Prometheus and Splunk etc. Working side by side with DevOps teams you’ll also have the chance to work with containers and Kubernetes, OpenShift, Docker and … monitoring, DevOps and automation tools. Requirements: Excellent previous experience in a similar Observability/Monitoring role. Experience of engineering and supporting solutions (OpenTelemetry, Grafana, Prometheus, Splunk etc) Experience with tools such as Jenkins, Ansible or Puppet Good knowledge of Linux and infrastructure support Experience of CI/CD, Cloud (AWS More ❯
Shell Develop and implement CI/CD pipelines for application deployment on Kubernetes Monitor the health of the platform and applications using tools like Prometheus, Grafana or ELK stack Assist with capacity planning and load testing of the platform and applications Develop and enforce best practices for building container-based … Experience with Infrastructure as Code (IaC) tools like Terraform Familiarity with CI/CD tools like Argocd , jenkins etc Experience with monitoring tools like Prometheus , Grafana , ELK stack etc Strong scripting skills (Python, Bash, etc.) Ability to troubleshoot complex networking issues BS degree in Computer Science, Engineering or a related More ❯
Newcastle Upon Tyne, Tyne And Wear, United Kingdom
Accenture
DevOps Engineer Location: Newcastle Upon Tyne Please Note: Due to the nature of client work you will be undertaking, you will need to be willing to go through a Security Clearance process as part of this role, which requires 5+ More ❯
driven architectures. Deep understanding of data processing, analytics, and real-time event streaming. Expertise in PostgreSQL, AWS and Kubernetes. Proficiency in monitoring tools like Prometheus, Grafana, and Kibana. Knowledge of security best practices, including OAuth, JWT, and data encryption. More ❯
remote either within the UK or in Europe. Your experience will cover: Go/Golang Redis - extensive, in-depth experience Grafana/Loki/Prometheus or other equivalent Observability/Monitoring technologies Developing refined dashboards for visualisation of observability and measurement data Experience with Cyber/Network Security analytics with More ❯
europe, Queen Street, City of Edinburgh, United Kingdom Hybrid / WFH Options
Bright Purple
remote either within the UK or in Europe. Your experience will cover: * Go/Golang * Redis - extensive, in-depth experience * Grafana/Loki/Prometheus or other equivalent Observability/Monitoring technologies * Developing refined dashboards for visualisation of observability and measurement data * Experience with Cyber/Network Security analytics with More ❯