Oxfordshire, South East, United Kingdom Hybrid / WFH Options
La Fosse Associates Ltd
Monitoring and Observability Engineer Salary - £50,000 - £55,000 - Fully remote role! Principal Accountabilities Design, implement, and manage monitoring solutions to ensure the availability, performance, and reliability of our systems. Collaborate with cross-functional teams to understand system requirements and implement effective monitoring strategies. Utilise expertise in Logic Monitor, OpenSearch … Proficient experience with other monitoring tools such as Dynatrace, New Relic, Splunk, Datadog, Nagios, Prometheus etc. Take ownership of the development of monitoring and observability practices Benefits include: 25 days holiday + statutory Competitive pension match Car allowance Family health care more »
collaborate across product focussed Agile engineering teams to ensure the reliability, availability and performance of client facing services. Responsibilities will include managing and configuring observability platforms such as DataDog and PagerDuty to provide proactive monitoring of production (and other) environments, design and implementation of automation processes to drive efficiencies, leading … similar SRE/Site Reliability Engineer position You have experience of running 24x7 services in the public cloud - Azure preferred You have experience with observability tools such as DataDog and PagerDuty You have a good knowledge of Containerisation - Kubernetes, AKS You have strong scripting skills for automation, PowerShell or Python more »
pipelines with Jenkins, GitLab CI/CD or similar - Containerize applications using Docker and orchestrate with Kubernetes - Monitor systems with Prometheus, Grafana and apply observability best practices - Automate deployment processes and improve DevOps workflows - Ensure high availability, fault tolerance and disaster recovery of cloud infrastructure - Collaborate with developers and operational more »
Cheltenham, England, United Kingdom Hybrid / WFH Options
Northrop Grumman
the support for live (mission critical) systems, working with customers to fault find and resolve issues within strict time constraints. Experience using Industry standard observability tooling (ELK, Grafana, Prometheus), creating/maintaining these environments is a plus. You will have a strong understanding & navigation of both Windows and Linux operating more »
infrastructure as code. Implement and maintain CI/CD pipelines using GitLab CI/CD and Jenkins. Manage and monitor SRE systems, including log observability, Application Performance Monitoring (APM), infrastructure monitoring, and security. Proficient in working with Kubernetes for container orchestration and management. Experienced with AWS Cloud services and infrastructure more »
understanding of Google Cloud (GCP) Deep understanding of SRE ethos and principles Vast amounts of Terraform experience Solid experience with Python Solid experience of Observability tooling. Good experience in dashboard creation/data visualisation using tools such as Google Looker, or Grafana Strong CI/CD experience Strong containerisation experience more »
Digital Operations team, focusing on exceptional support and strategic product advancements. Strategic oversight: Spearhead critical monitoring and response initiatives aligned with best practices in Observability and Site Reliability Engineering. Innovation and improvement: Continuously seek innovative ways to enhance our support processes, integrating cutting-edge technology solutions and refining our incident more »
understanding of web development technologies, including PHP, MySQL, HTML, CSS, and JavaScript. Mastery of PHP 8.2 and Laravel 9+, emphasising a DevOps mindset, including observability, monitoring, and alerts. Proficiency in working with APIs and integrating third-party services. Excellent problem-solving skills, with an ability to troubleshoot application issues and more »
of web development technologies, including PHP, MySQL, HTML, CSS, and JavaScript. Practical experience with PHP 8.2 and Laravel 9+, emphasising a DevOps mindset, including observability, monitoring, and alerts. Proficiency in working with APIs and integrating third-party services. Excellent problem-solving skills, with an ability to troubleshoot application issues and more »
Complexio is Foundational AI. This works to automate business activities by ingesting whole company data – both structured and unstructured – and making sense of it. Using proprietary models and algorithms Complexio forms a deep understanding of how humans are interacting and more »
Java/Kotlin) Mobile Development understanding (Swift/Kotlin) Strong OOP/Data Structure/Design Patterns understanding Cloud computing knowledge Understanding of cloud observability concepts (logging, monitoring, alerting etc.) This role is hybrid, with the office based in London. The successful hire will report to the Chief Product Officer. more »
Greater London, England, United Kingdom Hybrid / WFH Options
Axate
Left” approach, where testing happens early in the development process. Continuous Testing : Make testing continuous and automated throughout the software development lifecycle. Monitoring and Observability : Ensure monitoring tools are in place to catch issues in production. Early Involvement : Participate in brainstorming sessions and requirements discussions to ensure that test processes more »
required: Strong Cloud experience with AWS and AWS Services Containerisation/Orchestration with Kubernetes Strong understanding of IaC with Terraform Wealth of Monitoring and Observability experience Knowledge of Security/DevSecOps practices This position can offer £90-110K, plus benefits, and operates a hybrid working model (with 3 office more »
Define and follow software standards and processes from peer code reviews to coding standards Follow best DevOps and DevSecOps practices, to ensure successful delivery, observability, operation and security of software in production Work with test and operations teams to troubleshoot and resolve issues. Write unit and automated functional tests. Ensure more »
Surrey, England, United Kingdom Hybrid / WFH Options
Roc Search
new services and features is optimal in the context of their tech ecosystem, considering various functional and non-functional attributes, such as performance, availability, observability, security and cost. You will maintain a strategic vision, ensuring that a fast-paced development cycle converges on your preferred target architecture. You will become more »
approach ensures high-quality code, fosters knowledge sharing, and strengthens our collective expertise You play a pivotal role in driving automation, fine-tuning, enhancing observability, and ensuring reproducibility across our platform. Your contributions are instrumental in maintaining the platform's excellence and reliability Key Requirements: At least 3 years of more »
enhancing efficiency. Enforce adherence to digital principles, ensuring the integrity, security, and compliance of solutions while meeting both functional and non-functional requirements. Embed observability into solutions, monitoring production performance, resolving incidents, and addressing underlying risks and issues. Advocate for client requirements while maintaining discretion and confidentiality. Standardise best practices more »
designing, developing, and deploying applications based on microservices. Event-Driven Systems: Hands-on experience with Apache Kafka or similar distributed messaging systems. Monitoring and Observability: Familiarity with monitoring tools like Prometheus, Grafana, or Victoria Metrics. Database Technologies: Experience working with various databases, including: - TSDB: InfluxDB, TimescaleDB - GDB: Dgraph, Neo4j This more »
shape how everything runs at THINKalpha and be a leading voice in how we work and build our infrastructure. Your Work Configure and maintain observability tooling with Datadog and PagerDuty (Slack channels) Contribute to our IaC codebase by creating and maintaining Terraform and Ansible modules, and participate in the review … tools. Experience with both on-premise/colocated servers as well as cloud infrastructure, and hybrid deployments spanning both types of environments. Experience with observability platforms (e.g., DataDog) and alarm systems (e.g., PagerDuty) >Nice to have< Coding background in at least one language (Node, JavaScript, Python, C++, etc) Understanding of more »
Are you a visionary problem-solver with ability in transforming legacy observability setups into cutting-edge systems? Do you excel at designing innovative solutions that drive business value? We're on the hunt for a talented Kubernetes/Monitoring Solutions Architect based in the UK to lead our team in … revolutionizing our observability infrastructure within a data analytics company. The role is fully remote and may require some out of hours work to align with the timezone differences. Kubernetes & Monitoring Architect Responsibilities: Assess and comprehend existing legacy observability tools and infrastructure prevalent in the business, including Splunk, AppDynamics, Cribl, Zabbix … Thousand Eyes, and Service Now Event Management Collaborate closely with cross-functional teams to define requirements and objectives for future observability solutions, with a keen focus on noise reduction, seamless integration of business context, and harnessing the power of AIOps/Self-Healing capabilities Craft and articulate innovative designs for more »
understanding of Google Cloud (GCP) Deep understanding of SRE ethos and principles Vast amounts of Terraform experience Solid experience with Python Solid experience of Observability tooling. Good experience in dashboard creation/data visualisation using tools such as Google Looker, or Grafana Strong CI/CD experience Strong containerisation experience more »
the lead on projects to improve our DevOps: CI/CD pipeline (vulnerability scanning, static analysis, tests), blue/green deploys, auto load balancing, observability & instrumentation, infrastructure as code (eg Terraform) etc. Take the lead on projects to refactor our codebase, separating domain-specific logic, application logic and UI code more »
Greater London, England, United Kingdom Hybrid / WFH Options
Overcast HQ
years of real-world application of these concepts in a DevOps position. AWS Cloud skills & best practices Infrastructure as code CloudFormation Templates Continuous delivery, Observability (Application Performance Monitoring) Configuration management (Infrastructure as a Service) AWS product experience in high-levels Cloudwatch EC2, Lambda Containers - Docker, AWS ECR IT Operations & Production more »
management, and the prowess of cloud-native solutions. In your pursuit of continuous improvement, you're not solely reliant on metrics; you dive into observability metrics and user feedback, steering our technical progress with insightful analysis. Staying ahead is not just a practice; it's inherent. You're not merely more »
will involve ensuring that the design of new cloud services and features is optimal, considering various functional and non-functional attributes, including performance, availability, observability, security and cost. Whilst remaining hands-on, you’ll also maintain a strategic vision, ensuring that a fast-paced development cycle converges on your chosen more »