Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
Inspire People
Edinburgh or Belfast. About the Role As a Senior Site Reliability Engineer, you will: - Build and scale DBT's product platform and services in AWS. - Provide development teams with observability, monitoring, CI/CD pipelines and service-level objectives. - Participate in an on-call rota (with allowance), helping to keep DBT services resilient and reliable. - Mentor junior engineers and contribute More ❯
tools such as Airflow ● Experience with cloud platforms (AWS, GCP, or Azure) and infrastructure-as-code tools (e.g., Docker, Terraform, CloudFormation). ● Familiarity with data quality, data governance, and observability tools (e.g., Great Expectations, Monte Carlo).[3] ● Experience with BI and data visualization tools (e.g., Looker, Tableau, Power BI). ● Experience working with product analytics solution (Amplitude, Mixpanel) ● Experience More ❯
Warwick, England, United Kingdom Hybrid/Remote Options
Ocho
in Git, SQL optimisation, and async architecture. Excellent communicator who values clarity, documentation, and collaboration. Nice to Have Experience with Supabase , Kubernetes , Docker , Azure , GitHub Actions , vector databases , or observability tools like Prometheus , Grafana , and Langfuse . What Success Looks Like 3 months: You’ve established your 1:1 rhythm, shipped your first automation workflow, and built a trusted partnership More ❯
with React, Vue, or Blazor Integrate LLMs and GenAI features into core product experiences Lead technical decision-making and mentor engineers within your squad Ensure best practices across testing, observability, and code quality What We’re Looking For Proven experience delivering AI/ML-powered production systems (not prototypes) Strong full-stack capability – C# .NET + modern JavaScript frameworks Solid More ❯
Worcester, England, United Kingdom Hybrid/Remote Options
Chapman Tate Associates
Infrastructure as Code (Terraform, Bicep, PowerShell). Solid grasp of Azure security and identity management in line with Zero Trust principles. Experience with CI/CD pipelines , monitoring , and observability tools such as Azure Monitor and Log Analytics. Excellent communication skills, stakeholder engagement, and a proactive approach to problem-solving. Strategic mindset with the ability to balance hands-on delivery More ❯
experience with LLMs/GenAI/ML in production Strong background in C#, .NET, REST APIs , and cloud platforms (Azure, AWS, or GCP) Agile mindset with focus on testing, observability, and secure delivery Excellent communication and cross-functional collaboration skills Nice to have Experience with vector databases , RAG systems , or multi-agent AI Python skills for AI/ML development More ❯
West Midlands (County), Birmingham, United Kingdom Hybrid/Remote Options
Sherborne Talent Solutions
automation, and optimisation of CI/CD pipelines to drive speed, reliability, and consistency. Manage and optimise Azure infrastructure for scalability, security, performance, and cost control. Champion modern monitoring, observability, and incident management practices to maintain high availability. Partner with engineering, architecture, and product leadership to accelerate delivery and reduce operational friction. Drive adoption of FinOps principles to balance technical More ❯
ensure efficient delivery of software updates. Senior DevOps Engineer (in addition to above) Contribute to the architecture and evolution of our cloud infrastructure strategy. Drive best practices for automation, observability, and security within DevOps. Mentor and coach junior team members, supporting their technical growth. Evaluate new technologies and tools to improve operational efficiency. Champion continuous improvement across our delivery pipelines … GitHub Actions, or Jenkins). Experience with Infrastructure as Code (Terraform, Bicep, or ARM templates). Proficiency in scripting languages (PowerShell, Bash, or Python). Experience with monitoring and observability tools (e.g., Application Insights, Grafana, Prometheus). Understanding of containerisation and orchestration (Docker, Kubernetes). Familiarity with security best practices in cloud environments. Desirable Experience within SaaS or FinTech environments. More ❯
for leading and executing the migration of data, dashboards, alerts, and configurations from Splunk systems to Elasticsearch. This role involves deep technical expertise in Splunk architecture, data ingestion, and observability tools, along with strong project management and stakeholder communication skills. Must have skills: -Splunk -ELK Stack -Kibana Nice to have skills: -stakeholder communication skills -strong project management Responsibilities: Minimum number More ❯
Birmingham, England, United Kingdom Hybrid/Remote Options
EML
deployment processes with a focus on minimizing security risks. Site Reliability Engineering (SRE): Ensure system reliability, scalability, and performance through proactive monitoring and secure incident response. Develop and implement observability tools to monitor system health, detect anomalies, and identify security threats. Perform root cause analysis and implement solutions to prevent recurring issues, including security vulnerabilities. Define and measure Service Level More ❯
Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
ByteHire
or communicating with robotic automation systems and integrating with physical devices Desktop app development with Electron CI/CD setup, rollback strategies, and deployment automation Sentry, NewRelic, or other observability tooling implementation More ❯
United Kingdom, Birmingham, West Midlands (County)
Uniting Ambition
with MLOps practices and AI development frameworks (e.g., Azure AI, LangChain, Hugging Face). Relevant certifications in Azure Architecture, Data, or AI disciplines. Knowledge of automation tools, monitoring, and observability platforms. If you have these skills and would like to find out more, please apply now. More ❯
Birmingham, West Midlands, United Kingdom Hybrid/Remote Options
Robert Walters
to improve performance Develop strategies to improve performance across group technology DevOps Lead: Experience Technical dept across but not limited to: Java, UNIX, Linux, Middleware, Web-Logic, Cloud Platforms Observability tools Designing/Developing/Implementing technology advancements Experience of improving resilience of complex production environments The permanent opportunity for a DevOps Lead will pay a salary range of More ❯
next-generation AI products. You’ll join a small, experienced team developing an internal Kubernetes-based platform that enables AI innovation across the organisation automating everything from deployments to observability, and helping developers build smarter applications with confidence. What you’ll be doing: Designing, deploying, and maintaining Azure Kubernetes (AKS) environments Managing Infrastructure as Code with Terraform and improving GitOps … workflows (ArgoCD/GitHub Actions) Building observability and monitoring stacks using Prometheus, Grafana, and Loki Supporting AI workloads (LLMs, RAG, and document processing applications) running on Kubernetes Automating platform operations with Python, Go, and shell scripting Implementing security guardrails, PII compliance tooling, and best practices for production AI systems What you’ll need: 3+ years’ experience in DevOps or Platform … Engineering Strong background in Azure and Kubernetes Hands-on experience with Terraform, CI/CD, and container orchestration Familiarity with observability tools (Prometheus, Grafana, Loki) Scripting or programming skills in Python or Go Interest in AI infrastructure, LLMOps, or large language model deployment More ❯
across the organization. What you’ll be doing: Building and maintaining a Kubernetes-hosted AI platform (AKS) Deploying and managing LLMOps tools such as LiteLLM, Langflow, and Langfuse Implementing observability with Prometheus, Grafana, and Loki Managing infrastructure through Terraform, ArgoCD, and GitHub Actions Supporting internal AI applications including RAG, document processing, and internal AI assistants What you’ll need … years in Platform or DevOps Engineering (Azure preferred) Strong experience with Kubernetes, Docker, and Terraform Programming or scripting skills in Python or Go Familiarity with GitOps, Helm, and observability tools A learning mindset and interest in LLM operations More ❯
A fast-growing technology business is developing advanced software for accounting, payroll, tax, and practice management. With a strong engineering foundation and a clear commercial vision, the company is now expanding its focus on artificial intelligence to transform how professional More ❯
act? This is a chance to design and deliver agentic AI systems on Azure that automate real business workflows through tool use, retrieval, and reasoning, with the reliability and observability of true production engineering. In this position you’ll take ownership of designing and scaling end-to-end agentic solutions on Azure, combining LLMs, APIs, and orchestration frameworks to deliver … Productionise on Azure using AI Foundry/OpenAI, Azure ML, Functions, Event Grid/Service Bus, and Kubernetes. Build LLMOps pipelines for evaluation, monitoring, safety, and cost control. Define observability standards across prompts, tools, and data flows. Establish governance patterns, safety, privacy, and auditability. Stay hands-on with critical code paths while guiding architecture and best practice. 🧠Required Skills/ More ❯
patterns where appropriate Ensure APIs are well-documented using OpenAPI/Swagger standards Build and maintain a developer portal for internal and external API consumers Quality & Operations Implement comprehensive observability including logging, monitoring, and alerting Design for reliability, fault tolerance, and graceful degradation Optimize API performance, scalability, and cost efficiency Write clean, maintainable code with thorough testing and documentation Configure … and modern security patterns Testing mindset - you write unit tests and understand integration testing API documentation experience using OpenAPI/Swagger and maintaining developer portals Production systems mindset covering observability, reliability, and operational excellence Architectural thinking - ability to design systems for scale, security, and evolution Keywords RESTful APIs C# .Net Azure AI LLM ML Machine Learning SaaS Scale Up OAuth More ❯
Birmingham, England, United Kingdom Hybrid/Remote Options
Amberes
product features. You will move fast from concept to customer, working across the stack to design APIs, build front-end interfaces, integrate AI models, and ensure performance, reliability, and observability in production. Key Responsibilities Build and ship AI-driven features end-to-end, from prototype to production Design, implement, and maintain inference services with strong observability Develop and optimise retrieval More ❯