Site Reliability Engineer (SRE)

We are seeking a Site Reliability Engineer (SRE) to design, build, and maintain highly available, resilient, and scalable systems. You will collaborate closely with engineering, product, and operations teams to ensure our Java/Spring Boot applications run smoothly 24/7 in a cloud environment. Additionally, you will drive the adoption of analytics and data-driven insights to optimize system performance and extract value from operational data.

Key Responsibilities

  • Reliability & Scalability: Design, implement, and maintain systems that are robust, scalable, and highly available, supporting millions of daily transactions.
  • Cloud Migration: Lead and support migration of applications and infrastructure to public cloud platforms, ensuring best practices in security, reliability, and cost management.
  • Automation & Infrastructure as Code: Develop and maintain automation scripts and infrastructure using Kubernetes and Terraform.
  • Monitoring & Incident Response: Build and enhance monitoring, alerting, and observability solutions. Respond to incidents, perform root cause analysis, and drive continuous improvement.
  • Collaboration: Partner with software engineers, product managers, and business stakeholders to deliver solutions that meet business needs and operational requirements.
  • Analytics & Data Insights: Leverage cloud-based analytics tools to monitor system health, optimize performance, and extract actionable insights.
  • Continuous Improvement: Identify and implement opportunities to improve reliability, efficiency, and scalability of the platform.

Required Qualifications

  • Proven experience as a Site Reliability Engineer, DevOps Engineer, or similar role supporting large-scale, mission-critical systems.
  • Strong hands-on experience with Kubernetes and Terraform.
  • Experience deploying and operating applications in public cloud environments (AWS, Azure, GCP).
  • Solid understanding of Java and Spring Boot applications.
  • Experience with monitoring, logging, and observability tools (Prometheus, Grafana, ELK, Splunk).
  • Strong troubleshooting and problem-solving skills.
  • Excellent communication and collaboration skills.

Preferred Qualifications

  • Experience in financial services or payments/transaction processing environments.
  • Familiarity with cloud-based analytics platforms and data engineering concepts.
  • Experience with CI/CD pipelines and automation tools (Jenkins, GitHub Actions).
  • Knowledge of security best practices in cloud environments.

Job Details

Company
KBC Technologies UK LTD
Location
Bournemouth, Dorset, England, United Kingdom
Employment Type
Full-Time
Salary
Competitive salary
Posted