Senior Site Reliability Engineer

Role/Job title: Senior Site Reliability Engineer

Work Location: London

Role type: Contract InsideIR35

Mode of working: Hybrid

If Hybrid, how many days are required in office? 3 days/week

Duration of assignment: 6 Months

 

The Role

Our client is a leading professional services firm and part of the Big Four. We are looking for a seasoned Site Reliability Engineer to augment its state-of-the-art team to support its strategy of driving products and technology into everything they deliver to accelerate the growth in business.

As a Site Reliability Engineer, you'll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution.

 

Your responsibilities:

The team covers a variety of responsibilities that are executed by DevOps/DevSecOps, Site Reliability and ML Ops Engineers, including:

  • Defining standard release automation patterns for infrastructure and application components.
  • Defining standard CI/CD pipeline patterns that include a common set of DevOps and SecOps tools.
  • Developing reports and metrics to validate existing and new pipelines use standard DevSecOps tools.
  • Proactive optimization of redundancies, monitoring and alerting practices and patterns.
  • Developing resilient and highly available cloud patterns.
  • Infrastructure as Code development for buildout cloud infrastructure.
  • Building and Releasing pipeline development.
  • Secrets and configuration management.
  • Monitoring systems and services, providing incident and emergency response to triage and resolve system or client issues.
  • Management of the application ecosystem improving platform infrastructure and applications with high reliability, resiliency, performance and quality.
  • Supporting documentation, knowledge articles, and runbooks.

 

Your Profile

Essential skills/knowledge/experience:

  • At least 4 years of relevant working experience.
  • Advanced Kubernetes - Must have strong skills in Kubernetes at scale using one of GKE, AKS, EKS or RKE. Experience with Kubectl and Helm. - Worked on EKS with Kubectl.
  • Containers: Experience deploying Java (Spring Boot) microservices in dockerized environments.
  • Observability - Experience in setting up tools like Prom/Grafana, Datadog, AppDynamics, Splunk. to give actionable intel on a microservice environment including but not limited to synthetics, Application performance monitoring, logging and Alerting (Pagerduty/OpsGenie Integrations). - Worked on elasticsearch and OpsGenie Integration.
  • Good CI/CD expertise. Jenkins, Azure DevOps, Github Actions, ArgoCD, Artifactory, Azure container registry, Google container registry and other similar tooling.- Worked on Jenkins, ArgoCD.
  • SCM - Working with tools like Github/Gitlab for source code management and well as experience with branching strategies like GitFlow and trunk based.- Gitlab using GitFlow.
  • Strong troubleshooting skills - Be able to move all the way down to code level to give development teams a head start on application issues. Effectively be able to contribute to root cause analysis exercises post problem resolution.
  • Good Communication Skills - Active listening, verbal and non-verbal communication, Clarity and Concision, Confidence, Open-Mindedness, Respect.
  • Good Documentation skills - Be able to effectively document any automation, technical efforts so as to ensure ease of adoptability of a solution.
  • Good collaboration skills - Must be able to work effectively with Scrum/Dev teams with a push/pull (push back and prioritize work pulled in) philosophy in order to manage expectations and contribute to the stability and improvement of the platform.

 

Desirable skills/knowledge/experience:

  • IAC - Terraform , Pulumi. Preferably developed modules in the past rather than just using them. - Terraform
  • Security - worked with encryption at rest, in transit patterns. Experience with tools like Azure Key vault, Hashicorp Vault, Google KMS.
  • Security - Experience with tools like Veracode, Blackduck for AppSec testing, Qualys scanners for infra testing and Twistlock/Aqua for container scanning. - Qulays for OS Vulnerabilities scanning and fixing.
  • Automation - Must be able to identify toil and opportunities to reduce that within the team.
  • Authentication/Authorization - Familiarity with Authn/Authz schemes like OpenID, OAuth 2.0, SAML. -
  • Scripting and Programming - Experience with Python, Powershell, Go, Java, Node.
  • Event Driven/Event Sourcing Patterns - Familiarity with distributed event streaming platforms like Kafka, EventHub, RabbitMQ and patterns like CQRS.
Company
Infoplus Technologies UK Limited
Location
United Kingdom
Employment Type
Permanent
Salary
GBP Annual
Posted
Company
Infoplus Technologies UK Limited
Location
United Kingdom
Employment Type
Permanent
Salary
GBP Annual
Posted