Senior Site Reliability Engineer

Role/Job title: Senior Site Reliability Engineer

Work Location : London

Role type: Contract InsideIR35

Mode of working: Hybrid

If Hybrid, how many days are required in office? 3 days/week

Duration of assignment: 6 Months

The Role

Our client is a leading professional services firm and part of the Big Four. We are looking for a seasoned Site Reliability Engineer to augment its state-of-the-art team to support its strategy of driving products and technology into everything they deliver to accelerate the growth in business.

As a Site Reliability Engineer, you'll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution.

Your responsibilities:

The team covers a variety of responsibilities that are executed by DevOps/DevSecOps, Site Reliability and ML Ops Engineers, including:

  • Defining standard release automation patterns for infrastructure and application components.
  • Defining standard CI/CD pipeline patterns that include a common set of DevOps and SecOps tools.
  • Developing reports and metrics to validate existing and new pipelines use standard DevSecOps tools.
  • Proactive optimization of redundancies, monitoring and alerting practices and patterns.
  • Developing resilient and highly available cloud patterns.
  • Infrastructure as Code development for buildout cloud infrastructure.
  • Building and Releasing pipeline development.
  • Secrets and configuration management.
  • Monitoring systems and services, providing incident and emergency response to triage and resolve system or client issues.
  • Management of the application ecosystem improving platform infrastructure and applications with high reliability, resiliency, performance and quality.
  • Supporting documentation, knowledge articles, and runbooks.

Your Profile

Essential skills/knowledge/experience:

  • At least 4 years of relevant working experience.
  • Advanced Kubernetes Must have strong skills in Kubernetes at scale using one of GKE, AKS, EKS or RKE. Experience with Kubectl and Helm. - Worked on EKS with Kubectl.
  • Containers: Experience deploying Java (Spring Boot) microservices in dockerized environments.
  • Observability Experience in setting up tools like Prom/Grafana, Datadog, AppDynamics, Splunk. to give actionable intel on a microservice environment including but not limited to synthetics, Application performance monitoring, logging and Alerting (Pagerduty/OpsGenie Integrations). - Worked on elasticsearch and OpsGenie Integration.
  • Good CI/CD expertise. Jenkins, Azure DevOps, Github Actions, ArgoCD, Artifactory, Azure container registry, Google container registry and other similar tooling.- Worked on Jenkins, ArgoCD.
  • SCM - Working with tools like Github/Gitlab for source code management and well as experience with branching strategies like GitFlow and trunk based.- Gitlab using GitFlow.
  • Strong troubleshooting skills Be able to move all the way down to code level to give development teams a head start on application issues. Effectively be able to contribute to root cause analysis exercises post problem resolution.
  • Good Communication Skills - Active listening, verbal and non-verbal communication, Clarity and Concision, Confidence, Open-Mindedness, Respect.
  • Good Documentation skills - Be able to effectively document any automation, technical efforts so as to ensure ease of adoptability of a solution.
  • Good collaboration skills Must be able to work effectively with Scrum/Dev teams with a push/pull (push back and prioritize work pulled in) philosophy in order to manage expectations and contribute to the stability and improvement of the platform.

Desirable skills/knowledge/experience:

  • IAC - Terraform , Pulumi. Preferably developed modules in the past rather than just using them. - Terraform
  • Security worked with encryption at rest, in transit patterns. Experience with tools like Azure Key vault, Hashicorp Vault, Google KMS.
  • Security Experience with tools like Veracode, Blackduck for AppSec testing, Qualys scanners for infra testing and Twistlock/Aqua for container scanning. - Qulays for OS Vulnerabilities scanning and fixing.
  • Automation Must be able to identify toil and opportunities to reduce that within the team.
  • Authentication/Authorization Familiarity with Authn/Authz schemes like OpenID, OAuth 2.0, SAML. -
  • Scripting and Programming Experience with Python, Powershell, Go, Java, Node.
  • Event Driven/Event Sourcing Patterns Familiarity with distributed event streaming platforms like Kafka, EventHub, RabbitMQ and patterns like CQRS.
Company
Infoplus Technologies UK Limited
Location
United Kingdom, UK
Employment Type
Part-time
Posted
Company
Infoplus Technologies UK Limited
Location
United Kingdom, UK
Employment Type
Part-time
Posted