Senior Site Reliability Engineer
Role/Job title: Senior Site Reliability Engineer
Work Location: London
Role type: Contract InsideIR35
Mode of working: Hybrid
If Hybrid, how many days are required in office? 3 days/week
Duration of assignment: 6 Months
The Role
Our client is a leading professional services firm and part of the Big Four. We are looking for a seasoned Site Reliability Engineer to augment its state-of-the-art team to support its strategy of driving products and technology into everything they deliver to accelerate the growth in business.
As a Site Reliability Engineer, you'll work as part of a team of problem solvers, helping to solve complex business issues from strategy to execution.
Your responsibilities:
The team covers a variety of responsibilities that are executed by DevOps/DevSecOps, Site Reliability and ML Ops Engineers, including:
- Defining standard release automation patterns for infrastructure and application components.
- Defining standard CI/CD pipeline patterns that include a common set of DevOps and SecOps tools.
- Developing reports and metrics to validate existing and new pipelines use standard DevSecOps tools.
- Proactive optimization of redundancies, monitoring and alerting practices and patterns.
- Developing resilient and highly available cloud patterns.
- Infrastructure as Code development for buildout cloud infrastructure.
- Building and Releasing pipeline development.
- Secrets and configuration management.
- Monitoring systems and services, providing incident and emergency response to triage and resolve system or client issues.
- Management of the application ecosystem improving platform infrastructure and applications with high reliability, resiliency, performance and quality.
- Supporting documentation, knowledge articles, and runbooks.
Your Profile
Essential skills/knowledge/experience:
- At least 4 years of relevant working experience.
- Advanced Kubernetes - Must have strong skills in Kubernetes at scale using one of GKE, AKS, EKS or RKE. Experience with Kubectl and Helm. - Worked on EKS with Kubectl.
- Containers: Experience deploying Java (Spring Boot) microservices in dockerized environments.
- Observability - Experience in setting up tools like Prom/Grafana, Datadog, AppDynamics, Splunk. to give actionable intel on a microservice environment including but not limited to synthetics, Application performance monitoring, logging and Alerting (Pagerduty/OpsGenie Integrations). - Worked on elasticsearch and OpsGenie Integration.
- Good CI/CD expertise. Jenkins, Azure DevOps, Github Actions, ArgoCD, Artifactory, Azure container registry, Google container registry and other similar tooling.- Worked on Jenkins, ArgoCD.
- SCM - Working with tools like Github/Gitlab for source code management and well as experience with branching strategies like GitFlow and trunk based.- Gitlab using GitFlow.
- Strong troubleshooting skills - Be able to move all the way down to code level to give development teams a head start on application issues. Effectively be able to contribute to root cause analysis exercises post problem resolution.
- Good Communication Skills - Active listening, verbal and non-verbal communication, Clarity and Concision, Confidence, Open-Mindedness, Respect.
- Good Documentation skills - Be able to effectively document any automation, technical efforts so as to ensure ease of adoptability of a solution.
- Good collaboration skills - Must be able to work effectively with Scrum/Dev teams with a push/pull (push back and prioritize work pulled in) philosophy in order to manage expectations and contribute to the stability and improvement of the platform.
Desirable skills/knowledge/experience:
- IAC - Terraform , Pulumi. Preferably developed modules in the past rather than just using them. - Terraform
- Security - worked with encryption at rest, in transit patterns. Experience with tools like Azure Key vault, Hashicorp Vault, Google KMS.
- Security - Experience with tools like Veracode, Blackduck for AppSec testing, Qualys scanners for infra testing and Twistlock/Aqua for container scanning. - Qulays for OS Vulnerabilities scanning and fixing.
- Automation - Must be able to identify toil and opportunities to reduce that within the team.
- Authentication/Authorization - Familiarity with Authn/Authz schemes like OpenID, OAuth 2.0, SAML. -
- Scripting and Programming - Experience with Python, Powershell, Go, Java, Node.
- Event Driven/Event Sourcing Patterns - Familiarity with distributed event streaming platforms like Kafka, EventHub, RabbitMQ and patterns like CQRS.
- Company
- Infoplus Technologies UK Limited
- Location
- United Kingdom
- Employment Type
- Permanent
- Salary
- GBP Annual
- Posted
- Company
- Infoplus Technologies UK Limited
- Location
- United Kingdom
- Employment Type
- Permanent
- Salary
- GBP Annual
- Posted