Site Reliability Engineer with Python (Hiring Immediately)
Site Reliability Engineer with Python
Our client is seeking a site reliability engineer to deploy, manage, troubleshoot, and enhance complex cloud-based internal tools and externally managed services for a diverse organization.
You should have at least 7 to 10 years of hands-on experience as a Site Reliability Engineer.
You will collaborate with IT, product, and engineering teams to maintain and improve these tools and services, troubleshooting issues as they arise.
The ideal candidate will proactively identify system weaknesses and resolve them before causing production issues through monitoring and data analysis using various IT & DevOps tools.
Responsibilities
- Maintain and restore internal apps and services promptly after failures.
- Serve as the technical point of contact for two core platforms (mobile and web), engaging with IT support and engineering teams for problem-solving, issue resolution, and feature enhancements.
- Collaborate with internal teams and external vendors to ensure software quality, security, and performance standards are met.
- Create, update, and utilize documentation such as runbooks and playbooks.
- Automate existing workflows and develop new solutions for infrastructure, testing, and failover processes.
- Debug complex issues across web and mobile application stacks, advising stakeholders and implementing solutions as appropriate.
- Enhance CI/CD processes to improve release cycles and developer experience.
- Participate in daily and weekly development activities, including standups, sprint planning, retrospectives, and issue tracking.
- Lead critical post-mortem analyses and coordinate follow-up actions.
Qualifications
- 7+ years of experience in software engineering, development, or system operations.
- Proven experience debugging complex problems and implementing cost-effective solutions.
- Experience designing, building, and operating large-scale production systems.
- Deep knowledge of Python is preferred; experience with Java, Go, Rust, or similar languages is also considered.
- Proficiency with source control systems like Git and GitHub, including feature branching strategies.
- Experience with open-source databases such as MySQL, Postgres, Redis, etc.
- Knowledge of DevOps practices and container orchestration tools like Docker and Kubernetes.
- Experience with log monitoring and observability platforms like Sumologic or CloudWatch.
- Experience automating infrastructure, testing, and deployment processes using tools like CircleCI; infrastructure as code knowledge is a plus.
- Familiarity with AWS services; knowledge of Azure or Google Cloud is beneficial.
- Strong understanding of modern web and mobile application development, with hands-on experience in JavaScript (TypeScript preferred) and Python stacks.
- Experience working with cross-functional teams, including engineers, UX/product designers, and external stakeholders.
- Ability to manage scope, risk, quality, and timelines effectively.
- Focus on quality, security, performance, and end-user experience.
This position offers an exciting opportunity with an organization based in Central London or New York. The salary range is approximately £80K - £100K. Please send your CV in Word format, along with your salary expectations and notice period.
#J-18808-Ljbffr- Company
- Jas Gujral
- Location
- London, UK
- Employment Type
- Full-time
- Posted
- Company
- Jas Gujral
- Location
- London, UK
- Employment Type
- Full-time
- Posted