Senior Site Reliability Engineer
Role Title: Senior Site Reliability Engineer
Location: Glasgow
Duration: 31/12/2026
Days on site: 2-3
MUST BE PAYE THROUGH UMBRELLA
Role Description:
As a Senior Site Reliability Engineer you will play a pivotal role in raising awareness and driving
adoption of SRE methodologies. This is a hands-on engineering role where you will design,
build, and optimise automation frameworks, observability tools, and incident response
mechanisms. Engaging with storage, data, and other product teams. You will act as a trusted
advisor, providing strategic guidance and consultative support to help teams improve reliability,
scalability, and efficiency.
This team will establish a Centre of Excellence to enhance and promote SRE best practices.
To be successful in this role you should have:
• Proficiency in Programming and Scripting - This includes expertise in languages such as
Python, Powershell, or Go, which are essential for automating routine tasks and system
deployments.
• Incident Management and Troubleshooting - The ability to manage incidents effectively,
troubleshoot issues swiftly, and perform root cause analysis to prevent future incidents.
• Systems Engineering and Automation - A deep understanding of systems engineering,
including operating systems, networking, and cloud infrastructure. Proficiency in
automation tools is crucial for maintaining system reliability at scale.
• Influential Communication Skills - The ability to communicate effectively with team
members and stakeholders, ensuring alignment, inspiring and motivating them to
embrace new mindsets, cultures, and SRE working practices. This skill is crucial for
driving meaningful change and fostering a collaborative environment where innovative
ideas can thrive.
Some other highly valued skills include:
• Knowledge of Cloud Computing - Familiarity with cloud platforms and services, which is
increasingly important as more infrastructure moves to the cloud.
• Strong Problem-Solving Abilities - The capability to approach problems methodically
and find effective solutions, which is vital for maintaining system reliability.
Purpose of the role
To apply software engineering techniques, automation, and best practices in incident response,
to ensure the reliability, availability, and scalability of the systems, platforms, and technology
through them.
Accountabilities
• Availability, performance, and scalability of systems and services through proactive
monitoring, maintenance, and capacity planning.
• Resolution, analysis and response to system outages and disruptions, and implement
measures to prevent similar incidents from recurring.
• Development of tools and scripts to automate operational processes, reducing manual
workload, increasing efficiency, and improving system resilience.
• Monitoring and optimisation of system performance and resource usage, identify and
address bottlenecks, and implement best practices for performance tuning.
• Collaboration with development teams to integrate best practices for reliability,
scalability, and performance into the software development lifecycle, and work closely
with other teams to ensure smooth and efficient operations.
• Stay informed of industry technology trends and innovations, and actively contribute to
the organization's technology communities to foster a culture of technical excellence
and growth.
Expectations
• Contribute or set strategy, drive requirements and make recommendations for change.
Plan resources, budgets, and policies; manage and maintain policies/processes; deliver
continuous improvements and escalate breaches of policies/procedures.
• Be a subject matter expert within own discipline and will guide technical direction. Lead
collaborative, multi-year assignments and guide team members through structured
assignments, identify the need for the inclusion of other areas of specialisation to
complete assignments. Train, guide and coach less experienced specialists and provide
information affecting long term profits, organisational risks and strategic decisions.
• Advise key stakeholders, including functional leadership teams and senior management
on functional and cross functional areas of impact and alignment.
• Manage and mitigate risks through assessment, in support of the control and
governance agenda.
• Demonstrate leadership and accountability for managing risk and strengthening
controls in relation to the work your team does.
• Demonstrate comprehensive understanding of the organisation functions to contribute
to achieving the goals of the business.
• Collaborate with other areas of work, for business aligned support areas to keep up to
speed with business activity and the business strategies.
• Create solutions based on sophisticated analytical thought comparing and selecting
complex alternatives. In-depth analysis with interpretative thinking will be required to
define problems and develop innovative solutions.
• Adopt and include the outcomes of extensive research in problem solving processes.
• Seek out, build and maintain trusting relationships and partnerships with internal and
external stakeholders in order to accomplish key business objectives, using influencing
and negotiating skills to achieve outcomes.
- Company
- eTeam
- Location
- Glasgow, UK
- Posted
- Company
- eTeam
- Location
- Glasgow, UK
- Posted