Site Reliability Engineer
SRE is part of a global organization that leverages the latest technology to communicate with our colleagues across the globe. We organize ourselves into distributed teams -- SRE teams are anchored to iManage offices across the globe. Tuesdays and Thursdays are dedicated to in-office collaboration, rapid innovation, and developing a sense of belonging at iManage. Mondays and Fridays are reserved for (remote-friendly) focus time to get things done. Have the best of both work styles in a workplace that is intentional about belonging, collaboration, and accomplishment. Being a Site Reliability Engineer at iManage Means You are an engineer, a builder, and a systems thinker. You’ll create middleware and platform guardrails that empower developers to innovate quickly and reliably. You combine technical depth with empathy to eliminate customer pain, especially when working with enthusiastic teams stewarding the world’s most privileged data. You uplift those around you, act as a subject matter expert, and drive change. You chase contributing factors over root causes, value code over documentation, and documentation over process. You’ll engage in architectural discussions, reduce toil, and deliver scalable, resilient platforms that support our customers and organization. As an SRE, you’ll help scale our cloud platform, collaborate across teams to promote standardization and resiliency, and participate in on-call rotations. You’ll be a key voice in observability, change management, and service scalability. iManage is experiencing explosive growth in its flagship cloud product. We’re seeking software and systems engineers specializing in reliability and platform services to join our transformative cloud journey. This requires rethinking technical decisions with a beginner’s mindset and a focus on resilience and sustainability.If you write code, think in systems, embrace complexity and automation, and are passionate about service resilience and scalability — we want to talk to you. Here is what one of our leaders, Principal Site Reliability Engineer ( Will Lucas ) has to say about the role: " Being part of the SRE team means working at the heart of reliability and scalability—where every decision impacts thousands of users worldwide. We focus on building resilient systems, automating processes, and ensuring security at every layer. What makes iManage special is the culture of collaboration and trust. We’re empowered to innovate, solve complex challenges, and continuously improve. If you’re passionate about reliability and love working in an environment that values both technical excellence and teamwork, iManage is the place to be!" iM Responsible For
- Eliminating TOIL through automation and software development.
- Partnering cross-functionally with application teams and internal stakeholders.
- Creating a modern, cloud-native platform that is resilient, cost-effective, and secure by default.
- Scaling cloud infrastructure to support our Kubernetes-based ecosystem.
- Maintaining the freshness and utility of platform services.
- Improving the security posture of our products.
- Designing automation, orchestration, observability, and disaster readiness into our products.
- Participating in production support and on-call rotations.
- Leading incident management and post-incident retrospectives.
- Experience writing design documents, postmortems, and refactoring application code.
- Built automation to reduce operational burden or developed internal SaaS tools.
- Ability to advocate for SRE principles (e.g., SLOs vs SLAs) and introduce them effectively.
- Experience in public cloud or hosted datacenter environments (Azure and AKS preferred).
- A passion for collaborative teamwork.
- Hands-on experience with Linux server stacks (Ubuntu/Debian preferred).
- Knowledge of cloud provisioning platforms (Terraform preferred).
- Exposure to configuration management tools (Chef preferred).
- Experience with containerization/clustering technologies (Docker preferred).
- Familiarity with observability and alerting tools (Prometheus/Grafana or ELK/EFK).
- Practical experience with CI/CD pipelines and rollout strategies.
- A bachelor’s degree (or equivalent experience) in Computer Engineering or related field.
- Proficiency in one or more programming languages (e.g., Java, Python, Golang).
- Familiarity with scripting languages (e.g., PowerShell, Bash, Python, Ruby).
- Join a rapidly evolving, industry-leading SaaS company on an exciting journey of growth and scalability!
- Take on meaningful, high-impact challenges by leveraging cutting-edge technologies and best-in-class protocols to drive innovation.
- Own my career path with our internal development framework. Ask us more about this!
- Expand my skill set and earn certifications with unlimited access to LinkedIn Learning courses and interactive Microsoft courses & training.
- Be part of a supportive and experienced team within a dynamic, inclusive, and encouraging culture.
- Enjoy flexible work hours that empower me to balance personal time with professional commitments.
- Collaborate in a modern, open-plan workspace featuring a gaming area, free snacks and drinks, and regular social events.
- Creating an inclusive environment where you’re encouraged to help shape the culture by bringing your unique perspective, not just by fitting in.
- Providing a market leading salary determined through a fair and consistent process, equitable for all our employees, and regularly reviewed against industry benchmarks.
- Rewarding me with an annual performance-based bonus.
- Providing enhanced parental leave (20 weeks for primary and 10 weeks for secondary caregiver at 100% pay)
- Matching my pension contribution (up to 6%)
- Offering BUPA private medical insurance & a Simplyhealth cash plan to assist with the everyday costs.
- Providing Group life cover, including life insurance, income protection, and critical illness protection.
- Encouraging me to make use of our top-tier flexible time off policy, which includes 25 days of annual leave and the flexibility to take further additional time off as needed
- Having multiple company wellness days each year to prioritize mental health and well-being.
- Providing access to RethinkCare, a global behavioral health platform that enhances personal well-being, strengthens professional resilience, and empowers parental success through expert-led training and resources.