Senior DevOps Engineer (Product)
Hive Science is a tech start-up whose technologies are in use by many of the world’s largest brands including Land Rover, Edward Jones & Kroger. The Hive platform is delivering novel intelligence at the intersection of quantitative social psychology, behavioral science & AI/ML to accelerate & transform how work is done. We’re looking for our go-to infrastructure partner to join our team and get on board this rocket ship with us.
About the role:
This is an exciting opportunity to be part of a psychological science-based tech startup. This role sits within our product team and will be responsible for building, maintaining, and scaling the infrastructure that powers Hive’s next generation of products. This Senior DevOps Engineer will roll up their sleeves and help us continually evolve the foundation of our platform, enabling faster delivery speed and greater scale, as well as supporting new generative AI and machine learning technologies. We’re looking for someone who can bring strong infrastructure and automation expertise, and who can work closely with engineering and data science teams to support rapid product development & launch.
Important Details
· Location: Hybrid (3 days/week in office). Candidates must currently reside within commuting distance of London, UK.
· Employment Type: Full-time
· Work Authorization: Candidates must have existing right to work in the UK. Visa sponsorship is not available for this role.
· To Apply: Submit your CV to careers@hivescience.ai
What You’ll Be Doing:
As a Senior DevOps Engineer (Product), you’ll be responsible for the reliability, scalability, and security of our entire infrastructure stack; from CI/CD pipelines to production deployments, from infrastructure orchestration to security governance. You’ll be part of a high-powered team located in London. You will build robust systems, move quickly, but be ready to scale and ensure production-grade reliability as our product grows.
You will constantly need to be at the cutting edge as we deploy and scale the latest AI capabilities within our core platform. We can’t define everything you will be doing because some of it is unknown based on the disruptive world we live in and you need to be the kind of person ready to pivot at speed and stay at the bleeding edge of this new world.
Key areas of responsibility include:
Infrastructure & Cloud Engineering:
• Design, provision, and manage scalable cloud infrastructure using Infrastructure-as-Code (Terraform, CloudFormation) across AWS (must have deep experience), GCP, or Azure.
• Architect and maintain highly available, fault-tolerant systems that support our AI/ML workloads, web applications, and data pipelines.
• Manage containerization and orchestration platforms (Docker, Kubernetes, ECS) to support microservices and ML model deployments.
CI/CD & Automation:
• Build and maintain robust CI/CD pipelines (GitHub Actions, CircleCI, Jenkins) to automate testing, builds, and deployments across dev/staging/production environments.
• Implement MLOps workflows to streamline model deployment, versioning, and monitoring for our AI/ML products.
• Automate infrastructure provisioning (Terraform), configuration management, and deployment processes using scripting (Bash, Python) and automation tools.
Monitoring, Observability & Reliability:
• Implement comprehensive monitoring, logging, and alerting systems (Prometheus, Grafana, CloudWatch, Datadog, Sentry) to ensure system reliability and rapid incident response.
• Establish SLOs/SLIs and implement observability best practices to maintain high availability and performance.
• Lead incident response, root cause analysis, and implement preventive measures to improve system resilience.
Security & Governance:
• Implement and maintain security best practices including network security, firewalls, role-based access control (IAM), encryption at rest and in transit, and secrets management (AWS Secrets Manager, HashiCorp Vault).
• Develop and enforce governance frameworks for working with LLM APIs and AI services, including data protection, PII safeguards, and compliance requirements.
• Conduct security audits, vulnerability assessments, and implement remediation strategies to maintain a secure infrastructure.
Collaboration & Technical Support:
• Work closely with full-stack engineers and data scientists to support application deployments, optimize performance, and troubleshoot infrastructure issues.
• Support ETL/ELT workflows and data pipeline infrastructure for training and inference workloads across databases (SQL, NoSQL, Vector DBs, Graph DBs).
• Provide technical guidance and mentorship on DevOps best practices, infrastructure design, and deployment strategies.
Required Skills & Experience:
• Strong experience provisioning and managing secure cloud infrastructure (AWS preferred, also GCP or Azure)
• Expertise with Infrastructure-as-Code tools (Terraform, CloudFormation, Pulumi)
• Strong experience with containerization and orchestration (Docker, Kubernetes, ECS, Fargate)
• Proven track record building and maintaining CI/CD pipelines (GitHub Actions, CircleCI, Jenkins, GitLab CI)
• Experience with MLOps and supporting ML model deployment workflows (AWS Sagemaker, Lambda, containerized deployments)
• Proficiency in scripting and automation (Python, Bash, Go)
• Strong experience with monitoring and observability tools (CloudWatch, Prometheus, Grafana, Datadog, Sentry, New Relic)
• Experience with database administration and optimization across SQL, NoSQL, vector databases (Pinecone, FAISS), and graph databases (Neo4j)
• Knowledge of networking, security best practices, IAM configuration, and secrets management
• Experience supporting data pipelines, ETL workflows, and cloud data platforms (Databricks, Snowflake)
• Strong experience with the set up / design / governance and security of Clean Rooms and clean room integrations
• Previous experience in early-stage product teams or high-growth startups
• Ability to balance rapid prototyping with building scalable, production-grade infrastructure
• Strong problem-solving skills and ability to work independently in a fast-paced environment
Overall Work Experience & Additional Details:
You may have come from a platform engineering team at a tech company or from a startup where you wore every hat. You are fluent in both infrastructure theory and hands-on implementation, and you get a thrill out of building reliable, scalable systems that enable rapid product innovation and support cutting-edge AI/ML workloads.
As a fast-paced startup, each day is different from the one before. We’re nimble and creative, and value intellectual humility. We work really hard because we’re all 100% dedicated to the future we’re building. Our work is stimulating, challenging, and exciting. And our team is awesome. At Hive we only hire exceptional people, so you’ll be in good company; surrounded by passionate, insanely smart people who want to build the future of customer intelligence. Specifically, we’re looking for someone who will thrive in this type of environment:
• Fast-paced startup with competing demands and multiple priorities ongoing
• Own critical infrastructure decisions that directly shape the products we build
• A ‘solve the problem’ mentality
• Scrappy and creative
• Strong passion for the Hive Science mission and a love of the scientific method