Founding Research Engineer
Frontier models now score above 170 on IQ tests. Reasoning is no longer the constraint on enterprise AI. Context is.
The context layer sits between an enterprise's siloed data and the agents that need to act on it. Stuff the context window and you trade quality for cost and latency. Use naive RAG and retrieval breaks the moment the question gets interesting. Stand up a vanilla knowledge graph and you hit the harder problem underneath: someone has to design the ontology, and at enterprise scale (hundreds of thousands of files, hundreds of gigabytes) no human can.
This is what gates almost every enterprise AI deployment we've seen.
60x solves it. We've built AI Brain, a knowledge graph platform engineered backwards from the agentic retrieval problem. The thesis is dynamic ontology generation: the graph schema isn't authored by a user, it's generated by a multi-agent ingestion pipeline from the business logic of the data itself, and continuously enriched with secondary and tertiary derivatives. Pre-digested analysis lives in the graph so retrieval is a lookup, not a reasoning loop.
We operate a Palantir model for workflows. Platform sits at the centre. Forward-deployed engineers wrap it around enterprise workflows we've already templated. Customisations get retained as IP and feed back into the platform. Same flywheel shape as Palantir, different domain.
We work with enterprises across multiple sectors, and a growing list of global consultancies are evaluating us against their internal GPT deployments. In the last two weeks we shipped a redesigned ingestion pipeline, primary entity extraction with auto-enrichment, and an end-to-end demo across 500 companies. That pace is the default.
This is a founding role. The parts of the platform you'll work on are the parts that decide whether the thesis holds.
The role
You'll work on the research-grade core of AI Brain alongside the CTO (exited robotics founder) and the senior engineering team. The open problems on the desk:
1. Dynamic ontology generationThe graph schema is generated, not authored. Structure emerges from the business logic of the data, with analytical insight pre-computed and stored rather than recomputed on every query.
Open work:
- Hierarchical ontology, moving from a flat conceptual space to one with inheritance, without breaking source provenance
- Per-tenant configuration that a forward-deployed engineer can tune without touching the runtime
- The eval question underneath all of it: how do we measure whether a generated ontology is good?
When a single real-world entity (a company, a person, a product) appears across hundreds of documents under different names, the graph has to recognise it as one thing. We do this through a multi-stage consolidation pipeline that combines fuzzy matching, heuristics, and agent-driven tiebreaking against authoritative external sources where the domain demands it. Provenance back to the source is preserved end-to-end.
Open work:
- Edge-case dedup where the same entity appears under different names in different contexts
- The right boundary between consolidation, enrichment, and update as separable concerns
- Determining attributes at the entity level rather than re-deriving them per chunk
Existing graph stores don't carry the temporal model we need, so we're building our own in Rust. Time becomes a property of every node, edge, and attribute, and any retrieval can be run as of any point in history.
The commercial story this opens up: a graph that doesn't only produce decisions today, but backtests its own reasoning against historical state to prove the system would have caught the right answers when it mattered. That's what justifies the platform license, and it isn't feasible on the existing stack without compromising the model.
You'll be central to the design and build of the replacement. This is the deepest research-and-systems problem on the roadmap and the most consequential piece of IP we'll ship in the next twelve months.
4. Benchmark and white paperExisting large-context retrieval benchmarks are saturated. Frontier models score 100%, which means they no longer differentiate between systems that are good at enterprise retrieval and systems that aren't. We need a new one. Designing it, running it, and publishing the white paper is on the roadmap. Releasing the benchmark itself, separately from our results on it, is part of the strategy.
5. Frontier work we’ll explore soonOpen ideas from research conversations: alternative embedding geometries for deep hierarchies, community-detection approaches to retrieval, graph-internal continuous-monitoring patterns as an alternative to scheduled jobs, encoder-based privacy primitives that would unblock several enterprise sales cycles. You'll have a hand in picking what we commit to.
You'll also contribute to hiring, technical input on client engagements where it matters, and white paper authorship.
Our stack
- Agents: LangGraph with Pydantic-typed state, Claude via Vertex AI, Gemini for fast tagging
- Graph and data: Postgres plus Apache AGE today, with a Rust temporal graph database in active development as the long-term replacement
- Backend: FastAPI, Python 3.12, Pydantic everywhere
- Frontend: Next.js (App Router), TypeScript, Tailwind, shadcn, Vercel
- Infra: GCP across compute, storage, model serving, and key management
- Tooling: pnpm, Husky commit hooks (lint, format, typecheck, test, agentic check), Linear, Claude Code as a daily driver
We're opinionated about code quality and we use AI coding agents hard. Founding-team velocity assumes it.
What we're looking for
- Depth in at least one of: knowledge graphs and GraphRAG, retrieval systems, agent orchestration, or large-scale data ingestion
- A track record of taking research papers or first-principles thinking through to working production systems. Published work, open-source contributions, or systems you can walk us through architecturally all count.
- Rust experience. We don't expect you to have shipped a graph DB before, but we do expect you to have written real Rust on a systems-level problem (parsers, runtimes, storage engines, performance-critical services) and to have a view on where Rust earns its complexity and where it doesn't.
- Strong Python, and enough TypeScript to ship product surface where it matters
- The instinct to read someone else's PR, see three things to improve, and write the comment kindly
- Taste. You can tell a clever solution from the right solution, and you'll push back on us when we conflate them.
- Excitement about this problem space, not AI in the abstract: the context layer, GraphRAG, dynamic ontology generation, temporal data systems
You don't need a PhD. You do need to operate at that level on the problems we care about.
Beyond the role
The community. 60x sits at the centre of Unicorn Mafia, the invite-only builder community we run. Around 1,100 members, tightening: maths olympiad winners, hackathon regulars, and founders across London, San Francisco, New York, and Europe. The network gives you durable career capital. Most of our team meets future co-founders through it, and several alumni have spun out their own companies with the network already in place.
Day one, you're in. Events are free. International trips are paid for. NY trips, hackathon weekends. Rooftop parties where the other guests are co-founders of major AI companies.
We hand-pick who sits in the office to keep talent density high. Engineers from outside 60x come in because the room is worth being in. Most companies fly people in to get access to a room like this. Yours is at your desk.
The lifestyle. We look after the team and we socialise together.
- Private healthcare and a wider wellness benefits package
- Sauna and cold plunge sessions for recovery and team time
- Team socials, dinners, off-sites, and the overflow from UM events
- An environment for people who want to do the best work of their lives without burning out