An applied AI research & product lab
Agentic systems that earn their autonomy.
Bombadil Labs is an independent lab researching how autonomous AI systems fail — and building ones that don't. Research, R&D, and consulting for teams who'd rather measure than hope.
Trusted by teams at
What we do
Three practices, one obsession: autonomy you can audit.
Research
Failure modes, evaluation harnesses, and the unglamorous science of making agents reliable. We study how autonomous systems break so yours don't have to.
Agentic R&D
We design, prototype, and harden autonomous systems end to end — from the first tool call to the production incident review. Our own products are the proving ground.
Consulting
Embedded with your team. Architecture reviews, eval pipelines, deployment guardrails — and a straight answer about what not to automate yet.
In the workshop
What we're building.
Three agentic systems in active development at the lab. Each one is a bet that autonomy, done carefully, beats automation done carelessly.
How we work
House rules, written in ink.
- 01
- Autonomy is earned.
- Capability ships behind evidence, never ahead of it. An agent gets exactly as much rope as its eval record justifies.
- 02
- Evals before vibes.
- A demo is an anecdote. If it isn't measured, it isn't done — and if it can't be measured, it isn't ready.
- 03
- Small, sharp tools.
- The most trustworthy agent is the one with the fewest ways to go wrong. We subtract before we add.
- 04
- Boring is a feature.
- Reliability compounds; novelty depreciates. We optimize for the systems still quietly working in five years.
- 05
- Leave things better.
- Codebases, datasets, teams. We tend whatever garden we pass through, and we hand back the keys in better shape than we found them.
Field notes
Dispatches from the lab notebook.
The notebook is open; the ink is drying.
We write up what we learn — eval design, agent failure taxonomies, the occasional detour into the weeds. First notes land soon. Subscribe via RSS to read them the moment they do.
Have something worth building carefully?
Tell us what you're trying to make autonomous — and what keeps you up at night about it. We reply to every serious note.