Building a tech-enabled VC firm
I’m the head of data and product at Mosaic Ventures. This blog is about what that actually means: how we think about technology and data enabling early-stage investing, and what we’re building. Future posts will go deeper on the individual components: the data infrastructure, specific workflows, and how we’re using machine learning to surface investment signals. This first post starts with the most visible thing our stack has changed.
Mosaic today is a partner-led firm, with a lean tech team. There are no associates, analysts, or junior team who support the investment decision-makers. The work that we used to delegate to the younger investors – a portion of the sourcing, pipeline management, pitch deck analysis, meeting prep, analytical support for investment memo drafting – is handled by partners in conjunction with a purpose-built data infrastructure and AI-native workflows. Our investment judgment still operates at the centre, but is now augmented by custom-built agentic technology rather than humans.
Over two years ago, we made the bet that most of the discrete elements of the traditional associate role could now be cleanly separated, and performed by intelligent machines. We invest in AI applications that transform work across enterprises large and small, and took the leap to start eating our own dog food.
What replaced the junior team
What superseded the associate role at Mosaic is data infrastructure and a set of AI-native workflows, that have absorbed human work layer by layer. But we didn’t arrive here in one shot; the architecture organically evolved as the team changed, and the tech stack became an opportunity to massively improve partner productivity.
When we had a larger investment team with multiple associates, our stack reflected this: complex interfaces designed to impose structure on the chaos of a relationship-led traditional enterprise sales pipeline. This was all while managing a team of outsourced developers, building a custom UI linking many disparate data sources and third party systems . The goal was a single pane of glass: one place to see everything, make the system do the remembering. In practice it meant a lot of manual data entry, a CRM with occasional fidelity issues, and constant maintenance overhead.
As the junior team was promoted to principal and partner, or moved on to startups and other firms, we rebuilt around a different premise. Instead of one interface everyone had to log into, we decommissioned the monolith and moved to a multi-channel approach: a pub/sub message broker as the connective tissue between an ever-growing set of best-in-class tools, with microservices and workflow automations replacing the single pane. The idea was simple: push information to the right place automatically, and remove the manual entry layer entirely.
The stack today: Affinity as our CRM; tray.io for workflow orchestration (a Mosaic portfolio company), with agemo.ai layered in for more agentic workflow capabilities; Pub/Sub for event triggers; BigQuery as our data warehouse; Cloud Run for heavier backend workloads; and Data Studio on top for reporting and pipeline visibility. On the engineering side, we've largely managed without in-house or external development capacity by building with Claude Code and Cursor (RIP). The result is a small, fast, maintainable stack we can iterate on ourselves.
What the system does today
Here’s a concrete picture of what this looks like in practice.
The moment a founder or referral email hits a specific inbox (including via WhatsApp/Slack DM), their company information is automatically populated in our CRM: company stats, description, inferred pipeline status; and the pitch deck is attached to the relevant record without anyone touching it. Pitch deck analysis that would have consumed real time, perhaps an hour of careful reading and structured note-taking, now happens in seconds. Meeting prep documents, previously a 45-minute research and writing job, land in partners’ inboxes automatically. Weekly pipeline digests generate themselves. First drafts of investment memos, synthesizing all the facts in the data room and deep research, take a fraction of the time they previously required.
For research and due diligence, we’ve built a Chrome extension that sits across our browsing workflow. At the click of a button on any company page, it pulls together everything we know: public information about the company alongside our own relationship data, interaction history, and notes from the CRM. It surfaces who in our network knows the founders, flags prior conversations, and gives the investors the full context of our relationship with a company in seconds. From there you can dig deeper, trigger research workflows, or take action directly - without leaving the page you’re on.
Some of the tasks the system handles always sat in an awkward middle ground: too judgment-adjacent to fully automate five years ago, but fundamentally mechanical once the models got capable enough. They are, almost perfectly, what researchers call codified knowledge: the kind learned from textbooks, structured processes, documented workflows. AI is strongest here, and it’s where the associate role was most exposed.
The higher-end work remains the domain of the partner – that is, generating the tacit knowledge or research that shapes the firm’s view of a market, performing the referencing or diligence that surfaces a non-obvious risk, and writing memos that require nuanced argument (e.g. the investment thesis) rather than summaries. The judgment layer stays with the people who have the most context. In general, a seasoned investor partnered with a capable AI produces something noticeably better (and more efficiently) than the same investor reviewing a junior’s first pass: less telephone, more signal. Most importantly, we also believe it provides a better (e.g. more informed, engaged) experience for the founder, our customer.
There has been a striking pace of change in what’s possible. In early 2025, we scoped a retrieval-augmented generation (RAG) system to connect our private data (deal notes, CRM history, investment memos) to a language model, so that instead of generic outputs we’d get responses grounded in our actual history. It was going to be a meaningful build. By the time we’d finished the design, an off-the-shelf Claude and MCP configuration had rendered it largely unnecessary. What would have been months of engineering turned out to be a configuration exercise. That story repeats itself constantly. The planning cycles of organisations trying to use AI are consistently slower than the pace at which AI is improving, so we try to stay flexible and modular in how we architect and build going forward.
Using data to invest better
Eliminating the operational layer is the floor, not the ceiling. The more interesting question is what a genuinely data-driven investment firm looks like when the entire stack is rebuilt around that premise.
The work we’re focused on now is using data to improve investment decisions, not just faster ones. That means building systems that surface signals from information at the top of the funnel, that human pattern-matching alone might overlook; scoring and prioritising alerts on companies against the characteristics we’ve observed in the best performing startups in the market (both ours and others’), and understanding our referral network in a more structured and queryable way than a CRM was ever designed to support.
Some of this is pipeline infrastructure. Some of it is machine learning. Some of it is simply getting the data model right so that we can be agile in using the data in our day to day sourcing activities. The system we’ve built today automates what could be automated. What we’re building now is the layer above: not replacing investment judgment, but augmenting the information environment in which that judgment operates. That’s where the real prize is, if alpha can genuinely be improved.
Up next
In the posts that follow, I’ll go deeper on how we’ve actually built all of this: the data model, the specific workflows, the machine learning layer, and what’s worked and what hasn’t. The goal isn’t just to describe a stack; it’s to show what it looks like when a small firm shifts its perspective to leverage data infrastructure as a genuine source of investment edge, beyond what used to be the table stakes of operational efficiency.