Clear Eyes, Fuzzy Joins, Can’t Lose: Announcing Our Investment in Structify
.png)
Share
Human-quality workflows need human-quality data, an axiom that has only grown truer in the AI-first enterprise. However, access to complete, high-signal data remains a limiting factor, given steep data provider fees, inflexible schemas, AI hallucinations, and scattered, inconsistent, and mutating sources. Customers don’t need to be data scientists to recognize shovel-ready datasets, but if they need to be data scientists to generate them reliably, data will always be rate-limiting.
Structify delivers on a tantalizing promise: high-quality, well-structured datasets on demand, for any business problem and any source domain. Using an intuitive chat interface, with minimal prompts and no code, customers can define their schema, specify data sources, and deploy legions of agents to find, structure and merge data from across the web. Once normalized into a consistent form, these datasets can fuel untold queries and workflows. Basic setup takes only a few minutes, allowing users to go from idea to pipeline almost immediately. Finally, data science can be both approachable and enterprise-grade; iterative and efficient.
We often ask what’s possible now that wasn’t five years ago, and Structify’s robust in-house model is a stellar example. Traditional data providers are a labor-intensive business, and data scraping often requires heavy engineering resources, which helps explain why data pipelines historically haven’t been very iterative. Structify’s breakthrough is training a model to navigate like a human - that is, visually. By clicking, scrolling, and saving at scale, their model can structure and provide human-quality data for a given use case without any bespoke development. Structify then layers on continuously updating data, QA, and exhaustive verification.
It’s hard to overstate the complexity of so many inconsistent sources and structures - especially web pages, which can be all over the map. Unifying data from across this wilderness into a consistent form is extremely challenging. Previous generations of data practitioners relied on techniques like fuzzy joins to achieve that consistency; now, deep LLM-based approaches can achieve the desired effect. It’s also worth noting Structify’s refreshing, lightweight new take on crawl-ingest-transform-conform-structure-query architecture, which remains a major cost center and source of overhead.
This model enables both a high floor and high ceiling. At the free tier, customers can quickly get a sense for what the platform can do for their use cases. At higher tiers, customers can deploy Structify models wherever they’re needed (including on-prem), with extensive custom data sources. At one recent customer, Structify deployed ~1 million web agents in 24 hours.
Structify customers range from fintech startups to construction firms. In one case, a tech-forward boutique investment bank wanted to build an AI model to predict which issuers and investors would do deals, but the private markets data they relied on was both inaccurate and incomplete. Using Structify, they were able to save $250,000 annually on overseas data verification services. In another case, a multinational construction firm used Structify to transform thousands of engineering report PDFs into easily readable tables, accelerating their proposal and project timelines.
Today, Structify announced their $4.1 million seed round, and we were enthused to participate. Their approach to crawling and structuring is the best we’ve seen (and we’ve seen our fair share). We were likewise impressed by the team: CEO Alex Reichenbach was one of the top CS people in his class, receiving his first patent at 17, and along with CTO Alex Goldstein, brings deep experience at Matician in SLAM and translating data into action. COO Ronak Gandhi is an unconventional GTM leader with an eye for upside honed at United Talent Agency. We've known them for a while, and finally found our chance to formally work together.
If you’re an engineer looking to train best-in-class agents with a commitment to making information accessible, contact team@structify.ai.
And if you’re curious what your business could achieve with accessible, high-quality, structured datasets that don’t require a Ph.D. or Fortune 500 IT budget, sign up at structify.ai/signup.
We are privileged to welcome Structify to the 8VC Infrastructure portfolio, and look forward to supporting them in pushing the frontiers of dataset quality and accessibility.