An Early Map for a Useful Robotics Future
.gif)
Share
Authors: Alex Kolicich, Vivek Gopalan, Alex McLane
If reading this, you’re likely able to recall the first moment you marvelled at some act of physical autonomy - the sort our progenitors might have pronounced spellwork. The memory is for some more recent than for others; it may have been a Waymo or a mobile manipulator, a humanoid or a quadruped. In any case, the experience seems to have surpassed that threshold of ubiquity which affords us all the right to say: robotic intelligence has captured our mainstream public imagination.
This all begs some further questions: How in broad strokes did this materialize? And more pressingly, to what use should we direct these new possibilities in physical autonomy?
Scaling Laws and Generalizability in robotics
If you’re familiar at all with our line of blogs, you will have come across the refrain that even without further model improvement, LLMs coupled with the scaling laws of software have rendered many previously intractable workflows newly accessible to automation.
Historically, however, these same scaling laws have not translated clearly to robotics. There have been a few early exceptions to the rule: Kiva Systems in warehouse logistics, Anduril in defense drones, Waymo in self-driving, to name a few; in general, however, costly hardware, a lack of available data, and the greater heterogeneity of physical environments relative to the digital world (from modeling physics to tactile data) have bottlenecked cost and performance for building robotics at scale.
It’s only with advancements in VLAs (only possible with LLM and VLM advancement) and simulation, scaling data collection across modalities, and improvements in hardware components that different parts of the robotics challenge have abraded to render the whole of the problem much more tractable. With respect to autonomy, the industry is also getting smarter on ways to improve generalizability without breaking the bank: since there isn’t an obvious analogue in robotics to the large precorpus of internet language data which fueled LLM pre-training, many companies are sequencing data collection more closely alongside product use cases. Large open source data collection projects like OpenXEmbodiment have also facilitated initial scaling, but these efforts of course remain multiple orders of magnitude away from what exists for LLMs.
Important to highlight here is that the need to closely tie research, data collection, and product gestures to something more fundamental than the typical questions around finding PMF or aligning ROI with research. It digs deeper to what we even mean by generalizability itself. The existence of a “GPT-moment” for LLMs was in many ways a function of accessible pre-training data; this moment has since colored our colloquial discussion of AGI as a jump in category - something realized as a single event, not a sequence.
But of course most of our conceptions of generalizability are definitions of degree, not category. Speaking broadly, we often use the term to describe strong model performance in scenarios outside the training set; in robotics, this also can mean breadth across form factors. Generalizability refers to the degree to which a model captures a spread of task distribution. With LLMs, the demand for granular evals and benchmarks on specific tasks is an obvious symptom for this view. In robotics, however, the insight is even more important: in a domain where cheap or free data is not widely available, what data you choose to collect first - and specific answers on how and what it will best generalize to - is both a market and research question.
Principles for building in robotics
It’s true that each success story will look different, that generational robotics companies will likely create new markets. Even so, we have a few principles and heuristics to at least help narrow the problem scope - to finding the right marriage of morphology and market, data and research. Many of these principles guided our conviction in Bedrock Robotics which brings experience from some of Waymo’s most defining breakthroughs - in freeway driving, all-weather operation, systems reliability, and sensor architectures - to automating heavy machinery in construction.
- Generalizability - The importance of this should be made clear from the get-go: Even if measured in degrees, generalizability very clearly remains the main catalyst for this new era of robotics. For each investment at 8VC, we always ask ourselves “why now?” - What new technology enables something today that was impossible yesterday? In the digital world, LLMs have created some of the clearest “why now’s” for new software companies. In robotics, generalizability allows you to not have to train each new type of robot from scratch, significantly lowering the marginal cost of building for a new use case; while perhaps obvious to some, it should be made clear that this remains the main catalyst for building new generational robotics companies.
- Controlled training environments - In any path to more generalizable robotics, there always is some control on initial training environments: what set of tasks, what sort of setting, what type of manipulation? Vertically scaling up the side of a nuclear reactor is of course different from walking down the Embarcadero; operating a drill is different from picking a strawberry. These seemingly obvious truths become more difficult questions for data collection. Across a number of modalities, some control needs to be set for each step of policy research and data collection.
Framed generally, the motivating question here is what type of environment is controlled enough to make the problem tractable with clear use cases but dynamic enough to enable real ROI from generalizability? More specifically, what’s the potential saving to labor or unit economics when taking into account hardware, maintenance, and other costs for the first use case and how does the initial model translate to ROI on similar form factors? We’re excited about our recent investment in Bedrock, for instance, in part because it strikes this balance in a massive market - construction sites are controlled versions of the open world, but require much more than classical robotics could offer for excavator operation.
- Morphology-market fit - Training data for a unimanual mobile manipulator is very likely going to be more transferrable to training bimanual mobile manipulators than to an autonomous vehicle. If billions are to be spent on robotic data, we ought to ask which morphologies really best suit certain sets of desired task generalizations before path dependence stiffens: this is both a question of what is most tractable today, but also what the best form factors will be for tomorrow. Many morphologies (e.g., excavators, logging harvesters) have been designed for and already make sense in our world; others will be new and need to inspire consumer love and adoption.
- Clear route for scaling data collection - Simulation scales well for over-the-road locomotion, but areas like complex fluid simulation and manipulation (with many coefficients of friction to estimate) have at least in the near-term necessitated more real world data collection. Breakthroughs in sample efficiency can help significantly with scaling as well, but until research policies stabilize, high Capex and working capital means the real world data problem will likely need business models markedly different from the Mercor and Scale AI’s of the world which fueled LLM training.
Research is early enough such that it makes sense why many companies are choosing to own more of the stack - data, research, and product. Controlling the training distribution (with something like paid experts) also allows you to concentrate data collection to correct repeat mistakes, though quality of the distribution is of course driven by a company’s ability to control quality of the expert and their ability to manage cost for concentrated scaling.
What types of markets should we start with?
These principles help frame the robotics problem, but they remain perhaps one layer too abstract: there are many areas where morphology-market fit might obtain, even though the market remains a poor one. What are some heuristics more easily identifiable that make a space more likely to invite robotics at scale?
- Operator shortages accelerated by inverting demographic pyramids: This is a heuristic for a) immediate ROI, b) tailwinds for long-term ROI, c) likelihood for partnerships that help lower data collection costs. Many industrial sectors are facing labor gaps due to retiring workforces, geographic bottlenecks, and long new-hire training timelines. Construction and manufacturing, for instance, will lose tribal knowledge with retirements.
- High-wage workflows with heavy training: Workflows that are high-wage, require long training cycles and trade knowledge, and are often dangerous for human beings (e.g., aerospace and automotive welding, pipeline maintenance, forestry and logging, chemicals) are often a strong heuristic for markets with strong macro, safety, and technical catalysts for adoption. Heavy machinery is also often, though not always, seen in tandem: automation can create order-of-magnitude productivity gains and the market is large. While they have their own share of large technical challenges, approaches in heavy machinery also mitigate difficult problems in tactile sensors, complex joint mapping, or the many more degrees of freedom for usable actuators.
- Workflows where throughput is not the primary concern: Nonprogrammatic robots (particularly for fine-manipulation use cases) are still slow, so new companies should prefer high-value, low-turn goods and processes for automation. In areas of high-throughput but low-value goods, value disadvantages compound from robots moving too slowly. Many environments where we might initially think robotics makes sense are actually bottlenecked on throughput, so a slow robot will struggle to get deployed even if it costs nothing. Heuristics (2) and (3) are often correlated: Low-value work is usually high throughput, high value work is usually low throughput.
- Retrofits for Existing Systems: Though in some cases it makes sense to move hardware in-house, retrofits (e.g., automobiles, bulldozers, excavators, tractors) mitigate hardware scaling significantly.
- Structured setting with potential for a lot of robots: Unlike more unpredictable open-world settings, verticals like homebuilding or factory welding offer more structured environments where VLAs can train effectively with less data and where there’s much more than one type of machine needed in the vertical, creating a clear line for expansion.
Heuristics are of course often meant to be broken - they are rough guides conditioned on the current state of the world, given the absence of perfect information. Advancements in sample efficiency, simulation, automated QA, or new paradigms in reasoning models might shift some of these priors. But perhaps even more crucial is that while a handful of generational businesses will be built by bringing new technology to massive existing markets, many more will create entirely new markets and workflows ontologies. In addition to construction, we think markets which may fit the former shape include welding for aerospace and automobiles, horizontal monitoring for high-investment categories (e.g., utilities, defense, infrastructure), chemicals manufacturing without linear production processes, powerline and pipeline maintenance, and logging heavy machinery (though locomotion remains more challenging). But there also exist possibilities beyond the circumference of our present imagination. The technological world may look quite different in 5 to 10 years (e.g., we may have even more dexterous manipulators). The ways we’ve worked may not fit how we work in the future: our factory ontologies may change; we may architect new more efficient manufacturing assemblies.
Even as the landscape changes, we think the principles we’ve laid out will remain useful frameworks for at least some time - many are still true in the world of LLMs; we hope they offer some help - an early map - in guiding creative spirit towards a useful robotics future. If you share our conviction (or perhaps even more so if you do not), we’d love to chat (vivek@8vc.com ; amclane@8vc.com).