I enjoyed reading Sam Hammond's Ninety-five theses on AI and decided to put together some theses of my own. Unfortunately, I lack Hammond’s Protestant zeal and only got to 50.
You can think of this as the spiritual successor to my AI omens post from September 2022, although I still owe a post following up on that one. I'm mainly writing these for myself as a way to track my views over time but figured it is epistemically virtuous to share them even if doing so risks looking like a bozo later. With that, some caveats:
Many are strong opinions, weakly held.
Given the changes in my views over the past 2-3 years, I could easily imagine my views on these changing between now and 2027.
I. AI progress
... is not plateauing. The next generation of frontier models -- GPT5, Claude 4, Gemini 2.0, Llama 4, Grok 3, etc. -- will wow me (e.g. with GPQA and other hard benchmark scores as well as agentic enablement) and others. If they don't, then I'll update hard against short AI timelines.
Scaling will continue to predictably improve models for at least one if not two to three model generations.
Someone will figure out how to generate synthetic data that improves capabilities as a function of compute in the next 2-3 years. It's fairly likely (50%) at least one lab has already (see recent GPQA and MATH benchmark improvements by Claude 3.5 Sonnet, GPT4 Turbo, and Gemini 1.5 Pro).
Continual learning is underrated as a bottleneck relative to "the data wall".
The fact that AIs never tire, can be shaped and sculpted by data, and can be perfectly consistent in their judgements is currently under-emphasized as an advantage relative to their raw intelligence. Imagine a world where every employee can replay one employee’s experience or where when a company needs to pivot, every employee can immediately update their conception of how they can most contribute to the new strategy.
My prediction about math progress looks good in hindsight. The DeepMind IMO silver medal is just the beginning. In 10 years, AIs will be better than the best humans at proving well-stated theorems.
Llama 3 is super impressive but does not prove frontier (OpenAI, Anthropic, Google) AI labs no longer have an advantage.
It also does not prove that open source AI will continue to keep up, but it does provide a trickle of hope whereas previously there was only cope. (Whether you think this is good or bad is a different question.)
Robotics progress is accelerating but will look more like self-driving than LLMs in terms of the difficulty of deployment, at least until we get human+ level AGI.
Mechanistic interpretability isn't currently thought of as a branch of neuroscience but should be.
There's still a huge overhang between AI capabilities and user experience / product quality. I suspect the product experience will catch up in the next two years. The people best at pushing the former are not the best at pushing the latter.
II. Societal impacts
Analogies to past revolutions are fraught but whether AI is more like electricity and oil than nukes or a new species is a question on which many others, including basically all of my theses, hinge.
Military applications of AI will be at least as impactful as the invention of artillery and mechanized warfare.
My 2050 predictions look extremely silly in hindsight.
Widespread rollout of real-time voice interaction will ratchet up societal AI awareness and the corresponding fear, uncertainty, and excitement that go along with that.
AI pause proponents are right that most people are very afraid of AI but have motivated reasoning about how meaningful this is. Most of society also didn't care about factory farming and still doesn't.
Many people kind of cried wolf on language model propaganda a few years ago but still haven't introspected on what they got wrong. Some, like Jack Clark, have and they deserve our praise!
Nearly-always-on surveillance remains underrated as an AI risk.
Highly uncertain: If AI is to cognition as mechanization was to strength, intelligence will decrease in status - but this depends on how general the AI intelligence is and whether it mostly compliments or substitutes high-end human intelligence.
If the prior point is true, smart people whose identity and sense of self-worth is closely tied to their general reasoning and thinking ability will experience some of the most severe existential angst in response to human-level AI. Some of us already are.
AI tutors can dramatically improve education, but this matters much less than people think unless AI progress unexpectedly plateaus.
Leopold is right that AI researcher influence on AI trajectory is probably close to its peak level.
I struggle to imagine a world with super intelligence and ~current humans in which humans remain influential (and this seems bad). In other words, human dignity is a fuzzy concept with a mixed track record but a real and important thing to track.
Corollary: The Merge and other related human improvements are way under discussed because they come off as weird / selfish / are associated with billionaires. (Two out of three of these were true of AGI until a few years ago.) Whether these sorts of projects get developed aggressively and quickly enough is very path dependent. (Sam Altman, Elon, Vitalik, and Jed McCaleb have all been early to this view).
III. Impact on business
AI's impact on businesses will be uneven: massive in companies able to adapt such that better judgment can translate to better outcomes, minimal in companies where it can't be for political or other reasons. This is analogous to how being data-driven helped make Amazon one of the most valuable companies in the world but has been at best neutral and often harmful for companies that do it poorly. (Also see: Gwern and Flo Crivello on corporations.)
We should have a strong prior against centaur hypotheses based on the history of games and other areas of automation. I suspect they make sense on the 5-10 year time horizon but not the 10+ year one.
I'm skeptical that, even in the limit, AI labs will dominate entire market verticals outside of a few software-heavy ones (e.g. coding).
That said, right now, supervised fine-tuning via API is the main control lever besides prompting for closed source models. This will not be enough for building advanced products.
Thesis-driven investing in AI seems really tricky even if you have situational awareness.
Ability to make tacit knowledge legible is about to become a lot more valuable.
There's probably at least one major structural shift of the same magnitude of the invention of C-corps that will be triggered by AI that myself and others are totally missing.
More people, myself included, should learn about the history of robotic process automation.
IV. Biology & medicine
Real-time voice + video input will be a really big deal for biology but will initially be rate limited by scientist pride and skepticism.
Better reasoning combined with generalist robotics will be disproportionately important for exploratory research and academic applications.
The future of industrial scale lab automation will look more like a factory floor than a set of lab benches.
A superintelligence would probably still need to do lots of real world experiments but this isn't as big a rate limiter as many people who make this argument think. It certainly doesn't mean our current rate of progress is even close to maxed out. Think: enormously high-throughput automated labs with millions of top-notch minds thinking through all the implications of the tiniest bit of new data.
We should be accelerating this and lab automation more generally as much as possible so that bio can benefit from AI improvements.
Faster thinking clock speeds and better multimodal models will make high bandwidth temporal information streams such as live cell imaging much more valuable.
Dario was right about AI and biosecurity, but others jumped the gun by focusing on current generation (GPT-4 level) models, which are likely not meaningfully raising risk.
Plain old cybersecurity, authorization, and authentication are underrated as biosecurity mitigations - also better KYC on physical tooling. (Credit to Douglas Densmore, who I first heard say this on a panel.)
Bryan Johnson's Blueprint is a more transformative direction for the future of medical AI than AI scribes and medical assistants because AIs will be much better at steering n=1, personalized medical treatment than humans. But this requires humans to actually listen to and give control to AIs.
Depending on regulation in the US, AI doctor use will be a case of leapfrogging in developing countries, analogous to mobile pay in Africa.
V. Software engineering
AI software engineering disruption will climb the experience ladder.
Nobody knows how induced demand vs. replacement will play out for software engineers.
Software copilots are a transient state, but the transition from copilot to coworker will require incremental expansion to build trust along the way.
Good product, visual/UX design, and system design taste will become even more valuable. Faster iterations speeds may make these skills easier to develop and more empirically grounded.
We may finally be able to make a dent in translating the billions of lines of C, Fortran, and other code into modern languages. DARPA is ahead of the curve with TRACTOR.
Proposals for building an entire AI and software ecosystem around formally verified software are unrealistic but directionally correct because formal verification will become much more tractable as will fuzz testing and automated “manual” user testing / red teaming.
AI can push out the security / user experience Pareto frontier. This could make cybersecurity/software defense dominant without much loss of economic competitiveness because human (both developer and consumer) fallibility and laziness are major sources of software's insecurity. On the other hand, consumers care about security and privacy much less than is ideal.
How fun using or collaborating with a given tool is is an underrated factor in adoption because software engineering tool adoption is often driven from the bottom up. Cursor does well here.
Acknowledgements
Thanks to Eryney Marrogi and Willy for comments on an early version of this.