We are at the cusp of one of the largest technological inflections in our generation. Some industries will be uprooted and replaced; others will grow significantly as new markets are unlocked. With the proper framework, investors can be ready to take advantage of these inflections. For an introductory article on Generative AI, see our Part 1 here: https://pernasresearch.com/investment-writings/generative-ai/
LLMs “Think”
While much attention has focused on model training and discussions of “Stargate” data centers, inference workloads are poised to take center stage. About six months ago, LLMs faced criticism for “hallucinations” and their lack of Type 2 thinking—their inability to think critically and reliance on “one-shot” responses. With the recent ChatGPT Strawberry and o1 releases, however, a new picture is emerging.
This shift is due to increased inferencing compute. Rather than providing rapid, one-step responses, ChatGPT now employs a technique called “hidden chain of thought” reasoning, performing multiple internal steps to analyze the problem before responding. This has given rise to the second law of scaling, seen below. That is the more compute the LLM uses for inference, the more accurate the response ( the right graph2).
The rise of inference compute has been fueled by significant reductions in both cost and speed, thanks to specialized hardware and model optimization. For instance, NVLINK networking enables 72 Blackwell GPUs to function as a single unit, reducing “thoughtful inference time” to mere milliseconds. Hyperscalers like Amazon have also developed specialized AI hardware dedicated to inference.
The Rise of Inference and Implications
As inference workloads grow, they will reshape the dynamics between data centers and edge devices. In roughly five years, it’s likely that 80% of total compute will occur at the inference level, with only 20% dedicated to training. The trend of prolonged inference highlights the increasing role of edge data centers, which may soon surpass the importance of large hyperscale facilities.
Edge data centers—typically smaller facilities with fewer than 50 racks located in urban areas—are a fragmented industry. However, with rising inference workloads, more computation may shift to the edge, potentially driving substantial upgrades. Edge data centers could see power consumption more than double as demand grows for real-time inference (e.g., generating movies from prompts like “Create Star Wars in the style of Christopher Nolan with a John Williams soundtrack”), increasing their importance.
Inference also places pressure on edge devices. Smartphones currently average around 8 GB of RAM, which would be fully consumed by even Small Language Models. By 2030, average RAM on smartphones and computers could exceed 50 GB. However, longer inference times allow for smaller, less data-intensive models, creating an equilibrium where specialized, compact models are tailored for edge devices.
Data & Agents
For investors, assessing how outdated a company’s tech stack is during a technological sea change has never been easier. However, predicting whether new technology will harm or benefit a company remains complex; while first-order effects may offer advantages, second-order effects can differ significantly. In the near term, the companies most likely to benefit from generative AI technology are those with 1) existing data advantages and 2) substantial headcounts in areas that can be streamlined through AI agents.
Is Data the New Oil?
Data is often touted by management teams as a competitive advantage, yet not all data offers the same value. For data to create a lasting advantage via AI, it should meet several key criteria: 1) uniqueness (not easily replicated or proxied), 2) volume (transformers improve with more data), 3) durability (data that retains value over time without strong recency effects), 4) quality (clean, unbiased data to avoid issues like model collapse), 5) integration (data aggregated from CRMs, ERPs, sensors, etc., then processed and cleansed), and 6) a positive feedback loop that iteratively improves models.
Although data currently provides a large advantage to incumbents, this advantage could depreciate significantly as the use of synthetic data increases. Synthetic Data is artificially generated data that mimics the real-world and is designed to have similar statistical properties. Synthetic Data is still plagued with many issues that compound as it’s usage propagates into more future models.
AI Agents
AI agents represent the potential energy of generative AI realized, shifting from knowledge repositories to action-oriented capabilities—a transformational leap. Soon, “swarms” of millions of agents could operate within companies, driving efficiencies, expanding revenues, and opening new markets. In the short term, however, companies will primarily leverage AI agents for cost reduction, leading to significant cost-curve disruption and a strong deflationary impact. Additionally, AI agents will accelerate an organization’s “clock speed,” enhancing the pace of internal processes from sales to customer service. Companies focused on Business Process Automation, such as Salesforce and UiPath, are poised to benefit from this acceleration.
“I use Synopsys and Cadence for my Chip design, and I look forward to renting or leasing a million AI Chip Designer agents from Synopsys to design a new Chip” – Jensen Huang
While there are technical challenges in creating and implementing AI agents, the most significant hurdle will be the cultural shift required within companies. Organizational inertia and inherent biases often resist these changes. Roles most likely to be automated are those that involve high standardization and volume, such as customer service (e.g., call representatives), inside sales reps and sales development reps, where tasks like lead generation, answering questions, and order processing can be automated.
An example is “Alice”, shown below3, is a AI SDR Agent developed by 11x. Alice operates at 80% lower cost than human counterparts, isn’t constrained by geography or time zones, functions across all channels, and improves with each interaction.
“I could see 100% of our SMB salesforce becoming AI owned..” Senior Director, ZenDesk
AI agents are also being developed for software engineering, enabling top programmers to perform the work of an entire team by assisting in code writing and evaluation. In financial services, AI agents are used for tasks like fraud detection and automated trading. For example, PayPal has effectively utilized AI to prevent fraudulent transactions, saving millions of dollars annually.
Return on AI
Given the buzz of AI and the rise of Agents, as investors it is good to take a step back to discern hype from reality. In order to do so, it is helpful to look at the several factors that underpin AI economics. The dynamics of i) substantial fixed costs via data centers ii) high incremental margins, iii) enormous TAMs iv) a virtuous data feedback mechanism v) and a rapidly advancing technology ensure an abundance of animal spirits and frenzy that combine to rob investors of common sense. We do some basic ROI calculations to test whether or not the prevailing exuberance regarding AI is substantiated.
To approximate returns on AI investment, we first estimate that roughly $100 billion has been spent on GPUs over the past two years. Roughly another $100B is spent on networking equipment and fiber to bring GPUs to production readiness. Assuming a total investment of about $200 billion in AI, what have the gains been?
We classify the gains into two buckets. The first bucket is general societal productivity gains. As a proxy, we consider revenues from ChatGPT and similar tools, which demonstrate customer value via paid subscriptions (we don’t use profit as ChatGPT is underpricing its product along with investing heavily for growth). This totals approximately $6 billion over the last two years. The second bucket is business profits as a results of efficiency gains and optimizations. Profits thus far have mostly accrued to Microsoft, Meta, and other ad tech companies like Google. The increase in profits is due to improved operational efficiency and enhanced revenue from AI-driven tools such as ad-targeting algorithms and generative AI content. We estimate these additional profits at ~ $15 billion.
“improvements to our AI-driven feed and video recommendations have led to an 8% increase in time spent on Facebook and a 6% increase on Instagram this year alone… And we estimate that businesses using image generation are seeing a 7% increase in conversions and we believe that there’s a lot more upside….” – Mark Zuckerberg
“only 2.5 years in, our AI business is on track to surpass $10 billion of annual revenue run rate in Q2. This will be the fastest business in our history to reach this milestone.” – Satya Nadella
The Tally:
AI Investments: -$200B
Societal Gains: +$6B
Business Profits: +$15B
With an estimated total spend of approximately $200 billion and returns of around $21 billion, the ROI stands at roughly 10%. This calculation excludes AI “moonshots” such as autonomous driving and early-stage cancer detection. Although there is euphoria around AI, we believe we are still in the middle innings. Even if AI progress halts at the current stage, returns would increase as penetration rates increase at both macro and micro levels. “Macro” refers to overall adoption rates, while “micro” denotes the proportion of time individuals spend using AI applications on a per capita basis.
Final Thoughts
Gen AI has been spurred on by accelerated computing. Accelerated computing does not mean going from 100 miles an hour to 150 miles but rather 100 miles an hour to 5000 miles an hour. So far, text has been the primary modality for Gen AI, however multimodality is around the corner. With images and video content that is virtually limitless, will scaling laws apply in the real world or will they fall short? Our subsequent piece will delve into multimodal AI and what it means for industries such as robotics and medical imaging.
Sources
- https://www.nytimes.com/2024/09/27/technology/openai-chatgpt-investors-funding.html
- https://openai.com/index/learning-to-reason-with-llms/
- https://www.11x.ai/worker/alice
INVESTMENT DISCLAIMERS & INVESTMENT RISKS
Past performance is not necessarily indicative of future results. All investments carry significant risk, and it’s important to note that we are not in the business of providing investment advice. All investment decisions of an individual remain the specific responsibility of that individual. There is no guarantee that our research, analysis, and forward-looking price targets will result in profits or that they will not result in a full loss or losses. All investors are advised to fully understand all risks associated with any kind of investing they choose to do.