Generative AI: Inference, Agents, and Return on AI
While much attention has focused on model training and discussions of “Stargate” data centers, inference workloads are poised to take center stage. About six months ago, LLMs faced criticism for “hallucinations” and their lack of Type 2 thinking—their inability to think critically and reliance on “one-shot” responses. With the recent ChatGPT Strawberry and o1 releases, however, a new picture is emerging. This shift is due to increased inferencing compute. Rather than providing rapid, one-step responses...