MIT study: state of AI in business 2025
A recent study from MIT concluded that despite enterprises pouring $30-40 into generative AI initiatives, 95% of these projects yield no measurable impact. Adoption is widespread—many organizations are experimenting with AI tools—but only about 5 % of pilots reach production with meaningful results.
A brief summary
Most organizations are experimenting, but only 5% of pilots reach production with business value. Widely adopted tools like ChatGPT and Copilot boost individual productivity but not enterprise performance. Custom enterprise systems often fail due to brittle workflows and lack of contextual adaptation.
This gap is what the report defines as the GenAI Divide a sharp split between a small group extracting real ROI from GenAI and the vast majority seeing no return. The divide stems from differences in integration, learning capability, and outcome-driven evaluation—not from access, talent, or regulation.
The key conclusion of the study
The core barrier to scaling is not infrastructure, regulation, or talent. It is learning. Most GenAI systems do not retain feedback, adapt to context, or improve over time.
That is, they are too general.
How do we overcome the challenges
Similarly to when AI assists developers in coding, where it many times needs extreme amounts of prompting to get anything even acceptable, it would need to allow for the same approach when attached to real workflows.
Part of being less general would also be its ability to decide if the outcome of a task is acceptable or not. If done by an AI, this decision making is also prone to hallucinations, which means we cannot rely on AI for this purpose, we need deterministic approaches for the acceptance criteria of the outcome.
What this practically means is that when integrated in existing workflows, the interface between the workflow needs to be able to determine with no hallucination risk, that means without using AI, if the outcome of an AI fulfilled task is good enough or not.