Master Joe Phillips
Cinturón Blanco11 min read

AI Leadership Failure: Why 95% of AI Projects Are Failing

MIT confirms 95% of AI projects are failing. The cause isn't technical — it's leadership failure. The five patterns executives keep repeating, and how to stop.

A finance director at a Latin American manufacturer told me last month that her CEO had approved three AI pilots in the same quarter. Each from a different vendor. Each promising "transformation." Six months later: two abandoned, one still running because nobody wants to be the executive who shuts it down. Combined cost: USD $640,000. Combined operational improvement: zero.

This is not a vendor story. The vendors did their job. The pilots were technically competent. The failure was at the leadership level, and it was so predictable that the MIT study released in 2025 — the one that put the 95% failure rate in the public conversation — could have been built using her company as a single case study.

That study is now being cited across executive education, board reports, and consulting decks. Most of the commentary treats it as a technical problem. We need better governance. We need better data. We need better models. All true. All insufficient. The 95% number is not a technical metric. It's a leadership metric. Once you see it that way, you can do something about it.

Revista SUMMA, Central America and the Caribbean's tier-one business publication, made this same point in June 2026 — citing my book to frame the regional version of the failure: "before the prompt, posture must exist." I'll come back to that line. It is the one sentence that, if executives took seriously, would close most of the gap between the 5% who succeed and the 95% who don't.

What "AI leadership failure" actually means

I want to be precise. AI leadership failure is not when an executive fails to understand transformer architectures or hyperparameter tuning. Those are technical concerns and executives are right to delegate them.

AI leadership failure is when the executive fails at the job that cannot be delegated: defining what the organization is trying to achieve, measuring whether it's getting there, designing the system that produces the result, calibrating the promise made to whoever depends on it.

These are not new responsibilities. They are the same executive responsibilities that produced the 5% of AI projects that actually work. What changes with AI is that the cost of executing them poorly compounds faster. A vague goal in a manual process produces a slow drift. A vague goal multiplied by an AI system produces measurable chaos in weeks.

The MIT finding is not that AI doesn't work. It is that AI amplifies whatever leadership clarity you bring to it — or whatever leadership confusion. The 95% is the confusion rate, scaled.

The five patterns of AI leadership failure

After two decades building software for over a thousand clients and watching dozens of AI initiatives across Latin American and U.S. companies, I've come to recognize five recurring patterns. Each is a leadership failure. Each is fixable. None of them require technical fluency to address.

Pattern 1 — Mistaking velocity for direction

The executive announces an AI initiative because the board asked about it, the CEO read a McKinsey report, or a competitor made the news. The internal memo reads: "We need to be moving on AI."

Notice what's missing: toward what?

Velocity without direction is the default failure mode of executives under pressure. It feels productive. It generates internal communications, vendor meetings, pilot proposals. It produces motion. It produces nothing else.

The fix is unglamorous. Before approving a single AI initiative, the executive must be able to complete this sentence: "This effort succeeds when [specific outcome] is measurable by [specific metric] by [specific date]." If the sentence cannot be completed, the initiative is not ready to be funded.

In my book AI Black Belt, I call this the PMP triangle: Purpose, Metric, Promise. It belongs in the first conversation, not the third board update.

Pattern 2 — Buying tools before defining the problem

The pitch deck looks impressive. The vendor demo runs smoothly. The procurement team is already running the standard checklist.

Two questions don't get asked. What specific problem is this tool solving? What would success look like in 90 days?

When those questions don't get asked, the tool is being purchased to perform leadership. The executive is buying the appearance of acting on AI rather than the result of using AI well. The cost shows up later, in license fees, in change-management debt, and in the team's growing suspicion that AI is the leadership's distraction of the year.

The fix is a discipline I borrow from carpentry: measure four times before cutting once. Before any tool acquisition, the executive must define the exact problem, the expected outcome, the validation metric, and the cost of a wrong call. If those four answers don't exist in writing, the tool is not the answer because the question hasn't been asked.

Pattern 3 — Skipping measurement

Here is a sentence I hear in executive teams more often than any other: "We'll figure out the metrics after we see what it can do."

That sentence is the gravestone of AI projects. Without a metric defined upfront, the project succeeds or fails based on whoever has the loudest opinion six months in. That is not measurement. That is politics.

Real measurement requires the executive to commit to a number, a threshold, and a date, before the initiative starts. "This pilot delivers a 30% reduction in claims processing time by Q4, or we shut it down." That is a measurable promise. It produces clarity. It produces accountability. It produces, in 90% of cases, the early signal needed to course-correct before the budget is gone.

The leaders who skip this step are not being practical. They are being afraid. Measurement creates the possibility of admitting a project failed. The reluctance to define metrics is the reluctance to be held accountable. Both are leadership failures, not technical ones.

Pattern 4 — Confusing automation with delegation

This pattern is the one that produces the most operational damage. An executive automates a process that has not been properly delegated, documented, or measured. The result is not efficiency. It is the institutionalization of whatever was wrong with the process before.

Real delegation requires five conditions: clear result, authority aligned with responsibility, sufficient resources, measurable success criteria, and a follow-up ritual without micromanagement. AI is not a substitute for any of these. AI executed without proper delegation is delegation pretending to be done.

A practical illustration: a company automates its customer service triage with a language model. The model is competent. The team has not agreed on what counts as a successful resolution. Within sixty days, the model is closing tickets the team would have escalated. Customer satisfaction drops. Nobody can explain why. The executive is told the AI failed. The AI did exactly what it was instructed to do. The leadership failure was upstream.

Pattern 5 — Treating AI as a project, not a system

The fifth pattern is the most expensive over time. An executive treats an AI implementation as a one-time project — we deployed it, we trained the team, we declared victory — instead of as the start of a living system that requires continuous oversight, feedback loops, and adjustment.

AI systems drift. Models behave differently as input distributions change. User behavior adapts to the model's behavior, which changes the model's behavior, which changes user behavior. Without a governance routine — who reviews outputs, on what cadence, with what authority to intervene — the system slowly stops being aligned with the original purpose.

The 5% of executives who succeed with AI understand this. The 95% who don't think AI is finished when it's deployed. They will discover otherwise, expensively, six to eighteen months in.

Why technical solutions don't fix leadership problems

Every one of these five patterns has been pitched a "technical solution" by a vendor. We have a better governance platform. Our model explains itself. Our system has built-in monitoring.

None of them solve the underlying problem because the underlying problem is not technical. A better governance platform does not give the executive purpose. A more explainable model does not give the executive measurement discipline. A system with built-in monitoring still requires someone with authority to act on what the monitoring reveals.

This is the central confusion of the current moment. The market is selling executives technical answers to leadership questions. The technical answers are excellent. They cannot do the work that only the executive can do.

Until the executive shows up with clear purpose, measurable promises, and the discipline to keep the system in alignment, no technical investment will lift the success rate above 5%. The vendors will continue to be blamed. The vendors will continue not to be the problem.

The principle that closes the gap

The line SUMMA cited from my book is the shortest version of the fix:

"Before the prompt, posture must exist."

Posture is the executive's contribution to the system. It includes purpose, criterion, discipline, the willingness to be measured. It cannot be installed by a vendor. It cannot be delegated to the AI initiative team. It is the prerequisite, not the byproduct.

The seven-belt method I lay out in AI Black Belt is a structured way for non-technical executives to develop that posture, one belt at a time. White Belt builds the human foundation (mindset, learning, team, the AMARTE goal framework). Yellow Belt installs operating principles. Green Belt teaches you to measure honestly and decide what to automate using the Green Matrix. Blue Belt is about leverage — financial, human, process, technological, and influence. Brown Belt is systematization (the Prepare-Aim-Fire model with the PMP triangle). Red Belt is the living-system discipline. Black Belt is the eight-step Master Map of AI Systematization, where the prompt finally appears — as a consequence, not a starting point.

The reader does not need to read the whole book to start. They need to recognize which pattern is consuming their organization right now, and which belt's discipline answers it.

What executives can do tomorrow

Three actions that cost nothing and start the correction:

1. Audit your current AI initiatives against the PMP triangle. Take each one and try to articulate the Purpose, the Metric, and the Promise it is committed to. Any initiative that cannot pass this audit is a candidate for cancellation, not continuation.

2. Define the success metric and the kill-switch metric before any new initiative starts. If you cannot define both, the initiative is not ready to be approved.

3. Designate one human as accountable for each AI system in operation. Not a vendor. Not a committee. A specific person whose performance is measured against the system's performance. Without this, governance is theater.

These three actions do not require buying anything. They require the executive to do work that vendors cannot do for them. Most won't. That is precisely why the 5% who do this work continue to take market share from the 95% who don't.


The 95% failure rate is uncomfortable to look at directly. It implies that most of what is currently being celebrated as "AI transformation" is being measured by the wrong scoreboard. The scoreboard the market needs is not how many AI initiatives a company has launched. It is how many of them are still delivering measurable value at the eighteen-month mark. By that standard, the failure rate is closer to 95% than it is to 50%.

The path out of the 95% is available, but it does not go through technology. It goes through leadership posture, systematic measurement, honest commitment to specific outcomes, and the discipline to treat AI as a living system that needs governance.

It is the same path the 5% have always been on. It is now available to anyone willing to do the work.


If you want the full methodology — the seven belts, the named frameworks (AMARTE, Hwa·Won·Ryu, Tumanov Filter, Green Matrix, PAF, PMP Triangle, Master Map of AI Systematization), and the integrated case studies — read AI Black Belt: Fundamentals Before the Prompt. Published May 2026 by Legacy Publishers, foreword by Spencer Hoffmann. Available now on Amazon in Spanish; English edition in final author review.

For keynote speaking engagements on this topic, see Speaking. For executive AI consulting engagements, see Consulting.

For the tactical 47-question checklist that precedes any AI budget approval, read AI Implementation Checklist for Executives. For the 12 filters that separate real AI consultants from deck-sellers, read How to Choose an AI Consultant. For the framework that separates AI strategy from AI implementation, read AI Strategy vs AI Implementation.

Go deeper

Want to bring your team to the next belt?

Book a discovery call or explore the full book.

FAQ

Frequently asked questions

Detailed answer in the article body. See the relevant section.

Detailed answer in the article body. See the relevant section.

Detailed answer in the article body. See the relevant section.

Detailed answer in the article body. See the relevant section.

Keep training