The Good, The Bad, And The Ugly Of Copilot Studio: A Brutally Honest Review Going Into Late 2025

Look, I’ve spent more hours in Copilot Studio than I care to admit. As someone who’s deep in the trenches designing agents and having extensive experience with n8n and other agent-building architectures, I owe you the unfiltered truth about where this platform actually stands as we head toward the end of 2025.

The Good

GPT-5 Auto and Reasoning Models: Finally, Something That Works

Let’s be honest – GPT-4o and GPT-4.1 across many different types of agents has been completely awful. Night and day doesn’t even begin to describe the difference with GPT-5. The integration of GPT-5 provides genuinely advanced natural language understanding and generation. Agents can actually handle complex queries, context switching, and multi-step reasoning without falling apart. For the first time, the platform is showing some basic value and is actually usable. This alone makes 2025 a turning point.

MCP Server Integrations: Microsoft Actually Got Something Right

Microsoft’s early adoption of MCP server integrations puts them ahead of many competing platforms. The MCP server model allows centralized tool orchestration, and the OAuth support makes it possible to securely connect APIs, internal systems, and external services. Centralized execution simplifies multi-agent workflows compared to older distributed models. When it works, it’s impressive.

Rapid Testing and POC Capabilities

I’ll give credit where it’s due: Copilot Studio excels at quickly spinning up agents, running mock conversations, and testing tool integration. This is excellent for proof-of-concept experimentation, internal demos, and exploring agent scenarios without heavy infrastructure. If you need to show something fast to stakeholders, this is your playground.

Governance Capabilities are increasing

I am impressed about new DLP (Data Loss Prevention) and Security Controls offered by Purview, read more about Governance Best Practises here. Watch our Guardians of M365 Governance Video with Microsoft’s Erica Toelle about DSPM for AI.

Embed Copilot Studio into your iOS, Android and Windows Apps

Client SDK has improved significantly, see September Updates and its documentation

The Bad

Connected Agents Can’t Run Their Own MCP Servers

This is a massive limitation that kills multi-agent architectures. You can delegate messages to child agents, but tool invocation fails if the MCP server is attached to the child agent. Every multi-agent design today must proxy all MCP calls through the parent agent – which is clunky, unintuitive, and feels like a hack. If this isn’t fixed, Copilot Studio simply isn’t truly multi-agent capable for enterprise integrations.

Transparency Around Runtime Versions? Forget About It

It’s impossible to know what build or orchestration runtime your tenant is on without digging through obscure menus. Some behaviors – like MCP support or multi-agent quirks – are completely tied to the runtime version, but Microsoft doesn’t provide a clear way to check or control this. Troubleshooting becomes a guessing game.

Debugging and Logging Are Basically Non-Existent

Conversation logs help with text flow but are completely opaque for tool execution failures. There’s no easy way to confirm if a child agent tried to call an MCP server or if the call just silently failed. We desperately need structured logs or a developer mode showing exact tool invocation flow per agent.

Still Tied to the Old PVA Architecture

Here’s the uncomfortable truth: under the hood, Copilot Studio is still heavily influenced by the legacy Power Virtual Agents framework. This shows in limited orchestration flexibility, fragile environment setup, and convoluted tool and Topics handling. If Microsoft wants to compete in multi-agent AI and enterprise-level orchestration, they need to break completely away from the old PVA architecture and rethink this from the ground up.

The Ugly

ALM (Agent Lifecycle Management) Is Simply Broken

When placing a Copilot Studio agent in a managed solution, you suddenly get vague SQL errors. These seem to come from knowledge sources or connection references. When you delete one, it’s only removed from the UI – check via the API and it’s still there. Application / Agent Lifecycle Management should be basic functionality, not a technical impossibility. You want to see differences between versions of agents – and roll back? No chance.

Version Control in Teams Is a Nightmare

When publishing an agent to Teams, it’s not automatically updated for end users. All my users end up running different versions of the chat. How is this acceptable in 2025?

Content Filtering With Zero Transparency

When an agent response is ContentFiltered, Microsoft provides no transparency about what triggered the filter. No logging, no reason code, no detail explaining why a particular input, output, or tool execution was blocked. Debugging becomes impossible.

Governance Limitations From an Admin Perspective

There’s no way to prevent users from creating agents in other environments or redirect them to the default environment. You can’t block users from deploying agents to Copilot without admin involvement. There’s no ability to default block newly created custom agents in MAC. And we’re lacking basic insights into which agents are being used and by whom.

Marketing vs. Product Team Misalignment

Every now and then, we see a new feature marketed as GA, only to find out it’s actually still in preview for several months. It’s fine to release preview features – but don’t market them as GA. The trust erosion is real.

YouTube “Gurus” Mis-Frame Product Limitations as User Error

Many tutorials frame struggles with real agent building architectures as user mistakes or bad design. In reality, a lot of what breaks is product-level limitations, not poor design by the user. This affects adoption by organizations looking to start using the platform at enterprise scale.

The Bottom Line

Copilot Studio in late 2025 is a platform of contradictions. GPT-5 has finally made it usable, and the MCP integration architecture shows genuine promise. But the legacy baggage, broken ALM, governance nightmares, and Microsoft’s tendency to market preview features as production-ready create serious trust issues.

If you’re doing quick POCs or demos? Great platform. If you’re building enterprise-grade, multi-agent architectures with proper governance and lifecycle management? Buckle up – you’re in for a rough ride.

The potential is there. The execution? Still a work in progress.

Talk to us at HanseVision about your requirements and questions about Power Platform, M365 Governance, Copilot (Studio) and Agents Governance!

Find my Calendar here and check out our OnePager about M365 Governance.

The Good

GPT-5 Auto and Reasoning Models: Finally, Something That Works