Is the New GPT-5 in Copilot Actually Terrible? An Investigation
Z
Zack Saadioui
8/10/2025
So, Is the New GPT-5 in Copilot Actually... Terrible? Let's Investigate.
Alright, let's talk about the elephant in the room. The AI world has been buzzing, waiting for what feels like an eternity for GPT-5. The rumors were flying for well over a year. Sam Altman, OpenAI's CEO, was out there making some pretty bold claims, calling it the "smartest, fastest, most useful" AI yet. And then, boom, in early August 2025, it finally dropped. Not just in ChatGPT, but Microsoft went all-in & pushed it directly into Copilot for free.
The hype was through the roof. We were all expecting this massive, world-changing leap, the kind of jump we saw from GPT-3 to GPT-4 that really kicked off this whole AI craze.
But then… the reviews started rolling in. And honestly? A LOT of people were not just underwhelmed; they were genuinely angry. The top post on the ChatGPT subreddit was literally titled "GPT-5 is horrible." People were complaining about short, lazy replies, a weirdly sanitized "AI-speak" personality, & Plus users hitting their usage limits faster than ever.
So what in the world is going on? How can the "best model in the world" be getting this much hate? Is GPT-5 in Copilot a massive misstep? Is it secretly a downgrade?
I went down the rabbit hole, & it turns out the answer is... complicated. It's not a simple "yes" or "no". Here’s the real story behind the chaos.
First Things First: Yes, GPT-5 is ACTUALLY in Copilot
Before we get into the drama, let's get the facts straight. Microsoft officially announced they've integrated GPT-5 into their whole suite of products. For most of us, that means when you go to
1
copilot.microsoft.com
, you can now use GPT-5.
But here’s the first CRUCIAL thing to understand: it’s not just one single model. What OpenAI & Microsoft have built is a system of models. When you use Copilot's new "Smart mode," there's a real-time "router" working behind the scenes. This router looks at your prompt & decides which model is best for the job. Is it a simple question? It'll use a faster, more efficient model. Is it a super complex, multi-step problem? It'll route it to a more powerful, deeper "thinking" model.
There are different tiers, too. Free users get access to the standard GPT-5 & a 'mini' version. Paid users get higher limits, & Pro users can even access 'gpt-5-pro' & 'gpt-5-thinking' for the really heavy-duty stuff.
So, it's not like a single switch was flipped. It's a whole new, dynamic system. This detail is KEY to understanding why everyone is having such a different experience.
The Case for "It's Terrible": Why Are People So Mad?
You can't ignore the wave of negative feedback. It's real, it's widespread, & the reasons are pretty valid when you dig into them. It's not just about the model's performance; it's about the entire user experience surrounding the rollout.
The "Shrinkflation" Argument
This is probably the biggest complaint from power users. The term "Shrinkflation" kept popping up on Reddit, with users speculating that OpenAI is trying to cut down on their massive server costs. And honestly, you can see why they'd think that.
The argument is that to save energy & money, the model is designed to give shorter, more concise—& often less helpful—answers. Users reported that the unique "personality" they had gotten used to with older models was gone, replaced by generic, overly-cautious AI-speak. One user put it perfectly: it "feels like a downgrade branded as the new hotness." When you're used to a certain level of depth & you suddenly get something that feels lazy, it's easy to feel like the company is cutting corners at your expense.
The Forced Upgrade Problem
This one is HUGE. In a move that baffled many, OpenAI announced they were "deprecating"—which is corporate-speak for shutting down—all the preceding models. This includes versions like GPT-4o & others that tons of developers, researchers, & power users had built their entire workflows around.
Imagine spending months perfecting a set of prompts that work flawlessly with a specific model, only to be told you HAVE to switch to a new one that behaves completely differently. It broke things for a lot of people. This decision alone was bound to create backlash, regardless of how good GPT-5 actually was. It felt like a slap in the face to their most dedicated users.
Expectations vs. Reality
Let's be real: calling it "GPT-5" set an impossibly high bar. The jump from GPT-3 to 4 was astronomical. The term implies a generational leap. But that's not what we got. Many early reviewers & users noted the improvements were more incremental than revolutionary.
When you're promised a spaceship & you get a slightly faster car, you're going to be disappointed, even if the car is pretty nice. The consensus seems to be that while it's better in some areas, it's also a step back in others, making it feel like a weird sidegrade rather than a true successor.
The Other Side of the Coin: Where GPT-5 Apparently Shines
Okay, so with all that negativity, you'd think the model was a total flop. But it's not that simple. While casual users were complaining, another group—mostly business & enterprise users—were having a completely different experience.
Deeper Reasoning & Complex Tasks
This is where GPT-5 seems to be a genuine game-changer. Remember that "thinking" mode? When you give it a truly complex, multi-layered task, it delivers in a way older models just couldn't.
One YouTuber, David Fortin, did a side-by-side comparison. He asked the old Copilot (using GPT-4o) & the new GPT-5 version to analyze a 109-page financial statement from Starbucks. The old version gave a decent summary. The GPT-5 version, however, returned a beautifully structured response with a summary table, detailed breakdowns, & insights that were FAR more thorough. He also tested it by analyzing sales data, & GPT-5 was able to create detailed charts breaking down revenue by product & region by year, something the old model completely failed to do.
It's clear that GPT-5 is designed to be a workhorse for what OpenAI calls "agentic" tasks—multi-step processes that require reasoning. Think summarizing a 200-page report, drafting an investor memo, or debugging a large codebase. In these areas, it's not just better; it's in a different league.
A Boost for Business & Enterprise
This seems to be the real focus of GPT-5. It's less of a quirky creative companion & more of a serious business tool. Microsoft highlighted how with GPT-5, Microsoft 365 Copilot is better at reasoning through complex questions & staying on track in long conversations, even reasoning over your emails & documents to help you stay on top of your work.
This shift towards professional efficiency is pretty clear. While GPT-5 in Copilot is fantastic for improving internal business communications & automating complex document analysis, it raises the question of how businesses handle their external communications. This is where specialized AI tools really come into play. For example, while Copilot is sorting out your internal reports, you need a solution for the customer who lands on your website with a question at 10 PM. That's where a platform like Arsturn becomes incredibly valuable. It helps businesses create custom AI chatbots trained specifically on their own data. These bots can provide instant customer support, answer product questions, & engage with website visitors 24/7, ensuring you never miss a chance to connect with a customer.
The REAL Answer: It's Complicated (& It's All About the Router)
So, we have two completely opposite experiences. One group is calling it terrible & a downgrade, while another is praising its powerful new reasoning abilities. How can both be true?
The answer almost certainly lies in that model router we talked about earlier.
Your experience with "GPT-5" in Copilot is entirely dependent on the complexity of the prompt you give it.
If you ask a simple, open-ended question like "Write me a poem about a sad robot," the router likely identifies this as a low-complexity task. To save resources, it sends it to one of the faster, more efficient—& possibly "dumber"—models in the system. The result? You get a short, generic, uninspired poem & walk away thinking GPT-5 is garbage.
But if you give it a prompt like, "Read through these three attached quarterly reports, identify the primary drivers of revenue growth, & draft a five-slide PowerPoint presentation for the executive team summarizing your findings," the router recognizes this as a high-complexity task. It sends it to the powerful 'gpt-5-thinking' model. It takes a bit longer, but it comes back with a stunningly detailed & accurate analysis that saves you hours of work. You walk away thinking GPT-5 is the most incredible tool ever built.
This idea of using the right AI for the right task is crucial for businesses to understand. A general-purpose tool like Copilot is powerful, but not always the perfect fit for every specific business need. For businesses looking to optimize their website engagement & lead generation, a specialized tool is often better. That's the niche Arsturn fills. It's a no-code AI chatbot platform that lets businesses build chatbots trained on their own data. This means you're not getting a generic AI; you're getting a brand expert that can answer specific questions about your products, qualify leads, & book meetings, helping you boost conversions & provide a truly personalized customer experience.
So, How Do You Get the Good GPT-5 Experience?
If you've been feeling disappointed with Copilot lately, don't give up on it just yet. You might just be using it wrong. Based on everything we're seeing, here’s how to get the most out of the new system:
Be INCREDIBLY Specific: Don't be vague. Give it roles, formats, & constraints. Instead of "Summarize this," try "Act as a financial analyst & summarize this document into five key bullet points for a CEO who has 30 seconds to read them."
Give it Complex, Multi-Part Tasks: This seems to be the secret sauce. Ask it to do several things at once. "Read this document, extract the key statistics, present them in a markdown table, & then write a short email explaining their significance." This forces Copilot to engage its deeper reasoning models.
Frame it as a Professional Task: The model seems to be fine-tuned for business use cases. Framing your requests in a professional context, like preparing for a presentation or analyzing data, seems to yield much better results.
Upload Files & Data: The real power seems to be unlocked when you let it reason over specific documents or data sets. Use the file attachment feature & ask it to work with that context.
The Verdict
So, is GPT-5 in Copilot terrible? No. But the rollout & communication around it have been.
The negative feedback is totally understandable. It comes from a combination of sky-high expectations, the frustration of having familiar tools taken away, & a new system that feels like a step back if you don't know how to use it properly.
The truth is, GPT-5 represents a pivot. It's a move away from a single, all-powerful model towards a more efficient, specialized system of models. It's less of a flashy toy & more of a powerful, if sometimes finicky, work tool. The magic is still there, but you have to work a little harder to find it.
Hope this investigation was helpful! It's a pretty fascinating situation watching this technology evolve in real-time. Let me know what you think. Have you had a good or bad experience with the new Copilot?