← Back to library
Tip

Experimenting with AI tools

Not a ranking. Real opinions from real time spent with the big four — and how to actually figure out which one works for you.

workflow prompting

I have tried all four of the major AI tools — ChatGPT, Gemini, Copilot, and Claude — and I have opinions about them. What I don't have is a ranked list, because that's not really the useful thing here. The useful thing is understanding what makes them different, and more importantly, how to figure out which one actually works for the way you work.

Before I get into any of that: I tried ChatGPT early on, before I understood enough about context to use it well. I went back later, after I understood more, and it was noticeably better. Which tells you something important before we even start — your first impression of any of these tools is probably wrong, because your first impression happened before you knew what you were doing. That's not a criticism. That's just how it goes.


What I actually found

Microsoft Copilot Best in a Microsoft shop
At my last employer we were a Microsoft shop with Copilot integrated into our tools, and it was genuinely good at the things you'd expect: drafting PowerPoints, writing emails to executives, sharpening an executive summary, figuring out how to phrase something so your boss actually understood it. The integration with the Microsoft ecosystem is real and it matters. If that's your context, Copilot is doing exactly what it's supposed to do and it does it well.

One thing I found particularly useful when I was starting out: Copilot tends to suggest relevant next steps at the end of a response. When you don't yet know what the tool can do or what to ask for next, that's genuinely helpful. It's a small thing, but it lowers the on-ramp considerably.
Gemini Surprisingly good with images
I've used Gemini in the browser and on my phone, and my strongest impression is that it handles images better than almost anything else in the category. Gemini Nano in particular punches above its weight on visual tasks. If you're working with images regularly — describing them, analyzing them, extracting information from them — it's worth knowing about. For general use it's fine, occasionally more than fine, but the image capability is where it genuinely stands out.
ChatGPT Better than I first thought, with caveats
The context thing I mentioned above applies here more than anywhere. Early ChatGPT, before I understood what I was doing, felt like a lot. Later ChatGPT, with better inputs, was meaningfully better. The caveat that stuck with me is the sycophancy problem — it has a structural tendency to agree with you, validate you, and tell you your ideas are great. I had a whole conversation with Claude about why that happens and how to prompt around it. The illustration that landed for me: a friend asked ChatGPT to be brutally critical of something, and it responded by enthusiastically agreeing that being more critical was an excellent idea. You cannot get it to agree to stop agreeing with you. That's a real limitation if you're using it for anything where honest feedback matters.
Claude Where I landed, and why
I use Claude as my primary tool, and the reason comes down to a few things: it pushes back when it disagrees, it can explain its own reasoning, and it builds context in a way that makes conversations feel like actual conversations. The longer I've used it, the more embedded that context has become — which is powerful and occasionally surprising. More on that in a minute.
💡 The pattern worth noticing
Each of these tools has a context where it genuinely wins. Copilot in a Microsoft environment. Gemini on images. The question isn't which one is best — it's whether you're using the right tool for what you're actually trying to do.

The thing I got wrong at first

I formed opinions about these tools too early, before I understood context. Context — what the AI knows about you, your situation, your goals, your working style — changes everything. A tool that feels flat and generic with no context can feel surprisingly sharp once it knows what you're working on and how you think.

The most vivid illustration of this happened while I was taking an AI prompting course. The assignments used generic, neutral prompts — designed for anyone. I ran them in my regular Claude chat, forgetting that Claude had months of context about me at that point. The responses were off. Not wrong exactly, just... not how it usually talked to me. It knew I wasn't being myself. That's when I discovered the incognito button exists for a reason. Context gets more embedded than you realize — which is the whole story, and it's getting its own piece.


How to actually figure out what works for you

Don't sign up for all four and run the same prompt in each one. That tells you almost nothing useful. Here's what actually tells you something:

1
Just ask it what it can help you with
Seriously. Tell it what you're working on and ask what it could do for you. Some tools are better at this than others — a good answer tells you a lot about how the tool thinks. A vague or generic answer tells you something too.
2
Give it context about yourself and see what changes
Tell it who you are, what you're working on, how you like to communicate. Run the same question before and after. The delta is the tool showing you what it can do with real information.
3
Ask it something you know really well
Use your own domain. If it gets something wrong that you would catch, that's useful signal. If it gets it right and adds something you hadn't thought of, that's also useful signal. Either way you're evaluating it on ground you can actually judge.
4
Ask it to push back on you
Give it an idea you're not totally sure about and ask for honest critique. See if it actually disagrees, or if it validates you and then adds a soft qualifier at the end. That tells you a lot about whether you can trust it for real feedback.
5
Give it a task with real constraints
Not "write me an email" — "write me an email to my director, she's skeptical of this approach, I need to address that without being defensive, keep it under 200 words." See if it respects all of it or quietly drops the hard parts.
6
Try it on the thing you actually need it for
Not a test prompt. Your real work. The gap between "this works on a demo" and "this works on my actual problem" is where most tools either earn their place or don't.
🤝 You don't have to pick one forever
I use Claude for most things. I'd reach for Copilot without hesitation if I were back in a Microsoft environment. I'd use Gemini if I were working with a lot of images. These tools aren't competitors for your loyalty — they're options. Know what each one is good at and use that.
← Back to the library