Navigating the AI Ecosystem as a Senior Engineer
A practical approach to selecting AI models, IDEs, and CLI tools in the rapidly evolving agentic development landscape.
By Katia Wheeler ·

As a Senior Developer at Shopify, I’ve had access to a large chunk of the current AI ecosystem since Shopify went all in on AI back in April. With new models, IDEs, and CLI tools launching what feels like every other week, it can be overwhelming to figure out:
- Which model to use for what
- Which IDE actually improves your workflow
- When a CLI agent is better than an in-editor one
This post walks through how I work in this new agentic style and how I evaluate the tools I use day to day.
Agents
It’s currently Nov 17, 2025 as I’m writing this and the rate at which agents are changing is wild. Anthropic’s Opus 4.1 dropped in August and is already considered “legacy.” That’s how fast the agent landscape is moving.
New agent models ship almost bi-weekly, each claiming to out-reason, out-code, or out-perform the others. I’ve stopped trying to evaluate everything and instead doubled down on a small set of agents that I actually rely on.
These are the ones that stuck for me and why.

Sonnet 4.5
Sonnet 4.5 is my default coding workhorse.
- Use cases: bug fixes, quick iteration, and most planning
- Why I like it:
- Fast enough to stay in flow
- Solid reasoning and implementation quality
- Consistent across sessions and codebases
The Sonnet series has always struck a nice balance between execution and reasoning, and 4.5 continues that trend. I reach for it when I want something that “just works” across a wide range of tasks without a lot of prompt wrangling.
I used to default to Opus 4.1 for planning (see below), but since it’s now treated as “legacy,” I lean more on Sonnet 4.5 so I’m not building workflows on top of models that are actively being phased out.
Haiku
If Sonnet 4.5 is my reliable all‑rounder, Haiku is my speed demon. Haiku can handle many of the same categories of work (such as bug fixes, quick edits, short explanations, small refactors) but with an emphasis on latency and cost-efficiency over depth.
- Use cases: tight feedback loops while coding, small, well-scoped changes (e.g., “rewrite this function,” “add basic tests,” “clean up this file”), quick clarifications in context (“what does this function do?”, “why is this failing?”)
- Why I like it:
- Very fast responses
- “Good enough” reasoning for well-framed, local tasks
- Great for staying in flow when I don’t need full architectural thinking
Haiku can reason and plan, but where it really shines for me is execution on clearly bounded tasks, especially when I already know what I want and just need it done quickly.
How I Split Reasoning vs Execution
Sonnet 4.5 vs Haiku Both Sonnet 4.5 and Haiku can technically handle reasoning and execution, but I get the best results by giving them distinct roles in my workflow:
- Reasoning / Planning → Sonnet 4.5
- Designing a feature end-to-end
- Exploring tradeoffs between approaches
- Breaking down a ticket into implementable steps
- Thinking through edge cases and failure modes
Example: “Given this existing service, design how we’d add multi-tenant support, including data model changes, API updates, and migration strategy.”
- Execution / Local Changes → Haiku
- Implementing a single step from the plan
- Editing or refactoring a file or two
- Adding tests for a specific scenario
- Translating a clear spec into code
Example: “Here’s the plan from Sonnet. Implement step 2 in
billing_service.rband add tests for the new error handling behavior.”
In practice, a typical flow might look like:
- Use Sonnet 4.5 to:
- Understand the problem
- Propose an approach
- Break the work into concrete steps
- Use Haiku to:
- Implement those steps quickly and iteratively
- Handle the repetitive or mechanical parts of the work
- Polish and refine code as you go
This split keeps me from overloading one model with every kind of task and lets each one operate where it’s strongest: Sonnet for thinking, Haiku for doing.
Composer‑1
Composer‑1 came onto my radar with the release of Cursor 2.0.
- Use cases: greenfield work, exploration, rough drafts
- Why I like it:
- Very fast
- “Good enough” reasoning for early iterations
- Where it falls short:
- Not always accurate
- Can be brittle when working in large, established codebases
If I’m spiking a new idea, prototyping a feature, or scaffolding something from scratch, I’ll happily throw Composer‑1 at it. But when I’m working inside a mature, complex codebase (especially at Shopify scale), I still default to Sonnet 4.5 for its stability and consistency.
GPT‑5
GPT‑5 has essentially replaced Google for me.
- Use cases: research, learning, concept explanation, broad overviews
- Why I like it:
- Strong at explanation and education
- Great at pulling together context and summarizing
- Usually the most reliable in terms of factual accuracy when carefully prompted
It’s not always the fastest option, and I don’t lean on it as heavily for day-to-day coding compared to Sonnet. But for answering “why” questions, exploring unfamiliar domains, or synthesizing long-form content, GPT‑5 sits at the top of my stack.
Honorable Mention: Opus 4.1
Even with its “legacy” status, Opus 4.1 is still one of my favorite models for planning.
- Use cases:
- High-level system design
- Implementation plans
- Architectural blueprints
- Why I like it:
- Concise while still going deep where it matters
- Strong at exploring tradeoffs and edge cases
When I need a detailed, well-structured plan that balances pragmatism and thoroughness, Opus 4.1 still delivers. I use it a bit less now due to its legacy status, but it’s absolutely worth mentioning because of how much it shaped my current workflow.
IDEs
VS Code is no longer the only serious option in town. New AI-native or AI-augmented editors like Cursor, Zed, and Kiro have entered the picture, and I’ve ended up with a multi-IDE setup depending on what I’m working on.

Cursor
I avoided Cursor for a while. Before Cursor 2.0, it felt bloated, clunky, and absolutely brutal on my CPU.
Post‑2.0, it’s become my primary editor of choice.
- Why I like it:
- The agentic view is clean and well-integrated
- It genuinely supports a flow state while coding with an AI partner
- The review experience of seeing diffs, understanding changes, and integrating them is solid
- Performance has improved to the point where it no longer hammers my machine
If you bounced off Cursor before because it felt heavy or unstable, it’s worth giving 2.0+ another shot.
Zed
Zed has carved out a distinct niche for me:
- Open source
- Lightweight and extremely fast
- A growing extension ecosystem
- Highly configurable
I use Zed primarily for:  — Navigating large codebases  — Quickly jumping between files and symbols  — Situations where I want raw speed over heavy AI integration
It does have a built-in agent pane that you can configure for any agent, but I often still default to a CLI-based agent instead (see below). The in-editor agent sometimes feels a bit laggier to me, or that might just be my perception compared to the snappiness of the rest of Zed.
Overall, Zed is a very solid choice when you care about performance and minimal friction.
Kiro
Kiro, Amazon’s spec-driven IDE, is an interesting experiment in how to structure AI-assisted development.
- Pros:
- Spec-first flow can be helpful during planning
- Encourages explicit design before implementation
- Cons (for my workflow):
- Feels slow
- Writing a spec for every change becomes tedious fast
I’ll occasionally use Kiro during a planning-heavy phase, especially when I want stricter structure around what I’m building. But more often than not, I find myself reaching back for Opus 4.1 or Sonnet in a dedicated agent environment instead.
CLI Tools
Having an agent that’s not tied to an IDE but still fully integrated into your machine is an underrated superpower.
With CLI-based agents, I’ve been able to:
- Drive workflows in tools like Obsidian
- Generate daily summaries of my work
- Track “wins” and notable events for my brag doc
- Work across multiple repos or non-repo code and notes
These tools shine outside the narrow world of a single editor or git repository.

Claude Code
You’re absolutely rightif you think Claude Code is still the reigning champion of CLI-based agents in my setup.
- Strengths:
- Consistent reasoning
- Strong coding output, especially for refactors and multi-file changes
- Good at maintaining context over longer sessions
- When it goes off the rails:
- Like any model, it can hallucinate
- The upside is that you can usually:
- Redirect it with clearer instructions, or
- Clean up its “memory” / context and nudge it back on track
I use Claude Code for anything that feels like **“AI terminal pair-programming”.**From writing and editing code to generating summaries of my day and updating my personal knowledge base, I know Claude Code can handle it.
OpenCode
OpenCode fills a key gap: it gives you a Claude Code–style CLI experience but with the ability to plug in different models.
- Why it’s useful:
- Open source
- Model-agnostic: you can bring your own keys and swap in alternatives
- Great if:
- You don’t have an Anthropic subscription
- You want to experiment with multiple providers behind a single workflow
If you like the idea of a CLI agent but don’t want to be tied to one vendor, OpenCode is worth a look.
Codex
In my setup, Codex is the OpenAI-flavored alternative to Claude Code:
- What it does:
- Lets you interact with GPT models via the CLI
- Brings the power of GPT‑5 (and friends) to your terminal workflows
- Where I use it:
- Research tasks directly from the terminal
- Quick Q&A while working inside a repo
- Generating or tweaking scripts and one-off utilities
I tend to reach for Codex when I specifically want GPT’s style of reasoning or explanation but in a CLI-first workflow rather than a web UI or IDE integration.
Closing Thoughts
The AI tooling landscape is moving too fast for any one person to thoroughly evaluate every new model, IDE, or agent that drops. Instead of chasing everything, I’ve focused on a small, opinionated stack:
- Agents
- Sonnet 4.5 for day-to-day coding and planning
- Composer‑1 for fast greenfield iteration
- GPT‑5 for research and explanation
- Opus 4.1 (legacy, but still excellent) for deep planning
- IDEs
- Cursor 2.0+ as my primary AI-native editor
- Zed for performance and large codebase navigation
- Kiro occasionally, when I want a spec-first workflow
- CLI Tools
- Claude Code as my main terminal agent
- OpenCode for model-flexible CLI workflows
- Codex when I want GPT models at the command line
The “right” stack will vary by team, company, and personal preference. But if you’re feeling overwhelmed by the sheer volume of options, my recommendation is:
- Pick one primary agent, one IDE, and one CLI tool to really invest in and learn
- Use them deeply for a few weeks
- Only then start swapping components to see what meaningfully improves your flow
The goal isn’t to use all the tools. It’s to build a setup where the AI feels less like a novelty and more like a trusted teammate embedded in your everyday workflow.

Originally published on Medium.