Skip to content

Say hello to the new PR page.

Read more

AI is creating a new challenge for development teams. Just a few years ago, every developer used the same workflow: write code in an IDE and open a pull request. Now, code comes from everywhere. A junior developer might use GitHub Copilot to write boilerplate, a senior engineer might use a specialized AI agent to refactor a complex service, and your CI pipeline might even have an agent that automatically generates tests.

Each of these AI tools uses different models and operates through a different interface—some are in the IDE, some are in the terminal, and others are standalone apps. This fragmentation isn’t a temporary problem; it’s the future. As models become more specialized, developers will use a mix of AI tools best suited for each specific task.

The result is a chaotic and inconsistent developer experience at the point of code creation. But while generation will remain fragmented, the point of code review offers a powerful opportunity for consistency. This is where teams can bring order to the chaos, ensuring that all code—whether written by a human or an AI companion—meets the same high standards.

The developer toolchain has always been fragmented. Developers mix and match IDEs, terminals, and CLIs to create their preferred environment. But the rise of AI-native development tools is introducing a new, more complex layer of fragmentation. This happens in two key areas: the model layer and the interface layer.

There is no single "best" AI model for code generation. Instead, we're seeing a rapid expansion of models that are highly specialized for different tasks.

A developer’s toolkit today includes a mix of powerful options:

  • High-Reasoning models: Anthropic's Claude series excels at complex, multi-step tasks like debugging and refactoring, consistently performing well on difficult coding benchmarks.

  • General-purpose workhorses: OpenAI’s GPT series are versatile tools for a wide range of tasks, from generating boilerplate to providing architectural suggestions.

  • Large-context specialists: Google’s Gemini series is known for its massive context window, making it ideal for working across large repositories where understanding the entire codebase is critical.

  • Open-source powerhouses: Models like Meta’s Llama, Mistral’s Codestral, and DeepSeek-Coder provide strong performance with the flexibility for teams to fine-tune and self-host them for specific needs or compliance requirements.

Developers are not exclusively loyal to one model. They switch between them based on the task, using the best tool for the job. This is a positive development for productivity, but it guarantees that the underlying logic and "style" of generated code will come from many different sources.

The fragmentation continues at the interface layer. Developers now interact with AI across a growing number of surfaces:

  • IDE extensions: This remains the most common workflow, with GitHub Copilot leading the way. However, developers now augment their IDEs with multiple assistants, often using different tools for different languages or tasks.

  • AI-powered terminals: Tools like Warp and Copilot CLI integrate AI directly into the command-line experience, generating scripts and commands from natural language prompts.

  • Standalone "agentic" editors: Applications like Cursor are AI-first code editors that fundamentally rethink the IDE around AI, capable of handling more complex, end-to-end tasks.

  • CI/CD integrations: AI agents are now being run directly in pipelines to generate tests, suggest optimizations, or even perform automated refactors based on performance data.

This fragmentation is a natural evolution. Forcing all AI interactions into a single chat window in the IDE is a limiting approach that ignores the diversity of developer workflows.

If developers are using a dozen different AI tools to generate code, how do you maintain quality and consistency? You can’t enforce standards at the point of creation because you don’t control the tools.

Instead, the single point of leverage is the code review process. This is the central idea behind Graphite's platform: to provide a consistent, powerful, and unified experience where all code—human or AI-generated—is integrated, tested, and reviewed.

This is where stacking becomes essential to the platform story. AI-generated changes, especially those from agent-based tools, can often be large and complex. Forcing them into a single monolithic PR makes them nearly impossible to review safely. By breaking down AI-generated features into a stack of small, dependent PRs, you make the code understandable and manageable for human reviewers. Each PR in the stack represents a logical unit of work that can be individually tested and approved, a workflow that is core to the Graphite platform.

A modern review platform brings order to the fragmented world of AI code generation by:

  • Centralizing all changes in a single, consistent interface, regardless of their origin.

  • Enabling workflows like stacking that are built to handle the complexity and scale of AI-generated code.

  • Integrating with CI/CD to ensure every line of code passes the same automated quality gates.

AI code generation will not consolidate around a single model or interface. The future is a diverse ecosystem of specialized tools, and engineering teams that embrace this reality will be more productive.

The key is not to fight the fragmentation, but to build a development process that contains it. By focusing on a consistent, high-quality code review experience, you can ensure that your team ships reliable, maintainable code, no matter how it was written. The pull request, supported by a platform built for modern review, becomes the unifying force in a world of fragmented creation. This is the role Graphite is built for.

See how Graphite's platform unifies the review experience.

Built for the world's fastest engineering teams, now available for everyone