AI Coding vs Human Development: A Shifting Landscape

Luke Evans

13 Nov 2025 — 5 min read

AI-assisted coding has now been with us for several years. It sits squarely inside the AI hype cycle and has quickly become a focal point for both ambition and anxiety within commercial software organizations. Boards and investors ask, “What are we doing with AI?” Strategies are rewritten, budgets are reallocated, and the gravitational pull of AI stretches across every roadmap: Can we do more? Can we do more with less? What are our competitors doing? Once again the narrative emerges: “He who conquers AI conquers the world.”

So we march into the breach. We are well into the first act of this technological shift, and many feel they have read this script before. Some expect a familiar sorting: those who adapt will thrive in the “sunlit uplands” of AI mastery; those who do not will tumble into obsolescence. Whether or not today’s approaches lead to true AGI remains unknown, but the power of AI as a tool is now obvious to nearly everyone.

Even if the apparent intelligence of these systems is largely the product of vast-scale “autocomplete” across massive parameter spaces, the result is still remarkable: a distilled, compressed pyramid of humanity’s collective digital footprint — a structure we can now query in seconds, as if asking the entire world at once. And generative AI is not merelyautocomplete. Its internal training dynamics allow it to synthesize new compositions, juxtapositions, and conceptual links that were implicit but not explicitly present in the training corpus. In this sense, these systems can surface genuinely novel patterns and relationships.

Is Generative AI Good at Coding?

In many ways, yes. It has ingested an extraordinary range of code samples, books, tutorials, idioms, and abstractions. If you ask it to produce code similar to patterns it has encountered before — and it has encountered most of them many times — it performs impressively. Because so much software consists of well-known structural “set pieces,” LLMs can often synthesize these fragments into coherent whole solutions.

But this ability is bounded by one critical constraint: working memory.
Today, this is almost entirely represented by the context window:

the active conversation,
retrieved artifacts in the session,
and any supplementary agentic reasoning chains.

This is the real practical limit on reliable AI coding today. Managing context — its clarity, relevance, sufficiency, and stability — is now the most important determinant of high-quality results, second only to good prompt engineering (or more accurately, conversation engineering).

From One-Shot Prompts to Conversational Reasoning

Before ChatGPT, users had only one-shot prompting: all necessary context had to be packed into one message. With conversational models, context can be built incrementally and refined dynamically, mirroring how humans normally commission work. This significantly lowered the cognitive overhead required from the user, who no longer needed to gather and integrate all information up-front.

Techniques like “think step-by-step” or explicit reasoning prompts eventually evolved into built-in reflective reasoning: one user request may now produce internal branching, chain-of-thought reasoning, and summarization before the visible reply appears. While this has made models far more capable, the fundamental mechanism of working memory has not advanced much: the context window remains the primary vehicle for specific task knowledge during inference.

This limitation has real consequences. Context information decays by position and relevance; compaction systems can distort details; and without persistent long-term memory, the model’s “awareness” is entirely session-bound. These issues — memory, attention, compaction, and forgetting — will feature prominently in future discussions here.

The Role of the Human Operator

Despite their strengths, LLMs still behave like extraordinarily book-smart junior developers:

highly knowledgeable,
capable of producing impressively structured code,
yet prone to misplaced focus, omission, hallucination, and local inconsistencies.

They must be guided, redirected, reminded, and occasionally rescued. The human’s role, therefore, is not diminished: it becomes more senior, more vigilant, more architectural.

The LLM may supply domain knowledge, documentation, and boilerplate. But architecture, correctness, performance, maintainability, invariants, and system integration remain the responsibility of the human engineer. The human must also maintain clarity of intent — the LLM has no access to institutional context, product strategy, team norms, or tacit expectations unless explicitly placed into the session.

Early Lessons from Real-World Agentic Coding

Beyond the conceptual framing above, an increasingly consistent set of practical observations has emerged from early adopters of AI-assisted development. These represent widely experienced phenomena across languages, teams, and model providers.

1. AI Reduces Local Complexity but Can Increase Global Complexity

LLMs accelerate micro-level work (functions, tests, adapters), but they often increase system-wide entropy:

Solutions are locally correct but globally misaligned.
Architectural cohesion can erode unless actively maintained.
Rapid generation encourages excessive surface area.

Velocity increases, but coherence requires more, not less, senior oversight.

2. LLMs Frequently “Derail” Mid-Session

A common pattern:

Naming conventions drift.
Earlier architectural decisions are forgotten.
Details are contradicted or overwritten.
Hidden assumptions re-emerge.

This is a natural outcome of attention distribution and context-window limits.

3. Models Are Much Better at Generating New Code Than Editing Existing Code

Editing requires:

parsing existing structure,
preserving invariants,
avoiding duplication,
modifying only targeted regions.

This is extremely hard for LLMs. Frequent failure modes include:

inconsistent refactors,
duplicated logic,
incomplete updates,
references to nonexistent symbols.

4. Generated Code Often Looks Correct but Fails Semantically

LLMs excel at the shape and structure of software, but often misunderstand:

business rules,
invariant boundaries,
error-handling expectations,
non-functional constraints.

The result: code that appears elegant but is semantically brittle.

5. LLMs Lack an Innate Sense of Decomposition or Maintainability

Without explicit guidance, models tend to:

create large monolithic functions,
inline logic excessively,
generate wide, leaky interfaces.

They optimize for local completeness, not long-term cognitive load.

6. Tests Generated by LLMs Mirror the Implementation, Not the Requirements

Generated tests are usually:

overfitted to the implementation,
lacking adversarial cases,
weak on integration boundaries.

Human review is essential for reliable test suites.

7. Documentation Generation Is Excellent — but Only If the System Is Coherent

LLMs write clean, polished documentation. But it:

faithfully describes inconsistencies,
can misrepresent intent,
may solidify accidental design drift.

Documentation becomes a mirror rather than a specification.

8. A Clear Division of Labour Is Emerging: Humans Own the Core; AI Works at the Edges

Across teams, a stable pattern emerges:

Humans handle

architecture,
invariants,
system integration,
semantics,
correctness.

AI handles

adapters,
glue code,
scaffolding,
boilerplate,
tests,
migration drafts.

LLMs thrive when tasks are crisp, local, and low in semantic ambiguity.

9. Coordination Complexity Shifts From Code to Conversation

A profound cultural shift is underway:

Session transcripts become genuine design artifacts.
Reproducibility requires saving prompts and plan statements.
“Conversation engineering” becomes a real, teachable skill.

In agentic coding, the conversation becomes part of the software process.

Where We Go Next

Future posts will explore how specification, standards, architecture, validation, testing, and documentation evolve in LLM-assisted development. Many practices remain unchanged; others require significant reframing. The goal is not to replace human engineering, but to understand its evolving shape — and to wield AI as a powerful extension of human capability.

AI Coding vs Human Development: A Shifting Landscape

Luke Evans

Read more

The act of unfolding, expansion, or development.