Prompting’s Midlife Crisis: Turning Incantations into Infrastructure

For a while, “prompting” drifted into an odd cultural niche: part folklore, part copywriting, part performance. The implied premise was that a capable model is like an oracle, and the practitioner’s job is to craft the incantation that unlocks hidden power. If you don’t like that framing – I certainly don’t – it’s hard not to notice how quickly an ecosystem has grown around it: prompt marketplaces, “prompt engineers” selling bundles of clever phrasing, and a rhetoric of artistry that flatters the seller and mystifies the buyer.

Anthropic’s “skills” framing is useful precisely because it treats this mystification as a smell. A skill, in their guide, is not a sacred text. It is a packaged instruction set—literally a folder with a required SKILL.md (and optional scripts/, references/, assets/). The message is understated but decisive: stop fetishizing the prompt and start treating behavior as an artifact you can structure, test, version, and ship.

That change matters because it relocates the “hard part.” In classic prompting culture, the hard part is phrasing: finding the right tone, the right constraints, the right sequence of instructions. In the skills model, the hard part is product thinking: scoping repeatable workflows, embedding domain rules, and defining success criteria that allow you to tell whether you improved anything. Anthropic explicitly tells you to begin with a small number of concrete use cases and asks you to define what “working” means before you obsess over coverage. That is engineering hygiene, not verbal wizardry.

The most revealing part is where skills decide when to activate. In this design, the YAML frontmatter isn’t decoration; it is the gating mechanism. Anthropic calls it “the most important part” and states plainly that it is how Claude decides whether to load your skill, with the description required to contain both what the skill does and when to use it, including trigger phrases users might actually say. This is a direct antidote to the “one prompt fits all” fantasy. You can have beautiful instructions, but if the skill triggers at the wrong time (or fails to trigger when it should), you don’t have a product—you have a liability.

That emphasis also clarifies why the “prompt bundle” business is brittle. A prompt that looks clever in a screenshot is rarely robust in the wild, because the wild is full of paraphrases, partial information, and near-miss intents. Anthropic’s recommended testing approach explicitly includes “triggering tests” that must fire on obvious tasks and paraphrased requests, and must not fire on unrelated topics. In other words: the main failure mode is not that the prose wasn’t artful; it is that the activation logic was underspecified. Anyone selling prompts as if phrasing is the whole story is selling the least durable part.

Skills also reframe how context should be handled. The guide’s “progressive disclosure” design is a concrete architectural idea: the frontmatter is always loaded in the system prompt, the body is loaded only when relevant, and linked files are discovered only as needed. This is not romantic, but it is practical: it controls token spend, reduces instruction interference, and makes specialization composable. The guide even points out that Claude can load multiple skills simultaneously and that skills should not assume they are the only capability present. That is a world away from the “giant mega-prompt” approach, which tends to collapse under its own weight.

Once you treat “prompting” as an artifact lifecycle, you naturally adopt evaluation and iteration as first-class work. The guide encourages you to run the same request multiple times and compare outputs for consistency, and to test whether a new user can accomplish the task on the first try with minimal guidance. It also recommends iterating on a single challenging task until the model succeeds, then extracting the winning approach into a skill, before expanding test coverage. This is quietly devastating to the “prompt artisan” narrative, because it makes the workflow empirical. If results vary, you don’t praise the prompt’s creativity; you change the system until it stops varying.

An interesting detail is the guide’s blunt admission that some things are better handled with code than with language. In troubleshooting, it explicitly suggests bundling a script for critical validations because “code is deterministic; language interpretation isn’t.” That line is, effectively, a manifesto against prompt mysticism. It says: if correctness matters, don’t bet the business on interpretive reading of natural language instructions. Put guardrails into executable checks. Prompting, in that world, is the glue and the interface, not the enforcement mechanism.

Distribution is the final step that turns this from a personal trick into an organizational discipline. Anthropic describes a “current distribution model (January 2026)” for individuals (download, zip, upload in Claude settings, or place in the Claude Code skills directory), and it notes organization-level deployment with workspace-wide distribution, automatic updates, and centralized management. It also positions skills as an open standard and argues they should be portable across tools and platforms, analogous to MCP’s portability goals. Once you can ship “behavior” with update channels and governance, the “prompt as a one-off text product” starts to look like selling someone a single bash command and calling it DevOps.

So what does this mean for the future of prompting as a discipline? It does not vanish, but it gets demoted. The valuable work becomes: capturing workflows and best practices so the model applies them consistently; specifying triggers; designing instruction structure; planning error handling; building test suites; and deciding which constraints belong in language versus code. That is closer to “skill engineering” than “prompt artistry,” and it is harder to fake. It is also harder to package into a generic prompt pack, because the value comes from alignment with a real environment: tools, processes, file types, and definitions of success.

Your skepticism about elevating prompting into an art form is, in that sense, aligned with where the craft seems to be heading. The skills model treats language not as magic, but as configuration—structured, scoped, and validated. It replaces charisma with reproducibility. And it implicitly asks a question that prompt sellers rarely want to answer: not “is this prompt clever?” but “does it trigger when it should, avoid triggering when it shouldn’t, and keep working after the next ten weird user paraphrases?” If that becomes the mainstream standard, “prompting” survives—but mostly as one component in a larger engineering discipline that has very little patience for art.