AI Skills, Metaskills, and the Groundhog Day Principle
Stop Installing Skill Packs. Start Building Skill Systems.
You’ve probably seen the blog posts, viral social media threads, and polished screenshots about AI skill packages. The IDEAL marketing package. The BEST management package. The ideal engineering package that only the STUPID wouldn’t use RIGHT NOW. Forty skills to optimize and automate your entire company before lunch!
Overhyped skill packages are cool… references
These packages are useful in the same way generic templates are useful. They show what is possible and give people an idea of what’s possible. They help someone move from step 1, “I should probably use AI better,” to step 2, “my workflow could be made repeatable with something that resembles this.”
But as models become more powerful, generic skill packages become less important. The better the model, the more likely it is to outperform a generic set of instructions designed for a generic person at a generic company doing generic work. What matters more is a skill package customized for the context of what your company is actually trying to do.
We use the word “context” here to mean two things at once:
First, context means the specific circumstances around the task: your company, your industry, your competitors, your technical systems, your team’s operating rhythm, your customer expectations, and your [insert all the relevant information here!].
Second, context means the objectives you are trying to achieve inside those circumstances. The same instruction can be brilliant in one environment, useless in another, and actively counterproductive in a third. Anyone who has used a popular social media marketing skill and watched it disregard every guardrail can attest to the danger of a mismatch between objectives and AI instructions.
Because of missing context—meaning both circumstances and objectives—these skill packages can become counterproductive. They give the illusion of leveraging best practices while hard-capping your progress. While you were busy finding the best product review skill, someone else was iterating a set of skills and workflows around their actual team and actual customers. Within a few months, they blew right past you as if your team were stationary.
This is why I discourage people from treating skill packages as the answer, except as references, and instead encourage using metaskills. By metaskills, I mean skills that walk through the process of creating a customized skill for an organization, team, product, or code repository.
The first question, then, is not “which skill pack should we install or at least begin referencing?” That question can be useful, but it arrives too early in the sequence. The better question is: how do we create a suite of skills across the company that is customized to the context and objectives we are actually trying to achieve?
To answer that, it helps to start with the central frustration of working with AI systems.
AI Groundhog Day
AI skills are a form of context engineering, or delivering the right context in the right form at the right moment so that your AI can achieve its objective.
Think of context engineering as your own version of the movie Groundhog Day. You’re Bill Murray, stuck reliving a 24-hour loop, repeating the same day, or prompt, over and over. The townspeople, like generative AI, have no memory of the looping events. Only you remember. You test new behaviors, words, and attitudes again and again until you find the winning combination. You finally nail the perfect piano solo, catch a falling child, and perform the Heimlich maneuver.
For some activities, you charm your way through. For other tasks, you discover limits that simply can’t be overcome. No matter how many times Bill Murray tries, he can’t save everyone. The same goes for AI. Some problems are outside the AI’s capabilities, and that is not your fault.
Despite best efforts, Punxsutawney Phil never learned to drive a truck
For anyone who finds Groundhog Day too ancient a reference, picture yourself as the director of a forgetful robot theater troupe whose memory resets nightly. Your task is to craft a script so clear and precise that the performance delights audiences every day. The robots retain nothing from previous days, putting full faith in your script notes, or context engineering, to produce a consistently brilliant performance. Ultimately, you are constrained by your robots’ abilities, but your script determines whether those abilities are used well or failed spectacularly.
Context engineering follows the same pattern of iterative refinement as Groundhog Day or the Robot Theatre Troupe. The frustration of iterative refinement stems from resetting after each attempt. The magic of iterative refinement lies in infinite chances (within budget!) to get it right.
Most companies want the magic without respecting the amnesia reset.
They want to hire the best expert, contract the right consultant, find a skill to install, watch it work once in a clean example, announce that everyone at the company should use it, and declare the problem solved. Magic AI dollars rain down like manna from heaven and everyone celebrates themselves as AI geniuses, which is always a healthy cultural signal.
This is familiar AI theater we’ve all seen. The demo looks impressive because the example was selected to make the skill look impressive. Then the skill meets the complexity of a live organization with many different purposes, and then the skill fails in boring, practical ways.
Principles for High-Quality “Flywheel” Skills
I see three recurring principles among teams that are succeeding, driving steady progress, and accelerating the flywheel.
The first principle is to stay focused on the essential.
The failure mode usually looks like one of these patterns:
Create many, many skills with no realistic maintenance path and expect everyone to adopt one person’s tried-and-true method. One person has forty different skills that work half decently, and then wants everyone in the organization to adopt them.
Create wide-scope individual skills that do a million things, but do them poorly.
Let each individual create their own skills with varying degrees of success. The flywheel then restarts for every employee and occurs entirely in isolation.
The success mode is to focus on AI skills with high-leverage points of value, scope them narrowly, and get them right. Only expand to more skills once the most critical steps or functions are running smoothly and scaling across the organization.
The second principle is iteration.
Iteration is critical for two reasons:
First, humans are quite bad at describing exactly what needs to be achieved on the first go. We omit assumptions because they are obvious to us and use vague descriptions because our teammates already know what they mean. You might say “make this production-ready,” but what you really mean is “make this SOC 2 compliant,” so the AI fails.
Second, AI does not exactly think like humans do. A skill can look perfect to a specialist reading it as a document and still fail when the model tries to execute it. Instructions, triggers, examples, and sequencing are optimized for the AI, and sometimes must be optimized per AI model.
Essentially, you are not done when the skill looks good. You are done, or at least closer to done, when the skill has been tried, observed, corrected, and tried again.
Again, the Groundhog Day comparison is apt. Each loop through the day teaches you something because an inefficiency or failure is discovered, especially when the experience expands to a wider audience.
The third principle is accountability.
I have now seen many companies rolling out skills that are poorly maintained, too generic, or disconnected from the people who actually understand the work. What I have seen work, over and over, is when specific specialists are tasked with making a skill highly effective and generalizable.
The ideal ratio is one specialist supporting one to three skills. The initial skill is developed and iterated upon, and users report their feedback, or even better, their AI-enhanced feedback, to the owner. That owner can balance a single failure point against the risk of overfitting to one scenario at the expense of another.
You want people who understand the work deeply and also have customers consuming the output. That combination creates pressure in the right direction. The owner has enough expertise to make good changes, enough usage to see the real failures, and enough accountability to care when the skill wastes everyone’s time.
“Dogfooding,” or using your own AI skills in this case, is a key element to success. I would love to say that I should be in charge of a large suite of skills and that it would perform really, really well. The problem is that I do not have the time to use every single one of those skills. In fact, probably nobody has the time to use a large set of skills and continually iterate on all of them until they become a world-class experience for the team.
Conditions also change. Products, competitors, APIs, model behaviors, company strategy, and operating rhythms all change. If a person owns one skill, or maybe two, and they are close enough to use it frequently, they can update it when the environment shifts.
We can now see the reasons why gstack and other overhyped libraries are destined for mediocrity:
You need specialists who can iterate and own the skill over time.
Because generic libraries are not customized to your organization’s specific context, they will remain mediocre.
overhyped libraries ngmi, especially when written by AI
A Tangible Example of What’s Working
I’ve seen a few engineering organizations execute well across the three principles of focus, iteration, and accountability. Below, I’ll walk through some composite examples. The guiding question is: what skills, if adopted across the organization, would make the biggest impact today?
A few consistent candidates emerge across companies. Three I’ve found particularly useful are below. You can download a starting-point metaskill to create each of these.
Ticket Architect Skill
Focus: Because building is now a much shorter part of the process with AI, we can no longer afford to “figure out what we’re building as we build it.” We need to change the culture around defining work and push more clarity upfront. Default planning mode is too sycophantic. A ticket architect skill should push back and front-load exactly what should be achieved, especially as product managers, customer success managers, and others begin spawning agents to create prototypes and ideas.
Iteration: Feedback from engineers, product managers, QA, and Customer Success identifies the failure scenarios. One-shot pull request success data can also identify them.
Accountability: A skilled engineer who can build relationships and gather feedback across multiple teams and departments should own and support this skill.
AI Reinforcement Skill
Focus: Create a flywheel where the code and infrastructure become more AI-ready on every job. AI stumbles in some way on nearly every piece of work. Weaknesses in the infrastructure, code, documentation, and support system are identified each time, but most teams throw those observations in the garbage at the end. This skill uses the insights from each job to continually improve the foundation and accelerate future work.
Iteration: Reinforcement requires more taste and discernment from experts because the feedback cycle is slower. The question is broader: “Is the repo improving over time?” However, one of the simplest tests is to attempt a job a second time, leveraging the reinforcement changes, to see if it succeeds more cleanly.
Accountability: Because this skill applies across functions, a panel should support the engineer responsible for continually updating it and ensuring it operates properly.
Work Legibility Skill
Focus: “Automatic public standup for anyone to see.” We can now build faster than ever. To fully use that speed, we need more room to ask forgiveness rather than permission, but that only works if each person’s daily work is legible and easy for others to access.
Iteration: Because this serves AI collection, synthesis, reporting, and day-to-day management, the feedback loop must be continuous.
Accountability: This skill makes up a key portion of the developer experience and organizational legibility. Accordingly, it should be owned by a key leader who understands developer experience and is familiar with leadership priorities.
I particularly enjoy the combination of these skills because they reinforce one another. Ticket Architect front-loads decisions so work starts with clearer objectives and constraints, allowing more employees to directly access the build process. AI Reinforcement captures the lessons from each loop and pushes them back into the codebase, documentation, infrastructure, and agent instructions. Work Legibility makes the work visible enough that teams can move faster without disappearing into individual agents’ caves where nobody knows what is happening until the very end.
Downloadable Metaskills For You
Do I have a gstack equivalent of skills for you to use? No! But, click here for metaskills that will guide you through the process of creating a first draft of each skill above. They ask a series of questions and guide you to a solid place to start. Or, start with common sense and best practices, and then iterate from there!
Each company needs versions adjusted for its own systems, language, architecture, risk tolerance, and operating model. Metaskills are starting points for creating high-quality internal skills that a person on your team can own and improve over time.
Remember, this provides a starting point. As you focus, iterate, and establish accountability, you will accelerate your own flywheel.
Your robot theater troupe resets every night. Write the script accordingly.






