Business folks refer to a “competitive moat” as a lasting advantage that enables your company to outpace the competition and deliver superior value to customers. A few examples include:
Size — You have a huge, hyper-efficient widget factory, while your smaller competitors struggle to produce widgets at such a low cost.
Intellectual Property — You hold a patent on an innovation, making it difficult for others to replicate your success.
Network Effects — Launching a new social media app is challenging because users prefer to hang out where their friends and many other users are already active.
Your competitive moat protects your castle (or company).
In the realm of AI, how can your team stay ahead of the competition and remain relevant to your customers? Prompt engineering is not the answer.
To be clear, I am not saying that good prompt engineering can’t improve a model. What I am saying is that even the best prompt engineering will eventually be discovered by your competitors. Excellent prompt engineering provides, at best, a temporary edge in delivering the best possible AI to your users.
The latest example of this is an ingenious “prompt hack” from AI researchers at Zoom. Their paper can be found here. They cleverly took Chain-of-Thought (CoT) models and explicitly prompted them to continue “thinking out loud” to improve their results, but with the caveat that this thinking should be condensed into fewer tokens. They call it Chain of Draft (CoD). Don’t blabber on. Just use a few carefully chosen words to craft the best answer. The result? The approach slightly lowers accuracy, but cost and speed improve dramatically.
Unlike traditional methods that often involve lengthy reasoning steps, CoD leverages concise reasoning drafts to speed up response generation without sacrificing correctness.
Additionally, CoD offers significant cost advantages. By compacting the reasoning steps, it reduces the number of input tokens required for few-shot prompting and shortens the output token length, directly lowering computational cost. This token efficiency makes CoD especially appealing in cost-sensitive scenarios, such as large-scale deployments of LLMs or applications with strict budget constraints.
Nice breakthrough! This offers a balanced option that reduces cost and increases speed, especially when generating high-quality answers at scale. We will add this arrow to our quiver immediately and evaluate where and when it is best deployed.
However, was the Zoom team the first to figure out the transformation of CoT into CoD reasoning? Almost certainly not. They were the first to publicly claim it in a well-researched and transparent way. When the Zoom team announced the advancement, dozens of teams likely and quietly expressed disappointment that the competition would soon copy their latest trick.
To the extent that prompt engineering could become a moat or competitive advantage, a system and culture of consistently and quickly generating prompts that are optimized, monitored, and robust across your AI systems could be that moat. We’ll discuss this in depth in later posts. However, a one-off brilliant trick is not a system and will eventually fail you.
The Bottom Line: While prompt engineering is necessary and useful, at the end of the day neat tricks will not sustain your value proposition to customers over the competition. Highly impactful and potent data? Yes. World class AI workflow designs? Likely. But excellent prompting by itself is a shallow and drying moat that will not keep the competition out for long.
It’s a moat! Don’t tell me otherwise! 😜