The Transient Nature of Prompt Engineering: A Call for More Robust Language Models
Moving Beyond the Hack Towards Robust and User-Friendly Language Models
In tech, sometimes hacksย live on and become permanent. It's important to recognize when a hack is needed and set aย deadlineย with exit criteria before letting the hack live on to become a pattern some might regret.
One recentย example isย prompt engineering, a valuable HACK I'd say, considering the limitations of the initial versions of LLMs, e.g., data limitations, and how they were trained. Prompt engineering was (is) a stop-gap to formulate prompts/queries in a way that the LLM would understand them, similar to how they were trained and (instruction) tuned. ๐.๐ฒ., ๐ฐ๐ฎ๐ป ๐๐ฒ ๐ฐ๐ฟ๐ฒ๐ฎ๐๐ฒ ๐ฝ๐ฟ๐ผ๐บ๐ฝ๐๐ ๐๐ผ ๐ฐ๐น๐ผ๐๐ฒ ๐๐ผ ๐๐ต๐ผ๐๐ฒ ๐ถ๐ป๐๐๐ฟ๐๐ฐ๐๐ถ๐ผ๐ป๐ ๐๐ต๐ฒ ๐บ๐ผ๐ฑ๐ฒ๐น ๐๐ฎ๐ ๐๐ฟ๐ฎ๐ถ๐ป๐ฒ๐ฑ ๐ผ๐ป ๐๐ผ ๐ฟ๐ฒ๐ฎ๐น๐ถ๐๐ฒ ๐๐ต๐ฒ ๐ฏ๐ฒ๐๐ ๐ฝ๐ฒ๐ฟ๐ณ๐ผ๐ฟ๐บ๐ฎ๐ป๐ฐ๐ฒ?
Donโt get me wrong, we need prompt engineering today because we are not in the desired state,ย where, ideally, a foundation model would comprehend the prompt regardless of how it was formatted or chained. From a usability standpoint, I'd consider it a UX bug.ย The question that I think we should be asking is, how do we get to that desired state? Not how do we prompt engineerย better,ย especially in the long term.
Think about it, talking to a child, you'd have to formulate your communication intent to fit in their limited vocabulary and comprehension of the world versus talking to an educated adult who shares as much context with you and have a wider understanding of the world around them. You'd then use words without engineering them to communicate efficiently.
If I'd model that to a concept, I'd use Shannon's information theory and attribute the need for prompt engineering to channel noise. This noise can be a result of inadequate training data, suboptimal model architecture, or anyother limitation.
The question here is, how to reduce noise to elimiate the need to engineer Human language. Imagine youโre asking a friend for help. Youโd typically be direct and informal: โHey, I could really use some help with this.โ You wouldnโt be saying, โYouโre a great friend, youโre supposed to help me, hereโs exactly how you can helpโฆโ because that level of detail isnโt necessary. Your friend understands your context and the nature of your request without needing you to spell it all out.
I don't seeย prompt engineeringย going away completely (not in the short-term), but at least forย foundational models, we shouldn't have to rely on it as a pillar technique and acknowledge it as a hack, and put efforts in engineering and modelling beyond the prompt layer.ย It's not good UX.
TL;DR:
Hacks can become permanent, so be wary of prompt engineering solidifying as a standard practice.
Prompt engineering is a valuable but temporary fix for current LLM limitations.
Ideal language models understand human language regardless of prompt formatting โ no engineering needed.
Reliance on prompt engineering highlights a UX flaw: models aren't robust enough.
Instead of refining prompt engineering, focus on reducing "noise" (limitations in data, architecture, etc.).
Like talking to a child vs. an adult, we want models to understand implicit context, not engineered prompts.
Shannon's Information Theory: prompt engineering is needed due to channel noise, which we must reduce.
Prompt engineering may remain relevant in niche areas, but not for foundational models.
Invest in engineering and modeling beyond the prompt layer for truly robust and user-friendly language AI.
Good UX means natural interaction โ let's move beyond prompt engineering for the future of language models.
Thatโs it! If you want to collaborate, co-write, or chat, reach out viaย subscriber chatย or simply onย LinkedIn. I look forward to hearing from you!