Apologies for the snarky title, but there has been a huge amount of discussion around so called "Prompt Engineering" these past few months on all kinds of platforms. Much of it is coming from individuals who are peddling around an awful lot of "Prompting" and very little "Engineering".
Most of these discussions are little more than users finding that writing more creative and complicated prompts can help them solve a task that a more simple prompt was unable to help with. I claim this is not Prompt Engineering. This is not to say that crafting good prompts is not a difficult task, but it does not involve doing any kind of sophisticated modifications to general "template" of a prompt.
Others, who I think do deserve to call themselves "Prompt Engineers" (and an awful lot more than that), have been writing about and utilizing the rich new eco-system of tooling around LLMs for features such as templates, additional memory, and custom decoders. Examples of these include Langchain, VectorDB technologies, txtai/txtchat, my own work on token level constrained text generation, huggingfaces work on sequence level constrained text generation, and many others. Many of these additional tools are finding that they can form entire, well funded companies around their tool. Despite the money and hype around LLMs, it is still shockingly difficult to prompt them. but this doesn't have to be the case!
We are fortunate that an awful lot of very smart people have implemented many really neat prompt engineering techniques within Stable Diffusion and more specifically within the Automatic1111 webui. I am going to highlight some of these packages/techniques, and this is important because I will carefully explain and demonstrate conclusively that there are LLM analogies for these techniques which are being unjustly forgotten about/not implemented by any LLM front-end. Some naysayers seem to think this is not the case. My hope is that this gist puts the final nail in the coffin of our current non-creative approach to prompting LLMs. Let's list out the techniques that they've pioneered and which are broadly possible for us to use in NLP, that we have not implemented to my knowledge in any serious capacity in any repo.
We can implement prompt alternating by alternating the previous input prompts between the user given two prompts
Imagine that you want to get 20 tokens of output with prompt alternating between two prompts: "I like apple" and "I like bananas", making the input prompt look like this: [I like apple:I like bananas]
For the first token generated, the input is "I like apple", let's say it generates "because" For the second token generated, the input is "I like bananas because", let's say it generates "I" for the third token generated, the input is "I like apple because I" and it generates... and so on...
Similar to above but we can choose for how many tokens ahead that we do the above for before switching to another
In a LLM front end, I should be able to use "()" in the prompt to increase the model's attention to enclosed words, and [] to decrease it. I should be able to combine multiple modifiers
I should be able to average or compute the weighted average of the embeddings of multiple tokens in a prompt. This enables us to get a model to answer the question "What is the definition of {apple|orange}" where {apple|orange} is the mathematical average in embedding space of those two words.
Prompt Blending but with far more flexibility, use a third point as an "anchor"
This gist will include extremely basic LLM implementations for these 5 techniques, written by yours truly, all contributed with the hope that this spurs the community at large to implement and experiment with these features. Two are given below but I will finish the other 3 in the next few days as my time allows.
Great work