stream

Context engineering is information retrieval

Taivo Pungas

20 Jun 2023 • 1 min read

The stages of an LLM app seem to go like this:

Hardcode the first prompt, get the end-to-end app working.
Realise that the answers are bad.
Do some prompt engineering.
Realise the answers are still bad.
Do some more prompt engineering.
Discover vector databases!!!1
Dump a ton of data as plain strings into the vector db for semantic search on embeddings.
Post your achievement on Twitter.

The journey usually ends here -- with an impressive demo. But the demo is usually hand-picked out of many examples, and for most users' most queries the system doesn't work.

What's next? Improving on this would take much more work. Setting up even semi-rigorous evaluation takes annoying work including manual labelling. Fetching the right context takes even more work. Prompt engineering turns into orchestrating multi-prompt chains with intent detection leading to interleaved Python code and LLM calls...

Which is to say, another form of engineering.

What I wanted to focus on, though, is the "fetching the right context" part. While it may seem new, the problem is the age-old Information retrieval problem -- and solutions are probably similar. So my suggestion to anyone working to remove hallucinations: brush up on your Information Retrieval 101, and be inspired by the search-engine-builders of 20+ years ago.