Chain-of-thought reasoning

Matt Rickard has a concise overview of Chain-of-thought: the design pattern of having an LLM think step by step.

To summarize, the four mentioned approaches from simpler to more nuanced are:

  1. Add "Let's think step-by-step" to the prompt.
  2. Produce multiple solutions, have the LLM self-check on each one, pick the one that passes the self-check.
  3. Divide task into subtasks, solve each subtask.
  4. Explicit thought-action-observation loops, where the action can be usage of a tool (e.g. external API call).

This is not an exhaustive list. Each of the ideas above is introduced in a separate paper, which feels excessive given their simplicity. Regardless, these are essentially tricks on top of GPT/InstructGPT, rather than fundamental discoveries, so anyone can come up with new ones, and I'm sure people have already.

This list is notable because the approaches are generic. You could build a system that uses these tricks and yet is not constrained to any particular task. If you care only about one domain -- say, generating code -- that is low-hanging fruit. If you have any inkling of how the task is done in reality, you can bias the model towards a specific set of steps.

For example, a system that uses GPT for making code edits based on simple descriptions ("rename the GET /dogs API path with GET /bar") might be built with the following hard-coded steps:

  1. List files in repo
  2. Identify files that need to be changed, given the prompt
  3. For each file: make an edit to this file

Perhaps you could add some self-checking on top: if the code doesn't compile, run the same steps a second time but now also conditioning on the error message.

The core here, though, is that you can only improve on the GPT-made "algorithm" if you have some domain knowledge. So expertise in the task still matters. (Maybe LLMs are the time we finally collectively realize the value of business process management...?)