Self-driving apps

There's a now-famous classification of different levels of self-driving:

  • Level 0: No automation
  • Level 1: Driver assistance
  • Level 2: Partial automation
  • Level 3: Conditional automation (human fallback always available)
  • Level 4: High automation (can safely pull over)
  • Level 5: Full automation

The full definition is more detailed but I won't replicate it here.

Now that we have the GPT family of models -- and already are seeing multimodal model releases, including for robots (PaLM-E) -- is it time for a more general definition of automation levels?

Levels 0 and 5 are pretty boring; it's the intermediate levels that allow us to track progress and compare systems.

Let me apply this classification to a few top current applications of LLMs to see if that would work:

  • Github Copilot is on L2, slowly inching towards L3: it is able to make edits and sometimes produce the complete code needed, but without human review, bugs and security issues are likely.
  • Jasper, copy.ai and other writing apps: hard to tell but probably L3? But the system is not capable of requesting human input in specific places where it is uncertain.

This feels like the wrong framework for automating knowledge work. Possibly so because self-driving requires almost absolute safety, whereas many applications of generative AI are more error-tolerant. Or possibly because few people are actually trying to build L5 AI today -- everyone is building an assistant but few companies seem to go for full automation.