Where are the prompts to stop humans going off the rails?

Published: July 3, 2026

The place I work at has a miserly AI budget, and so last month I had been very disciplined in using my AI judiciously, and making use of just the Haiku model.

Come to the end of the month, I still had more than half my budget, so I spent two hours with Opus, doing a test driven development approach to creating some custom linting rules.

It worked really well. I copy pasted a design principles document in, and it one by one came up with example code that should fail and what the lint error should be. Tight human in the loop stuff.

Then, I asked it to implement the lint rules, and it goes off and does. Brilliant, great job.

Fast forward to yesterday, I think 'well Opus did a great job the other day, I've got a new task for you'. But this time I didn't do the TDD, I gave it more vague prompts and it created something that mostly worked, but I wasn't nearly as confident in it. And then I looked at my token usage to see that I'd already used half my budget in two days.

When we're using AI it seems that if we're not careful it starts becoming very hacky, hard coding things, copy pasting things, adding @ts-ignores etc, so we update our system prompts to keep it on track - "Don't do X", "if you encounter this situation, do Y" etc.

But humans can be just as temperamental, if not more. Sometimes we're under pressure, or we're just tired, and can think 'TDD will take too long, let me just get this thing working first'

Maybe we can include in our system prompts not only guard rails to stop the AI going off track, but also instructions to guide the human. For example if the AI receives a prompt 'we need a CLI tool to do X' instructions to have the AI come back with 'OK, are you sure you want me to just go off and do this? In the past we've established that creating a decision tree and doing TDD is most effective'. Or - 'I see that you just git commit -am "fix" --no-verify more than 100 files, do you think maybe we should take a break?'

Maybe I shouldn't design chatbots, because both of those sound super annoying, but that's the idea.

There are equivalent tools in other industries - vigilance control devices on trains come to mind - the device will make an annoying sound and the driver must press a button to prove they are still awake, otherwise the train comes to a stop.

Questions? Comments? Criticisms? Get in the comments! 👇

Spotted an error? Edit this page with Github