Token limits have been the number 1 problem of vibecoding. Aside from high usage, vague prompting and inefficient context management are two key causes that contribute to this problem. Tokens are counted from two things, the user’s input and agent’s output. You can have the best prompt in the world, but if your AI puke a Shakespearean essay for each response, your token will reach its limit faster. Similarly, if your prompt is too long, your token usage will also skyrocket. Prompting Prompting seems easy enough, but the reality is, if you want quality output, the input must be of similar quality. Here are some tips you can try to achieve a better, token-efficient prompt. Stick to English language only Certain languages may use higher tokens. Switching between languages might not be a good idea. Sure you can control your input prompt, but the output may use more tokens. Extremely short input Do not use ‘please’ or ‘thank you’. Get rid of filler words. Straight to the point....
I just share some of my notes here.