π€ as you give the LLM more instructions, it doesn't simply ignore the newer ("further down in the file") instructions - it begins to ignore all of them uniformly
@krzyzanowskimMarcin Krzyzanowski yeah. Context is only lightly ordered usually. Even getting the tokens to be interpreted as sequential is a bit of work. It interprets all context βin parallel.β Which means confusion can be global. https://huggingface.co/blog/designing-positional-encoding
If you have a fediverse account, you can quote this note from your own instance. Search https://mastodon.social/users/cocoaphony/statuses/115644369081380866 on your instance and quote it. (Note that quoting is not supported in Mastodon.)