There are tons of language models you can run on your own machines or servers that allow you to process any amount of data your machine can physically handle. That’s how LLMs themselves are made, by running dozens, scores, or even hundreds of high-end computers over immense amounts of data to tokenize it all into a manageable system.
But ChatGPT isn’t running on your own servers or computers, and you are sharing it with millions of other users, all of whom have their own demands and needs. So you are limited in what it can process, so the whole system doesn’t crash.