How can you index an entire codebase and what can I use it for?

GPT4 only provides around 8k tokens. How can you index an entire codebase and in which way is it indexed? I’m guessing that it’s not being kept in context when the codebase is much more than 8k tokens?

What exactly can I use it for? Let’s say I have a codebase with different games like backgammon and Yahtzee. Those games are written in the same framework/libraries/style. Could I use the codebase chat to get Cursor to write me another game in the same style as the two games already in the codebase?

Probably building vectors and/or compressing the code in some format that has some sort of pointers to search through using your prompt