What is Chunk Viz?
Language Models do better when they're focused. One strategy is to pass a relevant subset (chunk) of your full data. There are many ways to chunk text.
This is an tool to understand different chunking/splitting strategies.
Language Models have context windows. This is the length of text that they can process in a single pass.
Although context lengths are getting larger, it has been shown that language models increase performance on tasks when they are given less (but more relevant) information.
But which relevant subset of data do you pick? This is easy when a human is doing it by hand, but turns out it is difficult to instruct a computer to do this.
One common way to do this is by chunking, or subsetting, your large data into smaller pieces. In order to do this you need to pick a chunk strategy.
Pick different chunking strategies above to see how they impact the text, add your own text if you'd like.
You'll see different colors that represent different chunks.