LEGO color themes as topic models

Last updated: 04-15-2018

So I’m back to the LEGO dataset. In a previous post, the plot of the relative frequency of LEGO colors showed that, although there is a wide range of colors on the whole, just a few make up the majority of brick colors. This situation is similar to that encountered with texts, where common words – articles and prepositions, for example – occur frequently but those words’ meaning doesn’t add much to a (statistical) understanding of the text.

In this post, I use a few techniques associated with text mining to explore the color themes of LEGO sets.

