Logo

The Data Daily

Classifying Blood Bowl teams using clustered heatmaps | R-bloggers

Classifying Blood Bowl teams using clustered heatmaps | R-bloggers

Classifying Blood Bowl teams using clustered heatmaps
Posted on October 31, 2022 by Statistics | Gertjan Verhoeven in R bloggers | 0 Comments
[This article was first published on Statistics | Gertjan Verhoeven , and kindly contributed to R-bloggers ]. (You can report issue about the content on this page here )
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
(Photo by Erik Cats )
“If you graph the numbers of any system, patterns emerge” is one of my favorite movie quotes (from Darren Aronofsky’s cult movie about mathematics \(\pi\) (“pi”)). In this post we’ll graph the numbers from the Blood Bowl Fantasy football game, and see what patterns emerge. Blood Bowl is a board game that can be summarized as “fantasy-chess-with-dice”, but this would hardly do the game justice. For example, in chess both players play with the same pieces, but in Blood Bowl, almost 30 different teams (e.g. orcs, elves, etc) are available to choose from, each team with different skills that require different playing styles. In addition, Blood Bowl coaches must assemble and paint their playing pieces themselves, making it a creative hobby as well.
For this blog post, we have a look at similarities and differences between the different Blood Bowl 2020 teams, and see where the newly introduced Black Orc and Khorne teams fit in. Using data analysis, we can cluster teams that have similar (average) match performance statistics and graph the data using heatmaps.
Heatmaps are a graphical representation of the data, with for example darker colors representing higher numbers. This allows patterns to emerge visually, and deviations on the patterns are also easy to spot. (Fun fact: a hundred years ago, people already “shaded matrices” but did not call it a heatmap yet ( Wilkinson and Friendly 2009 ).)
We use match performance data from FUMBBL.com where Blood Bowl 2020 can be played online. For a previous blog post that describes the process of scraping the data see here . I made the match data (currently from August 2020 up to June 2022) publicly available in a Github repository .
Team play style categories
The most common way to classify the 25-30 different Blood Bowl teams is to distinguish four categories:
Bash (e.g. Orcs)
Agile (or Dash) (e.g. Wood Elf)
Hybrid (e.g. Humans)
Stunty (e.g. Halflings)
To formally classify a team we can use the follow decision tree, taking as input the most common roster choices for a team:
Does the team roster has a lot of Stunty players and a few Big Guys with negatrait?
yes classify as Stunty
no continue
Does the team roster has 4+ players with Strength skill access but < 4 players with Agility access?
yes classify as Bash
no continue
Does the team roster has 4+ players with Agility skill access but < 4 players with Strength access?
yes classify as Agile / Dash
no classify as Hybrid
As an example, Shambling Undead are typically played with 2 wights and 2 blitzers (4 players with Strength access), but also with four Ghoul runners (4 players with Agility access), so this team is classified as Hybrid.
Main source for this scheme is ( Dode 2017 ), but the bash/dash/hybrid/stunty categorization is widespread, for example at ( Breidr 2015 ), ( Amiral 2017 ) and ( Schlice 2018 ).
It will be interesting to compare the patterns in the data with this categorization. In the next section we’ll discuss the various match statistics available from FUMBBL, but first we need to prep the data.
Read and prep the data
We start with reading in the scraped FUMBBL match data, see my previous blog posts mentioned above for details.
# Load packages library(tidyverse) library(ggfortify) library(ggrepel) df_mbt % filter(race_type != "") %>% arrange(race_type))
This blog post focusses on the Blood Bowl 2020 ruleset, for this we need the “Competitive” division from FUMBBL. (I performed the analysis for the older divisions using the 2016 ruleset as well, the plots can be found at the end of this blog post. )
A blog post from Schlice ( Schlice 2018 ) got me interested in BB team classification using data. In his post, he makes heavy use of functional programming using R’s purrr package. This allows us to write a function and have this function work in parallel on a list of objects, and have it return the results also in list form.
As this was new to me, I decided to adapt his code to process the four divisions simultaneously. To do so, I wrote a function filter_division() that takes the source data and selects only matches from a particular division:
divisions % filter(race_name != "Treeman") %>% filter(race_name != "Simyin") } data_tables % left_join(yorder, by = "variable") df_long

Images Powered by Shutterstock