100 Toasty Tofu(s) – Another Triple J Hottest 100 Predictor

Update: Think you can do better than my prediction? Prove it by filling out your prediction here: Triple J Hottest 100 Prediction tracker submission. Also, you can look at the leaderboard of predictions over here.

100 Toasty Tofu(s) is another Triple J Hottest 100 Predictor, made for your entertainment with no guarantees what-so-ever.

Since 2012, various people have been predicting the Hottest 100 using social media scrapes and OCR. This started with The Warmest 100 and was continued by 100 Warm Tunas. I’ve long thought it’s an awesome experiment because the conditions are good for using social media as a predictor. Two factors make this a good experiment – the average person is willing to share their hottest 100 votes and the stakes are so low, unlike political elections, that there aren’t hoards of true believers/trolls/Russian government agents trying to manipulate public sentiment.

I use instagram-scraper to scrape the hashtags (the same as 100 Warm Tunas) and then a python script that uses Tesseract OCR to convert them to text. They are then matched with the Triple J song list (PDF) and saved. I removed any duplicate votes I found, that is people who voted for the same songs in the same order when there are greater than 3 songs in the image (a very unlikely occurrence). I figure these are probably the same person uploading the same image twice.

This is an initial cut, there’s still some extra work to do including:

  • Manually add songs that would be in the hottest 100 to the song list
  • Tune the OCR, including doing some pre-processing to images if needed
  • Tune the matching algorithm – currently using Levenshtein distance
  • Do more analysis on voting combinations (e.g are there factions who vote for particular songs together and what can we learn from this).
  • Make the table pretty like the other ones.
  • Make a form for people to upload their own predictions and show a leaderboard as they come in on the 27th.

The results are quite different to 100 Warm Tunas – I seem to be picking up more votes. I’m not sure if this is due to some sort of filtering I’m not doing or just algorithm differences, but we will see if 100 Warm Tunas still is the internet’s most accurate prediction of Triple J’s Hottest 100 for 2017 on January 27!

This table is updated automatically every few hours.
Total number of images: loading…
Total number of duplicates: loading…
Total number of votes: loading…

# Title Artist Votes % Votes Inc dupes %
Loading… Loading… Loading… Loading… Loading… Loading… Loading…

Leave a Reply

Your email address will not be published. Required fields are marked *