<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>python Archives - JasonGi</title>
	<atom:link href="https://jasongi.com/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>https://jasongi.com/tag/python/</link>
	<description>Jason Giancono</description>
	<lastBuildDate>Mon, 04 Mar 2024 13:36:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>
	hourly	</sy:updatePeriod>
	<sy:updateFrequency>
	1	</sy:updateFrequency>
	<generator>https://wordpress.org/?v=6.9.4</generator>

<image>
	<url>https://i0.wp.com/jasongi.com/wp-content/uploads/2024/03/cropped-jg-2.png?fit=32%2C32&#038;ssl=1</url>
	<title>python Archives - JasonGi</title>
	<link>https://jasongi.com/tag/python/</link>
	<width>32</width>
	<height>32</height>
</image> 
<site xmlns="com-wordpress:feed-additions:1">56842507</site>	<item>
		<title>Speed up Django&#8217;s collectstatic command with Collectfasta</title>
		<link>https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/</link>
					<comments>https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/#respond</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Mon, 04 Mar 2024 13:31:31 +0000</pubDate>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[collectfasta]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">https://jasongi.com/?p=7505</guid>

					<description><![CDATA[<p>Django&#8217;s collectstatic command (added in Django 1.3 &#8211; March 23, 2011) was designed for storage backends where file retrieval was cheap because it was on your local disk. In Django 1.4 (March 23, 2012) Django introduced CachedStaticFilesStorage which would append md5 hashes to the end of files so that you could have multiple versions of &#8230; <a href="https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/" class="more-link">Continue reading<span class="screen-reader-text"> "Speed up Django&#8217;s collectstatic command with Collectfasta"</span></a></p>
<p>The post <a href="https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/">Speed up Django&#8217;s collectstatic command with Collectfasta</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[
<p>Django&#8217;s <code>collectstatic</code> command (<a href="https://docs.djangoproject.com/en/5.0/releases/1.3/#extended-static-files-handling" target="_blank" rel="noreferrer noopener">added in Django 1.3 </a>&#8211; March 23, 2011) was designed for storage backends where file retrieval was cheap because it was on your local disk. </p>



<p>In <a href="https://docs.djangoproject.com/en/5.0/releases/1.4/#cachedstaticfilesstorage-storage-backend" target="_blank" rel="noreferrer noopener">Django 1.4</a> (March 23, 2012) Django introduced <code>CachedStaticFilesStorage</code> which would append md5 hashes to the end of files so that you could have multiple versions of files which could stick around while you did a blue/green deployment. It also meant you could put your app in-front of a CDN and the filename hashes would ensure that when the file changed so did the cache key. This meant you didn&#8217;t need to worry about invalidating the CDN assets or users&#8217; browser caches.</p>



<p>Later on (<a href="https://docs.djangoproject.com/en/5.0/releases/1.7/#django-contrib-staticfiles" target="_blank" rel="noreferrer noopener">Django 1.7 &#8211; September 2, 2014</a>) we got <code>ManifestStaticFilesStorage</code> which stores the filenames in a json file assisting with hosting on remote storage like S3.</p>



<p>The original <code>django-storages</code> is even older than <code>collectstatic</code> &#8211; <a href="https://github.com/jschneier/django-storages/commit/020143f5d4b128b5e02ea2003d5da4ccb7067f48" target="_blank" rel="noreferrer noopener">the initial commit</a> was back in  Jun 12, 2008. Its purpose was to provide a storage backend for AWS S3 which has since taken over the world. It also provides S3ManifestStaticStorage which is great for static file serving &#8211; you don&#8217;t even need to set up a static web server to serve them &#8211; they can come straight from the bucket or CDN.</p>



<p>The big problem with all of this is that running <code>collectstatic</code> on S3-based storage is painfully <strong>slow</strong>. Especially hashing storage which uses the post-process hook to modify and re-upload files to update file references (which then can trigger further updates). There used to be a solution to this &#8211; <a href="https://github.com/antonagestam/collectfast" target="_blank" rel="noreferrer noopener">Collectfast</a> (released May 2013) was an awesome drop-in replacement for the collectstatic management command which would auto-magically speed things up. Unfortunely, it <a href="https://github.com/antonagestam/collectfast/issues/212">has been archived and is no longer maintained</a> &#8211; the last release being in 2020. Waiting for collectstatic to run has become tiring.</p>



<p>I&#8217;ve spent the past few weekends forking the original <code>Collectfast</code> trying to get the repo up-to-date and working again. It has been an interesting challenge and I&#8217;ve finally got it to a state where I am happy with the performance improvements it provides over the Django command and am confident it works. Introducing&#8230;. <strong><a href="https://github.com/jasongi/collectfasta">Collectfasta</a></strong> -an updated fork of Collectfast &#8211; even faster than before.</p>



<h2 class="wp-block-heading">What&#8217;s new in Collectfasta?</h2>



<h3 class="wp-block-heading">You can now run all tests without connecting to cloud services</h3>



<p>One of the reasons <code>Collectfast</code> was archived was because it was difficult to find a new maintainer, as most tests, specifically the &#8216;live tests&#8217;, required real Google Cloud Platform (GCP) and AWS credentials for execution.</p>



<p>I have now set up popular mocking tools <a href="https://www.localstack.cloud/">LocalStack</a> and <a href="https://github.com/fsouza/fake-gcs-server">fake-gcs-server</a> to allow these tests to run without any AWS or GCP credentials. This has also opened up a new avenue of testing since you can run these mocks for free: testing for performance on many files rather than just a single file. I&#8217;m observing performance improvements of 5x-10x with local mocks, and these improvements are even more significant with remote APIs.</p>



<p>I&#8217;ve kept both the live tests and the docker tests running on master for better coverage.</p>



<h3 class="wp-block-heading">AWS_PRELOAD_METADATA reimplemented</h3>



<p>AWS_PRELOAD_METADATA has been <a href="https://github.com/jschneier/django-storages/pull/636">removed in django-storages 1.10 (2020-08-30)</a> and hard-coding <code>preload_metadata = True</code> has been a key performance optimisation that <code>collectfast</code> made in the boto3 strategy. The reason was straightforward: during <code>collectstatic</code> the <code>exists</code> method checks if a file already exists. This is fine when <code>exists</code> is cheap &#8211; but for the S3Storage <code>exists</code> will do a <a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_HeadObject.html">HeadObject</a> request to the S3 API every time, for every file. </p>



<div class="wp-block-group"><div class="wp-block-group__inner-container is-layout-constrained wp-block-group-is-layout-constrained">
<div class="wp-block-group is-vertical is-layout-flex wp-container-core-group-is-layout-8cf370e7 wp-block-group-is-layout-flex">
<p>In contrast, when <code>preload_metadata</code> was working:</p>



<ol class="wp-block-list">
<li>it would initially call <a href="https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjectsV2.html">ListObjectsV2</a> to see what is already there </li>



<li>stores the results in a <code>dict</code>, </li>



<li>then <code>exists</code> checks the <code>dict</code> first, returning <code>True</code> if the key exists &#8211; otherwise deferring to the initial implementation. </li>
</ol>



<p>This significantly speeds up subsequent <code>collectstatic</code> runs on the same files, since you&#8217;re replacing hundreds of API calls with one.</p>
</div>
</div></div>



<p>Removing this feature from <code>django-storages</code> made sense &#8211; it&#8217;s not the kind of thing you want people enabling on a web server &#8211; because it will cause memory leaks and is not concurrent-safe. However, for a management command like <code>collectstatic</code> &#8211; concurrency doesn&#8217;t matter.</p>



<p>Re-implementing the functionality was nasty &#8211; I wrapped the storage object with my own storage subclass of key methods that saved the preloaded data so that it could be kept up to date on <code>save</code>, <code>delete</code> etc. There&#8217;s surely a better pattern than what I ended up with &#8211; but I was optimising for replicating the removed logic rather than beautiful code &#8211; this is ripe for a refactor.</p>



<h3 class="wp-block-heading">The two-pass strategy</h3>



<p>After I got the <code>preload_metadata</code> working again, I found that my code was still pretty slow. The culprit was the multiple post-processing hashing passes that occur when the files reference each other. It confused me a lot because there are comments in ManifestFilesMixin that specifically mention consideration for S3:</p>



<blockquote class="wp-block-quote is-layout-flow wp-block-quote-is-layout-flow"><div class="wp-block-syntaxhighlighter-code "><pre class="brush: plain; title: ; notranslate">
            # use the original, local file, not the copied-but-unprocessed
            # file, which might be somewhere far away, like S3
            storage, path = paths&#x5B;name]
</pre></div><cite><a href="https://github.com/django/django/blame/3d4fe39bac082b835a2d82b717b6ae88ea70ea15/django/contrib/staticfiles/storage.py#L347">django/contrib/staticfiles/storage.py#L341</a></cite></blockquote>



<p>Upon further investigation, I discovered the cause was worse than I thought. Staticfiles does an <code>exists</code> check here on <a href="https://github.com/django/django/blame/3d4fe39bac082b835a2d82b717b6ae88ea70ea15/django/contrib/staticfiles/storage.py#L358">L358</a> and then deletes the file that exists on <a href="https://github.com/django/django/blame/3d4fe39bac082b835a2d82b717b6ae88ea70ea15/django/contrib/staticfiles/storage.py#L378C24-L378C42">L378</a> which means we need to re-upload it &#8211; this happens when there&#8217;s references between the static files. As a result, the system re-uploads these files every time, even with the <code>preload_metadata</code> optimisations. I wanted to find a better way.</p>



<p>I thought of a simple solution: a two-pass strategy. It works by running collectstatic using the <code>InMemoryStorage</code> or <code>FileSystemStorage</code> mixed in with <code>ManifestFilesMixin</code>. This means all the post-processing happens locally. Then for the second pass, we just iterate over the storage used in the first-pass and copy the files, as-is to S3. It means that it is still quite a bit slower than other strategies, because the first-pass has to run every time. But the first-pass is quite fast, and on subsequent runs the second-pass copies 0 files if they haven&#8217;t changed. It also only does a single ListObjectsV2 call at the start as we re-use the preload strategy for the second pass.</p>



<h3 class="wp-block-heading">What needs work?</h3>



<ol class="wp-block-list">
<li>The tests could be refactored to be a bit simpler &#8211; as raised in <a href="https://github.com/antonagestam/collectfast/issues/217">#217</a></li>



<li>The two-pass strategy only works for AWS &#8211; the Google version doesn&#8217;t even have a manifest files version in <code>django-storages</code></li>



<li>I haven&#8217;t touched the filesystem strategies at all &#8211; but in my experience filesystem storages are usually fast anyway. Potentially they (and the threading vars) could be removed &#8211; the main bottleneck I think has always been network requests.</li>



<li>I fought the current Strategy abstraction quite a bit &#8211; especially for two-pass &#8211; there&#8217;s an opportunity to refactor this to something simpler.</li>
</ol>



<p>PRs are accepted / encouraged &#8211; <a href="https://github.com/jasongi/collectfasta">github.com/jasongi/collectfasta</a></p>



<p></p>
<p>The post <a href="https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/">Speed up Django&#8217;s collectstatic command with Collectfasta</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2024/03/04/speed-up-djangos-collectstatic-command-with-collectfasta/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">7505</post-id>	</item>
		<item>
		<title>Hottest 100 Predictions &#8211; A Comparison</title>
		<link>https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/</link>
					<comments>https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/#respond</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Sun, 28 Jan 2018 11:49:32 +0000</pubDate>
				<category><![CDATA[Music]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[hottest 100]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[Triple J]]></category>
		<guid isPermaLink="false">http://jasongi.com/?p=3099</guid>

					<description><![CDATA[<p>This Hottest 100 I made a program to scrape Instagram for hottest 100 votes. I then collated the predictions from other programs (100 Warm Tunas and ZestfullyGreen&#8217;s Twitter scraper) and scored them based on performance, you can see the results here (I also opened this up to manual entries, one which outscored all the predictors). &#8230; <a href="https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/" class="more-link">Continue reading<span class="screen-reader-text"> "Hottest 100 Predictions &#8211; A Comparison"</span></a></p>
<p>The post <a href="https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/">Hottest 100 Predictions &#8211; A Comparison</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>This Hottest 100 I made a program to scrape Instagram for hottest 100 votes. I then collated the predictions from other programs (<a href="https://100-warm-tunas.nickwhyte.com/2017/">100 Warm Tunas</a> and <a href="https://www.reddit.com/r/triplej/comments/7sugnl/i_wrote_a_program_to_predict_the_outcome_of/">ZestfullyGreen&#8217;s Twitter scraper</a>) and scored them based on performance, <a href="https://jasongi.com/100-toasty-tofus-hottest-100-predictor/100-toasty-tofus-prediction-leaderboard/">you can see the results here</a> (I also opened this up to manual entries, one which outscored all the predictors).</p>
<p>I also decided to combine the results of the twitter scraper and my Instagram scraper, which turned out the be a better predictor than any of them. Next year I will have to incorporate a twitter scraper into my predictor.</p>
<p>Below is a summary of some interesting stats about the three automated prediction methods, plus the combination of 100 Toasty Tofu(s) and ZestfullyGreen&#8217;s Twitter scraper. I decided to take the results from ZestfullyGreen&#8217;s twitter scrape and add them to my results to see if this would be any better. I had a look at my predictions that included duplicate votes, however these performed worse than everything except the twitter prediction, so I have excluded them. This means my hypothesis on excluding duplicate votes (that they make the prediction less accurate) seems confirmed.</p>
<p>The final question that remains is, who truly is the internet&#8217;s most accurate Hottest 100 predictor? As you can see below, there isn&#8217;t really an answer for this. By my (somewhat arbitary) scoring system, 100 Warm Tunas and myself have a very similar accuracy. I think we will have to wait until next year to really test them.</p>
<div style="overflow-x: auto;">
<table>
<tbody>
<tr class="table-heading">
<th style="padding-right: 8px;"></th>
<th>JG</th>
<th>100 Tunas</th>
<th>ZG</th>
<th>JG + ZG</th>
</tr>
<tr>
<td class="page-title">Points</td>
<td class="orange">7289/10000</td>
<td class="orange">7288/10000</td>
<td class="red">5679/10000</td>
<td class="green">7297/10000</td>
</tr>
<tr>
<td class="page-title">Number of Songs in Correct Position</td>
<td class="green">7/100</td>
<td class="orange">4/100</td>
<td class="red">1/100</td>
<td class="orange">3/100</td>
</tr>
<tr>
<td class="page-title">Number of Correct Songs in any Position</td>
<td class="green">83/100</td>
<td class="green">83/100</td>
<td class="red">70/100</td>
<td class="green">83/100</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 5 Songs in Correct Position</td>
<td class="green">2/5</td>
<td class="green">2/5</td>
<td class="red">1/5</td>
<td class="green">2/5</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 5 Songs in any Top 5 Position</td>
<td class="green">4/5</td>
<td class="green">4/5</td>
<td class="green">4/5</td>
<td class="green">4/5</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 10 Songs in Correct Position</td>
<td class="green">2/10</td>
<td class="green">2/10</td>
<td class="red">1/10</td>
<td class="green">2/10</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 10 Songs in any Top 10 Position</td>
<td class="green">8/10</td>
<td class="green">8/10</td>
<td class="red">5/10</td>
<td class="green">8/10</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 20 Songs in Correct Position</td>
<td class="orange">2/20</td>
<td class="green">3/20</td>
<td class="red">1/20</td>
<td class="orange">2/20</td>
</tr>
<tr>
<td class="page-title">Number of best predictions (see below)</td>
<td class="orange">45</td>
<td class="green">50</td>
<td class="red">34</td>
<td class="orange">45</td>
</tr>
<tr>
<td class="page-title">Number of worst predictions (see below)</td>
<td class="orange">16</td>
<td class="orange">20</td>
<td class="red">65</td>
<td class="green">15</td>
</tr>
<tr>
<td class="page-title">Number of Correct Top 20 Songs in any Top 20 Position</td>
<td class="green">16/20</td>
<td class="green">16/20</td>
<td class="red">11/20</td>
<td class="green">16/20</td>
</tr>
<tr>
<td class="page-title">Guessed #1?</td>
<td class="green">Yes</td>
<td class="green">Yes</td>
<td class="red">No</td>
<td class="green">Yes</td>
</tr>
</tbody>
</table>
</div>
<div style="overflow-x: auto;">
<h2>Song-by-song comparison of predictors</h2>
<table>
<tbody style="overflow-x: auto;">
<tr class="table-heading">
<th style="padding-right: 4px;">#</th>
<th style="padding-right: 4px;">JG</th>
<th style="padding-right: 4px;">100 Tunas</th>
<th style="padding-right: 4px;">ZG</th>
<th style="padding-right: 4px;">JG + ZG</th>
<th>Title</th>
<th>Artist</th>
</tr>
<tr>
<td style="font-weight: bold;">1</td>
<td class="green">1</td>
<td class="green">1</td>
<td class="red">2</td>
<td class="green">1</td>
<td>HUMBLE.</td>
<td>Kendrick Lamar</td>
</tr>
<tr>
<td style="font-weight: bold;">2</td>
<td class="green">2</td>
<td class="orange">3</td>
<td class="red">4</td>
<td class="green">2</td>
<td>Let Me Down Easy</td>
<td>Gang Of Youths</td>
</tr>
<tr>
<td style="font-weight: bold;">3</td>
<td class="green">6</td>
<td class="green">6</td>
<td class="red">25</td>
<td class="green">6</td>
<td>Chateau</td>
<td>Angus &amp; Julia Stone</td>
</tr>
<tr>
<td style="font-weight: bold;">4</td>
<td class="red">3</td>
<td class="green">4</td>
<td class="red">3</td>
<td class="red">3</td>
<td>Ubu</td>
<td>Methyl Ethel</td>
</tr>
<tr>
<td style="font-weight: bold;">5</td>
<td class="orange">4</td>
<td class="red">2</td>
<td class="green">5</td>
<td class="orange">4</td>
<td>The Deepest Sighs, The Frankest Shadows</td>
<td>Gang Of Youths</td>
</tr>
<tr>
<td style="font-weight: bold;">6</td>
<td class="orange">10</td>
<td class="green">8</td>
<td class="red">1</td>
<td class="orange">10</td>
<td>Green Light</td>
<td>Lorde</td>
</tr>
<tr>
<td style="font-weight: bold;">7</td>
<td class="green">5</td>
<td class="green">5</td>
<td class="red">13</td>
<td class="green">5</td>
<td>Go Bang</td>
<td>PNAU</td>
</tr>
<tr>
<td style="font-weight: bold;">8</td>
<td class="orange">11</td>
<td class="green">10</td>
<td class="red">43</td>
<td class="orange">11</td>
<td>Sally {Ft. Mataya}</td>
<td>Thundamentals</td>
</tr>
<tr>
<td style="font-weight: bold;">9</td>
<td class="orange">16</td>
<td class="green">15</td>
<td class="red">33</td>
<td class="orange">16</td>
<td>Lay It On Me</td>
<td>Vance Joy</td>
</tr>
<tr>
<td style="font-weight: bold;">10</td>
<td class="green">9</td>
<td class="orange">13</td>
<td class="red">14</td>
<td class="green">9</td>
<td>What Can I Do If The Fire Goes Out?</td>
<td>Gang Of Youths</td>
</tr>
<tr>
<td style="font-weight: bold;">11</td>
<td class="green">7</td>
<td class="green">7</td>
<td class="red">29</td>
<td class="green">7</td>
<td>SWEET</td>
<td>BROCKHAMPTON</td>
</tr>
<tr>
<td style="font-weight: bold;">12</td>
<td class="green">15</td>
<td class="orange">16</td>
<td class="red">39</td>
<td class="green">15</td>
<td>Fake Magic</td>
<td>Peking Duk &amp; AlunaGeorge</td>
</tr>
<tr>
<td style="font-weight: bold;">13</td>
<td class="green">23</td>
<td class="orange">24</td>
<td class="red">30</td>
<td class="green">23</td>
<td>Young Dumb &amp; Broke</td>
<td>Khalid</td>
</tr>
<tr>
<td style="font-weight: bold;">14</td>
<td class="orange">29</td>
<td class="red">30</td>
<td class="green">6</td>
<td class="orange">29</td>
<td>Homemade Dynamite</td>
<td>Lorde</td>
</tr>
<tr>
<td style="font-weight: bold;">15</td>
<td class="green">12</td>
<td class="orange">11</td>
<td class="red">24</td>
<td class="green">12</td>
<td>Regular Touch</td>
<td>Vera Blue</td>
</tr>
<tr>
<td style="font-weight: bold;">16</td>
<td class="green">30</td>
<td class="orange">32</td>
<td class="red">36</td>
<td class="green">30</td>
<td>Feel The Way I Do</td>
<td>Jungle Giants, The</td>
</tr>
<tr>
<td style="font-weight: bold;">17</td>
<td class="orange">13</td>
<td class="red">12</td>
<td class="green">20</td>
<td class="orange">13</td>
<td>Marryuna {Ft. Yirrmal}</td>
<td>Baker Boy</td>
</tr>
<tr>
<td style="font-weight: bold;">18</td>
<td class="green">14</td>
<td class="green">14</td>
<td class="red">9</td>
<td class="green">14</td>
<td>Exactly How You Are</td>
<td>Ball Park Music</td>
</tr>
<tr>
<td style="font-weight: bold;">19</td>
<td class="orange">17</td>
<td class="green">19</td>
<td class="red">15</td>
<td class="orange">17</td>
<td>The Man</td>
<td>Killers, The</td>
</tr>
<tr>
<td style="font-weight: bold;">20</td>
<td class="green">35</td>
<td class="orange">38</td>
<td class="red">59</td>
<td class="green">35</td>
<td>Let You Down {Ft. Icona Pop}</td>
<td>Peking Duk</td>
</tr>
<tr>
<td style="font-weight: bold;">21</td>
<td class="red">8</td>
<td class="orange">9</td>
<td class="green">22</td>
<td class="red">8</td>
<td>Birthdays</td>
<td>Smith Street Band, The</td>
</tr>
<tr>
<td style="font-weight: bold;">22</td>
<td class="green">26</td>
<td class="green">26</td>
<td class="red">27</td>
<td class="green">26</td>
<td>Lemon To A Knife Fight</td>
<td>Wombats, The</td>
</tr>
<tr>
<td style="font-weight: bold;">23</td>
<td class="green">19</td>
<td class="orange">18</td>
<td class="red">10</td>
<td class="green">19</td>
<td>Not Worth Hiding</td>
<td>Alex The Astronaut</td>
</tr>
<tr>
<td style="font-weight: bold;">24</td>
<td class="orange">78</td>
<td class="orange">86</td>
<td class="red">N/A</td>
<td class="green">77</td>
<td>rockstar {Ft. 21 Savage}</td>
<td>Post Malone</td>
</tr>
<tr>
<td style="font-weight: bold;">25</td>
<td class="red">34</td>
<td class="green">31</td>
<td class="orange">18</td>
<td class="orange">33</td>
<td>Weekends</td>
<td>Amy Shark</td>
</tr>
<tr>
<td style="font-weight: bold;">26</td>
<td class="red">39</td>
<td class="red">39</td>
<td class="green">23</td>
<td class="red">39</td>
<td>Feel It Still</td>
<td>Portugal. The Man</td>
</tr>
<tr>
<td style="font-weight: bold;">27</td>
<td class="orange">43</td>
<td class="green">41</td>
<td class="red">N/A</td>
<td class="orange">43</td>
<td>Be About You</td>
<td>Winston Surfshirt</td>
</tr>
<tr>
<td style="font-weight: bold;">28</td>
<td class="green">47</td>
<td class="orange">51</td>
<td class="red">76</td>
<td class="green">47</td>
<td>Mystik</td>
<td>Tash Sultana</td>
</tr>
<tr>
<td style="font-weight: bold;">29</td>
<td class="green">28</td>
<td class="orange">27</td>
<td class="red">37</td>
<td class="green">28</td>
<td>Mended</td>
<td>Vera Blue</td>
</tr>
<tr>
<td style="font-weight: bold;">30</td>
<td class="red">36</td>
<td class="orange">35</td>
<td class="green">26</td>
<td class="red">36</td>
<td>Low Blows</td>
<td>Meg Mac</td>
</tr>
<tr>
<td style="font-weight: bold;">31</td>
<td class="green">25</td>
<td class="green">25</td>
<td class="red">48</td>
<td class="green">25</td>
<td>Lay Down</td>
<td>Touch Sensitive</td>
</tr>
<tr>
<td style="font-weight: bold;">32</td>
<td class="orange">27</td>
<td class="green">28</td>
<td class="red">91</td>
<td class="orange">27</td>
<td>NUMB {Ft. GRAACE}</td>
<td>Hayden James</td>
</tr>
<tr>
<td style="font-weight: bold;">33</td>
<td class="orange">22</td>
<td class="green">23</td>
<td class="red">58</td>
<td class="orange">22</td>
<td>Slow Mover</td>
<td>Angie McMahon</td>
</tr>
<tr>
<td style="font-weight: bold;">34</td>
<td class="green">37</td>
<td class="green">37</td>
<td class="red">19</td>
<td class="green">37</td>
<td>DNA.</td>
<td>Kendrick Lamar</td>
</tr>
<tr>
<td style="font-weight: bold;">35</td>
<td class="red">51</td>
<td class="orange">46</td>
<td class="green">31</td>
<td class="red">51</td>
<td>Passionfruit</td>
<td>Drake</td>
</tr>
<tr>
<td style="font-weight: bold;">36</td>
<td class="green">18</td>
<td class="orange">17</td>
<td class="red">12</td>
<td class="green">18</td>
<td>I Haven&#8217;t Been Taking Care Of Myself</td>
<td>Alex Lahey</td>
</tr>
<tr>
<td style="font-weight: bold;">37</td>
<td class="orange">63</td>
<td class="red">70</td>
<td class="green">52</td>
<td class="orange">62</td>
<td>Slide {Ft. Frank Ocean/Migos}</td>
<td>Calvin Harris</td>
</tr>
<tr>
<td style="font-weight: bold;">38</td>
<td class="orange">46</td>
<td class="red">48</td>
<td class="green">34</td>
<td class="orange">46</td>
<td>Bellyache</td>
<td>Billie Eilish</td>
</tr>
<tr>
<td style="font-weight: bold;">39</td>
<td class="orange">53</td>
<td class="green">49</td>
<td class="red">N/A</td>
<td class="orange">52</td>
<td>Got On My Skateboard</td>
<td>Skegss</td>
</tr>
<tr>
<td style="font-weight: bold;">40</td>
<td class="orange">24</td>
<td class="red">21</td>
<td class="green">44</td>
<td class="orange">24</td>
<td>True Lovers</td>
<td>Holy Holy</td>
</tr>
<tr>
<td style="font-weight: bold;">41</td>
<td class="green">41</td>
<td class="orange">40</td>
<td class="red">35</td>
<td class="green">41</td>
<td>Blood {triple j Like A Version 2017}</td>
<td>Gang Of Youths</td>
</tr>
<tr>
<td style="font-weight: bold;">42</td>
<td class="orange">59</td>
<td class="green">56</td>
<td class="red">N/A</td>
<td class="orange">59</td>
<td>Cola</td>
<td>CamelPhat &amp; Elderbrook</td>
</tr>
<tr>
<td style="font-weight: bold;">43</td>
<td class="red">91</td>
<td class="green">74</td>
<td class="green">74</td>
<td class="red">91</td>
<td>Murder To The Mind</td>
<td>Tash Sultana</td>
</tr>
<tr>
<td style="font-weight: bold;">44</td>
<td class="orange">49</td>
<td class="red">50</td>
<td class="green">42</td>
<td class="orange">49</td>
<td>In Motion {Ft. Japanese Wallpaper}</td>
<td>Allday</td>
</tr>
<tr>
<td style="font-weight: bold;">45</td>
<td class="green">21</td>
<td class="orange">20</td>
<td class="red">7</td>
<td class="green">21</td>
<td>Every Day&#8217;s The Weekend</td>
<td>Alex Lahey</td>
</tr>
<tr>
<td style="font-weight: bold;">46</td>
<td class="orange">57</td>
<td class="green">54</td>
<td class="red">17</td>
<td class="orange">57</td>
<td>Better</td>
<td>Mallrat</td>
</tr>
<tr>
<td style="font-weight: bold;">47</td>
<td class="green">45</td>
<td class="orange">52</td>
<td class="red">16</td>
<td class="green">45</td>
<td>Want You Back</td>
<td>HAIM</td>
</tr>
<tr>
<td style="font-weight: bold;">48</td>
<td class="orange">54</td>
<td class="green">47</td>
<td class="red">N/A</td>
<td class="orange">53</td>
<td>The Comedown</td>
<td>Ocean Alley</td>
</tr>
<tr>
<td style="font-weight: bold;">49</td>
<td class="orange">33</td>
<td class="green">34</td>
<td class="red">82</td>
<td class="green">34</td>
<td>Passiona</td>
<td>Smith Street Band, The</td>
</tr>
<tr>
<td style="font-weight: bold;">50</td>
<td class="orange">77</td>
<td class="red">84</td>
<td class="red">84</td>
<td class="green">74</td>
<td>On Your Way Down</td>
<td>Jungle Giants, The</td>
</tr>
<tr>
<td style="font-weight: bold;">51</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">56</td>
<td class="red">N/A</td>
<td>Man&#8217;s Not Hot</td>
<td>Big Shaq</td>
</tr>
<tr>
<td style="font-weight: bold;">52</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Glorious {Ft. Skylar Grey}</td>
<td>Macklemore</td>
</tr>
<tr>
<td style="font-weight: bold;">53</td>
<td class="green">62</td>
<td class="orange">68</td>
<td class="red">87</td>
<td class="orange">63</td>
<td>Moments {Ft. Gavin James}</td>
<td>Bliss N Eso</td>
</tr>
<tr>
<td style="font-weight: bold;">54</td>
<td class="orange">50</td>
<td class="green">57</td>
<td class="red">N/A</td>
<td class="orange">50</td>
<td>Homely Feeling</td>
<td>Hockey Dad</td>
</tr>
<tr>
<td style="font-weight: bold;">55</td>
<td class="orange">42</td>
<td class="green">44</td>
<td class="red">N/A</td>
<td class="orange">42</td>
<td>6 Pack</td>
<td>Dune Rats</td>
</tr>
<tr>
<td style="font-weight: bold;">56</td>
<td class="orange">32</td>
<td class="red">29</td>
<td class="green">72</td>
<td class="orange">32</td>
<td>Watch Me Read You</td>
<td>Odette</td>
</tr>
<tr>
<td style="font-weight: bold;">57</td>
<td class="green">67</td>
<td class="green">67</td>
<td class="red">N/A</td>
<td class="green">67</td>
<td>Bad Dream</td>
<td>Jungle Giants, The</td>
</tr>
<tr>
<td style="font-weight: bold;">58</td>
<td class="orange">20</td>
<td class="green">22</td>
<td class="red">11</td>
<td class="orange">20</td>
<td>The Opener</td>
<td>Camp Cope</td>
</tr>
<tr>
<td style="font-weight: bold;">59</td>
<td class="orange">80</td>
<td class="green">79</td>
<td class="red">N/A</td>
<td class="orange">80</td>
<td>Used To Be In Love</td>
<td>Jungle Giants, The</td>
</tr>
<tr>
<td style="font-weight: bold;">60</td>
<td class="orange">69</td>
<td class="green">66</td>
<td class="red">8</td>
<td class="orange">69</td>
<td>Boys</td>
<td>Charli XCX</td>
</tr>
<tr>
<td style="font-weight: bold;">61</td>
<td class="green">73</td>
<td class="orange">77</td>
<td class="red">N/A</td>
<td class="green">73</td>
<td>21 Grams {Ft. Hilltop Hoods}</td>
<td>Thundamentals</td>
</tr>
<tr>
<td style="font-weight: bold;">62</td>
<td class="orange">92</td>
<td class="green">89</td>
<td class="red">N/A</td>
<td class="orange">92</td>
<td>Saved</td>
<td>Khalid</td>
</tr>
<tr>
<td style="font-weight: bold;">63</td>
<td class="orange">40</td>
<td class="green">43</td>
<td class="red">28</td>
<td class="orange">40</td>
<td>Life Goes On</td>
<td>E^ST</td>
</tr>
<tr>
<td style="font-weight: bold;">64</td>
<td class="green">60</td>
<td class="orange">58</td>
<td class="red">45</td>
<td class="green">60</td>
<td>Fool&#8217;s Gold</td>
<td>Jack River</td>
</tr>
<tr>
<td style="font-weight: bold;">65</td>
<td class="green">65</td>
<td class="orange">62</td>
<td class="red">38</td>
<td class="orange">64</td>
<td>Everything Now</td>
<td>Arcade Fire</td>
</tr>
<tr>
<td style="font-weight: bold;">66</td>
<td class="green">66</td>
<td class="orange">65</td>
<td class="red">93</td>
<td class="orange">65</td>
<td>Lemon</td>
<td>N.E.R.D. &amp; Rihanna</td>
</tr>
<tr>
<td style="font-weight: bold;">67</td>
<td class="green">38</td>
<td class="orange">36</td>
<td class="red">N/A</td>
<td class="green">38</td>
<td>Shred For Summer</td>
<td>DZ Deathrays</td>
</tr>
<tr>
<td style="font-weight: bold;">68</td>
<td class="orange">48</td>
<td class="red">45</td>
<td class="green">80</td>
<td class="orange">48</td>
<td>Golden</td>
<td>Kingswood</td>
</tr>
<tr>
<td style="font-weight: bold;">69</td>
<td class="green">44</td>
<td class="red">42</td>
<td class="red">96</td>
<td class="green">44</td>
<td>I Love You, Will You Marry Me</td>
<td>Yungblud</td>
</tr>
<tr>
<td style="font-weight: bold;">70</td>
<td class="red">31</td>
<td class="orange">33</td>
<td class="green">54</td>
<td class="red">31</td>
<td>Amsterdam</td>
<td>Nothing But Thieves</td>
</tr>
<tr>
<td style="font-weight: bold;">71</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">21</td>
<td class="red">N/A</td>
<td>Perfect Places</td>
<td>Lorde</td>
</tr>
<tr>
<td style="font-weight: bold;">72</td>
<td class="red">88</td>
<td class="orange">85</td>
<td class="green">71</td>
<td class="red">88</td>
<td>In Cold Blood</td>
<td>alt-J</td>
</tr>
<tr>
<td style="font-weight: bold;">73</td>
<td class="orange">83</td>
<td class="green">64</td>
<td class="red">N/A</td>
<td class="green">82</td>
<td>Nuclear Fusion</td>
<td>King Gizzard &amp; The Lizard Wizard</td>
</tr>
<tr>
<td style="font-weight: bold;">74</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">98</td>
<td class="red">N/A</td>
<td>XO TOUR Llif3</td>
<td>Lil Uzi Vert</td>
</tr>
<tr>
<td style="font-weight: bold;">75</td>
<td class="green">61</td>
<td class="orange">60</td>
<td class="red">N/A</td>
<td class="green">61</td>
<td>Braindead</td>
<td>Dune Rats</td>
</tr>
<tr>
<td style="font-weight: bold;">76</td>
<td class="green">76</td>
<td class="green">76</td>
<td class="red">N/A</td>
<td class="orange">75</td>
<td>Cloud 9 {Ft. Kian}</td>
<td>Baker Boy</td>
</tr>
<tr>
<td style="font-weight: bold;">77</td>
<td class="red">N/A</td>
<td class="orange">100</td>
<td class="green">66</td>
<td class="red">N/A</td>
<td>Million Man</td>
<td>Rubens, The</td>
</tr>
<tr>
<td style="font-weight: bold;">78</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Electric Feel {triple j Like A Version 2017}</td>
<td>Tash Sultana</td>
</tr>
<tr>
<td style="font-weight: bold;">79</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">69</td>
<td class="red">N/A</td>
<td>Hey, Did I Do You Wrong?</td>
<td>San Cisco</td>
</tr>
<tr>
<td style="font-weight: bold;">80</td>
<td class="green">90</td>
<td class="green">90</td>
<td class="red">61</td>
<td class="green">90</td>
<td>Say Something Loving</td>
<td>xx, The</td>
</tr>
<tr>
<td style="font-weight: bold;">81</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">32</td>
<td class="red">N/A</td>
<td>Liability</td>
<td>Lorde</td>
</tr>
<tr>
<td style="font-weight: bold;">82</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">46</td>
<td class="red">N/A</td>
<td>1-800-273-8255 {Ft. Alessia Cara/Khalid}</td>
<td>Logic</td>
</tr>
<tr>
<td style="font-weight: bold;">83</td>
<td class="orange">74</td>
<td class="orange">72</td>
<td class="red">60</td>
<td class="green">76</td>
<td>Blood Brothers</td>
<td>Amy Shark</td>
</tr>
<tr>
<td style="font-weight: bold;">84</td>
<td class="green">84</td>
<td class="orange">73</td>
<td class="red">N/A</td>
<td class="orange">85</td>
<td>Oceans</td>
<td>Vallis Alps</td>
</tr>
<tr>
<td style="font-weight: bold;">85</td>
<td class="orange">58</td>
<td class="green">59</td>
<td class="red">N/A</td>
<td class="orange">58</td>
<td>Does This Last</td>
<td>Boo Seeka</td>
</tr>
<tr>
<td style="font-weight: bold;">86</td>
<td class="orange">94</td>
<td class="green">91</td>
<td class="red">95</td>
<td class="orange">94</td>
<td>Maybe It&#8217;s My First Time</td>
<td>Meg Mac</td>
</tr>
<tr>
<td style="font-weight: bold;">87</td>
<td class="orange">72</td>
<td class="red">63</td>
<td class="green">78</td>
<td class="orange">71</td>
<td>The Way You Used To Do</td>
<td>Queens Of The Stone Age</td>
</tr>
<tr>
<td style="font-weight: bold;">88</td>
<td class="orange">56</td>
<td class="green">61</td>
<td class="red">N/A</td>
<td class="orange">56</td>
<td>Edge Of Town {triple j Like A Version 2017}</td>
<td>Paul Dempsey</td>
</tr>
<tr>
<td style="font-weight: bold;">89</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Dawning</td>
<td>DMA&#8217;s</td>
</tr>
<tr>
<td style="font-weight: bold;">90</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Hyperreal {Ft. Kučka}</td>
<td>Flume</td>
</tr>
<tr>
<td style="font-weight: bold;">91</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Big For Your Boots</td>
<td>Stormzy</td>
</tr>
<tr>
<td style="font-weight: bold;">92</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>LOVE. {Ft. ZACARI}</td>
<td>Kendrick Lamar</td>
</tr>
<tr>
<td style="font-weight: bold;">93</td>
<td class="green">95</td>
<td class="green">95</td>
<td class="red">85</td>
<td class="orange">96</td>
<td>Do What You Want</td>
<td>Presets, The</td>
</tr>
<tr>
<td style="font-weight: bold;">94</td>
<td class="orange">99</td>
<td class="green">93</td>
<td class="red">N/A</td>
<td class="orange">98</td>
<td>Second Hand Car</td>
<td>Kim Churchill</td>
</tr>
<tr>
<td style="font-weight: bold;">95</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Mask Off</td>
<td>Future</td>
</tr>
<tr>
<td style="font-weight: bold;">96</td>
<td class="orange">100</td>
<td class="green">97</td>
<td class="red">55</td>
<td class="orange">100</td>
<td>Chasin&#8217;</td>
<td>Cub Sport</td>
</tr>
<tr>
<td style="font-weight: bold;">97</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>LOYALTY. {Ft. RIHANNA}</td>
<td>Kendrick Lamar</td>
</tr>
<tr>
<td style="font-weight: bold;">98</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Snow</td>
<td>Angus &amp; Julia Stone</td>
</tr>
<tr>
<td style="font-weight: bold;">99</td>
<td class="orange">64</td>
<td class="red">N/A</td>
<td class="red">N/A</td>
<td class="green">66</td>
<td>Arty Boy {Ft. Emma Louise}</td>
<td>Flight Facilities</td>
</tr>
<tr>
<td style="font-weight: bold;">100</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td class="green">N/A</td>
<td>Don&#8217;t Leave</td>
<td>Snakehips &amp; MØ</td>
</tr>
</tbody>
</table>
</div>
<p>The post <a href="https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/">Hottest 100 Predictions &#8211; A Comparison</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2018/01/28/hottest-100-predictions-a-comparison/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">3099</post-id>	</item>
		<item>
		<title>100 Toasty Tofu(s) &#8211; Another Triple J Hottest 100 Predictor</title>
		<link>https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/</link>
					<comments>https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/#respond</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Sat, 20 Jan 2018 07:53:55 +0000</pubDate>
				<category><![CDATA[Music]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[hottest 100]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[Triple J]]></category>
		<guid isPermaLink="false">http://jasongi.com/?p=2863</guid>

					<description><![CDATA[<p>Update: Think you can do better than my prediction? Prove it by filling out your prediction here: Triple J Hottest 100 Prediction tracker submission. Also, you can look at the leaderboard of predictions over here. 100 Toasty Tofu(s) is another Triple J Hottest 100 Predictor, made for your entertainment with no guarantees what-so-ever. Since 2012, various &#8230; <a href="https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/" class="more-link">Continue reading<span class="screen-reader-text"> "100 Toasty Tofu(s) &#8211; Another Triple J Hottest 100 Predictor"</span></a></p>
<p>The post <a href="https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/">100 Toasty Tofu(s) &#8211; Another Triple J Hottest 100 Predictor</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p><strong>Update: </strong> Think you can do better than my prediction? Prove it by filling out your prediction here: <a href="https://jasongi.com/100-toasty-tofus-hottest-100-predictor/triple-j-hottest-100-prediction-tracker-submission/">Triple J Hottest 100 Prediction tracker submission.</a> Also, you can look at the leaderboard of predictions over <a href="https://jasongi.com/100-toasty-tofus-hottest-100-predictor/100-toasty-tofus-prediction-leaderboard/">here.</a></p>
<p>100 Toasty Tofu(s) is another <a href="http://www.abc.net.au/triplej/hottest100">Triple J Hottest 100</a> Predictor, made for your entertainment with no guarantees what-so-ever.</p>
<p>Since 2012, various people have been predicting the Hottest 100 using social media scrapes and OCR. This started with <a href="http://warmest100.com.au/2013/index.html">The Warmest 100</a> and was continued by <a href="https://100-warm-tunas.nickwhyte.com/2017/">100 Warm Tunas</a>. I&#8217;ve long thought it&#8217;s an awesome experiment because the conditions are good for using social media as a predictor. Two factors make this a good experiment &#8211; the average person is willing to share their hottest 100 votes and the stakes are so low, unlike political elections, that there aren&#8217;t hoards of true believers/trolls/Russian government agents trying to manipulate public sentiment.</p>
<p>I use <a href="https://github.com/rarcega/instagram-scraper">instagram-scraper</a> to scrape the hashtags (the same as 100 Warm Tunas) and then a python script that uses <a href="https://github.com/tesseract-ocr/tesseract">Tesseract OCR</a> to convert them to text. They are then matched with the <a href="http://www.abc.net.au/triplej/hottest100/17/downloads/how-to-vote/Hottest_100_2017_Song_A-Z.pdf">Triple J song list (PDF)</a> and saved. I removed any duplicate votes I found, that is people who voted for the same songs in the same order when there are greater than 3 songs in the image (a very unlikely occurrence). I figure these are probably the same person uploading the same image twice.</p>
<p>This is an initial cut, there&#8217;s still some extra work to do including:</p>
<ul>
<li>Manually add songs that would be in the hottest 100 to the song list</li>
<li>Tune the OCR, including doing some pre-processing to images if needed</li>
<li>Tune the matching algorithm &#8211; currently using <a href="https://en.wikipedia.org/wiki/Levenshtein_distance">Levenshtein distance</a></li>
<li>Do more analysis on voting combinations (e.g are there factions who vote for particular songs together and what can we learn from this).</li>
<li>Make the table pretty like the other ones.</li>
<li>Make a form for people to upload their own predictions and show a leaderboard as they come in on the 27th.</li>
</ul>
<p>The results are quite different to 100 Warm Tunas &#8211; I seem to be picking up more votes. I&#8217;m not sure if this is due to some sort of filtering I&#8217;m not doing or just algorithm differences, but we will see if 100 Warm Tunas still is the internet&#8217;s most accurate prediction of Triple J&#8217;s Hottest 100 for 2017 on January 27!</p>
<p>This table is updated automatically every few hours.<br />
<strong>Total number of images</strong>: <span id="num-votes">loading&#8230;</span><br />
<strong>Total number of duplicates</strong>: <span id="num-dupes">loading&#8230;</span><br />
<strong>Total number of votes</strong>: <span id="num-individual-votes">loading&#8230;</span></p>
<div style="overflow-x: auto;">
<table>
<tbody id="hottest100table">
<tr class="table-heading">
<th>#</th>
<th>Title</th>
<th>Artist</th>
<th style="padding-right: 8px;">Votes</th>
<th>%</th>
<th style="padding-right: 8px;">Votes Inc dupes</th>
<th>%</th>
</tr>
<tr>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
<td>Loading&#8230;</td>
</tr>
</tbody>
</table>
</div>
<p><script type="text/javascript">
    jQuery(document).ready(function () {
        jQuery.getJSON("/static/hottest100results/votes.json", function (data) {
            var votesElement = jQuery('#num-votes'),
                indyVotesElement = jQuery('#num-individual-votes'),
                numVotes = 0,
                numIndyVotes = 0;
            data.forEach(function (list) {
                if (list.length > 0) {
                    numVotes = numVotes + 1;
                }
                numIndyVotes = numIndyVotes + list.length;
            });
            votesElement.text(numVotes);
            indyVotesElement.text(numIndyVotes);
            jQuery.getJSON("/static/hottest100results/dupes.json", function (data) {
                var dupesElement = jQuery('#num-dupes'),
                    numDupes = 0;
                data.forEach(function (list) {
                    if (list.length > 0) {
                        numDupes = numDupes + 1;
                    }
                });
                dupesElement.text(numDupes);
                jQuery.getJSON("/static/hottest100results/songs.json", function (data) {
                    jQuery.getJSON("/static/hottest100results/songs_with_dupes.json", function (dupeData) {
                        var tableBody = jQuery('#hottest100table'),
                            firstRow = jQuery('#hottest100table tr:last');
                        for (var ii = 0; ii < 99; ii++) {
                            firstRow.clone().appendTo('#hottest100table');
                        }
                        for (var ii = 1; ii < 101; ii++) {
                            var songId = data.top100[ii - 1];
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(1)').text(ii);
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(2)').text(data.songs[songId].name);
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(3)').text(data.songs[songId].artist);
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(4)').text(data.songs[songId].votes);
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(5)').text((data.songs[songId].votes / numVotes * 100).toFixed(2) + '%');
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(6)').text((dupeData.songs[songId].votes));
                            jQuery('#hottest100table tr:nth-child(' + (ii + 1) + ') td:nth-child(7)').text((dupeData.songs[songId].votes / (numVotes + numDupes) * 100).toFixed(2) + '%');
                        }
                    });
                });
            });
        });
    });
</script></p>
<p>The post <a href="https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/">100 Toasty Tofu(s) &#8211; Another Triple J Hottest 100 Predictor</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2018/01/20/100-toasty-tofus-another-triple-j-hottest-100-predictor/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2863</post-id>	</item>
		<item>
		<title>Mac OSX frozen version of Blackboard Scraper</title>
		<link>https://jasongi.com/2015/03/28/mac-osx-frozen-version-of-blackboard-scraper/</link>
					<comments>https://jasongi.com/2015/03/28/mac-osx-frozen-version-of-blackboard-scraper/#respond</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Sat, 28 Mar 2015 20:11:11 +0000</pubDate>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[University]]></category>
		<category><![CDATA[blackboard]]></category>
		<category><![CDATA[blackboard scraper]]></category>
		<category><![CDATA[Mac OSX]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">http://jasongi.com/?p=2567</guid>

					<description><![CDATA[<p>I have made a frozen executable of the Blackboard Scraper for easy use on Mac OSX. Check it out here.</p>
<p>The post <a href="https://jasongi.com/2015/03/28/mac-osx-frozen-version-of-blackboard-scraper/">Mac OSX frozen version of Blackboard Scraper</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>I have made a frozen executable of the Blackboard Scraper for easy use on Mac OSX. Check it out <a title="Blackboard Scraper" href="https://jasongi.com/blackboard-scraper/">here</a>.</p>
<p>The post <a href="https://jasongi.com/2015/03/28/mac-osx-frozen-version-of-blackboard-scraper/">Mac OSX frozen version of Blackboard Scraper</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2015/03/28/mac-osx-frozen-version-of-blackboard-scraper/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2567</post-id>	</item>
		<item>
		<title>Updated Blackboard Scraper</title>
		<link>https://jasongi.com/2015/03/21/updated-blackboard-scraper/</link>
					<comments>https://jasongi.com/2015/03/21/updated-blackboard-scraper/#respond</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Sat, 21 Mar 2015 13:59:09 +0000</pubDate>
				<category><![CDATA[Programming]]></category>
		<category><![CDATA[University]]></category>
		<category><![CDATA[blackboard]]></category>
		<category><![CDATA[blackboard scraper]]></category>
		<category><![CDATA[Mac OSX]]></category>
		<category><![CDATA[python]]></category>
		<guid isPermaLink="false">http://jasongi.com/?p=2552</guid>

					<description><![CDATA[<p>The Blackboard Scraper has been updated to work with the new blackboard changes. Get the new version here. Also, if you want to say thanks, you could enrol to vote in the University Council elections here, it only take 2 seconds to input your student number!</p>
<p>The post <a href="https://jasongi.com/2015/03/21/updated-blackboard-scraper/">Updated Blackboard Scraper</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>The Blackboard Scraper has been updated to work with the new blackboard changes. Get the new version <a title="Blackboard Scraper" href="https://jasongi.com/blackboard-scraper/">here</a>.</p>
<p>Also, if you want to say thanks, you could enrol to vote in the University Council elections <a href="https://jfe.qualtrics.com/form/SV_8pPsRT9MBDGung9">here</a>, it only take 2 seconds to input your student number!</p>
<p><a href="https://i0.wp.com/jasongi.com/wp-content/uploads/2015/03/Capture3.png?ssl=1"><img data-recalc-dims="1" fetchpriority="high" decoding="async" class="aligncenter size-full wp-image-2558" src="https://i0.wp.com/jasongi.com/wp-content/uploads/2015/03/Capture3.png?resize=615%2C635&#038;ssl=1" alt="Blackboard Scraper" width="615" height="635" srcset="https://i0.wp.com/jasongi.com/wp-content/uploads/2015/03/Capture3.png?w=615&amp;ssl=1 615w, https://i0.wp.com/jasongi.com/wp-content/uploads/2015/03/Capture3.png?resize=290%2C300&amp;ssl=1 290w" sizes="(max-width: 709px) 85vw, (max-width: 909px) 67vw, (max-width: 984px) 61vw, (max-width: 1362px) 45vw, 600px" /></a></p>
<p>The post <a href="https://jasongi.com/2015/03/21/updated-blackboard-scraper/">Updated Blackboard Scraper</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2015/03/21/updated-blackboard-scraper/feed/</wfw:commentRss>
			<slash:comments>0</slash:comments>
		
		
		<post-id xmlns="com-wordpress:feed-additions:1">2552</post-id>	</item>
		<item>
		<title>Linux.conf.au: Day 3</title>
		<link>https://jasongi.com/2014/01/08/linux-conf-au-day-3/</link>
					<comments>https://jasongi.com/2014/01/08/linux-conf-au-day-3/#comments</comments>
		
		<dc:creator><![CDATA[jasongi]]></dc:creator>
		<pubDate>Wed, 08 Jan 2014 17:51:28 +0000</pubDate>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Programming]]></category>
		<category><![CDATA[data.gov.au]]></category>
		<category><![CDATA[lca2014]]></category>
		<category><![CDATA[linux.conf.au]]></category>
		<category><![CDATA[open government]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[python 3]]></category>
		<category><![CDATA[reverse engineering]]></category>
		<category><![CDATA[TPPA]]></category>
		<category><![CDATA[Trans-Pacific Partnership Agreement]]></category>
		<guid isPermaLink="false">http://jasongi.com/?p=73</guid>

					<description><![CDATA[<p>Wow. I now realise the last two days were just a warmup. The Conference really came into it&#8217;s own today. With talks going about twice as long as the previous two days, we really got some in-depth information from some top notch speakers. This is a continuation of this post here about my first Linux.conf.au, &#8230; <a href="https://jasongi.com/2014/01/08/linux-conf-au-day-3/" class="more-link">Continue reading<span class="screen-reader-text"> "Linux.conf.au: Day 3"</span></a></p>
<p>The post <a href="https://jasongi.com/2014/01/08/linux-conf-au-day-3/">Linux.conf.au: Day 3</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></description>
										<content:encoded><![CDATA[<p>Wow. I now realise the last two days were just a warmup. The Conference really came into it&#8217;s own today. With talks going about twice as long as the previous two days, we really got some in-depth information from some top notch speakers.</p>
<p>This is a continuation of this post here about my first Linux.conf.au, <a title="Linux.conf.au: Day 2" href="https://jasongi.com/blog/2014/01/08/linux-conf-au-day-2/">here</a>.</p>
<p>It started off with the lightning talks in lieu of a keynote speaker. I was a bit disappointed that there weren&#8217;t more of them (they finished well before scheduled) but the ones that were there were very well spoken about things ranging from Raising Geek Girls to Password Storage best-practices.</p>
<p>Next up I attended the <a href="https://lca2014.linux.org.au/schedule/30197/view_talk?day=wednesday">Python 3: Making the Leap!</a> <i>by</i> Tim Leslie tutorial. This was a great tutorial where Tim went through the nuances of the differences between the two versions of python and how the -3 flag and the 2to3.py tool can be used to convert a 2.xx script to python 3. I found this particularly helpful as someone who recent started having a go at python with my (disclaimer: linking to horrible, horrible code) <a href="https://jasongi.com/blackboard-scraper/">blackboard scraper</a> I recently wrote. For some reason despite all my dependencies working with python 3 I wrote it in 2.7. Learning to write pretty much depreciated code was probably not the most intelligent thing to do, but at least now I have a good explanation on what is different when writing code for 3 in the future.</p>
<p>Next up was <a href="http://linux.conf.au/schedule/30069/view_talk?day=wednesday">Building Effective Alliances around the Trans-Pacific Partnership Agreement</a> <i>by</i> Sky Croeser [ <a href="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/53-Building_Effective_Alliances_around_the_Trans-Pacific_Partnership_Agreement_-_Sky_Croeser.mp4">Video</a> ]. The great thing about this talk is that it went beyond saying why the TPPA is bad, which is all we ever seem to hear from tech news sites, and actually delved into strategies of how to effectively combat. The strategies were diverse with no possibility left unturned, from writing letters to politicians to <a href="http://en.wikipedia.org/wiki/1999_Seattle_WTO_protests#.22N30.22">1999 Seattle WTO Protest</a> tactics &#8211; fun for the whole family! I thought the way she approached the topic was terrific, as a person who is not in the slightest bit interested in radical activism, but a strong interest in political issues such as this, it was nice to be acknowledged as somebody who can still contribute. If you&#8217;re at all worried about TPPA, I would strongly recommend watching the talk when it&#8217;s uploaded on Monday.</p>
<p><a href="http://linux.conf.au/schedule/30123/view_talk?day=wednesday">Bringing MoreWomen to Free and Open Source Software</a> <i>by</i> Karen Sandler [ <a href="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/58-Bringing_More_Women_to_Free_and_Open_Source_Software_-_Karen_Sandler.mp4">Video</a> ] was a really eye opening talk on the strategies used at the GNOME Foundation the bring their amount of female contributors up to par with the rest of the Computing sector (3% compared to 18%, still a woefully small number). The lack of women in computing is quite a serious problem which I think is overlooked (and sometimes even perpetuated) by many computer scientists and programmers, and perhaps society in general. I&#8217;m fairly certain (and very hopeful) this is a changing statistic I&#8217;ve met many people who are committed to the issue.</p>
<p><a href="http://linux.conf.au/schedule/30133/view_talk?day=wednesday">Opening up government data</a> <i>by</i> Pia Waugh [ <a href="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/62-Opening_up_government_data_-_Pia_Waugh.mp4">Video</a> ] continued from Monday&#8217;s open government miniconf about making government datasets available freely to the public. As I mentioned on Monday, Pia Waugh is definitely someone who is &#8220;on our side&#8221; in the government sector relating to open data and is in charge of the great <a href="http://data.gov.au/">data.gov.au</a> website where you can request and vote for certain datasets to be release. She talked about certain issues related to open government data including ways to automate data collection and uploading (which would always provide up-to-date-data) and issues with the format in which data is provided. I love playing to datasets so this talk was very interesting to me, I would recommend giving it a watch when it&#8217;s online.</p>
<p><a href="http://linux.conf.au/schedule/30177/view_talk?day=wednesday">Reverse engineering vendor firmware drivers for little fun and no profit</a> <i>by</i> Matthew Garrett [ <a href="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/56-Reverse_engineering_vendor_firmware_drivers_for_little_fun_and_no_profit_-_Matthew_Garrett.mp4">Video</a> ]* would have to be the best talk of the day (and possibly the week). In this talk we follow the protagonist Garrett as he embarks on a journey to reverse engineer a vendor tool for modifying servers which doesn&#8217;t quite go as expected. I won&#8217;t go any further because you have to watch it to really appreciate. The thing I really like about Garrett&#8217;s talk is not only the sprinkles of humor paced throughout, but the fact that whenever he says anything that isn&#8217;t basic computer/Linux knowledge, he stops and gives a short explanation of what that is. One mistake I feel like many speakers have is that they assume every attendee is as knowledgeable as they are in the domain they are speaking about when in reality there are people of all different types of skillsets. By not stopping to explain things you run the risk of having half your audience feel stupid and switch off. As a student these kinds of explanations are much appreciated and help me achieve my goal of learning more about the kernel.</p>
<p>*for those with a keen eye, you might notice that two Garrett and the Open Data talk overlapped. I watched one on video after the end of today&#8217;s conference.</p>
<p>That was all for today, looking very forward to tomorrow with a keynote by Matthew Garrett who will hopefully be beating his own record of &#8220;Best Talk so Far&#8221;. If you can&#8217;t make it but want to watch, tune in at http://timvideos.us/octagon at 9AM tomorrow.</p>
<p><strong>My linux.conf.au adventure is continued <a title="Linux.conf.au: Day 3" href="https://jasongi.com/blog/2014/01/08/linux-conf-au-day-4/">here</a>.</strong></p>
<p>The post <a href="https://jasongi.com/2014/01/08/linux-conf-au-day-3/">Linux.conf.au: Day 3</a> appeared first on <a href="https://jasongi.com">JasonGi</a>.</p>
]]></content:encoded>
					
					<wfw:commentRss>https://jasongi.com/2014/01/08/linux-conf-au-day-3/feed/</wfw:commentRss>
			<slash:comments>3</slash:comments>
		
		<enclosure url="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/53-Building_Effective_Alliances_around_the_Trans-Pacific_Partnership_Agreement_-_Sky_Croeser.mp4" length="17806003" type="video/mp4" />
<enclosure url="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/58-Bringing_More_Women_to_Free_and_Open_Source_Software_-_Karen_Sandler.mp4" length="232294152" type="video/mp4" />
<enclosure url="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/62-Opening_up_government_data_-_Pia_Waugh.mp4" length="156750900" type="video/mp4" />
<enclosure url="http://mirror.linux.org.au/linux.conf.au/2014/Wednesday/56-Reverse_engineering_vendor_firmware_drivers_for_little_fun_and_no_profit_-_Matthew_Garrett.mp4" length="106484769" type="video/mp4" />

		<post-id xmlns="com-wordpress:feed-additions:1">73</post-id>	</item>
	</channel>
</rss>
