citizen428.blog()

Try to learn something about everything

GitHub Stats With Incanter

Today I stumbled upon a post called GitHub Stats on Programming Languages, where the author uses R to create various graphs about GitHub’s top 10 programming languages. He was nice and posted his data set, so I had a go at it with Incanter.

Load the necessary libraries:

Then read in the data set (the delimiter defaults to \, which can be changed with the :delim option) and create a (somewhat arbitrary) list of languages to examine:

Get a tabular view of the data:

Screen shot 2010-08-10 at 19.02.03

Narrow the data set down to languages of interest:

Display a bar chart of repositories per user (this has to be within the with-data form shown above):

Screen shot 2010-08-10 at 23.40.36

More graphs:

Screen shot 2010-08-10 at 23.40.53
Screen shot 2010-08-10 at 23.40.09
Screen shot 2010-08-10 at 23.41.06

The full code:

I just love how simple Incanter makes this type of thing. It’s an immensely cool and useful library, which can do much more than what I showed in this post (and previous ones). I hope that one day there’ll be a book about it!

Comments