On a weekend trip again, but managed to squeeze in quite a bit of reading during the week.
Friends Without Benefits Was there a paradigm shift in Silicon Valley from hard science to pointless web 2.0 startups?
How Universities Work As the title implies this is about universities in the US, but a lot of it also holds true for academic institutions in Europe.
First World War officially ends 92 years after the end of the war, Germany will pay the last chunk of reparations imposed by the Treaty of Versailles.
The Shell Hater’s Handbook Despite the name, this presentation by GitHubber Ryan Tomyako is a nice intro to shell scripting. If you know a shell hater, send him a link to this presentation.
Virtual vs. Real Protests Twitter “revolutions” and the confusion between “mobilization” and “organization”.
Small Change Very much in the same vain as the previous article, Malcolm Gladwell talks about hierarchies vs. networks, strong vs. weak ties and why joining a Facebook group is not the same sort of activism as putting your life at risk in a real world conflict.
Pay The Bills Interesting experiment in earning some money while looking for a job.
Slightly modified version of a post I originally wrote for our company blog.
When importing data at work, we often have to deal with XML. This generally works fine, but the format’s structured nature also means that you can’t just treat it like any old text file.
That’s something we recently had to work around when we wanted to generate a daily XML diff, which only contains elements which changed since the previous feed. Of course there are several open source tools for diff-ing XML (e.g. diffxml or xmldiff) but since we didn’t get them to do what we want in a reasonable amount of time, we just decided to roll our own.
The final solution is a 71 line bash script, which downloads a zip, extracts it, generates MD5 sums for every element and then creates a diff between this new file and the previous list of MD5 sums. Once we know which elements have changed we merge them into a new feed which then gets handed to our importer. The awesome xmlstarlet was a great help in this, as was battle-tested old awk.
Let’s look at an interesting snippet from the script:
Here we use xmlstarlet to iterate over all the items in the feed (the XPath “//item”), print the value of the “guid” element (-v “./guid”), output a pipe character (-o “|”) and then copy the current element followed by a newline (-c “.” -n) . This then gets piped through sed for some cleaning up (which I omitted here for brevity’s sake) before awk takes the part after each “|”, generates an MD5 sum and finally produces a file that looks like this:
Here we create an array with the id of the changed elements over which we then iterate. In the loop we once again use xmlstarlet to extract the current item from the feed which contains the right guid.
I’m quite happy with the result, it does exactly what we want it to do and is also reasonably fast. This is a good example of how familiar Unix tools can be combined to create fairly concise solutions for non-trivial problem.
Being sick this week I had a lot of time to read, but most if it went into Bruce Sterling’s Hacker Crackdown and Joe Dunthorne’s Submarine. Anyway, here we go:
“Did ‘Star Wars’ become a toy story? ":http://herocomplex.latimes.com/2010/08/12/star-wars-was-born-a-long-time-ago-but-not-all-that-far-far-away-in-1972-filmmakers-george-lucas-and-gary-kurtz-wer/ I admit to being quite a Star Wars nerd, the Han shot first kind who pretends Episode I-III never happened and who actually sat through the entire Holiday special. If you are remotely like me, this article providing some insider information by Gary Kurtz will be quite an interesting read.
In Arabian Desert, a Sustainable City Rises The first residents are starting to move into Masdar, a planned city built in Abu Dhabi. There seems to be some discussion about how much of a “real” city Masdar will be, but then Dubai didn’t feel like a “real” place to me either, but more like a soulless agglomeration of skyscrapers and shopping malls.
If you are running a non-system Emacs on OS X and have tried to use “emacsclient”, you may have seen the following error message despite having started the Emacs server:
1234567
1 2 3 4 5 6 7
<span class='line'>emacsclient: can't find socket; have you started the server?
</span><span class='line'>To start the server in Emacs, type "M-x server-start".
</span><span class='line'>emacsclient: No socket or alternate editor. Please use:
</span><span class='line'>
</span><span class='line'> --socket-name
</span><span class='line'> --server-file (or environment variable EMACS_SERVER_FILE)
</span><span class='line'> --alternate-editor (or environment variable ALTERNATE_EDITOR)</span>
This doesn’t work because you are invoking “/usr/bin/emacsclient” which came with the OS, instead of “/Applications/Emacs.app/Contents/MacOS/bin/emacsclient”. This can easily be fixed by symlinking the latter to “/usr/local/bin/emacsclient” and making sure that “/usr/local/bin” is listed in your path before “/usr/bin”.
Not a big deal, but it took me a couple of minutes to figure out and I thought I might as well save others some time…
Third Kim lucky? This article suggests that Kim Jong-un will become Kim Jong-il’s successor. I’d have expected a kind of military/party government after the latter’s retirement/death, let’s see what actually happens.
Itunes Remove duplicated songs It’s always nice to see something based on my blog posts, especially in this case where it was the first time ever I wrote something about Erlang.
The Myth of the Boy Wizard After the recently discovered flaw in Haystack, Austin Heap got a lot of flak. This article is an interesting take on the media’s role and responsibility in all of this.
Being kinda sick I decided to use the weekend for emptying out my Instapaper account a little. Doing so I finally read Rubinius wants to help YOU make Ruby better on the Engine Yard blog. This reminded me that it’s been over a year since I last looked at Rubinius, so I used the excellent RVM to get the latest version and started my experiments. Basically everything I threw at it just worked, except for some of my scripts using 1.9’s new lambda syntax. Speedwise it seems to be more in the MRI 1.8.7 than the 1.9.2 range, but that’s fair enough. Getting adventurous I decided to try how Rubinius would handle one of my all-time favorite Ruby annoyances, the inability to override to_s in subclasses of String (don’t ask, but this once cost me almost an entire afternoon).
In MRI 1.8.7, MRI 1.9.2, JRuby HEAD and MacRuby 0.6 this will output “original”, which I believe to have tracked down to rb_obj_as_string in string.c in the MRI source (no idea about the other implementations). To my great surprise Rubinius 1.0.1 actually output “overriden”, which instantly won it a new fan. :-)
One week ago I finally got my Kindle 3 and it’s about time for a review. Here we go: Awesome! That’s it, ’nuff said. In all seriousness, the Kindle may well be my favorite gadget and that comes from somebody who owns a MacBook Pro, an Android phone, an iPod Touch and a Nintendo DS.
First off, at around 180 Euro for the WiFi+3G version it’s quite a bargain. I love the form factor, and the software is quite ok too, especially after the latest update. The built-in dictionary has already proven to be useful on several occasions and I’m sure I’ll start using the annotation feature rather sooner than later. But now for the most important part, the e-paper display. It’s an absolute pleasure to read on, I find the experience to be highly immersive. Since Saturday I read most of Cory Doctorow’s novel Little Brother on it, and am surprised how the Kindle just seems to disappear while I read. Hands-free reading without the need to keep the book open is pretty sweet too.
As a book nerd and regular reader (I usually read between 50-60 books a year) I still can’t entirely abandon paper books though, there’s just something magical about their feel and smell. I also tend to pick up a lot of my reading material from second-hand shops, Bookcrossing or Offener Buecherschrank, something which is not possible for eBooks (yet). However, the Kindle is great for reading all the great freely available books like Peter Watt’s Rifters trilogy, or the Project Gutenberg texts which I mostly ignored so far because I find it too annoying to read them on a computer screen (programming eBooks are a different matter, I want to read them on my screen so I can easily switch to an editor and try out things).
As should be obvious by now, I’m pretty stoked by my new toy. There is however one app that makes it even more awesome, Calibre. While it has a pretty ugly UI, it’s jam-packed with useful features every eBook user will appreciate. I especially love the ability to fetch RSS feeds and convert them into Kindle “magazines”: every morning Calibre fetches the feeds of several of my favorite online publications, converts them and emails them to my Kindle where I can later read them. Users can contribute their own recipes for this and some of them are just amazing (e.g. the one for Austrian newspaper Der Standard is basically of equal – or better – quality as the commercial offers in the Kindle store). Of course the program also can deal with authentication, so it’s no problem to access my subscription of the English edition of Le Monde diplomatique or my unread Instapaper items. Recipes are Python scripts by the way, so it’s easy to modify or create them. All in all an absolutely fantastic piece of software, which I happily donated money to! :-)
If you are looking for an eBook reader, it’s probably hard to find better value for money than the new Kindle. I’ve been wanting to buy one for the last 6 month or so, but am very happy that I waited until now. It’s everything I expected from such a device, plus a bit more.
Due to an extended weekend trip this issue of “Information Overload” is much shorter and a bit later than usual, the next one should be back to normal:
Who Killed Prolog Interesting read for programming language nerds, also contains interesting info on how the Japanese economy was feared by the US in the 80ies.
A couple of days ago I finally started properly looking at Erlang for the first time. One aspect I find especially interesting is the bit syntax, so I wrote a small program for parsing ID3v1 tags for practice. There’s definitely room for improvement (I ignored ID3v1.1), but it was a fun little exercise.
Here’s the code:
147>% file doesn’t exist147>mp3:get_id3(“./test.txt”).enoent148>% file is not an MP3148>mp3:get_id3(“./test.clj”).no_id3150>% get the tags150>{id3v1,Tags}=mp3:get_id3(“song.mp3”).{id3v1,[{title,“DancingShoes”},{artist,“CliffRichardandTheShadows”},{album,“(SUMMERHOLIDAY1963)”},{year,“2000”},{comment,“Rubylearningr”},{genre,[24]}]}
I’m too new to Erlang to judge if this is a proper use of a property list, but it allowed me to write get_tags/2 as a wrapper for
proplists:get_value/2 which is rather nice:
Some initial help came from this related blog post, but I think our versions came out quite differently in the end.
All in all Erlang feels quite nice, except for minor syntactic quirks like different statement modifiers depending on context or the need to “extract” a local function with fun for the call in lists:map/2. Any feedback would be much appreciated, I’m sure there’s plenty of things I could have done better.