citizen428.blog()

Try to learn something about everything

The Updated Ruby Reading List

A little more than a year ago, I posted a Ruby reading list consisting of blog posts I often recommend to our students at RubyLearning as well as some interesting articles. It’s time for the updated version (some links are the same as last time, some are new, some are gone):

Review: Metasploit - the Penetration Tester’s Guide

Disclaimer: The fine folks at No Starch Press were nice enough to provide me with a review copy of this book, but this has not influenced this review in any way.

Where to start? The Metasploit Framework (MSF) is a very popular penetration testing tool used by security professionals the world over. It was previously written in Perl but underwent a complete rewrite for version 3, where the developers switched the project to Ruby. The tool unifies the various stages of penetration testing in convenient interfaces (“msfconsole” for interactive use and “msfcli” for scripting purposes): information gathering and storage, exploit and payload configuration, IDS and antivirus evasion and actually exploiting the system.

From this you probably can gather that Metasploit is quite big and complex, as well as in a state of constant flux. This makes it rather hard to write a definitive book on it, which is illustrated by the fact that shortly after this volume got published, the Metasploit team released version 4 of the framework. Considering these difficulties, I’m tempted to say that the authors have done a tremendous job describing MSF as it was at the time of writing.

Now for the actual content: after a foreword by Metasploit’s main developer HD Moore, there’s a little introduction section on penetration testing and the history of the framework. This is followed by the first chapter, which covers some penetration testing basics. After this the authors give a first introduction to the MSF, before dedicating a chapter each to various phases of pen testing, namely information gathering, vulnerability scanning and the actual exploitation. After this you’ll find a whole chapter on Meterpreter, covering various aspects of post-exploitation techniques. Once you get to this point, you should have a good idea about how Metasploit works in principle and how capable it is. The authors don’t stop here though, but use the following chapters to try to teach you about avoiding detection, client-side exploits and Metasploit’s auxiliary modules. By this point in the book it felt like I already had learned a lot, but then I realized that I’m only halfway through the book! There still were chapters on various topics, including the social-engineering toolkit which is built in the MSF and wireless exploitation with Karmetasploit. As a Ruby developer/dev ops guy I was really interested in the next couple of chapters, which deal with building your own modules and exploits as well as porting existing exploits to Metasploit and Meterpreter scripting. Wow, the authors definitely covered a lot of ground until here, but we are still not done, since there’s on more chapter on how to simulate your pen tests.

While the above shows what the book covered, it doesn’t say much on how it was covered. In my opinion the authors did a very good job, the text is easy to follow and to the point and helped by screenshots and transcripts of “msfconsole” sessions. Sure, most of this material is also available on the Metasploit Unleashed web site, but I like having it all in the form of one compact book. I noticed 2-3 places where the textual description and the content of the screenshot/transcript didn’t exactly match, which can lead to brief moments of confusion, but nothing dramatic.

If you are new to Metasploit and want to get up to speed quickly, it’s hard to imagine that you’ll find a better book at the moment. More experienced users of the framework should flip through it in a book store to decide how much they’ll really get from it, but it’s probably still a good book to have around, even if it’s just for the cheat sheet in Appendix B.

Information Overload 2011-08-21

Chaos Communication Camp 2011

After What the Hack in 2005 and the Chaos Communication Camp 2007 this was my third hacker camp and once again it was an inspiring, funny, chaotic and productive week of talks, hacking, partying, networking and more.

Monday was mostly spent driving to the camp, which all in all went rather smoothly. All hail to Tesco where we picked up a lot of equipment, including the super awesome tent we slept in. Tuesday was a good opportunity to get a first impression of the camp, see who’s there, starting to socialize and having a party.

On Wednesday the talks started and I saw three of them. Strong encryption of credit card information was really quite interesting and once again I was shocked by how insecure some very important aspects of our lives are. Strahlung im Weltall dealt with space radiation and was part of the Hackers in Space track of the conference. When you think web development is difficult, how about dealing with radiation that might randomly flip a bit? Last but not least GPRS Intercept was another rather shocking talk, showing that GPRS security is as bad as GSM’s.

Thursday I started with Die psychologischen Grundlagen des Social Engineerings, which was a really good and interesting talk by Stefan Schumacher, who’s a really nice guy who spent quite a bit of time at our village. Steal Everything, Kill Everyone, Cause Total Financial Ruin! was another social engineering related talk and very entertaining. I got to talk to Jayson later that night and he really is as adorable as he says ;-) One of the talks I was most looking forward to was Hacking DNA and I wasn’t disappointed. As an added bonus Marc took some time on Saturday to come over to our village and help us with some really useful information on how to get a biohacking group going at a hackerspace.

Friday I spent mostly socializing, but I also went to see 2 talks. She-Hackers: Millennials and Gender in European F/LOSS Subculture unfortunately was the worst lecture I saw at this camp, which is a real shame considering that the topic at hand was very interesting. In the evening I watched Certified programming with dependent types, which was a fairly interesting introduction to the Coq Proof Assistant. As often with Andreas’s talks the title and abstract suggested slightly more, but it still was a solid and interesting presentation.

Saturday started rather late for me due to Friday night’s party, but I still woke up in time for Moonbounce Radio Communication, held by members of Metalab’s amateur radio crew. Later that day the Metalab was on stage again, this time in the form of Rethinking online news, which was an awesome presentation that led to a nice little meetup of interested parties the next afternoon. Stay tuned on this front, there might be interesting things happening in the not too distant future! The most amazing part of the day for me though was taking a 20 minute flight in an Antonov An-2 from 1968, which was really cool and worth every cent of the 45 Euro it cost.

On Sunday I watched Data Mining Your City, which was not as interesting as I’d hoped for. It sadly also was the day where we had to say goodbye to the camp :-(

Random thoughts: It’s super inspiring to spend an entire week around several thousand intelligent and creative people and like always after such an event I feel invigorated and full of ideas. I also finally should learn not to count on getting any work on one of my projects done at the camp, it never really works. Sometimes I wonder why I even bother bringing my laptop, an iPad or Netbook would probably be sufficient. But then why would I sit in front of my computer working on the same stuff I’m working on for the rest of the year when I’m surrounded by loads of interesting things and people I haven’t met yet?

I <3 you, Chaos Communication Camp!

Information Overload 2011-08-07

There won’t be an “Information Overload” next week as I’ll be attending the Chaos Communication Camp.

Information Overload 2011-07-31

Tom and Michael vs the Good Grief Algorithm

This week Tom — who is now a happily married man :-) — and yours truly finally read another paper and since I’m sick and barely left bed for the last 2 days I decided to fight my boredom by writing about it.

The paper
“Multiple Aspect Ranking using the Good Grief Algorithm” by Benjamin Snyder and Regina Barzilay from the MIT CS and AI Lab.

Summary
The paper is in the field of sentiment analysis — extracting opinions from a text — and uses restaurant reviews as a corpus. The prime assumption is that such texts will contain more than one opinion (e.g. quality of food, price range, interior, quality of service), which the authors believe is not properly reflected by previous work in this area, where one opinion per text is assumed.

The problem the authors try to solve is assigning a rank from a fixed scale (e.g. 1-5) to several related aspects. They dismiss the easy approach of treating every aspect as a separate ranking problem, since they believe that a real text relates the different aspects in a coherent way (e.g. through phrases like “but”, “one problem was” etc.).

They built on previous work in the natural language processing field, namely linear models trained with “Perceptron” and extended this framework with a sort of “meta-model” that predicts relations (agreement or disagreement) between the individually ranked aspects. By relating the different aspects in such a way, it’s easier to reflect the contrasting views in real text like “good BUT pricey” in a meaningful way.

What follows is some math, where each input (a review) gets assigned an m-dimensional ranking vector (reflecting m aspects). For example if you try to rank 3 different aspects (food, service, price) on a scale of 1-5 a ranking vector might look like <5,5,3> (“food and service were great, but it was a bit pricey”). The joint ranking model then combines the individual ranks with an “agreement model” to introduce “grief terms” which express dissatisfaction in a certain aspect. The algorithm than tries to minimize the sum of this “grief” — which is what gave the algorithm its name — in a joint rank. This step then gets incorporated into the training of the individual rankers (there’s some pseudo-code for the joint training), where features of an aspect get represented through presence or absence of words and word bigrams. This is obviously a very simplified description of the algorithm, but it seems moot to repeat the entire math here, which is very well laid out in the presentation. There’s also a very nice illustration about decoding and relating the various aspects at the end of the PDF.

What then follows is a comparison of the Good Grief algorithm with similar algorithms on a corpus of 4500 restaurant reviews ranking restaurants on 5 different aspects, where it outperforms the “competition” in a statistically relevant way.

Takeaway
As someone who works in an area where sentiment analyses could come in handy I do have an interest in the topic, but alas we don’t have the time and resources to develop our own system to do this properly. I once hacked something together with JRuby and the OpenNLP library, which wasn’t really sentiment analysis but an attempt to extract useful phrases from reviews. It was crude and had its faults, but worked surprisingly well for the amount of time it took to write. It did however get abandoned in a prototype stage and after reading the paper it’s quite clear to me that that probably was a good choice. Sentiment analysis is a non-trivial problem that can’t properly be tackled in the ad-hoc fashion we tried. Should this topic ever resurface at my company I’ll definitely try to go for a more elaborate and scientific approach.

Difficulty of the paper
I know that for some people the math might look scary, but if you take some time and look at it carefully you’ll notice that it’s actually rather simple. It probably doesn’t hurt to be acquainted with some basic concepts of NLP and machine learning, but if you have those it’s quite an enjoyable paper even if you are not an absolute crack in the field.

Further reading

Information Overload 2011-07-24

Information Overload 2011-07-17