I had a really sticky problem.  I’m building a web 2.0 application that aggregates data from disparate social media sites.  The good news is that MANY sites have really great, REST API’s while many do not.  My nemesis was Google Blog Search.  Sorry Charlie, no API!

What to to?

I’ve burned several nights trying to work around this problem with mixed results and very disappointing results.  Plugging away and looking at dozen’s of different ways to tackle this problem, I finally settled on an approach suggested here by igvita.com.  While this got me 90% of the way there, I had to do a bit of twiddling to get it to work properly.

First, the igvita.com article didn’t explain how to submit data.  No big deal, you can pass the search criteria in on the requesting URL:

@url = "http://blogsearch.google.com/blogsearch?hl=en&ie=UTF-8&q=ruby+rails"

So far, so good.

The next trick is to use XPATH to locate and snag the piece of data you’re interested in.  This is where Firebug completely rocks.  If you’re not using Firebug for web development, you’re either a glutton for punishment or have been living under a rock for quite a while.  In my particular case, the only piece of data that I was interested was the number of hits on the particular query:

rubyrails-google-blog-search1

The igvita.com article goes into detail regarding how to find the right Xpath by leveraging Firebug.  Firebug pretty slick and the article does a great job explaining how to use it, so I won’t go into the details here.  The problem is that the HTML needs to be converted to XML before the XPATH will work properly.  Below is the Ruby code in its entirety:

require 'rubygems'
require 'open-uri'
require 'hpricot'

@url = “http://blogsearch.google.com/blogsearch?hl=en&ie=UTF-8&q=ruby+rails”
@response = ”

begin

# HPricot RDoc: http://code.whytheluckystiff.net/hpricot/
doc = Hpricot(@response)
xml = Hpricot.XML(open(@url).read)

# Retrive number of comments
number_of_hits = (xml/”/html/body/div[5]/table[3]/tbody/tr/td[2]/font/b[3]“).inner_html
puts “Number of hits: #{number_of_hits}”

rescue Exception => e
print e, “n”
end

Happy hacking,

Peter

Technorati Tags: , , ,

{ 1 comment }

In general, I’m not a huge fan of code reviews but they can be an effective tool if they’re kept in check.  A co-worker asked me to put together some talking points about code reviews, which I did, regarding some of the things I thought would be material to keep in mind for a code review process.  I thought they might be generally useful, so I’m posting them here.

  1. In general, I’m not a big fan of invasive, comprehensive code reviews. They tend to have diminished value the more comprehensive they are. I AM a fan of spot checks and quick audits concentrated mainly on those developers that maybe unknown quantities, e.g. new contractors, new team members etc.
  2. AUTOMATE whenever and wherever possible. If style standards or code test coverage can be automated (which it can), it should. Tools like CheckStyle and others are invaluable and are huge time savers. They also have the benefit of being impersonal. Nobody gets offended if a tools yells at you. Before anything’s done manually, ask “Can this be automated?” Manual review *should be* the exception, not the rule…
  3. If there are formal code reviews, the purpose and the criteria that is going to be checked should be explicitly stated up-front. I’ve seen too many code review sessions spiral into arguments that essentially boil down to “this is NOT how *I* would have implemented it!” as opposed to having a constructive discussion that ultimately adds value to the process.
  4. Code efficiency can be a dicey topic. I’ve seen a lot of developers spend hours arguing about the efficiency of a piece of code that might run once per day. A strong moderator that has good mediation skills is important. The reviewers must be mature enough to know when to pick a fight and when good enough is good enough.
  5. A good process with well defined roles goes a long way to alleviating arguments and maximizing the value of code reviews. In terms of roles, I would suggest: 1) Moderator 2) Peer 3) Developer who is being checked. The peer reviews and then the three people get together in a meeting and review the results. The moderator mediates and keeps the meeting rolling along and notes any action that is required to be taken. I would suggest a time and scope limited review session. Around 1-1.5 hours seems to work with follow-up sessions as needed. Any longer and eyes glaze over and people’s interest fads…
  6. Don’t let perfect get in the way of good enough – again there has to be a good balance and generally this discretion falls to the moderator, hence the moderator should be somebody who has good facilitator skills.

Technorati Tags: ,

{ 2 comments }

2 Minute TIps – Learn how to set up a Google Alert!

January 23, 2009

This is a bit of an experiement. I wanted to create a series of “2 Minute Tips” – video screen casts that show how to do something specific in a relatively short amount of time.

Read the full article →

Abundance and Gratitude

January 7, 2009

You can’t escape it.  Each morning when I slide the paper out of it’s plastic baggie or turn on CNN, there it is: the constant drumbeat of bad news.  It rains down on us unmercifully like a torrent.  I catch my optimism waning more than I’d like to admit.  I think doubt is natural when [...]

Read the full article →

Why the U.S. Automaker Bailout is a VERY bad idea…

December 10, 2008

The US plan for bailing out the US automakers is a horrible idea.  This is yet another case of seemingly well-intentioned legislation, designed to dupe the American public into thinking that the lawmakers “care” about the US economy and  is a weak attempt at underlining the idea that the “Big Three” are too big to [...]

Read the full article →

Darden Beijing Trip: Day 7 – Making our Way Home.

October 19, 2008

Day 7, last day! Follow along as I close out my trip and head home from Beijing China!

Read the full article →

Darden Beijing Trip: Day 6 – Chinese Economics and Shopping in Crazy town

October 16, 2008

Day #6 of our Beijing adventure learning about Chinese marketing, economics and consumer electronic market first hand!

Read the full article →

Darden Beijing Trip: Day 4 & 5 – The Great Wall of China!

October 16, 2008

Click HERE For the Flickr Photo-stream of Days 4&5

Getting to the Great Wall of China
It’s hard to believe we’ve been here for almost a week.  The schedule has been fast paced to put it mildly.  On Tuesday afternoon we took off for the Great Wall.  It was a bumpy and sometimes precarious ride out to [...]

Read the full article →

Darden Beijing Trip: Day 3 – Lecture, Chinese Juice Factory and Italian in China!

October 13, 2008

Click here for Day #3 Flickr Photstream

Day #3
It’s day #3 and I’m considering a lifetime ban on Chinese food.  It’s actually not that bad, I think others are having much more difficulty than I am.  Positioning myself as a vegetarian has saved me more than once.  If you’re a picky meat eater, you’re in trouble.  [...]

Read the full article →

Darden Beijing Trip: Day 2 – Class, Forbidden City, Exotic Food

October 12, 2008

Day #2 in Beijing. We visit a food court, Tienanmen Square and The Forbidden City

Read the full article →