May 27th, 2007
Hpricot is a very interesting lib to parse HTML. I’ll post here the example, just to check on the beauty of it from time to time…
# load the RedHanded home page
doc = Hpricot(open("http://redhanded.hobix.com/index.html"))
# change the CSS class on links
# remove the sidebar
# print the altered HTML
The lib is evolving, being added support for more XPath functions, etc.