Friday, July 5, 2013

How to install Beautiful Soup or BS4 on Windows?

BeautifulSoup is a Python module which is meant for web scraping. That is, using Python, you can fetch an html webpage (using a module such as urllib2), and then obtain meaningful information out of the html file (using the BeautifulSoup module).

Here are the steps to download and install BeautifulSoup on Windows. It assumes you have already installed python and you know how to use an archiving tool such as 7-zip.

1) Download the BeautifulSoup compressed file from the below link:
http://www.crummy.com/software/BeautifulSoup/bs4/download/

In my setup, I have downloaded a file named beautifulsoup4-4.2.1.tar.gz.

2) Extract the archive using a tool such as 7-zip. Once you have extracted the files, open cmd prompt, and go to directory named beautifulsoup4-4.2.1

3) Run the following command to install BeautifulSoup
C:\>python setup.py install

4) Test the installation with this command :
>>>from bs4 import BeautifulSoup

If you see the python prompt (>>>) in the next line without any errors, then it means BS4 is successfully installed.

For detailed coverage, check out this book : Getting Started with Beautiful Soup