blog.relativt.net

Eirik's public note to self

8 July 2011

Web harvesting and GUI testing with WebDriver

by Eirik Tenold

For a spare time project I’ve been working on lately, I had to web harvest (also known as screen scraping) a couple of web pages to get the information I needed. A coworker recommended WebDriver, a part of Selenium testing framework.

In a nutshell, WebDriver is a library that controls your browser from code. This can be useful when doing GUI testing or when you want to simulate a web browser to fetch something from the web.

You can control different browsers (Firefox, Chrome, etc). In my experiment I used Firefox. WebDriver currently only support version 3 of Firefox, so I had to download it and point WebDriver to where it was located.

WebDriver worked very well for my project and even though I’m not a big fan of web harvesting the overall result was good enough.

How I pointed WebDriver to Firefox 3 and turned on autodownload for Excel spreadsheets:

System.setProperty("webdriver.firefox.bin", "/opt/firefox-3.6/firefox/firefox");
FirefoxProfile fp = new FirefoxProfile();
fp.setPreference("browser.download.dir", "/tmp/firefox-webdriver");
fp.setPreference("browser.download.folderList", 2);
fp.setPreference("browser.helperApps.neverAsk.saveToDisk", "application/mx-msdownload");
WebDriver driver = new FirefoxDriver(fp);

The Maven dependency looks like this:

<dependency>
    <groupId>org.seleniumhq.webdriver<groupId>
    <artifactId>webdriver-firefox</artifactId>
    <version>0.9.7376</version>
</dependency>
tags: Java - Selenium - Firefox - WebDriver