How to Build a Web Scraper

Introduction: How to Build a Web Scraper

Many people use personal computers without utilizing them to

their fullest capabilities. By learning a few basic principles and utilizing free software, one can start to truly unlock the power and resources a computer has to offer. This tutorial will illustrate a method of constructing a “Web-Scraping” Bot or crawler. These “crawlers” are capable of automatically collecting all different types of data from any website. This tool is immensely powerful for any computer user.

Step 1: Required Materials:

1 Personal Computer

- I will be using Windows 10 in this demonstration, but the same code and principals can be applied across all platforms, even mobile.

Internet Connection

Google Chrome

Step 2: Previous Computer Experience:

While this tutorial does not require prior coding experience,

it is recommended that users have a basic understanding of how to use a keyboard (copy and paste) and how to use a mouse.

CAUTION: Always make sure you backup your important files. Improper installation may cause data corruption.

Step 3: Starting the Project

First, we need to download and install a program called

Python 2.7.14. Go to “” and click download Python 2.7.14. After it is done downloading, run the file and install Python. To check to make sure it installed, look in the C:/ Drive folder and find a folder called Python27. If it’s there, Python installed successfully. If it’s not there, try restarting your computer and running the installation program again.

Step 4:

Now we need to make Windows and Python play nice together.

Open Control panel and select "System and Security"

Step 5:

select “System”

Step 6:

Go to the left column and select “Advanced System Settings” A new window should appear.

Step 7:

Click “Environment Variables”

Step 8:

One is called “User Variables” and another called “System Variables” Navigate to “System Variables” and click “New..” (We are going to ADD two new variables)

Step 9:

First ADD Variable Name: PYTHON Variable PATH: C:\Python27\

Step 10:

Second ADD Variable Name: Python_Scripts Variable PATH: C:\Python27\Scripts Restart your computer.

Step 11:

After restart open command prompt (Hit windows key and type “cmd”) Enter the command: python You should see : “Python 2.7.13 (v2.7.13:a06454b1afa1, Dec 17 2016, 20:42:59) [MSC v.1500 32 bit (Intel)] on win32 Type "help", "copyright", "credits" or "license" for more information.” >>> If you don’t see this, repeat STEP 1 & 2. Press “Crtl + C” and then “Enter” to exit Python and return to the main line. Close Command Prompt;

Step 12:

We officially have Python installed. Now we have to install

a couple of small programs for our “crawler” to work.

Open a new notepad file and copy and paste all the text from “”

Save the text file as “” and move it into your documents folder.

Open command prompt as administrator

Step 13:

Type “cd documents” press enter

Type “python” Press enter

Type “pip install selenium” press enter

After selenium is successfully installed move on to the next step

Step 14:

Open a new notepad file.

Copy and paste all the CODE from into the notepad

Now the fun begins…

We have to decide what kind of data we want to scrape. For the sake of demonstration, I will use Ebay item prices.

Let’s say I want to sell my instrument, but I am not sure what the mean price is.

I can user the “crawler” to collect the prices for me.

Step 15:

In the notepad file, look for a line that says

landing_page_url = '’

I am going to copy and paste the URL from the page I want to scrape here.

In the case it will be

landing_page_url =



This is the ebay search result page for “MPC 2000XL”. (the instrument I want to sell)

Step 16:

Every single thing you see on a webpage is called an

“element”. As such, they each have their own “address” or “position” on the page that is unique to each element. We want the bot to grab and record certain element, but not others. We do this by discerning for the bot which things we want it to grab.

Go to the notepad file and locate a line that says,

Item_price_element_list[] = browser.find_elements_by_css_selector("xxxx") # Find the search box

Now open chrome and navigate to the landing_page that you pasted earlier.

Step 17:

Right click the element you want to scrape and select

“Inspect Element”

A new section should open and you should be able to view the page source code.

Step 18:

The element that you clicked on is now being highlighted in the source code window.

Right click that highlighted portion and hover over copy

Then select Copy CSS Selector

Step 19:

Paste it into the “xxxx” portion of the

Item_price_element_list like this:

Item_price_element_list[] = browser.find_elements_by_css_selector("#item3f88323b1e > > li.lvprice.prc > span") # Find the search box

Step 20: Enjoy

Believe it or not, we are done. This program will

successfully create a list of prices from the 1st result page for us.

Save the notepad file as and move it to your Documents Folder


Now open up CMD (does not have to be in administrative mode)

Type “cd documents”

Type ”python”

You should see a list of prices

Now I can find the mean and the median and properly list my instrument for a fair price!

Be the First to Share


    • Puzzles Speed Challenge

      Puzzles Speed Challenge
    • "Can't Touch This" Family Contest

      "Can't Touch This" Family Contest
    • CNC Contest 2020

      CNC Contest 2020

    8 Discussions


    4 months ago on Step 14

    the code from pastebin is broken. can u please give anothe rlink?thanks


    4 months ago

    Pastebin link for Step 14 is broken. Could you please re-upload?


    2 years ago

    That's a neat setup :)


    Reply 11 months ago


    Would you still have a copy of the content provided in the pastebin link? The link is no longer active and some of us are stuck. Would really appreciate help. Thanks.


    11 months ago

    Is there an updated pastebin link? I've been trying to google around but don't really even know what to search.


    Question 12 months ago

    can we scrape multiple elements at a time ....? if so , please demonstrate it and how to connect it to front end html page.?


    Question 2 years ago on Step 14

    Do you still have the code? The link no longer works on step 14 :(