Menu

Case Study: Implementing Data Scraping on Online Shopping Site

Role: Data Analyst & Web Scraper

Tools: Python Libraries - BeautifulSoup | Requests | Pandas | URLlib.lib | Regex101

Challenges

  • Competitor Pricing: to keep track of competitor pricing and to adjust pricing accordingly.
  • Stock Availability Tracking: inability to track stock availability on different marketplaces, leading to lost sales and customer frustration.
  • Brand Performance Tracking: inability to track brand performance, making it difficult to make informed decisions about product promotions and marketing efforts.

Process

The process began by making research on existing competitors based on location. The competitor in this case was (laptopsdirect.co.uk). The goal was to determine the:

  • Prices at which laptops of different brands were sold
  • Type & brand of products that were sold
  • Ratings of each product
  • Review count of each product
  • Specifications of each laptop sold
  • URL of each product

To address these challenges, a data scraping solution was implemented to provide by a website crawler that would gather competitor pricing, stock availability, and brand performance data, etc.

The project was done with BeautifulSoup & Requests (Python Libraries). The project began by importing a series of Python libraries including Pandas and urllib.parse.

Importing Libraries & Parsing data to BeautifulSoup

The requests library was used to make a request to the competitors website in order to initiate the web scraping process. This process was completed via GET method and a status code of 200 was obtained (showing that the request was successful).

The requested URL was then converted into a BeautifulSoup object, to enable the web scraping using the BeautifulSoup library. The BeautifulSoup object was also parsed using HTML.PARSER.

HTML Code

Different functions were used to scrape the needed information such as: prices, brand name, product URL, product ratings, review count, specifications, etc.

Finally, the scraped dataset was parsed through the Pandas library to form a table which was later converted into an Excel File.

Data Extraction

Data Extraction

Solution & Impact

The implementation of the data scraping improved the business by providing accurate competitor pricing data, enabling effective inventory management, and giving valuable insights into product and brand performance. This helped improved decision-making, and reduced lost sales.

Pandas Dataframe Table

Pandas to Excel Workbook

See Full Code on Google Colaboratory.

Get Scraped Excel Workbook Here.

See Case Study on Github.

Get In Touch

I'm happy to connect, listen and help. Let's work together and build something awesome. Let's turn your idea to an even greater product. Email Me.