Python Multithreading Example

Summary: in this tutorial, you’ll learn how to use the Python threading module to develop a multithreaded program.

Extending the Thread class #

We’ll develop a multithreaded program that scraps the stock prices from the Yahoo Finance website.

To do that, we’ll use two third-party packages:

requests – to get the contents of a webpage.
lxml – to select a specific element of an HTML document.

First, install the requests and lxml modules using the pip command:

pip install request lxmlCode language: Python (python)

Next, define a new class called Stock that inherits from the Thread class of the threading module. We’ll place the Stock class in stock.py module:

import threading

class Stock(threading.Thread):
   passCode language: Python (python)

Then, implement the __init__() method that accepts a symbol and initializes the url instance variable based on the symbol:

import threading
import requests
from lxml import html


class Stock(threading.Thread):
    def __init__(self, symbol: str) -> None:
        super().__init__()

        self.symbol = symbol
        self.url = f'https://finance.yahoo.com/quote/{symbol}'
        self.price = NoneCode language: Python (python)

For example, if you pass the symbol GOOG to the __init__() method, the URL will be:

https://finance.yahoo.com/quote/GOOGCode language: Python (python)

After that, override the run() method of the Thread class. The run() method gets the contents from the self.url and grabs the stock price:

class Stock(threading.Thread):
    def __init__(self, symbol: str) -> None:
        super().__init__()

        self.symbol = symbol
        self.url = f'https://finance.yahoo.com/quote/{symbol}'
        self.price = None

    def run(self):
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
        }
        response = requests.get(self.url, headers=headers)
        if response.status_code == 200:
            # parse the HTML
            tree = html.fromstring(response.text)
            # get the price in text
            price_text = tree.xpath(
                '//*[@id="quote-header-info"]/div[3]/div[1]/div[1]/fin-streamer[1]/text()')
            if price_text:
                try:
                    self.price = float(price_text[0].replace(',', ''))
                except ValueError:
                    self.price = None

    def __str__(self):
        return f'{self.symbol}\t{self.price}'Code language: Python (python)

How it works.

Make a request to the URL using the requests.get() method:

headers = {
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
}
response = requests.get(self.url, headers=headers)Code language: Python (python)

Notice that without valid headers, Yahoo will return 404 instead of 200.

If the request is successful, the HTTP status code is 200. In this case, we get the HTML contents from the response and pass it to the fromstring() function of the html module from the lxml package:

if response.status_code == 200:
   tree = html.fromstring(response.text)Code language: Python (python)

Every element on a webpage can be selected using something called XPath.

To get the XPath of an element using Google Chrome, you inspect the page, right-click the element, select copy, and Copy XPath.

The XPath of the stock price at the time of writing this tutorial is as follows:

//*[@id="quote-header-info"]/div[3]/div[1]/div[1]/fin-streamer[1]Code language: Python (python)

To get the text of the element, you append the text() at the end of the XPath:

//*[@id="quote-header-info"]/div[3]/div[1]/div[1]/fin-streamer[1]/text()Code language: Python (python)

Notice that if Yahoo changes the page structure, you need to change the XPath accordingly. Otherwise, the program won’t work as expected:

price_text = tree.xpath('//*[@id="quote-header-info"]/div[3]/div[1]/div[1]/fin-streamer[1]/text()')Code language: Python (python)

Once getting the price as text, we remove the comma and convert it to a number:

if price_text:
    try:
        self.price = float(price_text[0].replace(',', ''))
    except ValueError:
        self.price = NoneCode language: Python (python)

Finally, add the __str__() method that returns the string representation of the Stock object:

import threading
import requests
from lxml import html


class Stock(threading.Thread):
    def __init__(self, symbol: str) -> None:
        super().__init__()

        self.symbol = symbol
        self.url = f'https://finance.yahoo.com/quote/{symbol}'
        self.price = None

    def run(self):
        headers = {
            "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
        }
        response = requests.get(self.url, headers=headers)
        if response.status_code == 200:
            # parse the HTML
            tree = html.fromstring(response.text)
            # get the price in text
            price_text = tree.xpath(
                '//*[@id="quote-header-info"]/div[3]/div[1]/div[1]/fin-streamer[1]/text()')
            if price_text:
                try:
                    self.price = float(price_text[0].replace(',', ''))
                except ValueError:
                    self.price = None

    def __str__(self):
        return f'{self.symbol}\t{self.price}'
Code language: Python (python)

Using the Stock class #

The following main.py module uses the Stock class from the stock.py module:

from stock import Stock

symbols = ['MSFT', 'GOOGL', 'AAPL', 'META']
threads = []

for symbol in symbols:
    t = Stock(symbol)
    threads.append(t)    
    t.start()
    

for t in threads:
    t.join()
    print(t)
Code language: Python (python)

Output:

MSFT    253.67
GOOGL   2280.41
AAPL    145.86
META    163.27Code language: Python (python)

How it works.

First, import the Stock class from the stock.py module:

from stock import StockCode language: Python (python)

Second, initialize a list of symbols:

symbols = ['MSFT', 'GOOGL', 'AAPL', 'META']Code language: Python (python)

Third, create a thread for each symbol, start it, and append the thread to the threads list:

threads = []
for symbol in symbols:
    t = Stock(symbol)
    threads.append(t)    
    t.start()
    Code language: Python (python)

Finally, wait for all the threads in the threads list to complete and print out the stock price:

for t in threads:
    t.join()
    print(t)Code language: Python (python)

Summary #

Define a class that inherits from the threading.Thread class and override the run() method.

Was this tutorial helpful ?