Skip to main content

Python Multi Threading

Speeding up Python code using multithreading

A lot of times we end up writing code in Python which does remote requests or reads multiple files or does processing on some data. And in a lot of those cases I have seen programmers using a simple for loop which takes forever to finish executing. For example:
import requests
from time import time 
url_list = [
    "https://via.placeholder.com/400",
    "https://via.placeholder.com/410",
    "https://via.placeholder.com/420",
    "https://via.placeholder.com/430",
    "https://via.placeholder.com/440",
    "https://via.placeholder.com/450",
    "https://via.placeholder.com/460",
    "https://via.placeholder.com/470",
    "https://via.placeholder.com/480",
    "https://via.placeholder.com/490",
    "https://via.placeholder.com/500",
    "https://via.placeholder.com/510",
    "https://via.placeholder.com/520",
    "https://via.placeholder.com/530",] 
def download_file(url):
    html = requests.get(url, stream=True)
    return html.status_code
start = time()
for url in url_list:
    print(download_file(url))

print(f'Time taken: {time() - start}')

Output:Time taken: 4.128157138824463

This is a sane example and the code will open each URL, wait for it to load, print its status code and only then move on to the next URL. This kind of code is a very good candidate for multi-threading.
Modern systems can run a lot of threads and that means you can do multiple tasks at once with a very low over-head. Why don’t we try and make use of that to make the above code process these URLs faster?
We will make use of the ThreadPoolExecutor from the concurrent.futureslibrary.
from concurrent.futures import ThreadPoolExecutor, as_completed.It is super easy to use. Let me show you some code and then explain how it works.

from time import time 
url_list = [
    "https://via.placeholder.com/400",
    "https://via.placeholder.com/410",
    "https://via.placeholder.com/420",
    "https://via.placeholder.com/430",
    "https://via.placeholder.com/440",
    "https://via.placeholder.com/450",
    "https://via.placeholder.com/460",
    "https://via.placeholder.com/470",
    "https://via.placeholder.com/480",
    "https://via.placeholder.com/490",
    "https://via.placeholder.com/500",
    "https://via.placeholder.com/510",
    "https://via.placeholder.com/520",
    "https://via.placeholder.com/530",]
 def download_file(url):
    html = requests.get(url, stream=True)
    return html.status_code 
start = time() 
processes = []
with ThreadPoolExecutor(max_workers=10) as executor:
    for url in url_list:
        processes.append(executor.submit(download_file, url)) 
for task in as_completed(processes):
    print(task.result())
print(f'Time taken: {time() - start}')

Output:
<--truncated-->
Time taken: 0.4583399295806885

 We just sped up our code by a factor of almost 9! And we didn’t even do anything super involved. The performance benefits would have been even more if there were more urls.
So what is happening? When we call executor.submit we are adding a new task to the thread pool. We store that task in the processes list. Later we iterate over the processes and print out the result.
The as_completed method yields the items (tasks) from processes list as soon as they complete. There are two reasons a task can go to the completed state. It has either finished executing or it got cancelled. We could have also passed in a timeout parameter to as_completed and if a task took longer than that time period, even then as_completed will yield that task.
You should explore multi-threading a bit more. For trivial projects it is the quickest way to speed up your code. If you want to learn, more read the official docs. They are super helpful.

Comments

Post a Comment

Popular posts from this blog

Difference between NPM and NPX

NPM Vs NPX npm  - Javascript package manager npx  - Execute npm package binaries https://docs.npmjs.com/files/folders#executables If you use  npm 5.1 or earlier , you can't use npx. Instead, install create-react-app globally: npm install -g create-react-app Now you can run: create-react-app my-app NPM: One might install a package locally on a certain project: npm install some - package Now let's say you want NodeJS to execute that package from the command line: $ some - package The above will  fail . Only  globally installed  packages can be executed by typing their name  only . To fix this, and have it run, you must type the local path: $ ./ node_modules /. bin / some - package You can technically run a locally installed package by editing your  packages.json  file and adding that package in the  scripts  section: { "name" : "whatever" , "version" : "1.0.0" , "scripts" : { ...

Asp.Net MVC Life Cycle

MVC Life Cycle: 1. Routing -         Routing is the first step in ASP.NET MVC pipeline. Typically, it is a pattern matching system that matches the incoming request to the registered URL patterns in the Route Table. The UrlRoutingModule(System.Web.Routing.UrlRoutingModule) is a class which matches an incoming HTTP request to a registered route pattern in the RouteTable(System.Web.Routing.RouteTable). 2.Controller Initialization -        The MvcHandler initiates the real processing inside ASP.NET MVC pipeline by using ProcessRequest method. This method uses the IControllerFactory instance (default is System.Web.Mvc.DefaultControllerFactory) to create corresponding controller. 3. Action Execution – Action execution occurs in the following steps:        When the controller is initialized, the controller calls its own InvokeAction() method by passing the details of the chosen action method. This is han...

Books To Refer

1. C#   in Depth ,  Third   Edition : Foreword   by Eric   Lippert        By:  Jon Skeet    Publication Date:  16-SEP-2013 2. Learning jQuery - Fourth Edition     B y:  Jonathan Chaffer; Karl Swedberg     Pub. Date:  June 25, 2013      3.Beginning JSON                    By:  Ben Smith     Publication Date:  04-MAR-2014 4. Responsive Web Design with jQuery     By:  Gilberto Crespo      Pub. Date:  November 25, 2013 5. Developing Responsive Web Applications with AJAX and jQuery       By:  Sandeep Kumar Patel  P ub. Date:  July 25, 2014 6. Dependency Injection in .NET     by  Mark Seemann