Python `asyncio` module with use case

In this article, We will start with a simple use case then as we build our knowledge we will introduce more features by extending our simple use case.

We'll also see how Asynchronous programming improves our code performance

Now we will be taking a practical approach to asyncio in this blog and see how it can help us in improving code performance

From the documentation

asyncio is a library to write concurrent code using the async/await syntax.

asyncio is used as a foundation for multiple Python asynchronous frameworks that provide high-performance network and web-servers, database connection libraries, distributed task queues, etc.

asyncio is often a perfect fit for IO-bound and high-level structured network code. more information can be found here

Let's get started

Note: You would require Python 3.5 or above to follow along with the examples

Here's the simple use case:

import requests

url = "https://www.google.com"
print(requests.get(url).text)

As you can see in the above example I have used the requests package to fetch website content. In this case, I am getting content from Google

The above code is synchronous and is processed after converting it to bytecode which is then converted to machine code one line at a time by Python Virtual Machine (PVM). For more information on how code is getting run behind the scene, you can refer to this blog.

Let's see what changes we need to make in the above example to make it Asynchronous

import requests
import asyncio

async def get(url: str) -> str:
    return requests.get(url).text

coroutine = get("https://www.google.com")
loop = asyncio.get_event_loop()
print(loop.run_until_complete(coroutine))

In the above code snippet

  1. import asyncio - We introduced the asyncio module through which async code gets executed

  2. async def func() - Used async keyword to make the function asynchronous

  3. coroutine = get("https://www.google.com") - Don't get confused by this statement which seems like the code inside the get function is getting executed however it is not, when introduced async keyword it is no longer a normal function but a coroutine which you can check by printing the coro variable type, coroutine is a specialized version of Python generator functions

  4. loop = asyncio.get_event_loop() - Get event loop which is used to run async code, event loop is similar to while loop which monitors the coroutine, taking feedback on what’s idle, and looking around for tasks that can be executed in the meantime

  5. loop.run_until_complete(coroutine) - Used run_until_complete method from the event loop to run get coroutine and retrieve the function result

Even though we have used asyncio the code is still synchronous

Let's introduce some extra processing by fetching the same content multiple times instead of just once as in previous examples

Note: Just for example I am fetching the same content multiple times but in real life scenario you would be fetching different content either by changing parameters or by using different URLs

import time
import requests

def main():
    url = "https://www.google.com"
    for i in range(5):
        print("Started: %d" % i)
        data = requests.get(url).text
        print("Finished: %d" % i)

start = time.perf_counter()
main()
elapsed = time.perf_counter() - start
print(f"Executed in {elapsed:0.2f} seconds")

What the above code does is it fetches data from Google 5 times using for loop and it also included code to check how much time it is taking to complete to execution of the code.

If you run the code you will get similar output except there may be a difference in execution time due to various factors like network, memory, CPU, etc.

Started: 0
Finished: 0
Started: 1
Finished: 1
Started: 2
Finished: 2
Started: 3
Finished: 3
Started: 4
Finished: 4
Executed in 5.25 seconds

Now let's rewrite the above example with asyncio and see what's the difference

import time
import requests
import asyncio

async def get(url: str) -> str:
    return requests.get(url).text

async def main():
    result = []
    for i in range(5):
        print("Started: %d" % i)
        data = await get('http://www.google.com')
        print("Finished: %d" % i)
        result.append(data)
    return None

loop = asyncio.get_event_loop()
start = time.perf_counter()
loop.run_until_complete(main())
elapsed = time.perf_counter() - start
print(f"Executed in {elapsed:0.2f} seconds")

The code is pretty much similar to the previous example with a few changes

  1. Added a new keyword await that tells to wait for the completion of the current statement and then move to the next statement

  2. Added main function which is being passed to loop.run_until_complete method of the event loop

Let's run the code and see what's the difference

Started: 0
Finished: 0
Started: 1
Finished: 1
Started: 2
Finished: 2
Started: 3
Finished: 3
Started: 4
Finished: 4
Executed in 3.99 seconds

As you can see it only took 3.99 seconds to run which is approx 24 percent faster in comparison with the synchronous version. However, if you run the same code multiple times it may not perform well since it is not running concurrently let's see why.

In the above code, we are handling the running of a coroutine it would be better if instead of us running the coroutine asyncio handles the running of coroutines concurrently next let's see how we can do that

import time
import requests
import asyncio

async def get(url: str) -> str:
    return requests.get(url).text

async def call(i: int, url: str) -> str:
    print("Started i : %d" % i)
    data = await get(url)
    print("Finished i : %d" % i)
    return data

async def main():
    return await asyncio.gather(*(
        call(i, 'http://www.google.com')
        for i in range(5)
    ))

loop = asyncio.get_event_loop()
start = time.perf_counter()
loop.run_until_complete(main())
elapsed = time.perf_counter() - start
print(f"Executed in {elapsed:0.2f} seconds")
  1. Using asyncio.gather method we can tell asyncio to run the coroutines and collects the results for us

  2. In the above example, I have used generator comprehension which is similar to list comprehension

  3. Introduced call function which is calling get function with await keyword, the call function is being called by asyncio.gather method so we can see what part of the code is in execution

There is still one problem requests package blocks the event loop and does not let the event loop starts the processing of the next set of coroutines until the one running completes

There are two ways to solve the problem either use a different package like aiohttp that does not block the event loop or update in a way so that requests package does not block the event loop

Let's look at the second approach since first approach examples can be easily found online on whichever package you decided to use

import time
import requests
import asyncio

async def get(url: str) -> str:
    return await loop.run_in_executor(None, requests.get, url)

async def call(i: int, url: str) -> str:
    print("Started: %d" % i)
    data = await get(url)
    print("Finished: %d" % i)
    return data

async def main():
    return await asyncio.gather(*(
        call(i, 'http://www.google.com')
        for i in range(5)
    ))

loop = asyncio.get_event_loop()
start = time.perf_counter()
loop.run_until_complete(main())
elapsed = time.perf_counter() - start
print(f"Executed in {elapsed:0.2f} seconds")

By calling the event loop run_in_executor method it moves the execution of requests.get out of the main thread to a different thread which let the main thread continue on to the next set of steps

Note: By default event loop uses ThreadExecutor internally to run the passed function/method for more info you can check here

And when you run the code

Started: 0
Started: 1
Started: 2
Started: 3
Started: 4
Finished: 4
Finished: 1
Finished: 0
Finished: 3
Finished: 2
Executed in 0.59 seconds

We can see that all the coroutine execution started at the same time however each coroutine finished at random and in performance comparison, it is approx 89 percent faster than the synchronous version

Note: If you are using Python 3.7 then you can replace these two lines

loop = asyncio.get_event_loop()

loop.run_until_complete(main())

with just asyncio.run(main())

P.S.: Let me know if something can be improved or If I had committed any mistakes. Thanks for reading :) Happy learning !!