The Challenges of Async Python Observability: Introduction to FastAPI and asyncio 1/3

By Sümer Cip, on Jan 05, 2022

Blog post series index:

Introduction to FastAPI and asyncio (you are here)
Profiling Asynchronous Code
Blackfire to the Rescue (to be published)

FastAPI is a modern, high-performance Python web framework built on top of asyncio, which is gaining lots of attention lately. Under the hood, it uses the awesome ASGI web framework Starlette which uses asyncio. ASGI is the successor of WSGI (which was the protocol used between web servers and Python web applications/frameworks). asyncio is included in the Python standard library and is the current way of writing concurrent code using the async/await syntax. If you are not familiar with the concept, I highly suggest reading the basics first.

While asyncio has been included in the standard library for a while now, please note that there were times that this was not true. So, there are multiple ways of writing asynchronous code in Python using external libraries (an example library in this area is greenlet). In this article, when we say “asynchronous code”, we mean code that uses the asyncio library

In asyncio terms, concurrency does not mean that we can write code that can run in parallel in different processors. Full CPU level parallelism is still not something you can achieve with Python. That being said, we may concurrently execute I/O blocking operations without using OS threads. There are lots of good documentation and blog posts about how asyncio works in Python. It is however not intended here to extensively explain how asyncio works under the hood. Let’s instead have a high-level overview of it from an observability perspective: How to observe/profile/monitor asynchronous code in Python, especially FastAPI?

How does asynchronous code work under the hood?

Let’s take a look at an example to see what is actually happening inside the interpreter when an HTTP request reaches an ASGI web server. Let’s assume that we have the following code in our HTTP view:

@app.get("/")
async def home(request: Request):
   tasks = [do_http_request(), asyncio.to_thread(fetch_from_db_sync)]
   results = await asyncio.gather(*tasks)
   return {"results": results}

We have two separate unrelated coroutines: one sending an HTTP request to a microservice in our network (do_http_request) and another one fetching something from the database synchronously (fetch_from_db_sync), assuming that we do not have the async version of the DB library. It is safe to assume that doing multiple I/O operations in a single HTTP transaction is pretty standard these days. And these kinds of situations are where ASGI shines. Let’s visualize what happens:

asyncio flow

Please note that, as we do not have an asynchronous version of our DB library, we use a helper function asyncio.to_thread to run it in another thread in an asynchronous fashion, and return the result when it finishes. This is also a standard way of mixing synchronous and asynchronous code together. In fact, FastAPI allows you to write synchronous views which are executed using a separate thread in a similar fashion as above. 

The image above illustrates that an HTTP request comes in and that two tasks are spawned, and then immediately go into I/O wait state. This I/O wait state is another way of saying “registering an event with the event loop and returning back”.  Every asyncio application has an event loop where you can register events, e.g. waiting for a socket or a file. Then, when the event loop detects that the wait is finished, it would eventually switch to the waiting task and resume. In fact, all these event loop interactions are simply transparent to the user. In our example, all the heavy lifting is done inside the asyncio library when we call asyncio.gather.

In short, the event loop tries to maximize the overlapping of blocking I/O operations without creating additional OS threads, which would perform better and thus scale better in many situations (I/O sensitive code), at least in theory. 

In addition to the above, ASGI also aims to have:

  • The ability to send multiple responses for a single request;
  • The ability to receive multiple events, (e.g: Websocket frames).

With these use cases, it becomes easier to support protocols like HTTP/2 and WebSockets.

In our next article, we’ll figure out how to find async code performance bottlenecks thanks to profiling.

Subscribe to Blackfire Monitoring

Start a Blackfire Monitoring subscription today and set up the Continous Observability of your Python applications.

Happy Performance Optimization!

Sümer Cip

Sümer Cip is currently a Senior Software Engineer at Blackfire. He has been coding for nearly 20 years and contributed to many open source projects in his career. He has a deep love for Python and obsession with code performance and low-level stuff.