greenback: reenter an asyncio or Trio event loop from synchronous code

Python 3.5 introduced async/await syntax for defining functions that can run concurrently in a cooperative multitasking framework such as asyncio or Trio. Such frameworks have a number of advantages over previous approaches to concurrency: they scale better than threads and are clearer about control flow than the implicit cooperative multitasking provided by gevent. They’re also being actively developed to explore some exciting new ideas about concurrent programming.

Porting an existing codebase to async/await syntax can be challenging, though, since it’s somewhat “viral”: only an async function can call another async function. That means you don’t just have to modify the functions that actually perform I/O; you also need to (trivially) modify every function that directly or indirectly calls a function that performs I/O. While the results are generally an improvement (“explicit is better than implicit”), getting there in one big step is not always feasible, especially if some of these layers are in libraries that you don’t control.

greenback is a small library that attempts to bridge this gap. It allows you to call back into async code from a syntactically synchronous function, as long as the synchronous function was originally called from an async task (running in an asyncio or Trio event loop) that set up a greenback “portal” as explained below. This is potentially useful in a number of different situations:

  • You can interoperate with some existing libraries that are not async/await aware, without pushing their work off into another thread.

  • You can migrate an existing program to async/await syntax one layer at a time, instead of all at once.

  • You can (cautiously) design async APIs that block in places where you can’t write await, such as on attribute accesses.

greenback requires Python 3.8 or later and an implementation that supports the greenlet library. Either CPython or PyPy should work. There are no known OS dependencies.

Quickstart

  • Call await greenback.ensure_portal() at least once in each async task that will be using greenback. (Additional calls in the same task do nothing.) You can think of this as creating a portal that will be used by future calls to greenback.await_() in the same task.

  • Later, use greenback.await_(foo()) as a replacement for await foo() in places where you can’t write await.

  • If all of the places where you want to use greenback.await_() are indirectly within a single function, you can eschew the await greenback.ensure_portal() and instead write a wrapper around calls to that function: await greenback.with_portal_run(...) for an async function, or await greenback.with_portal_run_sync(...) for a synchronous function. These have the advantage of cleaning up the portal (and its associated minor performance impact) as soon as the function returns, rather than leaving it open until the task terminates.

  • For more details and additional helpers, read the rest of this documentation!

Detailed documentation

Principle of operation

This section attempts to confer a basic sense of how greenback works.

Async/await basics and limitations

Python’s async/await syntax goes to some lengths to look like normal straight-line Python code: the async version of some logic looks pretty similar to the threaded or non-concurrent version, just with extra async and await keywords. Under the hood, though, an async callstack is represented as something very much like a generator. When some async function in the framework you’re using needs to suspend the current callstack and run other code for a bit, it effectively yields an object that tells the event loop about its intentions. The exact nature of these “traps” is a private implementation detail of the particular framework you’re using. For example, asyncio yields Futures, curio yields specially formatted tuples, and Trio (currently) yields internal objects representing the two primitive operations cancel_shielded_checkpoint() and wait_task_rescheduled(). For much more detail on how async/await work (more approachably explained, too!), see the excellent writeup by Brett Cannon: How the heck does async/await work in Python 3.5?

Common to both async functions and generators is the limitation that one can only directly yield out of the function containing the yield statement. If you want the yielded value to propagate multiple levels up the callstack – all the way to the event loop, in the async case – you need cooperation at each level, in the form of an await (in an async function) or yield from (in a generator) statement. This means every point where execution might be suspended is marked in the source code, which is a useful property. It also means that adding some I/O (which might block, thus needs to be able to suspend execution) at the bottom of a long chain of formerly-synchronous functions requires adding async and await keywords at every level of the chain. That property is sometimes problematic. To be sure, doing that work can reveal important bugs (read Unyielding if you don’t believe it); but sometimes it’s just an infeasible amount of work to be doing all at once in the first place, especially if you need to interoperate with a large project and/or external dependencies that weren’t written to support async/await.

“Reentering the event loop” (letting a regular synchronous function call an async function using the asynchronous context that exists somewhere further up the callstack) has therefore been a popular Python feature request, but unfortunately it’s somewhat fundamentally at odds with how generators and async functions are implemented internally. The CPython interpreter uses a few levels of the C call stack to implement each level of the running Python call stack, as is natural. The Python frame objects and everything they reference are allocated on the heap, but the C stack is still used to track the control flow of which functions called which. Since C doesn’t have any support for suspending and resuming a callstack, Python yield necessarily turns into C return, and the later resumption of the generator or async function is accomplished by a fresh C-level call to the frame-evaluation function (using the same frame object as before). Yielding out of a 10-level callstack requires 10 times as many levels of C return, and resuming it requires 10 times as many new nested calls to the frame evaluation function. This strategy requires the Python interpreter to be aware of each place where execution of a generator or async function might be suspended, and performance and comprehensibility both argue against allowing such suspensions to occur in every operation.

Sounds like we’re out of luck, then. Or are we?

Greenlets: a different approach

Before Python had async/await or yield from, before it even had context managers or generators with send() and throw() methods, a third-party package called greenlet provided support for a very different way of suspending a callstack. This one required no interpreter support or special keyword because it worked at the C level, copying parts of the C stack to and from the heap in order to implement suspension and resumption. The approach required deep architecture-specific magic and elicited a number of subtle bugs, but those have generally been worked out in the years since the first release in 2006, such that the package is now considered pretty stable.

The greenlet package has spawned its own fair share of concurrency frameworks such as gevent and eventlet. If those meet your needs, you’re welcome to use them and never give async/await another glance. For our purposes, though, we’re interested in using just the greenlet primitive: the ability to suspend a callstack of ordinary synchronous functions that haven’t been marked with any special syntax.

Using greenlets to bridge the async/sync divide

From the perspective of someone writing an async function, your code is the only thing running in your thread until you yield or yield from or await, at which point you will be suspended until your top-level caller (such as the async event loop) sees fit to resume you. From the perspective of someone writing an async event loop, the perspective is reversed: each “step” of the async function (represented by a send() call on the coroutine object) cedes control to an async task until the task feels like yielding control back to the event loop. Our goal is to allow something other than a yield statement to make this send() call return.

greenback achieves this by introducing a “shim” coroutine in between the async event loop and your task’s “real” coroutine. Whenever the event loop wants to run the next step of your task, it runs a step of this “shim” coroutine, which creates a greenlet for running the next step of the underlying “real” coroutine. This greenlet terminates when the “real” send() call does (corresponding to a yield statement inside your async framework), but because it’s a greenlet, we can also suspend it at a time of our choosing even in the absence of any yield statements. greenback.await_() makes use of that capability by repeatedly calling send() on its argument coroutine, using the greenlet-switching machinery to pass the yielded traps up to the event loop and get their responses sent back down.

Once you understand the approach, most of the remaining trickery is in the answer to the question: “how do we install this shim coroutine?” In Trio you can directly replace trio.lowlevel.Task.coro with a wrapper of your choosing, but in asyncio the ability to modify the analogous field is not exposed publicly, and on CPython it’s not even exposed to Python at all. It’s necessary to use ctypes to edit the coroutine pointer in the C task object, and fix up the reference counts accordingly. This works well once the details are ironed out. There’s some additional glue to deal with exception propagation, non-coroutine awaitables, and so forth, but the core of the implementation is not very much changed from the gist sketch that originally inspired it.

What can I do with this, anyway?

You really can switch greenlets almost anywhere, which means you can call greenback.await_() almost anywhere too. For example, you can perform async operations in magic methods: operator implementations, property getters and setters, object initializers, you name it. You can use combinators that were written for synchronous code (map, filter, sum, everything in itertools, and so forth) with asynchronous operations (though they’ll still execute serially within that call – the only concurrency you obtain is with other async tasks). You can use libraries that support a synchronous callback, and actually run async code inside the callback. And when this async code blocks waiting for something, it will play nice and allow all your other async tasks to run.

You may find greenback.autoawait useful in some of these situations: it’s a decorator that turns an async function into a synchronous one. There are also greenback.async_context and greenback.async_iter for sync-ifying async context managers and async iterators, respectively.

If you’re feeling reckless, you can even use greenback to run async code in places you might think impossible, such as finalizers (__del__ methods), weakref callbacks, or perhaps even signal handlers. This is not recommended (your async library will not be happy, to put it mildly, if the signal arrives or GC occurs in the middle of its delicate task bookkeeping) but it seems that you can get away with it some reasonable fraction of the time. Don’t try these in production, though!

All of these are fun to play with, but in most situations the ergonomic benefit is not going to be worth the “spooky action at a distance” penalty. The real benefits probably come mostly when working with large established non-async projects. For example, you could write a pytest plugin that surrounds the entire run in a call to trio.run(), with greenback.await_() used at your leisure to escape back into a shared async context. Perhaps this could allow running multiple async tests in parallel in the same thread. At this point such things are only vague ideas, which may well fail to work out. The author’s hope is that greenback gives you the tool to pursue whichever ones seem worthwhile to you.

What’s the performance impact?

Running anything with a greenback portal available incurs some slowdown, and actually using await_() incurs some more. The slowdown is not extreme.

The slowdown due to greenback is mostly proportional to the number of times you yield to the event loop with a portal active, as well as the number of portal creations and await_() calls you perform. You can run the microbenchmark.py script from the Git repository to see the numbers on your machine. On a 2023 MacBook Pro (ARM64), with CPython 3.12, greenlet 3.0.3, and Trio 0.24.0, I get:

  • Baseline: The simplest possible async operation is what Trio calls a checkpoint: yield to the event loop and ask to immediately be rescheduled again. This takes about 13.6 microseconds on Trio and 12.9 microseconds on asyncio. (asyncio is able to take advantage of some C acceleration here.)

  • Adding the greenback portal, without making any await_() calls yet, adds about 1 microsecond per checkpoint.

  • Executing each of those checkpoints through a separate await_() adds about another 2 microseconds per await_(). (Surrounding the entire checkpoint loop in a single await_(), by contrast, has negligible impact.)

  • Creating a new portal for each of those await_(checkpoint()) invocations adds another 16 microseconds or so per portal creation. If you use with_portal_run_sync(), portal creation gets about 10 microseconds faster (so the portal is only adding about 6 microseconds of overhead).

Keep in mind that these are microbenchmarks: your actual program is probably not executing checkpoints in a tight loop! The more work you’re doing each time you’re scheduled, the less overhead greenback will entail.

API reference

Creating a portal

In order to use greenback in a particular async task, you must first create a greenback portal for that task to use. You may choose between:

  • ensure_portal(): Create a portal to be used by the current task, which lasts for the lifetime of that task. Use case: minimally invasive code change to allow greenback.await_() in a particular task.

  • bestow_portal(): Create a portal to be used by some other specified task, which lasts for the lifetime of that task. Use case: enabling greenback in a task without that task’s cooperation, which may be useful in some debugging and instrumentation situations. (with_portal_run_tree() is implemented using a Trio instrument that calls bestow_portal() on certain newly spawned tasks.)

  • with_portal_run(): Run an async function (in the current task) that might eventually make calls to await_(), with a portal available for at least the duration of that call. Use case: less “magical” than ensure_portal(); keeps the portal (and its perforamnce impact) scoped to just the portion of a task that needs it.

  • with_portal_run_sync(): Run a synchronous function (in the current task) that might eventually make calls to await_(), with a portal available for at least the duration of that call. Use case: same as with_portal_run(), but the implementation is simpler and will be a bit faster (probably only noticeable if the function you’re running is very short).

  • with_portal_run_tree(): Run an async function (in the current task) that can make calls to await_() both itself and in all of its child tasks, recursively. Available on Trio only, since asyncio lacks a clear task tree and also lacks the instrumentation features required to implement this. Use case: minimally invasive code change to allow greenback.await_() in an entire subsystem of your Trio program.

You can use has_portal() to determine whether a portal has already been set up.

await greenback.ensure_portal()

Ensure that the current async task is able to use greenback.await_().

If the current task has called ensure_portal() previously, calling it again is a no-op. Otherwise, ensure_portal() interposes a “coroutine shim” provided by greenback in between the event loop and the coroutine being used to run the task. For example, when running under Trio, trio.lowlevel.Task.coro is replaced with a wrapper around the coroutine it previously referred to. (The same thing happens under asyncio, but asyncio doesn’t expose the coroutine field publicly, so some additional trickery is required in that case.)

After installation of the coroutine shim, each task step passes through greenback on its way into and out of your code. At some performance cost, this effectively provides a portal that allows later calls to greenback.await_() in the same task to access an async environment, even if the function that calls await_() is a synchronous function.

This function is a cancellation point and a schedule point (a checkpoint, in Trio terms) even if the calling task already had a portal set up.

greenback.bestow_portal(task)

Ensure that the given async task is able to use greenback.await_().

This works like calling ensure_portal() from within task, with one exception: if you pass the currently running task, then the portal will not become usable until after the task yields control to the event loop.

await greenback.with_portal_run(async_fn, *args, **kwds)

Execute await async_fn(*args, **kwds) in a context that is able to use greenback.await_().

If the current task already has a greenback portal set up via a call to one of the other greenback.*_portal() functions, then with_portal_run() simply calls async_fn. If async_fn uses greenback.await_(), the existing portal will take care of it.

Otherwise (if there is no portal already available to the current task), with_portal_run() creates a new portal which lasts only for the duration of the call to async_fn. If async_fn then calls ensure_portal(), an additional portal will not be created: the task will still have just the portal installed by with_portal_run(), which will be removed when async_fn returns.

This function does not add any cancellation point or schedule point beyond those that already exist inside async_fn.

await greenback.with_portal_run_sync(sync_fn, *args, **kwds)

Execute sync_fn(*args, **kwds) in a context that is able to use greenback.await_().

If the current task already has a greenback portal set up via a call to one of the other greenback.*_portal() functions, then with_portal_run() simply calls sync_fn. If sync_fn uses greenback.await_(), the existing portal will take care of it.

Otherwise (if there is no portal already available to the current task), with_portal_run_sync() creates a new portal which lasts only for the duration of the call to sync_fn.

This function does not add any cancellation point or schedule point beyond those that already exist due to any await_()s inside sync_fn.

await greenback.with_portal_run_tree(async_fn, *args, **kwds)

Execute await async_fn(*args, **kwds) in a context that allows use of greenback.await_() both in async_fn itself and in any tasks that are spawned into child nurseries of async_fn, recursively.

You can use this to create an entire Trio run (except system tasks) that runs with greenback.await_() available: say trio.run(with_portal_run_tree, main).

This function does not add any cancellation point or schedule point beyond those that already exist inside async_fn.

Availability: Trio only.

Note

The automatic “portalization” of child tasks is implemented using a Trio instrument, which has a small performance impact on task spawning for the entire Trio run. To minimize this impact, a single instrument is used even if you have multiple with_portal_run_tree() calls running simultaneously, and the instrument will be removed as soon as all such calls have completed.

greenback.has_portal(task=None)

Return true if the given task is currently able to use greenback.await_(), false otherwise. If no task is specified, query the currently executing task.

Using the portal

Once you’ve set up a portal using any of the above functions, you can use it to run async functions by making calls to greenback.await_():

greenback.await_(awaitable)

Run an async function or await an awaitable from a synchronous function, using the portal set up for the current async task by ensure_portal(), bestow_portal(), with_portal_run(), or with_portal_run_sync().

greenback.await_(foo()) is equivalent to await foo(), except that the greenback version can be written in a synchronous function while the native version cannot.

Additional utilities

greenback comes with a few tools (built atop await_()) which may be helpful when adapting async code to work with synchronous interfaces.

@greenback.autoawait

Decorator for an async function which allows (and requires) it to be called from synchronous contexts without await.

For example, this can be used for magic methods, property setters, and so on.

@greenback.decorate_as_sync(decorator: Callable[[F], F]) Callable[[AF], AF]
@greenback.decorate_as_sync(decorator: Callable[[...], Any]) Callable[[Callable[[...], Awaitable[Any]]], Callable[[...], Awaitable[Any]]]

Wrap the synchronous function decorator decorator so that it can be used to decorate an async function.

This can be used, for example, to apply an async-naive decorator such as @functools.lru_cache() to an async function:

@greenback.decorate_as_sync(functools.lru_cache(maxsize=128))
async def some_fn(...): ...

Without the wrapping in decorate_as_sync(), the LRU cache would treat the inner function as a synchronous function, and would therefore unhelpfully cache the coroutine object that is returned when an async function is called without await.

Internally, the “inner” async function is wrapped in a synchronous function that invokes that async function using greenback.await_(). This synchronous function is then decorated with the decorator. decorate_as_sync() returns an “outer” async function which invokes the internal decorated synchronous function using greenback.with_portal_run_sync().

In other words, the following two calls behave identically:

result = await greenback.decorate_as_sync(decorator)(async_fn)(*args, **kwds)
result = await greenback.with_portal_run_sync(
    decorator(greenback.autoawait(async_fn)), *args, **kwds,
)
with greenback.async_context(async_cm)

Wraps an async context manager so it is usable in a synchronous with statement. That is, with async_context(foo) as bar: behaves equivantly to async with foo as bar: as long as a portal has been created somewhere up the callstack.

for ... in greenback.async_iter(async_iterable)

Wraps an async iterable so it is usable in a synchronous for loop, yield from statement, or similar synchronous iteration context. That is, for elem in async_iter(foo): behaves equivantly to async for elem in foo: as long as a portal has been created somewhere up the callstack.

If the obtained async iterator implements the full async generator protocol (asend(), athrow(), and aclose() methods), then the returned synchronous iterator implements the corresponding methods send(), throw(), and close(). This allows for better interoperation with yield from, for example.

Release history

greenback 1.2.1 (2024-02-20)

Bugfixes
  • greenback now uses deferred evaluation for its type hints. This resolves an incompatibility with less-than-bleeding-edge versions of outcome that was inadvertently introduced in the 1.2.0 release. (#30)

greenback 1.2.0 (2024-02-07)

With this release, greenback now requires at least Python 3.8.

Features
  • greenback’s internals have been reorganized to improve the performance of executing ordinary checkpoints (await statements, approximately) in a task that has a greenback portal active. On the author’s laptop with CPython 3.12, the overhead is only about one microsecond compared to the performance without greenback involved, versus four microseconds before this change. For comparison, the non-greenback cost of executing a checkpoint is 12-13 microseconds. (#26)

Bugfixes
  • greenback now properly handles cases where a task spawns another greenlet (not managed by greenback) that in turn calls greenback.await_(). This improves interoperability with other greenback-like systems that do not use the greenback library, such as SQLAlchemy’s async ORM support. (#22)

  • greenback.has_portal() now returns False if run in a task that has called greenback.bestow_portal() on itself but has not yet made the portal usable by executing a checkpoint. This reflects the fact that greenback.await_() in such a task will fail. The exception message for such an await_() failure has also been updated to more precisely describe the problem, rather than the previous generic “you must create a portal first”. (#26)

greenback 1.1.2 (2023-12-28)

Bugfixes
  • Public exports now use from ._submod import X as X syntax so that type checkers will know they’re public exports. (#23)

greenback 1.1.1 (2023-03-01)

Bugfixes
  • greenback.has_portal() now returns False, instead of raising an error, if it is called within an asyncio program in a context where no task is running (such as a file descriptor readability callback). (#16)

  • Fixed a bug that could result in inadvertent sharing of context variables. Specifically, when one task that already had a greenback portal set up called greenback.bestow_portal() on a different task, the second task could wind up sharing the first task’s contextvars context. (#17)

greenback 1.1.0 (2022-01-05)

Features
Bugfixes

greenback 1.0.0 (2021-11-23)

Features
Bugfixes
  • Add support for newer (1.0+) versions of greenlet, which expose a gr_context attribute directly, allowing us to remove the hacks that were added to support 0.4.17. greenlet 0.4.17 is no longer supported, but earlier (contextvar-naive) versions should still work. (#8)

  • We no longer assume that greenback.bestow_portal() is invoked from the “main” greenlet of the event loop. This was not a safe assumption: any task running with access to a greenback portal runs in a separate greenlet, and it is quite plausible that such a task might want to bestow_portal() on another task. (#9)

greenback 0.3.0 (2020-10-13)

Features
Bugfixes
  • Work around a regression introduced by greenlet 0.4.17’s attempt at adding contextvars support. (#5)

Documentation improvements

greenback 0.2.0 (2020-06-29)

Features
  • Added greenback.bestow_portal(), which enables greenback for a task from outside of that task. (#1)

  • Added support for newer versions of Trio with a trio.lowlevel module rather than trio.hazmat. Older versions of Trio remain supported.

greenback 0.1.0 (2020-05-02)

Initial release.

Indices and tables