greenback: reenter an asyncio or Trio event loop from synchronous code¶
Python 3.5 introduced async
/await
syntax for defining
functions that can run concurrently in a cooperative multitasking
framework such as asyncio
or Trio. Such frameworks have a number of advantages
over previous approaches to concurrency: they scale better than threads and are
clearer about control flow
than the implicit cooperative multitasking provided by gevent
. They’re also being
actively developed to explore some exciting new ideas about concurrent programming.
Porting an existing codebase to async
/await
syntax can be
challenging, though, since it’s somewhat “viral”: only an async
function can call another async function. That means you don’t just have
to modify the functions that actually perform I/O; you also need to
(trivially) modify every function that directly or indirectly calls a
function that performs I/O. While the results are generally an improvement
(“explicit is better than implicit”), getting there in one big step is not
always feasible, especially if some of these layers are in libraries that
you don’t control.
greenback
is a small library that attempts to bridge this gap. It
allows you to call back into async code from a syntactically
synchronous function, as long as the synchronous function was
originally called from an async task (running in an asyncio or Trio
event loop) that set up a greenback
“portal” as explained
below. This is potentially useful in a number of different situations:
You can interoperate with some existing libraries that are not
async
/await
aware, without pushing their work off into another thread.You can migrate an existing program to
async
/await
syntax one layer at a time, instead of all at once.You can (cautiously) design async APIs that block in places where you can’t write
await
, such as on attribute accesses.
greenback
requires Python 3.8 or later and an implementation that
supports the greenlet
library. Either CPython or PyPy should work.
There are no known OS dependencies.
Quickstart¶
Call
await greenback.ensure_portal()
at least once in each async task that will be usinggreenback
. (Additional calls in the same task do nothing.) You can think of this as creating a portal that will be used by future calls togreenback.await_()
in the same task.Later, use
greenback.await_(foo())
as a replacement forawait foo()
in places where you can’t writeawait
.If all of the places where you want to use
greenback.await_()
are indirectly within a single function, you can eschew theawait greenback.ensure_portal()
and instead write a wrapper around calls to that function:await greenback.with_portal_run(...)
for an async function, orawait greenback.with_portal_run_sync(...)
for a synchronous function. These have the advantage of cleaning up the portal (and its associated minor performance impact) as soon as the function returns, rather than leaving it open until the task terminates.For more details and additional helpers, read the rest of this documentation!
Detailed documentation¶
Principle of operation¶
This section attempts to confer a basic sense of how greenback
works.
Async/await basics and limitations¶
Python’s async
/await
syntax goes to some lengths to look like
normal straight-line Python code: the async version of some logic
looks pretty similar to the threaded or non-concurrent version, just
with extra async
and await
keywords. Under the hood, though,
an async callstack is represented as something very much like a
generator. When some async function in the framework you’re using
needs to suspend the current callstack and run other code for a bit,
it effectively yield
s an object that tells the event loop about
its intentions. The exact nature of these “traps” is a private
implementation detail of the particular framework you’re using. For
example, asyncio yields Future
s, curio yields specially
formatted tuples, and Trio (currently) yields internal objects
representing the two primitive operations
cancel_shielded_checkpoint()
and
wait_task_rescheduled()
. For much more detail on
how async
/await
work (more approachably explained, too!), see
the excellent writeup by Brett Cannon: How the heck does async/await
work in Python 3.5?
Common to both async functions and generators is the limitation that one
can only directly yield
out of the function containing the yield
statement. If you want the yielded value to propagate multiple levels up the
callstack – all the way to the event loop, in the async case – you need
cooperation at each level, in the form of an await
(in an async function) or yield from
(in a generator) statement.
This means every point where execution might be suspended is marked
in the source code, which is a useful property. It also means that adding
some I/O (which might block, thus needs to be able to suspend execution)
at the bottom of a long chain of formerly-synchronous functions requires
adding async
and await
keywords at every level of the chain.
That property is sometimes problematic. To be sure, doing that work can
reveal important bugs (read Unyielding if you don’t
believe it); but sometimes it’s just an infeasible amount of work to be
doing all at once in the first place, especially if you need to interoperate
with a large project and/or external dependencies that weren’t written to
support async
/await
.
“Reentering the event loop” (letting a regular synchronous function
call an async function using the asynchronous context that exists
somewhere further up the callstack) has therefore been a popular
Python feature request, but unfortunately it’s somewhat fundamentally
at odds with how generators and async functions are implemented
internally. The CPython interpreter uses a few levels of the C call
stack to implement each level of the running Python call stack, as is
natural. The Python frame objects and everything they reference are
allocated on the heap, but the C stack is still used to track the
control flow of which functions called which. Since C doesn’t have any
support for suspending and resuming a callstack, Python yield
necessarily turns into C return
, and the later resumption of the
generator or async function is accomplished by a fresh C-level call to
the frame-evaluation function (using the same frame object as
before). Yielding out of a 10-level callstack requires 10 times as
many levels of C return
, and resuming it requires 10 times as many
new nested calls to the frame evaluation function. This strategy
requires the Python interpreter to be aware of each place where
execution of a generator or async function might be suspended, and
performance and comprehensibility both argue against allowing such
suspensions to occur in every operation.
Sounds like we’re out of luck, then. Or are we?
Greenlets: a different approach¶
Before Python had async/await or yield from
, before it even had
context managers or generators with send()
and throw()
methods,
a third-party package called greenlet
provided support for a very different way of suspending a callstack.
This one required no interpreter support or special keyword because it worked
at the C level, copying parts of the C stack to and from the heap in order
to implement suspension and resumption. The approach required deep architecture-specific
magic and elicited a number of subtle bugs, but those have generally been worked out
in the years since the first release in 2006, such that the package is now considered
pretty stable.
The greenlet
package has spawned its own fair share of concurrency frameworks
such as gevent
and eventlet
. If those meet your needs, you’re welcome
to use them and never give async/await another glance. For our purposes, though,
we’re interested in using just the greenlet primitive: the ability to
suspend a callstack of ordinary synchronous functions that haven’t been marked
with any special syntax.
Using greenlets to bridge the async/sync divide¶
From the perspective of someone writing an async function, your code
is the only thing running in your thread until you yield
or
yield from
or await
, at which point you will be suspended
until your top-level caller (such as the async event loop) sees fit to
resume you. From the perspective of someone writing an async event
loop, the perspective is reversed: each “step” of the async function
(represented by a send()
call on the coroutine object) cedes
control to an async task until the task feels like yielding control
back to the event loop. Our goal is to allow something other than a
yield
statement to make this send()
call return.
greenback
achieves this by introducing a “shim” coroutine in
between the async event loop and your task’s “real”
coroutine. Whenever the event loop wants to run the next step of your
task, it runs a step of this “shim” coroutine, which creates a
greenlet for running the next step of the underlying “real” coroutine.
This greenlet terminates when the “real” send()
call does
(corresponding to a yield
statement inside your async framework),
but because it’s a greenlet, we can also suspend it at a time of our
choosing even in the absence of any yield
statements. greenback.await_()
makes use of that capability by
repeatedly calling send()
on its argument coroutine, using the
greenlet-switching machinery to pass the yielded traps up to the event
loop and get their responses sent back down.
Once you understand the approach, most of the remaining trickery is in
the answer to the question: “how do we install this shim coroutine?”
In Trio you can directly replace trio.lowlevel.Task.coro
with a
wrapper of your choosing, but in asyncio the ability to modify the
analogous field is not exposed publicly, and on CPython it’s not even
exposed to Python at all. It’s necessary to use ctypes
to edit the
coroutine pointer in the C task object, and fix up the reference counts
accordingly. This works well once the details are ironed out. There’s
some additional glue to deal with exception propagation, non-coroutine
awaitables, and so forth, but the core of the implementation is not very
much changed from the gist sketch that originally inspired it.
What can I do with this, anyway?¶
You really can switch greenlets almost anywhere, which means you can
call greenback.await_()
almost anywhere too. For example, you can
perform async operations in magic methods: operator implementations, property
getters and setters, object initializers, you name it. You can use combinators
that were written for synchronous code (map
, filter
, sum
, everything in
itertools
, and so forth) with asynchronous operations (though they’ll still
execute serially within that call – the only concurrency you obtain is with
other async tasks). You can use libraries that support a synchronous callback,
and actually run async code inside the callback. And when this async code blocks
waiting for something, it will play nice and allow all your other async tasks to run.
You may find greenback.autoawait
useful in some of these situations: it’s
a decorator that turns an async function into a synchronous one. There are also
greenback.async_context
and greenback.async_iter
for sync-ifying async
context managers and async iterators, respectively.
If you’re feeling reckless, you can even use greenback
to run
async code in places you might think impossible, such as finalizers
(__del__
methods), weakref callbacks, or perhaps even signal
handlers. This is not recommended (your async library will not be
happy, to put it mildly, if the signal arrives or GC occurs in the
middle of its delicate task bookkeeping) but it seems that you can
get away with it some reasonable fraction of the time. Don’t try these
in production, though!
All of these are fun to play with, but in most situations the
ergonomic benefit is not going to be worth the “spooky action at a
distance” penalty. The real benefits probably come mostly when working
with large established non-async projects. For example, you could
write a pytest plugin that surrounds the entire run in a call to
trio.run()
, with greenback.await_()
used at your leisure
to escape back into a shared async context. Perhaps this could allow
running multiple async tests in parallel in the same thread. At this
point such things are only vague ideas, which may well fail to work
out. The author’s hope is that greenback
gives you the tool to
pursue whichever ones seem worthwhile to you.
What’s the performance impact?¶
Running anything with a greenback portal available incurs some slowdown,
and actually using await_()
incurs some more. The slowdown is not
extreme.
The slowdown due to greenback
is mostly proportional to the
number of times you yield to the event loop with a portal active, as well
as the number of portal creations and await_()
calls you perform.
You can run the microbenchmark.py
script from the Git repository
to see the numbers on your machine. On a 2023 MacBook Pro (ARM64), with
CPython 3.12, greenlet 3.0.3, and Trio 0.24.0, I get:
Baseline: The simplest possible async operation is what Trio calls a checkpoint: yield to the event loop and ask to immediately be rescheduled again. This takes about 13.6 microseconds on Trio and 12.9 microseconds on asyncio. (asyncio is able to take advantage of some C acceleration here.)
Adding the greenback portal, without making any
await_()
calls yet, adds about 1 microsecond per checkpoint.Executing each of those checkpoints through a separate
await_()
adds about another 2 microseconds perawait_()
. (Surrounding the entire checkpoint loop in a singleawait_()
, by contrast, has negligible impact.)Creating a new portal for each of those
await_(checkpoint())
invocations adds another 16 microseconds or so per portal creation. If you usewith_portal_run_sync()
, portal creation gets about 10 microseconds faster (so the portal is only adding about 6 microseconds of overhead).
Keep in mind that these are microbenchmarks: your actual program is
probably not executing checkpoints in a tight loop! The more work
you’re doing each time you’re scheduled, the less overhead greenback
will entail.
API reference¶
Creating a portal¶
In order to use greenback
in a particular async task, you must first create
a greenback portal for that task to use. You may choose between:
ensure_portal()
: Create a portal to be used by the current task, which lasts for the lifetime of that task. Use case: minimally invasive code change to allowgreenback.await_()
in a particular task.bestow_portal()
: Create a portal to be used by some other specified task, which lasts for the lifetime of that task. Use case: enabling greenback in a task without that task’s cooperation, which may be useful in some debugging and instrumentation situations. (with_portal_run_tree()
is implemented using a Trio instrument that callsbestow_portal()
on certain newly spawned tasks.)with_portal_run()
: Run an async function (in the current task) that might eventually make calls toawait_()
, with a portal available for at least the duration of that call. Use case: less “magical” thanensure_portal()
; keeps the portal (and its perforamnce impact) scoped to just the portion of a task that needs it.with_portal_run_sync()
: Run a synchronous function (in the current task) that might eventually make calls toawait_()
, with a portal available for at least the duration of that call. Use case: same aswith_portal_run()
, but the implementation is simpler and will be a bit faster (probably only noticeable if the function you’re running is very short).with_portal_run_tree()
: Run an async function (in the current task) that can make calls toawait_()
both itself and in all of its child tasks, recursively. Available on Trio only, since asyncio lacks a clear task tree and also lacks the instrumentation features required to implement this. Use case: minimally invasive code change to allowgreenback.await_()
in an entire subsystem of your Trio program.
You can use has_portal()
to determine whether a portal has already
been set up.
- await greenback.ensure_portal()¶
Ensure that the current async task is able to use
greenback.await_()
.If the current task has called
ensure_portal()
previously, calling it again is a no-op. Otherwise,ensure_portal()
interposes a “coroutine shim” provided bygreenback
in between the event loop and the coroutine being used to run the task. For example, when running under Trio,trio.lowlevel.Task.coro
is replaced with a wrapper around the coroutine it previously referred to. (The same thing happens under asyncio, but asyncio doesn’t expose the coroutine field publicly, so some additional trickery is required in that case.)After installation of the coroutine shim, each task step passes through
greenback
on its way into and out of your code. At some performance cost, this effectively provides a portal that allows later calls togreenback.await_()
in the same task to access an async environment, even if the function that callsawait_()
is a synchronous function.This function is a cancellation point and a schedule point (a checkpoint, in Trio terms) even if the calling task already had a portal set up.
- greenback.bestow_portal(task)¶
Ensure that the given async task is able to use
greenback.await_()
.This works like calling
ensure_portal()
from within task, with one exception: if you pass the currently running task, then the portal will not become usable until after the task yields control to the event loop.
- await greenback.with_portal_run(async_fn, *args, **kwds)¶
Execute
await async_fn(*args, **kwds)
in a context that is able to usegreenback.await_()
.If the current task already has a greenback portal set up via a call to one of the other
greenback.*_portal()
functions, thenwith_portal_run()
simply calls async_fn. If async_fn usesgreenback.await_()
, the existing portal will take care of it.Otherwise (if there is no portal already available to the current task),
with_portal_run()
creates a new portal which lasts only for the duration of the call to async_fn. If async_fn then callsensure_portal()
, an additional portal will not be created: the task will still have just the portal installed bywith_portal_run()
, which will be removed when async_fn returns.This function does not add any cancellation point or schedule point beyond those that already exist inside async_fn.
- await greenback.with_portal_run_sync(sync_fn, *args, **kwds)¶
Execute
sync_fn(*args, **kwds)
in a context that is able to usegreenback.await_()
.If the current task already has a greenback portal set up via a call to one of the other
greenback.*_portal()
functions, thenwith_portal_run()
simply calls sync_fn. If sync_fn usesgreenback.await_()
, the existing portal will take care of it.Otherwise (if there is no portal already available to the current task),
with_portal_run_sync()
creates a new portal which lasts only for the duration of the call to sync_fn.This function does not add any cancellation point or schedule point beyond those that already exist due to any
await_()
s inside sync_fn.
- await greenback.with_portal_run_tree(async_fn, *args, **kwds)¶
Execute
await async_fn(*args, **kwds)
in a context that allows use ofgreenback.await_()
both in async_fn itself and in any tasks that are spawned into child nurseries of async_fn, recursively.You can use this to create an entire Trio run (except system tasks) that runs with
greenback.await_()
available: saytrio.run(with_portal_run_tree, main)
.This function does not add any cancellation point or schedule point beyond those that already exist inside async_fn.
Availability: Trio only.
Note
The automatic “portalization” of child tasks is implemented using a Trio
instrument
, which has a small performance impact on task spawning for the entire Trio run. To minimize this impact, a single instrument is used even if you have multiplewith_portal_run_tree()
calls running simultaneously, and the instrument will be removed as soon as all such calls have completed.
- greenback.has_portal(task=None)¶
Return true if the given task is currently able to use
greenback.await_()
, false otherwise. If no task is specified, query the currently executing task.
Using the portal¶
Once you’ve set up a portal using any of the above functions, you can use it
to run async functions by making calls to greenback.await_()
:
- greenback.await_(awaitable)¶
Run an async function or await an awaitable from a synchronous function, using the portal set up for the current async task by
ensure_portal()
,bestow_portal()
,with_portal_run()
, orwith_portal_run_sync()
.greenback.await_(foo())
is equivalent toawait foo()
, except that thegreenback
version can be written in a synchronous function while the native version cannot.
Additional utilities¶
greenback
comes with a few tools (built atop await_()
) which may
be helpful when adapting async code to work with synchronous interfaces.
- @greenback.autoawait¶
Decorator for an async function which allows (and requires) it to be called from synchronous contexts without
await
.For example, this can be used for magic methods, property setters, and so on.
- @greenback.decorate_as_sync(decorator: Callable[[F], F]) Callable[[AF], AF] ¶
- @greenback.decorate_as_sync(decorator: Callable[[...], Any]) Callable[[Callable[[...], Awaitable[Any]]], Callable[[...], Awaitable[Any]]]
Wrap the synchronous function decorator decorator so that it can be used to decorate an async function.
This can be used, for example, to apply an async-naive decorator such as
@functools.lru_cache()
to an async function:@greenback.decorate_as_sync(functools.lru_cache(maxsize=128)) async def some_fn(...): ...
Without the wrapping in
decorate_as_sync()
, the LRU cache would treat the inner function as a synchronous function, and would therefore unhelpfully cache the coroutine object that is returned when an async function is called withoutawait
.Internally, the “inner” async function is wrapped in a synchronous function that invokes that async function using
greenback.await_()
. This synchronous function is then decorated with the decorator.decorate_as_sync()
returns an “outer” async function which invokes the internal decorated synchronous function usinggreenback.with_portal_run_sync()
.In other words, the following two calls behave identically:
result = await greenback.decorate_as_sync(decorator)(async_fn)(*args, **kwds) result = await greenback.with_portal_run_sync( decorator(greenback.autoawait(async_fn)), *args, **kwds, )
- with greenback.async_context(async_cm)¶
Wraps an async context manager so it is usable in a synchronous
with
statement. That is,with async_context(foo) as bar:
behaves equivantly toasync with foo as bar:
as long as a portal has been created somewhere up the callstack.
- for ... in greenback.async_iter(async_iterable)¶
Wraps an async iterable so it is usable in a synchronous
for
loop,yield from
statement, or similar synchronous iteration context. That is,for elem in async_iter(foo):
behaves equivantly toasync for elem in foo:
as long as a portal has been created somewhere up the callstack.If the obtained async iterator implements the full async generator protocol (
asend()
,athrow()
, andaclose()
methods), then the returned synchronous iterator implements the corresponding methodssend()
,throw()
, andclose()
. This allows for better interoperation withyield from
, for example.
Release history¶
greenback 1.2.1 (2024-02-20)¶
Bugfixes¶
greenback now uses deferred evaluation for its type hints. This resolves an incompatibility with less-than-bleeding-edge versions of
outcome
that was inadvertently introduced in the 1.2.0 release. (#30)
greenback 1.2.0 (2024-02-07)¶
With this release, greenback now requires at least Python 3.8.
Features¶
greenback’s internals have been reorganized to improve the performance of executing ordinary checkpoints (
await
statements, approximately) in a task that has a greenback portal active. On the author’s laptop with CPython 3.12, the overhead is only about one microsecond compared to the performance without greenback involved, versus four microseconds before this change. For comparison, the non-greenback cost of executing a checkpoint is 12-13 microseconds. (#26)
Bugfixes¶
greenback now properly handles cases where a task spawns another greenlet (not managed by greenback) that in turn calls
greenback.await_()
. This improves interoperability with other greenback-like systems that do not use the greenback library, such as SQLAlchemy’s async ORM support. (#22)greenback.has_portal()
now returns False if run in a task that has calledgreenback.bestow_portal()
on itself but has not yet made the portal usable by executing a checkpoint. This reflects the fact thatgreenback.await_()
in such a task will fail. The exception message for such anawait_()
failure has also been updated to more precisely describe the problem, rather than the previous generic “you must create a portal first”. (#26)
greenback 1.1.2 (2023-12-28)¶
Bugfixes¶
Public exports now use
from ._submod import X as X
syntax so that type checkers will know they’re public exports. (#23)
greenback 1.1.1 (2023-03-01)¶
Bugfixes¶
greenback.has_portal()
now returns False, instead of raising an error, if it is called within an asyncio program in a context where no task is running (such as a file descriptor readability callback). (#16)Fixed a bug that could result in inadvertent sharing of context variables. Specifically, when one task that already had a greenback portal set up called
greenback.bestow_portal()
on a different task, the second task could wind up sharing the first task’scontextvars
context. (#17)
greenback 1.1.0 (2022-01-05)¶
Features¶
Added
@greenback.decorate_as_sync()
, which wraps a synchronous function decorator such asfunctools.lru_cache()
so that it can be used to decorate an async function. (#14)
Bugfixes¶
greenback.has_portal()
now returns False instead of raising an error if called outside async context. (#12)greenback.has_portal()
now properly respects its task argument; previously it erroneously would always inspect the current task. (#13)
greenback 1.0.0 (2021-11-23)¶
Features¶
New function
greenback.with_portal_run_tree()
is likegreenback.with_portal_run()
for an entire Trio task subtree: it will enablegreenback.await_()
not only in the given async function but also in any child tasks spawned inside that function. This feature relies on the Trio instrumentation API and is thus unavailable on asyncio. (#9)New function
greenback.has_portal()
determines whether the current task, or another specified task, has a greenback portal set up already. (#9)
Bugfixes¶
Add support for newer (1.0+) versions of greenlet, which expose a
gr_context
attribute directly, allowing us to remove the hacks that were added to support 0.4.17. greenlet 0.4.17 is no longer supported, but earlier (contextvar-naive) versions should still work. (#8)We no longer assume that
greenback.bestow_portal()
is invoked from the “main” greenlet of the event loop. This was not a safe assumption: any task running with access to a greenback portal runs in a separate greenlet, and it is quite plausible that such a task might want tobestow_portal()
on another task. (#9)
greenback 0.3.0 (2020-10-13)¶
Features¶
Add
greenback.with_portal_run()
andgreenback.with_portal_run_sync()
, which let you scope the greenback portal (and its performance impact) to a single function call rather than an entire task.greenback.with_portal_run_sync()
provides somewhat reduced portal setup/teardown overhead in cases where the entire function you want to provide the portal to is syntactically synchronous. (#6)
Bugfixes¶
Work around a regression introduced by greenlet 0.4.17’s attempt at adding contextvars support. (#5)
Documentation improvements¶
Add a more detailed discussion of the performance impacts of using
greenback
.
greenback 0.2.0 (2020-06-29)¶
Features¶
Added
greenback.bestow_portal()
, which enables greenback for a task from outside of that task. (#1)Added support for newer versions of Trio with a
trio.lowlevel
module rather thantrio.hazmat
. Older versions of Trio remain supported.
greenback 0.1.0 (2020-05-02)¶
Initial release.