Skip to content

fix: allow Ctrl+C to interrupt hanging lifespan#2752#2839

Open
theseriff wants to merge 1 commit into
Kludex:mainfrom
theseriff:fix/hanging-lifespan
Open

fix: allow Ctrl+C to interrupt hanging lifespan#2752#2839
theseriff wants to merge 1 commit into
Kludex:mainfrom
theseriff:fix/hanging-lifespan

Conversation

@theseriff

Copy link
Copy Markdown

Summary

reopen #2752

Checklist

  • I understand that this PR may be closed in case there was no previous discussion. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.

@hrv-dys hrv-dys left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This solves a genuine pain point. If the lifespan handler hangs (broken app, stuck dependency, etc.), a second Ctrl+C should force-exit instead of also hanging.

The approach is solid — wrapping lifespan calls in _wait_safe_lifespan() with an asyncio.Event that gets set from the signal handler via call_soon_threadsafe. A few observations:

  1. call_soon_threadsafe in handle_exit — this is correct because signal handlers run in the main thread but the event loop may be running in a different context. loop.call_soon_threadsafe(self.exit_event.set) ensures the event is set safely without races.

  2. The finally block in _wait_safe_lifespan properly cancels whichever task didn't finish first. If the exit event fires, the lifespan task gets cancelled; otherwise the waiter is cleaned up. This prevents leaked tasks.

  3. One thing to consider: when the lifespan task gets cancelled mid-shutdown, the app's lifespan.shutdown handler may not have completed cleanup (e.g., closing database pools, flushing buffers). This is inherent to a force-exit though — the user explicitly chose to interrupt, so partial cleanup is the expected tradeoff. Worth documenting in the commit message or a comment if it isn't already.

  4. The pragma: no cover on the waiter_task.done() branch — this makes sense since it's hard to reliably trigger a second SIGINT in a test, but it's the more critical code path. If there's a way to test it (maybe by directly calling exit_event.set() instead of sending a signal), that'd strengthen confidence.

One thing this codebase has historically been careful about: lifespan shutdown failures don't automatically propagate to the server as an exit signal. This PR doesn't change that contract — it just provides an escape hatch when the lifespan call itself never returns. That's the right scope.

Tests all pass (16 lifespan + 10 main = 26/26).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants