Skip to content

Remove interrupt tasks entirely#1222

Open
DavisVaughan wants to merge 14 commits into
mainfrom
feature/task-timing
Open

Remove interrupt tasks entirely#1222
DavisVaughan wants to merge 14 commits into
mainfrom
feature/task-timing

Conversation

@DavisVaughan
Copy link
Copy Markdown
Contributor

@DavisVaughan DavisVaughan commented May 19, 2026

  • Wait for the next Positron release, merge 1 day after that so we have time to test

Closes #1187

In #1187 we realized that our "interrupt time" tasks have never been running on Windows, and...no one has complained.

Practically, this means that LSP features like completions or hover have not been running on Windows when R is busy, i.e. when R is running some long running computation, even if that code is checking for interrupts periodically via R_CheckUserInterrupt(). Unix machines, on the other hand, do return completions or hover results at interrupt time, but it is fairly unreliable and highly subject to the R code being run, i.e. this video shows how when you run a long running lm(), you can basically hang the LSP on a Mac because it almost never calls R_CheckUserInterrupt():

Screen.Recording.2026-05-15.at.12.21.07.PM.mov

We've long through that doing extensive work at R interrupt time is rather dangerous (you could accidentally load a namespace mid-R execution, which would be wild), so we've wanted to remove these. And with oak, we are going to rely on interrupt time less and less anyways, because more things can be done statically without talking to the main R session.

So this PR removes interrupt tasks entirely. I'll leave more notes inline.

We no longer need this. The potential issue was that we'd call some R code while on the LSP thread, and behind our back while on that LSP thread R would run finalizers that were only meant to be run on the main R thread. This can no longer happen because `r_task()`s are always run on the main R thread. There is no longer any R code that runs on the LSP thread.

We could run `R_RunPendingFinalizers()` ourselves more aggressively after each top level command finishes, but RStudio does not do this, so I don't currently see a need for us to either.
We now only run interrupt tasks at idle time, so in a follow up commit we can remove interrupt tasks as a concept.

We still run activity handlers regularly, particularly important for the later R package

We still call `R_ProcessEvents()` somewhat regularly while idling, mostly out of good faith to whatever R's default methods do, but this is also where our `self.debug_filter` draining occurs now - and this is newly now also being called on Windows, so that's good. I think if we do use `R_ProcessEvents()` for more going forward, it should only be for "non critical" things that can happen at any arbitrary time.
So LSP didChange notifications can invalidate breakpoints while paused in the debugger
We have some indirect tests that get at this too, but a direct test feels quite useful and explanatory
Comment on lines -509 to 514
r_task::spawn(RTask::interrupt({
//
// Use `idle_any_prompt` so DAP breakpoint invalidation still fires when
// the LSP signals a document change while R is paused at `browser()`.
r_task::spawn(RTask::idle_any_prompt({
let dap_clone = console.debug_dap.clone();
async move || {
async move |_| {
Console::process_console_notifications(console_notification_rx, dap_clone).await
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a test for this as well in dap_breakpoints.rs

Comment on lines +1960 to +1962
// For Sync tasks (i.e. only `r_task()`s), we want to log excessive waiting,
// because we are blocking the calling thread
if matches!(task, QueuedRTask::Sync(_)) {
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used to log "excessive waiting" for interrupt tasks

We don't have interrupt tasks anymore, but we do have Sync tasks, which are only used by r_task().

I thought it was appropriate to change this to log excessive waiting for r_task()s, since those are blocking, and since they previously were interrupt tasks. So it felt like the spirit was the same.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea

Comment on lines 2496 to 2505
@@ -2475,33 +2503,6 @@ impl Console {
if let Some(text) = self.debug_filter.check_timeout() {
self.emit_stdout(text);
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have kept the debug_filter check as part of an R_ProcessEvents() hook, which is now also being run on Windows.

This is now the ONLY THING special that we do at interrupt time. And it doesn't talk to the R API in any way, so should be very safe.

It's important we run this regularly-ish because the whole point is to not hold onto user output that might (for whatever reason) look like debug: output too long during long running computation

We talked about moving this to another thread, which would let us remove interrupt time hooks altogether. I still think that idea is kind of nice in theory, but I wasn't quite sure how that thread would work.

  • Does it just busy loop with a 25ms sleep between checks?
  • Also, it would force DebugFilterState to be wrapped in a Mutex so it could be shared between Console and this extra thread, and that felt gross

I think we should keep this as is for this PR and if we still feel like we want to move it to a separate thread, we should do a follow up PR and aim for the simplest strategy possible (or, again, explore removing this check entirely)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing that gives me pause about this thread is using up 2mb stack space just for that. If a tokio task, that becomes much nicer, with the huge caveat that now Console depends on tokio :P

Comment on lines -2501 to -2504
// Run pending finalizers. We need to do this eagerly as otherwise finalizers
// might end up being executed on the LSP thread.
// https://github.com/rstudio/positron/issues/431
unsafe { R_RunPendingFinalizers() };
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer needed since we don't run R code off the main R thread

Comment on lines -110 to +131
// NOTE: This is a legit use of interrupt-time task. No R access here, and
// we need to go through Console since it owns the UI comm.
r_task(|| {
if let Some(ui) = Console::get().ui_comm() {
ui.show_message(String::from(user_message));
}
let request = client.send_request::<request::ShowMessageRequest>(ShowMessageRequestParams {
typ: MessageType::ERROR,
message: String::from(user_message),
actions: None,
});
match tokio::time::timeout(Duration::from_secs(5), request).await {
Ok(result) => {
result.log_err();
},
Err(_) => {
log::warn!("Timed out waiting for frontend to acknowledge LSP crash notification");
},
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

report_crash() moves off needing to use the ui comm!

Comment thread crates/ark/src/r_task.rs
Comment on lines +331 to +334
/// For rare performance sensitive cases where you'd like to avoid the cost of
/// poking R options via [Self::pin_with_capture()] and you know you don't need
/// the safety of capturing output because you aren't running R code
pub(crate) fn send_idle_without_capture<F, Fut>(fun: F) -> Self
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did end up wanting this for Drop in RThreadSafe

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer relevant since R_PolledEvents is a no-op if no one sets it, and we don't set this anymore

Comment thread crates/harp/src/exec.rs
Comment on lines +512 to 514
// TODO: These days this just means `RLocalInterruptsSuspended`.
// Should we simplify the name from `r_sandbox()` to `r_interrupts_suspended()`?
let _scope = crate::raii::RLocalSandbox::new();
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was curious about your thoughts on this, since r_sandbox() does less now

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather keep the name distinct from implementation details?

@DavisVaughan DavisVaughan requested a review from lionel- May 19, 2026 16:05
Copy link
Copy Markdown
Contributor

@lionel- lionel- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a step in the right direction!

The main thing I'd like us to be careful about is to not break native infrastructure that could potentially be used outside Positron:

  • Native graphics devices: GraphApp-based on Windows, Cocoa/Quartz on macOS, X11 on Linux.
  • Less important but comes up in practice: tcl/tk GUIs. R prompts user for a CRAN repo with a tcl/tk menu.

For some reason the tcltk package refuses to even load on my machine (R freezes) so I couldn't verify that it still works. But I could verify the Quartz graphics device no longer works, even though it does work on main. That's because Ark now sets ptr_R_ProcessEvents, see inline comment.

I did a deep dive in R's messy ecosystem of events loop (see below). I think we could simplify things and keep everything working with the following setup:

Windows:

pub fn pump_idle_events() {
    // R_ProcessEvents on Windows contains peekevent/doevent (GraphApp pump),
    // time limits, UserBreak, ptr_ProcessEvents (our hook), and R_Tcl_do.
    // No R_runHandlers / input-handler concept on Windows.
    unsafe { R_ProcessEvents() };
}

pub fn setup_r(args: &Vec<String>) {
    (*params).CallBack = Some(r_flush_debug_filter);
}

Unix:

pub fn pump_idle_events() {
    // ptr_R_ProcessEvents (Quartz), R_PolledEvents (debug_filter + tcltk),
    // R_CheckTimeLimits.
    unsafe { R_ProcessEvents() };

    // Drain ready input handlers (Quartz pipe, X11, later, help server).
    let mut fdset = unsafe { R_checkActivity(0, 1) };
    while !fdset.is_null() {
        unsafe { R_runHandlers(libr::get(R_InputHandlers), fdset) };
        fdset = unsafe { R_checkActivity(0, 1) };
    }

    // Match R's sys-std.c:1136 idiom: one final call with NULL fires
    // Rg_PolledEvents (Cairo X11 buffered display flush) and R_PolledEvents
    // (re-fires debug_filter + tcltk; both idempotent).
    unsafe { R_runHandlers(libr::get(R_InputHandlers), std::ptr::null_mut()) };
}

pub fn setup_r(args: &Vec<String>) {
    libr::set(R_PolledEvents, Some(r_flush_debug_filter));
    libr::set(R_wait_usec, 10000);
    // ptr_R_ProcessEvents intentionally not set as it's reserved for Quartz.
}

The other thing I'm wondering for a follow-up PR is if we should wait on the activity hanlder fds directly, on Unix. R_InputHandlers is a public linked list so it should be possible, I think Kevin had a comment about this (might have been deleted since): https://github.com/r-devel/r-svn/blob/714af4bf22bebd387437e31c397167996b4d5fbc/src/include/R_ext/eventloop.h#L93

The advantage would be that we'd run {later} callbacks and R's help server requests as soon as the events come in, without any timeout delays.


Notes from deep dive:

Interrupt-time: R_CheckUserInterrupt() calls R_ProcessEvents()

Unix R_ProcessEvents()

  • Ptr_R_ProcessEvents()
  • R_PolledEvents()
  • R_CheckTimeLimits()

Windows R_ProcessEvents()

  • GraphApp events
  • R_CheckTimeLimits()
  • Interrupt check
  • Ptr_R_ProcessEvents()
  • R_Tcl_do()

R_runHandlers (Unix only)

  • Run activity/input handlers registered by packages (e.g. grDevices on mac, later on linux and mac)
  • Runs only when R is idle, never when busy. Called by ReadConsole's Unix implementation.

Ptr_R_ProcessEvents

  • macOS: Set by Cocoa, only if pointer still NULL. We can't set this without breaking Cocoa GUI.
  • Windows: Mapped to Rp->Callback

R_PolledEvents (Unix only): Called from ProcessEvents()
-> Global pointer, only set by tcl/tk in stack-like fashion (calls old pointer)

Rg_PolledEvents: Separate hook for X11, called alongside R_PolledEvents when R_runHandlers is passed a NULL readMask (when R_checkActivity() returns NULL).

tcl/tk (Unix):

  • Interrupt-time: Sets R_PolledEvents on Unix (in a stack-like pattern, calling the old handler if any is set), R_Tcl_do on Windows
  • Idle: Does not register an activity input handler
  • Bidirectional: Sets a TCL event loop handler to call R_runHandlers when TCL is itself busy (e.g. running a modal dialog).

X11 (Unix): Runs on the R thread

  • Interrupt-time: nothing. X11 does not register Ptr_R_ProcessEvents, so X11
    windows freeze during R computation. (Cairo-buffered X11 devices additionally
    set Rg_PolledEvents = CairoHandler for periodic buffer flushes, but
    Rg_PolledEvents only fires from the legacy R_runHandlers(handlers, NULL)
    branch, which ark never invokes.)
  • Idle: Registers an activity input handler on the X11 connection fd
    (addInputHandler(R_InputHandlers, ConnectionNumber(display), R_ProcessX11Events, XActivity)). The X
    server pushes events to the connection fd, select() sees the fd ready,
    R_runHandlers dispatches to R_ProcessX11Events.

Quartz/Cocoa: Runs on the R thread

  • Interrupt-time: Registers Ptr_R_ProcessEvents if NULL
  • Idle-time: Registers an activity input handler for idle time events. Tickled by a background thread that sets a byte on the write end every 100ms.
  • If Ptr_R_ProcessEvents is NULL, also initialises Cocoa GUI in the current process, which sets it as a full-blown foreground process visible in the Dock.

HTML help server (Unix): Server runs on the R thread

  • Interrupt-time: Nothing, Help server is unresponsive when R is busy
  • Idle-time: Registers an activity input handler on the listening TCP socket
    (srv_input_handler).
    When a connection arrives, that handler calls
    accept() and registers another input handler on the per-connection
    socket (worker_input_handler) to serve the request. So each in-flight
    request has its own input handler.

HTML help server (Windows): Server runs in a separate thread.

later (Unix):

  • Interrupt-time: Nothing
  • Idle-time: Registers an activity input handler (that checks call stack is empty in case of debugger read-prompt)

later (Windows):

  • Spawns a message-only window that checks every 10ms


// Async messages for the Console. Processed at interrupt time.
// Async messages for the Console. Polled at any idle prompt (top-level or
// browser) so DAP doc-change handling continues during debug sessions.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doc-change handling?

let polled_events_rx = crossbeam::channel::tick(Duration::from_millis(50));
let activity_handlers_rx = crossbeam::channel::tick(Duration::from_millis(50));

// Run `R_ProcessEvents()` regularly out of good faith. We don't think this
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is what it does:

  • Checks for interrupts on Windows (not needed at idle time)
  • Checks time limits (not really supported by Ark)
  • Run events callbacks: Cocoa/Quartz (graphics device), Windows GraphApp-based events (graphics device), TCL/TK (CRAN repo selection)

Native events can potentially be used in other contexts than Positron: CRAN repo selection, plot GUI, etc. To try this, set options(device = "quartz") in a Jupyter app and create plots. Note you'll need to try it outside of this branch because Quartz has been broken by setting ptr_R_ProcessEvents, see other comment.

Since it's involved in GUI processing, we should keep polling at idle time every 50ms.

Comment on lines 2496 to 2505
@@ -2475,33 +2503,6 @@ impl Console {
if let Some(text) = self.debug_filter.check_timeout() {
self.emit_stdout(text);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main thing that gives me pause about this thread is using up 2mb stack space just for that. If a tokio task, that becomes much nicer, with the huge caveat that now Console depends on tokio :P

Comment on lines +1960 to +1962
// For Sync tasks (i.e. only `r_task()`s), we want to log excessive waiting,
// because we are blocking the calling thread
if matches!(task, QueuedRTask::Sync(_)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea

libr::set(ptr_R_ShowMessage, Some(r_show_message));
libr::set(ptr_R_Busy, Some(r_busy));
libr::set(ptr_R_Suicide, Some(r_suicide));
libr::set(ptr_R_ProcessEvents, Some(r_process_events));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This breaks the quartz device on macOS because it checks that this hook is still NULL before initialising: https://github.com/r-devel/r-svn/blob/714af4bf22bebd387437e31c397167996b4d5fbc/src/library/grDevices/src/qdCocoa.m#L712-L713

options(device = "quartz")
plot(1)

While none of this is used in Positron, I think we should avoid unnecessary regressions on this front to make it easier to integrate Ark in other contexts, e.g. jupyter-based REPLs.

if let Err(err) = r_sandbox(|| Console::get_mut().polled_events()) {
panic!("Unexpected longjump while polling events: {err:?}");
pub unsafe extern "C-unwind" fn r_process_events() {
if let Err(err) = r_sandbox(|| Console::get_mut().process_events()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we remove the sandbox here, and document prominently that no R code should ever run in process_events()?

Or we keep it but mention here that is purely defensive against future changes.

Comment thread crates/harp/src/exec.rs
Comment on lines +512 to 514
// TODO: These days this just means `RLocalInterruptsSuspended`.
// Should we simplify the name from `r_sandbox()` to `r_interrupts_suspended()`?
let _scope = crate::raii::RLocalSandbox::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd rather keep the name distinct from implementation details?

This seems a bit odd, because the location that `R_wait_usec` is defined at makes me think it would only be used by `R_PolledEvents()`, which we no longer set, but in reality it is also used in `Rsleep()` as the interval to check interrupts on
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

On Windows, call some equivalent to R_PolledEvents

2 participants