Skip to content

Fix libusb threads when backgrounded#3437

Draft
dougnazar wants to merge 2 commits into
networkupstools:masterfrom
dougnazar:fix_libusb_threads_when_backgrounded
Draft

Fix libusb threads when backgrounded#3437
dougnazar wants to merge 2 commits into
networkupstools:masterfrom
dougnazar:fix_libusb_threads_when_backgrounded

Conversation

@dougnazar
Copy link
Copy Markdown
Contributor

This is needed in particular with libusb on linux as it will create
a thread to handle either netlink or udev messages. If this is created
before we background, the driver will hang during exit trying to join
the non-existant thread in the new process. Shows this warning while
stopping the driver and is delayed until it switches to SIGKILL.

Stopping /run/nut/usbhid-ups-xxxx.pid failed, retrying harder: Success

I picked closing and then re-opening as the simpler to implement & test,
but has the biggest impact on each driver. Quite possibly some/all might
need to be modified to allow re-opening like the usb-hid driver.

The other option I considered was breaking up the backgrounding into stages,
creating the background process early, but leaving the forground process
running to report startup messages until initialization is finished.

Or perhaps there is another option?

dougnazar added 2 commits May 12, 2026 17:12
This is needed in particular with libusb on linux as it will create
a thread to handle either netlink or udev messages. If this is created
before we background, the driver will hang during exit trying to join
the non-existant thread in the new process. Shows this warning while
stopping the driver.

Stopping /run/nut/usbhid-ups-xxxx.pid failed, retrying harder: Success

Signed-off-by: Doug Nazar <nazard@nazar.ca>
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 12, 2026

A ZIP file with standard source tarball and another tarball with pre-built docs for commit f7ba488 is temporarily available: NUT-tarballs-PR-3437.zip.

@AppVeyorBot
Copy link
Copy Markdown

Build nut 2.8.5.4708-master completed (commit 11c4a94a6a by @dougnazar)

@AppVeyorBot
Copy link
Copy Markdown

@jimklimov jimklimov added USB service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug Linux Some issues are specific to Linux as a platform labels May 13, 2026
@jimklimov jimklimov added this to the 2.8.6 milestone May 13, 2026
@jimklimov
Copy link
Copy Markdown
Member

I haven't seen the described behavior, I think. Is this something that changed in recent libusb releases and/or Linux distros/kernel?

At what point does libusb start that thread - some method we call (so might defer that until after backgrounding), or somehow automatically during library load?

@dougnazar
Copy link
Copy Markdown
Contributor Author

dougnazar commented May 13, 2026

Looks like the thread creation was added in v1.0.16 (6853291ed Add hotplug support to the Linux backend.) and changed to pthread_join for v1.0.17 (3107f30ba linux_netlink: Remove use of pthread_cancel). I'm currently running v1.0.21.

It happens as soon as libusb is initialized, and cancelled when closed. The call trace is:

upsdrv_callbacks.upsdrv_initups();
	upsdrv_initups()  [usbhid-ups.c]
		ret = comm_driver->open_dev(&udev, &curDevice, subdriver_matcher, &callback);
			nut_libusb_open()
				libusb_init()
					libusb_init_context()
						usbi_backend.init();
							op_init()
								linux_start_event_monitor()
									linux_netlink_start_event_monitor()
									or
									linux_udev_start_event_monitor()

Edit: It just occurred to me that this might only happen on sysvinit style systems. I'm guessing that systemd handles backgrounding itself and wont see this issue.

@jimklimov
Copy link
Copy Markdown
Member

Thanks for the details, never knew that nuance.

As for systemd vs. init systems, I suppose this would have popped up in BSDs etc. as well? Otherwise, the difference is not so much in systemd itself, as in the *.service unit definitions which let the NUT daemons stay "foregrounded" since systemd does the forking. In services for NUT v2.7.x IIRC this was not the case, so there was double (or more) forking with systemd, backgrounding/detachment from terminal, possibly the root/nonroot split in upsmon, which blurred the lines about which PID(s) to monitor as part of the service unit.

Previously the only option to stay foregrounded was to hack the init script or similar, to add -D for debugging (which coincidentally also left the process foregrounded). With NUT v2.8.0 this was resolved to keep debugging and fore-/back-grounding separable (-F, -B) with -D just retaining legacy behavior about this by default; later also -FF to stay foregrounded and still save the PID file (so signals can be sent easily).

FWIW, the retrying harder wording comes from upsdrvctl.

@jimklimov
Copy link
Copy Markdown
Member

jimklimov commented May 13, 2026

Looking around for context, the idea about upsdrv_initups happening long before backgrounding is:

  • sometimes we do not background at all, e.g. to run jut to handle killpower request quickly or to dump_data or to debug with default other settings, or if the platform won't let us (e.g. Windows)
  • if the device was not found etc. we do not want to start the driver but rather fail early - hard(er) to do so visibly when detached - but this might be solvable with an upsmon-style pipe I suppose (keep a parent process around for a while, initups in the child, signal back to parent whether it should exit with success or error; maybe even pass the text message to the parent to print in case of error), all this complexity subject to whether we want to background at all. I think this PR should morph towards something like this approach?.. (UPDATE: I re-read the original post and this is essentially your "other option", so that path looks good to me)

As for the libusb threads, I wonder if they should not be inherited by the child process? I guess when the parent exits, the library closes the context and reaps the threads - so there's none to join for the child (and also maybe this is what breaks driver reconnections for some people)? Maybe there's a way to just NOT cleanly close libusb when the parent exits?

To this end, currently drivers/main.c goes like this (in broader strokes, since at least 2008; exit_upsdrv_cleanup() was separated to wrap upsdrv_cleanup() in 2023):

        dstate_setinfo("driver.state", "init.starting");

        atexit(exit_cleanup);
...
        dstate_setinfo("driver.state", "init.device");
        upsdrv_callbacks.upsdrv_initups();
        dstate_setinfo("driver.state", "init.quiet");

        /* UPS is detected now, cleanup upon exit */
        atexit(exit_upsdrv_cleanup);

        /* now see if things are very wrong out there */
        if (upsdrv_callbacks.upsdrv_info->status == DRV_BROKEN) {
                fatalx(EXIT_FAILURE, "Fatal error: broken driver. It probably needs to be converted.\n");
        }
...
        if (do_forceshutdown) {
                dstate_setinfo("driver.state", "fsd.killpower");
                forceshutdown();
        }
...
        switch (foreground) {
                case 0:
                        background();
...

There are several potential exit() points between that atexit(exit_upsdrv_cleanup); and the possible background(); call, so this hook would be useful there to un-initialize the device when the mono-process exits before it gets to backgrounding. I think exit_upsdrv_cleanup() and some global variables to hint about initial/current PID of the process can be used to just not-call upsdrv_cleanup() if this is a parent process and we were just about to process a background() call (so some global variable about that, not the foreground intent variable).

I don't know if that would actually help (e.g. libusb may well have its own atexit registrations to reap those threads and close netlink etc. listeners), but this looks like something that can be tested quickly - and if it does not help, then go to re-architecting how the backgrounding works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Linux Some issues are specific to Linux as a platform service/daemon start/stop General subject for starting and stopping NUT daemons (drivers, server, monitor); also BG/FG/Debug USB

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants