Can a CPU be notified that it's about to lose power?

Question

If I have a desktop computer and I pull out the power cord, are there are a few million CPU cycles that still happen before it completely runs out of energy? There could even be a capacitor on the motherboard to extend that to a few billion. Do CPUs detect that this is about to happen so that they can shutdown more cleanly? Is this a type of interrupt? Does Linux have any code specific to this situation?

No, how can they. But, they can usually inform the code that starts what caused a reset. — Andy aka, Mar 24 '23 at 23:31
By noticing that voltage is dropping quickly or whatever. I found an answer to my question by reinventing the term “power loss interrupt” https://stackoverflow.com/questions/25946448/what-exactly-happens-in-a-power-fail-interrupt — , Mar 24 '23 at 23:33
You said this --> Do CPUs know when they’ve lost power? Noticing that the voltage is dropping is "whilst they are losing power. — Andy aka, Mar 24 '23 at 23:34
I’m having a hard time imagining what you’re imagining that I’m asking. — , Mar 24 '23 at 23:36
This is far too broad a question, as it's asking multiple questions all about a vague and undefined 'desktop computer'. Assuming you mean a PC, that can use many different motherboards from many different manufacturers so that's far too generic. Please edit your question to be specific and remove all the ambiguity. — TonyM, Mar 24 '23 at 23:36
Lots of devices detect when the supply is failing, especially nonvolatile storage since you don't want to corrupt your hard disk everytime you lose power. — user1850479, Mar 24 '23 at 23:36
@TonyM not really, on the duplicate question I linked someone commented about SIM cards and that’s also exactly the kind of answer I wanted. — , Mar 24 '23 at 23:39
Many microcontrollers do have this feature, which may be called a "power-fail interrupt" or "brown-out interrupt". I'd bet that desktop CPUs don't because there is not much use for it - what would the CPU do in that time? — user253751, Mar 24 '23 at 23:52
In a desktop, the interrupt would have to come from the power supply and not the CPU, because whatever you want the CPU to do is going to rely on the whole system having power, not just the CPU. You want it to start doing that while the power supply unit still has some milliseconds of power left in its capacitors. — user253751, Mar 24 '23 at 23:54
@Andyaka Power!=power. The device loses power, the CPU still is running when drawing down DC link capacitors in the power supply, or there's still some battery charge left, etc. The question is awkwardly worded, but pretty clear. It could definitely be edited to be better :) — Kuba hasn't forgotten Monica, Mar 25 '23 at 00:02
System is becoming unstable and yoy want it to validly save the present state. Seems dangerous. — StainlessSteelRat, Mar 25 '23 at 00:43
@BorisVerkhovskiy use a UPS ... it will give you plenty of time to perform an orderly shutdown — jsotola, Mar 25 '23 at 01:45
@BorisVerkhovskiy I have not seen many that do not have a serial port or a USB port for connecting to a host PC ... some have a relay contact output that could even be used by a microcontroller — jsotola, Mar 25 '23 at 02:41

frr · Accepted Answer · 2023-03-26T11:28:02.943

On custom-designed boards with an MCU, obviously the designer can do anything she wants. The rest of my answer will be related to PC-compatible computers.

The ATX PSU (and before it the AT PSU) only has a single signal, to tell the host computer that it is doing fine: it's called Power Good or Power OK. On startup, first all the output rails of the main stage have to reach full voltage and be stable for some time (milliseconds), before Power Good comes alive. The motherboard then waits another interval before releasing itself from the RESET state. There are actually multiple RESET signals / traces / circuit nodes within a motherboard, and the chipset and SuperIO datasheets tend to contain RESET sequencing diagrams - on power-up, after Power Good becomes active.

But, let me first focus on the ATX PSU a bit:

ATX PSU

During a short power glitch, the Power Good signal remains happy. The ATX spec says that the PSU must sustain power output for a single period of the 60Hz mains (17 ms) at full steam ahead. Elsewhere in the spec, the Power Good must deassert 1 ms before the power rails actually go out of tolerance. Which means, that the threshold on energy remaining, for Power Good deasserting, is fairly low = Power Good only deasserts when the capacitors in the PSU are "almost depleted, as far as staying within tolerance is your goal".

Others have already pointed out that, given a particular PC system, featuring some characteristic power consumption, all other factors unchanged, different PSU models can give very different "holdover capability" = time to deplete the energy left in the PSU capacitors. This matches my observations, done the other day on several ATX PSU models: consumer-grade desktop, server-class and industrial-grade, of different wattages.

In that respect, it is interesting to note, where exactly this energy gets stored. Which of the myriad capacitors in the PC motherboard and PSU are relevant for the holdover capability. And the answer is: only the primary-side capacitor in the PSU really matters, for hold-over time. The characteristic largest can in the PSU, a couple hundred microFarad rated at 400-450 Volt. The one that presents a shock hazard for some time after pulling the cord. Why this one? It's a combination of several factors:

E = (1/2) C * U^2 . The square of Volts is a monster. A milliFarad at 12V has a tiny energy equivalent compared to a milliFarad at 100-350 V.
the primaries run at mains Voltage = lower hundreds of Volts
it so happens that the isolation strength of the dielectric layer inside the capacitor scales in an approx. direct proportion to mechanical thickness
the energy conversions (transformations) in the power cascade are done purposefully with minimal loss.

Not sure how dielectric layer thickness is related to insulation strength, and capacitance per spatial volume - somehow, in practice, elyts rated at a higher voltage achieve a higher energy density. The elyts at the low-volt secondaries are big enough to filter the choppy current character at the RF operating frequencies of the SMPS stages (actually also the regulation loop response interval), but they cannot do much at the desired timescales of a power outage holdover - preferably up to a second or so.

Also note that, in modern PSU's featuring active PFC, the PFC pre-stage (boost-mode upconverter) charges the aforementioned primary capacitor to a stabilized level around 400 V DC. That, regardless of the actual mains level. These PSU's with active PFC typically have a full-range AC input, i.e. work on anything between 100 and 240V AC nominal, in reality they start and shut off around 80-90 V AC. Technically they also work on DC input in a similar range (although for practical applications, note that DC-input operation has specific safety hazards / additional requirements). I mean to imply that with active PFC, the mains voltage in your wall socket has no influence on the energy available in the primary capacitor, and therefore on holdover duration.

The only way to achieve a longer hold-over in a PSU, consumption unchanged, is by designing in a larger primary capacitor. Note that the size of that cap has to do with inrush when plugging in the mains cord. In practice, any self-respecting PC PSU (especially the higher-powered models) need to contain an inrush-limiting circuit in the input section, before that primary capacitor. On cold power-up, that capacitor gets directly charged through a graetz bridge, before the PFC stage even starts up.

In my simple experiments years ago, I evaluated several PSU's between 200 and 700W nominal, and I did not load them to the nominal wattage - my load was around 100 Watt, which in reality is not a very modest PC. It's a fairly practical load level. My measured values of hold-over were between some 60 ms and almost a second. When comparing a cheap consumer-grade and an industrial PSU of similar nominal wattage, the industrial models tend to last significantly longer. Some server machines are known to keep running for a few seconds after the mains cord gets pulled. It certainly depends on how "loaded" the chassis is, with CPU's and disk drives.

So, to summarize the sub-topic on PSU's, picture this:

The output rails have a couple millifarad each, and need to stay within a tolerance of +/- 5% of nominal voltage. In practice, no logic in a PC computer has ever run directly on +12V, which is the main "bus voltage" inside the PC. The silicon runs on lower volts produced by step-down from 12V, so the tolerance for 12V is actually wider... but that doesn't matter much. 1 milliFarad = 1 V at 1A during just 1 ms. And you have a primary capacitor, say half a milliFarad, at 400V. The PC computer is in operation. When you pull the mains cord, you'll see the output rails remain rock solid throughout the holdover period, while the primary voltage gradually slopes down (and the slope actually gets steeper with the duration of the outage). Then at some point, when the primary cap is almost down to the threshold of the main stage shutdown (say 100V), the Power Good signal goes inactive. And a relatively short moment later, the main stage PWM shuts off, and the low-volt output rails sharply break off, supplying just the energy remaining in the secondaries.

At the output of the PSU, you get a factual holdover, but pretty much nothing in the way of advance notice. No flag to indicate that the PSU is running on "vapour in the tank".

There are PSU models, and I've only ever seen this in a redundant PSU, where an individual PSU module detects and indicates "mains out". But, even there, the delay between onset of the outage and the buzzer going on, is relatively long (a second or so). This could be a matter of some debouncing / timers in the detection logic, and could be worked upon... or it's just based on the primary cap getting totally discharged, i.e. the good ole "Power Good" logic.

The motherboard's perspective

= the motherboard's behavior in response to Power Good deasserting:

I've quickly checked some datasheets and Google lore, and although lack of proof is not exactly a "disproof", I have to say that the various docs available to me say next to nothing what happens after the "power OK" signal goes inactive. In my practical observation, the motherboard pretty much just activates all the RESET lines. This means "no more computing operations". Which to me makes the following sense: going on with unstable power, and trying to rescue things, might do more damage than it is worth.

Theoretically, you could imagine some sort of interrupt, a standard IRQ or an SMI, or some sorta ACPI callback, to give advance notice to the OS that there's a mains outage, and that the OS only has a couple dozen ms left. In practice, I don't know about any such thing. There is a standard arrangement where, in response to the power button input, mediated by ACPI, the OS can perform a graceful shutdown or a suspend to disk. Which can take a second or two for a bare OS, but possibly much longer for a loaded server or even desktop (minutes). Laptops actually have ACPI objects for the lid switch and for the battery (including the current gas gauge status).

I've seen custom vehicle-mount PC computers that had an extra input for the "ignition switch". The PC would automatically turn on when the ignition switch got engaged, and did a graceful shutdown (in a couple seconds) in response to the ignition input disengaging.

And as another side note, modern PC CPU's have a signal called "bi-directional processor hot" aka BD-PROCHOT. It's a shared active low signal with a pullup. The CPU drives this internally at times of overheat. But, the motherboard is allowed to tug at it too, e.g. when overheating is detected somewhere on the motherboard, or, some notebooks have been known to do this when the battery was running low. The CPU responds by PWM-throttling the CPU clock, thus decreasing consumption and heat production. Probably not what you want ;-) at a time of emergency shutdown. This signal has gained some infamy in broken hardware, where it triggers for no apparent reason or due to board-level engineering mistakes of various sorts...

I could imagine a completely custom arrangement where a service, running in the OS, would watch a GPIO, preferably with an IRQ, and would try to handle some critical cleanup if a mains outage was signaled that way. Within a timespan of a second, it probably wouldn't be able to do much about a larger volume of dirty write-back buffers, but if there are things more critical than storage I/O, and easy to handle immediately, that would be the way to go. Think industrial process control applications. Actually there are more proper ways to handle this scenario, I guess...

Storage perspective

Damage to persistent data can be prevented in various ways at the filesystem level, during healthy operation, under normal power conditions: journaling, ordering of write operations, generally minimizing the "critical section" of inconsistency in time - with guarantees on ordering all the way to the spinning rusty surface. Upon a power outage, this won't save your data buffered in RAM for writeback, but at least it will leave your filesystem on the HDD in a consistent state as of some moment in the past, or allowing you to roll back an unfinished transaction to the last consistent state (a few write operations back).

Flash drives in general may have a problem with surprise power-off, even when the writes from the host computer are correctly sequenced, or when there is no traffic in the queue, if the drive is doing some internal janitoring / wear leveling, just when the power outage happens. Some flash drives are safer in this respect than others - safety is likely at the cost of performance. There are also flash drives with extra capacitors and corresponding firmware measures, giving them ability to shut down in a consistent enough state on a surprise power outage - any drive having such capability will have that mentioned very visibly in its datasheet, as it's an important selling point (and increases the cost of the product).

A couple milliseconds worth of power are barely enough to tell the spinning hard drive to cancel all operatins and park its heads. This is a standard command in any disk drive interface, but some software would need to get to know and react - see above. Also note that for decades, disk drives have an inherent (mechanical?) mechanism that parks the heads long before the platters stop spinning, to prevent a headcrash against magnetic surface. IIRC, electric power goes out long before the rotational inertia of the spindle with platters even starts to lose some momentum. After the drive loses control due to an electric outage, the assembly of deflection coil + arms with heads get parked by the preload spring, as power to the deflection coil has been shut off - while the heads are still gliding on the airflow, long before the platters even noticeably spin down. Not sure if the drives also monitor their input power levels to prevent trying to operate under brownout conditions. Quite possibly they do.

General OS perspective

It takes many seconds, up to minutes, to shut down a general-purpose full-fledged OS.

You have a disk drive or RAID with "mounted" filesystems, you have some files open, dirty write-back buffers in RAM, and processes running that keep those files open, and produce the flow of writes. Multiple processes (and the system swapping) produce a mixed flow of reads and writes. Writes can be postponed for later, resulting in "dirty writeback buffers". Upon a request to shut down, first the user space processes are asked to perform their individual graceful shutdown. On memory-constrained systems (or due to a crappy swapping policy in the OS), some processes may need to be lifted out of swap first, in order to be able to gracefully terminate - producing a flurry of more disk IO prompted by the shutdown request. A desktop disk drive (spinning rust) of traditional PMR pedigree can be capable of around 70 random IOps. A large RAM full of pending dirty writes, scattered across random locations on the disk, can easily take several minutes to flush. I've been able to measure those "perfectly random IOps" by producing such a nasty load pattern deliberately - the jury is out on how close your practical scenario is, to my worst case experiment. After all the user space processes terminate, and all the dirty buffers get "writen back", the filesystem can be unmounted, and the disk drives get instructed to do a final sync of their internal cache as well (this is probably a part of some barrier operations during unmounting a FS). Then the drives get a command to spin down, and then finally, the "last man standing" on the software front, i.e. the "init" process in Linux, or an equivalent in Windows NT, tells the hardware (or calls ACPI) to perform a power off (transition to ACPI S5 state).

This whole business of a graceful shutdown can take minutes. Especially if this is a networked machine, and upon shutdown, for instance some processes try to close sessions with peers who may already be unreachable due to the power outage... If you're a server admin, of Windows or Linux, you've seen this before. The machine stuck on shutdown for unknown reasons... then finally turning off after minutes of sitting there idle.

It may take debugging and tweaking, in the OS and in the network, to achieve a reliable swift shutdown. It is easier to achieve on simple machines that normally don't have much to do, such as some "human machine interface" box with a pretty bare system and just a visual app with no dirty writeback. There are "embedded" editions of Windows that are austere with what they do in the background, allow you to turn off updates etc. Most Linux distroes can be installed in a minimalistic fashion - misconfigured systemd services are a traditional source of pointless waiting and timeouts during startup and shutdown, but a well configured standard issue distro can shut down in lower single digit seconds. Windows Embedded can perform similarly. In Linux or Windows equally, a flash drive as a system drive helps to achieve quick shutdown.

With deeper customization, Linux can shut down gracefully well under a second. And, minimalistic Linux and Windows systems can use the disk in read-only fashion, which naturally prevents almost all issues with surprise shutdown with respect to storage (bar some autonomous janitoring going on in the flash SSD or shingled HDD).

Conclusions and a final summary

If you need to gracefully shut down a PC upon mains power outage, consider getting a UPS. Their battery capacity tends to be dimensioned exactly for this purpose: to allow a busy machine to shut down gracefully within some minutes. A decent UPS will also have a communication link to inform host-based software that the mains power is down, and only a limited time remains.

The bare PC PSU can bridge some shorter outages (dozens of milliseconds up to a second or so) but does not seem to give any notice, that the holdover timeout is running. And when the Power Good signal does drop, the OS probably never gets to know, as the motherboad just "presses the reset button".

The machine stuck on shutdown for unknown reasons... “Working on updates Don't turn off your computer” likes to present itself at the worst possible times! — GB540, Mar 25 '23 at 19:30
I'm pretty sure dielectric thickness goes at the voltage. Since power goes at the square of voltage that means the density in the dielectric goes at the voltage. (It won't be completely linear because the conductors aren't infinitely thin.) — Loren Pechtel, Mar 26 '23 at 04:35
@LorenPechtel thanks for comment, agreed. Just on a napkin: Panasonic FR 470u/16V d8x11mm @ 16V = 0.1 J/cm^3 ; Nichicon LG 470u/450V d30x45mm @ 450V = 1.5 J/cm^3 . Admittedly, the low-volt capacitor will be subject to much higher current, prone to heating (Ir * ESR). Perhaps this is why the conductive cross-section of the electrodes has to be thicker, and hence the lower energy density... In practice it gets worse: the 16V elyt will run at just 12V (= lower energy) within a tight tolerance (= lots of deadweight). The primary cap runs at 90% voltage and gets depleted to 20% voltage... — frr, Mar 26 '23 at 09:46
If dielectric thickness is proportional to voltage, and we'd ignore electrode thickness: e.g. twice the voltage => twice the necessary dielectric thickness. To keep capacitance constant, we'd need twice the surface of the electrodes (and dielectric). That's four times the volume of the dielectric. So, in terms of energy per volume, the square of voltage would get divided by the quadratic growth of dielectric material volume. — frr, Mar 26 '23 at 09:52

hacktastical · Answer 2 · 2023-03-25T20:23:01.187

17

Yes, the CPU can indeed be notified of impending power loss, but this needs to be designed into the system power supply. By the time the CPU’s own local power goes out of regulation it’s generally too late.

The PC ATX power supply creates a ‘power good’ (PWR_OK) signal that the host monitors. The ATX supply ensures a certain amount of hold-up time, so that when PWR_OK drops the host can take action to protect itself before the power really goes away.

The hold-up time is not very long: 17ms in the Intel ATX spec. This is roughly the time it takes to discharge the ATX supply’s input capacitors below a workable voltage while at max power delivery. That 17ms corresponds to 1 60Hz AC cycle (16.67 ms). Still, that’s enough time to flush writes and pull in disk heads.

Other systems can monitor raw DC in (e.g., 12V from a wall wart or brick) and detect a brown-out condition that way.

edited Mar 25 '23 at 20:23

answered Mar 25 '23 at 00:03

hacktastical

53,912
2
49
152

7

The ATX PWR_OK is not a warning about power will soon go away. It is a signal which indicates that power is not good any more, so it is used as simply as reset signal which mostly prevents e.g. execution of code incorrectly. – Justme Mar 25 '23 at 09:45
By “the host” you mean the motherboard, not the CPU, right? – Mar 25 '23 at 19:10
Same thing. CPU is on the motherboard. – hacktastical Mar 25 '23 at 19:49

score 11 · Answer 3 · answered Mar 24 '23 at 23:57

Most microcontrollers (at least those with a single supply) have brown out detectors that can let the uC know that its come back from a power down.

In modern desktops there are many voltages and its almost impossible for the CPU to tell if one is failing. Possibly, depending on the mother board design their may be a circuit that can tell the uC that voltages are failing.

Instead though most modern operating systems depend on robust file systems like journaling in order to save the file system from being corrupted (usually)

There are many examples of test equipment with embedded Windows, or embedded Linux that are tolerant of sudden power outages. Also it is possible to design a system with a small back up battery, like a mini ups, perhaps a supercapacitor that can run the system for long enough to support a grace full shutdown in the case of mains supply failure. As far as I can tell you are on your own for writing an interrupt routine for Linux to react in some way to a detected power failure.

score 3 · Answer 4 · answered Mar 25 '23 at 01:14

Something like Linux or Windows (running a file system and all that rot) requires a great deal of time to prepare for an unexpected shutdown. More like seconds than milliseconds. "Usually" Windows and "A bit less usually" Linux will recover from an unexpected reset or shutdown fairly transparently to the user.

A UPS with hefty batteries can detect such a situation and call for a shutdown. I don't think there is any special code that calls for an unusually fast "emergency shutdown" in case of a impending power loss (of the file system in particular) - faster than something like 'sudo shutdown now' (which can take a minute or so), but it would make sense in some situations. The MCU does not have any direct way of knowing power is about to fail so the detection would have to be a bit ad hoc.

With microcontrollers, a brownout detection scheme can be used (many chips have the function built-in). Usually it just forces the chip into reset immediately upon the power going out of spec, and removes the reset only after a sizeable delay (hundreds of milliseconds). Although that may seem rude, and it does force a cold start (and it could cause file system corruption if writing is going on), there are worse things. The MCU may have millions or billions of cycles to get into mischief with an out-of-spec power source, perhaps trashing data in non-volatile memory stores or doing very undesirable things with external actuators. The out-of-spec power means that the MCU is free to do random things, such as ignore branch and return instructions.

In some cases, it may be necessary to provide energy storage and early warning of impending failure for protection reasons. For example, an expensive sensor array may require power sequencing on shutdown as well as power-up.

score 2 · Answer 5 · answered Mar 25 '23 at 09:08

2

For MCUs with byte addressable FRAM non-volatile memory such as Texas Instrument's MSP430FR series there is Intelligent System State Restoration after Power Failure with Compute Through Power Loss Utility which is a library for use the application. CTPL can use the MCU built-in VCC monitor to detect power loss, and to save the CPU and peripheral states into non-volatile FRAM before powering down. CTPL then automatically restores the application where the application last executed when power returns.

For a desktop computer Intel® Optane™ Persistent Memory may in theory allow such a scheme. However, the write endurance on Optane memory, the larger volatile state in a desktop computer and the need for the BIOS and operating system to support save/restore of state to persistent memory means not sure how easy to create an application for a desktop computer which could handle the AC power being removed, and carry on where left off when AC power was returned.

answered Mar 25 '23 at 09:08

Chester Gillon

3,690
1
11
15

That Optane reference is indeed interesting. Initially on the drawing board, the 3D-Xpoint memory was supposed to be an "EPROM with DRAM-like access characteristics". Yet in SSD's, Optane behaves more like a new, faster generation of Flash. It does have finer-grained access, but still has a limited number of TBW / erase cycles. The DIMM modules possibly work more like an ultra-fast Memory Technology Device = not as fast as RAM, and perhaps you need to do your own bad block management? And I suspect it's more for the server segment (ECC registered, also the sizes). And is possibly EOL by now. – frr Mar 25 '23 at 19:20
Micro Memory, and then Curtiss-Wright, did a NVRAM card which had battery-backed DDR memory as a card with PCI then PCIe form-factor. Curtiss-Wright / VMETRO / Micro Memory MM-5453J PCI Express x4 Low Profile NVRAM Card has some manuals of this EOL card (from a 2nd hand reseller as no longer appears on the Curtiss-Wright site). Am in the process of using the card to write a user mode driver under AlmaLinux 8.7 to use it, as no longer supported by the OS. – Chester Gillon Mar 25 '23 at 20:54
@frr: Yes, Optane DC-PM is non-volatile DIMMs. (Or was; discontinued now). They're connected to the memory bus, not PCIe, so can be memory-mapped directly by applications, which can use clflushopt and sfence to control the persistence order of stores. BTW, I thought it was only supported in Xeon CPUs; you could have one on your desktop, but the target market is database servers and similar transaction processing. It won't work in normal desktops with "client" CPUs, AFAIK. – Peter Cordes Mar 26 '23 at 02:56
1

The same journaling techniques that work for databases and filesystems can be used to maintain consistency. https://blocksandfiles.com/2019/04/04/enduring-optane-dimm-question-is-its-endurance-good-enough-yes-intel-has-delivered/ suggests the do have wear leveling, on top of 3D XPoint's high write endurance. Or it can be used the boring way, as a faster SSD with DRAM as a write-back cache like for normal memory-mapped files. But I don't think it's realistic to consider a machine with only NV DIMMs, especially not Optane due to performance; possibly battery-backed DRAM. – Peter Cordes Mar 26 '23 at 02:57