Beyond Spectre & Meltdown CPU Bugs
by Jean-Louis Gassée

We hear so much about software security vulnerabilities that we forget how buggy microprocessors can be. Some of these are “honest” bugs — mistakes in firmware code, errors in execution. But others stem from industry-wide design principles. These bugs are very subtle, hard to detect and cure, and they should cause us to reconsider our trust in CPU hardware.
Most computer bugs go unnoticed by the general public — as opposed to our tech milieu. An update fixes the problem and the typical user is none the wiser. But two recent security vulnerabilities, Meltdown and Spectre, were momentous enough for headlines to spill outside of the tech world.
First surprise, Meltdown and Spectre aren’t flaws in an operating system, nor in cryptography or networking — they’re microprocessor hardware security bugs. Second revelation, these defects impact most CPU chips based on x86 (Intel) or ARM architectures — in other words, nearly every device we use.
This unusual combination invites contemplation of the bugs’ cause (note the singular) and history, and leads us to a worried musing.
Let’s start with the root cause: Modern microprocessors are very fast, with clock speeds reaching 2, 3, even 4 GHz, but the world around them is much slower. This discrepancy led to sophisticated designs that bridge the speed gap — designs that have subtle, deeply buried security cracks.
As an example, fast memory needs 100 nanoseconds to respond to a read request. This amounts to hundreds of clock cycles wasted while the CPU is waiting for data. The speed differential gets much worse when waiting for external devices, disk drives, network connections, sensors…
To mitigate this waste, system designers invented devices such as the data cache, “a hardware or software component that stores data so future requests for that data can be served faster” [Wikipedia]. While a cache can help, it isn’t proactive — if the requested data isn’t in the cache, cycles are still wasted.
To mitigate this waste, designers came up with speculative execution, “an optimization technique where a computer system performs some task that may not be needed” [Wikipedia, again]. An example, admittedly simplistic, will help. While it’s waiting for data to arrive, the CPU could “fake it”: It could pretend that the data has actually arrived, and execute as much of the “happy path” code as possible. When the data finally arrives, the program continues — and is a few steps farther along the path. If the wait is long enough, the CPU could even hedge its bet and execute both the success branch and the error branch (“sorry, no data”), and then throw away the unneeded branch when the data arrives (or doesn’t).
Unfortunately, speculative execution isn’t quite as simple or as clean as we’d like it to be.
First, care must be taken to ensure that the two branches don’t step on each others’ data. This requires sophisticated memory management — and it may not be possible at all if an operation, such as a signal to the outside world, can’t be isolated and reversed.
Second, the sophisticated house-cleaning required to make it look like the “bad” branch never happened isn’t perfect, it leaves tiny cracks in the system’s logic. This is where the headline-making CPU bugs occur: By exploiting these cracks, a sophisticated attacker can gain access to protected memory containing valuable information such as system or user keys.
This is an intriguing type of bug. It is, in a manner of speaking, a bug in principle rather than in execution. The flaw isn’t due to a mistyped operator in programming code, it’s the result of a complicated set of thoughts, an industry-wide way of thinking about arcane processor design decisions.
(For a much more detailed discussion of speculative execution and its woes, see this Eben Upton précis. Thanks to Steve Sinofsky [@stevesi] for the pointer.)
That’s only part of the story. It now appears that “in principle” trouble with microprocessors was understood more than 20 years ago. In a 1995 paper titled The Intel 80x86 Processor Architecture: Pitfalls for Secure Systems, authors Sibert, Porras, and Lindell described architecture subtleties and implementation errors that made many x86 processors undesirable for secure systems. In particular, the paper’s authors pointed to memory architecture flaws that allow unwanted peeks into “protected” processes — precisely the sort of trouble that we’re seeing today.
The concern that Sibert et al expressed decades ago and its realization as Meltdown and Spectre should shake our old habits of the mind. We’ve come to believe that while software is a petri dish of deadly germs, the CPU is a reliable, antiseptic “hard truth”. We should have remembered Marvin Minsky’s (perhaps apocryphal) apothegm: Hardware and software aren’t fundamentally different, they’re merely different levels of crystallization of logic.
A final, disquieting thought.
The ARM architecture, combined with CAD (Computer Assisted Design, or Electronic Design Automation) allows an individual or a small team to design a microprocessor dedicated to a specific set of tasks such as home automation or industrial process monitoring. Once your design is finished, you have a choice of hundreds of “Pure Play foundries”, all ready to manufacture your chip. Great.
But…
A malicious designer can hide undocumented and virtually undetectable functions inside their chip. This would amount to a mole inside any home or industrial systems that uses devices built with the diabolical processor.
We’ll recall how thousands of Internet-connected cameras, compromised by design, were once harnessed to unleash DDOS (Distributed Denial Of Service) attacks on business websites. Now imagine more sophisticated operations on a much broader scale when the moles “call home” and the modern Fifth Column attacks our infrastructures. How does one vet a “modest” dedicated microprocessor that uses hundreds of thousands of transistors?
Things will get increasingly more interesting.
_________________________________________________________________
Wishing Monday Note readers a Happy New Year, I’m happy to be back in Palo Alto after three weeks in France visiting family, friends and monuments such as the very cold, wet, and monastic Mont Saint Michel.
— JLG@mondaynote.com