‘Magical’ Error Correction Scheme Proved Inherently Inefficient

79

If you’ve ever sent a text message, played a CD, or stored a file in the cloud, you’ve benefited from error correction. This revolutionary idea dates back to the 1940s, when researchers first realized that it’s possible to rewrite any message in a form that allows later corruption to be easily reversed.

Over the years, researchers have developed many ingenious schemes, called error-correcting codes, that encode data in different ways and use different procedures to fix errors. But to theoretical computer scientists, few are as compelling as so-called locally correctable codes. These codes have two simultaneous properties that sound almost contradictory: Any error can be corrected by reading the encoded data in just a few places, yet no attacker can foil this correction procedure by selectively tampering with the code. It’s as if you could recover any page torn out of a book by just glancing at a few others.

“It’s quite a magical phenomenon,” said Tom Gur, a computer scientist at the University of Cambridge. “A priori, it’s not obvious that such a mathematical object could exist at all.”

But this magic comes at a steep cost. The only known examples of locally correctable codes are extremely inefficient — encoding any message also makes it exponentially longer. Entire books encoded this way would be far too unwieldy.

Computer scientists have long wondered whether better locally correctable codes are possible. They’ve focused especially on codes that use only three queries to correct any error, hoping that this severe restriction might make these codes easier to understand. But even this simple case has stumped researchers for over 20 years.

Now the computer scientist Pravesh Kothari of Carnegie Mellon University and his graduate student Peter Manohar have finally proved that it’s impossible to build a three-query locally correctable code that avoids that exponential cost. It may be a negative result, but anything that clarifies the limits of error correction is exciting to researchers, especially because the mathematics of locally correctable codes crops up in areas far removed from communication.

“This result is amazing,” said Shubhangi Saraf, a computer scientist at the University of Toronto. “It’s a huge breakthrough.”

Strength in Numbers

To understand error correction, imagine the data you’d like to protect as a sequence of bits, or 0s and 1s. An error, in this model, can be any unwanted flip of a 0 into a 1 or vice versa, whether it’s due to a random fluctuation or deliberate tampering.

Suppose you want to send a message to a friend, but you’re concerned that errors might change the meaning. One simple strategy is to replace each 0 in your message with 000 and each 1 with 111. If your friend sees a part of the message that doesn’t contain three identical bits in a row, they’ll know that an error has occurred. And if errors are random and relatively rare, then any string of 110 is much more likely to be a corrupted 111 than a corrupted 000. A simple majority vote within each triplet will suffice to correct most errors.

This scheme, called the repetition code, has the virtue of simplicity, but little else to recommend it. For one thing, it requires tripling the length of every message just to deal with relatively infrequent errors, and if there’s a decent chance of two adjacent errors, we’ll need even more redundancy. Worse still, it quickly becomes useless if errors aren’t random, such as when attackers actively try to sabotage the code. In the repetition code, all the information needed to correct a given bit is stored in just a few other bits, leaving it vulnerable to a targeted attack.

Fortunately, many error-correcting codes fare better. One famous example, called the Reed-Solomon code, works by transforming messages into polynomials — mathematical expressions like x2 + 3x + 2 that consist of different terms added together, each with a variable (such as x) raised to a different power. Encoding a message using a Reed-Solomon code involves building a polynomial with one term for each character in the message, then plotting the polynomial as a curve on a graph and storing the coordinates of points that lie on the curve (taking at least one more point than the number of characters). Errors might push a few of these points off the curve, but if there aren’t too many errors, only one polynomial curve will pass through most of the points. That curve almost certainly corresponds to the true message.

Reed-Solomon codes are hyperefficient — you only need to store a few extra points to correct errors, so any encoded message is only marginally longer than the original. They’re also less vulnerable to the sort of targeted disruption that would spell disaster for the repetition code, because the information used to correct an error anywhere is distributed across the entire encoded message.

Think Globally, Act Locally

The strength of the Reed-Solomon code stems from interconnectedness. But precisely because of that interconnectedness, there’s no way to fix a single error in an encoded message without reading the whole thing. That may not sound like a problem in the context of communication: If you’re sending a message, you probably want the recipient to read all of it. But it can be a liability in data storage — another major application of error correction.

Consider a company that stores users’ emails in the cloud — that is, on a vast array of servers. You can think of the whole collection of emails as one long message. Now suppose one server crashes. With a Reed-Solomon code, you’d need to perform a massive computation involving all the encoded data to recover your emails from that one lost server. “You would have to look at everything,” said Zeev Dvir, a computer scientist at Princeton University. “That could be billions and billions of emails — it could take a really long time.”

Researchers use the term “local” to describe codes that use only a fraction of the encoded message to spot errors or correct them. The simple repetition code has something of this local character, but that’s precisely what makes it so vulnerable to tampering. A locally correctable code, by contrast, gets the best of both worlds — it can correct an error in any bit with just a few queries, all without losing the interconnectedness that makes Reed-Solomon codes so resilient.

“This is a really stringent notion,” Kothari said.

Previous articleHow to Guarantee the Safety of Autonomous Vehicles
Next articleThe ‘Accidental Activist’ Who Changed the Face of Mathematics