May 11, 2015

Bits, Eubits, Logfreedom, and Entropy

HomePrevious Lesson / Next Lesson

Typically we express entropy content in units of bits (zeroes and ones). This makes sense in light of the ease of constructing binary computers. But we could just as easily express entropy in units of hexadecimal digits, or even (gag!) decimal digits. Those are all examples of integer-base number systems. We could, in fact, have an irrational base for our number system, say, Euler's number. Just as one hexadecimal digit is worth 4 bits because (log2(16) = 4), one "eubit" (YOO - bit) is worth (log2(e) = 1.442695...) bits. The motivation for this is that it saves us the trouble of converting back and forth between logfreedom and bits, which is wasteful because they measure the same thing.

Thus what logfreedom is expressing is the amount of entropy required to exactly represent one of (some huge way count) possible mask lists compliant with a given population list and mask span. (If you absolutely must think in bits, then simply multiply by log2(e).) Logfreedom is not concerned with whether or not we literally wish to represent the mask list that way, which is the business of combinatorial data compression, but rather, quantifying the smallest amount of information into which it could be reversibly compressed, absent any statistical bias between one possible mask list vs. another -- in other words, the entropy content of the mask list. (In practice, of course, we would need to round up the storage requirement to an integer number of bits, but this is a trivial consideration.)

Put another way, we can choose to quantify entropy as the logfreedom of a population list corresponding to a particular mask list and mask span.

Note that this fundamentally differs from the definition of entropy from the perspective of compressed sensing. Compressed sensing begins with the assertion that most complicated real world signals are dominated by a small number of simple signals of high amplitude -- such that the signal can be, for all practical purposes, precisely reconstructed from a small set of symbols. This is equivalent to a large mask list dominated by a "zero mask" (corresponding to zero amplitude) with a small number of other ones. But viewed from the perspective of logfreedom, the simplest systems are those with the least way count -- often, but not necessarily, those which are dominated by a single such zero mask.

But which definition of entropy is more reasonable? The trouble with the compressed sensing definition is that it emphasizes the quantity of "large" components (essentially, the L0 norm of a list subjected to a zeroing threshold), not their diversity of amplitude or lack thereof. It ignores another empirical fact about real world signals, which is that they tend to exhibit components of equal amplitude (once digitized). For example, a realistic signal might have 10 components each of amplitude 3. From the compressed sensing perpective, this would be more complicated than, say, 4 components of amplitudes 3, 17, 19, and 24. Depending on the number of signals in the digitized raw data, the former might actually have lesser logfreedom (and therefore consume less storage) than the latter, even if we assume that all other components would be ignored.

We can therefore imagine a revitalization in the field of compressed sensing, focussed not on L1, L-fraction, or L0 minimization, but rather, the minimization of logfreedom subject to some metric of signal reconstruction quality. I think this would lead to smaller compressed files and perhaps much higher performance than the former approach, not the least of which because logfreedom can be accurately estimated using a subsampling of the raw data, and computed with maximum precision in one pass of the data without the expensive matrix operations due to matrix norm minimization algos such as Lasso and Homotopy.

To bring this all full circle, look at the following binary mask lists. Which one contains more entropy:

G0 = {0, 0, 0, 0, 0, 0, 0, 0, 1, 0}
G1 = {1, 0, 1, 0, 0, 1, 1, 1, 0, 1}

If you guessed G1, you're wrong. The answer is "neither" because the definition of "entropy" depends on one's expectation. With no expectation of any bias one way or the other, G0 and G1 contain the same amount of entropy, namely, 10 bits' worth. Granted, had I told you to expect mundane real world signals, then G1 would be the correct answer. On the other hand, had I told you that mask lists with 4 zeroes were disproportionately popular, then at a certain point, G0 would have less entropy. Entropy content is, truly in the eye of the beholder.

Logfreedom reflects the philosophy that in the real world, entropy varies monotonically with the number of states which a system can assume, once its population list and mask span have been chosen. Dyspoissonism, which is the topic of the next lesson, is the fractional difference between the entropy content as measured through the lens of logfreedom, and the assumption that all mask lists contain equal amounts of entropy.

No comments:

Post a Comment