May 12, 2015

Information Sparsity vs. Dyspoissonism

HomePrevious Lesson / Next Lesson

Related Dyspoissometer functions: dyspoissometer_sparsity_get().

Dyspoissonism provides an answer to the question "How compressible is this mask list?" Information sparsity answers a slightly different question: "How does this mask list compare to the most random possible mask list?" Like dyspoissonism, information sparsity is expressed as a fraction on [0, 1], but unlike dyspoissonism, it can easily reach either extreme in finite test cases.

The information sparsity, S, is given by:

S = 1 - (L / Lmax)

where L is the logfreedom of the mask list in question and Lmax is the maximum possible logfreedom, given Q and Z. (The ratio on the right is called "information density", but because it is usually close to one for real world data, sparsity is a more practical metric.) So more formally (and awkwardly), we have:

S = 1 - (L(H, Z) / Lmax(Q, Z))

See dyspoissometer_sparsity_get() in the source code for an implementation of S.

So S is a fraction which is zero when the logfreedom is maximal (Lmax), and zero when it's minimal (zero). The wonderful thing about S is that it allows us to compare various mask sets with different mask counts and mask spans in a normalized and reasonably fair manner, as opposed to dyspoissonism which, but for some trivial corner cases, has a nonzero lower bound which depends upon Q and Z. So we can conclude, for example, that mask set A is closer to its state of maximum randomness than mask set B, even though the latter is larger and involves more masks.

In theory, armed with S, we would have little or no use for D. Unfortunately, finding Lmax appears to involve a combinatorial problem of complexity which is bounded on the low side by Q -- yikes!

But all is not lost. We can, in practice, estimate Lmax to several digits in a matter of seconds on a cheap computer. Indeed, this is is Dyspoissometer for Files (command: "dyspoissofile") gives us the option to compute S, or not, and based on how many approximating iterations for Lmax. How to do that is the topic of the next lesson.


No comments:

Post a Comment