May 10, 2015

Way Count: Counting the Number of Configurations

HomePrevious Lesson / Next Lesson

Central to the discussion of information content in discrete sets is "way count", or the number of mask lists which comply with a given population list, H, and mask span, Z. For example, suppose we have:

H = {5, 1, 2, 1}

Thus we have five masks with frequency 1, one mask with frequency 2, two masks with frequency 3, and one mask with frequency 4. (K, the maximum frequency with nonzero population, is 4.) Remember that the population of frequency zero (H0) is omitted by convention, which means that we must provide the mask span as a given:

Z = 15

(In the real world, Z is almost always a power 2, and usually a power of 256 because masks usually consist of one or a few bytes each. So compliance with Z is a foregone conclusion in practice, and not something we need to test for. But this is a math lesson.)

Quiz:

1. Find H0, the population of masks with zero frequency.

2. Find Q, the mask count.

Answers:

1.

H0 = Z - Σ(F = (1, K), H[F])
H0 = 6

("Σ(F = (1, K), H[F])" means "the sum of H[F] with F running from 1 through K, inclusive".)

2.

Q = Σ(F = (1, K), F H[F])
Q = (1 * 5) + (2 * 1) + (3 * 2) + (4 * 1)
Q = 17

Look at the frequency list, J, below. Does it comply with H?

J = {3, 4, 1, 1, 1, 3, 1, 2, 1}

We have: 5 (F = 1)s, 1 (F = 2), 2 (F = 3)s, and 1 (F = 4). So its population list is indeed H.

Does J also comply with Z? No, because J contains only 9 items corresponding to 9 masks. Let's try another frequency list, J':

J' = {0, 0, 1, 3, 1, 4, 0, 0, 1, 1, 2, 3, 0, 1, 0}

As you can see if you count the frequencies in J', it also complies with H. And this time, we have 6 masks with frequency zero (H0 = 6), so J' also complies with Z. Thus any mask list G having frequency list J' is a "way" of H and Z.

The "way count" W of H and Z is the number of unique mask lists G complying with H and Z. We could calculate W by brute force:

W = 1
For every J compliant with H and Z
    For every G compliant with J
        W = W + 1
    Loop
Loop

Unfortunately, for all but the most trivial cases, this method rapidly becomes intractable. (Nonetheless, brute force testing proved quite useful for verifying Dyspoissometer's notion of W in simple test cases.) Fortunately, high school combinatorics can help us out here.

First of all, how many (J)s are compliant with H and Z? Well, we know that J contains Z items -- one for each mask. How many ways, X1, can we choose the masks with frequency 1? (There are H[1] such masks.)

X1 = Z nCr H[1]

where the right side means "Z combination H[1]".

And how many ways, Y1, can we chose the mask indexes corresponding to those masks? The least mask could occur at any of Q different mask indexes; the 2nd least mask could occur at any of (Q - 1); the 3rd least could occur at any of (Q - 2), etc. All possible combinations are allowed, so we need to take the product of these terms. So we have:

Y1 = Q nPr H[1]

where the right side means "Q permutation H[1]".

But what about masks with frequency 2? How many ways, X2, can we chose them? Well, after choosing the frequency-one masks, we have only (Z - H[1]) masks left from which to select a subset of size H[2]:

X2 = (Z - H[1]) nCr H[2]

Similarly, we can quantify X3 and X4, the number of ways in which we could choose masks of frequency 3 and 4, respectively:

X3 = (Z - H[1] - H[2]) nCr H[3]
X4 = (Z - H[1] - H[2] - H[3]) nCr H[4]

But how many ways, Y2, can H[2] different masks each have frequency 2, thereby occurring a total of 2H[2] times? Recall that we have only (Q - H[1]) mask indexes remaining from which to choose. The first pair of mask indexes containing masks with frequency 2 can thus be chosen in ((Q - H[1]) nCr 2) ways. The next set can then be chosen in ((Q - H[1] - 2) nCr 2). The 3rd set can then be chosen in ((Q - H[1] - 4) nCr 2) ways, the 4th in ((Q - H[1] - 6) nCr 2) ways, etc. So we have:

Y2 = Π(N = (1, H[2]), ((Q - H[1] - 2(N - 1)) nCr 2))

where "Π" means "product of" with the index, N, running from 1 to H[2]. Now, look at the "nCr 2" part. This term, first of all, implies that we have H[2] factors of (2!) (where "!" means "factorial"). (By the way, (0!) and (1!) are both defined to be one.) So we could simplify this to:

Y2 = Π(N = (1, H[2]), (Q - H[1] - 2(N - 1))(Q - H[1] - 2(N - 1) - 1)) / ((2!) ^ H[2])

Where "A ^ B" means (A raised to the power of B) -- not to be confused with the xor operator in C. And notice that we have expanded the pair of terms in the numerator of the combination into an explicit product of neighboring whole numbers. But if you look at this expression, we can simply replace the product series with a generic permutation:

Y2 = ((Q - H[1]) nPr (2H[2])) / ((2!) ^ H[2])

For Y3, we have (Q - H[1] - 2H[2]) mask indexes available from which to chose 3H[3] of them to contain all the frequency-3 masks. Using a very similar process, we can easily derive:

Y3 = ((Q - H[1] - 2H[2]) nPr (3H[3])) / ((3!) ^ H[3])

and then, analogously:

Y4 = ((Q - H[1] - 2H[2] - 3H[3]) nPr (4H[4])) / ((4!) ^ H[4])

By the way, look again at Y1:

Y1 = Q nPr H[1]

Note that we can rewrite it in the pattern as higher-order (Y)s:

Y1 = (Q nPr 1H[1]) / ((1!) ^ H[1])

Finding W is then a simple matter of plugging in the appropriate items from:

H = {5, 1, 2, 1}

then taking products:

W = X1 * X2 * X3 * X4 * Y1 * Y2 * Y3 * Y4
W = 3003 * 10 * 36 * 7 * 742,560 * 66 * 4200 * 1
W = 1,557,688,630,417,920,000

So there you have it: we can map every unique mask set compliant with H and Z to an integer on [0, (1,557,688,630,417,920,000 - 1)]. The mechanics of performing such a mapping are nontrivial and computationally expensive. Nevertheless, this concept is the basis of combinatorial data compression: given Z (usually implied) and H (explicitly specified in a usually-negligible amount of information), we can encode a single integer of ceiling(log2(W)) bits which exactly implies -- but is (maybe vastly) smaller than -- the mask list. Assuming that all compliant mask lists occur with equal probability, combinatorial compression is maximally efficient (minimum size).

But for the purposes of statistical analysis as opposed to data compression, we have an obvious problem: W is huge, approaching the capacity of a 64-bit integer -- all from an H with only 4 small items! In practice, standard integer types are useless for this reason, but even quad precision floating-point is rapidly exhausted by computing way count. We have to do something about this.

First of all, we can simplify the expression for W. Generically, we have:

W = X * Y

where:

X = Π(F = (1, K), (Z - Σ(N = (1, (F - 1)), H[N])) nCr H[F])

and

Y = Π(F = (1, K), (Q - Σ(N = (1, (F - 1)), N H[N])) nPr F H[F]) / ((F!) ^ H[F])

But by expanding the combinations and permutations, we can simplify X and Y as follows:

X = ((Z!) / ((H0!) * Π(F = (1, K), H[F]!))
Y = (Q!) / Π(F = (1, K), (F!) ^ H[F])

This is somewhat intuitive: (Q!) reflects that all mask indexes are used, but ((Z!) / (H0)!) reflects that only masks with nonzero frequency are used. But is this really correct? Try the example again:

X = (15!) / ((6!) * (5!) * (1!) * (2!) * (1!))
Y = (17!) / (((1!) ^ 5) * ((2!) ^ 1) * ((3!) ^ 2) * ((4!) ^ 1))

X = 7,567,560
Y = 205,837,632,000
W = XY
W = 1,557,688,630,417,920,000

Looks good! Now proceed to the next lesson to see how we deal with the problem of huge W.


No comments:

Post a Comment