Wednesday, March 6, 2013

Lexicographic Index of Subsets


Suppose we have a set of $n$ items labelled $\{1,2,...,n\}$. We want to enumerate the subsets of $s$ items somehow. A natural way to do this is to use a lexicographic ordering of the sorted sets. So for example if $n=10$ and $s=4$, we would get the following assignments:

$\{1, 2, 3, 4 \} : 1$
$\{1, 2, 3, 5 \} : 2$
$\{1, 2, 3, 6 \} : 3$
...
$\{7, 8, 9, 10 \} : {10 \choose 4 }$

Solution 1 - Counting Above

The best way to show this is by example. The lexicographic index is the total number of subsets minus the number of subsets lexically larger. Suppose $n = 10$ and $s=4$ and our subset $u = \{2,4,5,9\}$. The subsets that are lexically larger than $u$ look like the following:

$ \#\{ > 2, ... \} = {8 \choose 4} $
$ \#\{ 2, >4, ... \} = {6 \choose 3} $
$ \#\{ 2, 4, >5, ... \} = {5 \choose 2} $
$ \#\{ 2, 4, 5, >9 \} = {1 \choose 1} $

It is easy to see the pattern and generalize. Our formula is then
\[ enum(u) = {n \choose s} - \sum_{j=1}^{s}{n-u_j \choose s-j+1} \]

Solution 2 - Counting Below

The lexicographic index is one plus the number of subsets lexically smaller. Suppose $n = 10$ and $s=4$ and our subset $u = \{2,4,5,9\}$. The subsets that are lexically smaller look like the following:

$ \# \{< 2, ... \} = \# \{ ... \} - \# \{ \geq 2, ... \} = { 10 \choose 4} - {9 \choose 4} $
$ \# \{2, < 4, ... \} = \# \{ 2, ... \} - \# \{ 2, \geq 4, ... \} = { 8 \choose 3} - {7 \choose 3} $
$\# \{2, 4, <5, ...\} = \# \{ 2, 4, ... \} - \# \{ 2, 4, \geq5, ... \} = { 6 \choose 2} - {6 \choose 2}$
$\# \{2, 4, 5, <9\} = \# \{ 2, 4, 5, ... \} - \# \{ 2,4,5, \geq 9 \} = { 5 \choose 1} - {2 \choose 1} $

It is easy to see the pattern and generalize. Defining $u_0 = 0$, our formula is then 
\[ enum(u) = 1 + \sum_{j=1}^{s} \left[ {n-u_{j-1} \choose s-j+1} - {n-u_j+1 \choose s-j+1} \right] \]
If you telescope this sum, you end up with formula given by Solution 1.
Exercise: How can we calculate the inverse? Given a number, what subset does it correspond to?



___________________________________________________________________________________

No comments:

Post a Comment