Neill Robson's Website

Lucas's Theorem

Let’s explore a theorem that breaks massive combinatorial problems down into hand-computable pieces.

Combination questions take the form of “how many ways can you choose yy items from xx options?” If you’ve got six flowers, all different colors, how many ways can you choose three of them for a small bouquet?

The standard syntax and factorial definition of combinationsCombinations are better known as binomial coefficients. are:

(nk)=n!k!(nk)!\binom{n}{k} = \frac{n!}{k!(n-k)!}

While accurate, the above formula is difficult to deal with for even moderately large values. The value of (482176)\binom{482}{176} is practically intractible using factorials. Most applications don’t even have a use for a number that large.

Taking the residue constrains the magnitude of the final result, but the computation is still prodigious:

(nk)xmodm\binom{n}{k} \equiv{} x \mod m

Assuming the modulus mm is prime-powered, Lucas’s theorem gives us a useful decomposition of this problem.

A binomial coefficient under a prime modulus pp is congruent to the product of the binomial coefficients created by the base-pp digits of the original coefficient. In mathematical notation:

(nk)i=0d(niki)modp\binom{n}{k} \equiv{} \prod_{i=0}^{d} \binom{n_i}{k_i} \mod p

For example, take the (482176)\binom{482}{176} example from above. Convert the two values to base 5:

482=353+452+151+250176=153+252+051+150\begin{align*} 482 &= 3 * 5^3 + 4 * 5^2 + 1 * 5^1 + 2 * 5^0\\ 176 &= 1 * 5^3 + 2 * 5^2 + 0 * 5^1 + 1 * 5^0 \end{align*}

The coefficients to the powers of 5 are inserted into the binomial coefficient product:

(482176)(31)(42)(10)(21)mod53612mod536mod51mod5\begin{align*} \binom{482}{176} &\equiv{} \binom{3}{1} \binom{4}{2} \binom{1}{0} \binom{2}{1} & \mod 5\\ &\equiv{} 3*6*1*2 & \mod 5\\ &\equiv{} 36 & \mod 5\\ &\equiv{} 1 & \mod 5 \end{align*}

Considering the fact that the non-residue value is approximately 9.2101359.2 * 10^{135}, computing a residue so efficiently with Lucas’s theroem is quite a feat.

We use the “binomial” characteristic of binomial coefficients as the basis of the proof.

c=0r(rc)xc=(1+x)r=m=0k(1+x)rmpm\begin{align*} \sum_{c=0}^{r} \binom{r}{c} x^c &= (1+x)^r\\ &= \prod_{m=0}^{k} (1+x)^{r_m p^m} \end{align*}

The product is simply using the base-pp expansion of rr.

To go much further we need to justify a binomial quirk. For prime pp:

(1+x)pm(1+xpm)modp(1+x)^{p^m} \equiv (1+x^{p^m}) \mod p

Let’s look at the binomial coefficients. The factorial definition gives that (pmk)=pmk(pm1k1)\binom{p^m}{k} = \frac{p^m}{k} * \binom{p^m-1}{k-1}. Multiply kk over and the result is:

k(pmk)=pm(pm1k1)k * \binom{p^m}{k} = p^m * \binom{p^m-1}{k-1}

The right-hand side is divisible by pp at least mm times, and since kpmk \leq p^m, kk will be divisible by pp at most p1p-1 times (or equal pmp^m). Binomial coefficients are integers, so pp must divide (pmk)\binom{p^m}{k} for every 0<k<pm0 < k < p^m. The boundary cases are each 11 by definition.

Any coefficient divisible by pp is congruent to 0modp0 \mod p, which leaves only the first and last terms in the binomial expansion: 11, and xpmx^{p^m}. The aforementioned congruence follows.

Returning to the main equation:

m=0k(1+x)rmpm=m=0k((1+x)pm)rmm=0k(1+xpm)rmmodp\begin{align*} \prod_{m=0}^{k} (1+x)^{r_m p^m} &= \prod_{m=0}^{k} ((1+x)^{p^m})^{r_m}\\ &\equiv \prod_{m=0}^{k} (1+x^{p^m})^{r_m} \mod p \end{align*}

Expand that inner binomial out into a sum:

m=0ks=0rm(rms)(xpm)s=m=0ks=0rm(rms)xspmmodp\begin{align*} &\prod_{m=0}^{k} \sum_{s=0}^{r_m} \binom{r_m}{s} (x^{p^m})^s\\ =&\prod_{m=0}^{k} \sum_{s=0}^{r_m} \binom{r_m}{s} x^{sp^m} \mod p \end{align*}

The distributive law allows us to move the inner sum outside, turning it into an iterated sum:

m=0ks=0rm(rms)xspm=s0=0r0s1=0r1sk=0rkm=0k(rmsm)xsmpmmodp\begin{align*} &\prod_{m=0}^{k} \sum_{s=0}^{r_m} \binom{r_m}{s} x^{sp^m}\\ =& \sum_{s_0=0}^{r_0} \sum_{s_1=0}^{r_1} \dotsm \sum_{s_k=0}^{r_k} \prod_{m=0}^{k} \binom{r_m}{s_m} x^{s_mp^m} \mod p \end{align*}

Consider the set of (s0,s1,sk)(s_0, s_1, \dots s_k). The xx term in the inner product becomes xs0p0xs1p1xskpkx^{s_0p^0} * x^{s_1p^1} * \dots x^{s_kp^k}, making the combined exponent s0p0+s1p1+skpks_0p^0 + s_1p^1 + \dots s_kp^k.

In other words, the sets of ss are the base-pp representations of each term’s exponent.

  • When all sm=0s_m = 0, the exponent is 00.
  • When all sm=rms_m = r_m, the exponent is rr.

Now, recall our original summation variable c:0crc: 0 \leq c \leq r. It would be nice to replace the iterated sum with a simple c=0r\sum_{c=0}^{r}, but not every number from 00 to rr is necessarily represented in the iterated sum. Consider, for example, c=pe1c=p^e - 1 where e<ke < k. All of the cm=p1c_m = p-1, the maximum valid coefficient value, but not all rmr_m necessarily go that high.

Nonetheless, the replacement sum can still be justified if, for each of those extra cases where cm>rmc_m > r_m, the resulting term is 00. Consider the binomial coefficient at one of those cases: (rmcm)=0\binom{r_m}{c_m} = 0 when the inequality is trueAfter all, one cannot “choose” more items than exist in the set..

Let’s collapse that iterated sum using that justification:

s0=0r0s1=0r1sk=0rkm=0k(rmsm)xsmpm=c=0rm=0k(rmcm)xcmpm=c=0r[m=0k(rmcm)m=0kxcmpm]=c=0r[m=0k(rmcm)xc]modp\begin{align*} & \sum_{s_0=0}^{r_0} \sum_{s_1=0}^{r_1} \dotsm \sum_{s_k=0}^{r_k} \prod_{m=0}^{k} \binom{r_m}{s_m} x^{s_mp^m}\\ =& \sum_{c=0}^{r} \prod_{m=0}^{k} \binom{r_m}{c_m} x^{c_mp^m}\\ =& \sum_{c=0}^{r} \Bigg[\prod_{m=0}^{k} \binom{r_m}{c_m} * \prod_{m=0}^{k} x^{c_mp^m}\Bigg]\\ =& \sum_{c=0}^{r} \Bigg[\prod_{m=0}^{k} \binom{r_m}{c_m} * x^c\Bigg] \mod p \end{align*}

We now have a structure that closely parallels our original expression:

c=0r(rc)xcc=0rm=0k(rmcm)xcmodp\sum_{c=0}^{r} \binom{r}{c} x^c \equiv \sum_{c=0}^{r} \prod_{m=0}^{k} \binom{r_m}{c_m} x^c \mod p

Compare each coefficient, and you arrive at the theorem statement.

\square

Problems where Lucas's theorem can be directly applied are tricky to invent, and rarely practical. If you have a quantity of items expressible as a binomial coefficient, and wish to find the number left over after divvying the items into prime-numbered groups, Lucas can get you there easilyBrilliant has an example of counting oranges stacked in a pyramid structure. That’s about as practical of an example as I can find..

However, binomial coefficients in general tend to appear in many unexpected places throughout both number theory and applied math. Knowing a little trick exists when a binomial and prime modulus are involved can be the ticket you need to drastically reduce complexity.

Larry Riddle’s writeup of the theorem’s proof was instrumental in helping me understand it. Most of the steps above, down to the variable names, are taken from his summary: I simply fleshed out several steps that caught me off-guard when I was learning. The numeric example he works through is worth a read, in particular. I wouldn’t have understood the general proof without it!


math LaTeX