Notes on the counting techniques (basic and advanced) I learned in the course Discrete Math.

Basic counting principles

(Product Rule, Addition Rule, Division Rule)

The Pigeonhole Principle

Theorem. If there are k > 0 boxes and k + 1 or more objects, there is at least one box containing more than one object.

Theorem. If N objects are placed into k boxes, then there is at least one box containing at least \(\left\lceil \frac Nk \right\rceil\) objects.

Some applications

Corollary. A function f from a set with k+1 or more elements to a set with k elements is not injective.

Theorem. Every sequence of \(n^2+1\) distinct real numbers contains a monotonic (increasing or decreasing) subsequence of length n+1.

Theorem. (Erdös-Szekeres Theorem). For given r, s they showed that any sequence of distinct real numbers with length at least (r − 1)(s − 1) + 1 contains a monotonically increasing subsequence of length r or a monotonically decreasing subsequence of length s.

▶ Proof

Assign to each element of the sequence \(\{u_i\}\) a pair of numbers \((a_i, b_i)\) where:

  •  \(a_i\) indicates the length of the longest increasing subsequence ending at \(u_i\)
  •  \(b_i\) indicates the length of the longest decreasing subsequence ending at \(u_i\).

For some \(i<j\), if \(u_i<u_j\) then \(a_i < a_j\) (since we can extend the increasing subsequence ending at \(u_i\) to ending at \(u_j\)). Similarly, if \(u_i > u_j\) then \(b_j < b_j\). Basically, there does not exist any two indicies \(i,j\) such that \((a_i,b_i)=(a_j,b_j)\). (*)

We will prove by contradiction that there exist some \(i\) such that \(a_i\geq r\) or \(b_i\geq s\). Suppose that \(a_i\leq r-1\) and \(b_i\leq s-1\) for all \(i\). There are at most \((r-1)(s-1)\) possible pairs, but there are \((r-1)(s-1)+1\) numbers in the sequence. Hence there exist some two indicies \(i < j\) such that \((a_i, b_i) = (a_j, b_j)\), contradicting to what we have shown at (*). Hence, the theorem.

Theorem. In a simple undirected graph, there are at least 2 vertices having the same degree.

Bionomial Coefficients and Identities

Theorem. Let x and y be variables and n be a non-negative integer, then:

\[(x+y)^n=\sum_{i=0}^n\binom ni x^{n-i}y^i = \sum_{i=0}^n\binom n{n-i} x^{n-i}y^i\]

If we substitute \(x=1, y=1\) in the \((x+y)^n\) expansion, we get the result that:

\[\sum_{i=0}^n \binom ni=2^n\]

If we substitute \(x=1, y=-1\) in the \((x+y)^n\) expansion, we get the result that:

\[\sum_{i=0}^n (-1)^n\binom ni=0\]

If we substitute \(x=1, y=k\) in the \((x+y)^n\) expansion, we get the result that:

\[\sum_{i=0}^n k^n\binom ni=(k+1)^n\]

Pascal’s identity

Theorem. (Pascal’s Identity). Let n and k be positive number with \(n\geq k\), then:

\[\binom {n+1}k = \binom n{k-1}+\binom nk\]

One can construct what is called a Pascal triangle:

n\k 0 1 2 3 4 5 6 7
0 1              
1 1 1            
2 1 2 1          
3 1 3 3 1        
4 1 4 6 4 1      
5 1 5 10 10 5 1    
6 1 6 15 20 15 6 1  
7 1 7 21 35 35 21 7 1

Although we can use factorial-formula to calculate the combination, Pascal identity is rather more useful if we need to work with integers exclusively (avoid division).

Vandermonde’s identity

Theorem. (Vandermonde’s Identity). Let \(m,n,r\) be non-negative integers and \(r\leq \mathrm{min}(m,n)\). Then:

\[\binom {m+n}r = \sum_{k=0}^r \binom m{r-k} \binom nk\]

This means the number of ways to choose r objects from the set one of m objects and the set two of n objects (all objects are distinct). We have two equivalent ways of counting:

  • combine two sets together and calculate the combination. (LHS)
  • choose k objects from the set two and then r-k objects from the set one, which can be done in \(\binom m{r-k} \binom nk\) ways. Moreover, k ranges from 0 to r, hence the sum. (RHS)

In a special case where \(m=r=n\), we have a nicer result:

\[\binom {2n}n=\sum_{k=0}^n \binom nk ^2\]

This is called a central binomial coefficient. A closely related term is a Catalan number, which is

\(\binom {2n}n\frac{1}{n+1}=\binom {2n}n-\binom {2n}{n+1}\)u

Another identity

Theorem. Let \(k\leq n\) be two positive integers. Then:

\[\binom nk = \sum_{j=k-1}^{n-1}\binom j{k-1}\]


n\k 0 1 2 3 4 5 6 7
0 1              
1 1 1            
2 1 2 1          
3 1 3 3 1        
4 1 4 6 4 1      
5 1 5 10 10 5 1    
6 1 6 15 20 15 6 1  
7 1 7 21 35 35 21 7 1

Generalized Permutations and Combinations

Theorem. The number of \(k\)-permutations with repititon from the set of \(n\) objects is \(n^k\).

▶ Interpretation

Subsequently choose \(k\) objects from the set of \(n\) types of objects. There are \(n\) choices at each step. Using the product rule, the total number of ways to choose \(k\) objects is \(nn\dots n=n^k\).

Theorem. The number of \(k\)-combinations with repitition from the set of \(n\) objects is \(\binom {n+k-1} k = \binom {n+k-1}{n-1}\).

▶ Interpretation

Choose \(k\) objects from the set of \(n\) different types of objects but the order of choosing is not important. Basically we care about how many objects of each type we choose. It is equivalent to choosing the number of non-negative integer solutions to the equation

\[x_1+x_2+\dots+x_k = n\]

whereas \(x_i\) is the number of objects of type \(i\) that we should choose.

A more visual way to look at this problem is putting \(k-1\) walls among \(n\) objects to create \(k\) groups of objects. The \(i\)-th group contains \(x_i\) objects.

For example, \(n=5, k=3\):

  • The arrangement \(*\vert **\vert **\) tells us that we should choose 1 object of type 1, 2 objects of type 2, and 2 objects of type 3.
  • The arrangement \(***\vert **\vert\vert\) tells us that we should choose 3 objects of type 1, 2 objects of type 2, and 0 object of type 3.
  •  \(\vdots\)

There are \(n+k-1\) positions in an arrangement, and we need to choose \(k-1\) positions among them to be the walls and the rest will be objects. Hence there are \(\binom {n+k-1}{k-1}\) ways.

Permutations with identical objects

Theorem. The number of different permutations of \(n\) objects, where there are \(n_i\) objects of type \(i\), with \(i=1\dots k\) (\(k\) is the total number of types) is:

\[\frac{n!}{n_1!n_2!\dots n_k!}\]

Of course, \(n_1+n_2+\dots+n_k=n\).

▶ Interpretation

Suppose that we have a string of length \(n\) which contains \(k\) different letters. The frequency of letter \(i\) is denoted by \(n_i\). We are interested in the number of different anagrams. Assume that we “over-generated” \(n!\) anagrams, some of which may be duplicate and need to be removed. Note that, for each string, if we permute the positions of identical characters, we still get the same string. The number of ways to permute a string and still get the same string is \(n_1!n_2!\dots n_k!\). Using the division rule, we get the theorem above.

Distribute Objects into Boxes

Before we start, I want to introduce a few new terms.

A \(k\)-partition of a set A is a collection of \(k\) non-empty subsets of A. The number of \(k\)-partition of a set of \(n\) elements is denoted by \(S(n,k)\). This value is called a Stirling number of the second kind. (The forumula will be discussed later.)

A \(k\)-partition of a positive integer \(n\) is an integer solution to the equation \(x_1+x_2+\dots+x_k=n\), where \(x_1\geq x_2\geq\dots\geq x_k\geq 1\). The number of solutions is denoted by \(P(n,k)\). (The closed formula is, however, not yet discovered.)

In the four following cases, the number of objects is \(k\) and the number of boxes is \(n\).

Different objects and Different boxes

The boxes are numbered from 1 to n. The objects are numbered from 1 to k. Each object \(i\) is assigned with a number \(b_i\) from 1 to n, denoting the box into which this object will be put. An arrangement is the sequence \((1,b_1), (2,b_2),\dots, (k, b_k)\). Count the number of arrangements.

Condition #
None \(n^k\)
Each box has \(\leq\) 1 object
(only if \(k\leq n\))
\(\frac{n!}{(n-k)!}\)
Each box has \(\geq\) 1 object
(only if \(k\geq n\))
\(n!\times S(k,n)\)
Box \(i\) contains \(n_i\) objects \(\frac{k!}{k_1!k_2!\dots k_n!}\)

Identical objects and Different boxes

The boxes are numbered from 1 to n. Each box \(i\) is assigned with a number \(c_i\leq k\), denoting the number of objects put into the box. An arrangement is the sequence \(c_1,c_2,\dots,c_n\) such that \(c_1+c_2+\dots+c_n=k\).

Condition #
None \(\binom{k+n-1}{n-1}\)
Each box has \(\leq\) 1 object
(only if \(k\leq n\))
\(\binom nk\)
Each box has \(\geq\) 1 object
(only if \(k\geq n\))
\(\binom{k-1}{n-1}\)

Different objects and Identical boxes

The objects are numbered from 1 to k. An arrangement is a \(t\)-partition of the set \(\{1,2,\dots,k\}\) (\(t\) depends on the condition).

Condition #
None \(\sum_{t=1}^n S(k,t)\)
Each box has \(\leq\) 1 object
(only if \(k\leq n\))
\(1\)
Each box has \(\geq\) 1 object
(only if \(k\geq n\))
\(S(k,n)\)

Identical objects and Identical boxes

An arrangement is a \(t\)-partition of \(k\) (\(t\) depends on the condition).

Condition #
None \(\sum_{t=1}^n P(k,t)\)
Each box has \(\leq\) 1 object
(only if \(k\leq n\))
\(1\)
Each box has \(\geq\) 1 object
(only if \(k\geq n\))
\(P(k,n)\)



Ordinary Generating Function of a Sequence

A sequence of numbers \(a_0, a_1, a_2, \dots\) can be represented as a formal power series \(a_0 + a_1x + a_2x^2 + a_3x^3 + \dots\)

This series is called a generating function. The sequence can have infinitely or finitely many terms. A formal power series is very much like a polynomial, except that a series can have infinitely many terms, while a polynomial cannot; and a formal series means that the convergence of this series can be completely ignored. We can look at \(x\) as a placeholder / a bookkeeping device, we generally do not care about what value it may be, we are only interested in the coefficients.

Let’s begin with a simple sequence, \(\{a_n=1\}_{n=0}^{\infty}\). What is the generating function for this sequence? Let’s first call it \(A(x)\). Straightforward from the definition, we have:

\[A(x) = 1+x+x^2+x^3+x^4+\dots\]

However, we can try to find a “closed form” of A(x). Notice that:

\[xA(x) = x+x^2+x^3+x^4+\dots=A(x)-1\]

which implies that \(A(x) = \frac{1}{1-x}\). However, shortening a generating function like this is only valid when \(A(x)\) converges to some value, in this case, when \(\vert x\vert < 1\). As I said above, we generally do not care about the convergence, but in some occasions it is important to consider at which values of \(x\) that the series converges.

Sometimes, we are given a generating function and we need to find the terms of the sequence. For example, we are given \(A(x) = \frac{1}{1-x}\). In this case, we can apply the Taylor’s series expansion of a function. Recall that a Taylor’s series expansion of a function \(f(x)\) around the neighborhood of \(x=a\) is:

\[T(x)=f(a) + \frac{f'(a)}{1!}(x-a) + \frac{f''(a)}{2!}(x-a)^2 + \dots + \frac{f^{(n)}(a)}{n!}(x-a)^n + \dots\]

It is somewhat a notion / a convention to use \(a=0\) (a Taylor’s series at \(x=0\) is also known as a Maclaurin’s series), because it is easier to evaluate the coefficients.

Some calculating rules

There are some basic techniques to work with generating functions. Let \(f(x)\) and \(g(x)\) be the generating function for \(\{u_n\}_{n=0}^\infty\) and \(\{v_n\}_{n=0}^\infty\), respectively. We have the following rules:

  • Addition rule: \(f(x)+g(x) \longleftrightarrow \{u_n+v_n\}_{n=0}^\infty\)
  • Scaling rule: \(kf(x) \longleftrightarrow \{ku_n\}_{n=0}^\infty\)
  • Right-shifting rule: \(x^kf(x) \longleftrightarrow (0^1, 0^2, \dots, 0^k, u_0, u_1, u_2, \dots)\)
  • Derivative rule: \(f'(x) \longleftrightarrow (u_1, 2u_2, 3u_3, \dots)\)

Convolution rule

This deserves its own section. Let \(\{w_n\}_{n=0}^\infty\) be a sequence where \(w_n=\sum_{i+j=n}u_iv_j\). The generating function for \(\{w_n\}\) will be \(h(x)=f(x)g(x)\).

Other interpretation of the convolution rule is: If \(f(x)\) is the generating function for selecting items from set A and \(g(x)\) is the generating function for selecting items from set B. If A and B are disjoint, then the generating function for selecting items from \(A\cup B\) is \(f(x)g(x)\).

Combination with repitition

One classic example is to find the generating function of the combination with repitition, i.e. select a \(n\) objects from \(r\) types of objects. Let \(\{u_n\}_{n=0}^\infty\) be a sequence where \(u_n=\binom {n+r-1}{r-1}\). What is the generating function for \(\{u_n\}\)? We approach this problem as follows:

How many ways to select \(n\) objects from only type \(i\)? Obviously: one, i.e. choose \(n\) objects of type \(i\). The generating function for this selection is \(1+x+x^2+\dots=\frac{1}{1-x}\).

Since the sets of types are disjoint (there are \(r\) sets in total), we can use the convolution rule to answer the question: “How many ways to select \(n\) objects from types \(\{1, 2, \dots, r\}\)?”. The answer is:

\[\frac{1}{(1-x)^r}=\sum_{n=0}^\infty\left(\sum_{i_1+i_2+\dots+i_r=n}1\right) x^n\]

which essentially counts the number of non-negative solutions of the equation \(i_1+i_2+\dots+i_r=n\), precisely what we are looking for.

Generating Function Sequence
\(\frac{1}{1-x}=1+x+x^2+\dots\) \(u_n=1,\quad\forall n\)
\(\frac{1}{1-ax}=1+ax+a^2x^2+\dots\) \(u_n=a^n,\quad\forall n\)
\(\frac{1}{1-x^r}=1+x^r+x^{2r}+\dots\) \(u_n=1,\quad n\ \vdots\ r\)
\(u_n=0\) otherwise
\(\frac{1}{(1-x)^2}=1+2x+3x^2+\dots\) \(u_n=n+1,\quad \forall n\)
\(\frac{1}{(1-x)^r}=1+\binom r1 x+\binom {r+1}2 x^2+\dots\) \(u_n=\binom{r+n-1}{n},\quad \forall n\)
or \(u_n=\binom{r+n-1}{r-1},\quad \forall n\)
\(e^x=1+x+\frac{x^2}{2!}+\dots\) \(u_n=\frac{1}{n!},\quad \forall n\)

Exchanging money

How many ways to draw \(n\) dollars using the $1, $2, $5 tokens?

The order of the tokens are not important

Each draw is a natural solution \((x_1, x_2, x_3)\) to the equation \(x_1 + 2x_2 + 5x_3 = n\).

It is close to the combination with repetition. We choose \(n\) objects from three types of objects, but the condition is that:

  • The number of the first-type objects is arbitrary.
  • The number of the second-type objects is divisible by 2.
  • The number of the third-type objects is divisible by 5.

Using the convolution rule, the generating function for drawing \(n\) dollars using the $1, $2, $5 tokens are:

\[(1+x+x^2+\dots)(1+x^2+x^4+\dots)(1+x^5+x^{10}+\dots)\\ =\frac{1}{1-x}\cdot\frac{1}{1-x^2}\cdot\frac{1}{1-x^5}\]

The order of the tokens are important

We add an extra parameter to the problem. Let’s denote \(k\) the number of tokens used. Each draw is a natural solution \((x_1,x_2,\dots,x_k)\) to the equation \(x_1+x_2+\dots+x_k=n\) where each \(x_i\) is either 1, 2, or 5.

We use the convolution rule \(k\) times. Observe that the generating function for drawing \(n\) dollars using only one token is \((x+x^2+x^5)\), it means you can only draw when \(n\) is equal to 1, 2, or 5, and in such cases there is only one way. If you can choose \(k\) tokens, the generating function is \((x+x^2+x^5)^k\), i.e. the number of ways to draw \(n\) dollars is the coefficient of \(x^n\).

However, the original problem permits any amount of tokens, so the number of ways to draw \(n\) dollars is the coefficient of \(x^n\) in:

\[\sum_{k=0}^\infty (x+x^2+x^5)^k=\frac{1}{1-(x+x^2+x^5)}\]

Ultimately, the coefficient of \(x^n\) in this case is also \((\sum_{i_1+2i_2+5i_3=n} 1)\), very similar to the result of the case where the order is ignored, but the constructions of the generating functions are very much different and hence the results are also different.

Partition of a number

Recall: A \(k\)-partition of a positive integer \(n\) is an integer solution to the equation \(x_1+x_2+\dots+x_k=n\), where \(x_1\geq x_2\geq\dots\geq x_k\geq 1\). The number of solutions is denoted by \(P(n,k)\).

What is the generating function for the sequence \(\{u_n\}_{n=0}^\infty\) and \(u_n=\sum_{k=1}^nP(n,k)\)? (Assume that \(u_0=1\)) In other words, find the generating function for the number of partitions of \(n\).

The only thing that matters in each partition is how many number 1 are used, how many number 2 are used, etc. Each partition is a natural solution to the equation \(y_1 + 2y_2 + 3y_3 + \dots + ny_n= n\). It’s very similar to the exchanging money problem, which would yield:

\[\prod_{i=1}^n \frac{1}{1-x^i}\]

Since we are allowed to use infinitely many types of tokens, i.e. \(\{1,2,3,\dots\}=\mathbb{N}^*\), the number of ways to partition \(n\) into positive integers is the coefficient of \(x^n\) in:

\[\prod_{i=1}^\infty \frac{1}{1-x^i}\]

Solving Recurrence Relations

Using Generating Function

Generating function is a tool to find the closed formula of a sequence whose recursive formula is known. We take the Fibonacci sequence \(\{u_n\}_{n=0}^\infty\) as an example:

\[u_0 = 0\\ u_1 = 1\\ u_n=u_{n-1}+u_{n-2}\ \ (n\geq 2)\]

We first need to find the generating function, then the Maclaurin’s series expansion, and finally the closed formula of the coefficients.

Denote the generating function by \(f(x)\).

\[\begin{aligned} f(x) &= u_0 + u_1x + u_2x^2 + \dots = \sum_{n=0}^\infty u_nx^n \\ &= u_0 + u_1x + (u_1+u_0)x^2 + \dots\\ &= u_0 + u_1x + \sum_{n=2}^\infty (u_{n-1}+u_{n-2})x^n\\ &= u_0 + u_1x + x\sum_{n=1}^\infty u_nx^n + x^2\sum_{n=0}^\infty u_nx^n\\ &= 0+x+x(f(x) - u_0) + x^2f(x)\\ &= x+(x+x^2)f(x)\\ \end{aligned}\]

Hence:

\[f(x)=\frac{x}{1-x-x^2}\]

We then decompose \(f(x)\) into partial fractions:

\[f(x)=\frac{\alpha_1}{1-\beta_1 x} + \frac{\alpha_2}{1-\beta_2 x}\]

We can easily find the roots of the denominator \(1-x-x^2=0\), which are \(\frac{-1+\sqrt 5}{2}\) and \(\frac{-1-\sqrt 5}{2}\). \(\beta_1\) and \(\beta_2\) are the inverses of the roots, hence:

\[\beta_1 = \frac{1+\sqrt 5}2,\quad \beta_2=\frac{1-\sqrt 5}2\]

Further more, \(\alpha_1(1-\beta_2x)+\alpha_2(1-\beta_1x) = 1\) the numerator. We solve the following system:

\[\begin{cases} \alpha_1+\alpha_2 = 0\\ \alpha_1\beta_2 + \alpha_2\beta_1 = 1 \end{cases}\]

We have \(\alpha_1\beta_2 -\alpha_1\beta_1 = 1\) \(\Leftrightarrow\) \(\alpha_1=\frac{1}{\beta_2-\beta_1} = \frac{-1}{\sqrt 5}\) and \(\alpha_2=\frac{1}{\sqrt 5}\).

The closed formula of \(u_n\) will be therefore:

\[u_n=\alpha_1\beta_1^n + \alpha_2\beta_2^n\]


\[u_n=\alpha_2\beta_2^n - \alpha_2\beta_1^n = \frac{1}{\sqrt 5}\left(\left(\frac{1-\sqrt 5}{2}\right)^n - \left(\frac{1+\sqrt 5}{2}\right)^n\right)\]

Using Characteristic Equation

Homogeneous

Suppose we have a homogeneous recurrence relation:

\[a_n=c_1a_{n-1}+c_2a_{n-2}+\dots+c_ka_{n-k}\]

The characteristic equation is defined to be:

\[x^k-c_1x^{k-1}-c_2x^{k-2}-\dots-c_k=0\]

If the equation has \(k\) distinct roots \(r_1, r_2,\dots, r_k\), then the closed form of \(a_n\) is:

\[a_n = \alpha_1r_1^n + \alpha_2r_2^n + \dots + \alpha_kr_k^n\]

where \(\alpha_1,\alpha_2, \dots,\alpha_k\) are constants. They can be found by substituting some known values of \(a_n\) and solving the system of equation.

If the equation has \(t < k\) distinct roots \(r_1, r_2, \dots, r_t\) with multiplicity \(m_1, m_2,\dots,m_t\), respectively, then the closed form of \(a_n\) is:

\[a_n=P_1(n)r_1^n + P_2(n)r_2^n + \dots + P_k(n)r_k^n\]

where \(P_i(n)\) denotes a polynomial of \(n\) with constant coefficients and its degree is less than \(m_i\), i.e.

\[P_i(n) = \alpha_{i,0} + \alpha_{i,1}n + \alpha_{i,2}n^2 + \dots + \alpha_{i,m_i-1}n^{m_i-1}\]

Again, the coefficients can be found by substituting some known values of \(a_n\) and solving the system of equation.

Non-homogeneous

Suppose we have a non-homogeneous recurrence relation:

\[a_n=c_1a_{n-1}+c_2a_{n-2}+\dots+c_ka_{n-k} + F(n)\]

where

\[F(n)=b_1^nf_1(n) + b_2^nf_2(n)+\dots+b_q^nf_q(n)\]

Suppose that \(f_i(n)\) is a polynomial of \(n\) of degree \(d_i\) and \(b_i\) is a constant (for \(1\leq i\leq q\)).

The characteristic equation is defined to be:

\[(x^k-c_1x^{k-1}-c_2x^{k-2}-\dots-c_k)\prod_{i=1}^q(x-b_i)^{d_i+1}=0\]

We solve this equation to obtain the roots. Depends on whether the roots are distinct or not, we apply the same techniques as above to find the closed formula.

The Characteristic Equation method can be shown to be derived from the Generating Function method, which I believe more intuitive.