General Idea

A general two-level circuit can be seen as either a $CNF$ or a $DNF$. If the fan-in is restricted, then the circuit becomes a $k-CNF$ or $k-DNF$. We can describe how a circuit simplifies under a random restriction. This would allow us to reduce the depth of a circuit.

Formalization and Proof

Lemma 1(Håstad’s Switching Lemma) Suppose $f$ is an $n$ input function expressible as a $k-DNF$, after a random restriction $\rho$ of $t$ randomly selected inputs, for every $s \ge 2$: \begin{equation} \mathop{\mathrm{P}}\limits_{\rho} (f|_\rho\text{ is not expressible as an } s-CNF) \le \left(\frac{(n-k)k^{10}}n\right)^{\frac{s}2} \end{equation} in which $f|_\rho$ is $f$ under the restriction $\rho$.

Proof. To prove this lemma, we need to prove that the number of “bad restrictions” that restrict $f$ to functions not expressible as an $s-CNF$ is relatively small. The way we do this is to find a one-to-one mapping from such a restriction $\rho$ to a member in $R_{n-t-s} \times [2]^c$, in which $c = O(s \log k)$ and $R_m$ is the set of restrictions on $m$ inputs. We could see this as mapping $\rho$ to $(\rho\pi, \text{ some auxiliary information})$, where the auxiliary information is only necessary to compute the invert of the mapping.

Before we construct such a mapping, we need to point out an important fact. Since $f|_\rho$ is inexpressible as a $s-CNF$, there must be a set of variables $\{x_i\}$ such that by a proper restriction on these variables we could make $f|_\rho$ a zero function with the size of the set $>s$ (or we could find all such sets and $\wedge$ the clauses derived from each such set). Additionally, there isn’t a subset of this set that has the same property (exists a restriction that turns the original function into zero function). We denote by $\pi$ the corresponding restriction on this set.

Next, we try to construct such a mapping. Suppose $C_1 \dots C_{m}$ are the clauses (ordered) in the $k-DNF$ that is equal to $f$. Since $f|_\rho \neq \mathbf{1}$, the clauses are either fixed to 0 or undecided. Thus, from the left to the right, there is a clause $C_{i_1}$ that is the first to not be set to 0. Let $\pi_1$ be a restriction that only restricts the free variables $\pi$ restricts in this clause. Since $\pi$ makes the function zero, $\pi_1$ must also set $C_{i_1}$ to zero. We iterate this process to get $\pi_2\dots$ until the total number of variables restricted is equal to $s$. The last $\pi_p$ might be trimmed arbitrarily. Let $\sigma_i = \neg \pi_i$, we define the restriction in the mapping of $\rho$ as $\rho\sigma_1\dots\sigma_p$.

Next, we show how to revert this process. Define $s_i$ as the number of variables restricted by each $\sigma_i$. We find from the left to the right the first clause that is fixed to 1. By the process we get the $\sigma_i$’s, all the variable $\sigma_1$ restricts is in this clause. We then use $s_1 \log n$ bits to record all the variables restricted by $\sigma_1$. We could then compute $\rho\pi_1\sigma_2\dots\sigma_p$ and repeat this process to find all the $\sigma_i$. For $\sigma_p$, the first clause might be undecided, but this would not tamper with the inversion. Thus, we need at most $\sum_{i}s_i \log n = s\log n$ auxiliary bits.

With such a one-to-one mapping, we know that there are at most $\binom{n}{t+s}2^{t+s}\cdot 2^{s\log n} = \binom{n}{t+s}2^{t+s}n^s$ such “bad restrictions”. Thus, the probability that $\rho$ is a bad restriction is at most

\begin{equation} \frac{\binom{n}{t+s}2^{t+s}n^s}{\binom{n}{t}2^t} \le \left(\frac{(n-k)k^{10}}n\right)^{\frac{s}2} \end{equation}

The last inequality needs some estimate techniques. $\tag*{$\blacksquare$}$

Remarks

This lemma also stands if $CNF$ and $DNF$ switch place.
This lemma does not concern the size of $f$.
In reality $t$ is often close to $n$, say, $n-\sqrt{n}$.

Example Applications

One of the most well-known applications of this lemma is that

Theorem 1 (Parity is not in AC0) $\oplus \notin \mathrm{\mathbf{AC}}^0$, in which $\oplus (x_1,\dots x_n) = \sum_{i} x_i \text{ (mod 2)}$.

Proof. Suppose we have a circuit in $\mathrm{\mathbf{AC}}^0$ that could compute $\oplus$. Without loss of generality, suppose the circuit satisfies some conditions:

All fan-outs are 1
There is no $\neg$ gates in the circuit
The $\wedge$ and $\vee$ gates alternate between each layer
The bottom level has $\wedge$ gates of fan-in 1

We show that we could use the switching lemma to gradually reduce the depth of the circuit to 2. Suppose the size of the circuit has upper bound $n^b$. Consider the two bottom layers of the circuit, $\vee$ and $\wedge$. Each $\vee$ computes a $CNF$. For the $i^{th}$ step, let $t_i = n_i - \sqrt{n_i}$ and $k_i = 10b2^i$. Thus, after the restriction, with probability at least $1 - \frac1{10n^b}$, we could express the new function with a $k_{i+1}-CNF$. We could then merge the top layer of the new $CNF$ with the third layer in the circuit, which is also a $\wedge$ layer. We could then repeatedly use the switching lemma. With high probability (at least $1-\frac{dep(C)}{10n^b}$ by union bound), we could get a two layer circuit that has a small width $k$. Thus, we could fix the $k$ free variables to make the function constant. However, $\oplus$ could not be made constant by fixing $k < n$ variables, which is a contradiction.

Thus, $\oplus \notin \mathrm{\mathbf{AC}}^0$ $\tag*{$\blacksquare$}$