# Finite Playground

## New improvement to fast matrix multiplication

Again the exponent for the fastest matrix multiplication algorithm is improved.  This time the magical number is 2.3728639, 0.0000630 better than Williams’ 2.3729269.  (Last time it goes from the famous CW bound 2.3754770 to Stothers’ 2.3736898 then to Williams’, a 0.0025501 improvement.)

Powers of Tensors and Fast Matrix Multiplication, François Le Gall, 2014.

## Compilation of basic results related to regular language

* Results related to Pumping Lemma

Theorem.  A language $L$ accepted by a DFA $A$ of $n$ states is

1. nonempty if and only if $A$ accepts a sentence of length less lean $n$.
2. infinite if and only if $A$ accepts a sentence of length $\ell$ with $n \le \ell < 2n$.

The union of the following languages, $L_1 \cup L_2$, is non-regular, but nevertheless it cannot be proven using pumping lemma.

$L_1 = \{u_1 v_1 w v_2 u_2 : u_1,u_2,v_1,v_2,w \in [4]^*,$ and either $v_1=w$, $v_1=w$, or $v_2=v_2\}$

$L_2 = \{w : w\in [4]^*,$ and precisely $1/7$ of symbols in $w$ are 3′s $\}$

* Regular languages are precisely constant-space languages

Theorem.  A regular language can be recognized by a two-way read-only Turing machine.

Proof sketch.  Record the configuration of each time we first visit the $i$-th cell.  Define a finite automaton to simulate the same behavior.

Theorem.  $\textsc{Space}(o(\log\log n))$ is the same as $\textsc{Space}(1)$, both are the class of regular languages.  That is, having an algorithm that uses $o(\log\log n)$-space is equal to not using any space.

Proof sketch.  If a Turing machine uses $s(n)$ space, there are $2^{O(s(n))}$ possible configurations and $2^{2^{O(s(n))}}$ possible crossing sequences.  If $s(n) = o(\log\log n)$ then there are two crossing sequences that are the same, and we can further shorten a shortest string accepted by the machine by the following lemma:

Lemma.  Let $M$ be a machine and $x$ be the input.  Suppose two crossing sequence $C_i$ and $C_j$ are equal; then by removing the substring of $x$ from index $i+1$ to $j-i$ (assume $i < j$), we get another string that is accepted by the machine $M$.

This is a contradiction.

The following language is non-regular but can be decided in $O(\log\log n)$ space:

$L = \{\#(0)_2\#(1)_2\# \ldots \#(2^k-1)_2\# : k \ge 0\}$,

where $(i)_2$ is the binary representation of $i$.

* Generating function of regular language

For a given language $L \subseteq \Sigma^*$,

$S_L(z) = \sum_{n\ge 0} |L \cap \Sigma^n| z^n$

is the generating function of $L$.  Using analytic combinatorics we can prove the following useful facts.

Theorem.  For a regular language $L$, the generating function $S_L(z)$ is rational.

Theorem.  Let $A=(Q,\Sigma,\delta,q_0,F)$ be a finite automaton of a language $L$.  Then the generating function of $L$ is the following rational function determined under matrix form

$S_L(z) = [q_0]^T(I-z[\delta])^{-1}[F]$,

where $[\delta](i,j) = | \{$ number of different labels between $q_i$ and $q_j \}|$; $[q_0]$ and $[F]$ as characteristic vectors.

* Minimizing finite automata

Brzozowski’s algorithm use only power set construction $\textsc{Power}$ and edge reversal $\textsc{Rev}$.  One can observe that reversing edges of a DFA gives an NFA of the reverse language.  Then the power set construction gives a minimum DFA.  This algorithm takes exponential time in the worst case.

Hopcroft’s algorithm is the fastest algorithm known; runs in $O(n \log n)$ time.  Try to partition the states using the Myhill-Nerode equivalence relation.

* Relation with $AC^0$

Theorem.  $\mathsf{REG}$ and $\mathsf{AC}^0$ are incomparable.

Proof.  Parity is not in $\mathsf{AC}^0$.  Palindrome/addition/$\{ww : w \in \Sigma^*\}$ are not regular.

## Finite automata, monoids, and Myhill-Nerode theorem

Kleene-Buchi Theorem.  For a language L, the following are equivalent:
(1) L is recognized by a finite automaton,
(2) L is defined by a regular expression,
(3) L is definable in Monadic second order logic,
(4) L is recognized by a finite monoid.

The syntactic monoid $M(L) = \Sigma^*/\cong_L$ is defined by the equivalence $\cong_L$ with

$u \cong_L v$ if and only if $xuy \in L \iff xvy \in L$ for every $x,y \in \Sigma^*$.

Myhill Theorem.  A language is regular if and only if the syntactic monoid is finite.

If we define another equivalence relation $\sim_L$ with

$u \sim_L v$ if and only if $uy \in L \iff vy \in L$ for every $y \in \Sigma^*$,

$\Sigma^*/\sim_L$ may not be a monoid in general, but we have the following:

Myhill-Nerode Theorem.  A language $L$ can be recognized by a finite automaton with $k$ states if and only if the relation $\sim_L$ has index $k$; that is, $\Sigma^*/\sim_L$ has $k$ classes.

## Practice on maximum coverage problem

Recently I just realized that the analysis of approximation algorithms of covering typed problems are unintuitive to me.  I don’t know whether it’s because the analysis-based nature which always gets me, or there’s just that I’m not familiar with the proof enough.  In any case, this is my attempt on decoding the proofs.  Let me take maximum coverage problem as an example, which given a set system with a number $k$, we ask what is the largest amount of elements we can cover using $k$ sets.

Theorem.  Greedy covering gives a $(1-1/e)$-approximation to the maximum coverage problem.

Proof.  Let $\textit{new}(i)$ be the number of elements newly covered at step i.  Let $\textit{covered}(i)$ be the number of elements covered from step $1$ to step $i$; we have

$\textit{covered}(i) = \textit{new}(1) + \dots + \textit{new}(i)$

Let $\textit{gap}(i)$ be the number of elements we still need to cover in order to cover opt elements; therefore $\textit{gap}(i) = \textit{opt} - \textit{covered}(i)$.

Key observation.  At step $i$, there is always a set that covers at least $k$-th fraction of the uncovered elements in $\textit{OPT}$.
Proof.  This is exactly the part I feel uncomfortable with; I find the following formulation helps my intuition:
“No matter the removal of any subset of elements in $\textit{OPT}$, the original $k$ sets in the optimal cover still covers $\textit{OPT}$; and because $k$ sets are enough to cover a partial of $\textit{OPT}$, there is one set among the $k$ sets that covers at least $k$-th fraction of the partial $\textit{OPT}$.”

Therefore $\textit{new}(i) \ge \textit{gap}(i-1)/k$; from this point on things becomes easy.

$\textit{gap}(i) \le \textit{gap}(i-1) - \textit{gap}(i-1)/k = {k-1 \over k} \textit{gap}(i-1)$,

so each step will shrink the gap by $(k-1)/k$.  After $k$ steps the gap has size at most $(1-1/k)^k \le 1/e$-th fraction of $\textit{opt}$, which proves the statement.

## Dilworth Theorem and equivalence to Konig-Egervary Theorem

Dilworth’s Theorem. If $P$ is a finite poset, then the maximum size of an antichain in $P$ equals the minimum number of chains needed to cover the elements of $P$.

Proof. Induction on $|P|$. We separate into two cases.

(a) Some maximum antichain $A$ omits both a maximal element and a minimal element of $P$. Then neither $D[A]$ or $U[A]$ contains all of $P$; the maximality of $A$ implies $P = D[A] \cup U[A]$. Both $D[A]$ and $U[A]$ has width $w(P)$ because they contain $A$, and they only intersect in $A$. Apply induction hypothesis on both sets, we obtain $k$ chains covering $D[A]$ and $k$ chains covering $U[A]$; combining these chains form $k$ chains covering $P$.

(b) Every maximum antichain of $P$ consists of all maximal elements or all minimal elements. Thus $w(P-\{x,y\}) \le k-1$ if $x$ is a minimal element and $y$ is a maximal element below $x$. Apply induction hypothesis on $P-\{x,y\}$ gives us $k-1$ chains; add chain $\{x,y\}$ to get a covering of $P$.

Theorem.  Dilworth’s Theorem is equivalent to Konig-Egervary Theorem.

Proof.  Dilworth to Konig-Egervary: View an $n$-node bipartite graph $G$ as a bipartite poset. The nodes of one part are maximal elements, and nodes of the other part are minimal. Every covering of the poset of size $n-k$ uses $k$ chains of size 2, which is actually a matching. Each antichain of size $n-k$ corresponds to an independent set in $G$, and the rest of $k$ nodes forms a vertex cover.

Konig-Egervary to Dilworth: Consider an arbitrary poset $P$ of size $n$. Define a bipartite graph $G$ as follow: For each element $x \in P$, create two nodes $x^-$ and $x^+$. The two parts of the graph are $\{x^- : x \in P\}$ and $\{x^+ : x \in P\}$. The edge set is $\{x^- y^+ : x <_P y\}$.

Every matching in $G$ yields a chain-covering in $P$ as follow: We start we $n$ chain, each contains a unique element. If $x^- y^+$ is in the matching then $x$ and $y$ are combined in the same chain. Therefore because each node in $G$ appears in at most on edge of the matching, a matching of $k$ edges forms a covering of $n-k$ chains.

Given a vertex cover $R$ in $G$ of size $k$, define a corresponding antichain $A = \{x \in P : x^-, x^+ \notin R\}$; this is indeed an antichain because if there is a relation $x <_P y$ in $A$ then edge $x^- y^+$ will be presented, and vertex cover $R$ needs to contain at least one of $x^-, y^+$.

Claim. No minimal vertex cover of $G$ uses both $\{x^-,x^+\}$, because by transitivity the sets $\{z^- : z \in D(x)\}$ and $\{y^+ : y \in U(x)\}$ induce a complete bipartite subgraph in $G$. A vertex cover of a complete bipartite graph must use all of one part. Since $x^-,x^+$ have all the neighbors in these two sets, we can remove one of $x^-,x^+$ that is not in the right part and decrease the size of the vertex cover.

Thus a minimum vertex cover of size $k$ yields an antichain of size $n-k$.

## Graph coloring theorems revisited II – Vizing’s Theorem

Alexandr Kostochka found the following new proof to the classical theorem.

Vizing’s Theorem.  For every simple graph $G$, $\chi'(G) \le \Delta(G) + 1$.

Proof.  We prove by contradiction; assume that $G$ is a minimal counter-example (in terms of number of edges) to the statement; that is, $G$ has no edge-($\Delta+1$)-coloring.

Let $\phi$ be an edge-($\Delta+1$)-coloring of $G-xy$. For every $v \in V(G)$, let $O(v)=O_\phi(v)$ denote the set of colors in $[\Delta+1]$ not used to color the edges incident to $v$; one can see that $O(x) \cap O(y) = \emptyset$.

Construct the auxiliary multidigraph $H$ as follows: $V(H) = N_G(y)$ and there is a directed edge from $u$ to $v$ in $E(H)$ if $\phi(vy) \in O(u)$. Intuitively an edge from $u$ to $v$ means that if we can recolor $vy$, then we can recolor $uy$ as well. Let $X$ be the set of vertices reachable in $H$ from $x$ ($x \in X$ as well); any recoloring of an edge $wy$ for $w\in X$ will imply that we can color $xy$. Since the out-neighbors of a vertex reachable from $x$ also are reachable from $x$, $N^+_H(v) \subseteq X$ for every $v \in X$.  Therefore

$\sum_{v \in X} d^+_{H[X]}(v) = |{E(H[X])}| = \sum_{v \in X} d^-_{H[X]}(v) \text,$

and the theorem will be proven if we can show that

1. $d^+_{H[X]}(v) \ge \Delta - d(v) + 1 + [v = x]$ and
2. $d^-_{H[X]}(v) \le 1-[v=x]$ for all $v \in X$;

this implies

$0 = \sum_{v \in X} \{d^+_{H[X]}(v) - d^-_{H[X]}(v)\} \ge 2 + \sum_{v \in X} (\Delta - d(v)) \ge 2$,

a contradiction.  Items (1) and (2) follow from the following claims. Let $\alpha \in O(y)$.

Claim 1 (Exchange argument).  $\alpha\notin O(v)$ for every $v\in X$.
Proof sketch.  Recolor the edges of the form $wy$ for each $w$ on the path from $x$ to $v$.

Claim 1 yields that for every $v \in X$ and $\beta \in O(v)$, there is some $w \in N(y)$ with $\phi(yw)=\beta$. Then by the definition of $H$, $vw\in H[X]$. Because $|{O(v)}| \ge \Delta+1-d(v)$ for each $v\in V(G)$ and $|{O(u)}| \ge \Delta+1-d_{G-xy}(u) = \Delta-d(u)+2$ for $u \in\{x,y\}$; so

$d^+_{H[X]}(v) \ge \Delta+1-d(v)$ for every $v \in X$ and $d^+_{H[X]}(x) \ge \Delta-d(x)+2$.

Claim 2 (Distinct $O$-sets).  For all distinct $v,w \in X$, $O(v) \cap O(w) = \emptyset$.
Proof sketch.  A $[\beta,\gamma]$-path in $G$ is a path whose edges are alternately colored with $\beta$ and $\gamma$. A $[\beta,\gamma](a,b)$-path is a $[\beta,\gamma]$-path from $a$ to $b$ in $G$.

Claim: If $v \in X$ and $\beta \in O(v)$, then $G$ contains an $[\alpha,\beta](v,y)$-path.  The claim follows from a Kempe-chain-like argument to recolor $vy$ with $\alpha$.

Then suppose there are $v,w\in X$ with $\beta\in O(v)\cap O(w)$, then the $[\beta,\alpha]$-path starting at $y$ must end at both $v$ and $w$, a contradiction.

Claim 2 yields that $d^-_{H[X]}(v) \le 1$ for every $v \in X$ and $d^-_{H[X]}(x) = 0$, because the color on edge $vy$ corresponds to a unique in-neighbor of $v$ (and there are no color on edge $xy$).

## Graph coloring theorems revisited I – Brook’s Theorem and Rubin’s Block Theorem

Rubin’s Block Theorem.  令 G 為一雙連通圖; 若 G 不是完全圖或奇圈, 則 G 中有一個含有最多一條弦 (chord) 的偶圈.

Lemma 1.  給定連通圖 G 與一色組函數 L, (i) 若有某個點 v 的顏色選擇多於 degree, 則圖 G 可順利著色; (ii) 若圖 G 為雙連通, 且有兩個點上的顏色組不同, 則圖 G 可順利著色.

Lemma 2.  連通圖 G 若存在一 degree-choosable 生成子圖 H, 則圖 G 也是 degree-choosable.
Proof.  By Lemma 1(a).

Theorem.  圖 G 不為 degree-choosable 若且唯若每個雙連通集 (block) 皆為完全圖或奇圈.
Proof.  由 Lemma 2 與 Block Theorem 我們只需要證明含有最多一條弦的偶圈是 degree-choosable; 但利用 Lemma 1(b) 這很明顯.

Brooks’ Theorem (extended).  若圖 G 不是完全圖或奇圈, 則 $\chi_{\ell}(G) \le \Delta(G)$.
Proof.  利用上面的 Theorem 我們只需要處理當所有雙連通集都是完全圖或奇圈的情形.  如果有個點的 degree 比 $\Delta(G)$ 小, 利用 Lemma 1(a) 就結束了; 所以 G 一定是 $\Delta(G)$-正則 (regular); 但在這種情況下表示我們只有一個 block (不然斷點的 degree 會有問題), 證明完成.