Finite Playground

On earth as it is in Heaven.

Leave a comment

New improvement to fast matrix multiplication

最近花了許多時間讀資格考, 寫了不少筆記, 沒有時間寫網誌…  難得有新聞來寫一下!

Again the exponent for the fastest matrix multiplication algorithm is improved.  This time the magical number is 2.3728639, 0.0000630 better than Williams’ 2.3729269.  (Last time it goes from the famous CW bound 2.3754770 to Stothers’ 2.3736898 then to Williams’, a 0.0025501 improvement.)

Powers of Tensors and Fast Matrix Multiplication, François Le Gall, 2014.

Leave a comment

Compilation of basic results related to regular language

* Results related to Pumping Lemma

Theorem.  A language L accepted by a DFA A of n states is

  1. nonempty if and only if A accepts a sentence of length less lean n.
  2. infinite if and only if A accepts a sentence of length \ell with n \le \ell < 2n.

The union of the following languages, L_1 \cup L_2, is non-regular, but nevertheless it cannot be proven using pumping lemma.

L_1 = \{u_1 v_1 w v_2 u_2 : u_1,u_2,v_1,v_2,w \in [4]^*, and either v_1=w, v_1=w, or v_2=v_2\}

L_2 = \{w : w\in [4]^*, and precisely 1/7 of symbols in w are 3′s \}

* Regular languages are precisely constant-space languages

Theorem.  A regular language can be recognized by a two-way read-only Turing machine.

Proof sketch.  Record the configuration of each time we first visit the i-th cell.  Define a finite automaton to simulate the same behavior.

Theorem.  \textsc{Space}(o(\log\log n)) is the same as \textsc{Space}(1), both are the class of regular languages.  That is, having an algorithm that uses o(\log\log n)-space is equal to not using any space.

Proof sketch.  If a Turing machine uses s(n) space, there are 2^{O(s(n))} possible configurations and 2^{2^{O(s(n))}} possible crossing sequences.  If s(n) = o(\log\log n) then there are two crossing sequences that are the same, and we can further shorten a shortest string accepted by the machine by the following lemma:

Lemma.  Let M be a machine and x be the input.  Suppose two crossing sequence C_i and C_j are equal; then by removing the substring of x from index i+1 to j-i (assume i < j), we get another string that is accepted by the machine M.

This is a contradiction.

The following language is non-regular but can be decided in O(\log\log n) space:

L = \{\#(0)_2\#(1)_2\# \ldots \#(2^k-1)_2\# : k \ge 0\},

where (i)_2 is the binary representation of i.

* Generating function of regular language

For a given language L \subseteq \Sigma^*,

S_L(z) = \sum_{n\ge 0} |L \cap \Sigma^n| z^n

is the generating function of L.  Using analytic combinatorics we can prove the following useful facts.

Theorem.  For a regular language L, the generating function S_L(z) is rational.

Theorem.  Let A=(Q,\Sigma,\delta,q_0,F) be a finite automaton of a language L.  Then the generating function of L is the following rational function determined under matrix form

S_L(z) = [q_0]^T(I-z[\delta])^{-1}[F],

where [\delta](i,j) = | \{ number of different labels between q_i and q_j \}|; [q_0] and [F] as characteristic vectors.

* Minimizing finite automata

Brzozowski’s algorithm use only power set construction \textsc{Power} and edge reversal \textsc{Rev}.  One can observe that reversing edges of a DFA gives an NFA of the reverse language.  Then the power set construction gives a minimum DFA.  This algorithm takes exponential time in the worst case.

Hopcroft’s algorithm is the fastest algorithm known; runs in O(n \log n) time.  Try to partition the states using the Myhill-Nerode equivalence relation.

* Relation with AC^0

Theorem.  \mathsf{REG} and \mathsf{AC}^0 are incomparable.

Proof.  Parity is not in \mathsf{AC}^0.  Palindrome/addition/\{ww : w \in \Sigma^*\} are not regular.

Leave a comment

Finite automata, monoids, and Myhill-Nerode theorem

Kleene-Buchi Theorem.  For a language L, the following are equivalent:
(1) L is recognized by a finite automaton,
(2) L is defined by a regular expression,
(3) L is definable in Monadic second order logic,
(4) L is recognized by a finite monoid.

The syntactic monoid M(L) = \Sigma^*/\cong_L is defined by the equivalence \cong_L with

u \cong_L v if and only if xuy \in L \iff xvy \in L for every x,y \in \Sigma^*.

Myhill Theorem.  A language is regular if and only if the syntactic monoid is finite.

If we define another equivalence relation \sim_L with

u \sim_L v if and only if uy \in L \iff vy \in L for every y \in \Sigma^*,

\Sigma^*/\sim_L may not be a monoid in general, but we have the following:

Myhill-Nerode Theorem.  A language L can be recognized by a finite automaton with k states if and only if the relation \sim_L has index k; that is, \Sigma^*/\sim_L has k classes.

Leave a comment

Practice on maximum coverage problem

Recently I just realized that the analysis of approximation algorithms of covering typed problems are unintuitive to me.  I don’t know whether it’s because the analysis-based nature which always gets me, or there’s just that I’m not familiar with the proof enough.  In any case, this is my attempt on decoding the proofs.  Let me take maximum coverage problem as an example, which given a set system with a number k, we ask what is the largest amount of elements we can cover using k sets.

Theorem.  Greedy covering gives a (1-1/e)-approximation to the maximum coverage problem.

Proof.  Let \textit{new}(i) be the number of elements newly covered at step i.  Let \textit{covered}(i) be the number of elements covered from step 1 to step i; we have

\textit{covered}(i) = \textit{new}(1) + \dots + \textit{new}(i)

Let \textit{gap}(i) be the number of elements we still need to cover in order to cover opt elements; therefore \textit{gap}(i) = \textit{opt} - \textit{covered}(i).

Key observation.  At step i, there is always a set that covers at least k-th fraction of the uncovered elements in \textit{OPT}.
Proof.  This is exactly the part I feel uncomfortable with; I find the following formulation helps my intuition:
“No matter the removal of any subset of elements in \textit{OPT}, the original k sets in the optimal cover still covers \textit{OPT}; and because k sets are enough to cover a partial of \textit{OPT}, there is one set among the k sets that covers at least k-th fraction of the partial \textit{OPT}.”

Therefore \textit{new}(i) \ge \textit{gap}(i-1)/k; from this point on things becomes easy.

\textit{gap}(i) \le \textit{gap}(i-1) - \textit{gap}(i-1)/k = {k-1 \over k} \textit{gap}(i-1),

so each step will shrink the gap by (k-1)/k.  After k steps the gap has size at most (1-1/k)^k \le 1/e-th fraction of \textit{opt}, which proves the statement.

Leave a comment

Dilworth Theorem and equivalence to Konig-Egervary Theorem

Dilworth’s Theorem. If P is a finite poset, then the maximum size of an antichain in P equals the minimum number of chains needed to cover the elements of P.

Proof. Induction on |P|. We separate into two cases.

(a) Some maximum antichain A omits both a maximal element and a minimal element of P. Then neither D[A] or U[A] contains all of P; the maximality of A implies P = D[A] \cup U[A]. Both D[A] and U[A] has width w(P) because they contain A, and they only intersect in A. Apply induction hypothesis on both sets, we obtain k chains covering D[A] and k chains covering U[A]; combining these chains form k chains covering P.

(b) Every maximum antichain of P consists of all maximal elements or all minimal elements. Thus w(P-\{x,y\}) \le k-1 if x is a minimal element and y is a maximal element below x. Apply induction hypothesis on P-\{x,y\} gives us k-1 chains; add chain \{x,y\} to get a covering of P.

Theorem.  Dilworth’s Theorem is equivalent to Konig-Egervary Theorem.

Proof.  Dilworth to Konig-Egervary: View an n-node bipartite graph G as a bipartite poset. The nodes of one part are maximal elements, and nodes of the other part are minimal. Every covering of the poset of size n-k uses k chains of size 2, which is actually a matching. Each antichain of size n-k corresponds to an independent set in G, and the rest of k nodes forms a vertex cover.

Konig-Egervary to Dilworth: Consider an arbitrary poset P of size n. Define a bipartite graph G as follow: For each element x \in P, create two nodes x^- and x^+. The two parts of the graph are \{x^- : x \in P\} and \{x^+ : x \in P\}. The edge set is \{x^- y^+ : x <_P y\}.

Every matching in G yields a chain-covering in P as follow: We start we n chain, each contains a unique element. If x^- y^+ is in the matching then x and y are combined in the same chain. Therefore because each node in G appears in at most on edge of the matching, a matching of k edges forms a covering of n-k chains.

Given a vertex cover R in G of size k, define a corresponding antichain A = \{x \in P : x^-, x^+ \notin R\}; this is indeed an antichain because if there is a relation x <_P y in A then edge x^- y^+ will be presented, and vertex cover R needs to contain at least one of x^-, y^+.

Claim. No minimal vertex cover of G uses both \{x^-,x^+\}, because by transitivity the sets \{z^- : z \in D(x)\} and \{y^+ : y \in U(x)\} induce a complete bipartite subgraph in G. A vertex cover of a complete bipartite graph must use all of one part. Since x^-,x^+ have all the neighbors in these two sets, we can remove one of x^-,x^+ that is not in the right part and decrease the size of the vertex cover.

Thus a minimum vertex cover of size k yields an antichain of size n-k.

Leave a comment

Graph coloring theorems revisited II – Vizing’s Theorem

Alexandr Kostochka found the following new proof to the classical theorem.

Vizing’s Theorem.  For every simple graph G, \chi'(G) \le \Delta(G) + 1.

Proof.  We prove by contradiction; assume that G is a minimal counter-example (in terms of number of edges) to the statement; that is, G has no edge-(\Delta+1)-coloring.

Let \phi be an edge-(\Delta+1)-coloring of G-xy. For every v \in V(G), let O(v)=O_\phi(v) denote the set of colors in [\Delta+1] not used to color the edges incident to v; one can see that O(x) \cap O(y) = \emptyset.

Construct the auxiliary multidigraph H as follows: V(H) = N_G(y) and there is a directed edge from u to v in E(H) if \phi(vy) \in O(u). Intuitively an edge from u to v means that if we can recolor vy, then we can recolor uy as well. Let X be the set of vertices reachable in H from x (x \in X as well); any recoloring of an edge wy for w\in X will imply that we can color xy. Since the out-neighbors of a vertex reachable from x also are reachable from x, N^+_H(v) \subseteq X for every v \in X.  Therefore

\sum_{v \in X} d^+_{H[X]}(v) = |{E(H[X])}| = \sum_{v \in X} d^-_{H[X]}(v) \text,

and the theorem will be proven if we can show that

  1. d^+_{H[X]}(v) \ge \Delta - d(v) + 1 + [v = x] and
  2. d^-_{H[X]}(v) \le 1-[v=x] for all v \in X;

this implies

0 = \sum_{v \in X} \{d^+_{H[X]}(v) - d^-_{H[X]}(v)\} \ge 2 + \sum_{v \in X} (\Delta - d(v)) \ge 2,

a contradiction.  Items (1) and (2) follow from the following claims. Let \alpha \in O(y).

Claim 1 (Exchange argument).  \alpha\notin O(v) for every v\in X.
Proof sketch.  Recolor the edges of the form wy for each w on the path from x to v.

Claim 1 yields that for every v \in X and \beta \in O(v), there is some w \in N(y) with \phi(yw)=\beta. Then by the definition of H, vw\in H[X]. Because |{O(v)}| \ge \Delta+1-d(v) for each v\in V(G) and |{O(u)}| \ge \Delta+1-d_{G-xy}(u) = \Delta-d(u)+2 for u \in\{x,y\}; so

d^+_{H[X]}(v) \ge \Delta+1-d(v) for every v \in X and d^+_{H[X]}(x) \ge \Delta-d(x)+2.

Claim 2 (Distinct O-sets).  For all distinct v,w \in X, O(v) \cap O(w) = \emptyset.
Proof sketch.  A [\beta,\gamma]-path in G is a path whose edges are alternately colored with \beta and \gamma. A [\beta,\gamma](a,b)-path is a [\beta,\gamma]-path from a to b in G.

Claim: If v \in X and \beta \in O(v), then G contains an [\alpha,\beta](v,y)-path.  The claim follows from a Kempe-chain-like argument to recolor vy with \alpha.

Then suppose there are v,w\in X with \beta\in O(v)\cap O(w), then the [\beta,\alpha]-path starting at y must end at both v and w, a contradiction.

Claim 2 yields that d^-_{H[X]}(v) \le 1 for every v \in X and d^-_{H[X]}(x) = 0, because the color on edge vy corresponds to a unique in-neighbor of v (and there are no color on edge xy).

Leave a comment

Graph coloring theorems revisited I – Brook’s Theorem and Rubin’s Block Theorem

這學期在修學校著名的組合數學課程, 課本當然是使用 Douglas B. West 的新書草稿.  其中有許多內容是全新的, 我想這會成為新一代的組合學以及圖論教科書吧?

今天要談的是圖著色中著名的定理之一 — Brooks’ Theorem, 說明只要不是完全圖或奇圈 (odd cycle), 我們便可以稍稍改進單純的上界 \Delta(G) +1\Delta.  但當初學到的證明簡直是魔術, 看不出什麼東西.  在 West 的新書中使用了 [Entringer85] 的結果, 說明為何這兩種圖比較特別.

Rubin’s Block Theorem.  令 G 為一雙連通圖; 若 G 不是完全圖或奇圈, 則 G 中有一個含有最多一條弦 (chord) 的偶圈.

很奇怪的圖結構定理, 但我們來看看要怎麼使用它!  定義圖 G 為 degree-choosable 如果對於任意的色組函數將每個圖中的點對應到一組可選擇的顏色, 且每個點上的顏色至少跟點的 degree 相同, 我們都保證可以從每組顏色中選取一種將圖適當的著色.  底下這兩個 Lemma 不難證:

Lemma 1.  給定連通圖 G 與一色組函數 L, (i) 若有某個點 v 的顏色選擇多於 degree, 則圖 G 可順利著色; (ii) 若圖 G 為雙連通, 且有兩個點上的顏色組不同, 則圖 G 可順利著色.

Lemma 2.  連通圖 G 若存在一 degree-choosable 生成子圖 H, 則圖 G 也是 degree-choosable.
Proof.  By Lemma 1(a).

利用 Lemma 1,2 與 Rubin’s Block Theorem 我們可以證明底下的若且唯若條件, 不過我們只需要唯若方向就是; 所以暫時先證一邊!

Theorem.  圖 G 不為 degree-choosable 若且唯若每個雙連通集 (block) 皆為完全圖或奇圈.
Proof.  由 Lemma 2 與 Block Theorem 我們只需要證明含有最多一條弦的偶圈是 degree-choosable; 但利用 Lemma 1(b) 這很明顯.

Brooks’ Theorem (extended).  若圖 G 不是完全圖或奇圈, 則 \chi_{\ell}(G) \le \Delta(G).
Proof.  利用上面的 Theorem 我們只需要處理當所有雙連通集都是完全圖或奇圈的情形.  如果有個點的 degree 比 \Delta(G) 小, 利用 Lemma 1(a) 就結束了; 所以 G 一定是 \Delta(G)-正則 (regular); 但在這種情況下表示我們只有一個 block (不然斷點的 degree 會有問題), 證明完成.

不過, 這其實不是我最喜歡的證明; degree-choosable 的部分可以被另一個定理取代, 並且能更增強證明的結果; 我們留到下次再介紹 Alon-Tarsi Theorem.


Get every new post delivered to your Inbox.