Finite Playground

On earth as it is in Heaven.

Leave a comment

Hindman’s Theorem

While reading the posts on ultrafilters by Terry Tao [1,2], I find the following result interesting:

Hindman’s theorem (a special case).  Suppose we color the natural numbers using k colors.  Then there exists a color c and an infinite set S, all colored with c, such that every finite sum of over S has color c as well.

Also, this theorem is claimed to be rather unpleasant to prove if we insist on not using ultrafilters.  

Leave a comment

New improvement to fast matrix multiplication

最近花了許多時間讀資格考, 寫了不少筆記, 沒有時間寫網誌…  難得有新聞來寫一下!

Again the exponent for the fastest matrix multiplication algorithm is improved.  This time the magical number is 2.3728639, 0.0000630 better than Williams’ 2.3729269.  (Last time it goes from the famous CW bound 2.3754770 to Stothers’ 2.3736898 then to Williams’, a 0.0025501 improvement.)

Powers of Tensors and Fast Matrix Multiplication, François Le Gall, 2014.

Leave a comment

Compilation of basic results related to regular language

* Results related to Pumping Lemma

Theorem.  A language L accepted by a DFA A of n states is

  1. nonempty if and only if A accepts a sentence of length less lean n.
  2. infinite if and only if A accepts a sentence of length \ell with n \le \ell < 2n.

The union of the following languages, L_1 \cup L_2, is non-regular, but nevertheless it cannot be proven using pumping lemma.

L_1 = \{u_1 v_1 w v_2 u_2 : u_1,u_2,v_1,v_2,w \in [4]^*, and either v_1=w, v_1=w, or v_2=v_2\}

L_2 = \{w : w\in [4]^*, and precisely 1/7 of symbols in w are 3′s \}

* Regular languages are precisely constant-space languages

Theorem.  A regular language can be recognized by a two-way read-only Turing machine.

Proof sketch.  Record the configuration of each time we first visit the i-th cell.  Define a finite automaton to simulate the same behavior.

Theorem.  \textsc{Space}(o(\log\log n)) is the same as \textsc{Space}(1), both are the class of regular languages.  That is, having an algorithm that uses o(\log\log n)-space is equal to not using any space.

Proof sketch.  If a Turing machine uses s(n) space, there are 2^{O(s(n))} possible configurations and 2^{2^{O(s(n))}} possible crossing sequences.  If s(n) = o(\log\log n) then there are two crossing sequences that are the same, and we can further shorten a shortest string accepted by the machine by the following lemma:

Lemma.  Let M be a machine and x be the input.  Suppose two crossing sequence C_i and C_j are equal; then by removing the substring of x from index i+1 to j-i (assume i < j), we get another string that is accepted by the machine M.

This is a contradiction.

The following language is non-regular but can be decided in O(\log\log n) space:

L = \{\#(0)_2\#(1)_2\# \ldots \#(2^k-1)_2\# : k \ge 0\},

where (i)_2 is the binary representation of i.

* Generating function of regular language

For a given language L \subseteq \Sigma^*,

S_L(z) = \sum_{n\ge 0} |L \cap \Sigma^n| z^n

is the generating function of L.  Using analytic combinatorics we can prove the following useful facts.

Theorem.  For a regular language L, the generating function S_L(z) is rational.

Theorem.  Let A=(Q,\Sigma,\delta,q_0,F) be a finite automaton of a language L.  Then the generating function of L is the following rational function determined under matrix form

S_L(z) = [q_0]^T(I-z[\delta])^{-1}[F],

where [\delta](i,j) = | \{ number of different labels between q_i and q_j \}|; [q_0] and [F] as characteristic vectors.

* Minimizing finite automata

Brzozowski’s algorithm use only power set construction \textsc{Power} and edge reversal \textsc{Rev}.  One can observe that reversing edges of a DFA gives an NFA of the reverse language.  Then the power set construction gives a minimum DFA.  This algorithm takes exponential time in the worst case.

Hopcroft’s algorithm is the fastest algorithm known; runs in O(n \log n) time.  Try to partition the states using the Myhill-Nerode equivalence relation.

* Relation with AC^0

Theorem.  \mathsf{REG} and \mathsf{AC}^0 are incomparable.

Proof.  Parity is not in \mathsf{AC}^0.  Palindrome/addition/\{ww : w \in \Sigma^*\} are not regular.

Leave a comment

Finite automata, monoids, and Myhill-Nerode theorem

Kleene-Buchi Theorem.  For a language L, the following are equivalent:
(1) L is recognized by a finite automaton,
(2) L is defined by a regular expression,
(3) L is definable in Monadic second order logic,
(4) L is recognized by a finite monoid.

The syntactic monoid M(L) = \Sigma^*/\cong_L is defined by the equivalence \cong_L with

u \cong_L v if and only if xuy \in L \iff xvy \in L for every x,y \in \Sigma^*.

Myhill Theorem.  A language is regular if and only if the syntactic monoid is finite.

If we define another equivalence relation \sim_L with

u \sim_L v if and only if uy \in L \iff vy \in L for every y \in \Sigma^*,

\Sigma^*/\sim_L may not be a monoid in general, but we have the following:

Myhill-Nerode Theorem.  A language L can be recognized by a finite automaton with k states if and only if the relation \sim_L has index k; that is, \Sigma^*/\sim_L has k classes.

Leave a comment

Practice on maximum coverage problem

Recently I just realized that the analysis of approximation algorithms of covering typed problems are unintuitive to me.  I don’t know whether it’s because the analysis-based nature which always gets me, or there’s just that I’m not familiar with the proof enough.  In any case, this is my attempt on decoding the proofs.  Let me take maximum coverage problem as an example, which given a set system with a number k, we ask what is the largest amount of elements we can cover using k sets.

Theorem.  Greedy covering gives a (1-1/e)-approximation to the maximum coverage problem.

Proof.  Let \textit{new}(i) be the number of elements newly covered at step i.  Let \textit{covered}(i) be the number of elements covered from step 1 to step i; we have

\textit{covered}(i) = \textit{new}(1) + \dots + \textit{new}(i)

Let \textit{gap}(i) be the number of elements we still need to cover in order to cover opt elements; therefore \textit{gap}(i) = \textit{opt} - \textit{covered}(i).

Key observation.  At step i, there is always a set that covers at least k-th fraction of the uncovered elements in \textit{OPT}.
Proof.  This is exactly the part I feel uncomfortable with; I find the following formulation helps my intuition:
“No matter the removal of any subset of elements in \textit{OPT}, the original k sets in the optimal cover still covers \textit{OPT}; and because k sets are enough to cover a partial of \textit{OPT}, there is one set among the k sets that covers at least k-th fraction of the partial \textit{OPT}.”

Therefore \textit{new}(i) \ge \textit{gap}(i-1)/k; from this point on things becomes easy.

\textit{gap}(i) \le \textit{gap}(i-1) - \textit{gap}(i-1)/k = {k-1 \over k} \textit{gap}(i-1),

so each step will shrink the gap by (k-1)/k.  After k steps the gap has size at most (1-1/k)^k \le 1/e-th fraction of \textit{opt}, which proves the statement.

Leave a comment

Dilworth Theorem and equivalence to Konig-Egervary Theorem

Dilworth’s Theorem. If P is a finite poset, then the maximum size of an antichain in P equals the minimum number of chains needed to cover the elements of P.

Proof. Induction on |P|. We separate into two cases.

(a) Some maximum antichain A omits both a maximal element and a minimal element of P. Then neither D[A] or U[A] contains all of P; the maximality of A implies P = D[A] \cup U[A]. Both D[A] and U[A] has width w(P) because they contain A, and they only intersect in A. Apply induction hypothesis on both sets, we obtain k chains covering D[A] and k chains covering U[A]; combining these chains form k chains covering P.

(b) Every maximum antichain of P consists of all maximal elements or all minimal elements. Thus w(P-\{x,y\}) \le k-1 if x is a minimal element and y is a maximal element below x. Apply induction hypothesis on P-\{x,y\} gives us k-1 chains; add chain \{x,y\} to get a covering of P.

Theorem.  Dilworth’s Theorem is equivalent to Konig-Egervary Theorem.

Proof.  Dilworth to Konig-Egervary: View an n-node bipartite graph G as a bipartite poset. The nodes of one part are maximal elements, and nodes of the other part are minimal. Every covering of the poset of size n-k uses k chains of size 2, which is actually a matching. Each antichain of size n-k corresponds to an independent set in G, and the rest of k nodes forms a vertex cover.

Konig-Egervary to Dilworth: Consider an arbitrary poset P of size n. Define a bipartite graph G as follow: For each element x \in P, create two nodes x^- and x^+. The two parts of the graph are \{x^- : x \in P\} and \{x^+ : x \in P\}. The edge set is \{x^- y^+ : x <_P y\}.

Every matching in G yields a chain-covering in P as follow: We start we n chain, each contains a unique element. If x^- y^+ is in the matching then x and y are combined in the same chain. Therefore because each node in G appears in at most on edge of the matching, a matching of k edges forms a covering of n-k chains.

Given a vertex cover R in G of size k, define a corresponding antichain A = \{x \in P : x^-, x^+ \notin R\}; this is indeed an antichain because if there is a relation x <_P y in A then edge x^- y^+ will be presented, and vertex cover R needs to contain at least one of x^-, y^+.

Claim. No minimal vertex cover of G uses both \{x^-,x^+\}, because by transitivity the sets \{z^- : z \in D(x)\} and \{y^+ : y \in U(x)\} induce a complete bipartite subgraph in G. A vertex cover of a complete bipartite graph must use all of one part. Since x^-,x^+ have all the neighbors in these two sets, we can remove one of x^-,x^+ that is not in the right part and decrease the size of the vertex cover.

Thus a minimum vertex cover of size k yields an antichain of size n-k.


Get every new post delivered to your Inbox.