Recently I just realized that the analysis of approximation algorithms of covering typed problems are unintuitive to me. I don’t know whether it’s because the analysis-based nature which always gets me, or there’s just that I’m not familiar with the proof enough. In any case, this is my attempt on decoding the proofs. Let me take maximum coverage problem as an example, which given a set system with a number , we ask what is the largest amount of elements we can cover using sets.
Theorem. Greedy covering gives a -approximation to the maximum coverage problem.
Proof. Let be the number of elements newly covered at step i. Let be the number of elements covered from step to step ; we have
Let be the number of elements we still need to cover in order to cover opt elements; therefore .
Key observation. At step , there is always a set that covers at least -th fraction of the uncovered elements in .
Proof. This is exactly the part I feel uncomfortable with; I find the following formulation helps my intuition:
“No matter the removal of any subset of elements in , the original sets in the optimal cover still covers ; and because sets are enough to cover a partial of , there is one set among the sets that covers at least -th fraction of the partial .”
Therefore ; from this point on things becomes easy.
so each step will shrink the gap by . After steps the gap has size at most -th fraction of , which proves the statement.