Discrete Math & Algorithms
I took these notes in winter 2017 for CME 305, taught by Professor Reza Zadeh.
A graph consists of a set of vertices and edges.
A graph is bipartite iff it contains no odd cycles iff its vertices can be partitioned into two independent sets.
A graph is a tree iff any two of the three conditions hold:
Every tree has a leaf. When removed, the result is another tree, making this useful for induction.
Trees are bipartite.
Handshaking lemma: Hall’s theorem: for a bipartite graph, there is a matching that covers one side if and only if for every subset of , we have , where is the neighborhood of . (Proof uses induction on , weak if possible, strong in the remaining case.)
A clique is a subset of vertices whose induced subgraph is complete.
A cut is a partition of the vertices into two sets; its cut size is the number of edges that cross it.
A vertex cover is a subset of vertices such that every edge has at least one endpoint in the set.
An independent set is a subset of vertices in which no two vertices are adjacent.
A dominating set is a subset of vertices in which every vertex is adjacent to a vertex in the set (or is in the set).
A matching (independent edge set) is a subset of edges in which no two edges are incident.
A set is a vertex cover iff its complement is an indepedent set.
A set is an independent set iff it is a clique in the complement graph.
König’s theorem: in a bipartite graph, the size of the minimum vertex cover is the size of the maximum matching. 
A Hamiltonian path/cycle is a path/cycle that visits each vertex exactly once.
A Eulerian path/circuit is a path/circuit that visits each edge exactly once.
An Eulerian circuit exists iff is connected and every vertex has even degree.
iff asymptotically, is bounded above by a multiple of iff asymptotically, is bounded above and below by multiples of iff asymptotically, is bounded below by a multiple of
Pigeonhole principle: if items are put into fewer than containers, then at least one of the containers contains more than one.
The max is at least the average: Union bound: Exponential bound: Try to bound with Markov inequality: for and non-negative, Chernoff bound: let be the sum of independent Bernoulli random variables with . Then the simplified bounds requiring that . An Erdős-Rényi graph is a random graph on vertices, with each edge selected independently with probability . The distribution of degrees is binomial, which may be approximated as a Poisson distribution . If , then is almost surely connected.
A matroid is a generalization of linear independence in vector spaces. It’s a ground set with a family of subsets (independent sets) of that satisfy the following axioms:
A maximal independent set is called a basis. All bases have the same cardinality.
Examples of matroids abound:
First, note that a graph’s vertices and its independent sets do not form a matroid.
Graphic matroid: A graph’s edges and acyclic sets. The bases are spanning trees (when the graph is connected)
Vector matroid: A finite subset of a vector space and its linearly independent sets. The bases are bases. (Note: we can present the finite subset as a matrix, hence “matroid”.)
Uniform matroid: A finite set and its subsets of cardinality or less. The bases are sets of cardinality .
The greedy matroid algorithm, a generalization of Kruskal’s algorithm, optimizes weighted matroids.
Divide and conquer
hitting time: is expected number of steps to go from vertex to vertex commute time: is the expected number of steps to visit all nodes, starting from Cover time , where is the number of edges, is effective resistance is voltage from to if we inject units of current in to each node and extract units of current from . look at a spanning tree, and in order traversal where
Spectral graph theory
Edmonds’s blossom algorithm
For bipartite graphs, use max-flow. There is a bijection between (non-infinite) cuts and vertex covers. See .
Minimum spanning tree
The minimum spanning tree problem finds a spanning tree with minimum weight. A spanning tree is a cycleless connected subgraph with vertices. Adding any edge to a spanning tree yields a cycle; deleting any edge from a spanning tree partitions the vertices into two sets.
Some important properties of minimum spanning trees:
If the edge weights of a graph are distinct, there is a unique minimum spanning tree.
Cut property: out of all edges crossing a particular cut, each minimum-weight edge is part of some MST.
Cycle property: in any cycle of the graph, the maximum-weight edge (if unique) cannot be part of a MST.
Kruskal’s algorithm is an algorithm that builds a minimum spanning tree by greedily taking increasingly heavy edges that don’t create a cycle. Concretely:
Sort the list of edges in order of increasing weight.
Go through the list once, incrementally building a tree, taking an edge if it does not create a cycle.
The dominating cost is the sort, with complexity. The building of the tree can be done in or time with the union-find data structure, where is the inverse Ackermann function.
The maximum flow problem involves finding the maximum possible - flow in a weighted graph. An - flow is a flow from a source to a sink in a weighted graph (with weights interpreted as capacities), and its size is the amount leaving (or indeed, the net amount leaving any subset of vertices containing ). The minimum cut problem involves finding an - cut of minimum cut size. An - cut is a partition of vertices with and , and its cut size is the sum of the capacities of the directed edges from to . Notice that the size of any - flow is at most the size of any - cut, since all the flow must cross the cut and cannot exceed the capacity of the edges crossing the cut. In particular, this means that if the size of an - flow equals the size of an - cut, the flow is a maximum flow, and the cut is a minimum cut.
Ford-Fulkerson is the following algorithm for computing the max flow:
Start with zero flow.
For each edge, add an opposing edge if it doesn’t already exist, with capacity. While there is a directed path from to with positive capacity, we: send as much additional flow along as possible, reduce the capacities of the edges of by , increase the capacities of edges that oppose by .
The modified graph is called the residual graph, and the backwards edges allows us to “undo” choices of flow. When this terminates, return the resulting flow.
If the capacities are integers, then the algorithm always terminates, since the flow size (the amount leaving ) increases but is bounded by the total capacity of the edges leaving . (In this case, we see that the max flow is always integral.) If the algorithm terminates, then it returns a maximum flow. To show this, let be the nodes reachable from in the residual graph at the end of the algorithm, ignoring edges of capacity . Consider the edges crossing the cut in the original graph. The edges from to are fully saturated with flow, since the corresponding edges in the residual graph have been reduced to capacity. The edges from to are empty, since the residual edges from to are empty.
Hence the net flow (the flow size) equals this cut’s cut size, so this flow is a maximum flow, and this cut is a minimum cut.
If is the maximum flow, then there are at most iterations, each with work. This algorithm is .
Edmonds-Karp is Ford-Fulkerson, where the paths are chosen in order of increasing length using breadth-first search. It runs in time, which comes from [todo].
Note: in applications, use ’s liberally! They make analysis easier.
Karger’s algorithm for global min-cut is the following probabilistic algorithm. Generate random cuts as follows, keeping track of the minimum one seen so far. Until there are only two supervertices, contract a uniformly chosen edge (preserving multi-edges but removing self-edges). Return the cut that partitions the two supervertices.
(This is equivalent to assigning random weights and cutting a random edge of the MST.)
Let’s calculate the probability of returning a global minimum cut. Let be a global minimum cut, with cut size . Note that the degree of each vertex is at least (otherwise we could take just that vertex to find a minimum cut), and by the handshake lemma. The probability that we manage to avoid contracting an edge in on the first step is , and on the th step, combining into a probability of at least when there are only nodes left. Thus we require iterations of work each. Instead of generating each sequence of cuts independently, we can generate sequences at once with recursion. This process yields a algorithm.
Important complexity classes of decision problems:
P: easy to solve (polynomial time).
NP: easy to check (polynomial time verifiable). (Important: NP includes P!)
NP-hard: can solve any NP problem (any NP problem reduces to it in polynomial time).
NP-complete: NP-hard and NP.
When reduces to in polynomial time (we can use to solve ), we write , since we consider to be harder than . To prove that an algorithm is NP-hard, show that an NP-hard problem reduces to it. If a problem is NP-hard, there are two main strategies to make it feasible: approximation algorithms, or restricting to an easy case.
The boolean satisfiability problem (SAT) is the problem of determining whether a boolean statement is satisfiable. The Cook-Levin theorem says that SAT is NP-complete. SAT reduces to 3SAT, where the boolean statement is restricted to -conjunctive normal form, e.g. by noting that can be satisfied iff can be. There is a -approximation for Max-3SAT: choose all true or all false. Each clause is satisfied by at least one of the two assignments; hence at least half of the clauses are satisfied by one of two assignments. This can be improved to by choosing randomly, if each clause has exactly . There is also a polynomial-time randomized algorithm for 2SAT: at each step, choose an unsatisfied clause and flip one literal in it (choose either literal with probability ). If we compare the current assignment to a correct assignment, one or both literals will be wrong, so flipping one of them will increase the number of correct variables by with probability at least . Thus this is a random walk on a chain from to , where is the number of variables and the current position is the number of correct variables. The cover time is polynomial, making this algorithm polynomial-time. This algorithm isn’t the most efficient, but is easily generalizable to 3SAT (although, in that case, the algorithm becomes exponential, since the transition probabilities becomes and ).
Covering and packing
The following problems are all related and NP-complete:
Vertex Cover: is there a vertex cover of (at most) size ? Independent Set: is there an independent set of (at least) size ? Clique: is there a clique of (at least) size ? Dominating Set: is there a dominating set of (at most) size ? Set Cover: given a universe and a set of subsets of , is there a cover of using of the subsets?
The reductions are:
3SAT to Independent Set: introduce a triangle for each clause and connect and .
Vertex Cover, Independent Set, and Clique: straightforward.
Vertex Cover to Dominating Set: introduce an intermediate node for each edge , and connect them into a triangle.
Dominating Set to Set Cover: closed neighborhoods for each vertex are the subsets; set of vertices is the universe.
Vertex Cover to Set Cover: sets of incident edges for each vertex are the subsets; set of edges is the universe.
Minimum vertex cover/maximum independent set
There is a -approximation to Minimum Vertex Cover: find a maximal matching. An optimal cover contains at least one endpoint for each matching edge, so the maximal matching can only have twice as many. Another -approximation is by integer program relaxation: write the associated integer program; solve the linear program by relaxing the integer constraint; round the variables. Interestingly, the dual to this linear program is corresponds to maximum matching . Minimum vertex cover cannot be approximated better than (unless unique games conjecture is false). Maximum independent set cannot be approximated to within a constant factor (unless P = NP).
For a bipartite graph, use max-flow (see above). For a forest (which is bipartite!), you can also take a leaf as part of the cover, remove it and its parent, and recurse.
Set cover/dominating set
Minimum Set Cover has an -approximation: greedily choose the set that contains the most uncovered elements. Proof #1: it’s more intuitive to work with the weighted version, where the sets have weights (“prices”), and at each step, we choose (“purchase”) the set with the lowest price per uncovered (“unobtained”) element. We try to minimize the total price (in set cover, the price is per set). Number the elements in the order they are chosen by the algorithm. Then the th item is purchased for some effective price (the cheapest possible, by greediness). However, we could have purchased that item for an effective price of , where is the number of remaining items, by just purchasing all the sets in the optimal set cover. Thus the cost of the th item is so . Minimum Dominating Set can use the same algorithm. Proof #2: At a particular step, suppose there are uncovered elements. Because choosing all the optimal sets would cover those elements with sets, there exists one set that covers at least elements (probabilistic method). Therefore, the algorithm’s choice covers at least that many elements, so that after the step there are at most elements remaining. After steps, there are at most elements remaining. Setting makes this number less than , indicating that the algorithm will have terminated by then.
The following problems are NP-hard:
Hamiltonian Cycle: Is there a Hamiltonian cycle in the graph?
Hamiltonian Path: Is there a Hamiltonian cycle in the graph?
Traveling Salesman: In a weighted complete graph, find a tour of all the nodes of minimum distance.
The reductions are:
Traveling Salesman cannot be approximated, since any approximation will still find Hamiltonian cycles. If the distance is metric (triangle inequality and symmetric), then the Christofides algorithm is a -approximation algorithm. It consists of the following steps: Compute a MST a . Note that , since contains a spanning tree, so going around the MST and skipping (shortcutting) duplicate nodes is already a -approximation. Let be the odd-degree vertices in the subgraph . By the handshaking lemma, is even. Compute a minimum-weight perfect matching of the vertices in . The intuition is that it might be cheaper to connect these up another way instead of just blindly shortcutting. Indeed, shortcutting the optimal tour to a tour on just yields two disjoint perfect matchings on , so . Combine and into a multigraph and compute an Eulerian cycle on it. Shortcut it to yield the final tour.
Integer linear programming?
Bin packing - any fit
Randomized max-cut approx, deterministic max-cut approx, ILP, Goemans-Williamson
Linear programming (totally unimodular)
 Motwani and Raghavan, pg. 68.