Another go at blogging

2020-01-29

Blogging

Undirected Graphs

Some problems

Path
Shortest path
Cycle
Ehler tour: A cycle that uses each edge excatly once.
Hamilton tour: A cycle that uses each vertex exactly once
- classical NP-complete problem.
Connectivity
MST:
Biconnectivity: A vertex whose removal disconnects the graph
Planarity
Graph isomorphism: Are two graphs identical?
- No one knows so far. A lonstanding open problem

Representations

Real-world graphs tend to be sparse (huge number of vertices, small average vertex degree).

Set-of-edges representation
- unefficient
Adjacency-matrix representation
- space cost is prohibitive
Adjacency-list array representation
- GOOD

Adjacency-list Data structure

Space usage proportional to V + E
Constant time to add an edge
Time proportional to the degree of v to iterate through vertices adjacent to v

Depth-first Search (DFS)

Typical applications:

Find all vertices connected to a given source vertex
Find a path between two vertices

Algorithm:

Use recursion (a function-call stack) or an explicit stack.
Mark each visited vertex (and keep track of edge taken to visit it)
Return (retrace steps) when no unvisited options

public class DepthFirstPaths{
    private blloean[] marked;
    private int[] edgeTO;
    private int s;
    public DepthFirstPaths(Graph G, int s)
    {
        // ...
        dfs(G, s);
    }

    private void dfs(Graph Gm int v)
    {
        marked[v] = true;
        for (int w : G.adj(v))
            if (!marked[v])
            {
                dfs(G, w)
                edgeTo[w] = v;
            }
    }
}

Propositions:

DFS marks all vertices connected to s in time proportional to the sum of their degrees.
After DFS, can find vertices connected to s in constant time and can find a path to s in time proportional to its length.

Breadth-first Search (BFS)

Typical applications:

shortest path

Algorithm:

Put s onto a queue, and mark s as visited
Take the next vertex v from the queue and mark it
Put onto the queue all unmarked vertices that are adjacent to v

public class BreadthFirstPaths
{
    private boolean[] marked;
    private int[] edgeTo;
    // ...
    private void bfs(Graph G, int s)
    {
        Queue<Integer> q = new Queue<>();
        q.enqueue(s);
        marked[s] = ture;
        while (!q.isEmpty())
        {
            int v = q.dequeue();
            for (int w: G.adj(v))
            {
                if (!marked[w])
                {
                    q.enqueue(w);
                    marked[w] = true;
                    edgeTo[w] = v;
                }
            }
        }
    }
}

Proposition:

BFS computes shortest paths (fewest number of edges) from s to all other vertices in a graph in time proportional to E + V

Applications of DFS

Connected components

The goal is to preprocess graph to answer queries of the form is v connected to w? in constant time.

The relation is connected to is an equivalence relation:

Reflexive: v is connected to v
Symmetric: if v is connected to w, then w is connected to v
Transitive: if v connected to w and w connected to x, then v connected to x

public class CC {
    private boolean[] marked;
    private int[] id;
    private int count;

    public CC(Graph G) {
        marked = new boolean[G.V()];
        id = new int[G.V()];
        for (int v = 0, v < G.V(); v++) {
            if (!marked[v]) {
                dfs(G, v);
                count++;
            }
        }
    }

    // ...

    private void dfs(Graph G, int v) {
        marked[v] = true;
        id[v] = count;
        for (int w : G.adj(v)) {
            if (!marked[w]) {
                dfs(G, w)
            }
        }
    }
}

Cycle detection

Problem: Is a given graph acylic?

TODO

Two-colorability

Problem: Is the graph bipartite?

TODO

Symbol graphs

TODO

Degrees of separation

TODO

Directed Graphs

A directed graph (or digraph) is a set of vertices and a collection of directed edges. Each directed edge connects an ordered pair of vertices.

outdegree: the number of edges going from it
indegree: the number fo edges going into it
directed path: a sequence of vertices in which there is a (directed) edge pointing from each vertex in the sequence to its successor in the sequence
directed cycle
simple cycle: a cycle with no repeated edges or vertices

Representations

Again, use adjacency-lists representation

Based on iterating over vertices pointing from v
Real-world digraphs tend to be sparse

public class Digraph {
    private final int V;
    private final Bag<Integer>[] adj;

    public Digraph(int V) {
        this.V = V;
        adj = (Bag<Integer>[]) new Bag[V];
        for (int v = 0; v < V; v++) {
            adj[v] = new Bag<Integer>[];
        }
    }

    public void addEdge(int v, int w) {
        adj[v].add(w);
    }

    public Iterable<Integer> adj(int v) {
        return adj[v];
    }
}

Digraph search

Reachabiliity problem: Find all vertices reachable from s along a directed path.

We can use the same dfs method as for undirected graphs.

Every undirected graph is a digraph with edges in both directions.
DFS is a digraph algorithm,

Reachability applications:

program control-flow analysis
- Dead-code elimination
- infinite-loop detection
mark-sweep garbage collector

Topological Sort

Topological sort: Given a digraph, put the vertices in order such that all its directed edges point from a vertix earlier in the order to a vertex later in the order (or report impossible).

A digraph has a topological order if and only if it is a directed acyclic graph (DAG).
Topological sort redraws DAG so all edges poitn upwards.

use DFS again. It can be proved that reverse postorder of a DAG is a topological order.
(check P578 for the definition of Preorder/Postorder)

public class DepthFirstOrder {
    private boolean[] marked;
    private Stack<Integer> reversePost;

    publiv DepthFirstOrder(Digraph G) {
        reversePost = new Stack<Integer>();
        marked = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++) {
            if (!marked[v]) dfs(G, v);
        }
    }

    private void dfs(Digrapg G, int v) {
        marked[v] = true;
        for (int w : G.adj(v)) {
            if (!marked[w]) dfs(G, w)
        }
        reversePost.push(v);
    }
}

Directed cycle detection

To find out if a given digraph is a DAG, we can try to find a directec cycle in the digraph.
Use DFS and a stack to track the cycle.

// TODO

Some very typical applications of directed cycle detection and topological sort:
(A directed cycle means the problem is infeasible)

job schedule
course scuedule
inheritance
spreadsheet
- vertex: cell
- edge: formula
symbolic links

Strong components

Vertices v and w are strongly connected if there is both a directed path from v to w and a directed path from w to v.
Strong connectivity is an equvicalence relation.

Kosaraju-Sharir Algorithm

Kosaraju-Sharir is easy to implement but difficutl to understand. It runs DFS twice:

Given a digraph G, run DFS to compute the topological order of its reverse $G^R$
Run DFS on G in the order given by first DFS

TODO: ADD Proof

https://algs4.cs.princeton.edu/code/edu/princeton/cs/algs4/KosarajuSharirSCC.java.html

public class KosarajuSharirSCC {
    private boolean[] marked;     // marked[v] = has vertex v been visited?
    private int[] id;             // id[v] = id of strong component containing v
    private int count;            // number of strongly-connected components

    /**
     * Computes the strong components of the digraph {@code G}.
     * @param G the digraph
     */
    public KosarajuSharirSCC(Digraph G) {

        // compute reverse postorder of reverse graph
        DepthFirstOrder dfs = new DepthFirstOrder(G.reverse());

        // run DFS on G, using reverse postorder to guide calculation
        marked = new boolean[G.V()];
        id = new int[G.V()];
        for (int v : dfs.reversePost()) {
            if (!marked[v]) {
                dfs(G, v);
                count++;
            }
        }
    }

    // DFS on graph G
    private void dfs(Digraph G, int v) { 
        marked[v] = true;
        id[v] = count;
        for (int w : G.adj(v)) {
            if (!marked[w]) dfs(G, w);
        }
    }

    // ...
}

Minimum Spanning Trees

An edge-weighted-graph is a graph where we associate weight or costs with each edge.
A spanning tree of an undirected edge-weighted graph G is a subgraph T that is both a tree (conneted and acyclic) and spanning (includes all of the vertices).
Given an (connected) undirected edge-weighted graph G with V vertices and E edges, the MST of it must have V - 1 edges.
If the graph is not connceted, we compute minimum spanning forest (MST of each component).

A cut in a graph is a partition of its vertices into two (nonempty) sets
A crossing edge connects a vertex in one set with a vertex in the other.
Cut property: Given any cut, the crossing edge of min weight is in the MST.

Edge-weight Graph Data Type

Edge:
https://algs4.cs.princeton.edu/code/edu/princeton/cs/algs4/Edge.java.html

EdgeWeigthedGraph:
https://algs4.cs.princeton.edu/code/edu/princeton/cs/algs4/EdgeWeightedGraph.java.html

Greedy MST Algorithm:

Start with all edges colored gray.
Find cut with no blacked crossing edges; color its min-weight edge black.
Repeat until V-1 edges are colored black.

Implementations 1: Kruskal’s algorithm

For edges in ascending order of weight:

Add next edge to Tree unless doing so would create a cycle.

To efficiently solve this problem, use union-find :

use a priority queue to maintain all the edges in V
union-find data structure:
- maintain a set for each connected component in T.
- if v and w are in saome set, then adding v->w would create a cycle
- to add v>w to T, merge sets containing v and w.

TODO: Add code

Implementations 2: Prim’s algorithm

Start with vertex 0 and greedily grow tree T.
Add To T the min weight edge with exactly oue endpoint in T.
Reapeat unitl V - 1 edges.

The key to solve this problem is how do we find the crossing edge of minimal weight efficiently.

A lazy solution (in time proportional to $ElogE$, fair enough):

Maintain a PQ of edges with (at least) one endpoint in T
- Key = edge, priority = weight
Delete-min to determine next edge e = v->w to add to T
Disregard if both endpoints v and w are marked (both in T)
Otherwise, let w be the unmarked vertex (not in T)
- add to PQ and edge incident to w (assuming other endpoint not in T)
- add e to T and mark w

TODO: add code

A eager solution (in time proprotional to $ElogV$, better):

Maintain a PQ of vertices connected by an edge to T, where priority of v = weight of shortedt edge connecting v to T
Delete min vertex v and add its associated edge e = v->w to T
Update PQ by considering all edges e = v->x incident to v
- ignore if x is already in T
- add x to PQ if not alread on it
- decrease priority of x if v->x becomes shortest edge connecting x to T

This solution uses an indexed priority queue data structure.

TODO: add code

Shortest Paths

Some variants:

Which vertices?
- Single source
- Source-sink
- All pairs
Edge weights
- Nonegative weights
- Euclidean weights
- Arbitrary weights
Cycles?
- No directed cycles
- No negative cycles

Edge-weighted digraph data strcuture

Weighted directed edge:
https://algs4.cs.princeton.edu/code/edu/princeton/cs/algs4/DirectedEdge.java.html

Edge-weighted digraph:
https://algs4.cs.princeton.edu/code/edu/princeton/cs/algs4/EdgeWeightedDigraph.java.html

Use adjacency-lists implementation same as EdgeWeightedGraph

Generic Single-source Shortest paths

Our goal is to find the shortest path from s to every other vertex. As a result, what we find will be the shortest-paths tree (SPT) for source s.

Relax edge e = v->w

distTo[v] is length of shortest known path from s to v
distTo[w] is length of shortest known path from s to w
esgeTo[w] is last edge on shortest known pathh from s to w
if e = v->w gives shorter path to w through v, update both distTo[w] and edgeTo[w]

private void relax(DirectedEdge e) {
    int v = e.from(), w = e.to();
    if (distTo[w] > distTo[v] + e.weight()) {
        distTo[w] = distTo[v] + e.weight();
        edgeTo[w] = e;
    }
}

Optimality conditions

Given an edge-weighted digraph G, distTo[] are the shortest path distances from s iff:

distTo[s] = 0
For each vertex v, distTo[v] is the length of some path from s to v.
For each edge e = v->w, distTo[w] <= distTo[v] + e.weight()

Generic algorithm

Generic algorithm (to compute SPT from s) {
    Initialize distTo[s] = 0 and distTo[v] = $\infty$

    Repeat until optimality conditions are satisfied:
        - Relax any edge
}

Efficient implementations:

Nonnegative weights: Dijkstra’s algorithm
No directed cycles (DAGs): Topological sort algorithm
No negative cycles: Bellman-Ford

Implement 1: Dijkstra’s algorithm

When there is no nonnegative weight exists, we can use Dijkstra’s algorithm.

Consider vertices in increasing order of distance from s (non-tree vertex with the lowest distTo[] value)
add vertex to tree and relax all edges pointing from that vertex

public class DijkstraSP{
        // ...
    public DijkstraSP(EdgeWeightedDigraph G, int s) {
        edgeTo = new DirectedEdge[G.V()];
        distTo = new double[G.V()];
        pq = new IndexMinPQ<Double>(G.V());

        for (int v = 0; v < G.V(); v++) {
            distTo[v] = Double.POSITIVE_INFINITY;
        }
        distTo[s] = 0;

        pq.insert(s, 0.0);
        while(!pq.isEmpty()) {
            int v= pq.delMin();
            for (DirectedEdge e : G.adj(v)) {
                relax(e);
            }
        }
    }

    private void relax(DirectedEdge e) {
        int v = e.from(), w = e.to();
        if (distTo[w] > distTo[v] + e.weight()) {
            distTo[w] = distTo[v] + e.weight();
            edgeTo[w] = e;
            if (pq.contains(w)) pq.decreaseKey(w, distTo[w]);
            else pq.insert(w, distTo[w]);
        }
    }
}

Compare to Prim’s algorithm:

Both are computing a graph’s spanning tree
Prim’s algorithm choose closest vertex to tree as next vertex, while Dijkstra’s algorithm choose closest vertex to the source

Implement 2: Topological sort algorithm

When the graph is a DAG, we can consider vertices in topological order and do relaxing.

    // ...
    Topological topological = new Topological(G);
    for (int v : topological.order()) {
        for (DirectedEdge e : G.adj(v)) {
            relax(e);
        }
    }

Seam carving: Resize an image without distortion.

Longest paths:

Formuate as a shortest paths problem in edge-weighted DAGs
- Negate all weights
- Find shortest paths
- Negate weights in result
Allpication: Parallel job scheduling (Critical path method, CPM).

Implement 3: Bellman-Ford algorithm

A SPT exists iff no negative cycles (a directed cycle whose sum of edge weights is negative).

When we want to find shortest paths with nagative weights, Dijkstra’s algorithms doesn’t work.
We can use Bellman-Ford algorithm as long as there is no negative cycle in the graph.
(Bellman-Ford algorithm is a dynamic programming algorithm)

Initialize distTo[s] = 0 and distTo[v] = $\infty$
Maintain a queue and repeat until the queue is empty or find a cycle:
- Pop vertex v from q
- Relax each edge pointing from v to any vertex w:
  - if distTo[w] can be de decreased, update distTo[w] and add w to the queue

// ...
    public BellmanFordSP(EdgeWeightedDigraph G, int s) {
        distTo  = new double[G.V()];
        edgeTo  = new DirectedEdge[G.V()];
        onQueue = new boolean[G.V()];
        for (int v = 0; v < G.V(); v++)
            distTo[v] = Double.POSITIVE_INFINITY;
        distTo[s] = 0.0;

        // Bellman-Ford algorithm
        queue = new Queue<Integer>();
        queue.enqueue(s);
        onQueue[s] = true;
        while (!queue.isEmpty() && !hasNegativeCycle()) {
            int v = queue.dequeue();
            onQueue[v] = false;
            relax(G, v);
        }
    }

    private void relax(EdgeWeightedDigraph G, int v) {
        for (DirectedEdge e : G.adj(v)) {
            int w = e.to();
            if (distTo[w] > distTo[v] + e.weight()) {
                distTo[w] = distTo[v] + e.weight();
                edgeTo[w] = e;
                if (!onQueue[w]) {
                    queue.enqueue(w);
                    onQueue[w] = true;
                }
            }
            if (++cost % G.V() == 0) {
                findNegativeCycle();
                if (hasNegativeCycle()) return;  // found a negative cycle
            }
        }
    }

Bellman-Ford algorithm can also be used for finding a negative cycle.

Negative cycle application: arbitrage detection.