CS2040S Notes

AY2021/2022 Semester 2, written by See Toh Jin Wei
#nus #cs2040s

Finals Cheat Sheet

Available here in PDF.

Algorithm Analysis

Loop Invariants

Should be the same throughout the algorithm.

Used to prove correctness of algorithm and to understand how an algorithm works.

There are stronger and weaker invariants, stronger invariants are more detailed.

Some examples:

"At the end of iteration j: the first j items in the array are in sorted order." (insertion sort) -- also true for selection sort (but there is a stronger one for it)

After iteration j: the first j items in the array are in sorted order and are the j smallest items in the array." (selection sort)

Big O Definition

T(n)=O(f(n))T(n) = O(f(n)) if T grows no faster than f.

There exists constants c>0c > 0 and n0>0n_0 > 0 such that for all n>n0n > n_0: T(n)<=cf(n)T(n)<=cf(n).

Big Omega Definition

T(n)=Ω(f(n))T(n) = \Omega(f(n)) if T grows no slower than f.

There exists constants c>0c > 0 and n0>0n_0 > 0 such that for all n>n0n > n_0: T(n)>=cf(n)T(n)>=cf(n).

Big Theta Definition

T(n)=Θ(f(n))T(n) = \Theta(f(n)) if T grows at the same rate as f.

T(n)=Θ(f(n))T(n) = \Theta(f(n)) if and only if T(n)=O(f(n))T(n) = O(f(n)) and T(n)=Ω(f(n))T(n) = \Omega(f(n))

loglogn\log\log ndouble log
logn\log nlogarithmic
log2n\log^2 npoly-logarithmic
sqrt(n)sqrt(n)square root
nlognn\log nlog-linear
n3lognn^3\log n
22n2^{2n}different (bigger) than 2n2^n

Note: log(n!)\log(n!) is O(nlogn)O(n\log n)

Sterling's Approximation: n!2πn(ne)nn! \approx \sqrt{2\pi n}(\frac{n}{e})^n

Amortised Analysis

"Average" cost but NOT average.

Definition: Operation has amortised cost of T(n) if for every integer k, the cost of k operations is <=kT(n)

Must be true for ALL operations.

Sorting Algorithms

Properties of Different Sorting Algorithms

Different sorting algorithms have their place! Sometimes you want an algorithm with specific properties.

Absolute speed is usually not a good indicator (by itself)


Asymptotic run time Best/worst/average run time

Space Usage

In-place algorithm = O(1)O(1) space usage


Whether the algorithm keeps the same order for identical keys.

Performance on Special Inputs

Inputs like sorted, unsorted, reverse sorted arrays.

An array where Bubble Sort is slow but Selection Sort is fast: a sorted array that moves the smallest element to the end of the array. e.g. [2, 3, 4, 5, 6, 7, 8, 9, 1]


Randomly generate a permutation and check if sorted. O(nn!)O(n*n!)

Not in-place. Not stable.

Bubble Sort

repeat n times:
  for j <- 1 to n-1:
    if A[j] > A[j+1] then swap(A[j], A[j+1])
  break if no swap is done (means sorted)

Compare and swap each element.

Average: O(n2)O(n^2) Best: O(n)O(n) -- sorted (no swaps done) Worst: O(n2)O(n^2) -- reverse sorted n iterations of the array

Bubble sort iterates through the entire array (maximum of) n times and terminates early if no swap is done.

Loop Invariant: At the end of iteration j, the biggest j items are correctly sorted in the final j positions of the array.

Correctness: after n iterations, the array is sorted

In-place. Stable algorithm.

Selection Sort

Checks for the minimum element n times and swaps. Always: Θ(n2)\Theta(n^2)

Loop Invariant: At the end of the iteration j, the smallest j items are correctly sorted in the first j positions of the array.

In-place. Not stable algorithm, swaps change element. (but possible to be stable if careful) At most nn swaps.

Insertion Sort

Average: O(n2)O(n^2) Best: O(n)O(n) -- sorted (no swaps done) Worst: O(n2)O(n^2) -- reverse sorted

O(nk)O(nk) where k is the maximum distance of element from its sorted index

Loop Invariant: At the end of the iteration j, the first j items in the array are in sorted order.

Fast if the array is (almost) sorted.

In-place. Can be stable, iterate while A[i] > key and not >=

Merge Sort

Divide and conquer algorithm that splits the array into half recursively.

Average: O(nlogn)O(n\log n) Worst: T(n)=2T(n/2)+cnT(n)=2T(n/2)+cn -> O(nlogn)O(n\log n)

In practice, Insertion Sort is faster than Merge Sort when arrays are < 1000 in size. Could be due to the need to copy array. So, it is possible to combine merge and insertion sort together to have a faster sorting algorithm than each of them alone.

Not in-place (possible but very tricky or slower, using array shifting) But can minimise memory used by only copying array at merge portion instead of at recursive call. Stable algorithm.

Quick Sort

Divide and conquer algorithm.

Using pivots to partitioning the array into 2: elements that are <= or > compared to the pivot. There is a double-pivot variant of quicksort that is faster than the single-pivot variant.

Average case: O(nlogn)O(n\log n) Worst case: O(n2)O(n^2) -- sorted or reverse sorted array (only for head or tail pivots)

Loop invariant: For every i < low: B[i] < pivot For every i >= high: B[i] > pivot

This version handles duplicates well. 3-way partitioning of the array (easy in theory but a lot of corner cases!)

< pivot | = pivot | in-progress | > pivot

Choosing a pivot

Pivot can be head, tail or random or an arbitrary index. In the worst case, most simple (deterministic) pivots are bad. Partition around the median element is always O(nlogn)O(n\log n) (but need work to compute the median, especially since it is obviously unsorted)

At least 1:9 ratio split still O(nlogn)O(n\log n)! Pick the pivot randomly will produce at least this split most of the time (use Probability Theorem to analyse)! 8/10 probability of a good pivot (only picking the lowest 10% or highest 10% will result in a bad pivot)

Almost every single time its O(nlogn)O(n\log n) if picking random pivot.

Switching to insertion sort for small arrays is better than recursing all the way (these smaller arrays are already mostly sorted)! OR stop the recursion early (when each chunk is around < 20), then insertion sort on the entire array.

In-place. Not stable.

All in-place quick-sorts => not stable.

Non-Comparison Based Sorts:

Counting Sort, Radix Sort, Bucket Sort

Counting Sort

Good for small ranges of values with high amount of repetition.

Maintain another array as a counter. Count in O(n)O(n). Then, re-construct the array using the counter in O(n)O(n).

Can be stable if re-construct using a mix of original array and counter.

Stable but not in-place.

Radix Sort

Good for sorting integers or strings. Sort by first index of each container (), then second, etc. Or least significant to most significant digit.

O(nm)O(nm) where n is the size of the array and m is the size of the largest container.

Only faster than quicksort if small container sizes. n>2mn>2^m to be faster.

Makes use of counting sort as a subroutine.

So, stable (if sort in correct order of least to most significant bits) but not in-place.

Other Notes

(For CS2040S) Total space ever allocated. Ignore memory allocators, garbage collectors, stack frames etc.

Order Statistics/Quick Selection

Finding the kthk^{th} smallest element in an unsorted array.

Option 1

  1. Sort the array
  2. Return the element number k O(nlogn)O(n\log n)

Option 2

Only do the minimum amount of sorting necessary.

  1. Pick a random pivot
  2. Partition around the pivot
  3. Then we know the pivot is the ithi^{th} element.
  4. Choose left or right half depending on if our target element is less than or more than the current element.
  5. Recurse on the chosen side (not both!).

Worst "best case" of a 9:1 split. T(n)=T(9n/10)+O(n)T(n)=T(9n/10)+O(n) O(n)O(n)

// https://leetcode.com/problems/kth-largest-element-in-an-array/
class Solution {
  Random rng = new Random();

  private int randomInt(int lo, int hi) {
    // hi is inclusive
    return lo + rng.nextInt(hi - lo + 1);

  private void swap(int[] nums, int a, int b) {
    int temp = nums[a];
    nums[a] = nums[b];
    nums[b] = temp;

  // this is a simple 2-way partitioning

  // there are better (and more complex) partitioning algorithms, notably:
  // 3-way partitioning same complexities but has better average real time (when duplicate values are involved)
  // Median of medians worst case O(n log n) https://en.wikipedia.org/wiki/Median_of_medians
  private int partition(int[] nums, int lo, int hi) {
    // returns the rank of the pivot
    int pivot = randomInt(lo, hi);
    int value = nums[pivot];
    swap(nums, lo, pivot);
    int pointer = lo + 1;
    for (int i = lo + 1; i <= hi; i++) {
      int x = nums[i];
      if (x < value) {
        swap(nums, i, pointer++);
    swap(nums, lo, pointer - 1);
    return pointer - 1;

  private int findKthSmallest(int[] nums, int lo, int hi, int k) {
    int index = partition(nums, lo, hi);
    if (index > k) {
      return findKthSmallest(nums, lo, index - 1, k);
    } else if (index < k) {
      return findKthSmallest(nums, index + 1, hi, k);
    } else {
      return nums[k];

  public int findKthLargest(int[] nums, int k) {
    // quick select => O(n)
    int N = nums.length;
    return findKthSmallest(nums, 0, N - 1, N - k);

Finding the largest kk elements in an unsorted array.

  1. Quick select to find the kthk^{th} largest element in the array.
  2. 2 Options
  3. Scan through the array and collect those >= kthk^{th} largest item as found in (1).
  4. Select everything from kthk^{th} and to the right of it in the quick selected array.

Finding the smallest kk elements is similar.


Trees can have very different shapes! Performance is dependent on shape of tree.

Number of ways to order insertions: n!n! Number of shapes of a binary tree: ~4n4^n (by Catalan numbers) Since n!>4nn!>4^n, by PHP, order of insertions will not result in a unique shape.

Height of Tree

The height of a rooted tree is the maximum level of any vertex of the tree. The leaves have height 0. Empty node has height -1.

  1. Pre-order
    • Apply function on root (or current vertex)
    • Traverse left subtree
    • Traverse right subtree
  2. In-order
    • Traverse left subtree
    • Apply function on root (or current vertex)
    • Traverse right subtree
  3. Post-order
    • Traverse left subtree
    • Traverse right subtree
    • Apply function on root (or current vertex)

Easy with recursion. Or use a stack to simulate recursion. O(n)O(n) to traverse the entire tree. Each node is visited at most once.

Breadth-First Search (Level Order Traversal)

BFS starts at the root and visits its adjacent vertices, and then moves to the next level. Use a queue to traverse.

Binary Search Trees (BST)

Property: All in left subtree < key < all in right subtree Possible to reconstruct from just pre-order traversal.


Recurse left or right depending on less or more respectively to find a key or insertion position.


search(x) then place there

Finding Successor

If searching for a key not in the tree. search(x) will find either predecessor or successor.

  1. result = search(x)
  2. If result > x, return result
  3. If result < x, return search(result)
  4. If result == x
  5. if there is right subtree, return minimum in right subtree.
  6. if no right subtree, go up the tree until you find a value > x, then return value

But if x is the maximum value, then this algorithm will simply return the root (which is wrong).

Removing Node

If leaf (no child), just remove.

If only has 1 child, simply replace with child.

If there are 2 children swap it with the successor and then remove the node.

Successor of a node (must be the minimum in the right subtree of node) being deleted must have only 1 child.

This will make sure that BST property is upheld.


AVL Trees (A Self-Balancing BST)

BST is balanced if h=O(logn)h=O(\log n) (not exactly). A node v is height-balanced if v.left.heightv.right.height<=1|v.left.height - v.right.height |<=1. A BST is height-balanced if every node in the tree is height-balanced.

Every height-balanced tree is balanced. NOT all balanced trees are height-balanced.

2logn2\log n is the maximum height of a balanced tree.

Storing the height in the node to avoid having to recompute the height multiple times. Note: must update height on insert/delete this.height = max(left.height, right.height) + 1

A height-balanced tree with n nodes has height h<2lognh<2\log n. More accurately, hlogn/logϕ1.44lognh\approx \log n/\log \phi \approx 1.44\log n

Invariant of AVL Trees: Trees are height-balanced (don't need to be perfect).

Minimum number of nodes n>2h/2n>2^{h/2}. Proof: nh>=1+nh1+nh2n_h>=1+n_{h-1}+n_{h-2} nh>=2nh2n_h>=2n_{h-2} nh>=2h/2n_h>=2^{h/2}

Tree Rotations

Right rotation requires a left child. Left rotation requires a right child.

Rotations always have inverses!

Right rotation:

(not inclusive of checking if valid)
w = v.left
w.parent = v.parent
v.parent = w
v.left = w.right
w.right = v

After an insertion, use tree rotations to re-balance the tree. Maximum of 1 node being out of balanced. So, balance the first unbalanced node.

Left-heavy if left sub-tree has larger height than right sub-tree. Right-heavy if right sub-tree has larger height than left sub-tree.

Details on Tree Rotation

Rotations are O(1)O(1) because only changing pointers in constant time. Rotations maintain ordering of keys, aka maintains BST property.

What to Rotate

If v is out of balance and left-heavy:

  1. v.left is balanced, right-rotate(v)
  2. v.left is left-heavy, right-rotate(v)
  3. v.left is right-heavy, left-rotate(v.left) and right-rotate(v)

If v is out of balance and right-heavy:

  1. v.right is balanced, left-rotate(v)
  2. v.right is right-heavy, left-rotate(v)
  3. v.right is left-heavy, right-rotate(v.right) and left-rotate(v)

Inserting into AVL Trees

Check if height-balanced when unrolling recursion (and updating heights).

Rebalance trees using tree rotation.

Steps for inserting into AVL Tree.

  1. Insert key in BST.
  2. Walk up the tree
  • At every step, check for balance.
  • If out-of-balance, use rotations to re-balance and return.
    • Only need to fix the lowest out-of-balance node.

Minimum number of rotations after an insertion is 0. Maximum number of rotations after an insertion is 2. This is because inserting increases the height and rotations reduce the height, so they effectively cancel each other out.

Deleting in AVL Trees

  1. If v has two children, swap it with its ancestor.
  2. Delete node v from binary tree (and reconnect children)
  3. Rotate to balance (similarly to insertion BUT have to keep balancing all the way up to the root)

Maximum number of rotations is O(logn)O(\log n). This is because deleting reduces the height but rotations also reduce the height. Thus, it is not sufficient to only fix lowest out-of-balance node.


Regular BSTs are not great for storing words because it takes min(A.length,B.length)min(A.length, B.length) to compare each string. So, each tree operation takes O(hL)O(hL). Trees that are "optimised/better" for storing words.

Tries can also be used to store binary bits or IP addresses.

Search: O(L)O(L) Worst case space usage (0 overlap): O(nL)O(nL) * overhead (need some array or other structure to store pointers to children, asymptotically same but practically more) A lot of space wasted (for *potential* children), especially for large character sets (unicode characters set). Binary Trie don't have this problem :D


  1. Searching strings
  2. Sorting/enumerating strings

Partial String Operations

  1. Prefix queries
  2. Longest prefix
  3. Wildcards (pi??le)


Compressed tries (only unchain the nodes if needed). Using hash table instead of array.


// can also use a null-terminating character as a node to indicate the end (instead of a boolean)
public class Trie {
    static class TrieNode {
        char value;
        boolean isEnd;
        // can replace this array with hash table
        TrieNode[] nextNodes;

        TrieNode(char value, boolean isEnd) {
            this.value = value;
            this.isEnd = isEnd;
            this.nextNodes = new TrieNode[26];

    private final TrieNode root;

    Trie() {
        this.root = new TrieNode('0', false);

    void insert(String word) {
        char[] chars = word.toCharArray();
        int N = chars.length;
        TrieNode node = this.root;
        for (int i = 0; i < N; i++) {
            char c = chars[i];
            if (node.nextNodes[c - 'a'] == null) {
                node.nextNodes[c - 'a'] = new TrieNode(c, false);
            node = node.nextNodes[c - 'a'];
            if (i == N - 1) {
                node.isEnd = true;

    boolean search(String word) {
        TrieNode node = this.root;
        for (char c : word.toCharArray()) {
            if (node.nextNodes[c - 'a'] == null) {
                return false;
            node = node.nextNodes[c - 'a'];
        return node.isEnd;

    boolean startsWith(String word) {
        TrieNode node = this.root;
        for (char c : word.toCharArray()) {
            if (node.nextNodes[c - 'a'] == null) {
                return false;
            node = node.nextNodes[c - 'a'];
        return true;

M-Way Search Tree

Each node has up to m1m - 1 keys and up to mm children nodes.

Binary Search Tree is simply a M-Way Search tree with degree of 2.

Child Pointer 0 | k1k_1 | Child Pointer 1 | k2k_2 | ... | km1k_{m-1} | Child Pointer mm with increasing values

Problem with m-way search tree, no algorithm for proper insertion. So, B-Trees are m-way search trees with extra rules.

(a, b)-Tree

  1. Child policy Root: 1 to b-1 keys, 2 to b children Other nodes: a-1 to b-1 keys, a to b children (except leaf nodes)

  2. All non-leaf nodes must have 1 more child than its number of keys.

  3. All leaf nodes must be at the same depth (from root).

B-Tree (B Tree)


B does NOT stand for binary!

B-Tree are M-Way Search Trees with extra rules! B-Tree is (b, 2b)-tree.

Rules of B-Tree:

  1. Every node must have at least ceil(m/2) children.
  2. Root has minimum of 2 children (exception to rule 1). This is so that root-splitting is valid!
  3. All leaf nodes must be at same level.
  4. Keys are sorted.
  5. Creation process is bottom-up.

Creating tree:

  1. Start from the bottom
  2. Once node is full, split the node (left or right biased arbitrary)

Time Complexity

Theoretically O(logn)O(\log n) Practically O(1)O(1) because of memory access times

B+ Trees

Similar to B-Trees with few differences.

  1. Record pointers only in leaf nodes.
  2. Every key is duplicated in the leaf node too.
  3. Leaf nodes have pointer to the leaf nodes on its right.

Augmenting Data Structures

Most of the time, you need to add/store some extra data in an existing data structure to solve a new problem.

AVL Trees are an augmented BST.

  1. Maintain a set of items
  2. Modify the set of items
  3. Answer queries
  4. Develop new operations (as needed)

Need to modify data structure to maintain additional information when the structure changes (subject to insert/delete or whatever else the data structure was supposed to handle).

Use properties that only depend on the subtree (so that it is easy to maintain)

Also do need to take note that storing properties also take up space.

Dynamic Order Statistics

Augmenting AVL Tree.

Store weight of the subtree in every node. Still takes O(1)O(1) (same as rotation itself) to update weights during rotation.



def select(v, k):
  rank = v.left.weight + 1
  if (k == rank):
    return v
  elif k < rank:
    return select(v.left, k)
  else if k > rank:
    return select(v.right, k - rank)



def rank(node):
  rank = node.left.weight + 1
  while node != null:
    if node is left child:
      do nothing
    elif node is right child:
      rank += node.parent.left.weight (left sibling) + 1
    node = node.parent
  return rank

Interval Trees

aka Interval Search Trees

Interval trees are used to find if a value is captured in a set of intervals.

Intervals are sorted by the left endpoints.

Augment BST with maximum endpoint in subtree self.max = max(self.max, left.max, right.max)

Insert similarly to regular BST, but need to update max (only in root to leaf path) Need to maintain max during rotations (if using AVL tree)!

Key is important even though not directly used in algorithm (the relation between the nodes are important)

def interval_search(x):
  c = root
  while (c != null and x is not in c.interval):
    if c.left == null:
      c = c.right
    else if x > c.left.max:
      c = c.right
      // always search left if left.max > x
      c = c.left
  return c.interval

O(logn)O(\log n) if balanced

If search goes left, if there is no interval in left subtree, then there is no interval in right subtree either. If search goes right, there is no overlap in left subtree.

Interval Search (all intervals that work)

  1. Repeat until no more intervals found
  2. Search for interval
  3. Add to list
  4. Delete interval
  5. Repeat for all intervals on list
  6. Add interval back to tree

O(klogn)O(k\log n) where k is the number of intervals that work but a faster solution (albeit more complicated) works in O(k+logn)O(k+\log n)

Orthogonal Range Searching

One Dimensional Range Queries

  1. Use a BST
  2. Store all the points in the leaves of the tree. (Internal nodes only store copies)
  3. Each internal node stores the MAX of any leaf in the left subtree

Querying a range

Split Node is the highest node where lo < node.val <= hi.

## find the first node that is bounded within the query
def findSplit(lo, hi):
  v = root
  while true:
    if hi <= v.key:
      v = v.left
    else if low > v.key:
      v = v.right

def query(lo, hi):
  v = findSplit(lo, hi)
  leftTraversal(v, lo, hi)
  rightTraversal(v, lo, hi)

def collect_all_leaf(v):
  collects if its in the range

def leftTraversal(v, lo, hi):
  if lo <= v.key:
    leftTraversal(v.left, lo, hi)
    leftTraversal(v.right, lo, hi)

def rightTraversal(v, lo, hi):
  if v.key <= hi:
    rightTraversal(v.right, lo, hi)
    rightTraversal(v.left, lo, hi)

O(k+logn)O(k + \log n) where k is the number of points found O(nlogn)O(n\log n) time to build tree O(n)O(n) space complexity

Can augment the tree with the count (if only want the count and not the exact values)

For rotations, not much changes need to be done.

Two Dimensional Range Queries

Can build a x-tree using only x-coordinates For every node in the x-tree, build a y-tree out of nodes in subtree using only y-coordinates.

Query: O(log2n+k)O(\log^2 n + k)

  • O(logn)O(\log n) to find split node
  • O(logn)O(\log n) recursing steps
  • O(logn)O(\log n) y-tree-searches of cost O(logn)O(\log n)
  • O(k)O(k) enumerating output

Space: O(nlogn)O(n\log n)

  • Each point appears in at most one y-tree per level
  • There are O(logn)O(\log n) levels
  • Each node appears in at most O(logn)O(\log n) y-trees The rest of the x-tree takes O(n)O(n) space.

Building: O(nlogn)O(n\log n) but not easy

No longer a dynamic data structure because rotations are no longer efficient O(n)O(n) thus cannot maintain balance.

n-Dimensional Range Queries

Possible to extend to higher dimensions. BUT have to store a d-1 dimensional range tree in each node of a 1D range tree.


Good for storing coordinates.

Much better idea than n-dimensional range queries because this is a dynamic data structure. Alternating x and y (and z and ...) levels



Example here for 2D.

Alternate x and y. WLOG, treat as x findMedian using quick select (and remove the coord after selecting it)

  1. findMedian(points) based on x
  2. Partition around the median
  3. Then, recurse on left and right partitions (on y now) and take them as left and right children

T(n)=2T(n/2)+O(n)T(n) = 2T(n/2) + O(n) T(n)=(nlogn)T(n) = (n\log n)

Minimum/Maximum Coordinates

For minimum x-coordinate. Finding maximum (and the equivalents for y-coordinate) is similar.

  1. If leaf node, return itself.
  2. If at x-level, search left child.
  3. If at y-level, search BOTH left and right children and return the minimum. (because either could bear the minimum x-coordinate)

For a perfectly balanced kd-tree Even: T(n)=T(n/2)+O(1)T(n)=T(n/2)+O(1) Odd: T(n)=2T(n/2)+O(1)T(n)=2T(n/2)+O(1) Overall: T(n)=2T(n/4)+O(1)T(n)=2T(n/4)+O(1) T(n)=nT(n)=\sqrt n


Symbol Table Operations

  • void insert(Key k, Value v)
  • Value search(Key k)
  • void delete
  • boolean contains(Key k)
  • int size()

Dictionary are ordered symbol tables.

  • Query minimum
  • Query successor

HashTables are an implementation of a symbol table (need not be ordered). By implementing symbol table with AVL Trees, can O(logn)O(\log n) operations AND maintain order.

Implementing sorting with symbol tables take O(n2)O(n^2) because cannot find successor quickly.

Symbol tables are not comparison-based because they can work in O(1)O(1).

Can use hashing to prove that you have the password without sending the password! (cryptography)

Problems with Direct Access Tables

Treating everything as a number. Longest english word is 28 letters and 5 bits per letter => 140 bits max So, 2^140 size array can store EVERY english words (approx number of atoms in observable universe, so obviously too much space!)

Hash Functions

m = number of buckets Usually, m = 2n is chosen to keep probability of collision low

  • if m = n, 1/3 buckets have 2-3 items in them

Huge universe U of possible keys. Smaller number n of actual keys.

Store key k in bucket h(k) Time: time to compute h + time to access bucket In CS2040S, can assume hash function takes O(1) unless otherwise mentioned

Hash Collisions

Two distinct keys k1k_1 and k2k_2 collide if h(k1)=h(k2)h(k_1) = h(k_2) Always have collisions eventually by PHP because Universe >>> number of buckets.

A few methods to avoid problems of collisions (hash collision resolution)

  1. Choose new hash function
  2. Very bad because eventually collide again, copying over is very expensive.
  3. Chaining
  4. Open Addressing

Probability of Buckets

Assuming every key is equally likely to map to every bucket. (for simpler calculation)

Let X(i, j) = 1 if item i is put in bucket j = 0 otherwise P(X(i, j) == 1) = 1/m E(X(i, j)) = 1/m

By linearity of expectation, for n items, expected number of items per bucket: n/mn/m

Hash Collision Resolution


If m == n, can still insert but will be less efficient.

n > m is better for enumerating all the keys in a hash table (not normally prioritised) As long as m > n, equally efficient. (unlike open addressing)

Put all the items that collide in the same bucket using a linkedlist of items in that bucket.

Total Space: O(m+n) Table size: O(m) Linkedlist size: O(n)

Linkedlist wander all over the memory (bad for cache access) Needs to keep allocating list nodes into memory


  1. Calculate h(key)
  2. Lookup h(key) and chain if necessary

Expected cost: O(1+cost(h))O(1+cost(h)) because just insert into the front of linkedlist but if want to avoid duplicates, have to search for duplicates OR end up having many duplicates in linkedlist

Expected maximum cost of inserting n items: O(logn/loglogn)O(\log n/\log\log n) (proof beyond CS2040S)

Problem with Hash with Chaining

If all keys hash to the same bucket, then effectively become a linkedlist! Just assume that the hash function satisfies the simple uniform hashing assumption.


  1. Calculate h(key)
  2. Lookup h(key) and search through the chain

Expected: h + n/m = O(h) Worst case: O(h + n)

Table Size

Assume hashing with chaining Assume simple uniform hashing Expected search time: O(1 + n/m) Optimal size: m=Θ(n)m=\Theta(n)

if m < 2n: too many collisions if m > 10n: too much wasted space note that: 2 and 10 above are arbitrary

PROBLEM: don't know what is n in advance

IDEA: grow and shrink the table size as necessary (similar to dynamic arrays)

Growing Table

  1. Choose new table size m
  2. Choose new hash function h
  3. Hash function depends on table size!
  4. Java does this auto-magically
  5. For each item in the old hash table
  6. Compute new hash value
  7. Copy item to new bucket

Time Complexity Cost: O(m1+m2+n)O(m_1 + m_2 + n) This is because allocating memory takes a lot of time!

Design Question

Do you care that some insertions take a lot longer than others?

When you need it to work, then suddenly need to rebuild, gg.

Growing by Adding a Constant

If incrementing by size 1 Cost of resize: O(n) For inserting n items: O(n2)O(n^2)

Even for constants > 1: bad!

Growing by Doubling

Cost of resize: O(n) Cost of inserting n items + resizing: O(n)

Most insertions: O(1) Some insertions: O(n) (to rebuild) Average cost: O(1)


Growing by Squaring

Cost of resize: O(n2)O(n^2) Cost: O(n+n2+n)O(n + n^2 + n)

High cost of resize AND a lot of space wasted. Potentially only sqrt(n)sqrt(n) of the table used.

Shrinking Table Size

When too many items are deleted, table is too big.

Shrinking by Halving

if n == m, m = 2m if n < m/2, m = m/2

Cost of resize: O(n)

Imagine that at the border, delete 1 item, shrink, add 1 item, grow. VERY BAD!!!!!

if n == m, m = 2m if n < m/4, m = m/2 Shrink when 3/4 empty instead of 1/2 half! And leave empty space. To avoid the problem above!


  • Every time you grow a table of size m, at least m/2 new items are added
  • Every time you shrink a table of size m, at least m/4 items were deleted THUS, can "pay" for growing and shrinking the table

This will then be of amortised expected costs of O(1) for insertion and deletion.

Open Addressing

If m == n, can no longer insert!

Idea: Find another bucket.


  • Saves space
    • Empty slots vs linked lists
  • Rarely allocate new memory
    • No linkedlist nodes to allocate
  • Better cache performance
    • Table is stored in contiguous memory
    • Fewer accesses to the cache
    • Compare to linkedlist being in different places in the memory


  • More sensitive to choice of hash functions
    • Clustering is a common problem
  • Performance degrades badly as α\alpha approaches 1
    • Even around a quarter full is already slow

Hash Function

Hash Function has to return a SEQUENCE of values for some key. h(key, i):

  • hash function takes in the number of tries as well

Two properties of a good hash function

  • must enumerate all possible buckets else it might return table-full even when there is space left
  • (Stronger) Uniform Hashing Assumption
    • Each of the n! permutations of probe sequence are equally probable
    • BUT linear probing does NOT satisfy this uniform assumption due to clusters being formed + bigger clusters will grow faster
    • there will be clusters of O(log n) -- proof beyond CS2040S

Performance of Open Addressing

For n items, in a table of size m, assuming uniform hashing, the expected cost of an operation is <=11α<=\frac{1}{1-\alpha} where α=n/m<1\alpha=n/m <1

because you will never probe the same bucket twice (because probe sequence does not repeat)

1+nm(1+n1m1(1+n2m2(...)))1+\frac{n}{m}(1+\frac{n-1}{m-1}(1+\frac{n-2}{m-2}(...))) nimi<=nm<=α\frac{n-i}{m-i}<=\frac{n}{m}<=\alpha so <=1+α(1+α(...))<=1+\alpha(1+\alpha(...)) <=1+α+α2+α3+...<=1+\alpha +\alpha ^2+\alpha ^3+... <=11α<=\frac{1}{1-\alpha}

Linear Probing

Very fast because of caching despite the clusters (because the clusters are mostly in the same cache, so negligible difference)

for i in 0..m:
  if buckets[hash(x) + i % m] is empty:
    insert x into this bucket

Keep trying adjacent buckets until an empty bucket.

h(k, 0), h(k, 1), h(k, 2), ... until it finds an empty bucket, then insert there in general: h(k, i) = h(k, 1) + i mod m (i is number of tries, because circular)

If DELETED cell, overwrite the deleted cell


Keep probing by incrementing i until the correct key is found OR null (empty)

If DELETED cell, continue probing


Cannot just set to null, because will affect searching other elements (e.g. expected to be 3rd try but if null, will terminate before 3rd try)

Tombstone value: Set bucket to be DELETED

Quadratic Probing

Similar to linear probing but increment by squares instead

increment by 1, 4, 9, 16, ..., i2i^2

for i in 0..m:
  if buckets[(hash(x) + i * i) % m] is empty:
    insert x into this bucket

Quadratic probing can fail to insert even when table is not full When table size is of n21n^2-1 and constantly insert into slot 0.

Double Hashing

Start with two original hash functions f(k) and g(k) let h(k,i)=f(k)+ig(k)modmh(k,i)=f(k) + i*g(k) \mod m

Since f(k) is good, f(k, 1) is almost random Since g(k) is good, the probe sequence is almost random

If g(k) is relatively prime to m, then h(k, i) hits all buckets Can prove by contradiction.

Example: m=2r2^r, and g(k) output always odd Example: m = 330, g(k) = 7^k

Tabulation Hashing

Use a table lookup with XOR operations to generate a hash. This lookup table is a table of size [x, N] where x is the numerical base of the key and N is the length (in bits) of the key.

Assuming the key is a N-bit binary value, so x = 2, N = N.

hash = 0
for j in range(N):
  hash = hash XOR T[key[j]][j]

Java Hashing

Java Map

Standard is to use HashMap<Key,Value> interface java.util.Map<Key,Value>

  • void clear()

  • boolean containsKey(Object k)

  • boolean containsValue(Object v) // may not be efficient!

  • Value get(Object k)

  • Value put(Key k, Value v)

  • Value remove(Object k)

  • int size()

  • Set<Map.Entry<Key,Value>> entrySet()

  • Set<Key> keySet()

  • Collection<Value> values() Above 3 are not sorted and not necessarily efficient

Java Map Rules

  • No duplicate keys allowed (checked during insertion)
    • need to search on insertion
  • No mutable keys
    • no longer clear which key to use (original vs modified)
  • Must redefine equals() method to be consistent
    • Reflexive
    • Symmetric
    • Transitive
    • Consistent
    • Null is null

More Java HashMap Behaviour

  • Returns null when key is not in map
  • Returns null when value is null

It is implemented with chaining. It will hash the hashCode of your object again to a value that can fit in the internal table! Because of this, table_hash == table_hash does not imply that hash == hash.

TreeMap (dictionary in Java)

  • Supports most HashMap operations with MORE operations! (because dictionary is an ordered data structure)

Design decisions

  • Allow duplicate keys
    • No: need to search on insertion
    • Yes: faster insertion
  • What to do if user inserts duplicate key
    • Override existing key
    • Add new value (key will have 2 values)
    • Error
  • Insert empty/null
    • Deletes existing key, value pair
    • Inserts a null
    • Error

Wide Interfaces

Lots of functionality Easy to use No guarantee of efficiency (some operations are not optimised) Java mostly has wide interfaces

Narrow Interfaces

Limited functionality Enforces proper usage Restricts usage

Hash Code Method

Every Object in Java has a hashCode() method By default Java hashCode returns the memory location of the object. Every Object hashes to a different location. Should override this!


  1. Always returns the same value, if the object has not changed
  2. If two objects are equal, then they return the same hash code
  • If two objects have the same hash code, they might not be equal

For Integer:

public int hashCode() {
  return value;

For Long:

public int hashCode() {
  return (int) (value ^ (value >>> 32))

For String:

s[0] * 31^(n-1) +
s[1] * 31^(n-2) +
s[n-1] * 31^(0)

// 31 is just a prime number that happens to work well


A set of edges and a set of vertices.

These should not be conditional and should work regardless of direction, distance, etc...

Graph Definitions

Multi Graphs: Has more than 1 edge between same pair of nodes

Hyper graphs: Has edges that connect > 2 nodes

Multi graphs and hyper graphs not covered in CS2040S. (both not considered regular graphs for CS2040S).

(Simple) Path: Set of edges connecting 2 nodes, path intersects each node at most once (no repeated nodes).

Connected: Every pair of nodes is connected by a path.

Disconnected: Some pair of nodes not connected by a path.

Connected Components: Set of nodes that are reachable (exists some path) from each other.

Cycle: "Path" where first and last node are the same.

(Unrooted) Tree: Connected graph with no cycles. Tree with N vertices has N - 1 edges.

Forest: Graph with no cycles (>= 1 trees)

Degree of Node: Number of adjacent edges. * can draw example picture

Degree of Graph: Maximum degree of all the nodes.

Diameter: The maximum distance between two nodes, following the shortest path. Only interested in reachable nodes (not defined for unreachable).

Finding diameter:

  1. Take every pair of nodes
  2. Find shortest path distance between each pair
  3. Return longest distance found in step 2

Sparse: Average degree of nodes VS total number of nodes. E = O(V) is sparse

Special Graphs

Star: One central node, all edges connect center to edges. (also a tree) Degree = n - 1 Diameter = 2

Clique (Complete Graph): All pairs are connected by edges. Degree = n - 1 Diameter = 1

Line (or path): Line graph Degree = 2 Diameter = n - 1 Line is also a bipartite graph.

Cycle: Circle Degree = 2 Diameter = n / 2 or (n - 1) / 2 based on odd/even Even length cycle is a bipartite graph. Odd length cycle cannot be a bipartite graph.

Bipartite Graph: Nodes divided into two sets with no edges between nodes in the same set. Maximum Diameter = n - 1 (essentially a Line Graph) To prove bipartite graph: re-arrange/colour into a bipartite graph

A graph does NOT contain an odd length cycle <=> it is a bipartite graph. https://math.stackexchange.com/questions/311665/proof-a-graph-is-bipartite-if-and-only-if-it-contains-no-odd-cycles

Modelling Problems

4-Colouring Problem Nodes: tiles Edges: pairs of adjacent tiles A map in a plane can be coloured using 4 colours such that adjacent tiles sharing a common boundary do not share the same colour.

Sliding Puzzle Every state is a node. Moves = edges

For n = 9 Maximum degree: 4 (maximum possible of moves) Nodes = 9! Edges: 4*9! (4 from maximum degree)

Diameter = scrambled -> unscrambled

Rubik's Cube Every state is a node. Moves = edges

Number of nodes (for 2x2x2 cube) = 7! * 3^8 (7! and not 8! because symmetry rule, fix one of the pieces) ^ upper bound that includes illegal states

For 3x3x3 cube: 43 quintillion vertices (approximately)

Diameter of n x n x n cube: Θ(n2/logn)\Theta(n^2 /\log n)

Graph Representation

Chosen based on if graph is dense or sparse and what type of queries to be optimised for.

If a graph is dense, then use an adjacency matrix, else use an adjacency list. Dense: |E| = Θ(V2)\Theta(V^2)

Adjacency List

Nodes: stored in an array Edges: linked list per node

// or LinkedList<Integer>
class NeighbourList extends LinkedList<Node> {}

class Node {
  int key;
  NeighbourList neighbours;

class Graph {
  Node[] nodes;

// Don't do this:
List<List<Integer>> nodes;
// ^ harder to read, debug, extend

Memory usage for G = (V, E) array of size: |V| linkedlist of size: |E|

Total O(V + E)

For a cycle: O(V) For a clique: O(V^2)

Fast query: find a neighbour of a node Fast query: enumerate all neighbours Slower query: whether 2 nodes are neighbours

If 2 nodes are connected: O(min(|V|, |E|))

Adjacency Matrix

n * n matrix where matrix[i][i] = 1 iff there is an edge between the two nodes

n-length neighbours, matrix^n

infinite-length approximately equals Google page rank (simulating random walks)

class Graph {
  Node[][] matrix;
  boolean[][] matrix;
  // use boolean if don't care about path lengths

Memory usage for G = (V, E) array of size: V * V Total: O(V^2)

For a cycle: O(V^2) For a clique: O(V^2)

Fast query: whether 2 nodes are neighbours Slow query: find a neighbour of a node Slow query: enumerate all neighbours

If 2 nodes are connected: O(1)

Graph Searching

  1. Start at s -> end at f
  2. Visit all nodes in the graph

Only visit every node once!

BFS/DFS visits every node and edge in the graph, but NOT every path in the graph.

Visiting every possible path is exponential (an undirected graph), so we clearly don't want to do this unless there is no other better choice (which there IS).

DON'T modify BFS/DFS to backtrack/re-visit, probably will be end up being exponential or infinite loops.

BFS/DFS forms a tree/forest (depends on number of connected components).

For a connected graph, this is because n - 1 new nodes will be discovered from the source node. With each discovery, the parent pointer is an edge. The final structure will thus consist of n nodes and n - 1 edges and will form one connected component. Thus, it is a tree by definition of a tree.

Shortest Path

Shortest path NEVER contains a cycle (ONLY IF shortest path is unique -> but most algorithms avoids this to only construct trees) IF there exists a cycle, then it would NOT be a shortest path (prove by contradiction)

Use BFS.

Breadth-First Search (BFS)

Use a queue. Takes edge from vertex that was discovered least recently.

Explore level by level. Keep track of visited nodes with a boolean[]. And only visit if not visited before.

Crossed edges = edges that will visit a node that has already been visited before

Parent edges = edges that are used in a BFS

  1. Parent edges form a tree (no cycles) ALWAYS (even if shortest path is non-unique because of checking for "visited" nodes)
  • if there ARE cycles => re-visited a node
  1. Parent edges are shortest paths from source
  • if there IS a shorter path, it would already have been taken in the previous iteration
  1. possibly high degree (star) and possibly high diameter (straight line)
// need to restart for nodes not visited (if not connected)

BFS(Node[] nodeList, int startId) {
  // keep track of visited nodes
  boolean[] visited = new boolean[nodeList.length];
  // even though Java boolean[] default is already false
  Arrays.fill(visited, false);

  int[] parent = new int[nodeList.length];
  Arrays.fill(parent, -1);

  Collection<Integer> frontier = new Collection<>();

  while (!frontier.isEmpty()) {
    // can reuse same queue if using queue (just keep track of number of nodes in a level)
    Collection<Integer> nextFrontier = new Collection<>();
    for (Integer v : frontier) {
      // do things to v here
      for (Integer w : nodeList[v].neighbours) {
        if (!visited[w]) {
          // only visit each node once
          visited[w] = true;
          parent[w] = v;
    frontier = nextFrontier;

For Adjacency List: Each vertex only added to frontier ONCE (aka only visited once) -> O(V) Each v.neighbours only enumerated ONCE -> O(E), each edge is examined twice, once from each end Time Complexity of BFS: O(V + E)

For Adjacency Matrix Time Complexity of BFS: O(V^2)

Depth-First Search (DFS)

Use a stack or recursion stack. Takes edge from vertex that was discovered most recently.

  1. Follow path under you get stuck
  2. Backtrack until you find a new edge
  3. Recursively explore it
  4. Don't repeat vertices

DFS does not necessarily yield the shortest path

Use recursion as the stack :D

DFS parent graph is a tree (because never visit same node twice) parent edges logic similar to BFS

DFS_visit(Node[] nodeList, boolean[] visisted, int startId) {
  for (Integer v : nodeList[startId].neighbours) {
    if (!visited[v]) {
      visited[v] = true;
      DFS_visit(nodeList, visited, v);

DFS(Node[] nodeList) {
  boolean[] visited = new boolean[nodeList.length];
  Arrays.fill(visited, false);

  for (int start = 0; start < nodeList.length; start++) {
    if (!visisted[start]) {
      visisted[start] = true;
      DFS_visit(nodeList, visisted, start);

For Adjacency List Time complexity: O(V + E) (same reason as BFS)

For Adjacency Matrix Time Complexity: O(V^2)

Directed Graph (Digraph)

Undirected graphs NOT considered a directed graph (in CS2040S).

Each edge is unique (to not be a multi graph).

(v,w)(w,v)(v, w) \neq (w, v)

In-degree: number of incoming edges Out-degree: number of outgoing edges

Adjacency List

Exactly the same but only store outgoing edges in linked list

Adjacency Matrix

Exactly the same but no longer symmetrical matrix


Follow outgoing edges Ignore incoming edges


Follow outgoing edges Backtrack (through incoming edges) when unrolling recursion

Topological Ordering

Every directed acyclic graph (DAG) has a topological ordering. DAG must have a partial order, hence antisymmetric, reflexive, transitive.

Topological orderings are not unique.

Must have:

  1. Sequential total ordering of all nodes
  2. Every edge only point forward (NO bidirectional) -- antisymmetric
  3. Must have no cycle

Pre-order traversal of trees is topological ordering of the tree because it respects the direction of edges where parents are ordered before children.

DFS for Topological Ordering

Can use post-order DFS to find Topological Ordering. Process each node when it is last visited and put it at the BACK of the ordering. Also, need to start the DFS multiple times because we don't know wheres the actual start(s).

for (int start = 0; start < nodeList.length; start++) {
  if (!visisted[start]) {
    visisted[start] = true;
    DFS_visit(nodeList, visisted, start);
    // post-order
    add to end of ordering

O(V + E) time complexity because its just a DFS with an extra O(1) on each node

Reverse of post-order DFS.

Kahn's Algorithm for Topological Ordering

Kahn's Algorithm is a form of BFS. A bit more difficult to implement than DFS.


  1. S = all nodes in G that have no incoming edges
  2. Add nodes in S to the topological ordering
  3. Remove all edges adjacent to nodes in S
  4. Remove nodes in S from the graph

O(V + E) time complexity

Strongly Connected Components

For every vertex v and w in the component:

  1. There is a path from v to w.
  2. There is a path from w to v.

A single node is trivially a strongly connected component.

Cycles are strongly connected components.

Graph of multiple strongly connected components is acyclic if connecting the strongly connected components in a non-cyclic way.

DAG has no cycles and hence there are no (u, v) in the same strongly connected component. Thus, the DAG will have n strongly connected components.

DAG: Directed Acyclic Graph

Weighted Graphs

CANNOT use BFS to find shortest path in weighted graphs because it does not attempt all possible paths

δ(u,v)\delta(u,v) = shortest distance from u to v

Triangle Inequality δ(S,C)δ(S,A)+δ(A,C)\delta(S,C)\leq \delta(S,A)+\delta(A,C)

EVERY shortest path algorithm makes use of this triangle inequality.

The need to maintain distance estimate for each distance.

int[] dist = new int[V.length];
Arrays.fill(dist, INFINITY);
dist[start] = 0;

At every iteration:

  1. Reduce estimate
  2. Invariant: estimate >= actual distance

If all the edges added some constant factor of C > 0: shortest path will change (because not every path has the same number of edges). If all the edges multiplied some constant factor of C > 0: shortest path will not change.

// application of triangle inequality
void relax(int S, int A) {
  // essentially, take the shortest path from S to A
  if (dist[v] > dist[u] + weight(u, v)) {
    dist[v] = dist[u] + weight(u, v);

Need to relax edges in the correct order to only relax each edge once. Look at Dijkstra's Algorithm below.

Bellman-Ford Algorithm

n = V.length;
for (i = 0; i < n; i++) {
  if during an iteration, all the relax does nothing, then terminate early
  for (Edge e : graph) {

If relax every edge in the graph (n - 1) times, will have correct final answer. Can terminate early if an entire sequence of relax has no effect. n - 1 times because the longest path is n - 1 edges (if no negative weight cycle)

If P is the shortest path from S to D, and if P goes through X, then P is also the shortest path from S to X (and from X to D). After i iteration of Bellman-Ford, i hop estimate (i hops) on shortest path is correct. After 1 iteration of Bellman-Ford, NOT ALL nodes 1 hop away from source is correct (only the one along the shortest path is guaranteed)!

Thus, by induction, after n iterations, shortest path reaches the destination.

Time complexity: O(E * V) in the worst case but can terminate early!

Negative Weights

Should work because triangle inequality still holds.

Notion of shortest path is not well-defined with negative weight cycles (negative infinity essentially).

Bellman-Ford does not work for negative weights (with cycles). If after |V| iterations and there is still a change => negative weight cycle exist. Thus, can use Bellman-Ford to detect if negative weight cycles exist

All Same Weight

Bellman-Ford will still work.

Effectively unweighted so just use BFS easier.

Alternative: Queue-Based Bellman-Ford

These queue-based alternative allows one to easily detect paths that include negative weight cycles while not interfering with the results of the nodes whose paths do not include a negative weight cycle.

  1. boolean[] onQ
  2. Queue<Node>
  3. int[] distances

One way is to loop while queue is not empty AND number of loops do not exceed V * V. Another way is to check for existence of negative weight cycles every V iterations (parent pointers are needed). While looping, relax all the edges of the polled node and add those adjacent edges to the queue, if onQ[dest] is False.

After said loop, every node left in the queue (and every node which is linked to those nodes and so forth) has a path that includes a negative weight cycle. Thus, can set the distances of these nodes to NEGINF.

Dijkstra Algorithm

Right Order of Relax

Proof that a right order of relaxing edges exists

  1. Find shortest path tree
  2. Relax tree edges in BFS order
  3. Relax non-tree edges in any order

If relax edges in the correct order, each edge is only relaxed once.

Thus, it is possible for an algorithm with O(E) time complexity. BUT can't use the proof as the algorithm itself as you need the answer to solve the answer.


Relax the edges in the right order such that each edge is only relaxed once! Use priority queue. Take edge from vertex that is closest to source.

Necessary assumptions:

  1. All edges weight >= 0 (NO negative weight edges)
  2. Extending a path does not make it shorter

Idea Only relax an edge once its estimate is correct (and will never change again)!

  1. Maintain distance estimate for every node
  2. Begin with empty shortest path tree
  3. Repeat:
  • Consider node with minimum estimate
  • We will show that this node has a good estimate
  • Add node to shortest path tree
  • Relax all outgoing edges

Connecting all the edges which last caused each node to change value (while relaxing) will form a shortest path tree

Data Structure

Use a priority queue

  • void insert(Key k, Priority p) O(log n)
  • Data extractMin()
  • void deleteMin() O(log n)
  • decreaseKey(Key k, Priority p) O(log n)
  • boolean contains(Key k) O(1)
  • boolean isEmpty()

Can use AVL Tree (with a hash table) or heap to implement a priority queue

Indexed by priority but still need to do fast contains() One solution is to use a hash table to point to the nodes in the AVL Tree

decreaseKey(key, priority): delete, then decrease, then re-insert back -> still O(log n)

public Dijkstra {
  private Graph g;
  private IPriorityQueue pq = new PriorityQueue();
  private double[] distTo;

  void searchPath(int start) {
    pq.insert(start, 0.0);
    distTo = new double[G.size()];
    Arrays.fill(distTo, INFINITY);
    distTo[start] = 0;
    while (!pq.isEmpty()) {
      int w = pq.deleteMin();
      for (Edge e : G[w].neighbours) {
        // this loop is ran <= V times
        // but since each edge is relaxed only once

  // relax is O(log V)
  void relax(Edge e) {
    int v = e.from();
    int w = e.to();
    double weight = e.weight();
    if (distTo[w] > distTo[v] + weight) {
      distTo[w] = distTo[v] + weight;
      parent[w] = v;
      if (pq.contains(w)) {
        pq.decreaseKey(w, distTo[w]);
      } else {
        pq.insert(w, distTo[w]);

Time complexity if implemented with an AVL Tree Priority Queue O(E + V) in the main loop insert O(log n) -> at most V times extractMin O(log n) -> at most V times decreaseKey O(log n) -> at most E times isEmpty() O(1) -> at most V times

Overall: O((V + E) _ log V) = O(E _ log V)

V = number of nodes in the connected component of source

E >= V (because every node in the connected component is connected) thus can remove V from the final asymptotic running time

Why Dijkstra Algorithm Works

Every edge crossing the boundary has been relaxed.

Fringe vertices: neighbour of a finished vertex (in priority queue)

Claim: When taking a node out of the priority queue, the estimate will be correct and it can join the finished set of vertices.

Proof by induction: Base:

  1. Every finished vertex has correct estimate
  2. Initially, only finished vertex is start

Inductive step:

  1. Remove vertex from priority queue
  2. Relax its edges
  3. Add it to finished
  4. Claim: It has a correct estimate

To prove the claim that "It has a correct estimate": Prove by contradiction, if fringe vertex has another shorter path then by triangle inequality, will find some contradiction.


Can terminate Dijkstra Algorithm immediately once the destination is dequeued from the priority queue. This is because once a node is dequeued, the estimate is correct.

Dijkstra's Performance

PQ ImplementationinsertdeleteMindecreaseKeyDijkstra
AVL TreelogV\log VlogV\log VlogV\log VO(ElogV)O(E\log V)
d-way HeapdlogdVd\log_d VdlogdVd\log_d VlogdV\log_d VO(ElogE/VV)O(E\log_{E/V}V)
Fibonacci Heap1logV\log V1O(E+VlogV)O(E+V\log V)

Fibonacci Heap has good amortised performance but real-world performance is slower great YouTube video on fibonacci heap

Negative Weight Edges

Invariant of estimate being correct when removed from priority queue no longer holds true. Thus Dijkstra Algorithm cannot be performed on a graph with negative weight edges.

Re-weighing Weights

Re-weigh weights by increasing each edge's weight by some constant to make it non-negative will NOT WORK.

This will favour paths with less edges, which may not be the actual shortest path.

There is a way to re-weight weights properly for this to work, but it increases the time complexity so much that running Bellman-Ford is faster.

Heap & Heap Trees

Heap trees are an implementation of priority queues.

Array Representation of Heap Tree

For minimum heap tree: For an array of indices [1, 2, ..., n]

  • root: ii
  • left child at 2i2i
  • right child at 2i+12i+1
  • parent at i/2\lfloor i/2\rfloor

Full Binary Tree

  1. Every level of the tree must be full.

In array representation: heap has array of 2n2^n items, every item in this array is filled (no spaces). Use resizeable array if heap is resizable too.

Every full binary tree is also a complete binary tree.

Complete Binary Tree

  1. Every level of the tree must be full except possibly the leaves.
  2. Leaf nodes are as far left as possible.

In array representation: heap has array of xx items, every item in this array is filled (no gaps in the array).

Maximum Heap Tree

  1. Value at root node is greater than or equal to both its children (local property, does not apply to "nephews").
  2. The heap tree is a complete binary tree.

Operations have to maintain the complete binary tree property.

Claim: For heap with unique values, second largest element must be a child of the root node. Proof: Suppose it is not, then there is a contradiction because that means there exists a node > second largest yet < largest.

Height is logn\lfloor \log n\rfloor because complete binary tree

Minimum Heap Tree is the same except it is "lesser than or equal to" instead.

Bubbling Up and Down

bubbleUp and bubbleDown

bubbleUp(Node u):
  // root = no parent = no violation
  while u is not the root:
    if (u.priority > u.parent.priority) {
      swap(u, u.parent)
    u = u.parent
bubbleDown(Node u):
  // leaf = no children = no violation
  while u is not a leaf:
    maxP = max(u.left.priority, u.right.priority)
    if (maxP > u.priority) {
      // choose larger side to bubble down to
      if (maxP == u.left.priority) {
        swap(u, u.left)
        u = u.left
      } else if (maxP == u.right.priority) {
        swap(u, u.right)
        u = u.right

Each iteration of bubble up and down raises/lowers the level of the invariant violation. Once each the root/leaf levels, no more invariant violation by definition!

They have different complexities based on where you start it (nearer to leaf or root)!


  1. Insert the new node at the far left of the leaf height
  2. Bubble up to fix the first property of maximum heap tree

Maximum number of swaps is O(log n) Time complexity: O(log n)

To find insertion point:

  1. use size of tree
  2. end pointer to insertion point

Note: to get O(1) search for the tree, use hashtable to store

  1. pointer (need to maintain when add/deleting elements)
  2. position (need to maintain when add/deleting/swapping elements)

Extracting Maximum

Only can remove the root (because its the maximum)

  1. Remove the root (it is the result)
  2. Move the last element at the leaf height to the root
  3. Bubble down (pick either child node that is > current) to fix the first property of maximum heap tree
Deleting a General Node
  1. increase priority of the node to infinity
  2. deleteMax()

Heap Sort

  1. Add all the elements one by one into the heap
  2. Extract maximum n times.
  3. While extracting, fill in the extracted value at the end of the array
  4. After this, the array is sorted.

O(n log n)

In-place Not stable


Inserting n items into a heap in O(n) time.

  1. Start off with a complete tree (that does not satisfy the first property of maximum heap tree)
  2. Check through each node from the bottom right most to the root
  3. If it is a heap at its subtree, good! (only need to check left and right children)
  4. Else, bubble down (and NOT bubble up).

Invariant: After i iterations, last i items are in the heap because after the bubble down on x, tree rooted at x is a heap.

Intuition: Bubbling down results in an overall O(n) because there are much more nodes below than on top.


Cost of bubbleDown on item j is O(log(nj+1)+1)O(\log(\frac{n}{j+1})+1) because height of item jj is O(log(nj+1))O(\log(\frac{n}{j+1})) and cost = height + 1

Heapfiy cost=j=1nlog(nj+1+1)\text{Heapfiy cost}=\sum_{j=1}^{n}\log(\frac{n}{j+1}+1) =n+logn1+logn2+logn3++lognn=n+\log\frac{n}{1}+\log\frac{n}{2}+\log\frac{n}{3}+\cdots+\log\frac{n}{n} =n+lognnn!=n+\log\frac{n^n}{n!} =n+lognnnen by Stirling’s Approximation=n+\log\frac{n^n}{\frac{n}{e}^n}\text{ by Stirling's Approximation} =n+logen=n+nloge=n(1+loge)=O(n)=n+\log e^n=n+n\log e=n(1+\log e)=O(n)

Alternate heapify cost analysis using GP of GP.

NOTE: Can skip all the leaf nodes because they have no children so definitely nothing to bubbleDown. (Doesn't change complexity but is an optimisation by about half)

Summary of Single Source Shortest Path (SSSP)

Single-Source Shortest Paths BFS: Unweighted graphs Bellman-Ford: simple, general Dijkstra: Faster but only for non-negative weights

DAGs: topological sort (using either reverse DFS post-order OR Kahn's Algorithm) + relax DAG Longest Path: negation of weights + shortest path on DAG

Shortest Path on DAG

  1. Topological order
  2. Relax edges in any order

Topological ordering because when reach any node, all nodes pointing into it definitely has the correct distance (because topologically before this node).

Time complexity (assuming adjacency list) Finding topological ordering: O(V + E) Relaxing E edges: O(E) Overall: O(V + E)

Longest Path on DAG

Negate all the edges to reduce this problem to be a Shortest Path on DAG problem

For a cyclic graph, CANNOT do this negation trick because it leads to negative weight cycles and everything breaks.

In general, finding longest path on a general graph is a NP-hard problem (no polynomial solution).

Logarithm Trick to Avoid Multiplication

In general, modifying graph/input is better than modifying the algorithm because modifying algorithms could easily introduce bugs or change time complexities.

log(a1a2an)=loga1+loga2++logan\log(a_1*a_2* \cdots *a_n)=\log a_1+\log a_2+\cdots +\log a_n log(1a1a2an)=log(a1a2an)=loga1+loga2++logan\log(\frac{1}{a_1*a_2* \cdots *a_n})=-\log(a_1*a_2* \cdots *a_n)=-\log a_1+-\log a_2+\cdots +-\log a_n

Note that 0 < x < 1, then logx-\log x > 0. So, can use Dijkstra.

Simpler Travelling Salesman

Start, Destination and single checkpoint in the middle. Run SSSP on start and SSSP on destination. Choose the best checkpoint and paths.

If need 2 different type of checkpoints, modify the graph such that each vertex will store whether a type A checkpoint and type B checkpoint has been visited respectively in the path.

If n checkpoints, its NP-hard!


Union Find

Is Connected?

Naive Version (without wall breaking and static set) isConnected: return true if A and B are in the same connected component Preprocess: Identify connected components. Label each location with its component number

Dynamic Connectivity

Given a set of objects: union(): connect two objects find(): is there a path connecting 2 objects?

Quick-Find Algorithm

Optimised for quick finding

If objects are not integers, convert them to integers.

  1. Hash function (if collision don't matter)
  • UUID generator
  1. Hash table + open addressing (if collisions matter)

Data Structure int[] componentId Two objects are connected if they have the same component identifier.

Essentially very flat trees (tree with height of max 2) Same component => in the same subtree Can model the above int array as a flat tree.

find(int p, int q):
  // O(1)
  return componentId[p] == componentId[q]

// O(n) because need to search through ALL objects
union(int p, int q):
  // choose either to be the new id
  updateComponent = componentId[q]
  for (int i = 0; i < componentId.length; i++) {
    if (componentId[i] == updateComponent) {
      componentId[i] = componentId[p];

Quick-Union Algorithm

int[] parent (parent pointers) Two objects are connected if they are part of the same tree

These non-binary trees will not be balanced.

find(int p, int q) {
  // O(n) if a very tall tree (all objects in same component)
  while (parent[p] != p) p = parent[p];
  while (parent[q] != q) q = parent[q];
  return root of p == root of q;

union(int p, int q) {
  // O(n) if a very tall tree (all objects in same component)
  while (parent[p] != p) p = parent[p];
  while (parent[q] != q) q = parent[q];
  // pick either node to be the parent
  parent[p] = parent[q];

This version is very bad because trees are tall (unbalanced).

Need to compress the tree to achieve fast union!

Optimisation: Weighted Union

IDEA: Correctly choose which tree to make the root. Pick the taller tree to be the new root. (make shorter tree a child)

union(int p, int q) {
  while (parent[p] != p) p = parent[p];
  while (parent[q] != q) q = parent[q];
  // pick taller tree to be the root
  if (size[p] > size[q]) {
    // p is taller
    parent[q] = parent[p];
    size[p] = size[p] + size[q];
  } else {
    // q is taller
    parent[p] = parent[q];
    size[q] = size[p] + size[q];

Tree will end up being much shorter!

Maximum depth of tree: O(log n) IDEA: Height only increases when total size doubles. Claim: A tree of height k has size at least 2k2^k. Thus, conversely, height of tree of size n is at most log n.

Base case: Tree of height 0 contains 1 item Induction: Let T1 be the bigger tree (in terms of size) and T2 be the smaller tree.

  1. Assume a tree of height k - 1 has size at least 2k12^{k-1}
  2. T2 has height < T1 and T2 has size at least 2k12^{k-1} by induction. size[T1] >= size[T2] >= 2k12^{k-1} by union-by-weight rule

Time complexity Find: O(log n) Union: O(log n)

Union-Find Summary

AKA Union-By-Rank (rank = log(size)) AKA Union-By-Height

Important Properties:

  1. weight/rank/size/height of subtree does not change except at the root (so only update root on union)
  2. weight/rank/size/height only increases when tree size doubles

Optimisation: Path Compression

After finding the root: set the parent of each traversed node to the root

When path compressing, height changes but weight doesn't :D

findRoot(int p) {
  root = p
  while (parent[root] != root) root = parent[root]
  while (parent[p] != p) {
    temp = parent[p]
    parent[p] = root
    p = temp
  return root

// alternative: point each node to their grandparent instead
findRoot(int p) {
  root = p
  while (parent[root] != root) {
    // pointing everything to their grandparent achieves the same effect as directly pointing to the root
    root = parent[root]
    parent[root] = parent[parent[root]]
  return root

Do path compression whenever you area traversing up the tree. (doesn't affect complexity)

Time complexity Find: O(log n) Union: O(log n)

First find operation will be O(n) because haven't do any path compression.

Optimisation: Weighted Union + Path Compression

Theorem: Starting from empty, any sequence of m union/find operations on n objects takes: O(n+mα(m,n))O(n + m\alpha (m,n))

ALPHA IS NOT CONSTANT but is always < 5 in our universe

IMPOSSIBLE to achieve linear time

Time complexity Find: O(α(m,n))O(\alpha (m,n)) Union: O(α(m,n))O(\alpha (m,n))

Alpha is the Inverse Ackermann function

Union-Find Summary

Weighted Union is faster:

  1. Trees are flat: O(log n)
  2. Union and find are both O(log n)

Weighted Union + Path Compression very fast:

  1. Trees very flat
  2. On average, almost linear performance per operation
weighted-unionO(log n)O(log n)
path-compressionO(log n)O(log n)
weighted-union with path compressionα(m,n)\alpha(m,n)α(m,n)\alpha(m,n)


  1. Maze solving
  2. Game states
  3. Network connections
  4. Lowest Common Ancestors (LCA)
  5. Finite state automata
  6. Image processing in Matlab
  7. Physics

Union-Split-Find Insert and delete edges. Very difficult problem.

Binomial Tree

B4 = (root + B0 + B1 + B2 + B3) = B3 + B3 size(Bk) = Θ(2k)\Theta(2^k) height(Bk) = k - 1

  1. Build Binomial Tree using union operations
  2. Union: create new root in O(1)
  3. Find deepest leaf in O(log n)
  4. Perform path compression
  5. Still a binomial tree
  6. Goto step 2

Minimum Spanning Trees (MST)

For weighted, undirected graphs. (For directed graphs, it gets much more complicated and is not well-defined)

A spanning tree is an acyclic subset of the edges that connect all nodes. Note: If there were cyclic edges, can remove one edge and reduce the weight.

A minimum spanning tree is a spanning tree with minimum weight.

MST cannot be used for finding shortest path (entirely unrelated!) Because the shortest path might not be included in the MST.

To avoid issue of duplicate edge weights, use a unique identifier like a node ID as a tie-breaker.

Applications for MST

  • Network design problems
    • Telephone networks
    • Electrical networks
    • Computer networks
    • Road networks
    • etc

Minimum cost network to stream a movie to all the users.

Properties of MST

A cut of a graph G = (V, E) is a partition of the vertices V into two disjoint subsets. An edge crosses a cut if it has one vertex in each of the two sets.

  1. MSTs have no cycles (it is a tree).
  2. When cutting a MST into 2 pieces, they will become 2 MSTs for their connected components.
  3. Cycle property
  • for every cycle, the maximum weight edge is not in the MST.
  • proof by contradiction:
    • if it is in the MST, add another edge in the cycle into the MST
    • then, there is a cycle
    • thus, we need to remove an edge
    • so we'll remove the heavier edge
    • thus, maximum weight edge is not in the MST
  • BUT for every cycle, the minimum weight edge may or may not be in the MST because the lightest edge in one cycle can be the heaviest edge in another cycle.
  1. Cut property
  • For every cut/partition of the nodes, the minimum weight edge across the cut is in the MST.
  • proof by contradiction
  • if it is not, add minimum weight edge on cut
  • then, there is a cycle
  • thus, we need to remove the heaviest edge on cycle
  • so the minimum weight edge on cut will be left

For every node, the minimum outgoing edge is always part of the MST. (this is because of the cut property, cut the one node away from the rest of the graph) For every node, the maximum outgoing edge can be part of the MST (if there is only 1 outgoing edge).

Generic MST Algorithm

Red rule (based on cycle property): If C is a cycle with no red arcs, then colour the max-weight edge in C red. Blue rule (based on cut property): If D is a cut with no blue arcs, then colour the minimum-weight edge in D blue.

Greedily apply red rule and blue rule to an arbitrary edges until no more edges can be coloured.

On termination: The blue edges are an MST.

  1. Every cycle has a red edge. No blue cycles => forest of blue edges
  2. Blue edges form a tree => spanning tree
  • if not, there is a cut with no blue edge
  1. Every edge is coloured
  2. Every blue edge is in the MST (because of cut property)

This generic MST algorithm is the basis of ALL MST algorithms.

Note: If all edges have different costs, there is a unique MST. https://math.stackexchange.com/questions/352163/show-that-theres-a-unique-minimum-spanning-tree-if-all-edges-have-different-cos

A divide-and-conquer implementation of this does not work because converse of cut property is not true!

BUT divide-and-conquer along the median does work.

Prim's Algorithm

Basic Idea: Identify cuts and keep growing the set S. Fairly similar to Dijkstra's Algorithm's idea with some small changes that completely change the outcome!

S: set of nodes connected by blue edges, initially, some random starting node Next, add the minimum weight edge on cut from S to the rest of the nodes (cut property) and relax the edges. Repeat this until all nodes are added into S.

Use a priority queue to find the minimum weight edge on cut. In this case, priority queue is used to store nodes and not edges.

// pq stores nodes
pq = new PQ();
for (Node v : G.V()) {
  pq.insert(v, infinity);
pq.decreaseKey(start, 0);

HashSet<Node> S = new HashSet<Node>();

HashMap<Node,Node> parent = new HashMap<Node, Node>();
parent.put(start, null);

while (!pq.isEmpty()) {
  Node src = pq.deleteMin();
  for (Edge e : v.edgeList()) {
    Node dest = e.otherNode(src);
    if (!S.contains(dest)) {
      // assume decreaseKey does nothing if weight is > original
      double weight = e.getWeight();
      pq.decreaseKey(dest, weight);
      if (weight < currentWeight of dest) {
        parent.put(dest, src);


  1. Each added edge is the lightest on some cut.
  2. Hence each edge is in the MST.

Time complexity of using a AVL Tree for PQ: O(E log V) because each vertex added/removed once from the priority queue -> O(V log V) each edge, <= 1 decreaseKey -> O(E log V) E >= V because connected -> O(E log V)

with fibonacci heap O(V) -> insert all nodes at the start (fibonacci heap is of size O(V)) O(E) -> decreaseKey called for each edge O(V log V) -> extractMin for each node O(V log V + E)

Kruskal's Algorithm

Basic Idea:

  • Sort edges by weight from smallest to biggest
  • consider edges in ascending order:
    • if both endpoints are in the same blue tree, then colour the edge red (must be the heaviest edge on the cycle, because doing this algorithm in sorted weight order)
    • otherwise, colour the edge blue

Can terminate early once V - 1 edges are obtained

Use Union-Find data structure. Connect two nodes if they are in the same blue-tree.

// sort edges
Edge[] sortedEdges = sort(G.E());
ArrayList<Edge> mstEdges = new ArrayList<Edge>();
UnionFind uf = new UnionFind(G.V());

for (int i = 0; i < sortedEdges.length; i++) {
  Edge e = sortedEdges[i];
  Node src = e.first();
  Node dest = e.second();
  if (!uf.find(src, dest)) {
    uf.union(src, dest);


  1. Each added edge crosses a cut
  2. Each edge is the lightest edge across the cut (all other lighter edges across the cut have already been considered because sorted)

Time complexity on a connected graph: O(E log E) or O(E log V) because sorting edges -> O(E log E) = O(E log V) because O(E) = O(V) for E edges: Find and (sometimes) Union -> O(E α\alpha(E))

Boruvka's Algorithm

Basic Idea: Add all "obvious" blue edges Parallelisable MST algorithm.

For every node: add minimum adjacent edge (at least n/2 edges added, at most n/2 connected components)

Repeat: for every connected component, add minimum outgoing edge

Initially, n connected components, one for each node

Each iteration:

  1. For each connected component, search for the minimum weight outgoing edge
  2. Add selected edges
  3. Merge connected components

At most O(log V) iterations. Each iteration costs O(E+ V) = O(E) Thus, overall O(E log V)

MST Variants

All edges same weight

Run BFS or DFS because any spanning tree is a MST. O(E) in connected graph

Cost would be x*(V-1) where x is the cost of an edge.

All edges have weight from {1.. 10}

For Kruskal Algorithm Use an array of size 10 to sort. Each A[i] holds a linked list of edges of weight i. Then, run the second portion of Kruskal's Algorithm as usual.

Then, sorting is O(E). Overall: O(E α\alpha(E))

For Prim's Algorithm Inserting/removing nodes from PQ: O(V) decreaseKey: O(E) Total: O(V + E) = O(E)

CANNOT do this similar technique for Dijkstra's Algorithm

Directed MST

No root don't make sense. A rooted spanning tree: every node is reachable from the root.

Prim's, Kruskal's, Boruvka's do not work for minimum rooted spanning tree because cut property and cycle property do not work.

VERY hard problem (not covered in CS2040S)

Special Case

Special case that makes it very easy :D

A directed acyclic graph with one root (no incoming edges)

  1. For every node except the root: add minimum weight incoming edge

  2. No cycles because acyclic graph

  3. Each edge is chosen only once (thus overall V - 1 edges) Thus, must be a tree.

Since every node has to have at least one incoming edge in the MST, this is the minimum spanning tree!

Re-Weighing Edges

By adding/subtract all edges by some constant k, the MST does not change. Only relative weights of edges matter. Not the absolute edge weights.

Maximum Spanning Trees

MST with negative weights have no problem! So, just multiply each edge by -1 and run Minimum Spanning Tree algorithm and take the most negative.

OR run Kruskal's Algorithm in reverse (start from largest edge weight to smallest)

Smallest Maximum Edge


  1. Get MST
  2. BFS/DFS from A to B to find a path (it will definitely have the smallest maximum edge)

Dynamic Programming

Optimal Sub-Substructure

Optimal solution can be constructed from optimal solutions to smaller sub-problems.

It is a property of many problems. Greedy (Dijkstra's and MST) and divide-and-conquer (merge sort, Fast Fourier Transform).

Overlapping subproblems The same smaller problem is used to solve multiple different bigger problems. Divide and conquer problems do not have overlapping subproblems.

Basic Ideas

Strategy 1

  1. Solve smallest problems then go up layer by layer

Strategy 2

Most DP problems can be converted into graph problems and then topologically sort and relax.

  1. Topologically sort DAG (root point to its dependencies)
  2. Solve problems in reverse order

Strategy 3

  1. Start at root and recurse
  2. Recurse.
  3. Recurse.
  4. ...
  5. Solve and memoize (make sure each subproblem is only solved once).

Strategy 4

Table that captures all subproblems.

DP Recipe

  1. Identify optimal substructure
  2. Define subproblems important
  3. Solve problem using subproblems
  4. Write the code!

DP Problems

Longest Increasing Subsequence

Find the length of the longest increasing subsequence.

DAG Solution

Node => element Edge u -> v => u to v is increasing

Now, use negated SSSP from every node and take the best + 1.


This causes a lot of repeated computation.

DP Solution 1 (Suffix Version with Graph)

This is defined as a suffix.

The sub problem is S[i] = LIS(A[i..n]) starting at A[i]

S[n] = 0 S[i] = max(all adjacent) + 1

// assuming V is already topo sorted
  // memo array
  int[] S = new int[V.length];
  for (i = 0; i < V.length; i++)
    S[i] = 0;
  S[n-1] = 1;
  for (int v = A.length - 2; v >= 0; v--) {
    // find maximum S for any outgoing edge
    int max = 0;
    for (Node w : v.neighbours) {
      if (S[w] > max) {
        max = S[w];
    S[v] = max + 1;
DP Solution 2 (Prefix Version)

This is defined as a prefix.

S[i] = LIS(A[1..i]) ending at A[i]

S[1] = 1 S[i] = max(all previous) + 1

static int lengthOfLIS (int[] nums) {
  int[] dp = new int[nums.length];
  dp[0] = 1;
  for (int i = 1; i < nums.length; i++) {
    dp[i] = 1;
    for (int j = 0; j < i; j ++) {
      if (nums[j] < nums[i])
        dp[i] = Math.max(dp[i], dp[j] + 1);
  int res = 0;
  for (int x : dp) res = Math.max(res, x);
  return res;

Both DP Solution 1 and 2 are O(n2)O(n^2)

Can solve in O(n log n) by binary searching subproblems


  1. Maintain a tails array storing the smallest tail of each subsequence of lengths 0 to n - 1
  2. This tails array is non-decreasing or increasing (if unique values), and thus is monotonic. So we can do binary search!
  3. Iterate through each value in the array and binary search for the position to add to the tails array.
  4. At the end, return the size of the tails array (not the actual size, but the size that has values). The size will represent the existence of a subsequence at that length.

Prize Collecting


  • directed graph G = (V, E)
  • Edge weights w = prizes on each edge

Find maximum prize possible.

  1. Check for positive weight cycles. If yes, then return infinity.
  2. Negate the edges and run Bellman-Ford

Lazy Prize Collecting

Find maximum prize possible in at most k edges.

Idea 1 (treat as graph problem)
  1. Transform into a DAG
  2. Make k copies of every node
  3. Solve prize collecting via DAG_SSSP (longest path) for each node (in the original graph)

O(kVE) Transformed graph has kV nodes and kE edges (transforming graph need to recompute number of nodes and edges). Build transformed graph in O(kV + kE) = O(kE) because connected graph Topo-sort/longest path -> O(kV + kE) Once per source: repeat V times -> O(kVE)

Optimisation Realise that you don't need to run DAG_SSSP V times! Create super-source that is connected to every other node in the original graph (with 0 weight) Run DAG_SSSP ONCE from this super-source node.

O(kV + kE)

Idea 2 (DP problem)

P[v, k] = maximum prize that you can collect starting at v and taking exactly k steps P[v, 0] = 0

P[v, k] = max(P[w1, k-1] + w(v, w1), P[w2, k-1] + w(v, w2), ...) where w1, w2, ... are the neighbours of v

int LazyPrizeCollecting(V, E, kMax) {
  int[][] P = new int[V.length][kMax+1];
  for (int i = 0; i < V.length; i++)
    for (int j = 0; j < kMax + 1; j++)
      P[i][j] = 0;
  for (int k = 1; k < kMax + 1; k++) {
    for (int v = 0; v < V.length; v++) {
      int max = -INFINITY;
      // find max prize in next step
      for (int w : V[v].neighbours) {
        if (P[w, k-1] + E[v, w] > max) {
          max = P[w, k-1] + E[v, w];
      P[v, k] = max;
  return maxEntry(P);

Total Cost:

  1. number of subproblems: kV
  2. cost to solve each subproblem: |V.neighbours| O(kV2)O(kV^2)

Longest Common Subsequence

let X(j) = X[0..j]

LCS(A(n), B(n)) = A[n] == B[n] ? LCS(A(n-1), B(n-1)) + 1 : max(LCS(A(n), B(n-1)), LCS(A(n-1), B(n)))

O(mn)O(mn) where m = A.length, n - B.length

Minimum Vertex Cover

https://en.wikipedia.org/wiki/Vertex_cover Find set of nodes C (C = vertex covers) where every edge is adjacent to at least one node in C.

NP-Complete problem (no polynomial algorithm) but there is an easy 2-approximation algorithm (but easy solution better (read: lower) than 2 approximation, also provably no polynomial approximation better than 2ϵ for ϵ>0\sqrt 2-\epsilon \text{ for }\epsilon > 0)

Simpler Problem Minimum vertex cover on a tree (maybe not binary tree)

Need 2 subproblems: consider 2 cases: v is covered and not covered S[v, 0] = size of vertex cover in subtree rooted at node v, if v is not covered S[v, 1] = size of vertex cover in subtree rooted at node v, if v is covered

Base case is leaf S[leaf, 0] = 0 S[leaf, 1] = 1

If v is not covered, all of v's children has to be covered. S[v, 0] = S[w1, 1] + S[w2, 1] + ... S[v, 1] = 1 + min(S[w1, 0], S[w1, 1]) + ... where w1, w2, ... are children of v

Total 2V subproblems

int treeVertexCover(V) {
  int[][] S = new int[V.length][2];
  for (int v = V.length - 1; v >= 0; --) {
    // start from leaves
    if (v.children.size() == 0) {
      // v is leaf
      S[v][0] = 0;
      S[v][1] = 1;
    } else {
      S[v][0] = 0;
      S[v][1] = 1;
      for (int w : V[v].children) {
        S[v][0] += S[w][1];
        S[v][1] += Math.min(S[w][0], S[w][1]);
  // return min at root
  return Math.min(S[0][0], S[0][1]);


All Pairs Shortest Path (APSP)

Input of weighted, directed graph. Finding shortest path between any 2 nodes.

Trivial version: run SSSP on each vertex. Dynamic programming is the efficient way to compute APSP.

Naive Solution

For positive weights only Run Dijkstra's Algorithm on every query and memoize. Preprocessing: 0 Responding to q queries: O(VE * log V) sparse graph: O(V2logV)O(V^2logV)

For identical weights, O(V(E + V)) + O(VE) in dense graph: O(V3)O(V^3) in sparse graph: O(V2)O(V^2)

Preprocessing Solution

Preprocessing: all-pairs-shortest-path Responding to q queries: O(q)

There are also some versions that sacrifice some response time for faster preprocessing.

let S[v, w, P] be the shortest path from v to w that only uses intermediate nodes in the set P.


Limit ourselves to V+1 different sets P (out of 2V2^V)

P0 = {} P1 = {1} P2 = {1, 2} ... PnP_n = {1, 2, ..., n}

Either use or don't use a node S[v, w, P8P_8] = min(S[v, w, P7P_7], S[v, 8, P7P_7] + S[8, w, P7P_7]) S[v, w, Pi+1P_{i+1}] = min(S[v, w, PiP_i], S[v, i+1, PiP_i] + S[i+1, w, PiP_i])

In the end, you find the path using with PnP_n which allows you to use ALL intermediate nodes.

int[][] APSP(E) {
  // 3D matrix is unnecessary (only the previous set is used), thus only need a 2D matrix
  int[][] S = new int[V.length][V.length];
  for (int v = 0; v < V.length; v++) {
    for (int w = 0; w < V.length; w+) {
      // initialise with original weights
      S[v][w] = E[v][w];

  // for sets P0, P1, ...
  for (int k = 0; k < V.length; k++) {
    // for every pair of nodes
    for (int v = 0; v < V.length; v++) {
      for (int w = 0; w < V.length; w++) {
        // either use or don't use the node
        S[v][w] = Math.min(S[v][w], S[v][k] + S[k][w]);
  return S;

O(V3)O(V^3) because triply nested loop (better than naive Dijkstra -> O(V3logV)O(V^3\log V) when dense)

V3V^3 subproblems

Path Reconstruction

Return actual path and not just the distance. Don't store all the shortest paths: this takes V3V^3 space (V choose 2 pairs * V hops)

Only store the first hop for each destination, similarly to a routing table. In Floyd-Warshall, store "intermediate node" whenever modifying/updating the matrix entry for a pair. Then you can easily reconstruct the actual path in O(V).

O(V2)O(V^2) space

Transitive Closure

Return a matrix M where M[v, w] = 1 if there exists a path from v to w M[v, w] = 0 otherwise

Similarly to Floyd-Warshall but without the actual distance.

M[v, w, k + 1] = OR(M[v, w, k], AND(M[v, k, k], M[k, w, k]))

Minimum Bottleneck Edge

For (v, w), the bottleneck is the heaviest edge on a path between v and w

Return a matrix B where B[v, w] = weight of the minimum bottleneck.