Homework 7 (due 5/26)
CSC 321



We've talked more about dynamic programming, including edit distance (Section 9.5) and fast modular exponentiation via repeated squaring (see the wikipedia entry for various implementations and variants). We also saw the Floyd Warshall algorithm for all-pairs shortest path (Section 8.5), and started talking about greedy and dynamic solutions to coin changing.

Next, we will discuss various text searching algorithms, including naive text-searching (9.1), and Rabin-Karp Search (9.2). Knuth-Morris-Pratt (9.3), and Boyer-Moore-Horspool (9.4). After that, we'll move on to NP-completeness (Section 10.2).



Submission
: Submit to the d2l dropbox. Use some standard format for your homework (Word or pdf, for example), and make sure to include your name and assignment number. Include screenshots for code and testruns. You do not need to include the code (or any executables).

1. (Reading Assignment) Read Sections 9.5, 8.5. 

2. (Best Approximate Match) In this problem you will implemen a variant of the edit distance algorithm, namely the best approximate match. As explained below (and on pages 402-404 in the book), the two differ only in the initialization. The rest of the dynamic program (and algorithm) is the same.

  1. [15pt] Implement the Best Approximate Match algorithm (Algorithm 9.5.10 in the book) as a function bap(p,t). It calculates the smallest edit distance between p and any substring of t. The calculation is the same as for the edit distance problem (compare formula on page 400 for edit distance to formula on page 403 for best approximate match), the only difference is in the initialization phase. Test-run this with several small examples (e.g. Example 9.5.12 in the book [there is one incorrect entry in that table, see whether you can find it]). You do not need to find the best approximate match, just its distance.
  2. [5pt] You recall a phrase 'I wandered the streets alone and happy' from your favorite Mark Twain book, but you don't recall which one your favorite Mark Twain book is. It could have been any of the following four: Innocents Abroad, Mysterious Stranger, Roughing It, or Tramp Abroad. Use your implementation for Best Approximate Match to find out which book it is (calculate, and show, the distances of the phrase to each of the four books using your algorithm in a). To save space you may want to consider not creating the whole table, but just two lines. Hint: my program takes about 2-3 minutes to run through all four texts.
  3. [Extra Credit, 6pt] Extend your algorithm so it finds the exact phrase that you misremembered.  

Hint: you may want to start with implementing the edit distance algorithm (Algorithm 9.5.7) and test that with some examples before moving on to Algorithm 9.5.10 if you think you'll benefit from additional examples and testing.

3. (Transitive Closure, 15pt) Given a directed graph G = (V,E), the transitive closure tc(G) is a graph on the same vertex set V as G, and an edge (u,v) if there is a directed path from u to v in G. For example, the transitive closure tc(({1,2,3}, {(1,2),(2,3)})) is ({1,2,3}, {(1,2),(1,3), (2,3)}).

a) [3pt] What is the transitive closure of a directed path of length n, i.e. if you have vertices V = {1, ..., n} and edges E = {(1,2), (2,3), (3,4), ..., (n-1, n)} ?

b) [2pt] What is the transitive closure of a star graph, that is, one special (center) vertex 1 with edges to all other vertices, that is, E = {(1,2), (1,3), ..., (1,n)}, and no other edges ? Hint: draw it, with 1 in the center to see why it's called a star graph.

b) [7pt] We know that we can compute the transitive closure tc(G) of a graph using Floyd-Warshall,  but that takes time O(|V|3). Suppose we know that G is a dag (directed acyclic graph). Sketch (precise description, or pseudocode) the fastest algorithm for calculating the transitive closure tc(G) of the dag G.

c) [3pt] What is the running time of your algoritm in b)? Hint: be careful.


Marcus Schaefer
Last updated: May 20th, 2015.