## Homework 4 (due 10/9) CSC 491

In class we covered more dynamic programming; in particular we saw how to solve the longest common subsequence problem, the all-pairs shortest path problem (using the Floyd-Warshall algorithm, see Section 25.2), and the optimal binary search tree (15.5), also see optimal binary search tree example  (the formula in class was incorrect: the parentheses were wrong). Next week we will finish dynamic programming (0/1 knapsack) and start talking about greedy algorithms (16.1, 16.3).

Midterm will be in sixth week, first half of class. There will be a lecture during the second half. Exam is open book, open notes. Coverage includes everything up to and including this week.

Submission: you can submit the homework to me either by hardcopy in class or email it to me.

1. [Line-breaking] You want to write an algorithm that typesets a set of words forming a paragraph. Your input is an array w[1..n] of the lengths of the words. Punctuation is considered part of the preceding word. For example,

"For example, we might have to typeset this sentence. Optimally."

we would have w[1] = 3, w[2] = 8 (including the ","), w[3] = 2, ... w[10] = 10 (including the final period). You are also given a maximum number k of symbols in a line. For k = 15, the earlier example could be written as

```For example, we     // 15 symbols
might               // 5 symbols
have to typeset     // 15 symbols
this sentence.      // 14 symbols
Optimally.          // 10 symbols
```

(with symbol count at the end). For this exercise we assume a typewriter-like font, i.e. the font is mono-spaced: every symbol has the same width. The particular typesetting shown for the example is not very good. Better would be:

```For example, we     // 15 symbols
might have to       // 13 symbols
typeset this        // 12 symbols
sentence.           // 9 symbols
Optimally.          // 10 symbols
```

Still, not very nice, but with a width of at most 15 letters, we might not be able to do much better. How do we measure the badness of a solution: we sum up the cubes of the number of letters missing to a full line (excluding the last line of the paragraph; last lines may be short); so the first example has badness 03 + 103 + 03 + 13 = 1001. The second, in comparison, had badness 03 + 23 + 33 + 63 = 251, which is much better.

a) [25pt] Write an algorithm in pseudocode (and include an explanation of your algorithm) that computes the minimal badness of a paragraph of text, given w[1..n] and k. Be careful in dealing with spaces.

b) [5pt] What is the running time of your algorithm?

c) [15pt] Implement your algorithm and modify it so it actually produces the optimal layout. Test it with one or two texts, include printouts.

2. [LCS, 10pt] Using the LCS algorithm, find the longest common subsequence of "SLWOVNNDK" and "ALWGQVNBBK". Compute the full table.

3. [Edit-Distance, 15pt] One notion of edit distance between two strings is defined as follows: given a string you can perform the following operations on it: (D)elete a letter, and (I)nsert a letter. The edit distance is the smallest number of operations that turns one word into the other word. For example,

```ALGORITHM (delete 2nd letter)
AGORITHM  (delete 8th letter)
AGORITH   (delete 7th letter)
AORITH    (delete 2nd letter)
AORTH     (delete 4th letter)
AORTHA    (insert A after 5th letter)
AORTA     (delete 5th letter)
```

Showing that the edit distance between ALGORITHM and AORTA is at most 6. (Can you make do with less operations?) Show that with the edit distance ED(x,y) of two words x and y defined like this it is true that

n + m - 2 LCS(x,y) = ED(x,y)

where n = |x| and m = |y|. For example, ED(ALGORITHM, AORTA) = 6, while LCS(ALGORITHM, AORTA) = 4, n = 9, m = 5. So 9 + 5 - 2*4 = 6. Hint: show <= and >= separately.

4. [Floyd-Warshall, 20pt] Trace the Floyd-Warshall algorithm for the weighted digraph G = (V,E, w), where V = {1,2,3,4} and E = {12, 13, 14, 21, 24, 31, 34, 42, 43} and w(12) = 2, w(13) = 4, w(14) = 3, w(21) = 3, w(24) = 3, w(31) = 5, w(34) = -3, w(42) = -1, w(43) = 4. Edges that are not present are of infinite weight.

5. [Optimal binary search tree, 15pt] Complete the example on the handout. (Double-check that the values I computed are correct.)

6. [Extra Credit] The farthest distance problem from the last homework is open as extra credit.

Marcus Schaefer
Last updated: October 2nd, 2006.