CSC 321

We've talked about dynamic programming, starting with Fibonacci sequences (Section 8.1), Longest Common Subsequences (Section 8.4), and the Optimal Sequence Alignment Problem. Next week, we will talk about Approximate Pattern Matching (Section 9.5), and Coin changing, both greedy (Section 7.1) and dynamic programming.

Submission

1. (Reading Assignment) Read Sections 8.1, 8.4, and the wikipedia entry on Optimal Sequence Alignment Problem (in class we described it as a minimization problem D(i,j), here it is described as a maximization problem F(i,j), the code is nearly the same, except the max in the pseudocode would be a min for us, and for us a high number is a large penality, whereas in the online presentation a low number signals bad alignment).

2. (Binomial Coefficient, Done Right, 17pt) In your discrete math classes you should have encountered the binomial coefficient, typically written and defined as:

(the symbol on the left is read "n choose k", the exclamation mark is the
factorial function). To avoid awkward typesetting, I'll write it as a
function B(n,k). It counts the number of ways you can choose k elements out
of a set of n (distinct) items. E.g., B(n,2) is the number of (unordered)
pairs of n items, or, in other words, the number of edges in a complete
graph K_{n}.

a) [2pt] Implement B(n,k) using the factorial definition. Do you see any
issues with this way of calculating the binomial coefficient? *Hint*:
try values of k which are very small, or very close to n. What happens in
those cases? If you're using Python, use "from math import factorial" to get
the factorial function.

b) [3pt] A better definition of B(n,k) is the recursive one: B(n, 0) = 1
for all n, and B(n,k) = B(n-1,k-1) + B(n-1,k), where we assume n >= k
(otherwise, B(n,k) = 0). Implement a recursive function for B(n,k) using
this definition. Do you see any issues with this way of calculating the
binomial coefficient? *Hint*: when do the running times start blowing
up?

c) [8pt] Implement a dynamic program that calculates B(n,k) based on the
recursive definition in b), and see whether it can deal with the inputs that
caused trouble in a) and b). *Hint:* this is a two-dimensional
problem, so you'll need a two-dimensional table.

d) [2pt] What is the asymptotic running time of your program in c)? (Express in terms of n and k.)

e) [2pt] How much space does your program in c) use? (Express in terms of n and k). Can this be improved?

3. (Word Chain, 18pt) A word chain is a sequence of words in which the last letter of a word is the same as the first letter of the next word, for example, ['hello', 'oracle', 'exit', 'train', 'night'].

You are given a list of words, and find a longest subsequence of that list which is a word chaing. For example, you may have the list

lst = ['dowser', 'rogue', 'buddy', 'eel', 'yarn', 'elven', 'rocky', 'spoonerisms', 'premise', 'conflicts', 'compacter', 'backwood', 'rubbly', 'groat', 'nights', 'swapped', 'eagle', 'dilapidated', 'break', 'simple']

In that list, there is a six word chain, namely ['dowser', 'rogue', 'elven', 'nights', 'swapped', 'dilapidated'].

We want to develop, and implement a dynamic program that finds the length of such a longest word chain.

a) [5pt] Here are two possible choices for D[i], one good, one bad:

D1[i] = length of the longest word chain in the first i words of lst (lst[0], lst[1], ..., lst[i-1]).

D2[i] = length of the longest word chain in the first i words of lst (lst[0], lst[1], ..., lst[i-1]) which ends at lst[i-1].

Using the list lst above, compute, by hand, a couple of the values of D1[i], and D2[i], for small i, sufficiently many so you can answer the following question:

- Which of D1/D2 (possibly both) allows you to determine the length of the longest word chain in lst?
- Which of D1/D2 (possible both) can be calculated dynamically (i.e. has an easy recursive definition).

b) [5pt] With your choice of D from a), write down the recursive definition of D (including the initial values). Verify with the values you hand-computed in a) that the definition is correct.

- if you had all the values of D1, can you determine the length of the longest word chain?
- if you had all the values of D2, can you determine the length of the longest word chain?

c) [5pt] Implement your dynamic program from b) so that it takes as input a list, and as output returns the length of a longest wordchain within that list. You can test with my (randomly generated) word lists (solutions in comments).

d) [2pt] What is the asymptotic running time of your dynamic program in the parameter n = len(lst)?

e) [1pt] How much space does your dynamic program use (express in the parameter n = len(lst))?

f) [Extra Credit, 6pt] Extend the program from c) so it actually returns a longest word chain, not just its length.

Marcus Schaefer

Last updated: May 14th, 2015.