Homework 6 (due 5/19)
CSC 321



We've talked about dynamic programming, starting with Fibonacci sequences (Section 8.1), Longest Common Subsequences (Section 8.4),  and the Optimal Sequence Alignment Problem. Next week, we will talk about Approximate Pattern Matching (Section 9.5), and Coin changing, both greedy (Section 7.1) and dynamic programming.



Submission
: Submit to the d2l dropbox. Use some standard format for your homework (Word or pdf, for example), and make sure to include your name and assignment number. Include screenshots for code and testruns. You do not need to include the code (or any executables).

1. (Reading Assignment) Read Sections 8.1, 8.4, and the wikipedia entry on Optimal Sequence Alignment Problem (in class we described it as a minimization problem D(i,j), here it is described as a maximization problem F(i,j), the code is nearly the same, except the max in the pseudocode would be a min for us, and for us a high number is a large penality, whereas in the online presentation a low number signals bad alignment).

2. (Binomial Coefficient, Done Right, 17pt) In your discrete math classes you should have encountered the binomial coefficient, typically written and defined as:

 

(the symbol on the left is read "n choose k", the exclamation mark is the factorial function). To avoid awkward typesetting, I'll write it as a function B(n,k). It counts the number of ways you can choose k elements out of a set of n (distinct) items. E.g., B(n,2) is the number of (unordered) pairs of n items, or, in other words, the number of edges in a complete graph Kn.

a) [2pt] Implement B(n,k) using the factorial definition. Do you see any issues with this way of calculating the binomial coefficient? Hint: try values of k which are very small, or very close to n. What happens in those cases? If you're using Python, use "from math import factorial" to get the factorial function.

b) [3pt] A better definition of B(n,k) is the recursive one: B(n, 0) = 1 for all n, and B(n,k) = B(n-1,k-1) + B(n-1,k), where we assume n >= k (otherwise, B(n,k) = 0). Implement a recursive function for B(n,k) using this definition. Do you see any issues with this way of calculating the binomial coefficient? Hint: when do the running times start blowing up?

c) [8pt] Implement a dynamic program that calculates B(n,k) based on the recursive definition in b), and see whether it can deal with the inputs that caused trouble in a) and b). Hint: this is a two-dimensional problem, so you'll need a two-dimensional table.

d) [2pt] What is the asymptotic running time of your program in c)? (Express in terms of n and k.)

e) [2pt] How much space does your program in c) use? (Express in terms of n and k). Can this be improved?

3. (Word Chain, 18pt) A word chain is a sequence of words in which the last letter of a word is the same as the first letter of the next word, for example, ['hello', 'oracle', 'exit', 'train', 'night'].

You are given a list of words, and find a longest subsequence of that list which is a word chaing. For example, you may have the list

lst = ['dowser', 'rogue', 'buddy', 'eel', 'yarn', 'elven', 'rocky', 'spoonerisms', 'premise', 'conflicts', 'compacter', 'backwood', 'rubbly', 'groat', 'nights', 'swapped', 'eagle', 'dilapidated', 'break', 'simple']

In that list, there is a six word chain, namely ['dowser', 'rogue', 'elven', 'nights', 'swapped', 'dilapidated'].

We want to develop, and implement a dynamic program that finds the length of such a longest word chain.

a) [5pt] Here are two possible choices for D[i], one good, one bad:

D1[i] = length of the longest word chain in the first i words of lst (lst[0], lst[1], ..., lst[i-1]).

D2[i] = length of the longest word chain in the first i words of lst (lst[0], lst[1], ..., lst[i-1]) which ends at lst[i-1].

Using the list lst above, compute, by hand, a couple of the values of D1[i], and D2[i], for small i, sufficiently many so you can answer the following question:

b) [5pt] With your choice of D from a), write down the recursive definition of D (including the initial values). Verify with the values you hand-computed in a) that the definition is correct.

c) [5pt] Implement your dynamic program from b) so that it takes as input a list, and as output returns the length of a longest wordchain within that list. You can test with my (randomly generated) word lists (solutions in comments).

d) [2pt] What is the asymptotic running time of your dynamic program in the parameter n = len(lst)?

e) [1pt] How much space does your dynamic program use (express in the parameter n = len(lst))?

f) [Extra Credit, 6pt] Extend the program from c) so it actually returns a longest word chain, not just its length.


Marcus Schaefer
Last updated: May 14th, 2015.