1. (Reading Assignment) Read Sections 8.1, 8.4, and the wikipedia entry on Optimal Sequence Alignment Problem (in class we described it as a minimization problem D(i,j), here it is described as a maximization problem F(i,j), the code is nearly the same, except the max in the pseudocode would be a min for us, and for us a high number is a large penality, whereas in the online presentation a low number signals bad alignment).
2. (Binomial Coefficient, Done Right, 17pt) In your discrete math classes you should have encountered the binomial coefficient, typically written and defined as:
(the symbol on the left is read "n choose k", the exclamation mark is the factorial function). To avoid awkward typesetting, I'll write it as a function B(n,k). It counts the number of ways you can choose k elements out of a set of n (distinct) items. E.g., B(n,2) is the number of (unordered) pairs of n items, or, in other words, the number of edges in a complete graph Kn.
a) [2pt] Implement B(n,k) using the factorial definition. Do you see any issues with this way of calculating the binomial coefficient? Hint: try values of k which are very small, or very close to n. What happens in those cases? If you're using Python, use "from math import factorial" to get the factorial function.
b) [3pt] A better definition of B(n,k) is the recursive one: B(n, 0) = 1 for all n, and B(n,k) = B(n-1,k-1) + B(n-1,k), where we assume n >= k (otherwise, B(n,k) = 0). Implement a recursive function for B(n,k) using this definition. Do you see any issues with this way of calculating the binomial coefficient? Hint: when do the running times start blowing up?
c) [8pt] Implement a dynamic program that calculates B(n,k) based on the recursive definition in b), and see whether it can deal with the inputs that caused trouble in a) and b). Hint: this is a two-dimensional problem, so you'll need a two-dimensional table.
d) [2pt] What is the asymptotic running time of your program in c)? (Express in terms of n and k.)
e) [2pt] How much space does your program in c) use? (Express in terms of n and k). Can this be improved?
3. (Word Chain, 18pt) A word chain is a sequence of words in which the last letter of a word is the same as the first letter of the next word, for example, ['hello', 'oracle', 'exit', 'train', 'night'].
You are given a list of words, and find a longest subsequence of that list which is a word chaing. For example, you may have the list
lst = ['dowser', 'rogue', 'buddy', 'eel', 'yarn', 'elven', 'rocky', 'spoonerisms', 'premise', 'conflicts', 'compacter', 'backwood', 'rubbly', 'groat', 'nights', 'swapped', 'eagle', 'dilapidated', 'break', 'simple']
In that list, there is a six word chain, namely ['dowser', 'rogue', 'elven', 'nights', 'swapped', 'dilapidated'].
We want to develop, and implement a dynamic program that finds the length of such a longest word chain.
a) [5pt] Here are two possible choices for D[i], one good, one bad:
D1[i] = length of the longest word chain in the first i words of lst (lst, lst, ..., lst[i-1]).
D2[i] = length of the longest word chain in the first i words of lst (lst, lst, ..., lst[i-1]) which ends at lst[i-1].
Using the list lst above, compute, by hand, a couple of the values of D1[i], and D2[i], for small i, sufficiently many so you can answer the following question:
b) [5pt] With your choice of D from a), write down the recursive definition of D (including the initial values). Verify with the values you hand-computed in a) that the definition is correct.
c) [5pt] Implement your dynamic program from b) so that it takes as input a list, and as output returns the length of a longest wordchain within that list. You can test with my (randomly generated) word lists (solutions in comments).
d) [2pt] What is the asymptotic running time of your dynamic program in the parameter n = len(lst)?
e) [1pt] How much space does your dynamic program use (express in the parameter n = len(lst))?
f) [Extra Credit, 6pt] Extend the program from c) so it actually returns a longest word chain, not just its length.