We saw more examples on loops and strings, and talked about formatted output (Section 4.1, 4.2) and file input and output (Section 4.3). We also practiced using loops for processing lists and strings (Section 5.1 covers some of that). We'll continue with exceptions (Section 4.4) and programming patterns (Chapter 5).
If you are working with textfiles on a Mac, do the following: Add
import codecs
at the top of your program, and then instead of open use codecs.open:
infile =
codecs.open(fname,'r','UTF-8')
That should work.
Submission: The homework is due by midnight (I will not accept late homeworks). You can submit your homework through d2l into the drop-box for this homework. Please prepare your homework as a single file containing all answers (e.g. doc, docx, or pdf, not a zip file).
When submitting programs, include the programs, together with screenshots of test-runs of your program (make sure screenshots are sized so the text is legible; resize and/or crop the images).
1. (Reading Assignment) Read Sections 4.1, 4.2, and 4.3, as well as 5.1 and 5.2 (up to page 137). If you want to read ahead, check out Section 4.4 and continue with chapter 5.
2. (Strings, 10pt) We talked about title case in class (typesetting a text as the title of a book or a section in a book). We now want to implement a slightly fuller set of rules:
The first and last word in the text are always upper case
The remaining words are upper case, unless
have one or two letters
or are on the following list of common words: the, and, but, for.
Write a function title(s) which implements these rules. Below are some testruns. Note that your title function should work even if the initial s has weird upper/lower case spelling. Hint: initally, to simplify, assume that s is all lower case. What to do if it's not? Hint 2: Implement the conditions one at a time, start with the title code we wrote earlier.
3. (Strings and Loops, 15pt) One of the basic rules of writing (well, some people think) is that small numbers should be written out as words, so you should not write "I have 4 cards", but "I have four cards", and "The temperature is 0 degrees" should be "The temperature is zero degrees." We want to implement a function that convers text in this way. We'll do this in two steps.
a) [7pt] Implement a function numb_word(s) that takes as an argument a string s (representing a word). If s contains an integer number between 0 and 12, then numb_word should return the English name of that number. For any other value, simply return s itself, unchanged. To test whether a string s contains of digits only, you can use s.isnumeric(). Hint: do not use an if/elif cascade, use a list.
b) [8pt] Implement a function edit(s) that does the following: it takes the string s and replaces all occurrences of numbers n between 0 <= n <= 12 with the corresponding English words. You can assume that your text does not contain any punctuation. Hint: use a loop, and use your function from a) to process one word a time.
Hint: if you did not get part a to work, simple write a function number_word(n) that always returns 'one', so you can work on part b).
4. (File reading, 15pt) We want to find long words in some text (I tested with innocents.txt, you can work with that, or download your own text, but it should be reasonbly long, to make testing interesting, check out https://www.gutenberg.org/ for a good source of public domain texts).
a) [6pt] Write a function prep(s) which takes a string s and replaces all punctuation symbols in s by blanks. Do not use the .replace() method for this. Instead, work through s, one character at a time, and it the character is punctuation, instead use a blank. Use from string import punctuation, to get the string puntuation, which contains all Python punctuation characters.
b) [9pt] Write a function long(fname, k) which lists all words of length at least k in fname, following the format used in the following sample runs. For the output use formatted output (i.e. the format function). After reading the text from the file, use the prep() function from part a) to strip the text of all punctuation (otherwise you'll get some spurious long words, including hyphens, say).