﻿

CSC 241 Lab 5

Submission:  Submit the python or text file that contains all your answers to the submission folder for this lab by the end of the lab. To simplify Drew's (our tutor) work, please include your name and the number of the lab at the top of the file (if you're submitting a Python file, use comments: start the line with a #).

1. (Files)

1. Download a book from Project Gutenberg (http://www.gutenberg.org/) in Plain Text format (and save it with extension .txt). (In class we used innocents.txt, and I'll use it for the sample runs, but find a new book for this problem.)

2. Write a function empty(fname) that counts how many of the lines in file fname are empty. Hint: the length of an empty line is 1 (the '\n' character). Hint: start with one of the occurs() functions which list all lines containing a word. 3. Write a morality checker for books. The morality checker is a function moral(fname) that takes as an input the name of a file, and compares how often the file contains the words 'good' and 'evil' (upper or lower case). If there are more mentions of good, the book is moral, otherwise it is immoral. Here are some sample runs I did with some books from Project Gutenberg. 2. (Files and Searching) In class we worked with the file GMATLarge.csv, which lists fictiotious GMAT scores for fictitious students by last name. Solve the following problems for this file:

a) Write a function percentile(myscore) which calculates your percentile of the whole population. What's the percentile? That's the percentage of people who have a score equal to or lower than yours. So the top score is in the 100-percentile. Some sample runs below. E.g. this means that (in our sample) only 4% of the students score higher than 750. Again this is a modification of a program we wrote in class. b) Write a function average(prefix) which calculates the average score of all students whose last name starts with prefix. You can start with the program you wrote for a), or the score() function we saw in class. Sample-runs: Hint: there's various ways of doing it. You could accumulate the scores in a list. You can also accumulate them in a number, but then you also need to count the number of scores.

3. (All Words) Implement a function occurs_all(fname, words) which prints all the lines in file fname which contain all the words in the list words. Hint: In class we implemented something similar, we implemented a function occur(fname, words) which lists all the lines which contain any of the words. Start by reading and understanding the two functions involved in that. Once you understand that code, modify it so it solves the occurs_all version. 4. (Extra Credit, Only if you have time, 3EC). Modify 3) so instead of printing the lines to the screen, they get written to a file (named using the name of the original file).

Marcus Schaefer
Last updated: April 29th, 2019.