Homework 3 (due 1/29)
CSC 233



We finished talking about monoalphabetic substitution ciphers, including Sukhotin's algorithm. We continued by taking a more mathematical look at the shift cipher and started looking at an extension of it, the affince cipher. Next week, we will continue with the affine cipher (Section 2.12) and move into the Renaissance, where nomenclators (combination codes + cipher) were popular, and polyalphabetic substitution ciphers were invented. We did various class examples (all available on d2l), as are the various handouts and spreadsheets we used.

Things I mentioned in class, or should have mentioned:


Submission: The homework is due by midnight (I will not accept late homeworks). You can submit your homework through d2l into the drop-box for this homework.

Please prepare your homework as a single file containing all answers (e.g. doc, docx, or pdf, not a zip file). For an example, see hwexample.docx . How to take screenshots? Check out screenshots for MAC, Windows, Linux.


1. (Reading Assignment) Complete reading chapter 2 of the book (Sections 2.7-2.17). We did not (and most likely will not) cover material from sections 2.13 amd 2.17, so you can skip these if you want, but if you have the time, I recommend both of them. If you want to read ahead, continue with Sections 3.1-3.3.

2. (Recognizing languages, 10pt) You intercepted the following three ciphertexts a) through c) below; you know that the original plaintexts are in English, German and French, indeed you know that they are excerpts from Jules Verne's Voyage au centre de la terre, Mark Twain's Connecticut Yankee, and Johann Goethe's Leiden des jungen Werther. You also know that the ciphers used were simple transposition ciphers. For each ciphertext, determine the source (Verne, Twain, Goethe) by determining which language the underlying plaintext is in without breaking the cipher. In particular, you  do not have to find out which particular cryptosystem was used.

You cannot assume that there is an example in each language, so for each of the following cases, make your best and most convincing argument why it is in the language you determined, and not in the other two. Argue with more than just one or two letter frequencies.

a) "ENTUV LNCMN VRMNM TDEEG SDMNN CERAE RITTC MEONT RISCM UEEAI REAZC TTNNN RLNAI EUENA HURIN AAOIL RKENH NLUTT DRIIG AVRNU DSAGE ZUDIS ADEDD SKANE MIGOI AEAII NEEOH NTZHR DZIZN ZTNHA EGDNT GNNFN EEIEE ZEEFE MASRN SNRAS SAERC ERSEM CHAES HAHER UNEEE CEGWS NNMBS NITLF NEHKI HRPSN SRANN URNEU RNDIH TTAZI KFIMM HELNH OADAF IRIAG NEMUD HOLIE GSCNT NEOIO HESNU NSCIV NENRE HTTKN HHCLG ECLIN FDESI NSNHA TDIAG EESIL FUREG THUSS IPEHI TZEEI ZDUEE ESEGG WUUCR ICILD CVEER TMUDL NUVON DCREG BESHE DAHSN DMVTE WETSC LUREB EAWIM SMNEI DHEIE ETAEU EEAEF ERZOD CUNRM UTESN EREER NSTRA GZECR ZIHOI"

b) "IEEEO ATEEN EIRAF AUTST AHSUE NISAE OTQLL EUSNC NQELN EIMVA LELMS SESDO ANSAE VBETA EDSMS LLENM EDEII RNMAJ RSYUP RLSSN LTECE CESUE TALEB OLRSU SUSSA EASND UGCEE ERNVR ETTQL MRIIN MBSAS UCORA LMOSN EAEEU ATIUT SUOUC STATA PEEYN EAETE AAEUU ICDRN LNSRU TLITL ERVUL SOSLH SIASI DGNEN LUEET UATET SIEOP DLEET CEMST UBONA EGROV LSXAC SMISP ISOMR SPRCU UIUUU LDEME DSDMT PAEEN EEVLT ARNSM DPUIE ASATM IENIE ECEBU ONTON RSVNF ATEIS AIDOE EDLCA NROTD URICI LRONA EUAOE UEELI AOEEN UEQQE IEACV RBELS OATAA RDLOD INSOA EETES AUTUE TMNEO ILATS EDUEJ UAAEM AAHMD IDEEE EISSE PEEOE PSILD LHOII TAESV ENTNA GDETN TSFAY SDAPD ECULN ELEED BENEC PSESO TETLI RS"

c) "DRCII LONTG ADCUN HENLO UITNL HDATU IMSHL EDCLE VMLII ENERU ZOIHA CAGCI LGIET UPERH ECSDH AZICD NELSS FMERI TBEHI NEHLC ETAFB SRCSS RHSUI NLEEM FLMRR ELHAU ELREI IERET ARUIE CSETU WLIAI HLETE NTEHS EBEEN ESEFM HOLET UDGSH NSSEB HRGDI GHADN HKLRT MSASL WTERR UIDHG ULHWG DBEUU NOEIN EZENO TIEDH ECDAU DENGB ECELI IGESR SHMGM EEEIS EABCI NDAIF NIMIC ISTNU UECGI SRUAS STTES OBENO CBOTS IECER UTRSE ENGSI EEEHN RDNEW HEBUN ONTHS EEBAS UOFRG TUSHT MIHNT SUTGN EERML SSTSR ZOA"

Hint: think frequency analysis; you can use the substitution cipher spreadsheet to perform frequency analysis. I've uploaded an updated version which contains frequencies for French and German. (There's also the frequency analsyis tool from the last homework.) You can also use wikipedia's page frequency distribution page which has frequency counts in many languages, and allows you to sort by both letter and frequency (for each language separately). Look for signal letters that help you tell two languages apart.

3. (Substitution Cipher, 20pt) Your goal is to break a substitution cipher. Enter your student ID below (if your ID starts with a 0, include the 0) and hit return to get your ciphertext (based on text from a 19th century novel). The ciphertext will be different each time you submit the form (since I use a random substitution), but your plaintext does not change. Attack the cipher with the tools we saw and mentioned in class. You can do this by hand on paper, or,  more easily, using the substitution cipher spreadsheet we used in class as well.

Student ID:
When writing up your solution, include details about how you proceeded, including information about what decisions you made, why you made them, and (if necessary), why you went back on an earlier decision.

Include intermediary snapshots of partially restored plaintext as you reconstruct it.

Note: do submit your solution even if you don't manage to break the cipher entirely. There will be partial credit.

4. (Sukhotin's Algorithm, 10pt) Run Sukhotin's algorithm on the ciphertext

'KUJLVKNMXURKJMCI'

to identify the 5 ciphertext letters corresponding to vowels in the plaintext. Do this by hand on paper, or, as we did in class, using a text-editor. You can use the spreadsheets or templates to double-check your work if you want, but your primary work needs to be done without those. Document each step of how the matrix develops, and which letter/letters you identify as a vowel/vowels in each step.

Separately, and explicitly, state the list of vowels-letters you found.

5. [Extra Credit, 10pt] Decrypt the second ciphertext from the Sukhotin worksheet. Include details of your process (as described earlier):

MGIHRAUCS UB R LDCNDMBBUTM GUBHCTMDP CK CID COS UNSCDRSHM


Marcus Schaefer
Last updated: January 23rd, 2020