2014-05-04 Final Project Specifications.

posted Jun 4, 2014, 6:07 AM by Samuel Konstantinovich   [ updated Jun 12, 2014, 5:52 AM ]

Final Project Specifications:

Deadline: Monday June 16th. Optional workdays: June 17,18,19,20.

Do not have any index.html in any of your folders, I must be able to see all the files. 

1. Store all your images in one place: (do not have duplicates)

2. Store your current version of the project in:

3. Save revisions in separate directories:  (these revisions must work. If there is an error, they do not count)
user.Name/public_html/finalproject/05/  (june 5th version)
user.Name/public_html/finalproject/06/  (june 6th version)
ONE group member must have all of the revisions. The OTHER must link me to the other person's directory.

4. Each version requires a changes.txt that tells me who did what on this version.
finalproject/06/changes.txt :
tim: allowed user to stay logged in, made links from main page to other pages.
mitt: fixed a but in the account creation. made it so logging in erases old magic number and replaces with new one.
finalproject/05/changes.txt :
tim: generated a working login, added css to make it look decent.
mitt: generated a working main page that requires a login, formatted pages
(one file per version folder! It must be timestamped that day, or the next day. You may not fill in the changes on the last day and receive credit.)

You are required to have at least these versions (you may have more if you want!)
(07/ OR 08/), (at least one update over the weekend)
09/ through 13/ (updates every day durring the week)
(14/ OR 15/), (at least one update over the weekend)
16/ through 20/ 

You can save additional versions like so:

5. In public_html/finalproject/  you need to make a file  revisions.txt , this must include all of the changes.txt in chronological order:
tim: generated a working login, added css to make it look decent.
mitt: generated a working main page that requires a login, formatted pages
tim: allowed user to stay logged in, made links from main page to other pages.
mitt: fixed a but in the account creation. made it so logging in erases old magic number and replaces with new one.

2014-06-02 Finalize Lab20

posted Jun 2, 2014, 5:32 AM by Samuel Konstantinovich   [ updated Jun 2, 2014, 5:32 AM ]

Lab 20 final component:

Create a text file for posts. Make it writeable. 

PostID,UserName,IP,Post Content
0,bob,,This is my first post,
1,bob,,OMG!!! Blargh!
2,dave,,This is my first post
3,bob,,If I fits - i sits.
4,dave,,Bob is dumb...

Everyone's main page has a form to add a new post to their own site. They are the only one that can see this. Just append a new line to the posts.txt file.
Make a viewProfile.py  that will allow you to view any other person's posts. There will be no form on this, just display all posts by that person.
On your main page, have a link to all other users profile.

Optional stuff:
4: Make a separate file userList.py  that allows non-logged-in users to view the list.
5: Make viewProfile.py allow non-logged-in users to view.
6. Make userlist.py and viewProfile.py 


posted May 28, 2014, 7:52 AM by Samuel Konstantinovich   [ updated May 28, 2014, 7:52 AM ]

Adding to a file:

f=open("pass.dat",'a') #'a' mode is append. You use this to add to a file.

#DO NOT use 'w' mode, or you ERASE the file and start from the beginning.

Calculating the address of the computer connecting to your website:

import os
ip = os.environ["REMOTE_ADDR"]


posted May 27, 2014, 6:22 AM by Samuel Konstantinovich   [ updated May 27, 2014, 6:51 AM ]


You should already have a login.py that looks at a password file. That password file should be located in your lab20 directory.

Add a new user/hash to your file: (I will use this to try to log in!)
user = konstans 
hash = 50d96e01ef03c80de481f38fe55d0955


make a link from your login.py to create.py.

create.py will have a way to create an account using a user field, and a password field + submit button.

1. If either field is blank, you should print an error.
2. If the username has special characters in it print an error.  (store all users as lower case, when people log in you should convert the username to lower case.)
3. If the user name is already in your password file, print an error.
4. If the password isn't good enough print an error. Good enough includes:
   10 or more characters
   at least 1   uppercase, lower case, number, and special character (something that is not an upper/lower/number is a special character)
5. When everything is good we will print "success"  but won't do anything else.

2014-05-23 + HW + Lab20

posted May 23, 2014, 5:38 AM by Samuel Konstantinovich   [ updated May 23, 2014, 7:44 AM ]

A hash function is a one way function. There is no inverse so you cannot undo the operation. 
example of a 1 way function that converts numbers and cannot be reversed is the remainder.
3031%15 -> 1
1000%15 ->10
There is no way to undo the change.

example of a 1 way function that converts strings to values is len.
len("ham") -> 3  
len("fish") -> 4

These are not a very good hash functions because many values have the same hashed value.

Hashing a password is really important if you want to store information securely.

MD5 is a way we will hash passwords:


You can use it this way:

import md5
m= md5.new()

hashed = m.hexdigest()
print s,hashed

Hashed contains the text you would store on a website password database.

Store users and passwords in a plain text file:

Your goal:
1. make a user/password file in the format shown. You can use any separator you choose, like , : ; etc.
2. Make a login page that has a user text field, and a password field, along with a submit button.
3. When you click submit, you should check the user/password file for a matching user and hashed password. If there is a match display 
"You logged in" 
if not 
"username and password do not match"


posted May 19, 2014, 7:56 AM by Samuel Konstantinovich   [ updated May 19, 2014, 7:56 AM ]

Notes from class:

import cgi,cgitb

formData = cgi.FieldStorage()

#if you get a submitted form:
if 'filename' in formData and 'page' in formData:
    print '<h3>Do stuff with the form</h3>\n'

    pageNumber = formData['page'].value #string of the page number
    fileName = formData['filename'].value #string of the fileName

    print "the page number is:",pageNumber,"<br>"
    print "the file name is:",fileName,"<br>"

    nextPage = str(int(pageNumber)+1) #add one to the page number

    #build a link and don't forget the & between the variables
    linkToNextPage = '''<a href = "displayForm.py?filename='''+ fileName + "&page="+nextPage+'''">next page</a><br>'''

    print linkToNextPage
#else you didn't get a submitted form:
    print '''
 <form action="displayForm.py">
 filename: <input type="text" name="filename"><br>
 <input type="hidden" name="page" value="1">
 <input type="submit">

2014-05-16 + HW

posted May 16, 2014, 5:44 AM by Samuel Konstantinovich   [ updated May 16, 2014, 4:27 PM ]

Goal: Processing form data.

1. After you send form data to a website like this:

we can retrieve the variable names and their values in python using cgi: (the following should be mypyfile.py)

print 'Content-type: text/html\n\n'
import cgi
form=cgi.FieldStorage() #this gets all the variables in a dictionary

if len(form)>0:
        for key in form: #just like a dictionary
            value=form[key].value #this is an extra step
            print key+','+value+'<br>'

1. Remember to finish your data analysis project.
2. Lab18
Make form.py  and process.py  such that:
form.py is a website that has a form that is linked to process.py.
process.py is a website that not only displays the information but does different things when you use the checkboxes or radio buttons. 

2014-05-13 HW

posted May 13, 2014, 5:11 AM by Samuel Konstantinovich   [ updated May 13, 2014, 8:23 AM ]

1a. Download and test the web server found here:
1b. Make sure you can run py files and html files.

2. Read about forms on w3schools:


posted May 12, 2014, 5:09 AM by Samuel Konstantinovich   [ updated May 12, 2014, 5:32 AM ]

Pick a partner. Everyone should have one partner, if there are an odd number of people in the class then there can be ONE group of three.  
You are responsible for your partner's actions today. If one of you doesn't do what you are supposed to do, my wrath will fall upon both of you. 

Warmup 1-3
1.What is the range of values of random.random() ?

2. What is the range of values of random.randint(a,b) ?

3. What is the range of values of random.randint(a,b)+random.random() ?

Real Question: 
4. Show a trace through each of these functions on paper (sorry no tablets/laptops this time). Determine which of them correctly inverts the dictionary. If not, explain what happens. Do not run idle to check your answers, instead: compare your answers with another pair of students.

Here is a sample dictionary and the results: (remember order doesn't matter)
 { 'a':99 , 'b':34,  'c':99, 'd':100, 'e':34 }  --> {34: ['b', 'e'], 99: ['a', 'c'], 100: ['d']}

def invert1(D):
    newD = {}
    for key in D:
    return newD

def invert2(D):
    newValues = D.keys()
    newKeys = D.values()
    newD = {}
    for i in range(len(newValues)):
        if newKeys[i] in newD:
            newD[newKeys[i]]= [ newValues[i] ]
    return newD

def invert3(D):
    q = {}
    for h in D:
    for h in D:
    return q

def invert4(D):
    z = {}
    for x in D:
        if D[x] in z:
            z[D[x]]= [ x ]
    return z

def invert5(D):
    newD = {}
    for key in D:
    for key in D:
    return newD

def invert6(D):
    newD = {}
    for key in D:
        newKey = D[key]
        newVal = key
        if newKey in newD:
            newD[newKey]= [ newVal ]
    return newD  

Time for a big project (Due Sunday. May 19th by morning)
Everyone can share ideas on what data to look at and how to look at it, as well as share cool data sources.
Everyone must document what kind of help was received for the project.

1. This will count as a significant project, more than your labs. (The time stamp of your py file on the web server should not be later than 05-19-2013 800am)

2. You will have a different file every day you work, so i can see what you did.
You will need sub-directories   v13,v14,v15,v16 or v17,v18  (You need at least two versions for the weekend)
Every night you will work on the assignment, but I only expect the 1st version Tuesday night.
Timestamps matter on each version. Runnability matters on each version. It should do something, like display the data, display the totals or something. Each version NEEDS to display something. If you break your code, fix it by commenting out the broken parts before you save it for the day. At no point should you leave your code in a broken state!!!

You may optionally work on this with a partner and collaborate in several ways:
A:    Just share ideas on data, but write separate code. Of course you can help each other with debugging code, but the code you submit should be what you wrote. This is Conceptual Collaboration, but independent implementation.
B:    Work on the coding aspect together (you need to produce more (extra features, more analysis) also it is necessary to delegate tasks) You must document who wrote what portions of the code. You may not just write one program with two of you sitting at the keyboard. You are producing more than 1 person worth of code. (I have higher expectations)
C:    Ask me if you want to work with some other arrangement. 

1. You will make a folder for this project and all of the data you use
There will be subfolders for every day of work. Each subfolder should have some working code. 

2. You will save one or more data sets in the directory, as data1.___ data2.___,etc they can be txt csv or other file types. You may optionally make some test data sets, smaller ones that work with older versions of your program for debugging purposes. 

Heading To Use:


LastName, FirstName




DATASET: _______

DATA SOURCE: _______

Chosen because...



3. You will make a file data.py  that will generate 

an HTML page containing:

  • Heading & explanatory paragraphs (Why did you choose this dataset to explore? Background the reader should know? Etc...)

  • Data table (you can format and colorize it as you see fit, but don't randomize colors)

  • Citation & link to data source. (Name of source and link to page where CSV file can be downloaded)

  • Link to analysis.py

4. You will make a file analysis.py that will generate an HTML page containing:
  • Heading & background paragraphs (How did your exploration evolve? Key insights? Turning points? Obstacles encountered/overcome? Etc...)

  • Table of summary newly created data (avgs, sums, areas of overlap between datasets, etc. that you used python to calculate)

  • Link to data.py

  • Summary/Conclusion paragraphs. (Aim for at least 1 pattern, notable intersection, or bizarre-or-otherwise-notable-phenomenon extracted from the data.) (What did you eventually find? Obstacles encountered/overcome? Ideas for further exploration?)

Finally. You will also link your project files in your assignments.html

2014-05-07 Lab17

posted May 7, 2014, 5:40 AM by Samuel Konstantinovich   [ updated May 7, 2014, 6:23 AM ]

Your goal: Lab17
-Make a web page (From now on, there is no html file in any assignment. You are making python programs that generate web pages)
-The web page opens two or more files that contain plain text books. (don't spend time finding books in class, do that at home when you don't need my help)
-The output of the webpage is a comparison of word frequencies, as outlined in Part1, and Part2 below.
-When you download two books in plain text:
  a. After you save your book, remove the gutenberg header / footer. Just include the book itself.
  b. Replace all hyphens '-' with ' 'spaces. This will fix issues like   "end.--Next" showing up as a word.

Part I. You need to make 3 tables.
(make a table around the 3 tables, so you get side by side tables)
Wrapper table contains:
Hamlet                     Othello                Highest %:
#Table1:                                          #Table2:                                 #Table3:.
Word    Count    %         word   count  %        Book     Dif
a       700      3.0%      a      650    2.0%     Hamlet   1.0%
an      200      1.4%      an     198    0.7%     Hamlet   0.7% 
at      133      0.3%      at     232    0.7%     Othello  0.4%
...                        ...
fish      2      0.001%    fish   0      0.0%     Hamlet   0.001%
...                         ...
The words should be in alphabetical order, and the words in each table1+table2 have to correspond to each other. If the other book does not have the word, the other book needs to add have a zero in its table.

The 3rd table is which book has the higher %, and the difference between the percentages.

Specifics that can help you:

1a. You should have a function that reads a book, and makes a dictionary of tallys.
1b. You should have a function that takes a dictionary and makes an inverted dictionary.

2. Make a function fillMissing(A,B) that takes two dictionaries of words+tallys, and checks all the keys of A that are not in B, and adds them to B with tally of 0. (You can use this twice to fill in all the missing words)

3a. Make a function that takes a dictionary and makes a list of lists in the format of tables 1 and 2.

3b. Make a function that takes two dictionaries, and makes a list of lists in the format of table 3.
4. Make a function that takes a list of lists, and makes an HTML table out of it. (this makes the functions in 3a/3b much cleaner, moving the tags to a separate place.

Part II.  (You need an inverted dictionary to get these)
After you complete the big table, you will put a summary of stats on top. 
You should list (but are not limited to):
1. A table of the 20 most common words in each book, and their tallys.
2. The number of unique words in each book
3. A list of those words.
4. At least one more statistic you calculated.

1-10 of 53