Saturday, March 26, 2011

CIS(theta) Meeting XII (2010-2011) - Instant Cluster, just add water!

Aim: 
Instant Cluster, just add water!

Attending: 
CIS(theta) 2010-2011: DavidG, HerbertKRyanH

Absent: 
CIS(theta) 2010-2011: JoshG

Reading:
NA

Parallel Python
IPython
Large Integer number crunching Mersenne Primes
http://www.hoise.com/primeur/03/articles/weekly/AE-PR-01-04-37.html
Large Integer number crunching Beal Conjecture
http://www.bealconjecture.com/

We had a great meeting this week! We finally broke down and decided to install our own software stack for MPI on our existing 32bit Ubuntu 10.04 Desktop! We reviewed what works on our LAN (openSSH) and what doesn't work (PXE). Maybe we can add the functionality from pelicanHPC such as Octave and MPITB? We could even write our own C++ code for fractals. What about mpi4py or parallel python?


We can learn from our experience this year. I think we need to get openMPI working over openSSH using public-key authentication (as detailed below). Then we can run povray or blender or even our own C++ applications on a more permanent cluster installation! So, for example, its MPIPOV over openMPI over openSSH with public keys over Ubuntu over 64bit dual-core AMD Athlons with 750MB RAM over gigE, right? Here's what we discussed.

InstantCluster Step 1: Infrastructure
Make sure your cores have enough ventilation. The room has to have powerful air conditioning too. These two factors may seem trivial but will become crucial when running the entire cluster for extended periods of time! Also, you need to have enough electrical power, preferably with cabling out of the way, to run all cores simultaneously. Don't forget to do the same with all your Ethernet cabling. We have CAT6E cables to support our gigE Ethernet cards and switches. We are lucky that this step was taken care of for us already!

InstantCluster Step 2: Hardware
You need some up to date fast Ethernet switches plus Ethernet cards and cores as well as plenty of RAM in each Linux box. As stated above, our gigE LAN was setup for us. Also, we have 64bit dual-core AMD Athlons and our HP boxes have 750 MB of RAM. I'd rather 1 or 2 GB of RAM, but that will have to wait for an upgrade!

InstantCluster Step 3: Firmware
We wasted way too much time last year trying out all kinds of Linux distros looking for a good 64bit base for our cluster. This year we spent way too much time testing out different liveCD distros. Recently, we downgraded from 64bit Ubuntu 10.04 Desktop edition to the 32bit version on our Linux partitions. 64bit gives us access to more RAM and larger maxint, but was proving to be a pain to maintain. Just to name one problem, jre and flash were hard to install and update on FireFox. Last year we tried Fedora, Rocks, Oscar, CentOS, Scientific Linux and, finally, Ubuntu. 32bit Ubuntu has proven very easy to use and maintain, so I think we'll stick with it for the cluster!

InstantCluster Step 4: Software Stack
On top of Ubuntu we need to add openSSH, public-key authentication and openMPI. In step 5 we can discuss an application to scatter/gather over the cluster whether it be graphical (fractals, povray, blender, openGL) or number crunching (C++ or python app for Mersenne Primes or Beal's Conjecture). So, what follows is a summary of what we did to get up to plublic-key authentication. We didn't have time to do more than that, but we made some good progress getting this much working so far. This summary is based on the http://cs.calvin.edu/curriculum/cs/374/MPI/ link listed above. I added some notes in [...]. First, we installed openSSH-server from http://packages.ubuntu.com using our proxy server, then:
  1. If you have no .ssh directory in your home directory, ssh to some other machine in the lab; then Ctrl-d to close the connection, creating .ssh and some related files. 
  2. From your home directory, make .ssh secure by entering:
    chmod 700 .ssh
  3. Next, make .ssh your working directory by entering:
    cd .ssh
  4. To list/view the contents of the directory, enter:
    ls -a [we used ls -l]
  5. To generate your public and private keys, enter:
    ssh-keygen -t rsa
    The first prompt is for the name of the file in which your private key will be stored; press Enter to accept the default name (id_rsa).The next two prompts are for the password you want, and since we are trying to avoid entering passwords, just press Enter at both prompts, returning you to the system prompt.
  6. To compare the previous output of ls and see what new files have been created, enter:
    ls -a [we used ls -l]
    You should see id_rsa containing your private key, and id_rsa.pub containing your public key.
  7. To make your public key the only thing needed for you to ssh to a different machine, enter:
    cat id_rsa.pub >> authorized_keys
    [The Linux boxes on our LAN, soon to be cluster, have IPs ranging from 10.5.129.1 to 10.5.129.24 So, we copied each id_rsa.pub file to temp01-temp24 and uploaded these files to our ftp server. On the ftp server we just ran cat tempnn >> authorized_keys for each temp file to generate one master authorized_keys file for all nodes that we could just download to each node's .ssh dir.]
  8. [optional] To make it so that only you can read or write the file containing your private key, enter:
    chmod 600 id_rsa
  9. [optional] To make it so that only you can read or write the file containing your authorized keys, enter:
    chmod 600 authorized_keys
Well, that's all for now, enjoy!

Saturday, March 19, 2011

Edmodo update: "The Social Network for your Class!"

Wow, using Edmodo has been great these last few weeks! I only just started using Edmodo after February Break around 2/21/11. I set it up, literally, in seconds one day during break. It couldn't have been easier to set up or use. The students took to it right away as its modeled after Facebook. I strongly recommend you try it for your classes!

I used to upload files to share with my classes to a private ftp site I set up at school that the students could access in class and at home. I also uploaded screen-casts from http://screencast-o-matic.com to my http://www.youtube.com/calcpage2009 channel. However, this year my school had to change all its external IPs but my ftp server has yet to get a new IP. So, my students couldn't use this resource from home. When I found edmodo, I thought it would be a great way to share files from class, but its a lot more! You can start threads or discussions on topics extending what was done in class. Students can post questions any time of the day or night and you can answer them. You can even make announcements about due dates and other important reminders for class such as Multiple Choice Mondays or Free Response Fridays. You can even make links to YouTube Wednesday videos!

Take a look at the edmodo links I have on the right side of this blog. I made a few posts from each class public so you could see what it is like. I will probably not be adding much more content publicly, but I hope you get the idea. Here's a sample of what I've done in my 4 preps these past few weeks.

Advanced Computer Math:
http://www.edmodo.com/public/feed/group_id/208308
You can see here, that there are several threads where students are voicing their concerns for preparing for a recent test and I was able to help them out. Also, we had some jokes and PI Day stuff! Can you play Rock, Paper, Scissors, Lizard, Spock?

AP Computer Science:
http://www.edmodo.com/public/feed/group_id/208307
In this group, I usually post copyrighted material that I could not make public such as chapters in pdf format from the author of our text and labs from that book. However, this is a great place to share code we write in class to study new syntax at home! I also share College Board materials my students may need for a particular unit.

preCalculus 4 Juniors
http://www.edmodo.com/public/precalculus-math-4r-11/group_id/208305
and AP Calculus BC:
http://www.edmodo.com/public/ap-calculus-bc-math-5hbc/group_id/208306
My math classes typically generate a powerpoint or pdf from my SmartNotes and a quicktime or mp4 screen-cast recording every day. Edmodo is a great place to share these files too! I even posted some fun stuff about a math conference I just went to where I participated in honoring one of my own students with a $1000 scholarship! Gratzgo to MichaelS and thanx go to LIMACON!

That's all folks....

Friday, March 11, 2011

CIS(theta) Meeting XI (2010-2011) - If you can't beat them....

Aim: 
If you can't beat them....

Attending: 
CIS(theta) 2010-2011: DavidG, HerbertK, JoshG, RyanH

Reading:
NA

Parallel Python
IPython
Large Integer number crunching Mersenne Primes
http://www.hoise.com/primeur/03/articles/weekly/AE-PR-01-04-37.html
Large Integer number crunching Beal Conjecture
http://www.bealconjecture.com/

MPIPOV
http://comp.uark.edu/~ewe/misc/povray.html
POVRAY
http://himiko.dnsalias.net/wordpress/2010/03/29/persistence-of-vision-ray-tracer-povray/
MPI and blender
http://www.blender.org/forum/viewtopic.php?t=10244&view=next&sid=3003d90233a27b81c5093a374c2b0e31
More MPI and blender
http://wiki.vislab.usyd.edu.au/moinwiki/BlenderMods/




If you can't beat them, join them! In other words, I think the consensus from our last meeting is that we are done with liveCD Linux distro solutions for HPC. What's to stop us from installing openMPI on our Linux partitions? So, we will have to install openSSH and openMPI directly on our Linux partitions to set up a more permanent cluster. 


Fret not, all is not lost as we can learn from our experience this year. Cluster By Night should serve as a proof of concept. Namely, we CAN run openMPI over openSSH! pelicanHPC worked fine with a crossover Ethernet cable, but I am done with PXE boot! However, we can emulate all the number crunching apps that pelicanHPC has by running Octave with MPITB and openMPI. Let's make sure not repeat the mistakes from last year trying out a million distros on our Linux partitions. Let's stick to our 32bit Unbuntu 10.10 Desktop. We should take stock of where we've been.

1995-2001: Our First Cluster
Ethernet (10Mbps), 
Gateway PCs, 128MB RAM, 
Mixed Intel Pentium II 400MHz and Intel Pentium III 800MHz
****************************************
2001-2007: Our Second Cluster
Fast Ethernet (100Mbps), 
Dell PCs, 1GB RAM, 
Pentium IV 2.6 GHz
****************************************
2007-now: Our Current Cluster
gigE (1000Mbps), 
HP PCs, 750MB RAM, 
AMD Athlon 64bit dualcore 2GHz


CIS(theta) (2007-2008) used openMOSIX and C++ to make fractal graphs on the complex plane. We took our cue from these blog posts:
http://nullprogram.com/blog/2007/09/02/
http://nullprogram.com/blog/2007/09/17/
http://nullprogram.com/blog/2007/10/01/
http://nullprogram.com/projects/mandel/

CIS(theta) (2008-2009) used public key authenticated ssh and a pile of bash scripts to scatter/gather povray jobs. In fact, this blog started out as a site to record our results and document how we did it!
http://shadowfaxrant.blogspot.com/2009/05/poor-mans-cluster-step-0.html
http://shadowfaxrant.blogspot.com/2009/06/poor-mans-cluster-step-1.html
http://shadowfaxrant.blogspot.com/2009/07/poor-mans-cluster-step-2.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-3.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-4.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-5.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-6.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-7.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-8.html
http://shadowfaxrant.blogspot.com/2009/08/poor-mans-cluster-step-9.html

CIS(theta) (2009-2010) got bogged down finding a stable 64bit Linux distro to use on the Linux partitions of our dual boot PCs. We used 32bit and 64bit Fedora 11 and 12. We tried centOS, Scientific Linux, OSCAR and Rocks! We got a torque server working for openMPI and helloMPI.c but didn't get much farther than that, I'm afraid.

CIS(theta) (2010-2011) has to switch gears and take the best of all the above. Many in the HPC community talk about a software stack. So, let's come up with our own! I think we can run our application (C++ fractal program we design, povray, blender or openGL) on top of openMPI on top of openSSH (with public key authentication) on top of Ubuntu. What about the hardware stack (64bit dual-core AMD athlons on top of a gigE switch)? I think we can make a stack like this work, what do you think?

BTW, openGL sounds interesting. Take a look at the links above from Thomas Jefferson High School. They've been running a clustering course for quite some time. In fact, they got into it when they won some sort of computing competition on the late 1980s and Cray donated a super computer to their school! More recently they've been playing with openMosix and openMPI as well as fractals and povray just like us. They have a lot of notes on openGL too! Also, if you want a good overview of things MPI, take a look at the ualberta link, it's a very good overview in ppt style even though its a pdf!


Then we should update our Poor Man's Cluster notes to Poor Man's Cluster 2.0 notes! Step1 can be about the physical plant including electrical power, ventilation and AC. Step2 can be about the hardware stack including our gigE LAN and athlon boxes. Step3 can be about the firmware stack including Ubuntu, openSSH and setting up public keys. Step4 can be about the software stack whether it be openMPI+MPITB+Octave or MPI4py or povray or blender or openGL. Step5 can be about various applications and results.

Well, that's all for now, enjoy!

Wednesday, March 2, 2011

The Great Computer Science vs Computing Science Debate!

The College Board is proposing a new course entitled CS Principles, http://csprinciples.org, that students could take before the current AP CS course or instead of it. Wouldn't a course in Computing Science or Scientific Computing plus Discrete Mathematics be more useful to students wanting an alternative or an intro to the current AP CS course? 


I have spent this whole school year reworking my intro course to include a Computer Algebra System (SAGE) and a new programming language (python, new to my school anyway) and a new text on Discrete Mathematics (the Litvins' Mathematics for the Digital Age, http://www.skylit.com). I think my introCS students are getting alot out of this approach. It certainly sounds more rigorous than the propsed CS Principles. 


Sorry, I don't mean to offend anyone. I know alot of the people developing this course are top notch. I just don't see this as a good direction for the future of APs in Computer Science. I know the intent is a good one. We want more students to do more Computer Science in High School and Undergrad. If a student's first exposure to Computer Science is the current AP course, it may not always go well. Also, if a student only does one year of Computer Science, I think they are missing out. So, a program like CS Principles does have potential, but I would define the curriculum differently (see below).


If I had my druthers, I'd love to see the following sequence of courses.
 
APCS1: 
Discrete Math with CAS as described above. My school used to call this course Computer Math. Some call it Computing Science or Scientific Computing. On the python edu-sig forum they are coining the phrase Digital Math. I've been teaching a course like this since the late 1980s. In fact, I've been teaching programming with some sort of BASIC (IBM BASICA, MS QBASIC, Visual BASIC, REALbasic, yabasic, etc) since 1975! I only just switched to python this year. Over the years, I've also used ForTran, Pascal, C, C++, Java, SAGE, Octave and R in this course. We've even played with html and applets for the web.
 
APCS2: 
The current APCS course is mostly about algorithms and Object Oriented Programming aka APCS A. The current course, using java, is sufficient. We even do a little computer history, computer literacy and computer ethics! This course is especially beautiful given the right text. I highly recommend anything by Cay Horstmann. See http://www.horstmann.com 


APCS1 and APCS2 are usually followed by a Data Structures, Networking or Operating Systems course. Some schools do robotics or website design. I would follow APCS1 and APCS2, as described above, with APCS3 or APCS4 as described below.

APCS3: 
The old APCS course that was canceled about Data Structures aka APCS AB should be brought back. A standard course in linked lists and binary trees would be great!
 
APCS4: 
A new course in multicore, cluster, grid computing and clouds would be timely! I've been running a Computing Independent Study course, aka CIS(theta), for seniors who have already taken AP Computer Science as juniors. Lately we've been learning about SAGE, Octave and R. We've also set up clusters using PVM, MPI and openMosix doing some applications involving fractals, povray, blender and even some number crunching. We've also played with various Linux distros to get our cluster running: clusterKnoppix, BCCD, Quantian, OSCAR, Rocks, centOS, Fedora, Ubuntu, Cluster By Night, parallelKnoppix and pelicanHPC.

What do ya think? 

Teaching with Technology, 

Tuesday, March 1, 2011

Spock, Birthdays and Pheasants!

Here's a random pic from class where GrahamS and ChrisS explain how to play "Rock, Paper, Scissors, Lizard, Spock!" So, Rock crushes Scissors and Rock smashes Lizzard.  Also, Paper covers Rock and Paper disproves Spock. Now, Scissors cuts Paper and Scissors decapitates Lizard. Then, Lizard eats Paper and Lizard poisons Spock. Finally, Spock vaporizes Rock and Spock phasers Scissors? IDK about that last one....


Here's another pic from last weekend outside my daughter's Sunday School. Can you make out the wild turkeys? Actually, I think they are declared as a protected species on Long Island. Aren't those Ring-Necked Pheasants? All kinds of beasts come out of those woods when you least expect it. Last month it was a whole deer family! This is a tribe of pheasants. I think there were 13 or 14 all together!


Finally, a pic from last month on my birthday 2/8/11. BTW, my students owe me a 100% on a test or quiz for my birthday! It was the 20th anniversary of my 29th birthday. Gotta love Carvel....


Generally Speaking,