Wednesday, March 15, 2017

CIS(theta), 2016-2017: March Meeting: MPI4PY + hello.py!


CIS(theta), 2016-2017: 
March Meeting: MPI4PY + hello.py!

Before going on to 
Step 7: Coding 1 - Quadrature, 
we decided to install mpi4py and to write the programs helloSEQ.py and helloMPI.py to test the cluster using python! 

So we added a new
Step 6: Software Stack III - MPI4PY
(see below).

InstantCluster Step 1: 
Infrastructure - Power, Wiring and AC

InstantCluster Step 2: 
Hardware - PCs

InstantCluster Step 3: 
Firmware - Ubuntu

InstantCluster Step 4: 
Software Stack I - openSSH:

01) Install openSSH-server from USC or 
http://packages.ubuntu.com

02) Create a the same new user on every box of the cluster

03) login as the new user, we used 
userid: jaeger, passwd: galaga

04) If you have no .ssh directory in your home directory, ssh to some other machine in the lab; then Ctrl-d to close the connection, creating .ssh and some related files. 

05) From your home directory, make .ssh secure by entering:
chmod 700 .ssh

06) Next, make .ssh your working directory by entering:
cd .ssh

07) To list/view the contents of the directory, enter:
ls -a [we used ls -l]

08) To generate your public and private keys, enter:
ssh-keygen -t rsa

The first prompt is for the name of the file in which your private key will be stored; press Enter to accept the default name (id_rsa).The next two prompts are for the password you want, and since we are trying to avoid entering passwords, just press Enter at both prompts, returning you to the system prompt.

09) To compare the previous output of ls and see what new files have been created, enter:
ls -a [we used ls -l]
You should see id_rsa containing your private key, and id_rsa.pub containing your public key.

10) To make your public key the only thing needed for you to ssh to a different machine, enter:
cat id_rsa.pub >> authorized_keys

NOTE: The Linux boxes on our LAN, soon to be cluster, have IPs ranging from 10.5.129.1 to 10.5.129.24 So, we copied each id_rsa.pub file to temp01-temp24 and uploaded these files via ssh to the teacher station. Then we just ran cat tempnn >> authorized_keys for each temp file to generate one master authorized_keys file for all nodes that we could just download to each node's .ssh dir.

[optional] To make it so that only you can read or write the file containing your private key, enter:
chmod 600 id_rsa 

[optional] To make it so that only you can read or write the file containing your authorized keys, enter: 
chmod 600 authorized_keys

InstantCluster Step 5: 
Software Stack II - openMPI

We finally have openSSH installed with public key authentication on 10.5.129.17-10.5.129.20. We tested that today.
Today we also installed openmpi-bin, libopenmpi-dev and gfortran on the same machines: 

sudo apt-get install openmpi-bin
sudo apt-get install libopenmpi-dev
sudo apt-get install gfortran

Then we compiled flops.f:

mpif77 -o flops flops.f

Then we ran flops on our quadcores:

mpirun -np 4 flops

We got about 8 GFLOPS! So, we have multicore working on individual PCs now it's time scale our job over the cluster! 


mpirun -np 16 --hostfile machines flops

We got up to nearly 32GFLOPS! We made sure all four PCs are identical COTS, have identical firmware, have public key authenticated ssh for the user jaeger, and have these 3 files:

/home/jaeger/.ssh/authorized_keys
/home/jaeger/machines
/home/jaeger/flops

The /home/jaeger/machines file is a txt file that looks like this:
10.5.129.17
10.5.129.18
10.5.129.19
10.5.129.20

InstantCluster Step 6: 
Software Stack III - MPI4PY + hello.py

python is a very easy language to learn. It's an interpreted language, so all you have to do is write a text file and run it through the python interpreter as long as it's installed (we have it in unbuntu). hello.py is just one line (unlike java): print "hello!" 

Save this line to a file called hello.py with your fave text editor and that's it! Or you could call it helloSEQ.py since it's a sequential, not a MPI, script. 

To run it, open a shell and type:

python helloSEQ.py

Alternatively, you could add a line at the top of helloSEQ.py: #!/usr/bin/python or whatever path points to your installation of python. Then make the file executable and run it from the commandline as the super user in a terminal:

chmod 755 helloSEQ.py
./helloSEQ.py

Then we installed "python-mpi4py" as the main user from 


Make sure to install that package on every linux box in the cluster. Our helloMPI.py looks like this:

#!/usr/bin/python
import sys
from mpi4py import MPI

comm = MPI.COMM_WORLD
id = comm.Get_rank()
name = MPI.Get_processor_name()
p=comm.Get_size()

if id==0:
  print "This is id=0, I am the Master Process!"
  print "HELLO_MPI: there are ", p, " MPI processes running."
print
print "Hello World, from process: ", id
print "Hello World, my name is ", name

The first line is probably optional as you are going to run this file using mpirun on one PC (quadcore):

mpirun -np 4 python helloMPI.py

or on the whole cluster (4 quadcores):

mpirun -np 16 --hostfile machines python helloMPI.py

After you edit and save helloMPI.py, before running mpirun, make sure to:

scp helloMPI.py jaeger@10.5.129.XXX:/

making an identical copy of the file helloMPI.py in the home folder of every PC in the cluster. That's it, you have a working MPI cluster using python!

We are also studying sample MPI code:



InstantCluster Step 7: 
Coding I - Quadrature

InstantCluster Step 8: 
Coding II - Mandelbrot

InstantCluster Step 9: 
Coding III - Mandel Zoom

InstantCluster Step 10: 
Coding IV - POVRay

InstantCluster Step 11: 
Coding V - Blender

InstantCluster Step 12: 
Coding VI - 3D Animation

2016-2017 MANDATORY MEETINGS
09/14/2016 (organizational meeting)
10/26/2016 (installing Ubuntu 16.10 64bit)
11/09/2016 (installing Ubuntu 16.10 64bit)
12/14/2016 (Pelican HPC DVD)
01/11/2017 (openSSH Public Keys)
02/08/2017 (openMPI Software Stack)
03/08/2017 (Quadrature)
03/22/2017 (Fractal Plots + Zoom Movie)
(03/29/2017 is a make up day)
04/26/2017 (POVRAY 3D Stills + Animation)
05/10/2017 (Blender 3D Animation)
(05/24/2017 is a make up day)

So, what's all this good for aside from making a Fractal Zoom or Shrek Movie?

SETI Search
Econometrics
Bioinformatics
Protein Folding
Beal Conjecture
Scientific Computing
Computational Physics
Mersenne Prime Search
Computational Chemistry
Computational Astronomy
Computer Aided Design (CAD)
Computer Algebra Systems (CAS)

These are but a few examples of using Computer Science to solve problems in Mathematics and the Sciences (STEAM). In fact, many of these applications fall under the heading of Cluster Programming or Super Computing. These problems typically take too long to process on a single PC, so we need a lot more horse power. Next time, maybe we'll just use Titan!

====================

Membership (alphabetic by first name):
CIS(theta) 2016-2017: 
DanielD(12), JevanyI(12), JuliaL(12), MichaelC(12) , MichaelS(12), YaminiN(12)

CIS(theta) 2015-2016: 
BenR(11), BrandonL(12), DavidZ(12), GabeT(12), HarrisonD(11), HunterS(12), JacksonC(11), SafirT(12), TimL(12)

CIS(theta) 2014-2015: 
BryceB(12), CheyenneC(12), CliffordD(12), DanielP(12), DavidZ(12), GabeT(11), KeyhanV(11), NoelS(12), SafirT(11)

CIS(theta) 2013-2014: 
BryanS(12), CheyenneC(11), DanielG(12), HarineeN(12), RichardH(12), RyanW(12), TatianaR(12), TylerK(12)

CIS(theta) 2012-2013: 
Kyle Seipp(12)

CIS(theta) 2011-2012: 
Graham Smith(12), George Abreu(12), Kenny Krug(12), LucasEager-Leavitt(12)

CIS(theta) 2010-2011: 
David Gonzalez(12), Herbert Kwok(12), Jay Wong(12), Josh Granoff(12), Ryan Hothan(12)

CIS(theta) 2009-2010: 
Arthur Dysart(12), Devin Bramble(12), Jeremy Agostino(12), Steve Beller(12)

CIS(theta) 2008-2009: 
Marc Aldorasi(12), Mitchel Wong(12)

CIS(theta) 2007-2008: 
Chris Rai(12), Frank Kotarski(12), Nathaniel Roman(12)

CIS(theta) 1988-2007: 
A. Jorge Garcia, Gabriel Garcia, James McLurkin, Joe Bernstein, ... too many to mention here!
====================


Well, that's all folks,
A. Jorge Garcia

 Applied Math, Physics and CS
2017 NYS Secondary Math PAEMST Nominee


Sage Ebay
TpT

No comments:

Post a Comment