How to build a Supercomputer



So you want to build a supercomputer?  Well, if you checked out my video on quantum computing you’ll know that scientist are working to make computers stronger and faster than ever. Unfortunately, building a quantum computer is not something you can do at home, unless your home is a laboratory of course. All hope isn’t lost though, you can still go on to build your very own supercomputer. To do this, we will be making a Linux cluster. A cluster is a set of loosely or tightly connected computers that work together so that they can be viewed as a single system.

For this Linux cluster, we will be using the program called MPICH. MPICH is an implmentation of the Message Passing Interface (MPI) standard. This system is widely used in parallel computing in order to pass information to different worker nodes on the network. One thing to note for this tutorial, I will not be making use of a shared file system. However, using a shared file system will make your life a lot easier. Without using a shared file system (as I will show below), you will need to have the file paths and usernames the same on both system. Now, let’s take a look at how to build a supercomputer.

What You Will Need:

  • At least two devices running Linux
  • MPICH

**I am going to assume that you are running a debian based distro**

Step 1: Download Prerequisites

Before we get started, you will need to make sure that you have the GCC, G++, Openssh and Fortran compiler installed. Type in the following:

sudo apt-get update

sudo apt-get upgrade

sudo apt-get install gcc

sudo apt-get install g++

sudo apt-get install openssh-server

sudo apt-get install gfortran

Step 2: Create New User

Next, you need to create a new user and add them to the sudo group. Keep in mind, the username must be exactly the same on both systems (this is where using a shared file system would save some time).

sudo adduser dave

sudo adduser dave sudo

Adduser daveAdd User to Sudo Group

This will create a new user called Dave and add him to the sudo group. Remember, repeat this process on all the other computers that you are planning to use in the final super computer (use the same username). Now log out and back in to that user.

Step 3: Edit Hosts File

To make life a little easier, we are going to give names to all the computers on the network instead of just referring to them by their IP address. To do this, you must edit the host file. I am first going to install the nano text editor

sudo apt-get install nano

sudo nano /etc/hosts

Make sure that your hosts file looks similar to mine. Of course, replace the IP addresses (and names if you desire) to the appropriate values matching your network. Hit CRTL +O to save and CRTL + X to exit.

Etc host file

To test is type in the following:

ping worker0

Of course, replace worker0 with whatever you called the other computer. It should be able to communicate with the other computer without failure.

ping test

Step 4: Configure MPICH Program

Download the mpich.tar.gz file for your system. Unzip it using whatever method you like (I prefer just to use the gui).

Extract Mpich tar gz

Make a directory in your home folder called mpich.

mkdir mpich

Then navigate to the mpich.zip that you just extracted and run the configure file.

cd Downloads

cd mpich.your.version

./configure --prefix=/home/dave/mpich

This will configure the files and put them in the mpich folder that you made in your home directory. Now, make the program:

make

make install

Finally, copy the examples folder (located inside the mpich.your.version.tar.gz that you extracted) and moved that to the MPICH folder inside of your home directory. I am going to use the GUI (explorer) rather than doing it from terminal.

**Remember, all of these steps must be repeated on all Systems on the network**

Step 5: Bashrc

Next, you need to export that paths to your .bashrc file.In your home directory type in the following:

nano .bashrc

At the bottom of the file, add in the following lines:

export PATH=/home/dave/mpich/bin:$PATH

export PATH

LD_LIBRARY_PATH="/home/dave/mpich/lib:$LD_LIBRARY)PATH"

export LD_LIBRARY_PATH

Then, type CRTL + O to save and CRTL + X to exit.

Bashrc export

To test that it works type in:

which mpicc

You should see the folder path. Do it again with mpiexec

which mpiexec

Bashrc Test

Step 6: Processor file

We need to have a file that specifies to MPICH the computers on the network and the number of processes that we want them to handle. Navigate to the MPICH directory in your home folder. Create a file (I’ll call it hosts) and within that file identify the computer name with the number of processes.

cd Mpich

nano hosts

Then, type CRTL + O to save and CRTL + X to exit.

Processor File

Step 7: Password-less SSH

The final thing that we need to do is make sure that you can connect via ssh to the other computer(s) without needing a password. Let’s generate the ssh key.

ssh-keygen

Keep all the following values default and don’t specify a pass phrase. Finally, type in:

ssh-copy-id worker0

Replace “worker0” with the appropriate computer on your network. This will copy the ssh-key to that computer. It might prompt you to log in using the password for the first-time. In any case, try to ssh into that computer and make sure that you don’t need to enter a password.

ssh worker0

If it works, type “exit” to exit the ssh.

Step 8: Testing

Now it’s time for us to test the super computer. Just to summarize, the above steps were all so that we can set up all the computers to effectively communicate to each other on the network using the MPICH interface. Navigate to the Examples folder inside of the Mpich directory in your home folder and run the pi example program. Within the run command, you can specify the number of processes that you want to use:

cd Mpich

cd Examples

mpi exec -n 4 -f /home/dave/mpi/hosts ./cpi

If everything works, you should see a similar output.

Mpich Output Pi Program

As you can see, the computer splits up the tasks to each individual computer on the network. Now keep in mind, this is just a simple program that calculates the absolute value of pi. Imagine if we developed a more complex program with many devices on the network, this would be extraordinary!