BeagleBoneBlackCluster

This work is part of an ongoing project to build a scalable micro computing cluster for building energy management research. In this tutorial, I will iterate upon the BeagleBone Black Cluster: Demo Build, with the primary intention of improving the efficiency of the setup process and the usability of the cluster system. Specifically, the tutorial will introduce cluster SSH for repeated commands, BitTorrent Sync for large file sharing, and PostgreSQL for database storage. Also, instead of using a BBB as the manager node, I will use a laptop running a desktop version of Ubuntu.

Many of the steps in this tutorial are the same as in the Demo Build. For efficiency, I will assume that you have worked through the Demo Build and that I can leave out many of the comments for these steps. Also, I will use Note to provide additional information/explanations and Optional to indicate steps that may be helpful but are not required for the build. If you find any errors or solve any bugs, please let me know. I would greatly appreciate it.

Last Tested With:

  • BeagleBone Black Rev C
  • Debian 7.5 (2014-05-14)
  • Ubuntu 14.04.1 LTS
  • BitTorrent Sync 1.4
  • MPICH 3.1.3

I. Preparation

1.1 Requirements

To setup this cluster, you will need 2 or more BeagleBone Blacks, a computer running Debian or Ubuntu, and a switch to provide Ethernet access. I will assume each BBB is running an officially supported Debian image. Also, to setup the boards, you will need a computer with an SSH client (I will be using a Ubuntu laptop).

1.2 Troubleshoot

After setting up a 24 node cluster, I have a few troubleshooting tips.

  • When turning on a BBB, if the PWR light turns on but none of the USR lights, disconnect and reconnect the board. 
  • If the Ethernet lights do not turn on, reboot the board.
  • If a BBB hangs on startup (lights are on/blinking but you are never able to access the board over USB/SSH), re-flash the Debian OS to the eMMC. If the problem persists, re-flash the SD card and then re-flash the eMMC.
  • In my experience, it is pretty easy to corrupt an install. You should avoid inserting/removing the SD card with power connected and minimize contact with the board as much as possible.

II. Manager Node Setup

In this build, a laptop running Ubuntu will serve as the manager node and each BBB will be a worker node. For the Ubuntu laptop, I will assume that you have already setup a user, chosen a hostname, and updated the OS. Feel free to change the hostname to nodem or to leave the hostname unchanged. However, I will be referring to the manager node as nodem.

2.1 Static IP

There are a number of tools and graphical interfaces that will allow you to set a static IP in Ubuntu, however I have had best results by simply editing the /etc/network/interfaces file (see step 2 in the Demo Build for more about determining addresses). I am using 192.168.1.39 as the static IP of nodem.

ubuntu@nodem:~$ sudo nano /etc/network/interfaces
# The primary network interface

auto eth0
iface eth0 inet static
  address 192.168.1.39
  netmask 255.255.255.0
  gateway 192.168.1.1
  dns-nameservers 8.8.8.8 8.8.4.4

2.2 Hosts

Update the /etc/hosts file to include all the nodes that will be in your system. If using nano, press Ctrl and X to save the changes, press Y to confirm, and Enter to keep the same filename.

ubuntu@nodem:~$ sudo nano /etc/hosts
127.0.0.1 localhost

192.168.1.39 nodem
192.168.1.40 node0
192.168.1.41 node1
192.168.1.42 node2
192.168.1.43 node3
192.168.1.44 node4
192.168.1.45 node5
192.168.1.46 node6
192.168.1.47 node7

2.3 Cluster User

Create a new user called cluster and assign sudo privileges. Set a password but feel free to leave the user information blank.

ubuntu@nodem:~$ sudo adduser cluster
ubuntu@nodem:~$ 
sudo adduser cluster sudo
ubuntu@nodem:~$
su - cluster

III. Worker Nodes Setup

Repeat steps 3.1 through 3.4 for each BBB worker node.

Note: If you use a Ubuntu laptop to SSH via USB into the worker nodes, you may need to remove 192.168.7.2 from the list of known hosts before connecting another BBB.

cluster@nodem:~$ ssh-keygen -f "/home/cluster/.ssh/known_hosts" -R 192.168.7.2

 

3.1 Change Password

debian@beaglebone:~$ passwd

3.2 Static IP

Next, edit the /etc/network/interfaces file, as shown below. For each BBB, replace X with a different number (see step 2 in the Demo Build for more about determining addresses).

debian@beaglebone:~$ sudo nano /etc/network/interfaces
# The primary network interface

#allow-hotplug eth0
#iface eth0 inet dhcp
auto eth0
  iface eth0 inet static
  address 192.168.1.X
  netmask 255.255.255.0
  network 192.168.1.0
  broadcast 192.168.1.255
  gateway 192.168.1.1

Optional: Verify that the static IP is working. Reboot or shutdown/reconnect the board (wait for the lights to turn off). Once rebooted, ping the board from the manager node.

debian@beaglebone:~$ sudo shutdown -h now
cluster@nodem:~$ ping 192.168.1.X

3.3 Cluster User

Create a new user called cluster and assign sudo privileges. Set a password but feel free to leave the user information blank.

debian@beaglebone:~$ sudo adduser cluster
debian@beaglebone:~$ 
sudo adduser cluster sudo

3.4 Hostname

Change the board’s hostname (node0, node1, node2, etc.).

debian@beaglebone:~$ sudo hostname node0

And update the /etc/hostname file.

debian@beaglebone:~$ sudo nano /etc/hostname
node0

Note: If you are seeing the message unable to resolve host node0, this is because we have not yet updated the list of hosts (step 5.1 below).

IV. Cluster SSH

Cluster SSH is a tool for making the same changes to multiple systems at the same time. The tool displays windows that allow input into individual nodes as well as a console that will input into every node. This will enable us to configure many nodes at once and keep them in sync.

4.1 Install

Install cluster SSH on the manager node (Ubuntu laptop).

cluster@nodem:~$ sudo apt-get install clusterssh

The easiest way to start cluster SSH is with the command syntax:

cssh <username>@<host1> <username>@<host2> <username>@<host3> 

For example:

cluster@nodem:~$ sudo cssh cluster@node0 cluster@node1 cluster@node2 cluster@node3 

Or, since the username for the SSH client and server are the same, simply:

cluster@nodem:~$ sudo cssh node0 node1 node2 node3 

V. Synced Setup

Access the worker nodes using cluster SSH and complete the following steps on each node at the same time. Be very careful with cluster SSH and make sure that each node is always at the correct step.

Note: I will use nodeX to jointly refer to each of the worker nodes (node0, node1, node2, etc.).

5.1 Update

Update and upgrade the boards (this might take a few minutes).

cluster@nodeX:~$ sudo apt-get update
cluster@nodeX:~$ 
sudo apt-get upgrade --yes
cluster@nodeX:~$
sudo apt-get clean

Note: If you are using the Debian Image 2014-05-14, you may need to remove one of the links from the apt sources list (GPG Error).

cluster@nodeX:~$ sudo nano /etc/apt/sources.list

Change:

deb [arch=armhf] http://debian.beagleboard.org/packages wheezy-bbb main

#deb-src [arch=armhf] http://debian.beagleboard.org/packages wheezy-bbb main

To:

#deb [arch=armhf] http://debian.beagleboard.org/packages wheezy-bbb main

#deb-src [arch=armhf] http://debian.beagleboard.org/packages wheezy-bbb main

Note: If you are seeing the error E: Sub-process /usr/bin/dpkg returned an error code (1), I suggest re-flashing the eMMC and starting over.

5.2 Hosts

Update the /etc/hosts file (same content as the manger’s hosts file).

cluster@nodeX:~$ sudo nano /etc/hosts

5.3 Timezone

Optional: To change the local timezone, using the following command and follow the prompts.

cluster@nodeX:~$ sudo dpkg-reconfigure tzdata

5.4 Remove Resources

Optional: The Debian distribution for BBB comes with many out-of-the-box services that are unnecessary for a headless computer cluster. If you are using the BBB Rev C or later, space probably wont be an issue during the setup process. For the BBB Rev B or earlier, you will at least need to remove the X11 GUI in order to have the space needed to complete the rest of the setup. In my experience, these steps can shrink the Debian install to about 1 GB. 

To check how much free space you have in the file system:

cluster@nodeX:~$ df -h

Note: For each package removed with apt-get, we will use autoremove to uninstall unused dependencies. However, many of these dependencies will be re-installed later as dependencies.

Remove the X11 GUI:

cluster@nodeX:~$ sudo apt-get purge x11-common --yes
cluster@nodeX:~$ 
sudo apt-get autoremove --yes

Remove the contents of the documentation directory:

cluster@nodeX:~$ sudo rm -rf /usr/share/doc/*

Remove GTK+, a multi-platform toolkit for creating graphical user interfaces:

cluster@nodeX:~$ sudo apt-get purge libgtk2.0-common --yes
cluster@nodeX:~$ 
sudo apt-get autoremove --yes

Remove the Apache2 Web Server:

cluster@nodeX:~$ sudo apt-get purge apache2 --yes
cluster@nodeX:~$ 
sudo apt-get autoremove --yes
cluster@nodeX:~$
sudo rm -rf /etc/apache2

5.5 Reboot

Reboot the boards and exit SSH.

cluster@nodeX:~$ sudo reboot && exit

5.6 Check

In my experience, a number of problems with SSH, NFS, and MPI are simply cause by incorrect network configuration. Before moving on, verify that the nodes are accessible by their hostname at the correct IP address. 

cluster@nodeX:~$ ping nodem
cluster@nodem:~$ ping node0

VI. SSH Keys

6.1 For Manager

For this build, a single public and private key pair will be copied to each node in the system, allowing for secure password-less access. To begin, install SSH and generate a public and private key. When asked where to save the files, use the default to store in the cluster directory (just press Enter). When asked for a passphrase, leave it blank (just press Enter).

cluster@nodem:~$ sudo apt-get install ssh
cluster@nodem:~$ 
ssh-keygen

Add nodem to the list of known hosts by copying the contents of the public key into the authorized keys file.

cluster@nodem:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Verify that you can log into nodem (from nodem) without entering a password.

cluster@nodem:~$ ssh cluster@nodem
cluster@nodem:~$ 
exit

Note: If you are having difficulties, it might be helpful to delete the contents of the .ssh directory and start over.

cluster@nodem:~$ rm -rf ~/.ssh

6.2 For Workers

Install SSH.

cluster@nodeX:~$ sudo apt-get install ssh

Copy the public and private keys from nodem to each of the worker nodes. The following command reads “copy the file from nodem to nodeX while logged into nodeX“.

cluster@nodeX:~$ scp cluster@nodem:~/.ssh/id_rsa.pub ~/.ssh/id_rsa.pub
cluster@nodeX:~$ 
scp cluster@nodem:~/.ssh/id_rsa ~/.ssh/id_rsa

Again, copy the contents of the public key into the authorized keys file.

cluster@nodeX:~$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

You should now be able to SSH from any node into any node of the system without entering a password.

VII. NFS

In this build, we will be mounting a single directory using NFS. See Demo Build: Step 7 for details.

7.1 For Manager

cluster@nodem:~$ sudo apt-get install nfs-kernel-server
cluster@nodem:~$ 
mkdir /home/cluster/nfs_shared
cluster@nodem:~$
echo '/home/cluster/nfs_shared *(rw,sync,fsid=0,no_subtree_check)' | sudo tee -a /etc/exports
cluster@nodem:~$
sudo service nfs-kernel-server restart

7.2 For Workers

cluster@nodeX:~$ sudo apt-get install nfs-common
cluster@nodeX:~$ 
mkdir /home/cluster/nfs_shared
cluster@nodeX:~$
sudo mount nodem:/home/cluster/nfs_shared /home/cluster/nfs_shared
cluster@nodeX:~$
echo 'nodem:/home/cluster/nfs_shared /home/cluster/nfs_shared nfs defaults 0 0' | sudo tee -a /etc/fstab

Ensure that NFS directories are mounted after network connection is established by adding mount -a to /etc/rc.local (before exit 0):

cluster@nodeX:~$ sudo nano /etc/rc.local
mount -a

Reboot and check if the directory is mounted (if nothing appears, directory is not mounted).

cluster@nodeX:~$ sudo reboot && exit
cluster@nodeX:~$ mount -l | grep /home/cluster

VIII. MPI

8.1 Install

Next, we will install MPI on the manager and the worker nodes. While apt-get is convenient for installing packages and dependencies, the version of MPICH on Debian’s Wheezy repository is outdated (version 1.4.1). In case you were wondering, MPICH2 was renamed simply MPICH in 2012 and began supporting MPI-3.0 in 2013 with the release of version 3.0. At the time of this writing, version 3.1.3 is the stable release at mpich.org. In this build, we will compile MPICH on each node in the system (You should install MPI on the manager node and then use cluster SSH to install on the worker nodes). 

First, install dependencies (specifically, a c, c++, and Fortran compiler). Then download and unpack MPICH.

cluster@nodem:~$ sudo apt-get install build-essential gfortran
cluster@nodem:~$ 
wget http://www.mpich.org/static/downloads/3.1.3/mpich-3.1.3.tar.gz
cluster@nodem:~$
tar -xzf mpich-3.1.3.tar.gz
cluster@nodem:~$
cd mpich-3.1.3

Configure and install MPICH. This may take close to an hour.

cluster@nodem:~/mpich-3.1.3$ ./configure
cluster@nodem:~/mpich-3.1.3$ 
make; sudo make install

Clean up.

cluster@nodem:~$ cd ~
cluster@nodem:~$ 
rm -rf mpich-*

Run a couple of checks. The mpichversion command should show the version 3.1.3 and mpicc, the MPI c compiler, should be in /usr/local/bin.

cluster@nodem:~$ mpichversion
cluster@nodem:~$ 
which mpicc

Next, install MPI4PY. This will also take some time.

cluster@nodem:~$ sudo apt-get install python-dev python-pip
cluster@nodem:~$ 
sudo pip install mpi4py

Note: If the MPI4PY install is unsuccessful, you can try passing in the location of mpicc.

cluster@nodem:~$ sudo env MPICC=/usr/local/bin/mpicc pip install mpi4py

8.2 Test

Next, we will test the MPICH and MPI4PY installation by running a couple of simple examples from the manager node. Remember to work in the manager node’s nfs_shared directory so that all the nodes will have access to the files. 

cluster@nodem:~$ cd ~/nfs_shared

Create a machines.txt file and list the hostnames for each node in the cluster. In this build, I am not including the manager node in the list (If I include nodem, I get assertion errors when trying to use MPI bcast or gather).

cluster@nodem:~/nfs_shared$ sudo nano machines.txt
node0

node1
node2
node3

Next, create and save the hello.py example.

debian@nodem:~/nfs_shared$ sudo nano hello.py
from mpi4py import MPI

from sys import stdout
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()
stdout.write("Hello World! I am process " \
"%d of %d on %s.\n" % (rank, size, name))

Now, run the program using the following command syntax:

mpiexec –n <number of processes> -f <hostlist> python <file>

For example:

cluster@nodem:~/nfs_shared$ mpiexec -n 4 -f machines.txt python hello.py

You should see an output similar to:

Hello World! I am process 1 of 4 on node1.

Hello World! I am process 2 of 4 on node2.
Hello World! I am process 3 of 4 on node3.
Hello World! I am process 0 of 4 on node0.

To truly test that MPI is working, we need to try broadcasting and gathering. The next example will do both. 

cluster@nodem:~/nfs_shared$ sudo nano bcast_gather.py
from mpi4py import MPI

from sys import stdout
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()
if rank == 0:
    data = { 'key1' : [7, 2.72, 2+3j] }
else:
    data = None
data = comm.bcast(data, root=0) # Process 0 will broadcast
data2 = [rank,size]
data2 = comm.gather(data2, root=0) # Process 0 will gather
stdout.write("Process %d of %d on %s:\n" \
" Received: %s\n Gathered: %s\n" \
% (rank,size,name,data,data2))
cluster@nodem:~/nfs_shared$ mpiexec -n 4 -f machines.txt python bcast_gather.py

If you are not receiving any errors, consider yourself lucky. In my experience, errors related to bcast or gather are really difficult to solve. 

If you are having trouble, try rebooting everything, using ping to check that every node is reachable at the expected IP address, SSH-ing into each node and back into the manager node, and excluding nodes from the machines.txt file to isolate problem nodes.

8.3 Tips

Optional: Below are a few more tips for working with MPI.

Instead of passing the machines.txt file every time, we can set the default machines list using Hydra (included in MPICH 1.3.0+).

cluster@nodem:~$ export HYDRA_HOST_FILE=/home/cluster/nfs_shared/machines.txt
cluster@nodem:~$ 
mpiexec -n 4 python hello.py

To disable the SSH “message of the day” from appearing every time a node is accessed, use cluster ssh to open /etc/ssh/sshd_config and comment out the line starting with Banner.

cluster@nodeX:~$ sudo nano /etc/ssh/sshd_config
#Banner /etc/issue.net

Alternatively, you can add LogLevel quiet to the ~/.ssh/config file of just the manager node. However, be aware that this will also silence all SSH error messages.

cluster@nodem:~$ sudo nano /home/cluster/.ssh/config
LogLevel quiet

 

IX. BitTorrent Sync

Optional: In this build, I will try to augment the system’s file sharing capabilities with BitTorrent Sync. BitTorrent Sync has the advantage of transferring files to each of the nodes rather than mounting a remote directory. This makes it better suited for sharing larger files like datasets and because the worker nodes will be given read only access, the end result is similar to NFS. However, at the time of this writing, I have had only mild success getting BitTorrent Sync to actually transfer files in a timely manner (ready-only access combined with 8 to 24 node booting up at once seems to cause trouble). In many cases, it is easier to just transfer files using Secure Copy:

scp cluster@nodem:~/path/to/file ~/path/to/file

Nonetheless, for those interested in BitTorrent Sync, here is how I set it up (last tested with BitTorrent Sync 1.4).

9.1 For Manager

Go to getsync.com and download the BitTorrent Sync for you system or download it from the command line (replace x64 with your system):

cluster@nodem:~$ wget -O btsync.tar.gz http://download.getsyncapp.com/endpoint/btsync/os/linux-x64/track/stable

Create a hidden sync directory and a directory to share. Extract the BitTorrent Sync tar ball into the .sync directory.

cluster@nodem:~$ mkdir /home/cluster/.sync
cluster@nodem:~$ 
mkdir /home/cluster/btsync_shared
cluster@nodem:~$
tar -zxvf btsync.tar.gz -C /home/cluster/.sync
cluster@nodem:~$
rm btsync.tar.gz

Normally, we would start the BitTorrent Sync daemon with the command:

cluster@nodem:~$ /home/cluster/.sync/btsync

Then, we could open a web browser and go to localhost:8888 to access the WebUI (web user interface).

However, in this project, we are going to take a more automated approach and start BitTorrent Sync on each node using a config file.

To do this, we first need to generate a secret using the following command.

cluster@nodem:~$ /home/cluster/.sync/btsync --generate-secret

Next, in the .sync directory, create a file called btsync_config and add the contents below. Replace the secret value in shared_folders list with your secret. This config file will be used by the manager and each of the worker nodes to access the shared folder (next step). See this article for details.

cluster@nodem:~$ sudo nano /home/cluster/.sync/btsync_config
{

  "device_name": "node",
  "listening_port" : 12345,
  "storage_path" : "/home/cluster/.sync",
  "check_for_updates" : false,
  "use_upnp" : false,
  "download_limit" : 0,
  "upload_limit" : 0,
  "lan_encrypt_data": true,
  "webui" :{ },
  "shared_folders" :
  [ {
    "secret" : "YOUR SECRET HERE",
    "dir" : "/home/cluster/btsync_shared",
    "use_relay_server" : true,
    "use_tracker" : true,
    "use_dht" : false,
    "search_lan" : true,
    "use_sync_trash" : true
  } ]
}

To automatically launch BitTorrent Sync when the system boots, open crontab and add a line to start btsync on reboot.

cluster@nodem:~$ env EDITOR=nano crontab -e
@reboot /home/cluster/.sync/btsync --config /home/cluster/.sync/btsync_config

Note: In my experience, crontab is sensitive to extra whitespace and empty lines.

To check that btsync runs on boot, restart the system and check from the command line using:

cluster@nodem:~$ ps aux | grep "btsync"

9.2 For Workers

Download the ARM version of BitTorrent Sync and extract the contents.

cluster@nodeX:~$ wget -O btsync.tar.gz http://download.getsyncapp.com/endpoint/btsync/os/linux-arm/track/stable
cluster@nodeX:~$ 
mkdir /home/cluster/.sync
cluster@nodeX:~$
mkdir /home/cluster/btsync_shared
cluster@nodeX:~$
tar -zxvf btsync.tar.gz -C /home/cluster/.sync
cluster@nodeX:~$
rm btsync.tar.gz

Copy the btsync_config file we previous created from nodem to each node.

cluster@nodeX:~$ scp cluster@nodem:~/.sync/btsync_config ~/.sync/btsync_config

Now, there is a problem: Each worker node now has an identical configuration file but we need the device_name in each file to be unique. To fix this, I use the manager node to save the following Python code to a file in the nfs_shared directory. Then, I used cluster SSH to run the code on each worker node and replace the device_name with the hostname.

cluster@nodem:~$ sudo nano /home/cluster/nfs_shared/btsync_device_config.py
import json

import socket
data = {}
with open('/home/cluster/.sync/btsync_config', 'r') as jsf:
    data = json.load(jsf)
data['device_name'] = socket.gethostname()
with open('/home/cluster/.sync/btsync_config', 'w') as jsf:
    json.dump(data, jsf, indent=4)
cluster@nodeX:~$ python /home/cluster/nfs_shared/btsync_device_config.py

Again, open crontab and add a line to start btsync on reboot. This time, set btsync to use the configuration file.

cluster@nodeX:~$ env EDITOR=nano crontab -e
@reboot /home/cluster/.sync/btsync --config /home/cluster/.sync/btsync_config

Reboot the workers. Confirm that btsync is running and that the btsync_shared directory is being synchronized. You can also check the Peer List in the WebUI, though it seems to be slow to update.

cluster@nodeX:~$ sudo reboot && exit
cluster@nodeX:~$ ps aux | grep "btsync"

X. Scientific Python Packages

10.1 Install

Next, I will install several Python packages on the manager and worker nodes. Specifically, NumPy, SciPy, SciKit Learn, Pandas, StatsModels, FANN, and CVXOPT.

cluster@nodem:~$ sudo apt-get install python-numpy python-scipy python-sklearn
cluster@nodem:~$ 
sudo apt-get install python-pandas python-statsmodels python-pyfann python-cvxopt

10.2 Test SciKit-Learn

 To demonstrate that SciKit Learn is successfully installed, execute a basic linear regression by running the following code on one of the worker nodes. 

cluster@nodem:~$ sudo nano /home/cluster/nfs_shared/sklearn_demo.py
import numpy as np

from sklearn.linear_model import LinearRegression
from sklearn.utils import check_random_state
n = 100
x = np.arange(n)
rs = check_random_state(0)
y = rs.randint(-50, 50, size=(n,)) + 50. * np.log(1 + np.arange(n))
lr = LinearRegression()
lr.fit(x[:, np.newaxis], y) # x needs to be 2d for LinearRegression
x_p = lr.predict(x[:, np.newaxis])
rmse = np.sqrt( np.mean( np.square( np.subtract( x_p, x) ) ) )
print "RMSE:", rmse
cluster@nodeX:~$ python /home/cluster/nfs_shared/sklearn_demo.py

10.3 Test CVXOPT

Optional: To demonstrate that CVXOPT is successfully installed, use MPI to run a linear program optimization on each the worker nodes. 

cluster@nodem:~$ cd ~ /home/cluster/nfs_shared
cluster@nodem:~/nfs_shared$ sudo nano mpi_cvxopt_demo.py
from mpi4py import MPI

from sys import stdout
from cvxopt import matrix, solvers
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
size = comm.Get_size()
name = MPI.Get_processor_name()
if rank == 0:
    data = {'key1' : [7, 2.72, 2+3j]}
else:
    data = None
data = comm.bcast(data, root=0) # Process 0 will broadcast
# Solve a Linear Program
A = matrix([ [-1.0, -1.0, 0.0, 1.0], [1.0, -1.0, -1.0, -2.0] ])
b = matrix([ 1.0, -2.0, 0.0, 4.0 ])
c = matrix([ 2.0, 1.0 ])
solvers.options['show_progress'] = False
sol=solvers.lp(c,A,b)
data2 = [rank,sol['x']]
data2 = comm.gather(data2, root=0) # Process 0 will gather
stdout.write("Process %d of %d on %s:\n" \
" Received: %s\n Gathered: %s\n" \
% (rank,size,name,data,data2))
cluster@nodem:~/nfs_shared$ mpiexec -n 5 -f machines.txt python mpi_cvxopt_demo.py

XI. PostgreSQL

Optional: To serve and collect data, I find it useful to have a PostgreSQL database running on the manager node. The database is fairly easy to setup, though might take some practice to use effectively. The following steps are based on posts from digitalocean.com and help.ubuntu.com.

11.1 For Manager

Install PostgreSQL, Psycopg (database adapter for Python), and pgAdmin (graphical admin tool).

cluster@nodem:~$ sudo apt-get install postgresql postgresql-contrib
cluster@nodem:~$ 
sudo apt-get install python-psycopg2 pgadmin3

Create a PostgreSQL user (called a role in PostgreSQL) named cluster.  Assign privileges to create databases and other roles and add a password for remote access. Then, create a database owned by cluster called clusterdb.

cluster@nodem:~$  sudo -u postgres createuser --createdb --createrole --pwprompt cluster
cluster@nodem:~$ 
 sudo -u postgres createdb --owner=cluster clusterdb

To allow network access to the database, we need to edit the postgresql.conf file so that PostgreSQL listens on all IP addresses (replace 9.4 with the version of PostgreSQL you are using).

cluster@nodem:~$ sudo nano /etc/postgresql/9.4/main/postgresql.conf

Change:

listen_addresses = 'localhost'   # what IP address(es) to listen on;

To:

listen_addresses = '*'    # what IP address(es) to listen on;

We also need to add the following lines to pg_hba.conf to enable authenticated access to the database. The IP address should match the Network address used in the static IP setup (again, replace 9.4 with the version of PostgreSQL you are using).

cluster@nodem:~$ sudo nano /etc/postgresql/9.4/main/pg_hba.conf
# Allow anyone from the local network who is authentication

host all all 192.168.1.0/24 md5

Note: The line we added to pg_hba.conf uses the syntax:

host <database> <user> <CIDR-address> <auth-method>

According to the PostgreSQL Docs, typical examples of a CIDR-address are 192.168.1.39/32 for a single host, or 192.168.1.0/24 for a small network, or 10.6.0.0/16 for a larger one. To specify a single host, use a CIDR mask of 32 for IPv4 or 128 for IPv6.

To apply these changes, restart the database service.

cluster@nodem:~$ sudo /etc/init.d/postgresql restart

We can test that everything is working using the command syntax:

psql -U <username> -p <port> -h <host> <database>

For example:

cluster@nodem:~$ psql -U cluster -p 5432 -h 192.168.1.39 clusterdb

Exit the PostgreSQL console.

clusterdb=# \q

Optional: You can start the graphical admin tool, pgAdmin, using the command, pgadmin3. There is a useful video walk-through at EnterpriseDB.

cluster@nodem:~$ pgadmin3

11.2 For Workers

Install Psycopg, the database adapter for Python.

cluster@nodeX:~$ sudo apt-get install python-psycopg2

To test Psycopg, create a python file in the nfs_shared directory with the following code. Make sure the replace YOUR_PASSWORD. Use cluster SSH to run the code on each of the worker nodes.

cluster@nodem:~$ sudo nano /home/cluster/nfs_shared/postgresql_demo.py
# Small script to show PostgreSQL and Pyscopg together

import psycopg2
try:
    conn = psycopg2.connect("dbname='clusterdb' user='cluster' host='nodem' password='YOUR_PASSWORD'")
    print "Connected"
except:
    print "I am unable to connect to the database"
cluster@nodeX:~$ python /home/cluster/nfs_shared/postgresql_demo.py

XII. Resources & Credits

If you have made it this far, CONGRATS! You now have a BeagleBone Black computing cluster.

As I mentioned in the Demo Build, I very highly recommend working through the tutorials at mpitutorial.com to learn the basics of MPI. The website is completely free, filled with well written tutorials, and includes numerous detailed examples (written in c).

The development of this tutorial was supported as part of a crowdfunding campaign through Experiment.com. I am incredibly thankful for the generous support I have received!

 

2 Comments

  1. aurelien
    October 11, 2015

    Hi,
    Really impressed by this experimentation / course thanks for all!

    I see you use the USB as power. Where/How do you connect them?

    Can you present a bit more the hardware / stuff part?

    Thanks and Congratulations once again

    aurelien

    Reply
    • Eric
      October 13, 2015

      Thanks for the complement. I am still working on a “final” build that covers powering and connecting the nodes. The short answer to your question is that I use a 60W USB hub to power the BBBs and a 48 port Ethernet switch to connect and communicate between all the nodes. I’m still working on a post to detail the specifics of the hardware, but the BeagleBones, the USB Hub, and the Ethernet switch are about all you need.

      Reply

Leave a Reply