Posted on September 1, 2019
This post is mostly here to remind me how I did this in the future, but it can also be useful for others. Just a heads up, I use older versions of PyTorch and Torch Vision to make sure that PySyft works.
PySyft is a library designed to help with federated learning. It allows tensors to be turned into binaries so that they can more easily be sent as packets across ports or https.
First, get yourself an account at https://aws.amazon.com
You will need to give them a credit card, as a free instance will not have enough memory to do the job and you will get memory failures.
Next, in the AWS panel, select “Launch Instance” to begin setup for your new AWS instance.
Then, select the type of OS you would like to run from. Preferences given to Ubuntu instances and “Linux” instances which are normally just CentOS.
The next question will ask you what size machine you want. The more processors and RAM you select, the more expensive the instance. Unfortunately for this project, you cannot choose the free instance model. Running the server code alone will overwhelm the memory on the machine. I recommend large, just due to the amount of data that needs to be processed for machine learning. The task on the client side will run much faster if you choose this option. Medium will also work.
If you just choose select and launch, it will take you to the final screen where you can see the type of AWS server you have set up.
When you select launch it will ask you for a key. Use the top pull down menu to have it give you a key, and download it to a safe place on your machine. Make sure that you will have access to that location. And make sure that if your machine appends .txt onto the end of the key, that you remove that from the name.
Now you are ready to open a terminal and ssh into your new instance. If for any reason you have issues with the ownership of your key, use chmod 400 (name of your pem) to give yourself permission.
To enter your instance, if the instance is Ubuntu, type into the terminal:
ssh -i nameofyourkey.pem ubuntu@your.aws.ip.address
If the instance is CentOS or other Linux
ssh -i nameofyourkey.pem ec2-user@your.aws.ip.address
Now you have a new instance. QUICK! Update! As with all cloud machines, you should expect that it may not be entirely up to date. So don’t forget to update whenever you start up something new.
Ubuntu:
sudo apt update
sudo apt upgrade -y
CentOS:
sudo yum update
And then do a quick reboot and log back in.
The next part is pulled mostly from https://github.com/NovaVic/PySyft/wiki but includes several additions for new aws machines.
To begin, just as in the turotial, set up Miniconda:
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
chmod 755 Miniconda3-latest-Linux-x86_64.sh
./Miniconda3-latest-Linux-x86_64.sh press yes to various prompts
and do init of conda as shown in prompt. With that, conda is added to ~/.bashrc To activate the base conda virtual env:
source ~/.bashrc
Install pytorch and cudatool and torchvision, however, you will need an older version of both to make this work
conda install pytorch=1.0 torchvision cudatoolkit=9.0 -c pytorch
Now you will need to download PySyft and build from source, but to do that, we will first need git. On Ubuntu
sudo apt-get install git
On CentOS
sudo yum install git
Now you can pull a clone of the PySft github
git clone https://github.com/OpenMined/PySyft.git
Next you will need to go into the PySyft that you just downloaded and change the requirement.txt file slightly, so that it accepts the correct versions of PyTorch and TorchVision.
cd PySyft
nano requirements.txt
Change these two areas:
torch==1.0
torchvision==0.2.2
Now we can install PySyft properly onto our new machine, but first we will need to install a gcc compiler.
On Ubuntu
sudo apt install gcc
On CentOS
sudo yum group install "Development Tools"
sudo yum install man-pages
And finally we can install PySyft!
python setup.py install
You may also wish to add a newer version of Numpy and Jupyter notebooks, depending on which client you run.
conda install numpy
conda install jupyter notebook
When I was working on this setup, I was also working with a group in a “Secure and Private AI” class on Udacity. We had some working code for creating a client and a server, sending models from the client to the server, and then having the server calculate the overall model using the models it aquired from the clients. This tutorial utilizes the MNIST data set. You can also download the code from this project onto the machine, to start off running. For that, cd.. until you are back into the home area of the file system. Then:
git clone https://github.com/jess-s/SPAIC-Scorchers.git
And now you’re ready to start a server and a client!