Connecting to your Amazon instance

Michael Matschiner, Milan Malinsky, and the Workshop Team. 6 June 2022


Presentation slides

AMI introduction


Amazon Web Services and Amazon instances

For this workshop we are going to use a high performance cloud platform called Amazon Web Services (AWS). This is a flexible resource to get computational work done quickly and relatively inexpensively. We have set up individual virtual machines called “Amazon instances” for each participant. These have all the software and data you will need along with sufficient CPU, RAM, and storage resources. Your instance IP address can be found on this list and the list of IP addresses is also linked from the main workshop page (in the box with links).

The IP addresses are going to change regularly during the workshop, as we’ll stop the instances overnight and resume them on the next day to cut down the cost that we need to pay to Amazon. Whenever instances are restarted, the IP addresses will change; however, the content of the instance will remain the same. So the files that you write on any given day on your instance should usually still be present on the next. We are going to announce when we stop instances overnight, and we will accept requests to leave instances running in case that you’ld like to keep working on an analysis.

In addition to stopping instances overnight, it may also occur that we need to issue completely new instances at some point. This may be the case when/if we need to install additional software or add datasets for activities. In such cases, we’ll terminate old instances, meaning that the files from these instances will not be accessible anymore to you. We are going to announce if/when that happens, and we will give you time to finish analyses or download data if you request it.

Generally, however, we discourage downloading data — at least large datasets — from the Amazon instances, because we have to pay for download volume. We are going to provide a Dropbox directory at the end of the workshop, which will contain the dataset and scripts used during the Workshop, and we ask you to download this Dropbox directory (if interested) instead of copying large volumes of data from the Amazon instances.

Logging-in to your Amazon instance

There are different ways to interact with a running Amazon instance, and the most convenient methods usually differ between the activities and depend on your own operating system. Below, we describe the three most important ways for accessing the instance; via programs that support SSH and SCP, and through web browsers with Apache Guacamole and RStudio. At the beginning of each activity description, we specify the most convenient method for that activity.

Using SSH / SCP

A number of the activities requires only/mostly command-line access to the Amazon instance. In these cases we recommend that you access the instance directly from your machine via a terminal program and the “shell”. The shell is often the only way to run genomic analyses and interact with high performance compute clusters. The interaction with remote server through the shell takes place primarily by means of two programs, called SSH (Secure Shell) and SCP (Secure copy protocol). In brief, the difference between the two programs is that SSH allows access to a remote server while SCP allows copying of files from and to a remote server. The shell, including SSH and SCP, can be used through terminal programs that are available on for all operating systems, natively or through additional installation.

  • Identify a suitable terminal program installed or installable on your own machine.

    Help needed?

    The following terminal programs are available on Mac OS X, Linux, and Windows:
  • Apple Mac
    Mac OS X comes preinstalled with an application called “Terminal”, you can search for it with Spotlight or in Finder.
  • Linux
    All Linux distributions come with a terminal application. These may be in different places depending on what type of linux system you have. If you are using linux you probably already know how to find this; if not please ask one of the instructors for help.
  • Windows
    Most Windows systems unfortunately do not have a terminal program that would be useful for our purposes. There are several free options that you could use, including PuTTY and MobaXterm. The newest Windows operating system also comes with Linux Bash Shell and Windows Terminal; however, we have been unable to test these terminal programs. Below, we provide instructions on the download and installation of PuTTY, a terminal program that has previously been used successfully in the Workshop.

To download PuTTY, go to http://www.chiark.greenend.org.uk/~sgtatham/putty/download.html and select either the 32-bit or 64-bit Windows installer, depending on the version of Windows on your machine.

Double-click on the file after downloading. In the field below “Host Name” (marked with orange rectangle) insert the IP address for the Amazon instance, as shown in the screenshot below. Then click “Open”.

If the warning message shown below appears, click “Yes”. This is only a check that you trust the computer you are connecting to.

Make sure to know how you can find these programs quickly on your computer; you’ll need them frequently during the Workshop.