This is a list of questions related to the technical infrastructure required to do the labs.
You can use the computers in BC 07-08. These computers run virtual machines; choose one of the two following images that contains the software that you will use during the course is
IC-CO-IN-SC
IC-BLC-IN-SC
Many programs that will be useful during the course (such as Python, Jupyter)
are not the "default" ones found in $PATH
. You can find them in
/opt/anaconda3/bin
. (The same holds for the cluster.)
Basically, you will find these two commands useful:
/opt/anaconda3/bin/jupyter console # Launch a command-line interpreter
/opt/anaconda3/bin/jupyter notebook # Launch a notebook server
Be careful: launching jupyter notebook
(without the absolute path as
described above) seems to work at first, but many of the libraries that we use
in the course will not be available (e.g., matplotlib
).
You need to have python3
installed on your machine to do the labs. We recommend you to install Python with Anaconda.
To install Anaconda, go to https://www.anaconda.com/distribution/, choose your distribution, and download the Python 3.7 graphical installer. You will then be walked through the installation steps by the installer.
The server is iccluster040.iccluster.epfl.ch
. You need to be on the EPFL
network to be able to reach it (either on-campus or connected via VPN). You can
connect to the server using your GASPAR credentials. On the command line, type:
ssh -l USERNAME iccluster040.iccluster.epfl.ch
Where USERNAME
is your GASPAR username.
Linux and Apple OS X users can use scp
to transfer files. The basic syntax of
scp is scp [from] [to]
. The [from]
portion can be a filename or a
directory/folder. The [to]
portion will contain your username, the hostname
of the cluster login node, and the destination directory/folder. For example:
scp /SOME/LOCAL/FILE ${USER}@iccluster040.iccluster.epfl.ch:/SOME/REMOTE/DIRECTORY
It is possible to transfer a directory using scp
with options -r -p
. The
-r
indicates that the copy is recursive. The -p
preserves
dates/times/permissions of the files.
You can also transfer files between your local computer and the cluster using a SFTP client, such as Cyberduck (OSX),
FileZilla (Linux), WinSCP (Windows).
Step-by-step instructions is available here.
If you get an error that looks like this:
17/02/28 19:30:19 ERROR SparkUI: Failed to bind SparkUI
java.net.BindException: Address already in use: Service 'SparkUI' failed
after 16 retries! Consider explicitly setting the appropriate port for the
service 'SparkUI' (for example spark.ui.port for SparkUI) to an available
port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
...
You certainly did not set the spark.ui.port
number properly in your .profile
. To solve the issue, just edit your .profile
and add the line
alias pyspark='pyspark --conf spark.ui.port=xxxx0'
where xxxx0
is the last 4 four digits of you SCIPER number followed by 0.
If the first digit starts with 0
or if the overall number is larger than 6553
, then the port number will not be set properly. In this case, replace xxxx
with any random number between 1024
and 6553
.
If too many users are connected to the cluster and have requested resources
from YARN (e.g., using large values in the call to pyspark
---see above),
there may be no more resources left for you. There are two symptoms of this:
- The Jupyter kernel seems to be "working" permanently, and code cells never execute.
- The Spark context (variable
sc
) is empty.
You might also see messages like this in the terminal.
17/02/28 19:34:07 INFO Client: Application report for
application_1484292377252_0269 (state: ACCEPTED)
In this case, be patient and try a bit later when the server is less crowded.
You can also try to run Spark without asking resources to YARN, by simply
typing pyspark
in the shell, without arguments (results are not guaranteed).
I am getting permission errors when editing files with vim
on the BC07-08 machines, what should I do?
When saving a file with vim
on the BC07-08 machines, you get the following error:
E137: Viminfo file is not writable: /home/USERNAME/.viminfo
You can safely ignore this error. It is just an error linked to a permission issue on your Myfile directory, but it does not prevent you to actually save the file you were editing.
When connecting to the cluster with ssh
, if you get an error
ssh: Could not resolve hostname [hostname]: nodename nor servname provided, or not known
Try deconnecting/reconnecting your wifi, or connecting through VPN.
I am getting an error about Malformed HTTP message
on jupyter logs and can't access my notebooks, what should I do?
Try clearing the cache of your web browser and reload the page.