SESYNC provides access to remote JupyterLab sessions via a web browser. The Jupyter Project provides an environment for Python development, and SESYNC’s Jupyter Server adds direct connections to resources like shared file storage, databases, GitLab, and a compute cluster.
You can access Jupyter by pointing any web browser to https://jupyter.sesync.org/ and logging in with your SESYNC username and password. If you forgot your username or password, please go to https://pwm.sesync.org/. If your SESYNC credentials do not give you access, please email cyberhelp@sesync.org and ask to enable this resource for your account or whole team.
Once you log in, you will see a JupyterLab interface. To learn how to use all the features of the JupyterLab IDE, check out the complete documentation.
User contributed Python packages can be installed on JupyterLab by using the
“pip3” or “pip3.8” utility from the Terminal, depending on whether you are
using a Python 3.6 or Python 3.8 kernel. To avoid a permission error, please use the
--user
flag. For instance, using the Terminal launched from Jupyter:
<USERNAME>@sesync01:~$ pip3 install --user <PACKAGE>
If you receive an error stating that the package cannot be installed, there is a chance that some underlying system library is not installed. Please email the error message to cyberhelp@sesync.org, and explain which package you need to install.
Note for advanced users: Instead of installing individual packages with the
--user
flag, we recommend creating a project-specific kernel with a specific version of Python and
installing only the packages you need for your project into that environment.
This is good for reproducibility because it ensures that your code is always executed in
the same environment. See the FAQ on creating a virtual environment
on the Jupyter server for more details.
“Stuff” usually belongs in one of three places:
When you first open JupyterLab, you will be working in your home directory which is located at “/research-home/USERNAME” or equivalently “~/”. This is a private directory, and only you have access to the files in it. We strongly recommend that you save source code in your home directory. This will protect against multiple group members attempting to update a project file at the same time. If you need to share code between project members please use a version control application such as GitLab or GitHub.
If you have requested it, your group will have a data directory available. Your research data directory appears on https://files.sesync.org/ as PROJECTNAME-data, where PROJECTNAME is the short name assigned for your project by SESYNC, and is accessed from https://jupyter.sesync.org at the path “/nfs/PROJECTNAME-data”. You can add to this directory either by saving output from Python to folders there, or by using one of the options for uploading described in Quick Start: Research Data Directory. You should store all shared data here. Examples of data types that should be placed here include csv files, landsat imagery, hdf5 data files–anything that makes sense to have only a single copy, with access shared by your project members.
Note that to connect to the research data directory, you will need to create a symlink to your root project directory by opening the terminal in JupyterLab and typing:
ln -s /nfs/PROJECTNAME-data PROJECTNAME-data
See the FAQ on creating a symlink for more information.
Since everyone will be working off of the same set of code, there are three options for working with data. If your data is quite small (i.e. a CSV with a few hundred rows, also known as “small-batch artisanal data”) you can include it in your project, push it to your remote repository, and everyone will have a clone. Larger datasets should be in your Research Data Directory so that everyone is able to work off one shared copy of the data. Very large datasets may need to be loaded into a RDBMS, and SESYNC provides both MySQL and PostgreSQL servers for this purpose.
Let’s assume that J. Smith (with USERNAME “jsmith”) is part of the “Trees and Urban Heat Island Mitigation” working group. When J. Smith logs in to https://files.sesync.org, the directory “cooltrees-data” will indicate that the PROJECTNAME mentioned above is cooltrees. Data can be written and read into JupyterLab. For example, a script saved as “~/example_script.py” could include:
import os
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
# Parameters and arguments
out_dir = '/nfs/cooltrees-data'
# Generate example data
x = np.arange(0., 10., 0.4)
y = x**2
plt.plot(x, y, 'bs')
plt.xlabel('x')
plt.ylabel('y')
out_plot_filename = os.path.join(out_dir, 'x_y_plot.png')
plt.savefig(out_plot_filename)
# Write out test data to current directory
dataset = pd.DataFrame({'x':x, 'y':y})
## print(dataset)
out_filename = os.path.join(out_dir, 'dataset.csv')
## write out to csv
dataset.to_csv(out_filename)