README.md 6.03 KB
Newer Older
1
2
3
4
5
6
7
8
9
10
11
12
13
14
# DOCKER UTILS FOR WORKING WITH CUDA AND PYTHON

## Introduction

Bash files presented in this directory are created with a goal to simplify the workflow with Docker for users with no prior experience. The idea is to imitate Jupyter `lab/notebook` workflow while using the isolated environment of a docker container for CUDA drivers installation. That way, all users are independent from the system-wide install of CUDA and safe from all sudden updates and changes. 
All files work with cuda_conda docker image, which includes CUDA 11 with conda installation. 
For the ease of use, you can put all bash files in your home directory and immediately start work by calling them after logging in with ssh.


## Initialization

In order to initialize everything for the intended workflow, run `init_container.sh`. The initialization might take approximately 5 minutes. After that, you can use other provided scripts.

What it does:
Lasse Lensu's avatar
Lasse Lensu committed
15
16
1. It creates directory `docker` in your user's home directory. That directory will be mounted to the container and will serve as a persistent data storage and allow users to work with large datasets and save their computation results and source files. It will automatically create `docker/work` directory, which is intended to be the main working directory. Other subdirectories in `docker` will be populated automatically by the initialization procedure.
2. It will create a conda environment in the `docker/env` directory and make it a default conda environment. That way, all python and conda packages installed during work sessions with the container will be preserved for the future use. Users can freely stop and run new containers without worrying that their packages will be lost, nor do they need to rebuild the whole image after adding new packages. 
17
18
19
20
21
22
23
24
25
26
27
28
3. Both Pytorch and TensorFlow are installed in the new environment along with the Jupyter server. 

## Proposed workflow for Jupyter

(First, the container should be initialized with `init_container.sh`. Do this once.)

1. Run `jupyter_container.sh`. After the server starts, you can detach from the container using `Ctrl+P, Ctrl+Q` (this hint will be printed by the command) to allow it to run in background.

Script will attempt to run it using port 8888. If it is already in use, it will try to find the free port and it will print the final used port. Otherwise, if you want to specify other port for the Jupyter, you may use `-p | --port` flag, i.e. `-p 8885`. If you need several ports, for example for tensorboard, you may specify using `<host port>:<container port>` notation. For example `jupyter_container.sh -p 8885 -p 6666:6006` will forward container ports 8885 and 6006 to 8885 and 6666 on host. You will be able to access Jupyter using localhost:8885 and tensorboard (if it uses 6006 in the container) using localhost:6006.

2. Go to `localhost:8888` (or other port, if you specified it or if 8888 is already in use) and start working.  

Lasse Lensu's avatar
Lasse Lensu committed
29
3. By default, jupyter will open folder `docker/work`. You may also put any necessary data and source files inside this directory. For example, you can create folders `src` and `data` for the source files and data respectively. Inside the container you can access this data using path `/$USER/work/...`, where `$USER` is your username.
30
31
32
33
34
35
36
37
38
39
40
41
42

## General information

By default, all scripts work with a container with a name `$USER_work` (i.e., `ekaterina_work` for user ekaterina, or `fedor_work` for user fedor). If, for some reason (e.g. you want to run several containers simultaneously), you need other name for a container, you can specify it using flag `-n` or `--name`, e.g. `bash_container.sh -n my_container`. Do not forget to stop and remove containers that you do not need (you may use `stop_remove_container.sh`). 

Please keep in mind that containers of other people are also visible to you. DO NOT stop or remove containers of other users, unless specifically given permission by the creator of the container.

By default, scripts will try to run a container, but if it is already running, they will just execute the command inside a running container. You can change this behaviour by using `-r | --run` or `-e | --execute` flags.

Help is available by calling any script with `-h | --help` flag. 

## Scripts overview

Kate Nepovinnykh's avatar
Kate Nepovinnykh committed
43
`attach_container.sh` - attach to a running container. For example, if you have started jupyter server using `jupyter_container.sh`, you might want to attach to it in order to stop jupyter server (and, by extension, the container itself). 
44
45
46

`bash_container.sh` - run an interactive bash terminal inside a container. 

Lasse Lensu's avatar
Lasse Lensu committed
47
`execute_in_container.sh` - execute a command inside a container. You have to specify the command, but keep in mind that only the commands from inside the container system are available. For example, if you want to run some python file, you can put it in `docker/work` directory, i.e `docker/work/test.py` and run it using path `/$USER/src/test.py`. The full command will look like this: `execute_in_container.sh python /$USER/src/test.py` (or `execute_in_container.sh python /ekaterina/src/test.py` if your username is `ekaterina`).
48
49
50
51
52
53
54

`init_container.sh` - initialize container with a conda environment on a mountable drive. Call this before calling other scripts. Call this only once.

`jupyter_container.sh` - run jupyter lab in a container. Uses default port 8888, you can change it by using `-p` or `--port` flag. After the server starts, you can detach from the container using `Ctrl+P,Q` (this hint will be printed by the command) to allow it to run in background. If you need to reattach to stop it or check outputs, you can use `attach_container.sh`.
 
`run_background_container.sh` - starts container and detaches from it immediately. This container will run in background until you stop it with `stop_remove_container.sh` (or docker commands). Might be useful if you want to execute bash terminal in it (e.g., using `bash_container.sh`) and run some long process in background using `screen` command. Overall, the use of this script is discouraged.

Lasse Lensu's avatar
Lasse Lensu committed
55
`stop_remove_container.sh` - stops and removes a running container.