Commit 3de00c2a authored by Jakub Yaghob's avatar Jakub Yaghob
Browse files

dalsi pridavne info o CUDA troubleshootingu

parent 84619bd3
......@@ -226,7 +226,7 @@ SLURM will send you an email, when the job finishes. There are many other mail-t
Charliecloud provides user-defined software stacks (UDSS) for HPC.
It allows you to run nearly any software stack (like TensorFlow) on the cluster even it is not system-wide installed and available.
All informations about Charlicloud can be found on its [Charliecloud documentation]( page.
All informations about Charliecloud can be found on its [Charliecloud documentation]( page.
### Basic workflow
......@@ -304,7 +304,7 @@ You may disable this by specifying `--no-home` option.
Moreover, you may bind additional directories by using options `--bind=/some/dir`
(which will appear as `/mnt/0` in your UDSS environment) or by `--bind=/source/dir:/dest/dir`.
### Advanced techniques and notes
### Advanced techniques, troubleshooting, and notes
#### Builders
......@@ -342,3 +342,14 @@ It can happen, executing `ch-fromhost` from the 3rd step will produce some error
Ignore safely these errors, they do no harm to you.
This warning/notice was contributed by Vít Kabele.
#### CUDA is not working inside your UDSS
If you suspect that CUDA is not working inside your container, run the `nvidia-smi` command from the container command line.
If `nvidia-smi` prints the CUDA version correctly, then CUDA is functional. However, if it prints "ERR", CUDA does not work.
In this case, follow the checklist:
- Did you correctly import the CUDA libraries? See step 3 of the basic workflow.
- Is `` (or ``) loadable? Check `LD_LIBRARY_PATH` environment variable inside your container.
If not set to the CUDA library directory, set it to the correct path, e.g. `export LD_LIBRARY_PATH=/usr/local/cuda/lib64`.
Be careful, if the variable is already set to some additional paths.
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment