Skip to content
GitLab
Menu
Projects
Groups
Snippets
/
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
KSI
clusters
Commits
3de00c2a
Commit
3de00c2a
authored
Jan 12, 2021
by
Jakub Yaghob
Browse files
dalsi pridavne info o CUDA troubleshootingu
parent
84619bd3
Changes
1
Show whitespace changes
Inline
Side-by-side
README.md
View file @
3de00c2a
...
...
@@ -226,7 +226,7 @@ SLURM will send you an email, when the job finishes. There are many other mail-t
Charliecloud provides user-defined software stacks (UDSS) for HPC.
It allows you to run nearly any software stack (like TensorFlow) on the cluster even it is not system-wide installed and available.
All informations about Charlicloud can be found on its
[
Charliecloud documentation
](
https://hpc.github.io/charliecloud/
)
page.
All informations about Charli
e
cloud can be found on its
[
Charliecloud documentation
](
https://hpc.github.io/charliecloud/
)
page.
### Basic workflow
...
...
@@ -304,7 +304,7 @@ You may disable this by specifying `--no-home` option.
Moreover, you may bind additional directories by using options
`--bind=/some/dir`
(which will appear as
`/mnt/0`
in your UDSS environment) or by
`--bind=/source/dir:/dest/dir`
.
### Advanced techniques and notes
### Advanced techniques
, troubleshooting,
and notes
#### Builders
...
...
@@ -342,3 +342,14 @@ It can happen, executing `ch-fromhost` from the 3rd step will produce some error
Ignore safely these errors, they do no harm to you.
This warning/notice was contributed by Vít Kabele.
#### CUDA is not working inside your UDSS
If you suspect that CUDA is not working inside your container, run the
`nvidia-smi`
command from the container command line.
If
`nvidia-smi`
prints the CUDA version correctly, then CUDA is functional. However, if it prints "ERR", CUDA does not work.
In this case, follow the checklist:
-
Did you correctly import the CUDA libraries? See step 3 of the basic workflow.
-
Is
`libcuda.so`
(or
`libcuda.so.1`
) loadable? Check
`LD_LIBRARY_PATH`
environment variable inside your container.
If not set to the CUDA library directory, set it to the correct path, e.g.
`export LD_LIBRARY_PATH=/usr/local/cuda/lib64`
.
Be careful, if the variable is already set to some additional paths.
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment