• 1 Post
  • 9 Comments
Joined 1 year ago
cake
Cake day: July 27th, 2023

help-circle


  • Thanks!

    As I understand it, it bind-mounts the /dev/nvidia devices and the CUDA toolkit binaries inside the container, giving it direct access just as if it was running on the host. It’s not virtualized, just running under a different namespace so the VRAM is still being managed by the host driver. I would think the same restrictions exist in containers that would apply for running CUDA applications normally on the host. Personally I’ve had up to 4 containers run GPU processes at the same time on 1 card.

    And yes, Nvidia hosts it’s own GPU accelerated container images for PyTorch, Tensorflow and a bunch of others on the NGC. They also have images with the full CUDA SDK on their dockerhub.



  • I try to find ways to make my setup more bulletproof or faster whenever I get the itch. As an example, I recently switched to OpenSUSE and Podman to take advantage of the LTO optimized packages and rootless containers.

    I tried to run my online life through self hosting but I found a lot of the services weren’t reliable or capable enough to get real work done. So I went from 30 containers to about 7 and have a lot less to tinker with.