首页 关于
树枝想去撕裂天空 / 却只戳了几个微小的窟窿 / 它透出天外的光亮 / 人们把它叫做月亮和星星
目录

Nvidia_Docker相关工具安装

1. Nvidia 驱动安装

首先,编辑/etc/modprobe.d/blacklist.conf,把可能导致重复load的开源驱动拉入黑名单,添加如下内容:

        blacklist nouveau
卸载所有安装过的nvidia驱动,添加驱动源
        $ sudo apt-get remove --purge nvidia-*
        $ sudo apt-get update
查找合适的驱动版本,这里选择 'nvidia-driver-535 - distro non-free'
        s$ ubuntu-drivers devices
        == /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0 ==
        modalias : pci:*************************************************
        vendor   : NVIDIA Corporation
        driver   : nvidia-driver-470-server - distro non-free
        driver   : nvidia-driver-525-server - distro non-free
        driver   : nvidia-driver-525-open - distro non-free
        driver   : nvidia-driver-525 - distro non-free
        driver   : nvidia-driver-470 - distro non-free
        driver   : nvidia-driver-535-open - distro non-free recommended
        driver   : nvidia-driver-535 - distro non-free
        driver   : xserver-xorg-video-nouveau - distro free builtin

通过如下指令安装后重启

        $ sudo apt-get install nvidia-driver-535
        $ sudo reboot
重启系统后,可执行如下命令查看驱动的安装状态
        $ nvidia-smi

2. CUDA 的安装

根据 nvidia-smi 的提示,在cuda-toolkit-archive中找到对应的版本,这里选择 12.2。

        wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2004/x86_64/cuda-ubuntu2004.pin
        sudo mv cuda-ubuntu2004.pin /etc/apt/preferences.d/cuda-repository-pin-600
        wget https://developer.download.nvidia.com/compute/cuda/12.2.0/local_installers/cuda-repo-ubuntu2004-12-2-local_12.2.0-535.54.03-1_amd64.deb
        sudo dpkg -i cuda-repo-ubuntu2004-12-2-local_12.2.0-535.54.03-1_amd64.deb
        sudo cp /var/cuda-repo-ubuntu2004-12-2-local/cuda-*-keyring.gpg /usr/share/keyrings/
        sudo apt-get update
        sudo apt-get -y install cuda

3. Docker 配置

3.1. 安装docker

Docker 安装教程

        # docker安装文档: https://docs.docker.com/engine/install/
        curl -fsSL https://get.docker.com -o get-docker.sh
        sh get-docker.sh
        sudo systemctl --now enable docker

3.2. 安装 nvidia-container-toolkit

在docker环境中加载nvidia和cuda驱动,需要安装 nvidia-container-toolkit,安装指令如下:

        distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
              && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
              && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \
                    sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
                    sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
        sudo apt-get update
        sudo apt-get install -y nvidia-container-toolkit
        sudo nvidia-ctk runtime configure --runtime=docker
        sudo systemctl restart docker



Copyright @ 高乙超. All Rights Reserved. 京ICP备16033081号-1