2021-07-12 在 GeForce RTX 3090上配置深度学习环境 cuda 11.1 + tensorflow2.5.0 + python3.8.3_device interconnect
本博客配置成功的环境已经导出 至
//download.csdn.net/download/Julse/20687132?spm=1001.2014.3001.5501
文章目录
成功安装的细节安装tensorflow-gpu 2.5.0安装keras安装 cudnn 问题1 -测试tensorflow是否安装成功问题2 tensorflow 和tensorlow-gpu问题3 conda 的多个数据源里面都没有 tensorflow-gpu=2.5.0,但是pip里面有问题4 tensorflow是gpu版本,keras是否也要指定gpu版本呢?问题5 tensorflow2.5和keras2.4.3可能不兼容问题6 cudnn 报错安装其他版本cuda未解决的问题其他问题:GeForce RTX 3090
配置环境的过程遇到了很多问题,最后成功配置的版本如下,亲测可用
tensorflow-gpu 2.5.0
cudnn 8.1.0.77
python 3.8.3
cuda 11.1
参考的版本对应关系如图
//www.tensorflow.org/install/source
成功安装的细节
安装tensorflow-gpu 2.5.0
conda activate 虚拟环境名字
pip install tensorflow-gpu==2.5.0 # conda install tensorflow-gpu==2.5.0 如果找不到
检查报告有没连接取得胜利,显现了/device:GPU:0 可爱字眼,安心连接下一个步骤
>>> tf.__version__
'2.5.0'
>>> tf.test.gpu_device_name()
出现如下字样
'/device:GPU:0'
没有再出现skip gpu...
2021-11-21 09:11:11.576578: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-11-21 09:11:11.586861: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-11-21 09:11:12.301775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:3b:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2021-11-21 09:11:12.302507: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 1 with properties:
pciBusID: 0000:5e:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2021-11-21 09:11:12.303282: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 2 with properties:
pciBusID: 0000:b1:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2021-11-21 09:11:12.303954: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 3 with properties:
pciBusID: 0000:d9:00.0 name: GeForce RTX 3090 computeCapability: 8.6
coreClock: 1.695GHz coreCount: 82 deviceMemorySize: 23.70GiB deviceMemoryBandwidth: 871.81GiB/s
2021-11-21 09:11:12.304010: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-11-21 09:11:12.322885: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-11-21 09:11:12.322999: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-11-21 09:11:12.337252: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-11-21 09:11:12.342694: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-11-21 09:11:12.348923: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-11-21 09:11:12.354146: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-11-21 09:11:12.356096: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-11-21 09:11:12.362607: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0, 1, 2, 3
2021-11-21 09:11:12.363212: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-11-21 09:11:17.044150: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-11-21 09:11:17.044213: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0 1 2 3
2021-11-21 09:11:17.044240: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N N N N
2021-11-21 09:11:17.044245: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 1: N N N N
2021-11-21 09:11:17.044249: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 2: N N N N
2021-11-21 09:11:17.044254: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 3: N N N N
2021-11-21 09:11:17.050196: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:0 with 3793 MB memory) -> physical GPU (device: 0, name: GeForce RTX 3090, pci bus id: 0000:3b:00.0, compute capability: 8.6)
2021-11-21 09:11:17.053391: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:1 with 3665 MB memory) -> physical GPU (device: 1, name: GeForce RTX 3090, pci bus id: 0000:5e:00.0, compute capability: 8.6)
2021-11-21 09:11:17.054353: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:2 with 3663 MB memory) -> physical GPU (device: 2, name: GeForce RTX 3090, pci bus id: 0000:b1:00.0, compute capability: 8.6)
2021-11-21 09:11:17.055315: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/device:GPU:3 with 3771 MB memory) -> physical GPU (device: 3, name: GeForce RTX 3090, pci bus id: 0000:d9:00.0, compute capability: 8.6)
'/device:GPU:0'
安装keras
pip install keras
在激活码conda虚拟游戏区域的具体条件下,tensorflow用pip命令提示符布置,keras也用pip布置,以免conda会再布置一款 tensorflow,引起摩擦
编码中有的keras调成tensorflow.keras, keras包但是不就用干了,这些包没这个必要再装了。
安装 cudnn
conda install -c nvidia cudnn=8.1.0
问题1 -测试tensorflow是否安装成功
或许有博客日志说这一个报错能真接被忽视,只是亲测gpu无非采用,说明书没了配置好import tensorflow as tf
tf.test.gpu_device_name()
报错资料
I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: SSE4.1 SSE4.2 AVX AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags
解决思路,先查询了一下oneDNN是什么
//01.org/onednn
后来发现其实就是tensorfow没有安装正确,需要卸载重新安装
看到有文章说可以忽略,但是gpu无法成功使用,仅仅是不把报错信息显示出来而已
//blog.csdn.net/qq_39096123/article/details/100575784
问题2 tensorflow 和tensorlow-gpu
游戏管方网建设中谈及,最早的时候ios的版本任何事物图片软件包是错开的,由于就看作直接性进行怎么安装tensorlow 2.5 ios的版本就说了,实际情况上发现了,用cpu编译的tensorflow,gpu上进行怎么安装难以成功创业tf.config.list_physical_devices('GPU')
能够有一个空的文件列表,介绍没能遇到GPU
tf.test.is_built_with_cuda
问题3 conda 的多个数据源里面都没有 tensorflow-gpu=2.5.0,但是pip里面有
此时此刻最新版本信息查询conda 4.9.2
pip 21.1.3 from /home/username/miniconda3/envs/envnames/lib/python3.8/site-packages/pip (python 3.8)
用pip的的安装随后,相对应的的cuda发行版就没有自主的的安装好
结合python微信版本指定的tensorflow
pip install //storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-2.4.0-cp38-cp38-manylinux2010_x86_64.whl
安装之后会看到有:tensorflow-gpu 2.4.0
依然无法成功使用GPU
问题4 tensorflow是gpu版本,keras是否也要指定gpu版本呢?
keras-gpu
安装keras-gpu用如下指令
安装之后tensorflow会被conda自动更新
也就是,直接安装keras-gpu就可以了,对应tensorflow-gpu也就自动装好了
但是进入python控制台,发现tensorflow不能用了,可能是因为pip装了一个tensorlow,conda又装了一个
此外,安装的keras-gpu并不能通过import keras导入,无法满足当前程序,因此摈弃这种安装方式
问题5 tensorflow2.5和keras2.4.3可能不兼容
运营代码是什么时间报错,报错的是keraskeras和tf.keras关系
问题6 cudnn 报错
Failed to get convolution algorithm.
This is probably because cuDNN failed to initialize,
so try looking to see if a warning log message was printed above.
简略短信
Loaded runtime CuDNN library: 8.0.5
but source was compiled with: 8.1.0.
CuDNN library needs to have matching major version and equal or higher minor version. If using a binary install, upgrade your CuDNN library. If building from sources, make sure the library loaded at runtime is compatible with the version specified during compile configuration.
进行安装cudnn 8.1.0就好处理
pip install cudnn
ERROR: Could not find a version that satisfies the requirement cudnn (from versions: none)
ERROR: No matching distribution found for cudnn
conda找到了对应版本
但是默认的版本不符合要求
最后发现应该 输入下面的命令安装正确版本的cudnn
//anaconda.org/nvidia/cudnn
conda install -c nvidia cudnn
安装其他版本cuda
服务器配置多版本CUDA、CUdnn(不同Linux账户使用不同CUDA、CUdnn版本)
//www.cnblogs.com/sddai/p/10278005.html
下载链接
//developer.nvidia.com/cuda-toolkit-archive
在官网下载cuda,然后解压,配置环境变量
即是:在用户目录下面的.bashrc 文件末尾,加上这几句,然后source .bashrc 即可
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/home/user/cuda/lib64
export PATH=$PATH:/home/user/cuda/bin
export CUDA_HOME=$CUDA_HOME:/home/user/cuda
未解决的问题
全为N的矩阵 与 部分为Y的矩阵 表示的含义,训练模型的时候有无影响
之前的理解是Y是表示两两之间可以通讯,但是目前全部是N,一个程序能成功调用多块GPU,Y与N目前没有造成影响
其他问题:
安装tensorflow-gpu==2.4的时候找不到文件
在这个网站上找到之后,点击文件详情,复制source_url
conda install <source_url>
只能那一瞬间按照好tensorflow-gpu
//anaconda.org/anaconda/tensorflow-gpu/files
下载这个文件发现,里面只有一些基本信息,没有内容
不能走这个捷径
皇冠新体育APP相关的文章
- Mac m1 | 安装navicat&连接mysql常见问题&docker安装mysql_SoniaChan33_mac navicat链接mysql
- mysql查询每个学生的各科成绩,以及总分和平均分(实操版)_无痕之剑的书橱_mysql统计每个学生的总分
- 皇冠新体育APP:安卓手机使用termux搭建centos7个人服务器_1687F_安卓手机装centos
- 皇冠新体育APP:MySQL数据库 --- Java的JDBC编程_wwzzzzzzzzzzzzz
- Cause: com.zaxxer.hikari.pool.HikariPool$PoolInitializationException: Failed to initialize pool_hhtS
- Flink cdc2.x mysql维表关联 (Flink 1.13 DataStream)_Jhon_yh
- 皇冠新体育APP:Rocketmq一个生产者多个消费者的问题_远方的少年_rocketmq多个消费者消费一个消息
- 基于SpringBoot + MyBatis的前后端分离实现在线办公系统_Serendipity sn
- 【愚公系列】2022年02月 Docker容器 RabbitMQ集群的搭建_时光隧道
- MySQL 下载安装教程_暗诺星刻_mysql下载安装步骤
- 皇冠新体育APP:docker-compose整合Redis、MySQL、rabbitmq_一米阳光zw
- tlinux 3.1 配置 docker / docker-compose 一键部署 redis,rabbitmq 容器_ChaITSimpleLove
- 皇冠新体育APP:RabbitMQ基础介绍与在java中使用-入门_T_rabbitmq在java中使用
- 使用 Spring Boot + WebSocket + RabbitMQ 构建聊天应用程序_allway2
- 皇冠新体育APP:MySQL索引的原理_吕维尧
- Docker安装RabbitMQ??基于docker-compose工具_MrDarren