RTX 2060 Super for Machine Learning
by Mathieu Poliquin
The RTX 2060 Super is the card I use currently and I think it’s the best bang for the buck for my use case and I think for most use cases provided your models fit inside 8GB and with the new 16bit precision mode offered in RTX cards you can almost double the memory available for you model with no precision related issues in most cases. Which makes it the best choice over the previous generation, for example the GTX 1080 8GB, which doesn’t have proper 16bit support
My specific model is a MaxSun iCraft RTX 2060 Super, the only complaint I have is the high temperatures. It reaches 80C quite quick under 100% load (38C under rest) even thought the airflow in the computer case is very good, but I think it’s related to the effectiveness of the gpu fans. When I set fan speed to 100% the temperatures are ok but quite noisy.
Performance tests results
Test | P106-100 |
---|---|
PPO2 Atari Pong | ~1670 frame/sec |
Resnet50 batch=32 | 181.88 images/sec |
Resnet50 batch=32 (16 bit) | 292.80 images/sec |
Resnet50 batch=64 (16 bit) | 324.83 images/sec |
Isaac gym/OpenAI - Shadowhand | 30952 steps/s |
Host to Device | 1122 MB/s |
Device to Host | 1218.3 MB/s |
Device to Device | 168540.0 MB/s |
Quake 2 RTX | 55 fps |
Software
- Windows 10
- Ubuntu 18.04
- Tensorflow 1.14
- Pytorch 1.8
- CUDA 10.0
- CUDNN 7.36
- NVIDIA driver 430
Hardware
- MaxSun iCraft RTX 2060 Super
- E5 2678 v3 ES (30MB cache, 12C/24T)
- 16 GB ECC DDR4 2133Mhz
- Dell T7810 dual socket motherboard (but only one CPU is used)
OpenAI baselines + retro
When training ML models on games the CPU is also heavily used for simulation so the GPU is not 100% utilized but used in spikes. That said you can still get a big performance boost using OpenAI’s baselines and Retro frameworks of about 500 fps with same CPU with their default CNN model
details of the setup here: https://www.videogames.ai/2019/01/29/Setup-OpenAI-baselines-retro.html
Isaac gym
I get Isaac 30952 steps/s on the ShadowHand example. I actually did a video of Isaac gym on the RTX 2060, you can see it here as well as a comparaison with the p106-100:
Quake 2 RTX
I get around 55 fps at the beginning of the first level (demo version). You can see the gpu profiling details inside the screenshot
Here are the options I used:
Resnet 50 test
For the resnet 50 test I use tensorflow’s benchmarks repo on github: https://github.com/tensorflow/benchmarks/tree/master/scripts/tf_cnn_benchmarks
I used tensorflow version 1.14
python3 tf_cnn_benchmarks.py --num_gpus=1 --batch_size=64 --model=resnet50 --use_fp16