Logo

Indie Machine Learning and Video Game Dev

Neural Net Downloads
How to
Hardware Reviews
Get Merch
About
Training of the week
26 January 2019

AMD RX550 + ROCM ultra budget Machine Learning

by Mathieu Poliquin

rx550
Testing the ultra budget AMD RX 550 gaming card for machine learning

Recently I saw this test RX470 + ROCM (It’s in Japanese but there is many screenshots) The author reports ~7600 examples/sec with cifar10 on a RX470 8GB mining card. This is on par with a GTX 1060 at less half then price on the second hand market AND with 8 GB. Lots of Neural Nets require 8GB or more for training, so it’s a significant advantage. Moreover, contrary to NVIDIA cards, you can mod the bios of AMD RX cards to boost the clock and squeeze even more performance.

So all of this made me curious to see how much performance, the AMD RX550, one of the cheapest Polaris 11 gpu based cards can deliver.

Conclusion for people in a hurrry

I bought one second-hand for under 168 yuan on taobao (25 US$/32 CAD$). Even with that very low price the p106-100 and the Sapphire RX470 8GB mining card delivers better bang for the buck. That said if you want a very cheap first card to play with cifar10 datasets it could be good a choice, but keep in mind you will need to upgrade soon after as 2GB of VRAM is far from enough for most Machine Learning tasks Important note: I unfortunatly got a faulty card, which crashed often in both Ubuntu with or without ROCM drivers and also unstable in Windows 10.

Test P106-100 RX550 (2GB)
Cifar10 peak ~8000 examples/sec ~3200 examples/sec
Alexnet forward 0.083s/batch 0.209s/batch
Alexnet backward 0.193s/batch 0.736s/batch
PPO2 Atari Pong ~1110 frame/sec ~280 frames/sec

Software

Hardware

Details

Hardware Info

gpu-z

rocm-smi

rocm-smi

rocm-smi

cifar10

cifar10

Atari Pong baselines

I would have expected somewhere between 400-500 fps considering cifar10 performance is quite good compared to the specs of the card. Since rocm is still in early development is possible that OpenAI baselines uses some tensorflow features that are not yet full optimized by ROCm. I did not have the time to profile it yet. Atari Pong

clinfo details

clinfo

tags: gpu - rocm - machine learning - rx550 - hardware