When I participated in Google Summer Of Code last summer, I worked for Ganglia, which is a very popular open source distributed monitoring system. I have worked on its nvidia gpu module, which is written in python language. This is my first official contribution to a widely used open source software.
While working here, I came to realize that, there aren’t that much resources exist online. Also, the original NVML API PDF reference I got, was for C programming language, not for python. However, I managed to convert the C API call to python call, mainly because of the super easy nature of python language. Finally, I managed to become flexible with it and implemented new features, enhanced existing features on this plugin module.
So, in this nvml python tutorial, which is intended for complete beginners, I will be sharing some common ways to interact with NVIDIA GPU from python, so that beginners who are new to this platform, can get a better start.
First of all, you will have to make sure you have a Nvidia GPU on the machine you will be working on(just to mention who might mixed up, GPU and graphics card are two different thing). To install the python dependency, you can use the latest package from here:
When you are just starting with your first ever application on NVML, first thing you need to do is to set up a connection to the GPU(s) object. This is as easy as one liner method call. However, you should handle exception on this statement. Because, if it doesn’t find compatible GPUs in the system it is running on, it will throw an error. Here is a code sample:
try: nvmlInit() except NVMLError, err: print "Failed to initialize NVML: ", err print "Exiting..." os._exit(1)
Terminate A Connection
Same rule applies for terminating a connection. You will need to handle exception on this too:
try: nvmlShutdown() except NVMLError, err: print "Error shutting down NVML:" return 1
Get Number Of GPUs:
Now, there can be several number of GPUs connected to the machine and NVML can interact with all of them as per your need. To know how many GPUs are connected, we can just call like below:
numOfGPUs = int(nvmlDeviceGetCount())
Get Reference To A GPU Object By Index:
Provided that you have the number of GPUs, most easy way to reference to them is by 0 based index system. So, you will just need to pass the index from 0 to N-1(where N is the number of GPUs we did get from above code) to ‘nvmlDeviceGetHandleByIndex’ method like below:
gpuObj = nvmlDeviceGetHandleByIndex(gpu_id);
Few Other useful operations:
As you now have reference to a specific GPU, you can start interacting with it and gather necessary information from it as well.
Some common operations are given here, just as a starting point. Use the reference manual to get other necessary calls as per your need.
#get GPU temperature temperature = nvmlDeviceGetTemperature(gpu_device, NVML_TEMPERATURE_GPU) #get GPU memory total totalMemory = nvmlDeviceGetMemoryInfo(gpu_device).total #get GPU memory used usedMemory = nvmlDeviceGetMemoryInfo(gpu_device).used
Hope this basic NVML python tutorial will help you start interfacing to Nvidia GPU from python. To know or debug/verify your call, you should also use a CLI tool in parallel to your python script, about which you can know more on the official documentation . However, if you are having any issues regarding the area covered in this tutorial, please let me know via comments. Happy coding 🙂