LHC Tunnel

LHC Tunnel

Wednesday 9 May 2018

Introducing GPUs to the CERN Cloud

High-energy physics workloads can benefit from massive parallelism -- and as a matter of fact, the domain faces an increasing adoption of deep learning solutions. Take for example the newly-announced TrackML challenge [7], already running in Kaggle! This context motivates CERN to consider GPU provisioning in our OpenStack cloud, as computation accelerators, promising access to powerful GPU computing resources to developers and batch processing alike.

What are the options?

Given the nature of our workloads, our focus is on discrete PCI-E Nvidia cards, like the GTX1080Ti and the Tesla P100. There are 2 ways to provision these GPUs to VMs: PCI passthrough and virtualized GPU. The first method is not specific to GPUs, but applies to any PCI device. The device is claimed by a generic driver, VFIO, on the hypervisor (which cannot use it anymore) and exclusive access to it is given to a single VM [1]. Essentially, from the host’s perspective the VM becomes a userspace driver [2], while the VM sees the physical PCI device and can use normal drivers, expecting no functionality limitation and no performance overhead.
Visualizing passthrough vs mdev vGPU [9]
In fact, perhaps some “limitation in functionality” is warranted, so that the untrusted VM can’t do low-level hardware configuration changes on the passed-through device, like changing power settings or even its firmware! In fact, security-wise PCI passthrough leaves a lot to be desired. Apart from allowing the VM to change the device’s configuration, it might leave a possibility for side-channel attacks on the hypervisor (although we have not observed this, and a hardware “IOMMU” protects against DMA attacks from the passed-through device). Perhaps more importantly, the device’s state won’t be automatically reset after deallocating from a VM. In the case of a GPU, data from a previous use may persist on the device’s global memory when it is allocated to a new VM. The first concern may be mitigated by improving VFIO, while the latter, the issue of device reset or “cleanup”, provides a use case for a more general accelerator management framework in OpenStack -- the nascent Cyborg project may fit the bill.
Virtualized GPUs are a vendor-specific option, promising better manageability and alleviating the previous issues. Instead of having to pass through entire physical devices, we can split physical devices into virtual pieces on demand (well, almost on demand; there needs to be no vGPU allocated in order to change the split) and hand out a piece of GPU to any VM. This solution is indeed more elegant. In Intel and Nvidia’s case, virtualization is implemented as a software layer in the hypervisor, which provides “mediated devices” (mdev [3]), virtual slices of GPU that appear like virtual PCI devices to the host and can be given to the VMs individually. This requires a special vendor-specific driver on the hypervisor (Nvidia GRID, Intel GVT-g), unfortunately not yet supporting KVM. AMD is following a different path, implementing SR-IOV at a hardware level.

CERN’s implementation

PCI passthrough has been supported in Nova for several releases, so it was the first solution we tried. There is a guide in the OpenStack docs [4], as well as previous summit talks on the subject [1]. Once everything is configured, the users will see special VM flavors (“g1.large”), whose extra_specs field includes passthrough of a particular kind of gpu. For example, to deploy a GTX 1080Ti, we use the following configuration:
nova-compute
pci_passthrough_whitelist={"vendor_id":"10de"}
nova-scheduler
add PciPassthroughFilter to enabled/default filters
nova-api
pci_alias={"vendor_id":"10de",”product_id”:”1b06”,”device_type”:”type-PCI”,”name”:”nvP1080ti_VGA”}
pci_alias={"vendor_id":"10de",”product_id”:”10ef”,”device_type”:”type-PCI”,”name”:”nvP1080ti_SND”}
flavor extra_specs
--property "pci_passthrough:alias"="nvP1080ti_VGA:1,nvP1080ti_SND:1"
A detail here is that most GPUs appear as 2 pci devices, the VGA and the sound device, both of which must be passed through at the same time (they are in the same IOMMU group; basically an IOMMU group [6] is the smallest passable unit).
Our cloud was in Ocata at the time, using CellsV1, and there were a few hiccups, such as the Puppet modules not parsing an option syntax correctly (MulitStrOpt) and CellsV1 dropping the pci requests. For Puppet, we were simply missing some upstream commits [15]. From Pike on and in CellsV2, these issues shouldn’t exist. As soon as we had worked around them and puppetized our hypervisor configuration, we started offering cloud GPUs with PCI passthrough and evaluating the solution. We created a few GPU flavors, following the AWS example of keeping the amount of vCPUs the same as the corresponding normal flavors.
From the user’s perspective, there proved to be no functionality issues. CUDA applications, like TensorFlow, run normally; the users are very happy that they finally have exclusive access to their GPUs (there is good tenant isolation). And there is no performance penalty in the VM, as measured by the SHOC benchmark [5] -- admittedly quite old, we preferred this benchmark because it also measures low-level details, apart from just application performance.
From the cloud provider’s perspective, there’s a few issues. Apart from the potential security problems identified before, since the hypervisor has no control over the passed-through device, we can’t monitor the GPU. We can’t measure its actual utilization, or get warnings in case of critical events, like overheating.
Normalized performance of VMs vs. hypervisor on some SHOC benchmarks. First 2: low-level gpu features, Last 2: gpu algorithms [8]. There are different test cases of VMs, to check if other parameters play a role. The “Small VM” has 2 vCPUs, “Large VM” has 4, “Pinned VM” has 2 pinned vCPUs (thread siblings), “2 pin diff N” and “2 pin same N” measure performance in 2 pinned VMs running simultaneously, in different vs the same NUMA nodes

Virtualized GPU experiments

The allure of vGPUs amounts largely to finer-grained distribution of resources, less security concerns (debatable) and monitoring. Nova support for provisioning vGPUs is offered in Queens as an experimental feature. However, our cloud is running on KVM hypervisors (on CERN CentOS 7.4 [14]), which Nvidia does not support as of May 2018 (Nvidia GRID v6.0). When it does, the hypervisor will be able to split the GPU into vGPUs according to one of many possible profiles, such as in 4 or in 16 pieces. Libvirt then assigns these mdevs to VMs in a similar way to hostdevs (passthrough). Details are in the OpenStack docs at [16].
Despite this promise, it remains to be seen if virtual GPUs will turn out to be an attractive offering for us. This depends on vendors’ licensing costs (such as per VM pricing), which, for the compute-compatible offering, can be significant. Added to that is the fact that only a subset of standard CUDA is supported (not supported are the unified memory and “CUDA tools” [11], probably referring to tools like the Nvidia profiler). vGPUs are also oversubscribing the GPU’s compute resources, which can be seen in either a positive or negative light. On the one hand, this guarantees higher resource utilization, especially for bursting workloads, like developers. On the other hand, we may expect a lower quality of service [12].

And the road goes on...

Our initial cloud GPU offering is very limited, and we intend to gradually increase it. Before that, it will be important to address (or at least be conscious about) the security repercussions of PCI passthrough. But even more significant is to address GPU accounting in a straightforward manner, by enforcing quotas on GPU resources. So far we haven’t tested the case of GPU P2P, with multi-GPU VMs, which is supposed to be problematic [13].
Another direction we’ll be researching is offering GPU-enabled container clusters, backed by pci-passthrough VMs. It may be that, with this approach, we can emulate a behavior similar to vGPUs and circumvent some of the bigger problems with pci passthrough.

References

[5]: SHOC benchmark suite: https://github.com/vetter/shoc
[11]: CUDA Unified memory and tooling not supported on Nvidia vGPU: https://docs.nvidia.com/grid/latest/grid-vgpu-user-guide/index.html#features-grid-vgpu

23 comments:

  1. Replies
    1. Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download Now

      >>>>> Download Full

      Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download LINK

      >>>>> Download Now

      Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download Full

      >>>>> Download LINK Mg

      Delete
  2. Nice post, Good work. I would like to share this post on my blog.
    Servicenow service portal training

    ReplyDelete
  3. Thanks for delivering a good stuff related to SharePoint, Explination is good, Nice Article.
    Openstack Training
    Openstack Training Online
    Openstack Training in Hyderabad

    ReplyDelete
  4. Thanks for a wonderful sharing with us. Your many blogs has importations your hard work and experience you have got in This subject content area. Brilliant reading and, I love it. You really so knowledge person and I have bookmarked it and I am looking forward to reading new articles. Essay Help - statistics homework help - college homework help

    ReplyDelete
  5. I am really impressed with your writing abilities and also with the layout to your blog. Is this a paid topic or did you customize it yourself? Either way stay up with the nice quality writing, it’s uncommon to see a nice blog like this one nowadays. And if you are a gambling enthusiast like me, kindly visit 바카라사이트 as well as 블랙잭게임I am really impressed with your writing abilities and also with the layout to your blog. Is this a paid topic or did you customize it yourself? Either way stay up with the nice quality writing, it’s uncommon to see a nice blog like this one nowadays. And if you are a gambling enthusiast like me, kindly visit 바카라사이트 as well as 블랙잭게임

    ReplyDelete
  6. Hi, I love to see your recent post for expanding the knowledge set as much as I can. We like to see latest work with the proper linking of Assignment Help Service service.

    ReplyDelete
  7. Great post! For getting high-quality Assignment Help in Melbourne, you can visit our assignment writing platform. Hire experienced professionals from there and get your assignments on time.Read More:- Assignment Help Melbourne

    ReplyDelete


  8. Hey friend, it is very well written article, thank you for the valuable and useful information you provide in this post. Keep up the good work! FYI, Pet Care adda
    Sita Warrior Of Mithila Pdf Download , IDFC First Select Credit Card Benefits,Poem on Green and Clean Energy

    ReplyDelete
  9. Thanks for sharing this best stuff with us! Keep sharing! I am new in the blog writing. All types blogs and posts are not helpful for the readers. Here the author is giving good thoughts and suggestions to each and every reader through this article. Quality of the content is the main element of the blog and this is the help with essay uk way of writing and presenting.

    ReplyDelete
  10. This comment has been removed by the author.

    ReplyDelete
  11. Your style is really unique compared to other people I have read stuff from. Thank you for posting when you’ve got the opportunity, Guess I’ll just book mark this site.

    BSc 1st Year Hall Ticket 2022
    BSc 2nd Year Hall Ticket 2022
    BSc 3rd Year Hall Ticket 2022

    ReplyDelete
  12. Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download Now

    >>>>> Download Full

    Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download LINK

    >>>>> Download Now

    Openstack In Production - Archives: Introducing Gpus To The Cern Cloud >>>>> Download Full

    >>>>> Download LINK FL

    ReplyDelete
  13. I have never seen this type of information before. Thanks for sharing this. Please also visit 19가이드03

    ReplyDelete
  14. It is good to hear that your store is now expanding to new locations. I have been a patron of Fantastic Eyes because of all the wonderful work that you guys do. I hope that this expansion move of yours will turn dead redemption john varston leather vest out to be successful. I will definitely go and see this new store of yours

    ReplyDelete
  15. I believe this will help your website become more organized because you have decided to set a part on this site for the inquiries regarding tax and as well as the helpful discussions. To be honest, this is one of the few sites that are doing this kind of strategy. Also, help with assignment crux I think that this will not only benefit your clients or the potential ones but you most especially because you will be able to see the questions easier.

    ReplyDelete
  16. Excellent blog! Such clever work and exposure! Keep up the very good work. Akira Jacket

    ReplyDelete
  17. "Congratulations on the exciting move to a new home for OpenStack in production! It's always refreshing to see advancements in technology infrastructure. On a side note, while we're discussing transitions, it's essential to acknowledge the various facets of our digital lives. Speaking of changes, have you ever thought about how even our WhatsApp display pictures (DP) can reflect our emotions during transitions? Sometimes, a 'sad whatsapp dp' might capture those moments of nostalgia or adjustment. Wishing the OpenStack team continued success in their new environment!"

    ReplyDelete