LHC Tunnel

LHC Tunnel

Monday 9 January 2017

Containers on the CERN cloud

We have recently made the Container-Engine-as-a-Service (Magnum) available in production at CERN as part of the CERN IT department services for the LHC experiments and other CERN communities. This gives the OpenStack cloud users Kubernetes, Mesos and Docker Swarm on demand within the accounting, quota and project permissions structures already implemented for virtual machines.

We shared the latest news on the service with the CERN technical staff (link). This is the follow up on the tests presented at the OpenStack Barcelona (link) and covered in the blog from IBM. The work has been helped by collaborations with Rackspace in the framework of the CERN openlab and the European Union Horizon 2020 Indigo Datacloud project.

Performance

At the Barcelona summit, we presented with Rackspace and IBM regarding our additional performance tests after the previous blog post. We expanded beyond the 2M requests/s to reach around 7M where some network infrastructure issues unrelated to OpenStack limited the scaling further.

As we created the clusters, the deployment time increased only slightly with the number of nodes as most of the work is done in parallel. But for 128 node or larger clusters, the increase in time started to scale almost linearly. At the Barcelona summit, the Heat and Magnum teams worked together to develop proposals for how to improve further in future releases, although a 1000 node cluster in 23 minutes is still a good result


Cluster Size (Nodes)
Concurrency
Deployment Time (min)
2
50
2.5
16
10
4
32
10
4
128
5
5.5
512
1
14
1000
1
23

Storage

With the LHC producing nearly 50PB this year, High Energy Physics has some custom storage technologies for specific purposes, EOS for physics data, CVMFS for read-only, highly replicated storage such as applications.

One of the features of providing a private cloud service to the CERN users is to combine the functionality of open source community software such as OpenStack with the specific needs for high energy physics. For these to work, some careful driver work is needed to ensure appropriate access while ensuring user rights. In particular,
  • EOS provides a disk-based storage system providing high-capacity and low-latency access for users at CERN. Typical use cases are where scientists are analysing data from the experiments.
  • CVMFS is used for a scalable, reliable and low-maintenance for read-only data such as software.
There are also other storage solutions we use at CERN such as
  • HDFS for long term archiving of data using Hadoop which uses an HDFS driver within the container.  HDFS works in user space, so no particular integration was required to use it from inside (unprivileged) containers
  • Cinder provides additional disk space using volumes if the basic flavor does not have sufficient. This Cinder integration is offered by upstream Magnum, and work was done in the last OpenStack cycle to improve security by adding support for Keystone trusts.
CVMFS was more straightforward as there is no need to authenticate the user. The data is read-only and can be exposed to any container. The access to the file system is provided using a driver (link) which has been adapted to run inside a container. This saves having to run additional software inside the VM hosting the container.

EOS requires authentication through mechanisms such as Kerberos to identify the user and thus determine what files they have access to. Here a container is run per user so that there is no risk of credential sharing. The details are in the driver (link).

Service Model

One interesting question that came up during the discussions of the container service was how to deliver the service to the end users. There are several scenarios:
  1. The end user launches a container engine with their specifications but they rely on the IT department to maintain the engine availability. This implies that the VMs running the container engine are not accessible to the end user.
  2. The end user launches the engine within a project that they administer. While the IT department maintains the templates and basic functions such as the Fedora Atomic images, the end user is in control of the upgrades and availability.
  3. A variation of option 2., where the nodes running containers are reachable and managed by the end user, but the container engine master nodes are managed by the IT department. This is similar to the current offer from the Google Container Engine and requires some coordination and policies regarding upgrades
Currently, the default Magnum model is for the 2nd option and adding option 3 is something we could do in the near future. As users become more interested in consuming containers, we may investigate the 1st option further

Applications

Many applications at use in CERN are in the process of being reworked for a microservices based architecture. A choice of different container engines is attractive for the software developer. One example of this is the file transfer service which ensures that the network to other high energy physics sites is kept busy but not overloaded with data transfers. The work to containerise this application was described at the recent CHEP 2016 FTS poster.

While deploying containers is an area of great interest for the software community, the key value comes from the physics applications exploiting containers to deliver a new way of working. The Swan project provides a tool for running ROOT, the High Energy Physics application framework, in a browser with easy access to the storage outlined above. A set of examples can be found at https://swan.web.cern.ch/notebook-galleries. With the academic paper, the programs used and the data available from the notebook, this allows easy sharing with other physicists during the review process using CERNBox, CERN's owncloud based file sharing solution.



Another application being studied is http://opendata.cern.ch/?ln=en which allows the general public to run analyses on LHC open data. Typical applications are Citizen Science and outreach for schools.

Ongoing Work

There are a few major items where we are working with the upstream community:
  • Cluster upgrades will allow us to upgrade the container software. Examples of this would be a new version of Fedora Atomic, Docker or the container engine. With a load balancer, this can be performed without downtime (spec)
  • Heterogeneous cluster support will allow nodes to have different flavors (cpu vs gpu, different i/o patterns, different AZs for improved failure scenarios). This is done by splitting the cluster nodes into node groups (blueprint)
  • Cluster monitoring to deploy Prometheus and cAdvisor with Grafana dashboards for easy monitoring of a Magnum cluster (blueprint).

References

39 comments:

  1. This comment has been removed by the author.

    ReplyDelete


  2. I am really happy to say it’s an interesting post to read . I learn new information from your article , you are doing a great job . Keep it up and a i also want to share some information regarding best selenium online training and selenium training videos

    ReplyDelete
  3. Thanks, Admin for sharing such a useful post, I hope it’s useful to many individuals for developing their skills to get a good career.
    Docker and Kubernetes Training in Hyderabad
    Kubernetes Online Training
    Docker Online Training

    ReplyDelete
  4. Awesome Post. Great Content. It is very inspiring to read your post. Waiting for your updates.
    Kubernetes Online Training
    Docker Online Training
    Docker Training

    ReplyDelete
  5. Whatever we gathered information from the blogs, we should implement that in practically then only we can understand that exact thing clearly, but it’s no need to do it, because you have explained the concepts very well. It was crystal clear, keep sharing kubernetes tutorial for beginners and learn devops online. We provide open stack training also.

    ReplyDelete
  6. Poker online situs terbaik yang kini dapat dimainkan seperti Bandar Poker yang menyediakan beberapa situs lainnya seperti http://62.171.128.49/hondaqq/ , kemudian http://62.171.128.49/gesitqq/, http://62.171.128.49/gelangqq/, dan http://62.171.128.49/seniqq. yang paling akhir yaitu http://62.171.128.49/pokerwalet/. Jangan lupa mendaftar di panenqq silakan dicoba ya boss

    ReplyDelete
  7. Its good and Informative.Thank you for posting this article.
    Kubernetes Online Training

    ReplyDelete

  8. Wow. That is so elegant and logical and clearly explained. Brilliantly goes through what could be a complex process and makes it obvious.I want to refer about the best tableau server training

    ReplyDelete
  9. Up until now, I concur with you on a significant part of the data you have composed here. I should think some on it, yet generally speaking this is a brilliant article.


    SEO services in kolkata
    Best SEO services in kolkata
    SEO company in kolkata
    Best SEO company in kolkata
    Top SEO company in kolkata
    Top SEO services in kolkata
    SEO services in India
    SEO copmany in India

    ReplyDelete
  10. This is a great inspiring tutorials on hadoop.I am pretty much pleased with your good work.You put really very helpful information. Keep it up

    python training in bangalore

    python training in hyderabad

    python online training

    python training

    python flask training

    python flask online training

    python training in coimbatore

    ReplyDelete
  11. Aw, this was a very nice post. Taking the time and actual effort to produce a superb article… but what can I say… I procrastinate a whole lot and never manage to get anything done.shipping containers near me for sale

    ReplyDelete
  12. This comment has been removed by the author.

    ReplyDelete
  13. Thanks for sharing your wealthy information. This is one of the excellent posts which I have seen. I go through your all of your blog, but this blog is the best one. It is really what I wanted to see hope in future you will continue for sharing such an excellent post
    vé máy bay từ đà nẵng đi seoul

    giá vé máy bay đi nhật bản bao nhiêu

    ve may bay di tokyo

    vé máy bay eva đi đài loan

    vé máy bay giá rẻ đi taipei

    vé máy bay tphcm đi cao hùng

    ReplyDelete
  14. Thank you for sharing wonderful information with us to get some idea about it.
    workday integration course india
    workday online integration course

    ReplyDelete
  15. Nice Post! It's Really awesome please keep writing these types of content

    ReplyDelete
  16. Situs Nonton movie, film dan tv series terbaru dengan subtitle indonesia diupdate setiap hari, dari situs terpopuler nonton disini link
    di bawah ini
    layarkaca21
    bioskopkeren
    daymovie
    terbit21
    boomthis

    Situs judi online terpercaya
    http://199.188.201.133
    http://162.213.251.13
    bonekaqq
    indoqq99
    sahabatqq

    ReplyDelete
  17. Thanks again for the article post.Really thank you! Fantastic.
    devops online training
    devops training

    ReplyDelete
  18. Very awesome post! I like that and very interesting content.
    workday course
    workday online course

    ReplyDelete
  19. thanks due to the fact you have been precise-natured to percentage opinion subsequent to us. we are able to continually recognize all you have finished here because I understand you are selected worried thinking about our.! Users can save their personalized presentations locally and online into their accounts using this software. Prezi Pro Full Crack

    ReplyDelete
  20. i'm able to see which you are an capable at your pitch! i'm launching a internet site quickly, and your inform could be very useful for me.. thanks for all your benefit taking place and wishing you all the triumph for your issue.! MS Office 2010 Cracks

    ReplyDelete
  21. Great Article, Thank you for sharing this awesome blog.

    Keep Sharing more thoughts with us.

    ServiceNow Training

    ReplyDelete
  22. wordpress website design agency in united states Need professional WordPress Web Design Services? We're experts in developing attractive mobile-friendly WordPress websites for businesses. Contact us today!

    ReplyDelete
  23. Fx Sound Crack enhances sound quality, loudness, clarity, and thunderous bass on your PC. This application offers you a wide variety of audio content, including stunning music, films, Internet radio, websites, games, and video chats

    ReplyDelete