Wednesday, 14 August 2013

Managing identities in the cloud

CERN has 11,000 physicists who use the lab's facilities including the central IT department resoures. As with any research environment, there are many students, PhDs and other project members who join one of the experiments at CERN. They need to have computing accounts to access CERN's cloud but we also need to make sure these resources are handled correctly when they are no longer affiliated with the organisation.

Managing Users

For the CERN OpenStack cloud, we wanted complete integration with the site identity management system. With around 200 arrivals/departures per month, managing identities within OpenStack would have been a major effort.

CERN's users are stored in our Active Directory system which provides a single central password and user attribute store such as full name, organisational unit and location. We also define our user groups using Active Directory so that lists of members of an experiment can be centrally managed and applications share this master source of data for allocating roles to user groups.

Keystone provides the OpenStack authentication service including an LDAP back end. Working with the community during the Folsom release, we developed a number of patches so that Keystone was able to use the LDAP interface to Active Directory (see for details). This allows users from both the command line and Horizon GUI to use OpenStack with their standard credentials.

Where we can, we leave the LDAP schema read-only since there are many other dependencies and major schema changes can cause significant disruption.

Multiple Identities

Historically, users have multiple accounts at CERN.
  • Primary account which is used for their prime activity
  • Secondary accounts are used for cases where you wish a different identity. Typical examples are where an administrator would need an account that would provide standard user rights for documentation or an ultra account which is rarely used
  • Service accounts which are shared. Here the user is responsible for the account but is able to transfer the account to another user. Typical examples would be an account used for running a daemon or an application internal resource.
As examples, timbell (my primary account for my day to day work), timothybell (my secondary to simulate a typical low privilege user profile for documentation) and owncloud (a service account related to a specific application).

The structure of the cloud identities is such that we are aiming to use primary accounts and using roles within projects to reduce the need for secondary accounts. The project with multiple members to manage the project covers the service account scenario with respect to resources.

Thus, the cloud can potentially simplify both identity/roles and authentication by focusing on the one user, one account model. We expect exceptions but since one of the aims of the move to the cloud was to simplify our environment, we hope these can be limited to very special circumstances.

Managing Roles

We use the standard conventions for OpenStack roles.
  • Admin is a global role providing 'super-user' access to OpenStack. This is allocated to a group within Active Directory and the only members are the staff who support the cloud within IT.
  • For each project, there is a members list defined. When a project is set up, a group is provided as part of the request which defines the people who are able to perform actions within the project such as VM creation/deletion/reboot.
There is a regular script which ensures that the Active Directory groups are synchronised with those in Keystone.

User Lifecycle

With over 200 arrivals and departures every month, it is important to track the owner of resources to retire them when someone is no longer working on a CERN related activity. 

We use Microsoft Federated Identity Manager (FIM) as an engine to automatically create users when someone is registered in the CERN Human Resources database and to expire them as they leave.

Users who wish to use the cloud can subscribe via the CERN accounts and resources portal. This creates an account and a personal project for them in a few minutes so they can already start investigating cloud technologies.

The general approach is that personal resources (such as the Personal project in OpenStack) will be removed. VMs will be stopped and deleted. Departing users are removed from their roles. Ownership of shared resources, such as projects, can be transferred before leaving or are automatically passed to the supervisor.

With this lifecycle, the OpenStack resources follow that for other computing resources and there are no orphaned resources.

To allow FIM and OpenStack to integrate, we developed a service called Cornerstone which provides a SOAP interface for FIM such as create personal project, create shared project, etc. and then performs the automated operations behind the scenes.

One interesting issue was the propagation delays. When a new project is created in FIM, Active Directory is updated but there is a small delay before all the slaves of Active Directory are updated. Thus, for project creation, we use a single Active Directory server to receive the information to avoid inconsistency (at the expense of availability if AD is down). 


As we've rolled out Grizzly, there is now ongoing work on the CERN Grizzly OpenStack to enhance user access. Specifically,
  • Kerberos and X.509 certificates for user authentication are widely used in the High Energy Physics work. Kerberos is often used for interactive user authentication. X.509 certificates are also used for users but increasingly as a way to identify services such as automated job submission factories. Now that Keystone supports REMOTE_USER authentication, we can use the Apache kerberos and certificate authentication methods to front end the Keystone service. This will avoid having to source a profile and enter passwords.
  • Integration of CERN's web based Single Sign On is an attractive option for Horizon. While common passwords are used, the user of Horizon still needs to enter their password to get access to the dashboard. CERN uses Microsoft ADFS to provide a Single Sign On capability which is used for most web applications.
  • We have a team of system administrators who perform the standard operations tasks when there are alarms in our monitoring system. These sysadmins need to be able to start/stop/reboot instances across the cloud but not perform create/delete/... operations. We will investigate how to model this within the existing JSON policy files
A number of ongoing activities in Havana will make further integration easier:
  • The Keystone V3 API is coming along which will include additional functionality in the area of mapping groups to roles. We will investigate how to map OpenStack roles into Active Directory groups and thus avoid synchronisation scripts.
  • Domains will add an extra level of project handling allowing us to group projects together. This will also create the possibilities of a structured set of roles within our user communities.
We'll be participating in the Havana design discussions around these areas so that we can further streamline our user and identity management in future.