From TeraGrid Wiki
Distinguished Names (DNs) identify TeraGrid users. Certificate Authorities (CAs) perform an identity vetting process to assign DNs and issue digitally signed certificates containing the DNs (in the subject field) to users. Users run programs that present the digitally signed certificates to online services on TeraGrid and elsewhere as part of an authentication protocol, enabling the services to make authorization decisions based the users' authenticated distinguished names. DNs contain the user's name (in the common name field), together with other qualifying components that ensure the DN uniquely identifies the user even when other people may have the same name (such as John Wilson). For example:
/C=US/O=National Center for Supercomputing Applications/CN=Jim Basney
The purpose of DNs in the TeraGrid is to enable users to have a consistent identity across all TeraGrid sites, which may each have a different login name for the same user. This allows the user, to some extent, to be unaware of site-specific login names and enables single sign-on, which is the ability to use the same credentials to authenticate to multiple TeraGrid systems.
This document describes how distinguished names are used on the TeraGrid: how they are assigned, how they are associated with TeraGrid accounts, and how they are managed to meet TeraGrid's security requirements.
Currently this document captures what TeraGrid is doing fairly well. The next step is to flush out the issues and then work with the TeraGrid community to come to some conclusions about next steps.
- AMIE: Account Management Information Exchange (documentation)
- CA: Certificate Authority
- CN: Common Name
- DN: Distinguished Name
- RP: TeraGrid Resource Provider
- TG: TeraGrid
- TGCDB: TeraGrid Central Database
- TGUP: TeraGrid User Portal
Distinguished Name Flow
The figure at right shows how DNs are created, distributed and consumed on the TeraGrid. In this section we discuss each entity in the figure and its role in this process.
The TGCDB contains the "master copy" of the TG-wide mapping of users to their DNs. Each user record includes the set of DNs for that user. Every user has two canonical, immutable DNs for the NCSA CAs that are generated during account creation. A user may also have additional DNs as added by RP sites or by the user, as described subsequently.
The TGCDB also contains the following information for each user:
- For each RP site, a site person ID that uniquely identifies that user to that site. The site person ID may or may not be a login name for that user at a site. When a AMIE packet is sent to a particular RP, the site person id for that site is included and the site is responsible (using infrastructure out of scope of this document) for determining the pertinent login name(s) from the site person id.
- For each RP resource, one or more login names (i.e. first field in /etc/password) for that user. Users may have multiple login names on a resource if a RP site decides to assign them such. In this case it is up to the RP site to decide which account or accounts the DNs from the TGCDB are mapped to (the TGCDB just maps DNs to the site person ID, not individual user names).
As part of creating a user entry in the TGCDB, the NCSA Allocations Group, which handles allocations for TeraGrid, assigns two NCSA DNs for each user (one for the MyProxy CA and one for the CACL-based CA; these are described subsequently). These DNs are generated at the time the user's record in the TGCDB is generated, are crafted to be globally unique, and barring exceptional circumstances (e.g. some error in creation) are immutable.
Some TG sites that run CAs for their users push user-to-DN mappings back into the TGCDB using AMIE or by writing directly to the database. Some sites do this automatically when they receive an AMIE account creation notification; others do it only once a user requests a credential from the local site CA. Known instances of this include:
- PSC automatically pushes a PSC KCA DN for each TG user into the TGCDB via AMIE upon being notified of the user's account creation.
- SDSC uses gx-request to push a SDSC CACL DN for each users that requests a credential from the SDSC CACL CA, and a pushes a removal for each SDSC CACL DN that expires or is revoked.
- SDSC uses gx-request to push an NPACI CACL DN for each NPACI CACL DN that expires or is revoked. (The NPACI CA no longer issues new certificates.)
User-driven DN Management
There are two processes by which users can manage their DN mappings. Both work by sending SQL commands directly to the TGCDB.
- gx-request: gx-request (part of the gx-map package) can be run by a user from the command-line at TG sites to add or delete DN mappings for their TG account, or it can be run by an administrator.
- TGUP: TG users can view their registered DNs on the My TeraGrid tab of the TeraGrid User Portal. In the future, users will also be able to add and delete DN mappings through this interface.
Note that at this time the TGUP process is pending activation and it is unclear if all RP sites have gx-map configured to work in the manner described above. It's unclear for the gx-map process if a decision has been made on a TG-wide level to activate this feature.
There are two mechanisms by which user-to-DN mapping information is communicated between the TGCDB and RP sites:
- AMIE: Any relevant change in the TGCDB automatically generates an AMIE packet to affected TG site(s) containing information about the change. For more detail see #AMIE Details.
- SQL: TG sites may use a grid-mapfile management application which pulls (once given permission by the TGCDB admins) user-to-DN mapping information from the TGCDB. This pull operation may be initiated by the receipt of an AMIE packet. gx-map is currently the only application that does this.
Grid-mapfile Management Application
Each RP site has a application which handles AMIE-driven account management (which is out of scope of this document) and management of user-to-DN mappings. This application can also pull information as described in the previous section.
A site may also have local DN-to-user mappings (e.g. from Grid users who are not in the TGCDB). The application at each site handles the merging of these local mappings with mappings from the TGCDB.
gx-map is an example of such an application.
TG services map DNs to local accounts via /etc/grid-security/grid-mapfile, which contains lines of the format:
DNs can be mapped to multiple logins as follows:
See the gridmap specification for more details.
Note that grid-mapfiles will differ across the TG sites and systems for the following reasons:
- Login names are not consistent across the TG sites, so a user's DN(s) will map to a different login name at each site.
- Not all users have accounts at all TG systems, and each local grid-mapfile will only contain entries for users with accounts on that system.
Multiple CAs issue certificates to TeraGrid users. The TeraGrid Security Working Group determines which CA's certificates will be accepted across all TeraGrid sites, as documented at http://security.teragrid.org/TG-CAs.html. The configuration files for these CAs are published at http://security.teragrid.org/docs/teragrid-certs.tar.gz.
All TeraGrid users can obtain certificates from the NCSA CAs. Other CAs issue certificates to TeraGrid users based on their local policies.
The NCSA CAs
NCSA currently runs two CAs, as shown in the figure at right: the CACL CA, which issues long-lived certificates valid for up to one year, and the MyProxy CA, which issues short-lived certificates valid for up to one week. A complete set of policy documents and other technical information for the NCSA CAs can be found at http://security.ncsa.uiuc.edu/CA.
The NCSA Allocations group creates canonical DNs automatically for each user for both CAs in the NCSA User Database. NCSA DNs for long-lived certificates are of the form:
/C=US/O=National Center for Supercomputing Applications/OU=People/CN=Jim Basney
NCSA DNs for short-lived certificates are of the form:
/C=US/O=National Center for Supercomputing Applications/CN=Jim Basney
In the case of name conflicts, an integer is added to the CN field to differentiate between DNs.
The short-lived NCSA certificates are issued by the NCSA MyProxy CA (myproxy.teragrid.org). The MyProxy server consults a grid-mapfile routinely generated from the NCSA DNs from the NCSA User Database to map from the user's Kerberos principal name to their DN. As documented at http://www.teragrid.org/userinfo/access/, TG users can obtain these certificates from the MyProxy CA using their TeraGrid-wide username and password (as stored in the TeraGrid Kerberos domain) via the following methods:
- by running myproxy-logon from the command-line;
- by performing MyProxy authentication via the GSI-SSHTerm application;
- by logging into the TeraGrid User Portal (the TGUP uses the NCSA MyProxy CA to authenticate users and obtain grid credentials for them).
Users are highly encouraged to use short-lived certificates from the MyProxy CA. For users that required long-lived certificates, such certificates are issued by the NCSA CACL CA via the ncsa-cert-request client. Users are authenticated using their NCSA Kerberos username and password as well as their "NCSA Default Password" stored in the NCSA User Database.
The PSC KCA
The Purdue CA
The SDSC/NPACI CAs
The TACC CA
Other IGTF CAs
The TeraGrid also recognizes (some of the) CAs approved by the International Grid Trust Federation (IGTF), a policy management authority that works to harmonize and synchronize CA policies for global trust relationships. TeraGrid participates in the IGTF via The Americas Grid Policy Management Authority (TAGPMA) as a relying party, and individual TG sites participate as identity providers (i.e., CA operators). IGTF authentication profiles establish a common policy framework for grid CAs.
TeraGrid sites exchange account management information with the TeraGrid Central Database (TGCDB) via AMIE. AMIE defines a set of transactions for exchanging account management information. The following AMIE transactions include DN information:
- Requests a site to create a project and an account for the PI. The TGCDB sends a request_project_create packet with all known DNs for the PI. The site responds with a notify_project_create packet with all DNs in its grid-mapfiles for the PI, and the TGCDB adds any new DNs to the user's record. Then the TGCDB sends a data_project_create packet with all known DNs for the PI, and the site responds with an inform_transaction_complete packet. Sites must add to their grid-mapfiles all the DNs listed for the PI in the request_project_create and data_project_create packets.
- Requests a site to create an account for a user on a project. The TGCDB sends a request_account_create packet with all known DNs for the user. The site responds with a notify_account_create packet with all DNs in its grid-mapfiles for the user, and the TGCDB adds any new DNs to the user's record. Then the TGCDB sends a data_account_create packet with all known DNs for the user, and the site responds with an inform_transaction_complete packet. Sites must add to their grid-mapfiles all the DNs listed for the user in the request_account_create and data_account_create packets.
- Requests a site to update information about a user. The TGCDB will send this message to the sites whenever a DN is added/removed from a user's record. The TGCDB sends a request_user_modify packet with ActionType equal to replace or delete. For replace, the DNs listed must be added to the site's grid-mapfile. DNs in the grid-mapfile which are not listed must be preserved. For delete, the DNs listed must be deleted from the site's grid-mapfiles. The site responds with inform_transaction_complete.
Sites can also initiate transactions:
- Creates a TG project. The site sends a notify_project_create packet with all locally-known DNs for the PI, and the TGCDB adds any new DNs to the user's record. Then the TGCDB sends a data_project_create packet with all known DNs for the PI, and the site responds with an inform_transaction_complete packet. The sites must add to its grid-mapfiles all the DNs listed for the PI in the data_project_create packet.
- Notify the TGCDB of a new TG account for a project. The site sends a notify_account_create packet with all DNs in its grid-mapfiles for the user, and the TGCDB adds any new DNs to the user's record. Then the TGCDB sends a data_account_create packet with all known DNs for the user, and the site responds with an inform_transaction_complete packet. The site must add to its grid-mapfiles all the DNs listed for the user in the data_account_create packet.
- Notifies the TGCDB of updated information about a user. Similar to the TGCDB-initiated transaction by the same name.
If the AMIE transactions are processed correctly, DNs should be mapped to user accounts consistently across the TG sites. In other words, for each system where a TG user has an account, the same DNs should be mapped to that user account. Note that there will be delays between the time a DN is installed in the TGCDB and when it shows up in the grid mapfiles. The distribution process is driven by timers (cron jobs) that either poll the TGCDB (eg, gx-map) or push DNs to the sites (eg AMIE).
Managing Distinguished Names: The Challenges
In this section we discuss the challenges that TeraGrid has and is still experiencing with managing distinguished names. A subsequent second will discussion recommendations to addressing these issues.
NOTE: This page subsumes DN Propagation and Management.
Multiple Mappings and Ordering
Both the grid-mapfile format and the TGCDB allows multiple DNs to be bound to a single login name, as well as for a single DN to map to multiple login names.
It is common for TG users to have multiple DNs. Many TG users have NCSA and PSC DNs assigned to them when their accounts are created, and they may prefer to use a certificate obtain from another TG or IGTF CA.
In contrast, it is uncommon and problematic for a TG DN to map to multiple login names on a system, but there is nothing preventing this currently. When this happens, which ever mapping is listed first in the grid-mapfile will be the effective mapping for the user, meaning that if the order is inconsistent across sites or resources, the same DN may map to accounts for different users at different resources/sites.
Reverse Mappings and Ordering
Using the grid-mapfile to determine a DN from a login name is considered reverse mapping. When multiple DNs map to a login name in a grid-mapfile, the Globus Toolkit uses the first appearance of the login name in the file for the reverse mapping, while GPFS-WAN uses the last appearance.
Current TG practice does not guarantee the ordering of the grid-mapfile lines, making multiple reverse mappings problematic for users with multiple DNs mapping to a single login, as different sites or resources may have mappings in different order. This can cause the same user may map to two different DNs on two different resouces.
GPFS-WAN currently performs reverse DN mapping to determine if two users on different systems are the same user, i.e., GPFS-WAN needs to make my files look like they're owned by a user's local UID from wherever they are being accessed. To do this, it needs to know how to map between SDSC UIDs and NCSA UIDs.
Should reverse mapping be supported and if so, what requirements does this place on the propagation mechanism?
- Identifying Users: When a TeraGrid resource is accessed via a community account, it is more difficult for the TG RP to identify the individual user making the access. For accounting and reporting reasons, TeraGrid RPs would like statistics about the individuals accessing their resources. Additionally, RP security personnel would like the ability to trace back accesses to individuals in the case of a security incident. Furthermore, RPs may want to deny access to individuals or classes of users. The AAA Testbed effort aims to include user attributes in community proxy credentials to provide this information to RPs when community credentials are used to access TG resources.
- Isolating Users: When multiple individual users access community accounts, their activities in the shared account may conflict. For example, if a community user delegates credentials to a community account, other community users who are also accessing the community account should not have access to those delegated credentials. Community users should not be able to attack each other via the community account (for example, by installing Trojan executables in the account). Finally, we must recognize that science is a competitive pursuit and community members will have a reasonable expectation that their pre-publication results will be protected from disclosure to other community members. Thus, we must protect access to community user data between users in community accounts.
- Restricting Access: Community accounts are intended for use by members of an identified scientific community for identified scientific goals. They should not provide unfettered access to TG resources to anyone on the Internet. In particular, we must limit the ability for community accounts to be used for malicious purposes, by both limiting who has access (i.e., only valid community members) and what they have access to (i.e., a controlled set of scientific applications).
The security issues for community credentials and community accounts are closely related. Science Gateways and RPs must work together to provide appropriate access to TG resources. For portal gateways, the portal can control directly how the community credential is used; for desktop gateways, restricting access via the gateway is more difficult, because a credential is delivered to the user's desktop. On the RP side, restricted community accounts are an essential security measure for restricting access. See Science Gateway Credential Management for more discussion of these issues.
Differences in Textual Representations
A distinguished name in a certificate is actually a binary-encoded set of attribute and value pairs, where the attribute is an object identifier (OID) and then the value is a string. For example in the following DN:
/C=US/O=National Center for Supercomputing Applications/CN=Von S. Welch
"C=US" is a OID representing "C" or Country, and then "US" is the value encoded as a string. When the DN is converted to a string, the OID is mapped to the string "C".
Over the years, the specifications for mapping OIDs to strings have changed. The implementations in the OpenSSL software on which the Globus Toolkit and other TG software are build matched these changing specifications. Unfortunately, this means different versions of the software create different strings when converting the same DN to a string. This can cause problems when trying to match the DN in the grid-mapfile.
Globus Toolkit version 4.0.0 and later performs some internal conversion when checking the grid-mapfile to mask these changes in OpenSSL. Specifically, as documented in :
- If DN has "/e=" or "/E=", it is replaced with "/emailAddress"
- If DN has "/Email=", it is replaced with "/emailAddress". The comparison is NOT case sensitive.
- If DN has "/uid=", it is replaced with "/userid=". The comparison is NOT case sensitive.
Now that CTSS 3 is deployed (with Globus Toolkit 4.0) and CTSS 2 (with Globus Toolkit 2.4.3) is no longer supported, it should no longer be necessary to maintain old-style DNs in TeraGrid grid-mapfiles. They should be purged from the TGCDB.
Some accounts exist for administrative purposes and should not be accessible via grid credentials. E.g. "globus" and "condor" are accounts for managing software installation. Sites may, as part of their normal account creation process, automatically generate DNs for these accounts and then propagate them to the TGCDB, which then propagates them back out.
AMIE and gx-map Overlap
There is some overlap in functionality between AMIE and gx-map. Does TeraGrid have a official recommendation on when sites should use one vs the other?
Validating a TeraGrid Grid-mapfile
How does one tell when a grid-mapfile is "correct"? Today one can find problems and tell one is incorrect, but is there a test for correctness? Expanding on this, how can we validate a TG resource has a correct grid-mapfile. Should this be an INCA test?
- Notes from Eric Roberts on DN Propagation and Management seem to indicate he had tests for incorrectness at least.
There is an impression at least that DN propagation to all sites is not working in a timely manner. What is the desired timeframe for propagation and what are the issues preventing TeraGrid from achieving this propagation?
One issue that has been made apparent by discussions on the accounts-wg email list is that if there is a failure with automated grid-mapfile propagation (i.e. gx-request fails), it becomes a manual method (email is sent to an administration and no automated follow-up occurs). This makes it very easy for propagation failures to "fall through the cracks".
The authors suggest taking a number of steps to address the challenges laid out in the previous section. Some of these actions are decisions to constrain the contents and the allowed usage of the grid-mapfile in order to simplify maintenance, while others are the development of technical procedures to ensure the grid-mapfiles at all RP resources are being correctly maintained.
Steps to Simplify Grid-mapfile Maintenance
- Purge old-style DNs from the TGCDB and grid-mapfiles.
- DNs with "/e=" or "/E=" or "/Email=" or "/USERID=" have been maintained in TG grid-mapfiles for backward compatibility with older Globus Toolkit versions, but now that supported CTSS software is based on Globus Toolkit 4, this backward compatibility is no longer required. Purging these old-style DNs will simplify grid-mapfile management.
- Disallow the use of Reverse Mapping
- The mapping of login names to DNs places a burden across grid-mapfiles to be ordered similarly, which is very difficult to maintain. GPFS-WAN currently performs reverse mapping with no plans to change. By disallowing any reverse mappings, TeraGrid eases the grid-mapfile maintenance burden by relieving the dependancy on the ordering of entries in the grid-mapfile.
- Specify that a DN shall only map to a single user ID.
- Create policy such that each DN only appears once in each grid-mapfile. This means defining policies such as the following:
- A DN can only be mapped to a single user.
- If a user has multiple user IDs on a RP resource, the DN only maps to one of those user IDs.
- If a site has a local mapping for a DN that differs from the TGCDB mapping, the conflict must be resolved outside of the grid-mapfile.
- All mappings for TeraGrid users shall be maintain in the TGCBD
- If a site creates or is aware of a DN mapping for a TeraGrid user, instead of maintaining that mapping locally, they should push it (via AMIE) to the TGCDB as opposed to maintaining it locally and merging it with the grid-mapfile. This gives the TGCDB a complete view of all the mappings that should be in place for its users and allows for validation as discussed in the next section.
- Enable DN add/delete via the TGUP as planned.
- Since gx-map is not a required CTSS component, the TGUP is the only automated means for DN management available to all TG users.
Steps to Validate Grid-mapfile Maintenance
- Implement routine, automated DN consistency checks in the TGCDB.
- A DN should be associated with only one user in the TGCDB.
- Only community DNs (with CN containing "Community User") should be mapped to community accounts and community DNs should only be mapped to community accounts.
- Implement automated grid-mapfile validation across RP sites.
- Inconsistent grid-mapfiles cause user confusion and security problems. TG needs a mechanism to detect inconsistencies so they are promptly discovered and fixed. A grid-mapfile Inca test should be developed and deployed to perform the following verifications daily and report problems:
- For each account on the TG resource registered in the TGCDB, the DN mapping must match the DNs for that person in the TGCDB.
- No DN should be mapped to multiple accounts on a TG resource.
Distillation of Above into Inca Tests
Viewable at Inca Status Pages
This Inca reporter checks only the Distinguished Names (DNs) in the TeraGrid Central DataBase(TGCDB) for any of the following problems:
- A community user DN mapping to a non-community username.
- A non-community DN mapping to a community username.
- A DN mapping to multiple non-community person_ids.
- A DN that does not conform to the GT4 canonicalization standard: /E or /EMAIL (case insensitive) replaced with /emailAddress and /UID replaced with /userid
- A subset of error-OldDN's where the corresponding, canonicalized form of the DN is not in the TGCDB.
- A DN mapping to multiple community person_ids.
- A DN indicates that indicates a proxy certificate.
This Inca reporter finds inconsistencies between a system's local grid-mapfile and the TeraGrid Central DataBase (TGCDB). The grid-mapfile contains a mapping of Distinguished Names (DNs) to usernames (Unix) on the local resource.
Since the grid-mapfile is in constant flux and there exists a propagation delay for changes, some reported errors may be transient and resolve without intervention, but others may be persistent and represent real security or usability issues. When looking at the additional, descriptive error messages on the Inca Status Pages, results may be truncated, but they can all be viewed if the reporter is executed from the command line. The different error codes that can occur are listed and described as follows.
- A user is listed in the TGCDB with an active allocation for the local resource, and the user is in the resource's passwd file, but no DN->username mappings exist in the resource's grid-mapfile.
- Without any mappings, they cannot use X.509 certificates to log in to the resource via GSI-enabled applications, like GSI-OpenSSH.
- This can be a transient error resulting from delays in site accounting process, but also may be a persistent usability issue for users that are consistently without DN->username mappings. The resource will need to add all DN's listed in the TGCDB to the grid-mapfile.
- A user is listed in the TGCDB with an active allocation for the local resource, the user in the resource's passwd file, and they have some entries in the grid-mapfile, but one of the user's DNs as listed in the TGCDB is not among them.
- Without coherent mappings, the user may experience problems logging into the resource with a particular X.509 certificate, but not with others.
- The underlying cause of this error may not originate at the site for which it shows up, but the list of user DNs in the TGCDB should be trusted as accurate and any missing DNs should be mapped into the grid-mapfile. Other times, this error results from a difference in textual representation of the DN, which is explained in the error below.
- This is a subset of all the errorPartiallyMapped's and results from when the DN string in the grid-mapfile has been normalized according to Globus Toolkit (GT) 4.0 standard, but the DN string in the TGCDB conforms to older GT standards.
- GT4 canonicalization replaces /E or /EMAIL (case insensitive) with /emailAddress and /UID (case insensitive) with /userid.
- The old-style DN in the TGCDB can be purged and replaced with the corresponding GT4 canonicalized version, if necessary. Since globus canonicalizes internally, there should be no usability issues from this error.
- Occurs when a DN->username mapping has been entered into a resource's grid-mapfile, but was never propagated back to the TGCDB and back out to other TeraGrid sites.
- If this propagation does not happen, the user will be able to log in to the resource, but the Single Sign-On (SSO) capabilities of the TeraGrid are defeated and they will not be able to use their personal X.509 Certificate across TeraGrid sites.
- A DN in the resource's grid-mapfile maps to more than a single username.
- This may represents a mistake in the accounting process: a single person has been given multiple accounts at site. In that case, the user may log in to the resource via a GSI-enabled applications, but get logged in as a different user than they are used to. Usually, there's not a reason for a user to get multiple accounts at a site, and they will need to be consolidated into one account.
- This error may also represent a security issue: it can indicate users that are sharing accounts.
- Currently, the first username listed will be used as the default for GSI-enabled applications, but a given DN in the grid-mapfile must map to a single local username to avoid the two above issues.
- Occurs when a DN->username mapping appears more than once in the resource's grid-mapfile. The redundancies can be deleted.
- A community DN is mapped to a username that is not associated with a community.
- Both the local passwd file and the TGCDB are checked to see if the given username is associated with a real name that matches "Community User", as is the standard. If it does not, then either the standard naming convention for community users needs to be applied (CN="Community User" and the passwd entry or last_name of the people table in the TGCDB is "Community User") or a personal username needs to be unmapped because it poses a security issue.
- A poorly formatted line exists in the resource's grid-mapfile. A typical line is a quoted DN string followed by a space and the local username, for example:
- "/C=US/O=National Center for Supercomputing Applications/CN=Jonathan Siwek" jsiwek
- See here for details on how the grid-mapfile is parsed.
Source files for images on this page: