In Figure 1-4, the directory entry outlined in the dashed square has an RDN of cn=gerald carter. Note that the attribute name as well as the value are included in the RDN. The DN for this node would be cn=gerald carter,ou=people, dc=plainjoe,dc=org.
Functional model
The functional model is the LDAP protocol itself. This protocol provides the means for accessing the data in the directory tree. Access is implemented by authentication operations (bindings), query operations (searches and reads), and update operations (writes).
Security model
The security model provides a mechanism for clients to prove their identity (authentication) and for the server to control an authenticated client's access to data (authorization). LDAPv3 provides several authentication methods not available in previous protocol versions. Some features, such as access control lists, have not been standardized yet, leaving vendors to their own devices.
Figure 1-4. Example LDAP directory tree
At this high level, LDAP is relatively simple. It is a protocol for building highly distributed directories. In the next chapter, we will examine certain LDAP concepts such as schemas, referrals, and replication in much more depth.
Chapter 2. LDAPv3 Overview
Chapter 1 should have helped you understand the characteristics of a directory in general, and an LDAP directory in particular. If you still feel a little uncomfortable about LDAP, relax. This chapter is designed to flesh out some of the details that we glossed over. Your immediate goal should be to understand the basic building blocks of any LDAPv3 directory server. In the next chapter, we will start building an LDAP directory.
LDIF
Most system administrators prefer to use plain-text files for server configuration information, as opposed to some binary store of bits. It is more comfortable to deal with data in vi, Emacs, or notepad than to dig though raw bits and bytes. Therefore, it seems fitting to begin an exploration of LDAP internals with a discussion of representing directory data in text form.
The LDAP Interchange Format (LDIF), defined in RFC 2849, is a standard text file format for storing LDAP configuration information and directory contents. In its most basic form, an LDIF file is:
A collection of entries separated from each other by blank lines
A mapping of attribute names to values
A collection of directives that instruct the parser how to process the information
The first two characteristics provide exactly what is needed to describe the contents of an LDAP directory. We'll return to the third characteristic when we discuss modifying the information in the directory in Chapter 4.
LDIF files are often used to import new data into your directory or make changes to existing data. The data in the LDIF file must obey the schema rules of your LDAP directory. You can think of the schema as a data definition for your directory. Every item that is added or changed in the directory is checked against the schema for correctness. A schema violation occurs if the data does not correspond to the existing rules.
Figure 2-1 shows a simple directory information tree. Each entry in the directory is represented by an entry in the LDIF file. Let's begin with the topmost entry in the tree labeled with the distinguished name (DN) dc=plainjoe,dc=org:
# LDIF listing for the entry dn: dc=plainjoe,dc=org
dn: dc=plainjoe,dc=org
objectClass: domain
dc: plainjoe
Figure 2-1. An LDAP directory tree
We can make a few observations about LDIF syntax on the basis of this short listing:
Comments in an LDIF file begin with a pound character (#) at position one and continue to the end of the current line.
Attributes are listed on the lefthand side of the colon (:), and values are presented on the righthand side. The colon character is separated from the value by a space.
The dn attribute uniquely identifies the DN of the entry.
Distinguished Names and Relative Distinguished Names
It is important to realize that the full DN of an entry does not actually need to be stored as an attribute within that entry, even though this seems to be implied by the previous LDIF extract; it can be generated on the fly as needed. This is analogous to how a filesystem is organized. A file or directory does not store the absolute path to itself from the root of the filesystem. Think how hard it would be to move files if this were true.
If the DN is like the absolute path between the root of a filesystem and a file, a relative distinguished name (RDN) is like a filename. We've already seen that a DN is formed by stringing together the RDNs of every entity from the element in question to the root of the directory tree. In this sense, an RDN works similarly to a filename. However, unlike a filename, an RDN can be made up of multiple attributes. This is similar to a compound index in a relational database system in which two or more fields are used in combination to generate a unique index key.
While a multivalued RDN is not shown in our example, it is not hard to imagine. Suppose that there are two employees named Jane Smith in your company: one in the Sales Department and one in the Engineering Department. Now suppose the entries for these employees have a common parent. Neither the common name (cn) nor the organizational unit (ou) attribute is unique in its own right. However, both can be used in combination to generate a unique RDN. This would look like:
# Example of two entries with a multivalued RDN
dn: cn=Jane Smith+ou=Sales,dc=plainjoe,dc=org
cn: Jane Smith
ou: Sales
<...remainder of entry deleted...>
dn: cn=Jane Smith+ou=Engineering,dc=plainjoe,dc=org
cn: Jane Smith
ou: Engineering
<...remainder of entry deleted...>
For both of these entries, the first component of the DN is an RDN composed of two values: cn=Jane Smith+ou=Sales and cn=Jane Smith+ou=Engineering.
In the multivalued RDN, the plus character (+) separates the two attribute values used to form the RDN. What if one of the attributes used in the RDN contained the + character? To prevent the + character from being interpreted as a special character, we need to escape it using a backslash (). The other special characters that require a backslash-escape if used within an attribute value are:
A space or pound (#) character occurring at the beginning of the string
A space occurring at the end of the string
A comma (,), a plus character (+), a double quote ("), a backslash (), angle brackets (< or >), or a semicolon (;)
Although multivalued RDNs have their place, using them excessively can become confusing, and can often be avoided by a better namespace design. In the previous example, it is obvious that the multivalued RDN could be avoided by creating different organizationalUnits (ou) in the directory for both Sales and Engineering, as illustrated in Figure 2-2. Using this strategy, the DN for the first entry would be cn=Jane Smith,ou=Sales,dc=plainjoe,dc=org. This design does not entirely eliminate the need for multivalued RDNs; we could still have two people named Jane Smith in the Engineering organization. But that will occur much less frequently than having two Jane Smiths in the company. Look for ways to organize namespaces to avoid multivalued RDNs as much as is possible and logical.
Figure 2-2. A namespace that represents Jane Smith with a unique, multivalued RDN
One final note about DNs. RFC 2253 defines a method of unambiguously representing a DN using a UTF-8 string representation. This normalization process boils down to:
Removing all nonescaped whitespace surrounding the equal sign (=) in each RDN
Making sure the appropriate characters are escaped
Removing all nonescaped spaces surrounding the multi-value RDN join character (+)
Removing all nonescaped trailing spaces on RDNs
Therefore, the normalized version of:
cn=gerald carter + ou=sales, dc=plainjoe ,dc=org
would be:
cn=gerald carter+ou=sales,dc=plainjoe,dc=org
Without getti
ng ahead of ourselves, I should mention that the string representation of a distinguished name is normally case-preserving, and the logic used to determine if two DNs are equal is usually a case-insensitive match. Therefore:
cn=Gerald Carter,ou=People,dc=plainjoe,dc=org
would be equivalent to:
cn=gerald carter,ou=people,dc=plainjoe,dc=org
However, this case-preserving, case-insensitive behavior is based upon the syntax and matching rules (see Section 2.2 later in this chapter) of the attribute type used in each relative component of the complete DN. So while DNs are often case-insensitive, do not assume that they will always be so.
Subsequent examples use the normalized versions of all DNs to prevent confusion, although I may tend to be lax on capitalization.
Back to Our Regularly Scheduled Program . . .
Going back to Figure 2-1, your next question is probably, "Where did the extra lines in the LDIF listing come from?" After all, the top entry in Figure 2-1 is simply dc=plainjoe,dc=org. But the LDIF lines corresponding to this entry also contain an objectClass: line and a dc: line. These extra lines provide additional information stored inside each entry. The next few sections answer the following questions:
What is an attribute?
What does the value of the objectClass attribute mean?
What is the dc attribute?
If dc=plainjoe,dc=org is the top entry in the directory, where is the entry for dc=org?
What Is an Attribute?
The concepts of attribute types and attribute syntax were mentioned briefly in the previous chapter. Attribute types and the associated syntax rules are similar to variable and data type declarations found in many programming languages. The comparison is not that big of a stretch. Attributes are used to hold values. Variables in programs perform a similar task—they store information.
When a variable is declared in a program, it is defined to be of a certain data type. This data type specifies what type of information can be stored in the variable, along with certain other rules, such as how to compare the variable's value to the data stored in another variable of the same type. For example, declaring a 16-bit integer variable in a program and then assigning it a value of 1,000,000 would make no sense (the maximum value represented by a signed 16-bit integer is 32,767). The data type of a 16-bit integer determines what data can be stored. The data type also determines how values of like type can be compared. Is 3 < 5? Yes, of course it is. How do you know? Because there exists a set of rules for comparing integers with other integers. The syntax of LDAP attribute types performs a similar function as the data type in these examples.
Unlike variables, however, LDAP attributes can be multivalued. Most procedural programming languages today enforce "store and replace" semantics of variable assignment, and so my analogy falls apart. That is, when you assign a new value to a variable, its old value is replaced. As you'll see, this isn't true for LDAP; assigning a new value to an attribute adds the value to the list of values the attribute already has. Here's the LDIF listing for the ou=devices,dc=plainjoe,dc=org entry from Figure 2-1; it demonstrates the purpose of multivalued attributes:
# LDIF listing for dn: ou=devices,dc=plainjoe,dc=org
dn: ou=devices,dc=plainjoe,dc=org
objectclass: organizationalUnit
ou: devices
telephoneNumber: +1 256 555-5446
telephoneNumber: +1 256 555-5447
description: Container for all network enabled
devices existing within the plainjoe.org domain
* * *
Tip
Note that the description attribute spans two lines. Line continuation in LDIF is implemented by leaving exactly one space at the beginning of a line. LDIF does not require a backslash () to continue one line to the next, as is common in many Unix configuration files.
* * *
The LDIF file lists two values for the telephoneNumber attribute. In real life, it's common for an entity to be reachable via two or more phone numbers. Be aware that some attributes can contain only a single value at any given time. Whether an attribute is single- or multivalued depends on the attribute's definition in the server's schema. Examples of single-valued attributes include an entry's country (c), displayable name (displayName), or a user's Unix numeric ID (uidNumber).
Attribute Syntax
An attribute type's definition lays the groundwork for answers to questions such as, "What type of values can be stored in this attribute?", "Can these two values be compared?", and, if so, "How should the comparison take place?"
Continuing with our telephoneNumber example, suppose you search the directory for the person who owns the phone number 555-5446. This may seem easy when you first think about it. However, RFC 2252 explains that a telephone number can contain characters other than digits (0-9) and a hyphen (-). A telephone number can include:
a-z
A-Z
0-9
Various punctuation characters such as commas, periods, parentheses, hyphens, colons, question marks, and spaces
555.5446 or 555 5446 are also correct matches to 555-5446. What about the area code? Should we also use it in a comparison of phone numbers?
Attribute type definitions include matching rules that tell an LDAP server how to make comparisons—which, as we've seen, isn't as easy as it seems. In Figure 2-3, taken from RFC 2256, the telephoneNumber attribute has two associated matching rules. The telephoneNumberMatch rule is used for equality comparisons. While RFC 2552 defines telephoneNumberMatch as a whitespace-insensitive comparison only, this rule is often implemented to be case-insensitive as well. The telephoneNumberSubstringsMatch rule is used for partial telephone number matches—for example, when the search criteria includes wildcards, such as "555*5446".
Figure 2-3. telephoneNumber attribute type definition
The SYNTAX keyword specifies the object identifier (OID) of the encoding rules used for storing and transmitting values of the attribute type. The number enclosed by curly braces ({ }) specifies the minimum recommended maximum length of the attribute's value that a server should support.
* * *
Object Identifiers (OIDs)
LDAPv3 uses OIDs such as those used in SNMP MIBs. SNMP OIDs are allocated by the Internet Assigned Numbers Authority (IANA) under the mgmt(2) branch of the number space displayed in Figure 2-4. Newly created LDAPv3 OIDs generally fall under the private(4), enterprise(1) branch of the tree. However, it is also common to see numbers under the joint-ISO-ccitt(2) branch of the number tree. OIDs beginning with 2.5.4 come from the user attribute specifications defined by X.500.
An OID is a string of dotted numbers that uniquely identifies items such as attributes, syntaxes, object classes, and extended controls. The allocation of enterprise numbers by IANA is similar to the central distribution of IP address blocks; once you have been assigned an enterprise number by the IANA, you can create your own OIDs underneath that number. Unlike the IP address space, there is no limit to the number of OIDs you can create because there's no limit to the length of an OID.
For example, assume that you were issued the enterprise number 55555. Therefore, all OIDs belonging to your branch of the OID tree would begin with 1.3.6.1.4.1.55555. How this subtree is divided is at your discretion. You may choose to allocate 1.3.6.1.4.1.55555.1 to department A and 1.3.6.1.4.1.55555.2 to department B. Each allocated branch of your OID is referred to as an arc. The local administrators of these departments could then subdivide their arcs according to the needs of their network.
OID assignments must be unique worldwide. If you ever need to make custom schema files for your directory (a common practice), go to http://www.iana.org/cgi-bin/enterprise.pl and request a private enterprise number. The form is short and normally takes one to two weeks to be processed. Once you have your own enterprise number, you can create your own OIDs without worrying about conflicts with OIDs that have already been assigned. RFC 3383 describes some best practices for registering new LDAP values with
IANA.
* * *
Figure 2-4. Private enterprise OID number space
What Does the Value of the objectClass Attribute Mean?
All entries in an LDAP directory must have an objectClass attribute, and this attribute must have at least one value. Multiple values for the objectClass attribute are both possible and common given certain requirements, as you shall soon see. Each objectClass value acts as a template for the data to be stored in an entry. It defines a set of attributes that must be present in the entry and a set of optional attributes that may or may not be present.
Let's go back and reexamine the LDIF representation of the ou=devices,dc=plainjoe,dc=org entry:
# LDIF listing for dn: ou=devices,dc=plainjoe,dc=org
dn: ou=devices,dc=plainjoe,dc=org
objectclass: organizationalUnit
ou: devices
telephoneNumber: +1 256 555-5446
telephoneNumber: +1 256 555-5447
description: Container for all network enabled
devices existing within the plainjoe.org domain
In this case, the entry's objectClass is an organizationalUnit. (The schema definition for this is illustrated by two different representations in Figure 2-5.) The listing on the right shows the actual definition of the objectClass from RFC 2256; the box on the left summarizes the required and optional attributes.
LDAP System Administration Page 3