LDAP

December 23, 2008 by radhakrishnak

The Lightweight Directory Access Protocol, or LDAP is an application protocol for querying and modifying directory services running over TCP/IP.[1]

A directory is a set of objects with similar attributes organised in a logical and hierarchical manner. The most common example is the telephone directory, which consists of a series of names (either of persons or organizations) organized alphabetically, with each name having an address and phone number attached.

An LDAP directory tree often reflects various political, geographic, and/or organizational boundaries, depending on the model chosen. LDAP deployments today tend to use Domain name system (DNS) names for structuring the topmost levels of the hierarchy. Deeper inside the directory might appear entries representing people, organizational units, printers, documents, groups of people or anything else that represents a given tree entry (or multiple entries).

Origin and influences

Telecommunication companies introduced the concept of directory services to information technology and computer networking, as their understanding of directory requirements was well-developed after some 70 years of producing and managing telephone directories. The culmination of this input was the comprehensive X.500 specification[2], a suite of protocols produced by the International Telecommunication Union (ITU) in the 1980s.

X.500 directory services were traditionally accessed via the X.500 Directory Access Protocol (DAP), which required the Open Systems Interconnection (OSI) protocol stack. LDAP was originally intended to be a lightweight alternative protocol for accessing X.500 directory services through the simpler (and now widespread) TCP/IP protocol stack. This model of directory access was borrowed from the DIXIE and Directory Assistance Service protocols.

Standalone LDAP directory servers soon followed, as did directory servers supporting both DAP and LDAP. The latter has become popular in enterprises, as LDAP removed any need to deploy an OSI network. Today, X.500 directory protocols including DAP can also be used directly over TCP/IP.

The protocol was originally created by Tim Howes of the University of Michigan, Steve Kille of Isode Limited, and Wengyik Yeong of Performance Systems International, circa 1993. Further development has been done via the Internet Engineering Task Force.

In the early engineering stages of LDAP, it was known as Lightweight Directory Browsing Protocol, or LDBP. It was renamed as the scope of the protocol was expanded to include not only directory browsing and searching functions, but also directory update functions.

LDAP has influenced subsequent Internet protocols, including later versions of X.500, XML Enabled Directory (XED), Directory Service Markup Language (DSML), Service Provisioning Markup Language (SPML), and the Service Location Protocol (SLP)

Protocol overview

A client starts an LDAP session by connecting to an LDAP server, by default on TCP port 389. The client then sends an operation request to the server, and the server sends responses in turn. With some exceptions, the client need not wait for a response before sending the next request, and the server may send the responses in any order.

The client may request the following operations:

  • Start TLS — use the LDAPv3 Transport Layer Security (TLS) extension for a secure connection
  • Bind — authenticate and specify LDAP protocol version
  • Search — search for and/or retrieve directory entries
  • Compare — test if a named entry contains a given attribute value
  • Add a new entry
  • Delete an entry
  • Modify an entry
  • Modify Distinguished Name (DN) — move or rename an entry
  • Abandon — abort a previous request
  • Extended Operation — generic operation used to define other operations
  • Unbind — close the connection (not the inverse of Bind)

In addition the server may send “Unsolicited Notifications” that are not responses to any request, e.g. before it times out a connection.

A common alternate method of securing LDAP communication is using an SSL tunnel. This is denoted in LDAP URLs by using the URL scheme “ldaps”. The default port for LDAP over SSL is 636. The use of LDAP over SSL was common in LDAP Version 2 (LDAPv2) but it was never standardized in any formal specification. This usage has been deprecated along with LDAPv2, which was officially retired in 2003.

LDAP is defined in terms of ASN.1, and protocol messages are encoded in the binary format BER. It uses textual representations for a number of ASN.1 fields/types, however.

Directory structure

The protocol accesses LDAP directories, which follow the 1993 edition of the X.500 model:

  • A directory is a tree of directory entries.
  • An entry consists of a set of attributes.
  • An attribute has a name (an attribute type or attribute description) and one or more values. The attributes are defined in a schema (see below).
  • Each entry has a unique identifier: its Distinguished Name (DN). This consists of its Relative Distinguished Name (RDN) constructed from some attribute(s) in the entry, followed by the parent entry’s DN. Think of the DN as a full filename and the RDN as a relative filename in a folder.

Be aware that a DN may change over the lifetime of the entry, for instance, when entries are moved within a tree. To reliably and unambiguously identify entries, a UUID might be provided in the set of the entry’s operational attributes.

An entry can look like this when represented in LDAP Data Interchange Format (LDIF) (LDAP itself is a binary protocol):

 dn: cn=John Doe,dc=example,dc=com
 cn: John Doe
 givenName: John
 sn: Doe
 telephoneNumber: +1 888 555 6789
 telephoneNumber: +1 888 555 1232
 mail: john@example.com
 manager: cn=Barbara Doe,dc=example,dc=com
 objectClass: inetOrgPerson
 objectClass: organizationalPerson
 objectClass: person
 objectClass: top

dn is the name of the entry; it’s not an attribute nor part of the entry. “cn=John Doe” is the entry’s RDN, and “dc=example,dc=com” is the DN of the parent entry, where dc denotes Domain Component. The other lines show the attributes in the entry. Attribute names are typically mnemonic strings, like “cn” for common name, “dc” for domain component, “mail” for e-mail address and “sn” for surname.

A server holds a subtree starting from a specific entry, e.g. “dc=example,dc=com” and its children. Servers may also hold references to other servers, so an attempt to access “ou=department,dc=example,dc=com” could return a referral or continuation reference to a server which holds that part of the directory tree. The client can then contact the other server. Some servers also support chaining, which means the server contacts the other server and returns the results to the client.

LDAP rarely defines any ordering: The server may return the values of an attribute, the attributes in an entry, and the entries found by a search operation in any order. This follows from the formal definitions – an entry is defined as a set of attributes, and an attribute is a set of values, and sets need not be ordered.

Operations

The client gives each request a positive Message ID, and the server response has the same Message ID. The response includes a numeric result code which indicates success, some error condition or some other special cases. Before the response, the server may send other messages with other result data – for example each entry found by the Search operation is returned in such a message.

Expand discussion of referral responses to various operations, especially modify, for example where all modifies must be directed from replicas to a master directory.

StartTLS

The StartTLS operation establishes Transport Layer Security (the descendant of SSL) on the connection. That can provide data confidentiality (cannot be read by third parties) and/or data integrity protection (protect from tampering). During TLS negotiation the server sends its X.509 certificate to prove its identity. The client may also send a certificate to prove its identity. After doing so, the client may then use SASL/EXTERNAL to have this identity used in determining the identity used in making LDAP authorization decisions.

Servers also often support the non-standard “LDAPS” (“Secure LDAP”, commonly known as “LDAP over SSL”) protocol on a separate port, by default 636. LDAPS differs from LDAP in two ways: 1) upon connect, the client and server establish TLS before any LDAP messages are transferred (without a Start TLS operation) and 2) the LDAPS connection must be closed upon TLS closure.

LDAPS was primarily used with LDAPv2, because the StartTLS operation had not yet been defined. The use of LDAPS is deprecated, and modern software should only use StartTLS.

Bind (authenticate)

The Bind operation authenticates the client to the server. Simple Bind can send the user’s DN and password in plaintext, so the connection should be protected using Transport Layer Security (TLS). The server typically checks the password against the userPassword attribute in the named entry. Anonymous Bind (with empty DN and password) resets the connection to anonymous state. SASL (Simple Authentication and Security Layer) Bind provides authentication services through a wide range of mechanisms, e.g. Kerberos or the client certificate sent with TLS.

Bind also sets the LDAP protocol version. Normally clients should use LDAPv3, which is the default in the protocol but not always in LDAP libraries.

Bind had to be the first operation in a session in LDAPv2, but is not required in LDAPv3 (the current LDAP version).

Search and Compare

The Search operation is used to both search for and read entries. Its parameters are:

baseObject
The DN (Distinguished Name) of the entry at which to start the search,
scope
BaseObject (search just the named entry, typically used to read one entry), singleLevel (entries immediately below the base DN), or wholeSubtree (the entire subtree starting at the base DN).
filter
How to examine each entry in the scope. E.g. (&(objectClass=person)(|(givenName=John)(mail=john*))) – search for persons who either have given name John or an e-mail address starting with john.
derefAliases
Whether and how to follow alias entries (entries which refer to other entries),
attributes
Which attributes to return in result entries.
sizeLimit, timeLimit
Max number of entries, and max search time.
typesOnly
Return attribute types only, not attribute values.

The server returns the matching entries and maybe continuation references (in any order), followed by the final result with the result code.

The Compare operation takes a DN, an attribute name and an attribute value, and checks if the named entry contains that attribute with that value.

Update Data

Add, Delete, and Modify DN – all require the DN of the entry that is to be changed.

Modify takes a list of attributes to modify and the modifications to each: Delete the attribute or some values, add new values, or replace the current values with the new ones.

Add operations also can have additional attributes and values for those attributes.

Modify DN (move/rename entry) takes the new RDN (Relative Distinguished Name), optionally the new parent’s DN, and a flag which says whether to delete the value(s) in the entry which match the old RDN. The server may support renaming of entire directory subtrees.

An update operation is atomic: Other operations will see either the new entry or the old one. On the other hand, LDAP does not define transactions of multiple operations: If you read an entry and then modify it, another client may have updated the entry in the mean time. Servers may implement extensions which support this, however.

Extended operations

The Extended Operation is a generic LDAP operation which can be used to define new operations. Examples include the Cancel, Password Modify and Start TLS operations.

Abandon

The Abandon operation requests that the server aborts an operation named by a message ID. The server need not honor the request. Unfortunately, neither Abandon nor a successfully abandoned operation send a response. A similar Cancel extended operation has therefore been defined which does send responses, but not all implementations support this.

Unbind

The Unbind operation abandons any outstanding operations and closes the connection. It has no response. The name is of historical origin: It is not the opposite of the Bind operation.

Clients can abort a session by simply closing the connection, but they should use Unbind. Otherwise the server cannot tell the difference between a failed network connection (or a truncation attack) and a discourteous client.

Softlink and Hardlinks

December 23, 2008 by radhakrishnak

Hard links and Soft links

Links

As was mentioned in the section on file system structure, every file has a data structure (record) known as an i-node that stores information about the file, and the filename is simply used as a reference to that data structure. A link is simply a way to refer to the contents of a file. There are two types of links:

  • Hard links: a hard link is a pointer to the file’s i-node. For example, suppose that we have a file a-file.txt that contains the string “The file a-file.txt”:
    % cat a-file.txt
    The file a-file.txt
    %

    Now we use the ln command to create a link to a-file.txt called b-file.txt:

    % ls
    ./  ../  a-file.txt
    % ln a-file.txt b-file.txt
    % ls
    ./  ../  a-file.txt  b-file.txt

    Hard LinksThe two names a-file.txt and b-file.txt now refer to the same data:

    % cat b-file.txt
    The file a-file.txt
    %

    If we modify the contents of file b-file.txt, then we also modify the contents of file a-file.txt:

    % vi b-file.txt
    ...
    % cat b-file.txt
    The file a-file.txt has been modified.
    % cat a-file.txt
    The file a-file.txt has been modified.
    %

    and vice versa:

    % vi a-file.txt
    ...
    % cat a-file.txt
    The file a-file.txt has been modified again!
    % cat b-file.txt
    The file a-file.txt has been modified again!
    %
  • Soft links (symbolic links): a soft link, also called symbolic link, is a file that contains the name of another file. We can then access the contents of the other file through that name. That is, a symbolic link is like a pointer to the pointer to the file’s contents. For instance, supposed that in the previous example, we had used the -s option of the ln to create a soft link:
    % ln -s a-file.txt b-file.txt

    On disk, the file system would look like the following picture:

    Soft Links

But what are the differences between the two types of links, in practice? Let us look at an example that highlights these differences. The directory currently looks like this (let us assume that a-file.txt b-file.txt are both hard links to the same file):

% ls
./  ../  a-file.txt  b-file.txt

Let us first add another symbolic link using the -s option:

% ln -s a-file.txt Symbolicb-file.txt
% ls -F
./  ../  a-file.txt  b-file.txt  Symbolicb-file.txt@

A symbolic link, that ls -F displays with a @ symbol, has been added to the directory. Let us examine the contents of the file:

% cat Symbolicb-file.txt
The file a-file.txt has been modified again!

If we change the file Symbolicb-file.txt, then the file a-file.txt is also modified.

% vi Symbolicb-file.txt
...
% cat Symbolicb-file.txt
The file a-file.txt has been modified a third time!
% cat a-file.txt
The file a-file.txt has been modified a third time!
% cat b-file.txt
The file a-file.txt has been modified a third time!
%

If we remove the file a-file.txt, we can no longer access the data through the symbolic link Symbolicb-file.txt:

% ls -F
./  ../  a-file.txt  b-file.txt  Symbolicb-file.txt@
% rm a-file.txt
rm: remove `a-file.txt'? y
% ls -F
./  ../  b-file.txt  Symbolicb-file.txt@
% cat Symbolicb-file.txt
cat: Symbolicb-file.txt: No such file or directory

The link Symbolicb-file.txt contains the name a-file.txt, and there no longer is a file with that name. On the other hand, b-file.txt has its own pointer to the contents of the file we called a-file.txt, and hence we can still use it to access the data.

% cat b-file.txt
The file a-file.txt has been modified a third time!

Although it may seem like symbolic links are not particularly useful, hard links have their drawbacks. The most significant drawback is that hard links cannot be created to link a file from one file system to another file on another file system. A Unix file structure hierarchy can consist of several different file systems (possibly on several physical disks). Each file system maintains its own information regarding the internal structure of the system and the individual files on the system. Hard links only know this system-specific information, which make hard links unable to span file systems. Soft links, on the other hand, know the name of the file, which is more general, and are able to span file systems.

For a concrete analogy, suppose that our friend Joel User is a student at both UBC and SFU. Both universities assign him a student number. If he tries to use his UBC student number at SFU, he will not meet with any success. He will also fail if he tries to use his SFU student number at UBC. But if he uses his legal name, Joel User, he will probably be successful. The student numbers are system-specific (like hard links), while his legal name spans both of the systems (like soft links).

Here is an example that demonstrates a situation where a hard link cannot be used and a symbolic link is needed. Suppose that we try to create a hard link from the current working directory to the C header stdio.h.

% ln /usr/include/stdio.h stdio.h
ln: creating hard link `stdio.h' to `/usr/include/stdio.h': Invalid cross-device link
%

The ln command fails because stdio.h is stored on a different file system. If we want to create a link to it, we will have to use a symbolic link:

% ln -s /usr/include/stdio.h stdio.h
% ls -l
lrwxrwxrwx    1 a1a1 guest          20 Apr 20 11:58 stdio.h -> /usr/include/stdio.h
% ls
./  ../  stdio.h@
%

Now we can view the file stdio.h just as if it was located in the working directory. For example:

% cat stdio.h
/* Copyright (c) 1988 AT&T */
/* All Rights Reserved */ 

/* THIS IS UNPUBLISHED PROPRIETARY SOURCE CODE OF AT&T */
/* The copyright notice above does not evidence any */
/* actual or intended publication of such source code. */ 

/*
* User-visible pieces of the ANSI C standard I/O package.
*/ 

#ifndef _STDIO_H
#define _STDIO_H
...
%

The entire output of the cat command was not included to save space.

Note that the long listing (ls -l) of a soft link does not accurately reflect its associated permissions. To view the permissions of the file or directory that the symbolic link references, the -L options of the ls command can be used. For example:

% ln -s /usr/include/stdio.h stdio.h

% ls -l stdio.h
lrwxrwxrwx   1 a1a1     undergrad     20 May 10 15:13 stdio.h -> /usr/include/stdio.h

% ls -l /usr/include/stdio.h
-rw-r--r--   1 root     bin        11066 Jan  5  2000 /usr/include/stdio.h

% ls -lL stdio.h
-rw-r--r--   1 root     bin        11066 Jan  5  2000 stdio.h

Knowledge of links becomes doubly important when working as part of a software development team, and even more so when using a source code management tool such as RCS. The man page for the ln command contains more information on Unix links and the ln.

IMAP

December 23, 2008 by radhakrishnak

History

IMAP was designed by Mark Crispin in 1986 as a remote mailbox protocol, in contrast to the widely used POP, a protocol for retrieving the contents of a mailbox.[4]

Original IMAP

The original Interim Mail Access Protocol was implemented as a Xerox Lisp machine client and a TOPS-20 server.

No copies of the original interim protocol or its software exist; all known installations of the original protocol were updated to IMAP2. Although some of its commands and responses were similar to IMAP2, the interim protocol lacked command/response tagging and thus its syntax was incompatible with all other versions of IMAP.

IMAP2

The interim protocol was quickly replaced by the Interactive Mail Access Protocol (IMAP2), defined in RFC 1064 and later updated by RFC 1176. IMAP2 introduced command/response tagging and was the first publicly distributed version.

IMAP2bis

With the advent of MIME, IMAP2 was extended to support MIME body structures and add mailbox management functionality (create, delete, rename, message upload) that was absent in IMAP2. This experimental revision was called IMAP2bis; its specification was never published in non-draft form. Early versions of Pine were widely distributed with IMAP2bis support (Pine 4.00 and later supports IMAP4rev1).

IMAP4

An IMAP Working Group formed in the IETF in the early 1990s and took over responsibility for the IMAP2bis design. The IMAP WG decided to rename IMAP2bis to IMAP4 to avoid confusion with a competing IMAP3 proposal from another group that never got off the ground. The expansion of the IMAP acronym also changed to the Internet Message Access Protocol.

Some design flaws in the original IMAP4 (defined by RFC 1730) that came out in implementation experience led to its revision and replacement by IMAP4rev1 two years later. There were very few IMAP4 client or server implementations due to its short lifetime.

IMAP4rev1

The current version of IMAP since 1996, IMAP version 4 revision 1 (IMAP4rev1), is defined by RFC 3501 which revised the earlier RFC 2060.

IMAP4rev1 is backwards compatible with IMAP2 and IMAP2bis; and is largely backwards compatible with IMAP4. However, the older versions are either extinct or nearly so.

Unlike many older Internet protocols, IMAP4 natively supports encrypted login mechanisms. Plain-text transmission of passwords in IMAP4 is also possible. Because the encryption mechanism to be used must be agreed between the server and client, plain-text passwords are used in some combinations of clients and servers (typically Microsoft Windows clients and non-Windows servers). It is also possible to encrypt IMAP4 traffic using SSL, either by tunneling IMAP4 communications over SSL on port 993, or by issuing STARTTLS within an established IMAP4 session (see RFC 2595).

Advantages over POP3

Connected and disconnected modes of operation

When using POP3, clients typically connect to the e-mail server briefly, only as long as it takes to download new messages. When using IMAP4, clients often stay connected as long as the user interface is active and download message content on demand. For users with many or large messages, this IMAP4 usage pattern can result in faster response times.

Multiple clients simultaneously connected to the same mailbox

The POP3 protocol requires the currently connected client to be the only client connected to the mailbox. In contrast, the IMAP protocol specifically allows simultaneous access by multiple clients and provides mechanisms for clients to detect changes made to the mailbox by other, concurrently connected, clients.

Access to MIME message parts and partial fetch

Nearly all internet e-mail is transmitted in MIME format, allowing messages to have a tree structure where the leaf nodes are any of a variety of single part content types and the non-leaf nodes are any of a variety of multipart types. The IMAP4 protocol allows clients to separately retrieve any of the individual MIME parts and also to retrieve portions of either individual parts or the entire message. These mechanisms allow clients to retrieve the text portion of a message without retrieving attached files or to stream content as it is being fetched.

Message state information

Through the use of flags defined in the IMAP4 protocol, clients can keep track of message state; for example, whether or not the message has been read, replied to, or deleted. These flags are stored on the server, so different clients accessing the same mailbox at different times can detect state changes made by other clients. POP3 provides no mechanism for clients to store such state information on the server so if a single user accesses a mailbox with two different POP3 clients, state information–such as whether a message has been accessed–cannot be synchronized between the clients. The IMAP4 protocol supports both pre-defined system flags and client defined keywords. System flags indicate state information such as whether a message has been read. Keywords, which are not supported by all IMAP servers, allow messages to be given one or more tags whose meaning is up to the client. Adding user created tags to messages is an operation supported by some web-based email services, such as Gmail.

Multiple mailboxes on the server

IMAP4 clients can create, rename, and/or delete mailboxes (usually presented to the user as folders) on the server, and move messages between mailboxes. Multiple mailbox support also allows servers to provide access to shared and public folders.

Server-side searches

IMAP4 provides a mechanism for a client to ask the server to search for messages meeting a variety of criteria. This mechanism avoids requiring clients to download every message in the mailbox in order to perform these searches.

Built-in extension mechanism

Reflecting the experience of earlier Internet protocols, IMAP4 defines an explicit mechanism by which it may be extended. Many extensions to the base protocol have been proposed and are in common use. IMAP2bis did not have an extension mechanism, and POP3 now has one defined by RFC 2449.

Disadvantages of IMAP

While IMAP remedies many of the shortcomings of POP, this inherently introduces additional complexity. Much of this complexity (e.g., multiple clients accessing the same mailbox at the same time) is compensated for by server-side workarounds such as maildir or database backends.

Unless the mail store and searching algorithms on the server are carefully implemented, a client can potentially consume large amounts of server resources when searching massive mailboxes.

IMAP4 clients need to explicitly request new email message content potentially causing additional delays on slow connections such as those commonly used by mobile devices. A private proposal, push IMAP, would extend IMAP to implement push e-mail by sending the entire message instead of just a notification. However, push IMAP has not been generally accepted and current IETF work has addressed the problem in other ways (see the Lemonade Profile for more information).

Unlike some proprietary protocols which combine sending and retrieval operations, sending a message and saving a copy in a server-side folder with a base-level IMAP client requires transmitting the message content twice, once to SMTP for delivery and a second time to IMAP to store in a sent mail folder. This is remedied by a set of extensions defined by the IETF LEMONADE Working Group for mobile devices: URLAUTH (RFC 4467) and CATENATE (RFC 4469) in IMAP and BURL (RFC 4468) in SMTP-SUBMISSION. POP3 servers don’t support server-side folders so clients have no choice but to store sent items on the client. Many IMAP clients can be configured to store sent mail in a client-side folder. In addition to the LEMONADE “trio”, Courier Mail Server offers a non-standard method of sending using IMAP by copying an outgoing message to a dedicated outbox folder.

What is RAID ?

December 23, 2008 by radhakrishnak

What is RAID ?


  1. What does RAID stand for ?In 1987, Patterson, Gibson and Katz at the University of California Berkeley, published a paper entitled “A Case for Redundant Arrays of Inexpensive Disks (RAID)” . This paper described various types of disk arrays, referred to by the acronym RAID. The basic idea of RAID was to combine multiple small, inexpensive disk drives into an array of disk drives which yields performance exceeding that of a Single Large Expensive Drive (SLED). Additionally, this array of drives appears to the computer as a single logical storage unit or drive.The Mean Time Between Failure (MTBF) of the array will be equal to the MTBF of an individual drive, divided by the number of drives in the array. Because of this, the MTBF of an array of drives would be too low for many application requirements. However, disk arrays can be made fault-tolerant by redundantly storing information in various ways.Five types of array architectures, RAID-1 through RAID-5, were defined by the Berkeley paper, each providing disk fault-tolerance and each offering different trade-offs in features and performance. In addition to these five redundant array architectures, it has become popular to refer to a non-redundant array of disk drives as a RAID-0 array.
  2. Data StripingFundamental to RAID is “striping”, a method of concatenating multiple drives into one logical storage unit. Striping involves partitioning each drive’s storage space into stripes which may be as small as one sector (512 bytes) or as large as several megabytes. These stripes are then interleaved round-robin, so that the combined space is composed alternately of stripes from each drive. In effect, the storage space of the drives is shuffled like a deck of cards. The type of application environment, I/O or data intensive, determines whether large or small stripes should be used.Most multi-user operating systems today, like NT, Unix and Netware, support overlapped disk I/O operations across multiple drives. However, in order to maximize throughput for the disk subsystem, the I/O load must be balanced across all the drives so that each drive can be kept busy as much as possible. In a multiple drive system without striping, the disk I/O load is never perfectly balanced. Some drives will contain data files which are frequently accessed and some drives will only rarely be accessed. In I/O intensive environments, performance is optimized by striping the drives in the array with stripes large enough so that each record potentially falls entirely within one stripe. This ensures that the data and I/O will be evenly distributed across the array, allowing each drive to work on a different I/O operation, and thus maximize the number of simultaneous I/O operations which can be performed by the array.In data intensive environments and single-user systems which access large records, small stripes (typically one 512-byte sector in length) can be used so that each record will span across all the drives in the array, each drive storing part of the data from the record. This causes long record accesses to be performed faster, since the data transfer occurs in parallel on multiple drives. Unfortunately, small stripes rule out multiple overlapped I/O operations, since each I/O will typically involve all drives. However, operating systems like DOS which do not allow overlapped disk I/O, will not be negatively impacted. Applications such as on-demand video/audio, medical imaging and data acquisition, which utilize long record accesses, will achieve optimum performance with small stripe arrays.A potential drawback to using small stripes is that synchronized spindle drives are required in order to keep performance from being degraded when short records are accessed. Without synchronized spindles, each drive in the array will be at different random rotational positions. Since an I/O cannot be completed until every drive has accessed its part of the record, the drive which takes the longest will determine when the I/O completes. The more drives in the array, the more the average access time for the array approaches the worst case single-drive access time. Synchronized spindles assure that every drive in the array reaches its data at the same time. The access time of the array will thus be equal to the average access time of a single drive rather than approaching the worst case access time.
  3. The different RAID levels
    RAID-0
    RAID Level 0 is not redundant, hence does not truly fit the “RAID” acronym. In level 0, data is split across drives, resulting in higher data throughput. Since no redundant information is stored, performance is very good, but the failure of any disk in the array results in data loss. This level is commonly referred to as striping.
    RAID-1
    RAID Level 1 provides redundancy by writing all data to two or more drives. The performance of a level 1 array tends to be faster on reads and slower on writes compared to a single drive, but if either drive fails, no data is lost. This is a good entry-level redundant system, since only two drives are required; however, since one drive is used to store a duplicate of the data, the cost per megabyte is high. This level is commonly referred to as mirroring.
    RAID-2
    RAID Level 2, which uses Hamming error correction codes, is intended for use with drives which do not have built-in error detection. All SCSI drives support built-in error detection, so this level is of little use when using SCSI drives.
    RAID-3
    RAID Level 3 stripes data at a byte level across several drives, with parity stored on one drive. It is otherwise similar to level 4. Byte-level striping requires hardware support for efficient use.
    RAID-4
    RAID Level 4 stripes data at a block level across several drives, with parity stored on one drive. The parity information allows recovery from the failure of any single drive. The performance of a level 4 array is very good for reads (the same as level 0). Writes, however, require that parity data be updated each time. This slows small random writes, in particular, though large writes or sequential writes are fairly fast. Because only one drive in the array stores redundant data, the cost per megabyte of a level 4 array can be fairly low.
    RAID-5
    RAID Level 5 is similar to level 4, but distributes parity among the drives. This can speed small writes in multiprocessing systems, since the parity disk does not become a bottleneck. Because parity data must be skipped on each drive during reads, however, the performance for reads tends to be considerably lower than a level 4 array. The cost per megabyte is the same as for level 4.
      Summary:

    • RAID-0 is the fastest and most efficient array type but offers no fault-tolerance.
    • RAID-1 is the array of choice for performance-critical, fault-tolerant environments. In addition, RAID-1 is the only choice for fault-tolerance if no more than two drives are desired.
    • RAID-2 is seldom used today since ECC is embedded in almost all modern disk drives.
    • RAID-3 can be used in data intensive or single-user environments which access long sequential records to speed up data transfer. However, RAID-3 does not allow multiple I/O operations to be overlapped and requires synchronized-spindle drives in order to avoid performance degradation with short records.
    • RAID-4 offers no advantages over RAID-5 and does not support multiple simultaneous write operations.
    • RAID-5 is the best choice in multi-user environments which are not write performance sensitive. However, at least three, and more typically five drives are required for RAID-5 arrays.
  4. Possible aproaches to RAID
    • Hardware RAID
      The hardware based system manages the RAID subsystem independently from the host and presents to the host only a single disk per RAID array. This way the host doesn’t have to be aware of the RAID subsystems(s).

      • The controller based hardware solution
        DPT’s SCSI controllers are a good example for a controller based RAID solution.
        The intelligent contoller manages the RAID subsystem independently from the host. The advantage over an external SCSI—SCSI RAID subsystem is that the contoller is able to span the RAID subsystem over multiple SCSI channels and and by this remove the limiting factor external RAID solutions have: The transfer rate over the SCSI bus.
      • The external hardware solution (SCSI—SCSI RAID)
        An external RAID box moves all RAID handling “intelligence” into a contoller that is sitting in the external disk subsystem. The whole subsystem is connected to the host via a normal SCSI controller and apears to the host as a single or multiple disks.
        This solution has drawbacks compared to the contoller based solution: The single SCSI channel used in this solution creates a bottleneck.
        Newer technologies like Fiber Channel can ease this problem, especially if they allow to trunk multiple channels into a Storage Area Network.
        4 SCSI drives can already completely flood a parallel SCSI bus, since the average transfer size is around 4KB and the command transfer overhead – which is even in Ultra SCSI still done asynchonously – takes most of the bus time.
    • Software RAID
      • The MD driver in the Linux kernel is an example of a RAID solution that is completely hardware independent.
        The Linux MD driver supports currently RAID levels 0/1/4/5 + linear mode.
      • Under Solaris you have the Solstice DiskSuite and Veritas Volume Manager which offer RAID-0/1 and 5.
      • Adaptecs AAA-RAID controllers are another example, they have no RAID functionality whatsoever on the controller, they depend on external drivers to provide all external RAID functionality.
        They are basically only multiple single AHA2940 controllers which have been integrated on one card. Linux detects them as AHA2940 and treats them accordingly.
        Every OS needs its own special driver for this type of RAID solution, this is error prone and not very compatible.
    • Hardware vs. Software RAID
      Just like any other application, software-based arrays occupy host system memory, consume CPU cycles and are operating system dependent. By contending with other applications that are running concurrently for host CPU cycles and memory, software-based arrays degrade overall server performance. Also, unlike hardware-based arrays, the performance of a software-based array is directly dependent on server CPU performance and load.Except for the array functionality, hardware-based RAID schemes have very little in common with software-based implementations. Since the host CPU can execute user applications while the array adapter’s processor simultaneously executes the array functions, the result is true hardware multi-tasking. Hardware arrays also do not occupy any host system memory, nor are they operating system dependent.

      Hardware arrays are also highly fault tolerant. Since the array logic is based in hardware, software is NOT required to boot. Some software arrays, however, will fail to boot if the boot drive in the array fails. For example, an array implemented in software can only be functional when the array software has been read from the disks and is memory-resident. What happens if the server can’t load the array software because the disk that contains the fault tolerant software has failed? Software-based implementations commonly require a separate boot drive, which is NOT included in the array.

  5. What are the advantages of a multichannel contoller ?
  6. Hardware vs. Software caching ?

Install packages on Linux using yum

December 16, 2008 by radhakrishnak

Cleaning of the system.

Yum leaves as result of its use heads and packages RPM stored in the interior of the directory located in the route /var/cache/yum/. Particularly the packages RPM that have settled can occupy much space and is by such reason agrees to eliminate them once no longer they have utility. Also it agrees to do the same with the old heads of packages that no longer are in the data base. In order to make the corresponding cleaning, the following thing can be executed:

Code:
 yum clean all

Hello world!

November 21, 2008 by radhakrishnak

Welcome to WordPress.com. This is your first post. Edit or delete it and start blogging!

Compiling Linux kernel

Step # 1 Get Latest Linux kernel code

Visit http://kernel.org/ and download the latest source code. File name would be linux-x.y.z.tar.bz2, where x.y.z is actual version number. For example file inux-2.6.25.tar.bz2 represents 2.6.25 kernel version. Use wget command to download kernel source code:
$ cd /tmp
$ wget http://www.kernel.org/pub/linux/kernel/v2.6/linux-x.y.z.tar.bz2

Note: Replace x.y.z with actual version number.

Step # 2 Extract tar (.tar.bz3) file

Type the following command:
# tar -xjvf linux-2.6.25.tar.bz2 -C /usr/src
# cd /usr/src

Step # 3 Configure kernel

Before you configure kernel make sure you have development tools (gcc compilers and related tools) are installed on your system. If gcc compiler and tools are not installed then use apt-get command under Debian Linux to install development tools.
# apt-get install gcc

Now you can start kernel configuration by typing any one of the command:

  • $ make menuconfig – Text based color menus, radiolists & dialogs. This option also useful on remote server if you wanna compile kernel remotely.
  • $ make xconfig – X windows (Qt) based configuration tool, works best under KDE desktop
  • $ make gconfig – X windows (Gtk) based configuration tool, works best under Gnome Dekstop.

For example make menuconfig command launches following screen:
$ make menuconfig

You have to select different options as per your need. Each configuration option has HELP button associated with it so select help button to get help.

Step # 4 Compile kernel

Start compiling to create a compressed kernel image, enter:
$ make
Start compiling to kernel modules:
$ make modules

Install kernel modules (become a root user, use su command):
$ su -
# make modules_install

Step # 5 Install kernel

So far we have compiled kernel and installed kernel modules. It is time to install kernel itself.
# make install

It will install three files into /boot directory as well as modification to your kernel grub configuration file:

  • System.map-2.6.25
  • config-2.6.25
  • vmlinuz-2.6.25

Step # 6: Create an initrd image

Type the following command at a shell prompt:
# cd /boot
# mkinitrd -o initrd.img-2.6.25 2.6.25

initrd images contains device driver which needed to load rest of the operating system later on. Not all computer requires initrd, but it is safe to create one.

Step # 7 Modify Grub configuration file – /boot/grub/menu.lst

Open file using vi:
# vi /boot/grub/menu.lst

title           Debian GNU/Linux, kernel 2.6.25 Default
root            (hd0,0)
kernel          /boot/vmlinuz root=/dev/hdb1 ro
initrd          /boot/initrd.img-2.6.25
savedefault
boot

Remember to setup correct root=/dev/hdXX device. Save and close the file. If you think editing and writing all lines by hand is too much for you, try out update-grub command to update the lines for each kernel in /boot/grub/menu.lst file. Just type the command:
# update-grub
Neat. Huh?

Step # 8 : Reboot computer and boot into your new kernel

Just issue reboot command:

# reboot