× Java Assignment Help C++ Assignment Help C Assignment Help Python Assignment Help Coding Exam Help Reviews 4.9/5
  • Order Now
  • Unix File System : Directories, Filenames and Inode

    Everything in the UNIX system is a file. That is less of an over simplification than you might think. When the first version of the system was being designed, before it even had a name, the discussions focused on the structure of a file system that would be clean and easy to use. The file system is central to the success and convenience of the UNIX system. It is one of the best exams plus the "keep it simple" philosophy, shows the power achieved by the careful implementation of a few well-chosen ideas.

    To talk comfortably about commands and their interrelationships, we need a good background in the structure and outer workings of the file system. This chapter covers most of the details of using the file system - what files are, how they are represented, directories and the file system hierarchy, permissions, inodes (the system's internal record of files), and device files. Avail programming assignment help for any doubts and queries that might arise while going through the blog. Because most users of the UNIX system deal with manipulating files, there are many commands for file investigation or rearrangement; this chapter introduces the more commonly used ones.

    The basics of files

    A file is a sequence of bytes. (A byte is a small chunk of information, typically 8 bits long. For our purposes, a byte is equivalent to a character.) No structure is imposed on a file by the system, and no meaning is attached to its contents the meaning of the bytes depends solely on the programs that interpret the file. Furthermore, as we shall see, this is true not just of disc files but of peripheral devices as well. Magnetic tapes, mail messages, characters typed on the keyboard, line 'printer output, data flowing in pipes - each of these files is just a sequence of bytes as far as the system and the programs in it are concerned. The best way to learn about files is to play with them, so start by creating a small file:

    $ ed

    a

    now is the time

    for all good people

    w junk

    36

    Q

    $ Is –I junk

    -rw-r--r-- 1 you                              36 Sep 27 06:11 junk

    junk is a file with 36 bytes - the 36 characters you typed while appending (except, of course, for correction of any typing mistakes). To see the file,

    $ cat junk

     now is the time

    for all good people

    $

    cat shows what the file looks like. The command od (octal dump) prints a Lesvisible representation of all the bytes of a file:

    $ od –c junk

    0000000                 now is the time

    0000020                 for all good people

    0000040                 ple \n

    0000044

    $

    The -c option means "interpret bytes as characters." Turning on the -b option will show the bytes as octal (base 8) numbers as well:

    $ od –cb junk

    0000000                 now is the time

                                     156 157 167 040 151 163 040 164 150 145 040 164 151 155 145 012

    0000020                 for all good people

                                     146 157 162 040 141 154 154 040 147 157 144 040 160 145 157

    0000040                  plan \n

                                     160 154 145 012

    0000044

    $

    The 7-digit numbers down the left side are positions in the file, that is, the ordinal number of the next character shown, in octal. By the way, the emphasis on octal numbers is a holdover from the PDP-11, for

    which octal was the preferred notation. Hexadecimal is better suited for other machines; the -x option tells od to print in hex. Notice that there is a character after each line, with an octal value of 012. This is the ASCII newline character; it is what the system places in the input when you press the RETURN key. By a convention borrowed from C, the character representation of a newline is n, but this is only a convention used by programs like od to make it easy to read -the value stored in the file is the single byte 012.

    Newline is the most common example of a special character. Other characters associated with some terminal control operation include backspace (octal value 010, printed as \b), tab (011, \t), and carriage return (0.15, \r). It is important in each case to distinguish between how the character is stored in a file and how it is interpreted in various situations. For example, when you type a backspace on your keyboard (and assuming that your erase character is backspace), the kernel interprets it to mean that you want to discard whatever character you typed previously. Both that character and the backspace disappear, but the backspace is echoed to your terminal, where it makes the cursor move one position backward. If you type the sequence (i.c., followed by a backspace), however, the kernel interprets that to mean that you want a literal backspace in your input, so the is discarded and the byte 010 winds up in your file. When the backspace is echoed on your terminal, it moves the cursor to sit on top of the \.

    When you print a file that contains a backspace, the backspace is passed uninterpreted to your terminal, which again will move the cursor one position backward. When you use od to display a file that contains a backspace, it appears as a byte with value 010, or, with the -c option, as \b.

    The story for tabs is much the same: on input, a tab character is echoed to your terminal and sent to the program that is reading; on output, the tab is simply sent to the terminal for interpretation there. There is a difference, though you can tell the kernel that you want it to interpret tabs for you on output; in that case, each tab that would be printed is replaced by the right number of blanks to get to the next tab stop. Tab stops are set at columns 9. 17, 25, etc. The command

    $ stty -tabs

    causes tabs to be replaced by spaces when printed on your terminal. See stty(1). The treatment of RETURN is analogous. The kernel echoes RETURN as a carriage return and a newline, but stores only the newline in the input. On output, the newline is expanded into carriage return and newline. The UNIX system is unusual in its approach to representing control Informa on, particularly its use of newlines to terminate lines. Many systems instead provide "records," one per line, each of which contains not only your data but also a count of the number of characters in the line (and no new line). Other systems terminate each line with a carriage return and a newline because that sequence is necessary for output on most terminals. (The word "linefeed" is a synonym for newline, so this sequence is often called "CRLF." which is nearly pronounceable.)

    The UNIX system does neither there are no records, no record counts, and no bytes in any file that you or your programs did not put there. A new line is expanded into a carriage return and a new line when sent to a terminal. but programs need only deal with the single newline character because that is all they see. For most purposes, this simple scheme is exactly what is wanted. When a more complicated

    the structure is needed, it can easily be built on top of this; the converse, creating simplicity from complexity, is harder to achieve.

    Since the end of a line is marked by a newline character, you might expect a file to be terminated by another special character, say \e for "end of the file." Looking at the output of od, though, you will see no special character at the end of the file it just stops. Rather than using a special code, the system signifies the end of a file by simply saying there is no more data in the file. The kernel keeps track of file lengths, so a program encounters end-of-file when it has processed all the bytes in a file.Programs retrieve the data in a file by a system call (a subroutine in the kernel) called read. Each time read is called, it returns the next part of a file the next line of text typed on the terminal, for example. read also says how many bytes of the file was returned, so the end of the file is assumed when a reader says "zero bytes are being returned." If there were any bytes left, read would have returned some of them. It makes sense not to represent the end of the file by a special byte value, because, as we said earlier, the meaning of the bytes depends on the interpretation of the file. But all files must end, and since all files must be accessed through reading, returning zero is an interpretation-independent way to represent the end of a file without introducing a new special character.

    When a program reads from your terminal, each input line is given to the program by the kernel only when you type its newline (i.e, press RETURN). Therefore if you make a typing mistake, you can back up and correct it if you realize the mistake before you type newline. If you type newline before realizing the error, the line has been read by the system and you cannot correct it. We can see how this line-at-a-time input works using a cat. a cat normally saves up or buffers its output to write in large chunks for efficiency, but the cat -u "unbuffered" the output, so it is printed immediately as 'it is read:

    $ cat                    Buffered output from cat

    123

    456

    789

    Ctrl-d

    123

    456

    789

    $ cat –u

    123

    123

    456

    456

    789

    789

    Ctrl-d

    $

    cat receives each line when you press RETURN; without buffering, it prints the data as it is received. Now try something different: type some characters and then a CTL-d rather than a RETURN:

    $ cat –u

    123ctl-d123

    cat prints the characters out immediately. Cecil-d says, "immediately send the characters I have typed to the program that is reading from my terminal." The card itself is not sent to the program, unlike a newline. Now type a second ctl-d, with no other characters:

    $ cat -u

    123ctl-d123ctl-d$

    The shell responds with a prompt, because the cat read no characters, decided that meant end of file, and stopped. ctl-d sends whatever you have typed to the program that is reading from the terminal. If you haven't typed anything, the program will therefore read no characters, and that looks like the end of the file. That is why typing ctl-d logs you out - the shell sees no more input. Of course, Cecil-d is usually used to signal an end-of-file but, interestingly, it has a more general function.

    Exercise 2-1. What happens when you type crl-d to ed? Compare this to the command

    $ ed

    What's in a file?

    The format of a file is determined by the programs that use it; there is a wide variety of file types, perhaps because there is a wide variety of programs. But since file types are not determined by the file system, the kernel can't tell you the type of a file: it doesn't know it. The file command makes an educated guess (we'll explain how shortly):

    $ file /bin/bin/ed /usr/src/cmd/ed.c /usr/man/man1/ed. 1

    /bin:                   directory

    /bin/ed:                       pure executable

    /usr/src/cmd/ed.c:                c program text

    /usr/man/man1/ed. 1:                    roff, off, or eqn input text

    These are four fairly typical files, all related to the editor: the directory in which it resides (/bin), the "binary" or runnable program itself (/bin/ed), the "source" or C statements that define the program (/usr/src/cmd/ed.c) and the manual page (/usr/man/man1/ed. 1).

    To determine the types, file didn't pay attention to the names (although it I could have), because naming conventions are just conventions, and thus not perfectly reliable. For example, files suffixed .c are almost always C source, but there is nothing to prevent you from creating a .c file with arbitrary con tents. Instead, file reads the first few hundred bytes of a file and looks for clues to the file type. (As we will show later on, files with special system pro perties, such as directories, can be identified by asking the system, but file could identify a directory by reading it.)

    Sometimes the clues are obvious. A runnable program is marked by a binary "magic number" at its beginning. od with no options dumps the file in

    16-bit, or 2-byte, words and makes the magic number visible:

    $ od /bin/ed

    0000000      000410     025000      000462      011444      000000      000000      000000      000001

    0000020      170011      016600     000002      005060      177776      010600      162706     000004

    0000040      016616      000004     005720     010066      000002       005720      001376      020076

    The octal value 410 marks a pure executable program, one for which the exe cuting code may be shared by several processes. (Specific magic numbers are system dependent.) The bit pattern represented by 410 is not ASCII text, so this value could not be created inadvertently by a program like an editor. But you could certainly create such a file by running a program of your own, and the system understands the convention that such files are program binaries.

    For text files, the clues may be deeper in the file, so file looks for words like #include to identify C source, or lines beginning with a period to iden tify nroff or troff input.

    You might wonder why the system doesn't track file types more carefully, so that, for example, sort is never given /bin/ed as input. One reason is to avoid foreclosing some useful computation. Although

     $ sort /bin/ed

    doesn't make much sense, there are many commands that can operate on any file at all, and there's no reason to restrict their capabilities. od, wc, cp, cmp. file and many others process files regardless of their contents. But the for matless idea goes deeper than that. If, say, nroff input were distinguished from C source, the editor would be forced to make the distinction when it created a file, and probably when it read in a file for editing again. And it would certainly make it harder for us to typeset the C programs in Chapters 6 through 8!

    Instead of creating distinctions, the UNIX system tries to efface them. All text consists of lines terminated by newline characters, and most programs understand this simple format. Many times while writing this book, we ran commands to create text files, processed them with commands like those listed above, and used an editor to merge them into the troff input for the book. The transcripts you see on almost every page are made by commands like

    $ od -c junk >temp

    $ ed ch2.1

    1534

    r temp

    168

    od produces text on its standard output, which can then be used anywhere text can be used. This uniformity is unusual; most systems have several file formats, even for text, and require negotiation by a program or a user to create a file of a particular type. In UNIX systems there is just one kind of file, and all that is required to access a file is its name.

    The lack of file formats is an advantage overall programmers needn't worry about file types, and all the standard programs will work on any file - but there are a handful of drawbacks. Programs that sort and search and edit expect text as input: grep can't examine binary files correctly, nor can sort them, nor can any standard editor manipulate them.

    There are implementation limitations with most programs that expect text as input. We tested some programs on a 30,000-byte text file containing no newlines, and surprisingly few behaved properly because most programs make unadvertised assumptions about the maximum length of a line of text (for an exception, see the BUGS section of sort(1)).

    Non-text files have their place. For example, very large databases usually need extra address information for rapid access; this has to be binary for efficiency. But every file format that is not text must have its own family of support programs to do things that the standard tools could perform if the format were text. Text files may be a little less efficient in machine cycles, but this must be balanced against the cost of extra software to maintain more specialized formats. If you design a file format, you should think carefully before choosing a non-textual representation. (You should also think about making your programs robust in the face of long input lines.)

    Directories and filenames

    All the files you own have unambiguous names, starting with /usr/you, but if the only file you have is junk, and you type 1s, it doesn't print /usr/you/junk; the filename is printed without any prefix:

    $ Is junk

    /usr/you

    $

    That is because each running program, that is, each process, has a current directory, and all filenames are implicitly assumed to start with the name of that directory unless they begin directly with a slash. Your login shell, and 1s, therefore have a current directory. The command PWD (print working directory) identifies the current directory:

    $ PWD

    /usr/you

    $

    The current directory is an attribute of a process, not a person or a program people have login directories, and processes have current directories. If a process creates a child process, the child inherits the current directory of its parent. But if the child then changes to a new directory, the parent is unsafe fected its current directory remains the same no matter what the child does.

    The notion of a current directory is certainly a notational convenience because it can save a lot of typing, but its real purpose is organizational. Related files belong together in the same directory. /usr is often the top directory of the user file system. (user is abbreviated to usr in the same spirit as CMP, 1s, etc.) /usr/you are your login directory, your current directory when you first log in. /usr/src contains source for system programs, /usr/src/cmd contains source for UNIX commands, /usr/src/cmd/sh contains the source files for the shell, and so on. Whenever you embark on a new project, or whenever you have a set of related files, say a set of recipes, you could create a new directory with mkdir and put the files there.

    $ PWD

    /usr/you

     $ mkdir recipes

    $ cd recipes

     $ PWD

    /usr/you/recipes

    $ mkdir pie cookie

    $ ed pie/apple.

    $ ed cookie/chocochip

    $

    Notice that it is simple to refer to subdirectories. pie/apple has an obvious meaning: the apple pie recipe, in directory /usr/you/recipes/pie. You could instead have put the recipe in, say, recipes/apple.pie, rather than in a subdirectory of recipes, but it seems better organized to put all the pies together, too. For example, the crust recipe could be kept in recipes/pie/crust rather than duplicated in each pie recipe. Although the file system is a powerful organizational tool, you can forget where you put a file, or even what files you've got. The obvious solution is a command or two to rummage around in directories. The 1s command is certainly helpful for finding files, but it doesn't look in sub-directories.

    $ cd

    $ Is

    junk

    recipes

    $file *

    junk:          ASCII text

    recipes:         directory

    $ Is recipes

    Cookie

    Pie

    $ is recipes/pie

    apple

    crust

    $

    This piece of the file system can be shown pictorially as:

    command du

    The command du (disk usage) was written to tell how much disc space is consumed by the files in a directory, including all its subdirectories.

    $ du

    6         ./recipes/pie

    4         ./recipes/cookie

    11       ./recipes

    13.

    The filenames are obvious; the numbers are the number of discs blocks-typically 512 or 1024 bytes each of storage for each file. The value for a directory indicates how many blocks are consumed by all the files in that directory and its subdirectories, including the directory itself. du has an option -a, for "all," that causes it to print out all the files in a directory. If one of those is a directory, du processes that as well:

    $ du -a

    2             ./recipes/pie/apple

    3            ./recipes/pie/crust

    6            ./recipes/pie

    3           ./recipes/cookie/choco chip

    4           ./recipes/cookie

    11        ./recipes

    1          ./junk

    13.

    $

    The output of du -a can be piped through grep to look for specific files:

    $ du –a | grep choc

    3         ./recipes/cookie/choco chip

     $

    Recall from Chapter 1 that the name '.' is a directory entry that refers to the directory itself; it permits access to a directory without having to know the full name. du looks in a directory for files; if you don't tell it which directory, it assumes., the directory you are in now. Therefore, junk and ./junk are names for the same file.

    Despite their fundamental properties inside the kernel, directories sit in the file system as ordinary files. They can be read as ordinary files. But they can't be created or written as ordinary files to preserve their sanity and the users' files, the kernel reserves to itself all control over the contents of directories.

    The time has come to look at the bytes in a directory:

    $ od -cb.

    0000000             4 ; . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

                                064 073 056 000 000 000 000 000 000 000 000 000 000 000 000 000

    0000020            273 ( . . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

                                273 050 056 056 000 000 000 000 000 000 000 000 000 000 000

    0000040            252 ; r e c i p e s \0 \0 \0 \0 \0 \0 \0 \0

    252 073             162 145 143 151 160 145 163 000 000 000 000 000 000

    0000060            230 = j u n k \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    230 075             152 165 156 153 000 000 000 000 000 000 000 000 000

    0000100

    $

    See the filenames buried in there? The directory format is a combination of binary and textual data. A directory consists of 16-byte chunks, the last 14 bytes of which hold the filename, padded with ASCII NULs (which have value 0), and the first two of which tell the system where the administrative information for the file resides we'll come back to that. Every directory begins with two entries.("dot") and .. ("dot-dot").

    $ cd        Home

    $ cd recipes

    $ PWD

    /usr/you/recipes

    $ cd ..; PWD          Up one level

    /usr/you

    $ cd ..; PWD          Up another level

    /usr

    $ cd ..; PWD          Up another level

    /

    $ cd ..; PWD          Up another level

    /                                  Can't go any higher

    $

    The directory is called the root of the file system. Every file in the system is in the root directory or one of its subdirectories, and the root is its parent directory.

    Exercise 2-2. Given the information in this section, you should be able to understand roughly how the 1s command operates. Hint: cat. >foo; ls -f foo.

    Exercise 2-3. (Harder) How does the PWD command operate? an Exercise 2-4. du was written to monitor disc usage. Using it to find files in a directory hierarchy is at best a strange idiom, and perhaps inappropriate. As an alternative, look at the manual page to find(1) and compare the two commands. In particular, compare the command du -a 1 grep... with the corresponding invocation of find. Which runs faster? Is it better to build a new tool or use a side effect of an old one?

    Permissions

    Every file has a set of permissions associated with it, which determine who can do what with the file. If you're so organized that you keep your love letters on the system, perhaps hierarchically arranged in a directory, you probably don't want other people to be able to read them. You could therefore change the permissions on each letter to frustrate gossip (or only on some of the letters, to encourage it), or you might just change the permissions on the directory containing the letters, and thwart snoopers that way.

    But we must warn you: there is a special user on every UNIX system, called the super-user, who can read or modify any file on the system. The special login name root carries super-user privileges; it is used by system administrators when they do system maintenance. There is also a command called so that grants super-user status if you know the root password. Thus anyone who knows the super-user password can read your love letters, so don't keep sensitive material in the file system. If you need more privacy, you can change the data in a file so that even the super-user cannot read (or at least understand) it, using the crypt command (crypt(1)). Of course, even a crypt isn't perfectly secure. A super-user can change the crypt command itself, and there are cryptographic attacks on the crypt algorithm. The former requires malfeasance and the latter takes hard work, however, so crypt is in practice fairly secure. In real life, most security breaches are due to passwords that are given away or easily guessed. Occasionally, system administrative lapses make it possible for a malicious user to gain super-user permission. Security issues are discussed further in some of the papers cited in the bibliography at the end of this chapter. When you log in, you type a name and then verify that you" are that person by typing a password. The name is your login identification or login-id. But the system recognizes you by a number, called your user-id, or uid. Different login ids may have the same uid, making them indistinguishable from the system, although that is relatively rare and perhaps undesirable for security reasons. Besides a uid, you are assigned a group identification, or group id, which places you in a class of users. On many systems, all ordinary users (as opposed to those with login IDs like root) are placed in a single group called the other, but your system may be different. The file system, and therefore the UNIX system in general, determines what you can do by the

    permissions granted to your uid and group-id. The file /etc/password is the password file; it contains all the login information about each user. You can discover your uid and group-id, as does the system, by looking up your name in /etc/passwd:

    $grep you /etc/passwd

    The fields in the password file are separated by colons and are laid out like this (as seen in passwd(5)):

    login-id: encrypted-password: uid: group-id: miscellany: login-directory: shell

    The file is ordinary text, but the field definitions and separator are a convention agreed upon by the programs that use the information in the file. The shell field is often empty, implying that you use the default shell, /bin/sh. The miscellany field may contain anything; often, it has your name and address or phone number.

    Note that your password appears here in the second field, but only in an encrypted form. Anybody can read the password file (you just did), so if your password itself were there, anyone would be able to use it to masquerade as you. When you give your password to log in, it encrypts it and compares the result against the encrypted password in /etc/passwd. If they agree, it lets you log in. The mechanism works because the encryption algorithm has the property that it's easy to go from the clear form to the encrypted form, but very hard to go backward. For example, if your password is ka-boom, it might be encrypted as gkmbCTrJ04COM, but given the latter, there's no easy way to get back to the original.

    The kernel decided that you should be allowed to read /etc/passwd by looking at the permissions associated with the file. There are three kinds of permissions for each file: read (i.e., examine its contents), write (i.e., change its contents), and execute (i.e., run it as a program). Furthermore, different permissions can apply to different people. As file owner, you have one set of reading, writing, and executing permissions. Your "group" has a separate set. Everyone else has a third set.

    The 1 option of 1s prints the permissions information, among other things:

    $ ls -1 /etc/passwd

    -rw-r--r-- 1 root                  5115 Aug 30 10:40 /etc/passwd

    $ ls -LG /etc/passwd

    -rw-r--r-- 1 adm                 5115 Aug 30 10:40 /etc/passwd

    These two lines may be collectively interpreted as: /etc/passwd is owned by login-id root, group adm is 5115 bytes long, was last modified on August 30 at 10:40 AM, and has one link (one name in the file system; we'll discuss links in the next section). Some versions of 1s give both owner and group in one invocation.

    The string -rw-r--r-- is how 18 represents the permissions on the file. The first indicates that it is an ordinary file. If it were a directory, there would be a d there. The next three characters encode the file owner's (based on uid) read, write and execute permissions. rw- means that root (the owner) may read or write, but not execute the file. An executable file would have an x instead of a dash. The next three characters (x--) encode group permissions, in this case, so that people in group adm, presumably the system administrators, can read the file but not write or execute it. The next three (also r--) define the permissions for everyone else and the rest of the users on the system. On this machine, then, only the root can change the login information for a user, but anybody may read the file to discover the information. A plausible alternative would be for group adm to also have written permission on /etc/passwd.

    The file /etc/group encodes group names and groups and defines which users are in which groups. /etc/passwd identifies only your login group; the newgrp command changes your group permissions to another group.

    Anybody can say

    $ ed /etc/passwd

    and edit the password file, but only the root can write back the changes. You might therefore wonder how you can change your password since that involves editing the password file. The program to change passwords is called passwd; you will probably find it in /bin:

    $ Is-1/bin/passwd

    -rwar-xr-x 1 root             8454 Jan 4 1983 /bin/passwd

    $

    (Note that /etc/passwd is the text file containing the login information, while /bin/passwd, in a different directory, is a file containing an executable program that lets you change the password information.) The permissions here state that anyone may execute the command, but only the root can change the passwd command. But the s instead of an x in the execute field for the file owner states that, when the command is run, it is to be given the permissions corresponding to the file owner, in this case, root. Because /bin/passwd is "set-uid" to root, any user can run the passwd command to edit the password file. The set-uid bit is a simple but elegant idea that solves some security problems. For example, the author of a game program can make the program set-uid to the owner, so that it can update a score file that is otherwise protected from other users' access. But the set-uid concept is potentially dangerous. /bin/passwd has to be correct; if it were not, it could destroy system information under the root's auspices. If it had the permissions -rwarwxrwx, it could be overwritten by any user, who could therefore replace the file with a program that does anything. This is particularly serious for a set-uid program because the root has access permissions to every file on the system. (Some UNIX systems turn the set-uid bit off whenever a file is modified, to reduce the danger of a security hole.)

    The set-uid bit is powerful but used primarily for a few system programs such as passwd. Let's look at a more ordinary file.

    $Is -I /bin/who

    $

    who is executable by everybody, and writable by root and the owner's group. What "executable" means is this: when you type

    $ who

    to the shell, it looks like a set of directories, one of which is /bin, for a file named "who." If it finds such a file, and if the file has to execute permission, the shell calls the kernel to run it. The kernel checks the permissions, and, if they are valid, runs the program. Note that a program is just a file with execute permission. In the next chapter, we will show you programs that are just text files, but that can be executed as commands because they have to execute permission set.

    Directory permissions operate a little differently, but the basic idea is the same.

    $ ls -ld.

    drwxrwxr-x 3 you             08 Sep 27 06:11.

    $

    The -d option of 1s asks it to tell you about the directory itself, rather than its contents, and the leading d in the output signifies that '.' is indeed a directory. Anr field means that you can read the directory, so you can find out what files are in it with 1s (or od, for that matter). A w means that you can create and delete files in this directory because that requires modifying and therefore writing the directory file. 

    You cannot simply write in a directory even root is forbidden to do so.

    $ who >.                          Try to overwrite "."

    .: cannot create              You can't

    $

    Instead, there are system calls that create and remove files, and only through them is it possible to change the contents of a directory. The permissions idea, however, still applies: the w fields tell who can use the system routines to modify the directory.

    Permission to remove a file is independent of the file itself. If you have to write permission in a directory, you may remove files there, even files that are protected against writing. The rm command asks for confirmation before removing a protected file, however, to check that you want to do so one of the rare occasions that a UNIX program double-checks your intentions. (The -f flag to rm forces it to remove files without question.).

    The x field in the permissions on a directory does not mean execution; it means "search." Executing permission on a directory determines whether the directory may be searched for a file. It is, therefore, possible to create a directory with mode --x for other users, implying that users may access any file that they know about in that directory, but may not run it on it or read it to see what files are there. Similarly, with directory permissions --, users can see (1s) but not use the contents of a directory. Some installations use this device to turn off /usr/games during busy hours. The chmod (change mode) command changes permissions on files.

    chmod permissions filenames...

    The syntax of the permissions is clumsy, however. They can be specified in two ways, either as octal numbers or by symbolic description. The octal numbers are easier to use, although the symbolic descriptions are sometimes convenient because they can specify relative changes in the permissions. It would be nice if you could say

    $ chmod rw-rw-rw- junk             Doesn't work this way!

    rather than

                    $ chmod 666 junk

    but you cannot. The octal modes are specified by adding together a 4 for reading, 2 for writing, and 1 for executing permission. The three digits specify, as in 1s, permissions for the owner, group, and everyone else. The symbolic codes are difficult to explain; you must look in chmod(1) for a proper description. For our purposes, it is sufficient to note that turns permission on and that - turns it off. For example,

    $ chmod +x command

    allows everyone to execute the command, and

    $ chmod -w file

    turns off write permission for everyone, including the file's owner. Except for the usual disclaimer about super-users, only the owner of a file may change the permissions on a file, regardless of the permissions themselves. Even if somebody else allows you to write a file, the system will not allow you to change its permission bits.

    $ Is –Id /usr/mary

    drwxrwxrwx 5 mary                  704 Sep 25 10:18 /usr/mary

    $ chmod 444 /usr/mary

    Chmod : can’t change /usr/mary

    $

    If a directory is writable, however, people can remove files in it regardless of the permissions on the files themselves. If you want to make sure that you or your friends never delete files from a directory, remove write permission from it:

    $ cd

    $ date >temp

    $ chmod –w                                                    Make directory unwritable

    $ Is –Id.

    dr-xr-xr-x 3 you                            08 Sep 27 11:48.

    $ rm temp

    Rm: temp not removed                                  Can’t remove file

    $ chmod 775                                                   Restore permission

    $ Is –Id.

    Drwxrwxr-x 3 you                         08 Sep 27 11:48.

    $                                                                                      Now you can

    The temp is now gone. Notice that changing the permissions on the directory didn't change its modification date. The modification date reflects changes to the file's contents, not its modes. The permissions and dates are not stored in the file itself, but in a system, a structure called an index node, or i-node, is the subject of the next section.

    Exercise 2-5. Experiment with chmod. Try different simple modes, like 0 and 1. Be careful not to damage your login directory!

    Inodes

    A file has several components: a name, contents, and administrative information such as permissions and modification times. The administrative information is stored in the inode (over the years, the hyphen fell out of "i-node"). along with essential system data such as how long it is, where on the disc the contents of the file are stored, and so on.

    There are three times in the inode: the time that the contents of the file were last modified (written); the time that the file was last used (read or executed); and the time that the inode itself was last changed, for example, to set the permissions.

    $ date

    Tue Sep 27 12:07:24 EDT 1983

    $date >junk

    $Is –I junk

    -rw-rw-rw- 1 you             29 Sep 27 12:07 junk

    $Is – Iu junk

    -rw-rw-rw- 1 you             29 Sep 27 06:11 junk

    $ Is – Ic junk

    -rw-rw-rw- 1 you             29 Sep 27 12:07 junk

    $

    Changing the contents of a file does not affect its usage time, as reported by 18 -lu, and changing the permissions affects only the inode change time, as reported by ls -lc.

    $ chmod 444 junk

    $ ls -lu junk

    -r--r--r-- 1 you          29 Sep 27 06:11 junk

    $ ls -lc junk

    -r--r--r-- 1 you         29 Sep 27 12:11 junk

    $ chmod 666 junk

    $

    The -t option to 1s, which sorts the files according to time, by default that of the last modification, can be combined with -c or -u to report the order in which inodes were changed or files were read:

    $ ls recipes

    cookie

    pie

    $ ls -out

    total 2

    drwxrwxrwx 4 you             64 Sep 27 12:11 recipes

    -rw-rw-rw- 1 you                 29 Sep 27 06:11 junk

    recipes are most recently used because we just looked at their contents. It is important to understand inodes, not only to appreciate the options on 1s but because in a strong sense the inodes are the files. All the directory hierarchy does is provide convenient names for files. The system's internal name for a file is its i-number: the number of the inode holding the file's information. 1s I reports the i-number in decimal:

    $ date >x

    $ ls -i

    15768 junk

    15274 recipes

    15852 x

    It is the i-number that is stored in the first two bytes of a directory, before the name. od -d will dump the data in decimal by byte pairs rather than octal by bytes and thus make the i-number visible.

    $ od –c

    0000000              4 ; . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    0000020              273 ( . . \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    0000040             252 ; r e c i p e s \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    0000060              230 = j u n k \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    0000100              354 = x \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0 \0

    0000120

    $ od –d

    0000000             15156   00046   00000  00000   00000   00000  00000   00000

    0000020             10427  11822   00000  00000   00000  00000  00000  00000

    0000040             15274  25970  26979   25968   00115  00000   00000   00000

    0000060             15768  30058  27508   00000   00000   00000  00000  00000

    0000100             15852  00120   00000   00000  00000  00000  00000  00000

    0000120

    $

    The first two bytes in each directory entry are the only connection between the name of a file and its contents. A filename in a directory is therefore called a link, because it links a name in the directory hierarchy to the inode, and hence to the data. The same i-number can appear in more than one directory. The rm command does not remove inodes; it removes directory entries or links. Only when the last link to a file disappears does the system remove the inode, and hence the file itself.

    If the i-number in a directory entry is zero, it means that the link has been removed, but not necessarily the contents of the file there may still be a link somewhere else. You can verify that the i-number goes to zero by removing the file:

    $ rm x

    $od –d

    0000000         15156   00046   00000  00000   00000   00000   00000   00000

    0000020         10427  11822  00000  00000   00000  00000   00000  00000

    0000040         15274  25970  26979  25968  00115  00000    00000  00000

    0000060         15768   30058  27502   00000  00000  00000   00000   00000

    0000100         00000   00120   00000   00000  00000   00000   00000  00000

    0000120

    $

    The next file created in this directory will go into the unused slot, although it will probably have a different i-number. The in-command links to an existing file, with the syntax

    $ in old-file new-file

    The purpose of a link is to give two names to the same file, often so it can appear in two different directories. On many systems, there is a link to /bin/ed called /bin/e, so that people can call the editor e. Two links to a file point to the same inode, and hence have the same i-number:

    In junk linktojunk

    $ Is -li

    total 3

    15768 -rw-rw-rv- 2 you              29 Sep 27 12:07 Junk

    15768 -rw-rw-rv- 2 you              29 Sep 27 12:07 linktojunk

    15274 drwxrwxrwx 4 you         64 Sep 27 09:34 recipes

    The integer printed between the permissions and the owner is the number of links to the file. Because each link just points to the inode, each link is equally important there is no difference between the first link and subsequent ones. (Notice that the total disc space computed by 18 is wrong because of double counting.)

    When you change a file, access to the file by any of its names will reveal the changes, since all the links point to the same file.

    $ echo x > junk

    $ Is -I

    total 3

    -rw-rw-rw-                             2 you 2 Sep 27 12:37 junk

    -rw-rw-rw- 2 you                  2 Sep 27 12:37 junk

    Drwxrwxrwx 4 you              64 Sep 27 09:34 recipes

    $ rm linktojunk

    $ Is –I

    Total

    -rw-rw-rw- 1 you                  2 Sep 27 12:37 junk

    Drwxrwxrwx 4 you             64 Sep 27 09:34 recipes

    $

    After 1inktojunk is removed the link count goes back to one. As we said before, arming a file just breaks a link; the file remains until the last link is removed. In practice, of course, most files only have one link, but again we see a simple idea providing great flexibility.

    A word to the hasty: once the last link to a file is gone, the data is irretrievable. Deleted files go into the incinerator, rather than the waste basket, and there is no way to call them back from the ashes. (There is a faint hope of resurrection. Most large UNIX systems have a formal backup procedure that periodically copies changed files to some safe place like magnetic tape, from which they can be retrieved. For your protection and peace of mind, you should know just how much backup is provided on your system. If there is none, watch out some mishap to the discs could be a catastrophe.)

    Links to files are handy when two people wish to share a file, but sometimes you want a separate copy of a different file with the same information. You might copy a document before making extensive changes to it, for example, so you can restore the original if you decide you don't like the changes. Linking wouldn't help, because when the data changed, both links would reflect the change, or makes copies of files:

    $ cp junk copy junk

    $ Is –li

    Total 3

    15850 –rw –rw-rw- 1 you                 2 Sep 27 13:13 copy junk

    15768 –rw-rw-rw- 1 you                   3 Sep 27 12:37 junk

    15274 drwxrwxrwx 4 you                64 Sep 27 09:34 recipes

    $

    The i-numbers of junk and copy of junk are different, because they are different files, even though they currently have the same contents. It's often a good idea to change the permissions on a backup copy so it's harder to remove it accidentally.

    $ chmod -w copy of junk              Turn off write permission

    $ ls -li

    total 3

    15850 -r--r--r-- 1 you                        2 Sep 27 13:13 copy of junk

    15768 -rw-rw-rw- 1 you                  2 Sep 27 12:37 junk

    15274 drwxrwxrwx 4 you             64 Sep 27 09:34 recipes

    rm: copyof junk 444 mode n        No! It's precious

    $ date> junk

    $ ls -li

    total 3

    15850 -r--r--r-- 1 you                      2 Sep 27 13:13 copy of junk

    15768 -rw-rw-rw- 1 you                29 Sep 27 13:16 junk

    15274 drwxrwxrwx 4 you            64 Sep 27 09:34 recipes

    $ rm copy of junk

    rm: copy of junk 444 mode y     Well, maybe not so precious

    $ ls –li

    total 2

    15768 -rw-rw-rw- 1 you              29 Sep 27 13:16 junk

    15274 drwxrwxrwx 4 you         64 Sep 27 09:34 recipes

    $


    Comments
    No comments yet be the first one to post a comment!
    Post a comment