In this chapter, we learn more about the Unix facilities for manipulating files and directories.
So far, we have seen how to create files but the only file-related command
we have seen is
ls
which displays our list of files.
$ ls
date
dateoutput
trout
wcout
$
These files were created in the Chapter two ??.
The next command we learn is the
rm
command which removes files.
For example:
$ rm dateoutput trout $ ls date wcout $
The most important point about this is that Unix assumes we know exactly
what we are doing and that we will not make any mistakes.
The
rm
command
does not ask us to confirm that we really do want to delete the two files.
There is no way for a file to be brought back into existence after it has been deleted. This is because Unix is a multi-user system - the disk is shared between all the users and processes, so the space on disk that the deleted file occupied will have been re-used before we could try to get it back. Of course, people do make mistakes so it is up to us to make sure that we have copies of precious files before we change or delete them.
Notice that we can put as many file names as we wish after the
rm
itself,
and that
rm
does not display any output in the normal course of events.
If we forget to put any file names after
rm, we get this error message:
$ rm
usage: rm [-rif] file ...
$
For now, just accept that
usage:
followed by a command name
says that we have typed a command wrongly.
When we have learned how to read the Unix manual pages, we
will be able to understand the
usage
messages completely.
If we try to remove a non-existent file,
rm
grumbles:
$ rm nonsuch
rm: nonsuch: No such file or directory
$
Don't worry what
or directory
is about we will see later in the chapter.
To copy an existing file we use the
cp
command:
$ cp wcout newfile $ ls date newfile wcout $ cp wcout newfile $
In the
cp
command, the name of the existing file comes straight after
cp
and the name the copy will have comes last.
It may help you to think of the command as:
copy from to.
Notice that in the example, the copy was done twice - once
when
newfile
did not exist, and once when it did.
Unix treated the two attempts exactly the same.
In other words, Unix assumes you know what you are doing -
if you want to make a copy with the name of an existing file
you can do, but the existing file will be overwritten and its
contents lost without any warning.
So far all our file names have been very simple - they just
consisted of lower case letters.
In fact Unix is very flexible about what is allowed in file names
and the length of file names.
We can include upper case letters (capitals), digits, most punctuation
symbols including dot (.), minus (-) and underscore (_).
Some operating systems insist that file names are in two parts
separated by a dot and where the second part is two or three characters long.
For example:
name.ext.
Unix lets us have names like that if we wish but does not insist.
The following are all valid file names:
1 A a1 A1 aNotInconsiderablyLongName name.e name.extension name.e.extension ---..._-_._ .name
However,
a1
and
A1
refer to different files.
Also, file names that begin with a dot are teated slightly differently.
We can rename a file using the
mv
command:
$ mv newfile abettername $ ls abettername date wcout $
As you may by now expect, Unix leaves it to us to ensure that
abettername
does not exist; if it does, Unix will treat it as unwanted and
remove it.
If we wish to see a list of any files whose names contain certain
characters, we can use
ls
like this:
$ ls a*
abettername
$
The asterisk
(*)
means any string of characters, so
a*
means any string of characters beginning with
a
and the command gave us a list of files to suit.
We can use the asterisk anywhere in a file name as often as we wish:
$ ls *te*
abettername
date
$
Notice that it matched no characters after
te
in
date.
Asterisk on its own matches all file names except those whose names
begin with a dot.
The facility is known as file name generation because Unix generates the names of files whose names match a pattern. Asterisk is not the only character with a special meaning.
The question mark character matches any single character in a file name. For example:
$ cp date data $ ls dat? data date $ ls ???e date $
The second part of the example shows the names of all files
with four characters in the name and ending with
e.
The remaining file name generation facility allows us to specify one character from a list of possibilities. For example:
$ ls dat[ae] data date $ ls [a-c]* abettername $
The
[ae]
matches an
a
or an
e.
The
[a-c]
matches one letter between
a
and
c.
As you can see from the second part of the example,
the file name generation facilities can be used
in conjunction with each other.
If you have used computers before you will be familiar with the idea of
directories.
A directory
feels
like a part of the disk where your files are stored.
We talk about
moving
from one directory to another.
On some systems, this is indeed the case.
In Unix, however,
a directory is very simple and elegant - it is just a list of files
and it is held
in a file
called
..
When Unix refers to an existing file, it looks in the current list or
directory for the file name;
when Unix creates a new file, it adds the file name to the current
list or directory;
when Unix says that a file is not found, it means that it is not
in the current list or directory.
The list actually contains a pair of items for each file in the list:
the first item is the name of the file and the second is the address
of file on the disk.
The
ls
command simply displays the file names in the current list, leaving out
the addresses.
We will now drop the expression "list or directory" and simply use
"directory" but remember that in Unix a directory is just a file used
for the special purpose of holding a list of
files.
Now that we have seen there is little difference between files and
directories, it should be obvious that some Unix commands can
operate on files and directories.
That is why, when
rm
grumbles about non-existent files, it says:
No such file or directory.
In Unix the user always has a current or working directory; you can leave it to go to another, but you can't just leave it. You always have a directory which is your current, working directory.
Every user has a special directory called their home directory; it becomes the working directory when they log into log into the system. Each user's home directory belongs exclusively to that one user. This is why we could get through three chapters of this book without knowing about directories!
You may ask if: if there is a file called
., why didn't it appear when we used
ls?
The answer is that files whose names begin with a dot are called
hidden
files and Unix does not normally display them, to
avoid clutter in lists of files.
Most of the time, users do not wish to refer to hidden files.
We can see the hidden files using the following version of
ls:
$ ls -a
.
..
abettername
date
wcout
$
The
-a
is known as an
option
.
Many commands have options;
we will study them in more detail later.
As you see, as well as the file called
.which holds the directory's list of files, there is also one called
...
We will see what it is later.
Every directory has a name which can be displayed by using the
pwd
command.
Just after logging onto my system, I see this when I use the command:
$ pwd
/homedir/cms/ps
$
It looks as if my directory is called
/homedir/cms/ps.
When file or directory names have a slash
(/)
in them, they are known as
path-names.
A path-name consists of the name of a file or directory
preceded by a slash and one or more directory names separated by slashes.
There are two kinds of path-names:
full
path-names and
relative
path-names.
Full path-names begin with a slash;
relative path-names do not.
We can now see that
pwd
displays the full path-name of the user's home directory.
Path-names are used to access files that are outside the current
directory;
they tell Unix how to navigate from one directory to another
to find a particular file.
My home directory is called
ps
and it is in a directory called
cms;
cms
is in a directory called
homedir
which is in a directory called
/.
The directory called
/
is known as the
root
directory;
it is the first one created on every Unix system and all other
directories are created in it or in a directory
already linked, directly or indirectly, into root.
So, to find my home directory, we begin at the root, from there
we branch to
homedir, from there we branch to
cms
and from there we branch to
ps.
If we had many sheets of paper we would not keep them in one big pile. We could put related sheets together in one document wallet. If we had many document wallets we could put them in a filing cabinet. Creating directories is like starting a new document wallets or a new filing cabinet drawer - but much quicker. Instead of filling in a requisition form and waiting for ages, we simply issue this command:
$ mkdir red
$
Where are directories created? Since they are simply special purpose files, the answer is the same as if we had asked: where are files created? In other words, in the current directory. Let's have a look:
$ ls
abettername
date
red
wcout
$
As you can see, we have a new file called
red
; it is our new directory
Obviously we can't give a new directory the same name as an existing file or directory:
$ mkdir abettername
mkdir: abettername: File exists
$
Unix complains if we try.
When we log in, we begin using our home directory.
We can change to another with the
cd
command:
$ cd red $ pwd /homedir/cms/ps/red $ ls $
As you see, the new directory's name has been added to
the full path-name displayed by
pwd; also the new directory appears to be empty.
If we create some files, they will be created in this new directory:
$ date > flag $ date > blood $ ls blood flag $
We could create files that weren't red in this directory too!
If we want to get back to our home directory, we could do this:
$ cd /homedir/cms/ps
$
but it is a rather long command to use over and over again.
Since the purpose of computers is to make
life easier, the authors of Unix gave
cd
the default behaviour of changing to the user's home directory.
So all we have to do to get back is:
$ cd $ pwd /homedir/cms/ps $
That is
cd
without a parameter.
We often talk about a directory
hierarchy.
When we changed directory from
red
to the home directory, we might have referred to going up a level.
The
cd red
command could be said to take us down a level.
The directory above a directory is known as the directory's
parent.
Since a directory has only one parent, Unix uses
the hidden file
..
to refer to it.
Therefore, in any directory,
cd ..
can be used to go up one level in the directory hierarchy.
For example:
$ pwd /homedir/cms/ps $ cd .. $ pwd /homedir/cms $
We can go up more than one level at once:
$ pwd /homedir/cms $ cd ../.. $ pwd / $ cd homedir/cms/ps $ pwd /homedir/cms/ps $
Notice we used relative path-names in the
cd
commands.
The
../..
and
homedir/cms/ps
were not full path-names because they did not begin with a slash.
They are relative path-names - they tell us how to get to
a file starting from the
current
directory whereas full path-names tell us how to get to
a file starting from the
root
directory.
So
cd ../..
means from the current directory, go the parent and then to its parent;
cd homedir/cms/ps
means look for
homedir
in the current directory, and from there, go to
cms
and from there to
ps.
We can rename a directory using the same command as we use for files. For example:
$ pwd /homedir/cms/ps $ ls abettername date red wcout $ mv red rouge $ ls abettername date rouge wcout $ mv rouge red $
In the previous section we had to rely on our memory to know that
red
was a directory.
The
-l
option of the
ls
command can be used to show more information about files.
For example:
$ ls -l
total 10
-rw------- 1 cmsps cms 25 Jul 10 10:48 abettername
-rw------- 1 cmsps cms 29 Jul 10 14:40 date
drwx------ 2 cmsps cms 512 Jul 10 16:05 red
-rw------- 1 cmsps cms 25 Jun 21 10:19 wcout
$
As you can see, the files are still displayed one per line and the
file name is placed at the end of each line.
Several columns of information about the files are displayed using this option.
For now the only information we are concerned with is the first
character on the line; a
d
tells us the file is a directory;
a
-
tells us the file is not a directory.
We will come back to the other columns later.
So far,
mv
has been used for renaming files and directories.
It can also be used to move files into another directory.
Here we create some files and then move them into the
red
directory:
$ date > cherry $ date > rose $ mv cherry rose red $ ls abettername date red wcout $ cd red $ ls blood cherry flag rose $
When the last name in the
mv
command is that of an existing directory,
mv
moves the preceding files into the directory.
The
mv
command can be used from within a directory:
$ mv cherry rose .. $ mv ../cherry ../rose . $
The first
mv
moves the files to the parent directory;
the second moves the files in the parent directory back to the current
directory.
Notice that, in the second
mv,
using a path-name allows us to refer to files outside the current directory.
Notice also, how using
.
and
..
saves us having to type the names of the directories.
This facility is very important so let's spell it out: any Unix command that works on named files can be made to refer to files outside the current directory simply by specifying a path-name rather than a file name. With a plain file-name, Unix assumes the file is in the current directory; a path-name tells Unix where to find (or place) the file outside the current directory.
It is possible to move a file and rename it in one operation:
$ mv blood ../blueblood
$
Unix would delete
blueblood
in the parent directory if it already existed.
It is easy to refer to a directory at the same level as the current one. First we need some more directories to play with:
$ pwd /homedir/cms/ps $ mkdir orange yellow green $ cd red $ cd ../green $ cp ../red/flag . $ ls flag $
The path-name
../green
is note-worthy, because it changes direction:
it goes
up
to the parent before going
down
to another directory.
The up and down path-name makes
cd
step sideways from
red
to
green.
Notice how an up and down path-name makes it easy to copy
flag
from the
red
directory to the current one.
So far, we have used
ls
without parameters but it is possible to supply the names or path-names
of directories and see what they contain.
For example:
$ pwd /homedir/cms/ps/green $ ls .. ../red ../date ../date ..: abettername date green orange red wcout yellow ../red: blood cherry flag rose $
When given a mixture of files and directories,
ls
does the files first and the directories last.
The files simply have their names displayed but the directories
have their contents shown under a one line heading ending with
a colon
(:)
.
All the output from
ls
is ordered alphabetically.
We can be as organised in life as we can be bothered to be; its just the same with Unix directories - we can have directories in directories in directories almost to whatever depth we like. This example shows several directories being created.
$ mkdir red/plant red/plant/fruit red/plant/fruit/soft $ cd red/plant/fruit/soft $ pwd /homedir/cms/ps/red/plant/fruit/soft $
Obviously, the order of the directory names is important;
we can't make the
fruit
directory inside
red/plant
until we have created
red/plant
itself.
The
green
file was copied in an earlier example.
The trouble with copies is they often end up different
to each other and every one takes up space on the disk.
If we really want the same file in more than one directory, Unix
lets us do that safely and economically.
Here is how:
$ pwd /homedir/cms/ps/green $ mv flag flag.copy $ ln ../red/flag . $ ls -l total 4 -rw------- 2 cmsps cms 29 Jul 10 16:05 flag -rw------- 1 cmsps cms 29 Jul 14 13:44 flag.copy $
The copy of flag was renamed and then
ln
was used to link the existing file
../red/flag
into the current directory.
When we list the directory, we see two files.
The second is a copy; if we modified it, the version in
red
would be unchanged.
The first is the original file; it is now
equally
in both the
red
and
green
directories; we can't change one version and leave the other alone.
Lecturers often allow students to link to files belonging to the lecturer to avoid wasting space and so that the lecturer can change everyone's file at once.
Now is a good time to look at the meaning of the extra columns of
output from
ls -l.
This diagram shows the output with column headings:
directory |permissions || number of links || | owner || | | owner's group || | | | size in characters || | | | | modification date/time || | | | | | file name || | | | | | | -rw------- 2 cmsps cms 29 Jul 10 16:05 flag -rw------- 1 cmsps cms 29 Jul 14 13:44 flag.copy
The number of links tells us how many directories a file
occurs in;
it is always greater than one for directories because the
file called
.
is linked into the parent directory under another name.
The owner is shown by the login code and a similar code shows the group the file's owner belongs to. Thus we can see that the two files shown are the same size but they were created at different times and one of them is in another directory as well as this one. We will learn more about permissions in a later section.
Removing directories is nearly as easy as removing files. If the directory is empty you can do this:
$ pwd /homedir/cms/ps $ rmdir orange yellow $
Unix grumbles if you try to remove a directory which is not empty. For example:
$ rmdir red/plant
rmdir: red/plant: Directory not empty
$
If you are sure you do not need any of the files or directories,
you can use the
-r
option of
rm
to do the job and any files or directories contained in the directory
will be deleted as well:
$ rm -r red/plant
$
Ordinarily,
rm
complains if you try to make it delete a directory:
$ rm red
rm: red is a directory
$
Some times we know we have a file but can't remember which directory it is in:
find
can help here, for
example:
$ find . -name valuable -print
/homedir/cms/ps/student/valuable
$
In the example,
find
is told to look in the working directory and its
subdirectories for a file whose name is
valuable.
(Notice that the first parameter in the example is dot; it could have been
any other directory.)
When
find
finds the requested file, it is made to display the full path-name.
The
-name
and
-print
are options but they are unusual in two ways.
They don't come straight after the command name and they do not consist
of a single letter.
find
can work with
the size, file type, date of last change, permissions and so on, as well
as the name of the file.
It is not restricted to showing the path-name of the files it finds.
For instance, files could be moved or deleted.
Also, it can search a list of directories -- not just one as in the example.
You have to consult its manual entry to see all it is capable of.
It is handy to get an indication of what is in a file without looking at
the contents with
more
or
vi
; the
file
command does this:
$ file mystery
mystery: English text
$
It is not always perfectly correct, but it usually is. A particularly useful command is:
file *
which tells you about all the files in your current directory.
Unix is a multi-user system and it is possible to share files
but first, we need to know about file permissions.
They are shown in the ten characters at the start of each line by
ls -l
.
The first character tells us if the file is a directory;
the other nine are in three groups of three telling us what
the user (us), the user's group (the user's colleagues) and others
(anyone else using the system)
have permission to do to the file.
For example:
directory |user permission || group permission || | other permission || | | -rwxr-x--x
People may have permission to read, write or execute the file;
these are shown as
r,
w
and
x.
Lack of permission is shown as minus sign
(-).
Thus we can see the permission above allows the user
to read, write and execute the file;
his or her group to read and execute it;
and others just to execute it.
Read write and execute have different meanings depending on whether we are talking about files or directories.
Read permission allows people who have it for a file, to copy the file or
use it as the input to commands such as
more.
Write permission allows people to modify the file or replace it.
Execute permission allows people to run the program held in the file.
With directories, read permission allows people to use
ls
to list the contents of the directory
Write permission allows people to create and delete files in the directory.
Execute permission allows people to
cd
to the directory and refer to files in or below it.
To read someone else's file:
we have to know its path-name;
if we are in the same group as the file's owner, they must have given group read permission; if we are not in the same group as the owner, they must have given read permission for others;
the owner must have made all the directories in the path-name executable by others or by the group as appropriate.
Here we see a file belonging to a stranger being read:
$ ls -l /homedir/urs/tc/public_html/index.shtml -rwxr-xr-x 1 urstc urs 331 Dec 8 1994 /h ... $ more /homedir/urs/tc/public_html/index.shtml <HTML> ... $
Note: the output from
ls
has been shortened to make it fit on the page;
only the first line of output from
more
has been shown to save space.
There are two ways of using the
chmod
command to alter the permissions (or mode, as it is sometimes called)
of a file.
The method we cover here uses octal (base eight) numbers;
the other does not.
Unfortunately, the other method is not completely comprehensive
so that you have to
understand octal numbers for some tasks anyway.
That is why it is not used here.
Each individual permission must be changed to an octal number. For the user, read is 400; write is 200 and execute is 100. To combine them we add them up, so that read and write for the user is 600 and read and execute is 500. For the group, the numbers are 40, 20 and 10. For others they are 4, 2 and 1. For example, the following set of permissions:
-rwxr-x--x
is 751 (700 + 50 + 1) in octal.
We use an octal representation of the required permission like this:
$ chmod 440 abettername $ ls -l abettername -r--r----- 1 cmsps cms 25 Jul 10 10:48 abettername $ date > abettername abettername: Permission denied $ rm abettername rm: override protection 440 for abettername? n $
In the example, we give the user and the group read permission only on
the file.
Then we try to alter the file and, of course, Unix does not let us.
The example then demonstrates why we can't avoid octal numbers.
Without write permission on a file,
rm
issues a question which should simply say that the file is
write protected but instead tells us the file's permission in octal.
If our reply begins with
y,
rm
deletes the file.
Find the full pathname of your home directory
Answer
cd pwd
List all the files in your directory -- hidden ones too.
Answer
ls -a
The
-a
option lists hidden files too -- those whose names begin
with a dot.
Make two copies of the
bicycle
file -- called
bicycle1
and
bicycle2.
Answer
cp bicycle bicycle1 cp bicycle bicycle2
Change the name of the
bicycle
file to
bicycle4two.
Confirm this change by listing the filenames in your directory.
Answer
mv bicycle bicycle4two ls bicycle*
There should be three files beginning with "bicycle" --
bicycle1,
bicycle2
and
bicycle4two.
Remove
bicycle1
and
bicycle2
using file name generation and avoiding the
bicycle4two
file.
Confirm that they have been removed.
Answer
rm bicycle? ls
Note that:
rm bicycle*
would delete all three files.
Find the size of the biggest file in your home directory.
Answer
ls -l
And look for the biggest size in column 5!
What command would you use to see if you had any directories in your home directory?
Answer
ls -l
And look for lines starting with
d
execute
mkdirs
BEFORE you try the remaining questions; it will
create some files and directories for you to play with. NB:
mkdirs
will overwrite any files called
janet,
john,
freda
and
michael
that you have in your home directory. It will
generate lots of error messages if you run it twice.
Execute:
ls freda michael janet john
Can you explain the output?
Answer
Files are listed first followed by directories. For directories, the contents of the directory are also listed.
Execute this command:
ls -R freda michael # note the capital R
Can you explain the output?
Answer
The same as the previous question but
-R
makes
ls
recursive and it does all the subdirectories too.
Look at
freda
and
michael
with the CDE File Manager, or
with the Windows file manager. (It's called Windows
Explorer!)
OR
Draw a diagram to represent the hierarchy of directories
from your home directory down. (Just
janet
and
john
--
exclude your own directories.)
(You will need the diagramatic view to do the next few questions.)
With one
cd
command, change from where you are to the directory
called
flowers
.
Answer
cd freda/likes/flowers
With one
cd
command, change from where you are to the directory
called
freda.
Answer
cd ../..
With one
cd
command, change from where you are to the directory
called
copies.
Answer
cd ../michael/copies
With one
cd
command, change back to your home directory.
Answer
cd
Make a directory called
newdir1
in your home directory.
Answer
cd; mkdir newdir1
Make a directory called
newdir2
in
newdir1.
Answer
mkdir newdir1/newdir2
OR
cd newdir1; mkdir newdir2
What is the difference between
ls -ld
and
ls -l?
Answer
The
-d
option causes
ls
to give information about the directory
itself instead of showing the contents of the directory.
Look at the file called
whale
in the
michael
directory; then
add the date to the end of the file
(date >> whale).
Look at the file called
michael/copies/whale; what do you notice? That is:
modify one
whale
file and look at the other.
Try the same thing with the files called
cat.
Can you explain the difference in the behaviour of the two pairs of files?
Answer
cd cd michael more whale date >> whale more copies/whale
(Notice: date added to
copies/whale
too!)
date >> cat more copies/cat
(Notice:
copies/cat
unchanged.)
You should notice that the date has been added to both
whale
files.
The same thing does not happen with the
cat
files.
The explanation is that
whale
is just one file but it is shared between
the two directories;
cat
and
copies/cat
are two separate files.
You can tell that by doing an
ls -l whale
and looking at column two
of the output.
Remove
freda,
michael,
janet,
john
and their contents.
Answer
rm -r freda michael janet john
What command would you use to delete all the empty directories in the current directory?
Answer
rmdir *
Unix will grumble about the directories that aren't empty and will leave them untouched. Empty directories will be deleted.
Use a Unix tool to find if you have a file called
cat
in
any of your directories.
Answer
cd find . -name cat -print
cd pwd
ls -a
The
-a
option lists hidden files too -- those whose names begin
with a dot.
cp bicycle bicycle1 cp bicycle bicycle2
mv bicycle bicycle4two ls bicycle*
There should be three files beginning with "bicycle" --
bicycle1,
bicycle2
and
bicycle4two.
rm bicycle? ls
Note that:
rm bicycle*
would delete all three files.
ls -l
And look for the biggest size in column 5!
ls -l
And look for lines starting with
d
n/a
Files are listed first followed by directories. For directories, the contents of the directory are also listed.
The same as the previous question but
-R
makes
ls
recursive and it does all the subdirectories too.
n/a
cd freda/likes/flowers
cd ../..
cd ../michael/copies
cd
cd; mkdir newdir1
mkdir newdir1/newdir2
OR
cd newdir1; mkdir newdir2
The
-d
option causes
ls
to give information about the directory
itself instead of showing the contents of the directory.
cd cd michael more whale date >> whale more copies/whale
(Notice: date added to
copies/whale
too!)
date >> cat more copies/cat
(Notice:
copies/cat
unchanged.)
You should notice that the date has been added to both
whale
files.
The same thing does not happen with the
cat
files.
The explanation is that
whale
is just one file but it is shared between
the two directories;
cat
and
copies/cat
are two separate files.
You can tell that by doing an
ls -l whale
and looking at column two
of the output.
rm -r freda michael janet john
rmdir *
Unix will grumble about the directories that aren't empty and will leave them untouched. Empty directories will be deleted.
cd find . -name cat -print
http://homepages.shu.ac.uk/~cmsps/unix/filesys.html
Last updated: Thursday 05 April 2012 at 17:45