Most Linux commands read input, such as a file or another attribute for the command, and write output. By default, input is being given with the keyboard, and output is displayed on your screen. Your keyboard is your "standard input" (stdin) device, and the screen or a particular terminal window is the "standard output" (stdout) device.
However, since Linux is a flexible system, these default settings don't necessarily have to be applied. The standard output, for example, on a heavily monitored server in a large environment may be a printer.
Sometimes you will want to put output of a command in a file, or you may want to issue another command on the output of one command. This is known as redirecting output. Redirection is done using either the ">" (greater-than symbol), or using the "|" (pipe) operator which sends the standard output of one command to another command as standard input.
As we saw before, the cat command concatenates files and puts them all together to the standard output. By redirecting this output to a file, this file name will be created - or overwritten if it already exists, so take care.
nancy:~> cat test1 some words nancy:~> cat test2 some other words nancy:~> cat test1 test2 > test3 nancy:~> cat test3 some words some other words |
Redirecting "nothing" to an existing file is equal to emptying the file:
nancy:~> ls -l list -rw-rw-r-- 1 nancy nancy 117 Apr 2 18:09 list nancy:~> > list nancy:~> ls -l list -rw-rw-r-- 1 nancy nancy 0 Apr 4 12:01 list |
This process is called truncating.
The same redirection to an nonexistent file will create a new empty file with the given name:
nancy:~> ls -l newlist ls: newlist: No such file or directory nancy:~> > newlist nancy:~> ls -l newlist -rw-rw-r-- 1 nancy nancy 0 Apr 4 12:05 newlist |
Chapter 7 gives some more examples on the use of this sort of redirection.
Some examples using piping of commands:
To find a word within some text, display all lines matching "pattern1", and exclude lines also matching "pattern2" from being displayed:
grep pattern1 file | grep -v pattern2
To display output of a directory listing one page at a time:
ls -la | less
To find a file in a directory:
ls -l | grep part_of_file_name
In another case, you may want a file to be the input for a command that normally wouldn't accept a file as an option. This redirecting of input is done using the "<" (less-than symbol) operator.
Below is an example of sending a file to somebody, using input redirection.
andy:~> mail mike@somewhere.org < to_do |
If the user mike exists on the system, you don't need to type the full address. If you want to reach somebody on the Internet, enter the fully qualified address as an argument to mail.
This reads a bit more difficult than the beginner's cat file | mail someone, but it is of course a much more elegant way of using the available tools.
The following example combines input and output redirection. The file text.txt is first checked for spelling mistakes, and the output is redirected to an error log file:
aspell < text.txt > error.log
It might be that you have to use the following syntax to spell a file (check with the man pages):
aspell -H list < file.txt | sort -u
This also uses input and output redirection.
The following command lists all commands that you can issue to examine another file when using less:
mike:~> less --help | grep -i examine :e [file] Examine a new file. :n * Examine the (N-th) next file from the command line. :p * Examine the (N-th) previous file from the command line. :x * Examine the first (or N-th) file from the command line. |
The -i option is used for case-insensitive searches - remember that UNIX systems are very case-sensitive.
If you want to save output of this command for future reference, redirect the output to a file:
mike:~> less --help | grep -i examine > examine-files-in-less mike:~> cat examine-files-in-less :e [file] Examine a new file. :n * Examine the (N-th) next file from the command line. :p * Examine the (N-th) previous file from the command line. :x * Examine the first (or N-th) file from the command line. |
Output of one command can be piped into another command virtually as many times as you want, just as long as these commands would normally read input from standard input and write output to the standard output. Sometimes they don't, but then there may be special options that instruct these commands to behave according to the standard definitions; so read the documentation (man and info pages) of the commands you use if you should encounter errors.
Don't overwrite! | |
---|---|
Be careful not to overwrite existing (important) files when redirecting output. Many shells, including Bash, have a built-in feature to protect you from that risk: noclobber. See the Info pages for more information. In Bash, you would want to add the set -o noclobber command to your .bashrc configuration file in order to prevent accidental overwriting of files. |
Instead of overwriting file data, you can also append text to an existing file using two subsequent greater-than signs:
Example:
mike:~> date >> wishlist mike:~> cat wishlist more money less work Thu Feb 28 20:23:07 CET 2002 |
The date command would normally put the last line on the screen; now it is appended to the file wishlist.
There are three types of I/O, which each have their own identifier, called a file descriptor:
standard input: 0
standard output: 1
standard error: 2
In the following descriptions, if the file descriptor number is omitted, and the first character of the redirection operator is <, the redirection refers to the standard input (file descriptor 0). If the first character of the redirection operator is >, the redirection refers to the standard output (file descriptor 1).
Some practical examples will make this more clear:
ls > dirlist 2>&1
will direct both standard output and standard error to the file dirlist, while the command
ls 2>&1 > dirlist
will only direct standard output to dirlist. This can be a useful option for programmers.
All this is explained in detail in the Bash Info pages.
You can use the tee command to copy input to standard output and one or more output files in one move. Using the -a option to tee results in appending input to the file(s). This command is useful if you want to both see and save output. The > and >> operators do not allow to perform both actions simultaneously.
This tool is usally called on through a pipe (|), as demonstrated in the example below:
mireille ~/test> date | tee file1 file2 Thu Jun 10 11:10:34 CEST 2004 mireille ~/test> cat file1 Thu Jun 10 11:10:34 CEST 2004 mireille ~/test> cat file2 Thu Jun 10 11:10:34 CEST 2004 mireille ~/test> uptime | tee -a file2 11:10:51 up 21 days, 21:21, 57 users, load average: 0.04, 0.16, 0.26 mireille ~/test> cat file2 Thu Jun 10 11:10:34 CEST 2004 11:10:51 up 21 days, 21:21, 57 users, load average: 0.04, 0.16, 0.26 |
When a program performs operations on input and writes the result to the standard output, it is called a filter. One of the most common uses of filters is to restructure output. We'll discuss a couple of the most important filters below.
As we saw in Section 3.3.3.4, grep scans the output line per line, searching for matching patterns. All lines containing the pattern will be printed to standard output. This behavior can be reversed using the -v option.
Some examples: suppose we want to know which files in a certain directory have been modified in February:
jenny:~> ls -la | grep Feb |
The grep command, like most commands, is case sensitive. Use the -i option to make no difference between upper and lower case. A lot of GNU extensions are available as well, such as --colour, which is helpful to highlight searchterms in long lines, and --after-context, which prints the number of lines after the last matching line. You can issue a recursive grep that searches all subdirectories of encountered directories using the -r option. As usual, options can be combined.
Regular expressions can be used to further detail the exact character matches you want to select out of all the input lines. The best way to start with regular expressions is indeed to read the grep documentation. An excellent chapter is included in the info grep page. Since it would lead us too far discussing the ins and outs of regular expressions, it is strongly advised to start here if you want to know more about them.
Play around a bit with grep, it will be worth the trouble putting some time in this most basic but very powerful filtering command. The exercises at the end of this chapter will help you started, see Section 5.3.
The command sort arranges lines in alphabetical order by default:
thomas:~> cat people-I-like | sort Auntie Emmy Boyfriend Dad Grandma Mum My boss |
But there are many more things sort can do. Looking at the file size, for instance. With this command, directory content is sorted smallest files first, biggest files last:
ls -la | sort -nk 5
Old sort syntax | |
---|---|
You might obtain the same result with ls -la | sort +4n, but this is an old form which does not comply with the current standards. |
The sort command is also used in combination with the uniq program (or sort -u) to sort output and filter out double entries.