<<  Autumn Winter Spring Summer AWK (GAWK). ( , )  >>
awk
awk
Arguments
Arguments
Options
Options
To start, awk compares the first line in the input file (from the
To start, awk compares the first line in the input file (from the
Patterns You can use a regular expression (refer to Appendix A),
Patterns You can use a regular expression (refer to Appendix A),
Actions The action portion of an awk command causes awk to take action
Actions The action portion of an awk command causes awk to take action
Comments The awk utility disregards anything on a program line
Comments The awk utility disregards anything on a program line
Functions The functions that awk provides for manipulating numbers and
Functions The functions that awk provides for manipulating numbers and
Operators The following awk arithmetic operators are from the C
Operators The following awk arithmetic operators are from the C
Associative Arrays An associative array is one of awks most powerful
Associative Arrays An associative array is one of awks most powerful
Printf You can use the Printf command in place of Print to control the
Printf You can use the Printf command in place of Print to control the
%[-][x[
%[-][x[
Examples A simple awk program is shown on the following page
Examples A simple awk program is shown on the following page
awk
awk
$cat cars The first example below selects all lines that contain the
$cat cars The first example below selects all lines that contain the
$ awk {print $3, $1} cars 77 p1ym 79 chevy 65 ford 78 vo1vo 83 ford
$ awk {print $3, $1} cars 77 p1ym 79 chevy 65 ford 78 vo1vo 83 ford
The next example selects lines that contain a match for the regular
The next example selects lines that contain a match for the regular
$ awk $2 ~ /^[tm]/ {print $3, $2, $, $5} cars 65 mustang $l0000 84
$ awk $2 ~ /^[tm]/ {print $3, $2, $, $5} cars 65 mustang $l0000 84
The next example finds all cars priced at or under $3000
The next example finds all cars priced at or under $3000
Next, the range operator (,) selects a group of lines
Next, the range operator (,) selects a group of lines
$ awk /chevy/ , /ford/ cars chevy nova 79 60 3000 ford mustang 65 45
$ awk /chevy/ , /ford/ cars chevy nova 79 60 3000 ford mustang 65 45
$ awk f pr_header cars Make Model Year Miles Price Plym fury 77 73
$ awk f pr_header cars Make Model Year Miles Price Plym fury 77 73
$ awk f pr_header2 cars Make Model Year Miles Price
$ awk f pr_header2 cars Make Model Year Miles Price
$ awk {print length, $0} cars | sort 19 fiat 600 65 115 450 20 ford
$ awk {print length, $0} cars | sort 19 fiat 600 65 115 450 20 ford
You can combine the range operator (,) and the NR variable to display
You can combine the range operator (,) and the NR variable to display
$ cat separ_demo { if ($1 ~ /ply/) $1 = plymouth if ($1 ~ /chev/) $1
$ cat separ_demo { if ($1 ~ /ply/) $1 = plymouth if ($1 ~ /chev/) $1
You can change the default value of the output field separator by
You can change the default value of the output field separator by
You can use Printf to refine the output format (refer to page 535)
You can use Printf to refine the output format (refer to page 535)
$ awk -f printf_demo cars Miles Make Model Year (0000) Price
$ awk -f printf_demo cars Miles Make Model Year (0000) Price
The next example creates two new files, one with all the lines that
The next example creates two new files, one with all the lines that
$ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ;
$ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ;
$ grep mark  /etc/passwd mark:4zvDGYGEbYHJg:107:ext 112:/home/mark
$ grep mark /etc/passwd mark:4zvDGYGEbYHJg:107:ext 112:/home/mark
The next example shows another report based on the cars file
The next example shows another report based on the cars file
Problem 1) Find the number of annotated gene in each strand of ecoli
Problem 1) Find the number of annotated gene in each strand of ecoli

: Awk. : ???. : Awk.ppt. zip-: 136 .

Awk

Awk.ppt
1 awk

awk

Format

Summary

search for and process a pattern in a file.

awk [-Fc] f program-file [file-list] awk program [file-list]

The awk utility is a pattern-scanning and processing language. It searches one or more files to see if they contain lines that match specified patterns and then performs actions, such as writing the line to the standard output or incrementing a counter, each time it finds a match. You can use awk to generate reports or filter text. It works equally well with numbers and text; when you mix the two, awk will almost always come up with the right answer. The authors of awk (Alfred V. Aho, Peter J. Weinberger, and Brian W.Kernighan) designed it to be easy to use and, to this end, they sacrificed execution speed.

2 Arguments

Arguments

The awk utility takes many of its constructs from the C programming language. It includes the following features:

flexible format conditional execution looping statements numeric variables string variables regular expressions Cs printf

The awk utility takes its input from files you specify on the command line or fron1 its standard input.

The first format uses a program-file, which is the pathname of a fie containing an awk program. See Description, on the next page. The second format uses a program, which is an awk program included on the command line. This format allows you to write simple, short awk programs without having to create a separate program-file. To prevent the shell from interpreting the awk commands as shell commands, it is a good idea to enclose the program in single quotation marks. The file-list contains pathnames of the ordinary files that awk processes. These are the input files.

3 Options

Options

Description An awk program consists of one or more program lines containing a pattern and/or action in the hllowing format: panern { action } The pattern selects lines from the input file. The awk utility performs the action on all lines that the pattern selects. You must enclose the action within braces so that awk can differentiate it from the pattern . If a program line does not contain a pattern, awk selects all lines in the input file. If a program line does not contain an action, awk copies the selected lines to its standard output.

If you do not use the -f option, awk uses the first command line argument as its program. -fprogram-file file This option causes awk to read its program from the program file given as the first command line argument. -Fc field This option specifies an input field separator c, to be used in place of the default separators ([space] and [TAB]). The field separator can be any single character.

4 To start, awk compares the first line in the input file (from the

To start, awk compares the first line in the input file (from the

file--list) with each pattern in the program-file or program. If a pattern selects the line (if there is a match), awk takes the action associated with the pattern. If the line is not selected, awk takes no action. When awk has completed its comparisons for the first line of the input file, it repeats the process for the next line of input. It continues this process, comparing subsequent lines in the input file, until it has read the entire file-list. If several patterns select the same line, awk takes the actions associated with each of the patterns in the order in which they appear. It is therefore possible for awk to send a single line from the input file to its standard output more than once.

5 Patterns You can use a regular expression (refer to Appendix A),

Patterns You can use a regular expression (refer to Appendix A),

enclosed within slashes, as a pattern. The ~ operator tests to see if a field or variable matches a regular expression-The !~operator tests for no match.

You can process arithmetic and character relational expressions with the following relational operators.

Operator

Meaning

< <= == != >= >

less than less than or equal to equal to not equal to greater than or equal to greater than

You can combine any of the patterns described above using the Boolean operators | | (OR) or && (AND).

6 Actions The action portion of an awk command causes awk to take action

Actions The action portion of an awk command causes awk to take action

when it matches a pattern. If you do not specify an action, awk performs the default action, which is the Print command (explicitly represented as {print}). This action copies the record (normally a line-see Variables on the next page) from the input file to awks standard output. You can follow a Print command with arguments, causing awk to print just the arguments you specify. The arguments can be variables or string constants. Using awk, you can send the output from a Print command to a file(>), append it to a file (>>), or pipe it to the input of another program( | ). Unless you separate items in a Print command with commas, awk catenates them. Commas cause awk to separate the items with the output field separator (normally a [space]-see Variables on the next page). You can include several actions on one line within a set of braces by separating them with semicolons.

The comma is the range operator. If you separate two patterns with a comma on a single awk program line, awk selects a range of lines beginning with the first line that contains the first pattern. The last line awk selects is the next subsequent line that contains the second pattern. After awk finds the second pattern, it Starts the process over by looking for the first pattern again. Two unique patterns, BEGIN and END, allow you to execute commands before awk starts its processing and after it finishes. The awk utility executes the actions associated with the BEGIN pattern before, and with the END pattern after, it processes all the files in the file-list.

7 Comments The awk utility disregards anything on a program line

Comments The awk utility disregards anything on a program line

following a pound sign (#). You can document an awk program by preceding comments with this symbol.

Variables You declare and initialize user variables when you use them (that is, you do not have to declare them before you use them). In addition, awk maintains program variables for your use. You can use both user and program variables in the pattern and in the action portion of an awk program. Following is a list of program variables.

Variable

Represents

NR $0 NF $1-$N FS OFS RS ORS FILENAME

record number of current record the current record(as a single variable) number of fields in the current record fields in the current record input field separator (default:[SPACE]or[TAB]) output field separator (default:[SPACE]) input record separator (default:[NEWLINE]) output record separator (default:[NEWLINE]) name of the current input file

8 Functions The functions that awk provides for manipulating numbers and

Functions The functions that awk provides for manipulating numbers and

strings follow.

The input and output record separators are, by default, [NEWLINE] characters. Thus, awk takes each line in the input file to be a separate record and appends a [NEWLINE] to the end of each record that it sends to its standard output. The input field separators are, by default, [SPACE] and [TAB]s. The output field separator is a [SPACE]. You can change the value of any of the separators at any time by assigning a new value to its associated variable. Also, the input held separator can be set on the command line using the -F option.

Name

Function

length(str)

returns the number of characters in str; if you do not supply an argument, it returns the number of characters in th current input record

int(num)

returns the integer portion of num

index(str1, str2)

returns the index of str2 in str1 or 0 if str2 is not present

split(str, arr, del)

places elements of str, delimited by del, in the array arr[1]arr[n]; returns the number of elements in the array

sprintf(fmt, args)

formats args according to fmt and returns the formatted string; mimics the C programming language function of the same name

substr(str,pos,len)

returns a substring of str that begins at pos and is len characters long

9 Operators The following awk arithmetic operators are from the C

Operators The following awk arithmetic operators are from the C

programming language.

Operator

Function

*

multiplies the expression preceding the operator by the expression following it.

/

divides the expression preceding the operator by the expression following it.

%

takes the remainder after dividing the expression preceding the operator by the expression following it

+

adds the expression preceding the operator and the expression following it.

-

subtracts the expression following the operator from the expression preceding it

=

assigns the value of the expression following the operator to the variable preceding it.

++

increments the variable preceding the operator

--

decrements the variable preceding the operator

+=

adds the expression following the operator to the variable preceding it and assigns the result to the variable preceding the operator

-=

subtracts the expression following the operator from the variable preceding it and assigns the result to the variable preceding the operator

10 Associative Arrays An associative array is one of awks most powerful

Associative Arrays An associative array is one of awks most powerful

features. An associative array uses strings as its indexes. Using an associative array, you can mimic a traditional array by using numeric. strings as indexes. You assign a value to an element of an associative array just as you would assign a value to any other awk variable. The format is shown below. array[string] = value The array is the name of the array, string is the index of the element of the array you are assigning a value to, and value is the value you are assigning to the element of the array

Operator

Function

*=

multiplies the variable preceding the operator by the expression following it and assigns the result to the variable preceding the operator

/=

divides the variable preceding the operator by the expression following it and assigns the result to the variable preceding the operator

%=

takes the remainder, after dividing the variable preceding the operator by the expression following it, and assigns the result to the variable preceding the operator

11 Printf You can use the Printf command in place of Print to control the

Printf You can use the Printf command in place of Print to control the

format of the output that awk generates. The awk version of Printf is similar to that of the C language. A Printf command takes the following format: printf control-string arg1, arg2, ..., argn The control-string determines how Printf will format arg1-n. The arg1-n can be variables or other expressions. Within the control-string, you can use \n to indicate a [NEWLINE] and \t to indicate a [TAB]. The control-string contains conversion specifications, one for each argument (arg1-n). A conversion specification has the following format:

There is a special For structure you can use with an awk array. The formatat is: for (elem in array) action The elem is a variable that takes on the values of each of the elements in the array as the For structure loops through them, array is the name of the array, and action is the action that awk takes for each element in the array. You can use the elem variable in this action. The Examples section contains programs that use associative arrays.

12 %[-][x[

%[-][x[

y]]conv The - causes Printf to Left justify the argument. The x is the minimum field width, and the .y is the number of places to the right of a decimal point in a number. The conv is a letter from the following list.

conv

Cenversion

d

decimal

e

exponential notation

f

floating-point number

g

use f or e, whichever is shorter

o

unsigned octal

s

string of characters

x

unsigned hexadecimal

Refer to the following Examples section for examples of how to use printf.

13 Examples A simple awk program is shown on the following page

Examples A simple awk program is shown on the following page

{ print } This program consists of one program line that is an action. It uses no pattern. Because the pattern is missing, awk selects all lines in the input file. Without any arguments, the Print command prints each selected line in its entirety. This program copies the input file to its standard output. The following program has a pattern pan without an explicit action.

/jenny/ In this case, awk selects all lines from the input file that contain the string jenny. When you do not specify an action, awk assumes the action to be Print. This program Copies all the lines in the input file that contain jenny to its standard output. The following examples work with the car data file. From left to right, the columns in the file contain each cark make, model, year of manufacture, mileage, and price. All white space in this file is composed of single [TAB]s (there are no [SPACE]s in the file).

14 awk
15 $cat cars The first example below selects all lines that contain the

$cat cars The first example below selects all lines that contain the

string chevy. The slashes indicate that chevy is a regular expression. This example has no action part. Although neither awk nor shell syntax requires single quotation marks on the command line, it is a good idea to use then1, because they prevent many problems. If the awk program you create on the command line includes [SPACE]s or any special characters that the shell will interpret, you must quote them. Always enclosing the program in single quotation marks is the easiest way of making sure you have quoted any characters that need to be quoted.

$ awk /chevy/ cars chevy nova 79 60 3000 chevy nova 80 50 3500 chevy impa1a 65 85 1550

The next example selects all lines from the file (it has no pattern part). The braces enclose the action part-you must always use braces to delimit the action part, so that awk can distinguish the pattern part from the action part. This example prints the third field ($3), a [SPACE] (indicated by the comma), and the first field ($1) of each selected line.

16 $ awk {print $3, $1} cars 77 p1ym 79 chevy 65 ford 78 vo1vo 83 ford

$ awk {print $3, $1} cars 77 p1ym 79 chevy 65 ford 78 vo1vo 83 ford

88 chevy 65 fiat 8l honda 84 ford 82 toyota 65 chevy 83 ford The next example includes both a pattern and an action part. It selects all lines that contain the string chevy and prints the third and first fields from the lines it selects. $ awk /chevy/ {print $3, $l} cars 79 chevy 88 chevy 65 chevy

17 The next example selects lines that contain a match for the regular

The next example selects lines that contain a match for the regular

expression h. Because there is no explicit action, it prints all the lines it selects. $ awk /h/ cars chevy nova 79 68 3000 chevy nova 80 50 3500 honda accord 8l 30 6000 ford thundbd 84 l0 17000 chevy impa1a 65 85 l550 The next pattern uses the matches operator (~) to select all lines that contain the letter h in the first field. $ awk $1 ~ /h/ cars chevy nova 79 60 3000 chevy nova 80 50 3500 honda accord 8l 30 6000 chevy impa1a 65 85 l550 The caret (^) in a regular expression forces a match at the beginning of the line or, in this case, the beginning of the first field. $ awk $l ~ /^h/ cars honda accord 81 30 6000 A pair of brackets SUI-rounds a character class definition (refer to Appendix A, Regular Expressions). Below, awk selects all lines that have a second field that begins with t or m. Then it prints the third and second fields, a dollar sign, and the fifth field.

18 $ awk $2 ~ /^[tm]/ {print $3, $2, $, $5} cars 65 mustang $l0000 84

$ awk $2 ~ /^[tm]/ {print $3, $2, $, $5} cars 65 mustang $l0000 84

thundbd $17000 82 tercel $750

The next example shows three roles that a dollar sign can play in an awk program. A dollarsign followed by a number forms the name of a field. Within a regular expression, a dollar sign forces a match at the end of a line or held (5$). Within a string, you can use a dollar sign as itself. $ awk $3 ~ /5$/ {print $3, $l, $ $5} cars 65 ford $l0000 65 fiat $450 65 chevy $l550

Below, the equals relational operator (==) causes awk to perform a numeric comparison between the third field in each line and the number 65. The awk commands takes the default action, Print, on each line that matches. $ awk $3 == 65 cars ford mustang 65 45 10000 fiat 600 65 115 450 chevy impa1a 65 85 1550

19 The next example finds all cars priced at or under $3000

The next example finds all cars priced at or under $3000

$ awk $5 ?= 300 cars plym fury 77 73 2500 chevy nova 79 60 3000 fiat 600 65 115 450 toyota terce1 82 180 750 chevy impa1a 65 85 1550 When you use double quotation marks, awk performs textual comparisons, using the ASCII collating sequence as the basis of the comparison. Below, awk shows that the strings 450 and 750 fall in the range that lies between the strings 2000 and 9000. $ awk $5 >= 2000 && $5 < 9000 cars p1ym fury 77 73 2500 chevy nova 79 60 3000 chevy nova 80 50 3500 fiat 600 65 ll5 450 honda accord 8l 30 6000 toyota terce1 82 l80 750 When you need a numeric comparison, do not use quotation marks.The next example gives the correct results. It is the same as the previous ex. ample but omits the double quotation marks .

20 Next, the range operator (,) selects a group of lines

Next, the range operator (,) selects a group of lines

The first line it selects is the one specified by the pattern before the comma. The last line is the one selected by the pattern after the comma. If there is not line that matches the pattern after the comma, awk selects every line up to the end of the file. The example selects all lines starting with the line that contains Volvo and concluding with the line that contains fiat. $ awk /volvo/ , /fiat/ cars volvo gl 78 102 9850 ford ltd 83 15 10500 chevy nova 80 50 3500 fiat 600 65 115 450 After the range operator finds its first group of lines, it starts the process over, looking for a line that matches the pattern before the comma. In the following example, awk finds three groups of lines that fall between chevy and ford. Although the fifth line in the file contains ford, awk does not select it because, at the time it is processing the fifth line, it is searching for chevy.

$ awk $5 >= 2000 && $5 < 9000 cars plym fury 77 73 2500 chevy nova 79 60 3000 chevy nova 80 50 3500 Honda accord 81 30 6000

21 $ awk /chevy/ , /ford/ cars chevy nova 79 60 3000 ford mustang 65 45

$ awk /chevy/ , /ford/ cars chevy nova 79 60 3000 ford mustang 65 45

10000 chevy nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 ford thundbd 84 10 17000 chevy impala 65 85 1550 ford bronco 83 25 9500 When you are writing a longer awk program, it is convenient to put the program in a file and reference the file on the command line. Use the f option, followed by the name of the file containing the awk program. Following is an awk program that has two actions and uses the BEGIN pattern. The awk utility performs the action associated with BEGIN before it processes any of the lines of the data file. The pr_header awk program uses BEGIN to print a header. The second action, {print}, has no pattern part and prints all the lines in the file. $ cat pr_header BEGIN {print Make Model Year Miles Price} {print}

22 $ awk f pr_header cars Make Model Year Miles Price Plym fury 77 73

$ awk f pr_header cars Make Model Year Miles Price Plym fury 77 73

2500 Chevy nova 79 60 3000 Ford mustang 65 45 10000 Volvo gl 78 102 9850 Ford ltd 83 15 10500 Chevy nova 80 50 3500 Fiat 600 65 115 450 Honda accord 81 30 6000 Ford thundbd 84 10 17000 Toyota tercel 82 180 750 Chevy impala 65 85 1550 Ford bronco 83 25 9500 In the previous and following examples, the white space in the headers is composed of single [TAB]s, so that the titles line up with the columns of data. $ cat pr_header2 BEGIN { print Make Model Year Miles Price print ----------------------------- } {print}

23 $ awk f pr_header2 cars Make Model Year Miles Price

$ awk f pr_header2 cars Make Model Year Miles Price

------------------------------------------------------------ Plym fury 77 73 2500 Chevy nova 79 60 3000 Ford mustang 65 45 10000 Volvo gl 78 102 9850 Ford ltd 83 15 10500 Chevy nova 80 50 3500 Fiat 600 65 115 450 Honda accord 81 30 6000 Ford thundbd 84 10 17000 Toyota tercel 82 180 750 Chevy impala 65 85 1550 Ford bronco 83 25 9500 When you call the length function without an argument, it returns the number of characters in the current line, including field separators. The $0 variable always contains the value of the current line. In the next example, awk prepends the length to each line, and then a pipe sends the output from awk to sort, so that the lines of the cars file appear in order of length. Because the formatting of the report depends on [TAB]s, including three extra characters at the beginning of each line throws off the format of the last line. A remedy for this situation will be covered shortly.

24 $ awk {print length, $0} cars | sort 19 fiat 600 65 115 450 20 ford

$ awk {print length, $0} cars | sort 19 fiat 600 65 115 450 20 ford

ltd 83 15 10500 20 plym fury 77 73 2500 20 volvo gl 78 102 9850 21 chevy nova 79 60 3000 21 chevy nova 80 50 3500 22 ford bronco 83 25 9500 23 chevy impala 65 85 1550 23 honda accord 81 30 6000 24 ford mustang 65 45 10000 24 ford thundbd 84 10 17000 24 toyota tercel 82 180 750 The NR variable contains the record (line) number of the current line. The following pattern selects all lines that contain more than 23 characters. The action prints the line number of all the selected lines. $ awk length > 23 {print NR} cars 3 9 10

25 You can combine the range operator (,) and the NR variable to display

You can combine the range operator (,) and the NR variable to display

a group of lines of a file based on their line numbers. The next example displays lines 2 through 4. $ awk NR == 2 , NR == 4 cars chevy nova 79 60 3000 ford mustang 65 45 10000 volvo gl 78 102 9850 The END pattern works in a manner similar to the BEGIN pattern, except awk takes the actions associated with it after it has processed the last of its input lines. The following report displays information only after it has processed the entire data file. The NR variable retains its value after awk has finished processing the data file, so that an action associated with an END pattern can use it. $ awk END {print NR, cars for sale. } cars 12 cars for sale. The next example uses If commands to change the values of some of the first fields. As long as awk does not make any changes to a record, it leaves the entire record, including separators, intact. Once it makes a change to a record, it changes all separators in that record to the default. The default output field separator is a [SPACE].

26 $ cat separ_demo { if ($1 ~ /ply/) $1 = plymouth if ($1 ~ /chev/) $1

$ cat separ_demo { if ($1 ~ /ply/) $1 = plymouth if ($1 ~ /chev/) $1

= chevrolet print } $ awk f separ_demo cars plymouth fury 77 73 2500 chevrolet nova 79 60 3000 ford mustang 65 45 10000 volvo gl 78 102 9850 ford 1td 83 15 10500 chevrolet nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 ford thundba 84 10 17000 Toyota tercel 82 180 750 Chevrolet impala 65 85 1550 Ford bronco 83 25 9500

27 You can change the default value of the output field separator by

You can change the default value of the output field separator by

assigning a value to the OFS variable. There is one [TAB] character between the quotation marks in the following example. This fix improves the appearance of the report but does not properly line up the columns. $ cat ofs_demo BEGIN {OFS = [TAB]} { if ($1 ~ /ply/) $1 = plymouth if ($1 ~ /chev/) $1 = chevrolet print } $ awk -f ofs_demo cars plymouth fury 77 73 2500 chevrolet nova 79 60 3000 ford mustang 65 45 10000 volvo gl 78 102 9850 ford 1td 83 15 10500 chevrolet nova 80 50 3500 fiat 600 65 115 450 honda accord 81 30 6000 ford thundba 84 10 17000 Toyota tercel 82 180 750 Chevrolet impala 65 85 1550 Ford bronco 83 25 9500

28 You can use Printf to refine the output format (refer to page 535)

You can use Printf to refine the output format (refer to page 535)

The following example uses a backslash at the end of a program line to mask the following [NEWLINE] from awk. You can use this technique to continue a long line over one or more lines without affecting the outcome of the program. $ cat printf_demo BEGIN { print Miles print Make Mode1 Year (000) Price print \ ----------------------------------------------------------------------- } } if ($l ~ /p1y/? $l = p1ymouth if ($l ~ /chev/) $l = chevro1et printf %-l0s %-8s l9%2d %5d $ %8.2f\n,\ $1, $2, $3, $4, $5 }

29 $ awk -f printf_demo cars Miles Make Model Year (0000) Price

$ awk -f printf_demo cars Miles Make Model Year (0000) Price

------------------------------------------------------------------------------------------ plymouth fury 1977 73 $ 2500.00 chevrolet nova 1079 60 $ 3000.00 ford mustang 1965 45 $ 10000.00 volvo gl 1978 102 $ 9850.00 ford 1td 1983 15 $ 10500.00 chevrolet nova 1980 50 $ 3500.00 fiat 600 1965 115 $ 450.00 honda accord 1981 30 $ 6000.00 ford thundba 1984 10 $ 17000.00 Toyota tercel 1982 180 $ 750.00 Chevrolet impala 1965 85 $ 1550.00 Ford bronco 1983 25 $ 9500.00

30 The next example creates two new files, one with all the lines that

The next example creates two new files, one with all the lines that

contain chevy and the other with lines containing ford. $ cat redi rect-out /chevy/ {print ? chevfi1e} /ford/ {print ? fordfi1e} END {print done.} $ awk -f red1rect-out cars done . $ cat chevfi1e chevy nova 79 60 3000 chevy nova 80 50 3500 chevy nova 65 85 1550 The summary program produces a summary report on all cars and newer cars. The first two lines of declarations are not required; awk automatically declares and initializes variables as you use them. After awk reads all the input data, it computes and displays averages.

31 $ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ;

$ cat summary BEGIN { yearsum = 0 ; costsum = 0 newcostsum = 0 ;

newcount = 0 } { yearsum += $3 costsum += $5 } $3 ? 80 {newcostsum += $5 ; newcount ++} END { Printf Average age of cars is %3.lf years?n , \ ?90 - (yearsum/NR) printf Average cost of cars is $%7.2f?n ,? costum/NR printf Average cost of newer cars is %$7.2f?n,\ newcostsum/newcount } $ awk -f summary cars Ave rage age of cars is l3.2 years Average cost of cars is $62l6.67 Average cost of newer cars is $8750.00 Following, grep shows the format of a line from the passwd file that the next example uses.

32 $ grep mark  /etc/passwd mark:4zvDGYGEbYHJg:107:ext 112:/home/mark

$ grep mark /etc/passwd mark:4zvDGYGEbYHJg:107:ext 112:/home/mark

/bin/csh The next example demonstrates a technique for finding the largest number in a field. Because it works with the passwd file, which delimits fnelds with colons (:), it changes the input filed separator (FS) before reading any data. (Alternatively, the -F option could be used on the command line to change the input held separator.) This example reads the passwd file and determines the next available user ID number (field 3). The numbers do not have to be in order in the passwd file for this program to work.. The pattern causes awk to select records that contain a user ID number greater than any previous user ID number that it has processed. Each time it selects a record, it assigns the value of the new user ID number to the saveit variable. Then awk uses the new value of saveit to test the user ID of all subsequent records. Finally awk adds 1 to the value of saveit and displays the result. $ cat find-uid. BEGIN {F5 = : saveit = 0} $3 ?Saveit {saveit = $3} END {print Next avai1able UID i s saveit + 1} $awk f find_uid /etc/passwd Next available UID is 192

33 The next example shows another report based on the cars file

The next example shows another report based on the cars file

This report uses nested If Else statements to substitute values based on the contents of the price field. The program has no pattern part--it processes every record. $ cat price_range { if ($5 <= 5000) $5 = inexpensive e1se if ($5 > 5000 && $5 ? 1000) $5 = please ask e1se if ($5 >= l0000) $5 = expensive printf %-10s %-8s 19%2d %5d %-12s\n,\ $l, $2, $3, $4, $5 } $ awk -f price -range cars p1ym fury 1977 73 inexpensive chevy nova 1979 60 inexpensive ford mustang 1965 45 expensive volvo g1 1978 102 please ask ford 1td 1983 15 expensive chevy nova 1980 50 inexpensive fiat 600 1965 115 inexpensive honda accord 1981 30 please ask ford thundbd 1984 10 expensive toyota tercel 1982 180 inexpensive chevy impa1a 1965 85 inexpensive ford bronco 1983 25 please ask

34 Problem 1) Find the number of annotated gene in each strand of ecoli

Problem 1) Find the number of annotated gene in each strand of ecoli

genome sequences. Problem 2) Find the number of putatively identified, hypothetical, unknown genes from ecoli genome seqeunces.

Awk
http://900igr.net/prezentacija/bez_uroka/awk-228195.html
c

23688

1
900igr.net > > > Awk