Linux awk command
Linux awk command
AWK is a language processing text files, text analysis is a powerful tool.
AWK was called because it took three founders Alfred Aho, Peter Weinberger, and Brian Kernighan of the Family Name of the first character.
grammar
awk [选项参数] 'script' var=value file(s) 或 awk [选项参数] -f scriptfile var=value file(s)
Options Parameter Description:
- -F Fs or --field-separator fs
Specifying an input file off separator, fs is a string or a regular expression, such as -F :. - -v var = value or --asign var = value
Assign a user-defined variable. - -f scripfile or --file scriptfile
Awk command read from a script file. - -mf nnn and -mr nnn
Nnn value of inherent limitations, the maximum number of blocks nnn -mf option limits assigned; option limits the maximum number -mr record. These two features are Bell Labs version of awk extensions in the standard awk does not apply. - -W Compact or --compat, -W traditional or --traditional
Running in compatibility mode awk. So gawk behavior and exactly the same standard awk, awk all extensions are ignored. - -W Copyleft or --copyleft, -W copyright or --copyright
Print brief copyright information. - -W Help or --help, -W usage or --usage
Print a short description of all of awk options and each option. - -W Lint or --lint
Print warnings not to the structure of the traditional unix porting. - -W Lint-old or --lint-old
Printing can not be transplanted on to traditional unix platform structure warning. - -W Posix
Turn on compatibility mode. With the following restrictions, does not recognize: / x, function key, func, escape sequences, and when fs is a space, the new line as a field separator; operators ** and ** = can not replace ^ and ^ =; fflush invalid. - -W Re-interval or --re-inerval
Allow interval expressions in regular use, reference (grep in the Posix character classes), such as bracket expressions [[: alpha:]]. - -W Source program-text or --source program-text
Use program-text as source code, it can be mixed with the -f command. - -W Version or --version
Print version bug reporting information.
Basic Usage
log.txt text reads as follows:
2 this is a test 3 Are you like awk This's a test 10 There are orange,apple,mongo
Use a:
awk '{[pattern] action}' {filenames} # 行匹配语句 awk '' 只能用单引号
Example:
# 每行按空格或TAB分割,输出文本中的1、4项 $ awk '{print $1,$4}' log.txt --------------------------------------------- 2 a 3 like This's 10 orange,apple,mongo # 格式化输出 $ awk '{printf "%-8s %-10s\n",$1,$4}' log.txt --------------------------------------------- 2 a 3 like This's 10 orange,apple,mongo
Usage II:
awk -F #-F相当于内置变量FS, 指定分割字符
Example:
# 使用","分割 $ awk -F, '{print $1,$2}' log.txt --------------------------------------------- 2 this is a test 3 Are you like awk This's a test 10 There are orange apple # 或者使用内建变量 $ awk 'BEGIN{FS=","} {print $1,$2}' log.txt --------------------------------------------- 2 this is a test 3 Are you like awk This's a test 10 There are orange apple # 使用多个分隔符.先使用空格分割,然后对分割结果再使用","分割 $ awk -F '[ ,]' '{print $1,$2,$5}' log.txt --------------------------------------------- 2 this test 3 Are awk This's a 10 There apple
Use three:
awk -v # 设置变量
Example:
$ awk -va=1 '{print $1,$1+a}' log.txt --------------------------------------------- 2 3 3 4 This's 1 10 11 $ awk -va=1 -vb=s '{print $1,$1+a,$1b}' log.txt --------------------------------------------- 2 3 2s 3 4 3s This's 1 This'ss 10 11 10s
Use four:
awk -f {awk脚本} {文件名}
Example:
$ awk -f cal.awk log.txt
Operators
Operators | description |
---|---|
= + = - = * = / =% = ^ = ** = | Assignment |
?: | C conditional expression |
|| | Logical or |
&& | Logic and |
~ ~! | Match the regular expression does not match the regular expression |
<<=>> =! = == | Relational Operators |
Blank | connection |
+ - | Addition, subtraction |
* / & | Multiplication, division and remainder |
+ -! | Unary plus, minus, and logical negation |
^ *** | Exponentiation |
++ - | Increased or decreased, as a prefix or suffix |
$ | Field references |
in | Array members |
Filtering the first column of the row is greater than 2
$ awk '$1>2' log.txt #命令 #输出 3 Are you like awk This's a test 10 There are orange,apple,mongo
Filtering the first column of row 2 equals
$ awk '$1==2 {print $1,$3}' log.txt #命令 #输出 2 is
Filtering the first row and the second column is greater than 2 column equals 'Are' of
$ awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt #命令 #输出 3 Are you
Built-in variables
variable | description |
---|---|
\ $ N | N-th field of the current record, separated by FS between fields |
\ $ 0 | Complete input record |
ARGC | The number of command line parameters |
ARGIND | Position on the command line of the current file (start counting from 0) |
ARGV | Array contains the command line arguments |
CONVFMT | ENVIRON environment variable associative array digital conversion format (the default is% .6g) |
ERRNO | Finally, a system error description |
FIELDWIDTHS | The width of the field list (separated by a space) |
FILENAME | Current file name |
FNR | With NR, but relative to the current file |
FS | Field separator (default is any spaces) |
IGNORECASE | If true, the case-insensitive match |
NF | The number of fields in the current record |
NR | The current number of records |
OFMT | Digital output format (default is% .6g) |
OFS | Output field separator (default is a space) |
ORS | The output record separator (default is a newline) |
RLENGTH | The length of the match by the match function string |
RS | Record separator (default is a newline) |
RSTART | The first position is matched by the match function string |
SUBSEP | Array subscript separator (default is / 034) |
$ awk 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}' log.txt FILENAME ARGC FNR FS NF NR OFS ORS RS --------------------------------------------- log.txt 2 1 5 1 log.txt 2 2 5 2 log.txt 2 3 3 3 log.txt 2 4 4 4 $ awk -F\' 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}' log.txt FILENAME ARGC FNR FS NF NR OFS ORS RS --------------------------------------------- log.txt 2 1 ' 1 1 log.txt 2 2 ' 1 2 log.txt 2 3 ' 2 3 log.txt 2 4 ' 1 4 # 输出顺序号 NR, 匹配文本行号 $ awk '{print NR,FNR,$1,$2,$3}' log.txt --------------------------------------------- 1 1 2 this is 2 2 3 Are you 3 3 This's a test 4 4 10 There are # 指定输出分割符 $ awk '{print $1,$2,$5}' OFS=" $ " log.txt --------------------------------------------- 2 $ this $ test 3 $ Are $ awk This's $ a $ 10 $ There $
Use a regular string matching
# 输出第二列包含 "th",并打印第二列与第四列 $ awk '$2 ~ /th/ {print $2,$4}' log.txt --------------------------------------------- this a
~ A schematic begins.// Is mode.
# 输出包含"re" 的行 $ awk '/re/ ' log.txt --------------------------------------------- 3 Are you like awk 10 There are orange,apple,mongo
Ignore case
$ awk 'BEGIN{IGNORECASE=1} /this/' log.txt --------------------------------------------- 2 this is a test This's a test
Mode negated
$ awk '$2 !~ /th/ {print $2,$4}' log.txt --------------------------------------------- Are like a There orange,apple,mongo $ awk '!/th/ {print $2,$4}' log.txt --------------------------------------------- Are like a There orange,apple,mongo
awk script
About awk script, we need to pay attention to two key words BEGIN and END.
- BEGIN {there is put before executing the statements}
- END {This is a statement which put all of the rows after processing to be executed}
- {This is a process which put the statement to be executed for each row}
Assume that such a document (student achievement table):
$ cat score.txt Marry 2143 78 84 77 Jack 2321 66 78 45 Tom 2122 48 77 71 Mike 2537 87 97 95 Bob 2415 40 57 62
Our awk script as follows:
$ cat cal.awk #!/bin/awk -f #运行前 BEGIN { math = 0 english = 0 computer = 0 printf "NAME NO. MATH ENGLISH COMPUTER TOTAL\n" printf "---------------------------------------------\n" } #运行中 { math+=$3 english+=$4 computer+=$5 printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5 } #运行后 END { printf "---------------------------------------------\n" printf " TOTAL:%10d %8d %8d \n", math, english, computer printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR }
We look at the implementation of the results:
$ awk -f cal.awk score.txt NAME NO. MATH ENGLISH COMPUTER TOTAL --------------------------------------------- Marry 2143 78 84 77 239 Jack 2321 66 78 45 189 Tom 2122 48 77 71 196 Mike 2537 87 97 95 279 Bob 2415 40 57 62 159 --------------------------------------------- TOTAL: 319 393 350 AVERAGE: 63.80 78.60 70.00
Further examples
AWK is a hello world program:
BEGIN { print "Hello, world!" }
File Size Calculation
$ ls -l *.txt | awk '{sum+=$6} END {print sum}' -------------------------------------------------- 666581
To find out from a file larger than the length of the line 80
awk 'lenght>80' log.txt
Print multiplication table
seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'
More detailed information can be viewed AWK official manual: http://www.gnu.org/software/gawk/manual/gawk.html