Latest web development tutorials

Linux awk command

Linux awk command

Linux command Daquan Linux command Daquan

AWK is a language processing text files, text analysis is a powerful tool.

AWK was called because it took three founders Alfred Aho, Peter Weinberger, and Brian Kernighan of the Family Name of the first character.

grammar

awk [选项参数] 'script' var=value file(s)
或
awk [选项参数] -f scriptfile var=value file(s)

Options Parameter Description:

  • -F Fs or --field-separator fs
    Specifying an input file off separator, fs is a string or a regular expression, such as -F :.
  • -v var = value or --asign var = value
    Assign a user-defined variable.
  • -f scripfile or --file scriptfile
    Awk command read from a script file.
  • -mf nnn and -mr nnn
    Nnn value of inherent limitations, the maximum number of blocks nnn -mf option limits assigned; option limits the maximum number -mr record. These two features are Bell Labs version of awk extensions in the standard awk does not apply.
  • -W Compact or --compat, -W traditional or --traditional
    Running in compatibility mode awk. So gawk behavior and exactly the same standard awk, awk all extensions are ignored.
  • -W Copyleft or --copyleft, -W copyright or --copyright
    Print brief copyright information.
  • -W Help or --help, -W usage or --usage
    Print a short description of all of awk options and each option.
  • -W Lint or --lint
    Print warnings not to the structure of the traditional unix porting.
  • -W Lint-old or --lint-old
    Printing can not be transplanted on to traditional unix platform structure warning.
  • -W Posix
    Turn on compatibility mode. With the following restrictions, does not recognize: / x, function key, func, escape sequences, and when fs is a space, the new line as a field separator; operators ** and ** = can not replace ^ and ^ =; fflush invalid.
  • -W Re-interval or --re-inerval
    Allow interval expressions in regular use, reference (grep in the Posix character classes), such as bracket expressions [[: alpha:]].
  • -W Source program-text or --source program-text
    Use program-text as source code, it can be mixed with the -f command.
  • -W Version or --version
    Print version bug reporting information.

Basic Usage

log.txt text reads as follows:

2 this is a test
3 Are you like awk
This's a test
10 There are orange,apple,mongo

Use a:

awk '{[pattern] action}' {filenames}   # 行匹配语句 awk '' 只能用单引号

Example:

# 每行按空格或TAB分割,输出文本中的1、4项
 $ awk '{print $1,$4}' log.txt
 ---------------------------------------------
 2 a
 3 like
 This's
 10 orange,apple,mongo
 # 格式化输出
 $ awk '{printf "%-8s %-10s\n",$1,$4}' log.txt
 ---------------------------------------------
 2        a
 3        like
 This's
 10       orange,apple,mongo
 

Usage II:

awk -F  #-F相当于内置变量FS, 指定分割字符

Example:

# 使用","分割
 $  awk -F, '{print $1,$2}'   log.txt
 ---------------------------------------------
 2 this is a test
 3 Are you like awk
 This's a test
 10 There are orange apple
 # 或者使用内建变量
 $ awk 'BEGIN{FS=","} {print $1,$2}'     log.txt
 ---------------------------------------------
 2 this is a test
 3 Are you like awk
 This's a test
 10 There are orange apple
 # 使用多个分隔符.先使用空格分割,然后对分割结果再使用","分割
 $ awk -F '[ ,]'  '{print $1,$2,$5}'   log.txt
 ---------------------------------------------
 2 this test
 3 Are awk
 This's a
 10 There apple

Use three:

awk -v  # 设置变量

Example:

 $ awk -va=1 '{print $1,$1+a}' log.txt
 ---------------------------------------------
 2 3
 3 4
 This's 1
 10 11
 $ awk -va=1 -vb=s '{print $1,$1+a,$1b}' log.txt
 ---------------------------------------------
 2 3 2s
 3 4 3s
 This's 1 This'ss
 10 11 10s

Use four:

awk -f {awk脚本} {文件名}

Example:

 $ awk -f cal.awk log.txt

Operators

Operators description
= + = - = * = / =% = ^ = ** = Assignment
?: C conditional expression
|| Logical or
&& Logic and
~ ~! Match the regular expression does not match the regular expression
<<=>> =! = == Relational Operators
Blank connection
+ - Addition, subtraction
* / & Multiplication, division and remainder
+ -! Unary plus, minus, and logical negation
^ *** Exponentiation
++ - Increased or decreased, as a prefix or suffix
$ Field references
in Array members

Filtering the first column of the row is greater than 2

$ awk '$1>2' log.txt    #命令
#输出
3 Are you like awk
This's a test
10 There are orange,apple,mongo

Filtering the first column of row 2 equals

$ awk '$1==2 {print $1,$3}' log.txt    #命令
#输出
2 is

Filtering the first row and the second column is greater than 2 column equals 'Are' of

$ awk '$1>2 && $2=="Are" {print $1,$2,$3}' log.txt    #命令
#输出
3 Are you

Built-in variables

variable description
\ $ N N-th field of the current record, separated by FS between fields
\ $ 0 Complete input record
ARGC The number of command line parameters
ARGIND Position on the command line of the current file (start counting from 0)
ARGV Array contains the command line arguments
CONVFMT ENVIRON environment variable associative array digital conversion format (the default is% .6g)
ERRNO Finally, a system error description
FIELDWIDTHS The width of the field list (separated by a space)
FILENAME Current file name
FNR With NR, but relative to the current file
FS Field separator (default is any spaces)
IGNORECASE If true, the case-insensitive match
NF The number of fields in the current record
NR The current number of records
OFMT Digital output format (default is% .6g)
OFS Output field separator (default is a space)
ORS The output record separator (default is a newline)
RLENGTH The length of the match by the match function string
RS Record separator (default is a newline)
RSTART The first position is matched by the match function string
SUBSEP Array subscript separator (default is / 034)
$ awk 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}'  log.txt
FILENAME ARGC  FNR   FS   NF   NR  OFS  ORS   RS
---------------------------------------------
log.txt    2    1         5    1
log.txt    2    2         5    2
log.txt    2    3         3    3
log.txt    2    4         4    4
$ awk -F\' 'BEGIN{printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n","FILENAME","ARGC","FNR","FS","NF","NR","OFS","ORS","RS";printf "---------------------------------------------\n"} {printf "%4s %4s %4s %4s %4s %4s %4s %4s %4s\n",FILENAME,ARGC,FNR,FS,NF,NR,OFS,ORS,RS}'  log.txt
FILENAME ARGC  FNR   FS   NF   NR  OFS  ORS   RS
---------------------------------------------
log.txt    2    1    '    1    1
log.txt    2    2    '    1    2
log.txt    2    3    '    2    3
log.txt    2    4    '    1    4
# 输出顺序号 NR, 匹配文本行号
$ awk '{print NR,FNR,$1,$2,$3}' log.txt
---------------------------------------------
1 1 2 this is
2 2 3 Are you
3 3 This's a test
4 4 10 There are
# 指定输出分割符
$  awk '{print $1,$2,$5}' OFS=" $ "  log.txt
---------------------------------------------
2 $ this $ test
3 $ Are $ awk
This's $ a $
10 $ There $

Use a regular string matching

# 输出第二列包含 "th",并打印第二列与第四列
$ awk '$2 ~ /th/ {print $2,$4}' log.txt
---------------------------------------------
this a

~ A schematic begins.// Is mode.

# 输出包含"re" 的行
$ awk '/re/ ' log.txt
---------------------------------------------
3 Are you like awk
10 There are orange,apple,mongo

Ignore case

$ awk 'BEGIN{IGNORECASE=1} /this/' log.txt
---------------------------------------------
2 this is a test
This's a test

Mode negated

$ awk '$2 !~ /th/ {print $2,$4}' log.txt
---------------------------------------------
Are like
a
There orange,apple,mongo
$ awk '!/th/ {print $2,$4}' log.txt
---------------------------------------------
Are like
a
There orange,apple,mongo

awk script

About awk script, we need to pay attention to two key words BEGIN and END.

  • BEGIN {there is put before executing the statements}
  • END {This is a statement which put all of the rows after processing to be executed}
  • {This is a process which put the statement to be executed for each row}

Assume that such a document (student achievement table):

$ cat score.txt
Marry   2143 78 84 77
Jack    2321 66 78 45
Tom     2122 48 77 71
Mike    2537 87 97 95
Bob     2415 40 57 62

Our awk script as follows:

$ cat cal.awk
#!/bin/awk -f
#运行前
BEGIN {
    math = 0
    english = 0
    computer = 0
 
    printf "NAME    NO.   MATH  ENGLISH  COMPUTER   TOTAL\n"
    printf "---------------------------------------------\n"
}
#运行中
{
    math+=$3
    english+=$4
    computer+=$5
    printf "%-6s %-6s %4d %8d %8d %8d\n", $1, $2, $3,$4,$5, $3+$4+$5
}
#运行后
END {
    printf "---------------------------------------------\n"
    printf "  TOTAL:%10d %8d %8d \n", math, english, computer
    printf "AVERAGE:%10.2f %8.2f %8.2f\n", math/NR, english/NR, computer/NR
}

We look at the implementation of the results:

$ awk -f cal.awk score.txt
NAME    NO.   MATH  ENGLISH  COMPUTER   TOTAL
---------------------------------------------
Marry  2143     78       84       77      239
Jack   2321     66       78       45      189
Tom    2122     48       77       71      196
Mike   2537     87       97       95      279
Bob    2415     40       57       62      159
---------------------------------------------
  TOTAL:       319      393      350
AVERAGE:     63.80    78.60    70.00

Further examples

AWK is a hello world program:

BEGIN { print "Hello, world!" }

File Size Calculation

$ ls -l *.txt | awk '{sum+=$6} END {print sum}'
--------------------------------------------------
666581

To find out from a file larger than the length of the line 80

awk 'lenght>80' log.txt

Print multiplication table

seq 9 | sed 'H;g' | awk -v RS='' '{for(i=1;i<=NF;i++)printf("%dx%d=%d%s", i, NR, i*NR, i==NR?"\n":"\t")}'

More detailed information can be viewed AWK official manual: http://www.gnu.org/software/gawk/manual/gawk.html

Linux command Daquan Linux command Daquan