AWK command in Unix/Linux

AWK command in Unix/Linux




The Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then perform associated actions.

Some of the key features of Awk are:

  1. Awk views a text file as records and fields.
  2. Like common programming language, Awk has variables, conditionals and loops
  3. Awk has arithmetic and string operators.
  4. Awk can generate formatted reports.

 



 
The syntax of awk is:

awk ‘pattern{action}’ file

Consider we have a text file example.txt to be processed:
$ cat example.txt

Name Designation Department Salary
Thomas Manager Sales $5,000
Jason Developer Technology $5,500
Sanjay Sysadmin Technology $7,000
Nisha Manager Marketing $9,500
Rady DBA Technology $6,000

We can instruct AWK to print only certain columns from the input field:
$ awk ‘{print $1 “\t” $4}’ example.txt

Name Salary
Thomas $5,000
Jason $5,500
Sanjay $7,000
Nisha $9,500
Rady $6,000

In our example.txt file 1st column contains Name and 4th column contains corresponding Salary. And we are printing these two columns using AWK’s print command. In above example $1 and $4 represents the first and fourth field from the input record.
 
To omit the header record and get only the names printed:
$ awk ‘NR!=1{print $1}’ example.txt

Thomas
Jason
Sanjay
Nisha
Rady

NR denotes line number ranging from 1 to the actual line count.
The condition ‘NR!=1’ indicates not to execute the action part for the first line of the file
To printed data after comma:
$ awk –F, ‘{print $2}’ example.txt

000
500
000
500
000

To display data with comma separated:
$ awk ‘BEGIN {OFS=”,”} {print $1,$2}’ example.txt

Name,Designation
Thomas,Manager
Jason,Developer
Sanjay,Sysadmin
Nisha,Manager
Rady,DBA

To display data with colon separated:
$ awk ‘BEGIN {OFS=”:”} {print $1,$2,$3,$4}’ example.txt

Name:Designation:Department:Salary
Thomas:Manager:Sales:$5,000
Jason:Developer:Technology:$5,500
Sanjay:Sysadmin:Technology:$7,000
Nisha:Manager:Marketing:$9,500
Rady:DBA:Technology:$6,000

NF – Number of fields variable:
The NF can be used to know the number of fields in each line
$ awk ‘{print NF}’ example.txt

4
4
4
4
4
4

NR – number of records variable:
The NR can be used to know the line number or count of lines in a file.
$ awk ‘{print NR}’ example.txt
This will display the line numbers from 1.

1
2
3
4
5
6

To find the total number of records:

$ awk ‘END {print NR}’ example.txt

This will display the total number of lines in the file.
6
To display the records where field length is greater than 5

$ awk ‘length($1) > 5’ example.txt

Thomas Manager Sales $5,000
Sanjay Sysadmin Technology $7,000
 



 
To print record count:

$ awk ‘/a/{++cnt} END {print “Count = “, cnt}’ example.txt

Count = 6
 
To extract the first column from a file with the command:
$ awk ‘{print $1}’ example.txt

Name
Thomas
Jason
Sanjay
Nisha
Rady

To renaming files (append .new to “example.txt”):

ls example* | awk ‘{print “mv “$1” “$1″.new”}’ | sh

To remove only files:

ls –l | awk ‘$1!~/^drwx/{print $9}’|xargs rm

To remove only directories:

ls -l | grep ‘^d’ | awk ‘{print “rm -r “$9}’ | sh

OR

ls -p | grep /$ | awk ‘{print “rm -r “$1}’

OR

ls -l|awk ‘$1~/^d.*x/{print $9}’|xargs rm -r

 



 

Difference between grep, sed and awk

grep: search for specific terms in a file
#usage
$ grep This file.txt

Every line containing “This”
Every line containing “This”
Every line containing “This”
Every line containing “This”

$ cat file.txt

Every line containing “This”
Every line containing “This”
Every line containing “That”
Every line containing “This”
Every line containing “This”

Now awk and sed are completly different than grep. awk and sed are text processors. Not only the ability to find what you are looking for in text, they have the ability to remove, add and modify the text as well.
awk is mostly used for data extraction and reporting. sed is a stream editor.
 

Diff between grep, sed and awk

Example:
Sed

$ sed -i ‘s/cat/dog/’ file.txt

# this will replace any occurrence of the characters ‘cat’ by ‘dog’
Awk

$ awk ‘{print $2}’ file.txt

# this will print the second column of file.txt
Basic awk usage:
Compute sum/average/max/min/etc.
$ cat file.txt

A 10
B 20
C 60

 

$ awk ‘BEGIN {sum=0; count=0; OFS=”\t”} {sum+=$2; count++} END {print “Average:”, sum/count}’ file.txt

Average: 30
 

Diff between grep, sed and awk

Grep is useful if you want to quickly search for lines that match in a file. It can also return some other simple information like matching line numbers, match count, and file name lists.
Awk is an entire programming language built around reading CSV-style files, processing the records, and optionally printing out a result data set. It can do many things but it is not the easiest tool to use for simple tasks.
Sed is useful when you want to make changes to a file based on regular expressions. It allows you to easily match parts of lines, make modifications, and print out results. It’s less expressive than awk but that lends it to somewhat easier use for simple tasks.