Table of Contents
AWK command in Unix/Linux
The Awk is mostly used for pattern scanning and processing. It searches one or more files to see if they contain lines that matches with the specified patterns and then perform associated actions.
Some of the key features of Awk are:
- Awk views a text file as records and fields.
- Like common programming language, Awk has variables, conditionals and loops
- Awk has arithmetic and string operators.
- Awk can generate formatted reports.
The syntax of awk is:
awk ‘pattern{action}’ file
Consider we have a text file example.txt to be processed:
$ cat example.txt
Name Designation Department Salary
Thomas Manager Sales $5,000
Jason Developer Technology $5,500
Sanjay Sysadmin Technology $7,000
Nisha Manager Marketing $9,500
Rady DBA Technology $6,000
We can instruct AWK to print only certain columns from the input field:
$ awk ‘{print $1 “\t” $4}’ example.txt
Name Salary
Thomas $5,000
Jason $5,500
Sanjay $7,000
Nisha $9,500
Rady $6,000
In our example.txt file 1st column contains Name and 4th column contains corresponding Salary. And we are printing these two columns using AWK’s print command. In above example $1 and $4 represents the first and fourth field from the input record.
To omit the header record and get only the names printed:
$ awk ‘NR!=1{print $1}’ example.txt
Thomas
Jason
Sanjay
Nisha
Rady
NR denotes line number ranging from 1 to the actual line count.
The condition ‘NR!=1’ indicates not to execute the action part for the first line of the file
To printed data after comma:
$ awk –F, ‘{print $2}’ example.txt
000
500
000
500
000
To display data with comma separated:
$ awk ‘BEGIN {OFS=”,”} {print $1,$2}’ example.txt
Name,Designation
Thomas,Manager
Jason,Developer
Sanjay,Sysadmin
Nisha,Manager
Rady,DBA
To display data with colon separated:
$ awk ‘BEGIN {OFS=”:”} {print $1,$2,$3,$4}’ example.txt
Name:Designation:Department:Salary
Thomas:Manager:Sales:$5,000
Jason:Developer:Technology:$5,500
Sanjay:Sysadmin:Technology:$7,000
Nisha:Manager:Marketing:$9,500
Rady:DBA:Technology:$6,000
NF – Number of fields variable:
The NF can be used to know the number of fields in each line
$ awk ‘{print NF}’ example.txt
4
4
4
4
4
4
NR – number of records variable:
The NR can be used to know the line number or count of lines in a file.
$ awk ‘{print NR}’ example.txt
This will display the line numbers from 1.
1
2
3
4
5
6
To find the total number of records:
$ awk ‘END {print NR}’ example.txt
This will display the total number of lines in the file.
6
To display the records where field length is greater than 5
$ awk ‘length($1) > 5’ example.txt
Thomas Manager Sales $5,000
Sanjay Sysadmin Technology $7,000
To print record count:
$ awk ‘/a/{++cnt} END {print “Count = “, cnt}’ example.txt
Count = 6
To extract the first column from a file with the command:
$ awk ‘{print $1}’ example.txt
Name
Thomas
Jason
Sanjay
Nisha
Rady
To renaming files (append .new to “example.txt”):
ls example* | awk ‘{print “mv “$1” “$1″.new”}’ | sh
To remove only files:
ls –l | awk ‘$1!~/^drwx/{print $9}’|xargs rm
To remove only directories:
ls -l | grep ‘^d’ | awk ‘{print “rm -r “$9}’ | sh
OR
ls -p | grep /$ | awk ‘{print “rm -r “$1}’
OR
ls -l|awk ‘$1~/^d.*x/{print $9}’|xargs rm -r
Difference between grep, sed and awk
grep: search for specific terms in a file
#usage
$ grep This file.txt
Every line containing “This”
Every line containing “This”
Every line containing “This”
Every line containing “This”
$ cat file.txt
Every line containing “This”
Every line containing “This”
Every line containing “That”
Every line containing “This”
Every line containing “This”
Now awk and sed are completly different than grep. awk and sed are text processors. Not only the ability to find what you are looking for in text, they have the ability to remove, add and modify the text as well.
awk is mostly used for data extraction and reporting. sed is a stream editor.
Diff between grep, sed and awk
Example:
Sed
$ sed -i ‘s/cat/dog/’ file.txt
# this will replace any occurrence of the characters ‘cat’ by ‘dog’
Awk
$ awk ‘{print $2}’ file.txt
# this will print the second column of file.txt
Basic awk usage:
Compute sum/average/max/min/etc.
$ cat file.txt
A 10
B 20
C 60
$ awk ‘BEGIN {sum=0; count=0; OFS=”\t”} {sum+=$2; count++} END {print “Average:”, sum/count}’ file.txt
Average: 30
Diff between grep, sed and awk
Grep is useful if you want to quickly search for lines that match in a file. It can also return some other simple information like matching line numbers, match count, and file name lists.
Awk is an entire programming language built around reading CSV-style files, processing the records, and optionally printing out a result data set. It can do many things but it is not the easiest tool to use for simple tasks.
Sed is useful when you want to make changes to a file based on regular expressions. It allows you to easily match parts of lines, make modifications, and print out results. It’s less expressive than awk but that lends it to somewhat easier use for simple tasks.