simple-awk. A practical and simple awk guide.

I wrote a simple but practical guide to awk. There are some surprisingly powerful and mind blowing techniques towards the end.

Here’s a snippet from the end of the guide (the beginning starts much simpler):

#!/usr/bin/awk -f

# Declare custom func outside
function throw_ring(who){
	if (who=="gollum"){
		return 0
	}
	else if (who=="frodo"){
		return 1
	}
}

# On all records matching /ring/
/ring/{
# Find gollum or frodo
if (match($0,"gollum")){
	was_thrown=throw_ring("gollum")
	#use ternary if/else. If was_thrown is true (or bigger than 0) return "THROWN". Else return "NOT THROWN"
	print "the  ring throw status is ", was_thrown?"THROWN":"NOT THROWN"
}

if (match($0,"frodo")){
	was_thrown=throw_ring("frodo")
	print "the  ring throw status is ", was_thrown?"THROWN":"NOT THROWN"
}

}

This is a rather lange script but simple when you break it into pieces.

Start by declaring a custom “throw_ring” function.

on records that match /ring/ exec the following operations:

check if that record (which contains /ring/) contains “gollum”. If yes get “thrown_status” value.

Use the ternary operator to return either “THROWN” or “NOT_THROWN”.

Do the same for “frodo”

The output will look like: the ring throw status is THROWN

This one is also one of my favourites – inverse of a regex match.

#!/usr/bin/awk -f
{
	reg="[Bb]ilbo"
	if (match($0,reg)){
		bef=substr($0,1,RSTART-1)
		aft=substr($0,RSTART+RLENGTH)
		pat=substr($0,RSTART,RLENGTH)
		print bef,"|",pat,"|",aft
	}
	else print $0
}

We start by checking for a match. If a match exists commands inside if will run (since it will return a value greater than 0).

‘bef’ is the substring before the match. We cut it from position beginning (string indexing starts at 1) up to the start of the match (to RSTART).

‘aft’ is the substring after the match. We cut it from the end of the match up to the end. Note how we use “RSTART+RLENGTH” to calculate the first char AFTER the end of the match. This it’s a bit confusing with indexing starting from 1. If you’re used to 0 indexing you would want to add a 1 (“RSTART+RLENGTH+1”) but it’s not needed here because indexing starts at 1.

Note the comments starting with #

Check out the rest of this cool awk guide here.

Leave a Reply

Your email address will not be published. Required fields are marked *