9

I have a couple of commands in an awk script I'm writing:

print "Here are some players and their numbers, sorted by last name"
if(sum[x] > 500) {print x, $2}

Which outputs:

Here are some players and their numbers, sorted by last name
Lebron James 23
Kevin Durant 35
Kobe Bryant 24
Blake Griffin 32
Dikembe Mutumbo 55

How can I use the sort command in my awk script to sort the players and their numbers ONLY?

muru
  • 207,228
Anonymous
  • 425

7 Answers7

12

you can add | sort -k2 to your command. This will sort alphabetically based on the second column.

Example:

$ echo "Lebron James 23
Kevin Durant 35
Kobe Bryant 24
Blake Griffin 32
Dikembe Mutumbo 55" | sort -k2

results in

Kobe Bryant 24
Kevin Durant 35
Blake Griffin 32
Lebron James 23
Dikembe Mutumbo 55
Wayne_Yux
  • 4,942
9

Although I wouldn't recommend it (given the relative simplicity of piping the result through an external sort command) you can do this at least with recent versions of GNU awk (at least 4.0 IIRC), as described at Sorting Array Values and Indices with gawk

Here's how you could implement it, assuming you have the data in an associative array in which the index is Firstname Lastname. First you need to define a custom comparison function that splits the index, compares first on Lastname then (as a tie breaker) on Firstname e.g.

function mycmp(ia, va, ib, vb, sa, sb) {
  if(split(toupper(ia), sa) && split(toupper(ib), sb)) {
    if(sa[2] < sb[2]) return -1;
    else if (sa[2] > sb[2]) return 1;
    else {
      # compare first names
      if(sa[1] < sb[1]) return -1;
      else if (sa[1] > sb[1]) return 1;
      else return 0;
    }
  }
  else return 0;
}

Now you can use the PROCINFO["sorted_in"] array sorting method mentioned in comments by @zwets

PROCINFO["sorted_in"] = "mycmp";
for(i in a) print i, a[i];

Putting it together

#!/usr/bin/gawk -f

function mycmp(ia, va, ib, vb, sa, sb) {
  if(split(toupper(ia), sa) && split(toupper(ib), sb)) {
    if(sa[2] < sb[2]) return -1;
    else if (sa[2] > sb[2]) return 1;
    else {
      # compare first names
      if(sa[1] < sb[1]) return -1;
      else if (sa[1] > sb[1]) return 1;
      else return 0;
    }
  }
  else return 0;
}

{
  a[$1" "$2] = $3;
}

END {
  PROCINFO["sorted_in"] = "mycmp";
  for(i in a) print i, a[i];
}

Testing:

$ ./namesort.awk yourfile
Kobe Bryant 24
Kevin Durant 35
Blake Griffin 32
Lebron James 23
Dikembe Mutumbo 55

In lesser or older versions of awk, your best bet may be to store the data indexed by Lastname Firstname instead, sort with the conventional asorti, then split and swap the fields of the indices as you traverse the array to print it:

awk '
  {a[$2" "$1]=$3} 
  END {
    n=asorti(a,b); for (i=1;i<=n;i++) {split(b[i],s); print s[2], s[1], a[b[i]]}
}' yourfile
dessert
  • 40,956
steeldriver
  • 142,475
5

To sort only by the whitespace separated second field, use key -k2,2:

... | sort -k2,2

by default sort does the sorting lexicographically.

Note that, if you don't mention the last field for the sorting key i.e. if you just use -k2 then you might not get the desired result as this will sort according to all fields starting from second.

Also check man sort.

heemayl
  • 93,925
1
print "Here are some players and their numbers, sorted by last name"
if(sum[x] > 500) {print x, $2 | "sort -k2,2"}

To sort the output to a file:

print "Here are some players and their numbers, sorted by last name"
if(sum[x] > 500) {print x, $2 | "sort -k2,2 > sortedFile"}
1

Try

awk -f myscript.awk | sort -k2

Where myscript.awk contains purely awk commands.

If your actual script is a shell script, you have several options including

  • Pipe output through sort. ./myscript.bash | sort -k2
  • Rewrite code as a function inside the script
    Instead of

    $ cat t1
    #!/bin/bash
    for i in 2 4 3 1 5;
    do
      echo $i
    done
    
    $ ./t1
    2
    4
    3
    1
    5
    

    Do

    $ cat t2
    #!/bin/bash
    function foo {
      for i in 2 4 3 1 5;
      do
        echo $i
      done
    }
    foo | sort
    
    $ ./t2
    1
    2
    3
    4
    5
    

But note you can also apply the sort to the do...done structure instead of making a function.

    do
       echo $i
    done | sort
1

To sort your data to print:

  • Suppose you want to print 2nd field (whitespace separated) use this:

    awk '{print $2}' data.txt | sort
    

    e.g.:

    $cat>data.txt
    1 Kedar 20
    2 Amit 30
    3 Rahul 21
    ^C
    
    $awk '{print $2}' | sort
    Amit
    Kedar
    Rahul
    
  • If you want to print the whole of your data.txt but sorted on column 2, then:

    $awk '{print}'|sort -k2
    2 Amit 30
    1 Kedar 20
    3 Rahul 21
    

Use this logic(s) in your requirement.

You may use man sort for more interesting features of sort.

Melebius
  • 11,750
0

what about below:

 awk 'BEGIN{str="1\n2\n3\n4"; system("echo -e \""str"\" | sort -r")}'

it work when i tested.