grep to return Nth and Mth lines before and after the match

Question

I know that with grep I can use the fields -A and -B to pull previous and next lines from a match.

However they pull in all lines between the match based on however many lines are specified.

grep -r -i -B 5 -A 5 "match"

I'd like to only receive the 5^th line before a match and the 5^th line after the match in addition to the matched line and not get the lines between.

Is there a way to do this with the grep?

glenn jackman · Answer 1 · 2018-05-10T17:00:59.283

12

If:

cat file

a
b
c
d
e
f match
g
h
i match
j
k
l
m
n
o

Then:

awk '
    {line[NR] = $0} 
    /match/ {matched[NR]} 
    END {
        for (nr in matched)
            for (n=nr-5; n<=nr+5; n+=5) 
                print line[n]
    }
' file

a
f match
k
d
i match
n

edited May 10 '18 at 17:00

answered May 10 '18 at 16:36

glenn jackman

18,218

wjandrea · Answer 2 · 2018-05-10T19:34:20.517

This is basically Glenn's solution, but implemented with Bash, Grep, and sed.

grep -n match file |
    while IFS=: read nr _; do
        sed -ns "$((nr-5))p; $((nr))p; $((nr+5))p" file
    done

Note that line numbers less than 1 will make sed error, and line numbers greater than the number of lines in the file will make it print nothing.

This is just the bare minimum. To make it work recursively and handle the above line number cases would take some doing.

JoL · Answer 3 · 2018-05-11T00:27:37.293

6

It can't be done with only grep. If ed's an option:

ed -s file << 'EOF' 
g/match/-5p\
+5p\
+5p
EOF

The script basically says: for every match of /match/, print the line 5 lines before that, then 5 lines after that, then 5 lines after that.

edited May 11 '18 at 00:27

answered May 10 '18 at 23:25

JoL

1,458

αғsнιη · Answer 4 · 2018-05-10T20:07:32.853

awk '/match/{system("sed -n \"" NR-5 "p;" NR "p;" NR+5 "p\" " FILENAME)}' infile

Here we are using awk's system(command) function to call external sed command to print the lines which awk matched with pattern match with 5^th lines before and after the match.

The syntax is easy, you just need to put the external command itself inside double-quote as well as its switches and escape the things you want exactly pass to the command, everything else related to the awk itself options should be outside of the quotes. So the below sed:

"sed -n \"" NR-5 "p;" NR "p;" NR+5 "p\" " FILENAME

translate into:

sed -n "NR-5p; NRp; NR+5p" FILENAME

NR is the line number that matched with the pattern match and FILENAME is the of current processing filename passing by awk.

score 2 · Answer 5 · edited May 11 '18 at 13:46

2

using @glenn's example text file and using perl instead of awk:

$ perl -n0E 'say /(.*\n)(?=(?:.*\n){4}(.*match.*\n)(?:.*\n){4}(.*\n))/g' ex

will give the same results, but running faster:

a
f match
k
d
i match
n

edited May 11 '18 at 13:46

Fabby

35,017

answered May 11 '18 at 10:18

Brandon Haberfeld · Accepted Answer · 2019-07-25T18:33:56.853

The tool you want to use is called sift. This is basically a grep on steroids. Grep in parallel. Sift has a huge amount of options to do exactly what you want - specifically to return a particular line relative to a match(s) which may/may not be followed by /preceded by some text.

It amazes me that sift is not mainstream gnu as it was written in the go language but installs on Linux just fine. IT searches in parallel using all cpus huge quantities of text where grep just takes weeks to do the same.

Sift website - see examples

grep to return Nth and Mth lines before and after the match

6 Answers6