1. Best solution: Python
Using bash for task such as this might be slightly too complex, because it doesn't have sufficient tools for that purpose. Certainly it can be done, but with very large amount of effort. Therefore we need set of tools that can allow us to parse log file in a simpler way. Python offers such set of tools via datetime module.
The python script presented below takes 3 arguments on command line: single- or double- quoted beginning timestamp, single- or double- quoted ending timestamp, and the file to read. The format of timestamps should be consistent with 'Mon day HH:MM:SS` format.
#!/usr/bin/env python
import datetime as dt
import sys
def convert_to_seconds(timestring):
year = str(dt.date.today().year)
dtobj = dt.datetime.strptime( year + ' ' + timestring , '%Y %b %d %H:%M:%S' )
return int(dtobj.strftime('%s'))
beginning = convert_to_seconds(sys.argv[1])
ending = convert_to_seconds(sys.argv[2])
with open(sys.argv[3]) as log:
for line in log:
logstamp = " ".join(line.strip().split()[0:3])
s_logstamp = convert_to_seconds(logstamp)
if s_logstamp < beginning: continue
if s_logstamp >= beginning and s_logstamp <= ending:
print(line.strip())
sys.stdout.flush()
if s_logstamp > ending: break
Test run on /var/log/syslog:
$ ./read_log_range.py 'Feb 8 13:57:00' 'Feb 8 14:00:00' /var/log/syslog
Feb 8 13:57:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed
Feb 8 13:59:55 eagle org.gtk.vfs.Daemon[28480]: ** (process:2259): WARNING **: Couldn't create directory monitor on smb://x-gnome-default-workgroup/. Error: Operation not supported by backend
Feb 8 13:59:59 eagle gnome-session[28631]: (nm-applet:28825): GdkPixbuf-CRITICAL **: gdk_pixbuf_composite: assertion 'dest_x >= 0 && dest_x + dest_width <= dest->width' failed
2. Bash
Of course, it is possible do to so in bash, with use of date and awk utilities for extracting the timestamps and conversions. Below is the bash implementation of the same python script.
#!/usr/bin/env bash
#set -x
str_to_seconds(){
date -d"$1" +%s
}
main(){
local date1=$1
local date2=$2
local logfile=$3
local s_date1=$(str_to_seconds "$date1")
local s_date2=$(str_to_seconds "$date2")
while IFS= read -r line;
do
timestamp=$(awk '{print $1,$2,$3}' <<< "$line")
s_timestamp=$(str_to_seconds "$timestamp")
[ $s_timestamp -lt $s_date1 ] && continue
if [ $s_timestamp -ge $s_date1 ] && [ $s_timestamp -le $s_date2 ]
then
printf "%s\n" "$line"
fi
[ $s_timestamp -gt $s_date2 ] && break
done < "$logfile"
}
main "$@"
3. Comparison of the two approaches
Naturally, bash version takes much longer time. Shell isn't made for processing of large amount of data, such as logs. For instance, on my machine with SSD and dual core processor, the shell took a significant amount of time to read almost 13,000 line file:
$ time ./read_log_range.sh 'Feb 8 13:56:00' 'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null
0m39.18s real 0m02.48s user 0m02.68s system
$ wc -l /var/log/syslog
12878 /var/log/syslog
Even several optimizations with if statements didn't help. Compare that with it's python alternative:
$ time ./read_log_range.py 'Feb 8 13:56:00' 'Feb 8 14:00:00' '/var/log/syslog' &> /dev/null
0m00.60s real 0m00.53s user 0m00.07s system
$ wc -l /var/log/syslog
12878 /var/log/syslog
As you can see, python was about 65 times faster than its bash counterpart.