regex | List of result from grep

Solution for regex | List of result from grep
is Given Below:

The following grep command gives me the number of requests from July 1st to July 31st between 8 a.m. and 4 p.m.

zgrep -E "[01-31]/Jul/2021:[08-16]" localhost_access.log* | wc -l

I don’t want to get all requests in the month, but the requests per day. I could of course enter the command 31 times, but that’s tedious. Is there a way to display the requests per day one below the other, so that I get the following as a result (ideally sorted by number), for example

543

432

321

etc.

How to do that?

You want to count lines based on a certain value in a line. That’s a good job for awk. With grep-only, you would always have to process the input files once per day. In any way, we need to fix your regex first:

zgrep -E "[01-31]/Jul/2021:[08-16]" localhost_access.log* | wc -l

[08-16] matches the characters 0, 8, -, 1 and 6. What you want to match is (0[89])|(1[0-6]); that is 0, followed by one of 8 or 9 – or – 1 followed by one of range 0-6. To make it easier, we assume normal days in the date and therefore match the day with [0-9]{2} (two digits).

Here’s a complete awk for your task:

awk -F/ '/[0-9]{2}/Jul/2021:(0[89])|(1[0-6])/{a[$1]++}END{for (i in a) print "day " i ": " a[i]}' localhost_access.log*

Explanation:

  • /[0-9]{2}/Jul/2021:(0[89])|(1[0-6])/ matches date + time for every day (at 08-16) in july
  • {a[$1]++} builds an array with key=day and a counter of occurrences.
  • END{for (i in a) print "day " i ": " a[i]} prints the array when all input files were processed

Because we’ve set the field separator to /, you need to change a[$1] to address the correct position (for two more slashes before the actual date: a[$3]). (Of course this can be solved in a more dynamic way.)

Example:

$ cat localhost_access.log
01/Jul/2021:08 log message
01/Jul/2021:08 log message
02/Jul/2021:08 log message
02/Jul/2021:07 log message
$ awk -F/ '/[0-9]{2}/Jul/2021:(0[89])|(1[0-6])/{a[$1]++}END{for (i in a) print "day " i ": " a[i]}' localhost_access.log*
day 01: 2
day 02: 1

Run zcat | awk in case your log files are compressed, but remember the regex above searches for “Jul/2021” only.