Posts Tagged ‘awk’

Interview questions … Get IP address from Apache Logs

Monday, August 15th, 2011

TL;DR Some Apache log processing one liners to get IP addresses from the access and error logs that I have found handy.

There may be some log processing questions asked during the course of an interview, so I am going to concentrate on a couple that will get the IP addresses from the apache log files. If you do have log processing quesitons I sincerely hope you get to play at a command line, as off the top of the head can be difficult, unless you’re good at visualizing commands.

Q: How would you get the IP address from the access logs?
A: This one is fairly straight forward, I would:
cat access.log | awk '{print $1}' | uniq

This will output (if you choose not to use uniq you will see multiple of the same ip, I’ll leave to the discretion of the reader):

127.0.0.1
127.0.0.2
127.0.0.3

The breakdown is as follows:

    1. read through the contents of the file (cat access.log)
    2. pipe output to awk to print the first field ($1)
    3. pipe output to only show unique data (uniq)

Another question might be to parse out the IP address from the error.log, while this is a bit more difficult, it is fairly straight forward using readily available system tools.

Q: Can you please extract the IP address from the error.log?
A: Here is my solution (Thanks Malcolm!)
cat error.log | awk '{ if($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq

The breakdown is as follows, this one is a bit more complex so I will walk through each step of it:

First part: cat error.log - read through the log file.
 
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Mon Aug 15 15:16:21 2011] [error] [client 194.72.238.62] Invalid method in request \x16\x03\x01
[Mon Aug 15 15:50:27 2011] [notice] caught SIGWINCH, shutting down gracefully
[Mon Aug 15 15:50:37 2011] [notice] mod_python: Creating 8 session mutexes based on 75 max processes and 0 max threads.
[Mon Aug 15 15:50:37 2011] [notice] mod_python: using mutex_directory /tmp 
PHP Warning:  Module 'gd' already loaded in Unknown on line 0
PHP Warning:  Module 'mysql' already loaded in Unknown on line 0
PHP Warning:  Module 'mysqli' already loaded in Unknown on line 0
[Mon Aug 15 15:50:38 2011] [warn] mod_wsgi: Compiled for Python/2.5.1.
[Mon Aug 15 15:50:38 2011] [warn] mod_wsgi: Runtime using Python/2.5.2.
 
2. Second part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' - if the 7th field matches client (this seems to be pretty standard though ymmv) print out eighth field (which should be the IP address, also notice the trailing "]" character).
 
96.126.120.254]
96.126.120.254]
96.126.120.254]
96.126.120.254]
194.72.238.62]
 
3. Third Part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' - use sed to remove the trailing "]" character.
 
96.126.120.254
96.126.120.254
96.126.120.254
96.126.120.254
194.72.238.62
 
4. Fourth Part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq - lets output unique ips and not all of them (multiple matches).
 
96.126.120.254
194.72.238.62

Addendum:
If you would like you can then add another awk on the end (or pipe to any other command you feel like) for instance. The following pipes the output to the host command:

[Edit: for host might want to specify the -W <time> flag just in case, it could try forever on some unless specified]
cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq | awk '{ print | "host -W 3 " $1 }'

I hope this helps, share and enjoy! Thanks for reading!

-Scott.

Disclaimer: I make no claims to the viability of the code/script/commands and make no guarantees that it will work on your system, use at your own risk.

Normalize MAC address for DHCP reservations

Sunday, August 7th, 2011

So part of what I am doing at my current job is helping one of the Unix admins with DNS and DHCP. For the DHCP portion to setup the reservations we need the MAC addy in a certain format, of which the people requesting never seem to get consistently right, so I wrote a small shell script (it’s still rough) that will normalize the MAC address for what we need (colon separated, alpha characters lower case).

[Edit: the below only works on FreeBSD and Linux, for Solaris swap the awk '{print tolower($0)}' with tr '[:upper:]‘ ‘[:lower:]‘ <- or you could just do that for all of them as well).

#!/usr/bin/env sh
 
mac="$1"
 
#check to see if input is empty
if [ ! -n "$mac" ]
    then
        echo "Please enter a MAC address."
        exit
else 
    # echo the input, strip out dot(.), strip out colon(:), strip out dash(-)
    # add colon(:) every two chars, remove last colon(:)
    # awk to lowercase characters [EDIT: updated the sed for . seps and escape]
    echo $mac | sed -e 's/\.//g' -e 's/\://g' -e 's/\-//g' -e 's/../&:/g' -e 's/:$//g' \
| awk '{print tolower($0)}'
fi

Some of the standard formats we see are:
1A:BC:35:57:33:08
1A-BC-35-57:33:08
1ABC35573308
1abc35573308
1a.bc.35.57.33.08
1A.BC.35.57.33.08

The script will take all of these and make them look like: 1a:bc:35:57:33:08.

Here is the breakdown for those that are curious:
1. if [ ! -n "$mac" ] checks to make sure first input variable is not empty.
2. echo $mac outputs the first argument (does not do any input validation)
3. The sed is in five parts:

    [Edited to escape the sequences, just in case they should have special meaning]
        a. 's/\.//g' strips out the period should it be in that format
        b. 's/\://g' strips out the colons should they be there (to avoid extra colons)
        c. 's/\-//g' strips out the dashes should they be in that format
        d. 's/../&:/g' every two characters append a colon
        e. 's/:$//g' remove the last trailing colon
    

4. awk ‘{print tolower($0)}’ make all of the upper case alpha chars lower case to match our format needs.
4a. tr ‘[:upper:]‘ ‘[:lower:]‘ for Solaris (yes I know tr ‘[A-Z]‘ ‘[a-z]‘ will work as well, but this is easier to read IMHO)

This will work on a file full of MAC addresses, just throw it in a for loop (this is if you name the file macnorm.sh):

for i in `cat filename`; do ./macnorm.sh $i; done

Again it is important to realize that there is no checking of the input, so if you did it over /etc/passwd it would add a colon (:) every two characters. Hope this helps.

-Scott.

Translate lower to upper (or vice versa) on Linux command line

Friday, September 17th, 2010

An acquaintance of mine asked a really good interview question, so I thought I would share. For the linux wonks, describe three ways to translate from lower case to uppercase on the command line (or vice versa). I have to admit I only thought of one (though the other should have been obvious), so I came up with tr, missed sed (which should have been obvious), and last but not least, one I would have never thought of, dd. *Actually thought of another way using awk so I’ll add that as well. Below are the examples:

Created a file called test with the following contents: test test test (to go from upper to lower on the tr flip the options).

Tr:
tr: cat test | tr [:lower:] [:upper:]
Output: cat test | tr [:lower:] [:upper:]
TEST TEST TEST
 
tr: cat test | tr "a-z" "A-Z"
Output: cat test | tr "a-z" "A-Z"
TEST TEST TEST
 
Sed:
*With sed replace U with L if you want to lower (there are other ways as well):
 
sed: cat test | sed 's/\(.*\)/\U\1/'
Output: cat test | sed 's/\(.*\)/\U\1/'
TEST TEST TEST
 
dd: (bonus points if you know where dd got its name**)
 
dd if=test conv=upper
Output: dd if=test conv=ucase
TEST TEST TEST
0+1 records in
0+1 records out
15 bytes (15 B) copied, 0.000155 s, 96.8 kB/s
 
Awk:
cat test | awk '{print toupper($_)}' (use tolower($_) if you want to lcase)
Output: cat test | awk '{print toupper($_)}'
TEST TEST TEST

** From the history files, the name comes from “convert and copy” but some wily compiler designers had already taken the “cc” command.

Getting IP address from shell

Thursday, September 9th, 2010

So someone asked earlier on one of the social networking sites I follow how to get the ip address from the shell, here are two quick and dirty ways (using just cmd line utils that could be utilized from shell) – Use at your own risk :).

Linux: - You can specify a device or if you do not it returns them all.
ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e '/Bcast//g'
 
Output: 
 
$> ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e 's/Bcast//g' 192.168.1.2 192.168.1.3
 
or
 
OSX / BSD: - If you leave off the final grep it will return then all (IPv6 as well) 
 
ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 
(greps out everything but numbers, grep -v [:digit:] works as well)
 
Output: 
 
$> ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 127.0.0.1 192.168.1.3
Meta:
Contact:
Scott:
Systems Administrator, Geek, Horror fan.

Ways to reach Scott:
Twitter
FriendFeed
Facebook
LinkedIn
Tumblr
Zerply
About.me

Barry:
Systems Administrator / Geek.

Ways to reach Barry:
Twitter
Friendfeed
Associations:
Twitter:

View more tweets | Powered by HL Twitter