Posts Tagged ‘awk’

Printing column headers from commands

Thursday, March 22nd, 2012

As with most things in *nix world there are more than one way of doing things. Here is a nifty trick I picked up from a former coworker. Sometimes when I needed to do disk reporting it was handy to have the column header (remember this is my use case) for those who don’t necessarily know what the columns mean when doing a df or similar and grepping for a particular volume / disk (read upper management). I am fairly certain that there are even more ways of accomplishing this, but here are two that I’ve used. Also be aware, that you need to know what you’re looking for, or the pattern you need to match.

Note: this was done from my MBP, so in Linux ymmv*.

sed:

Before:
df -h | grep disk
 
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager
 
After:
df -h | sed -n '/^Filesystem/p;/disk/p'
 
Filesystem       Size   Used  Avail Capacity  Mounted on
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager

awk:

Before:
df -h | grep disk
 
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager
 
After:
df -h | awk '/^Filesystem {print;}; /disk/ {print;}; {next;};'
 
Filesystem       Size   Used  Avail Capacity  Mounted on
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager

sed breakdown:
With the sed line we are matching any line that begins with Filesystem “/^Filesystem” and printing it “/p”, then matching any line that has disk “/disk” and printing it “/p”. The “-n” option to sed tells it to only print what we are looking for – from the sed manpage on OSX – “-n By default, each line of input is echoed to the standard output after all of the commands have been applied to it. The -n option suppresses this behavior.” If you were to remove the “-n” switch it would look something like this:

 
df -h | sed '/^Filesystem/p;/disk/p'
 
Filesystem       Size   Used  Avail Capacity  Mounted on
Filesystem       Size   Used  Avail Capacity  Mounted on
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
/dev/disk0s2    465Gi  123Gi  342Gi    27%    /
devfs           199Ki  199Ki    0Bi   100%    /dev
map -hosts        0Bi    0Bi    0Bi   100%    /net
map auto_home     0Bi    0Bi    0Bi   100%    /home
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk1s2    110Mi   94Mi   16Mi    86%    /Volumes/VirtualBox
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager
/dev/disk2s0s2   25Mi   22Mi  2.7Mi    90%    /Volumes/VZAccess Manager

printing the matches multiple times due to matching on the “/^Filesystem” and matching on the “/disk”.

awk breakdown:
With the awk one, we are matching the line that begins with “/^Filesystem” and printing it, then matching the line with “/disk/” and printing, then followed by a statement “{next;}” to keep processing and skip unless the pattern matches.

Also while writing the post, thought of another one that it might be useful, though it really depends on what you’re running on your system. Thought of a little netstat example though I will go with the caveat here that netstat can easily make this a moot point, take this with a grain of salt.

Before:
 
netstat -anl | grep -i listen
 
tcp46      0      0  *.9292                 *.*                    LISTEN     
tcp46      0      0  *.9302                 *.*                    LISTEN     
tcp46      0      0  *.9301                 *.*                    LISTEN     
tcp46      0      0  *.9200                 *.*                    LISTEN     
tcp46      0      0  *.9300                 *.*                    LISTEN     
tcp46      0      0  *.62056                *.*                    LISTEN     
tcp4       0      0  *.62056                *.*                    LISTEN     
tcp4       0      0  127.0.0.1.51093        *.*                    LISTEN     
tcp4       0      0  127.0.0.1.26164        *.*                    LISTEN     
tcp4       0      0  *.17500                *.*                    LISTEN     
tcp4       0      0  127.0.0.1.631          *.*                    LISTEN     
tcp6       0      0  ::1.631                                       *.*                                           LISTEN     
ffffff80147a27d0 stream      0      0 ffffff8017e13f00                0                0                0 /tmp/launch-H5oHbg/Listeners
 
After:
 
netstat -anl | sed -n '/^Proto/p;/LISTEN/p'
 
Proto Recv-Q Send-Q  Local Address          Foreign Address        (state)    
tcp46      0      0  *.9292                 *.*                    LISTEN     
tcp46      0      0  *.9302                 *.*                    LISTEN     
tcp46      0      0  *.9301                 *.*                    LISTEN     
tcp46      0      0  *.9200                 *.*                    LISTEN     
tcp46      0      0  *.9300                 *.*                    LISTEN     
tcp46      0      0  *.62056                *.*                    LISTEN     
tcp4       0      0  *.62056                *.*                    LISTEN     
tcp4       0      0  127.0.0.1.51093        *.*                    LISTEN     
tcp4       0      0  127.0.0.1.26164        *.*                    LISTEN     
tcp4       0      0  *.17500                *.*                    LISTEN     
tcp4       0      0  127.0.0.1.631          *.*                    LISTEN     
tcp6       0      0  ::1.631                                       *.*                                           LISTEN

Again, this makes the presumption that you know what you’re looking for, and if you tried to do that with ESTABLISHED in netstat it may be moot unless you narrowed down to less than a screen. Also I’ll leave the awk version and looking for specific ports / port ranges as an exercise for the reader. Also astute readers will notice that we lost the last output from the first netstat command due to not grepping indiscriminately.

I hope this information comes in handy, share and enjoy!

*For those that may not know, ymmv = Your Mileage May Vary.

Interview questions … Get IP address from Apache Logs

Monday, August 15th, 2011

TL;DR Some Apache log processing one liners to get IP addresses from the access and error logs that I have found handy.

There may be some log processing questions asked during the course of an interview, so I am going to concentrate on a couple that will get the IP addresses from the apache log files. If you do have log processing quesitons I sincerely hope you get to play at a command line, as off the top of the head can be difficult, unless you’re good at visualizing commands.

Q: How would you get the IP address from the access logs?
A: This one is fairly straight forward, I would:
cat access.log | awk '{print $1}' | uniq

This will output (if you choose not to use uniq you will see multiple of the same ip, I’ll leave to the discretion of the reader):

127.0.0.1
127.0.0.2
127.0.0.3

The breakdown is as follows:

    1. read through the contents of the file (cat access.log)
    2. pipe output to awk to print the first field ($1)
    3. pipe output to only show unique data (uniq)

Another question might be to parse out the IP address from the error.log, while this is a bit more difficult, it is fairly straight forward using readily available system tools.

Q: Can you please extract the IP address from the error.log?
A: Here is my solution (Thanks Malcolm!)
cat error.log | awk '{ if($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq

The breakdown is as follows, this one is a bit more complex so I will walk through each step of it:

First part: cat error.log - read through the log file.
 
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Sun Aug 14 13:27:14 2011] [error] [client 96.126.120.254] Invalid method in request \x80e\x01\x03\x01
[Mon Aug 15 15:16:21 2011] [error] [client 194.72.238.62] Invalid method in request \x16\x03\x01
[Mon Aug 15 15:50:27 2011] [notice] caught SIGWINCH, shutting down gracefully
[Mon Aug 15 15:50:37 2011] [notice] mod_python: Creating 8 session mutexes based on 75 max processes and 0 max threads.
[Mon Aug 15 15:50:37 2011] [notice] mod_python: using mutex_directory /tmp 
PHP Warning:  Module 'gd' already loaded in Unknown on line 0
PHP Warning:  Module 'mysql' already loaded in Unknown on line 0
PHP Warning:  Module 'mysqli' already loaded in Unknown on line 0
[Mon Aug 15 15:50:38 2011] [warn] mod_wsgi: Compiled for Python/2.5.1.
[Mon Aug 15 15:50:38 2011] [warn] mod_wsgi: Runtime using Python/2.5.2.
 
2. Second part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' - if the 7th field matches client (this seems to be pretty standard though ymmv) print out eighth field (which should be the IP address, also notice the trailing "]" character).
 
96.126.120.254]
96.126.120.254]
96.126.120.254]
96.126.120.254]
194.72.238.62]
 
3. Third Part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' - use sed to remove the trailing "]" character.
 
96.126.120.254
96.126.120.254
96.126.120.254
96.126.120.254
194.72.238.62
 
4. Fourth Part: cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq - lets output unique ips and not all of them (multiple matches).
 
96.126.120.254
194.72.238.62

Addendum:
If you would like you can then add another awk on the end (or pipe to any other command you feel like) for instance. The following pipes the output to the host command:

[Edit: for host might want to specify the -W <time> flag just in case, it could try forever on some unless specified]
cat error.log | awk '{ if ($7 == "[client") {print $8} }' | sed -e 's/]$//g' | uniq | awk '{ print | "host -W 3 " $1 }'

I hope this helps, share and enjoy! Thanks for reading!

-Scott.

Disclaimer: I make no claims to the viability of the code/script/commands and make no guarantees that it will work on your system, use at your own risk.

Normalize MAC address for DHCP reservations

Sunday, August 7th, 2011

So part of what I am doing at my current job is helping one of the Unix admins with DNS and DHCP. For the DHCP portion to setup the reservations we need the MAC addy in a certain format, of which the people requesting never seem to get consistently right, so I wrote a small shell script (it’s still rough) that will normalize the MAC address for what we need (colon separated, alpha characters lower case).

[Edit: the below only works on FreeBSD and Linux, for Solaris swap the awk '{print tolower($0)}' with tr '[:upper:]‘ ‘[:lower:]‘ <- or you could just do that for all of them as well).

Original:

#!/usr/bin/env sh
 
mac="$1"
 
#check to see if input is empty
if [ ! -n "$mac" ]
    then
        echo "Please enter a MAC address."
        exit
else 
    # echo the input, strip out dot(.), strip out colon(:), strip out dash(-)
    # add colon(:) every two chars, remove last colon(:)
    # awk to lowercase characters [EDIT: updated the sed for . seps and escape]
    echo $mac | sed -e 's/\.//g' -e 's/\://g' -e 's/\-//g' -e 's/../&:/g' -e 's/:$//g' \
| awk '{print tolower($0)}'
fi

I have updated the code to check for length, though it does not strip the newline character so it’s mac.length + 1.

#!/usr/bin/env sh
 
mac="$1"
len="${#mac}"
 
#check to see if input is empty and we're not getting a valid mac, less than is ok, just make sure it's not greater than 17 (mac.length +1)
 
if [[ ! -n "$mac" || "$len" -gt "17" ]]
    then
        echo "Please enter a MAC address."
        exit
else
    # echo the input, strip out dot(.), strip out colon(:), strip out dash(-)
    # add colon(:) every two chars, remove last colon(:)
    # awk to lowercase characters [EDIT: updated the sed for . seps and escape]
    echo $mac | sed -e 's/\.//g' -e 's/\://g' -e 's/\-//g' -e 's/../&:/g' -e 's/:$//g' \
| awk '{print tolower($0)}'
fi

Some of the standard formats we see are:
1A:BC:35:57:33:08
1A-BC-35-57:33:08
1abc35573308
1a.bc.35.57.33.08
1A.BC.35.57.33.08

The script will take all of these and make them look like: 1a:bc:35:57:33:08.

Here is the breakdown for those that are curious:
1. if [ ! -n "$mac" ] checks to make sure first input variable is not empty.
2. echo $mac outputs the first argument (does not do any input validation)
3. The sed is in five parts:

    [Edited to escape the sequences, just in case they should have special meaning]
        a. 's/\.//g' strips out the period should it be in that format
        b. 's/\://g' strips out the colons should they be there (to avoid extra colons)
        c. 's/\-//g' strips out the dashes should they be in that format
        d. 's/../&:/g' every two characters append a colon
        e. 's/:$//g' remove the last trailing colon
    

4. awk ‘{print tolower($0)}’ make all of the upper case alpha chars lower case to match our format needs.
4a. tr ‘[:upper:]‘ ‘[:lower:]‘ for Solaris (yes I know tr ‘[A-Z]‘ ‘[a-z]‘ will work as well, but this is easier to read IMHO)

This will work on a file full of MAC addresses, just throw it in a for loop (this is if you name the file macnorm.sh):

for i in `cat filename`; do ./macnorm.sh $i; done

Again it is important to realize that there is no checking of the input, so if you did it over /etc/passwd it would add a colon (:) every two characters. Hope this helps.

-Scott.

Translate lower to upper (or vice versa) on Linux command line

Friday, September 17th, 2010

An acquaintance of mine asked a really good interview question, so I thought I would share. For the linux wonks, describe three ways to translate from lower case to uppercase on the command line (or vice versa). I have to admit I only thought of one (though the other should have been obvious), so I came up with tr, missed sed (which should have been obvious), and last but not least, one I would have never thought of, dd. *Actually thought of another way using awk so I’ll add that as well. Below are the examples:

Created a file called test with the following contents: test test test (to go from upper to lower on the tr flip the options).

Tr:
tr: cat test | tr [:lower:] [:upper:]
Output: cat test | tr [:lower:] [:upper:]
TEST TEST TEST
 
tr: cat test | tr "a-z" "A-Z"
Output: cat test | tr "a-z" "A-Z"
TEST TEST TEST
 
Sed:
*With sed replace U with L if you want to lower (there are other ways as well):
 
sed: cat test | sed 's/\(.*\)/\U\1/'
Output: cat test | sed 's/\(.*\)/\U\1/'
TEST TEST TEST
 
dd: (bonus points if you know where dd got its name**)
 
dd if=test conv=upper
Output: dd if=test conv=ucase
TEST TEST TEST
0+1 records in
0+1 records out
15 bytes (15 B) copied, 0.000155 s, 96.8 kB/s
 
Awk:
cat test | awk '{print toupper($_)}' (use tolower($_) if you want to lcase)
Output: cat test | awk '{print toupper($_)}'
TEST TEST TEST

** From the history files, the name comes from “convert and copy” but some wily compiler designers had already taken the “cc” command.

Getting IP address from shell

Thursday, September 9th, 2010

So someone asked earlier on one of the social networking sites I follow how to get the ip address from the shell, here are two quick and dirty ways (using just cmd line utils that could be utilized from shell) – Use at your own risk :).

Linux: - You can specify a device or if you do not it returns them all.
ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e '/Bcast//g'
 
Output: 
 
$> ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e 's/Bcast//g' 192.168.1.2 192.168.1.3
 
or
 
OSX / BSD: - If you leave off the final grep it will return then all (IPv6 as well) 
 
ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 
(greps out everything but numbers, grep -v [:digit:] works as well)
 
Output: 
 
$> ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 127.0.0.1 192.168.1.3
Meta:
Contact:
Scott:
Systems Engineer, Geek, Horror fan.

Ways to reach Scott:
Twitter
FriendFeed
Facebook
LinkedIn
Tumblr
Zerply
About.me

Barry:
Systems Administrator / Geek.

Ways to reach Barry:
Twitter
Friendfeed
Associations:
Twitter:

View more tweets | Powered by HL Twitter