Archive for the ‘System Administration’ Category

About Junior Systems Admins…

Thursday, May 12th, 2011

This is a subject that is near and dear to most of us, as most of the systems admins that I know have come up through the ranks. I believe as we get more experienced, and hence more responsibility we as systems admins tend to go with the mindset* of “Screw it, I can do this faster myself than show it to, and explain it to someone else” because of project constraints, deadlines** and the myriad of other reasons we collectively don’t have the time. But I think that this can be detrimental to those that are more Jr., who are in need of good mentorship (yes I understand some of the difficulties here), but think that if possible we should take a little extra time and effort to help grow them***. I honestly believe that I missed out a lot on some basic stuff when I was learning because the person that I worked with, who was a very smart systems engineer was under such time constraints that most often he would just do it not giving me the chance to learn. As a sidenote I think it is also important to let them know the scope of their responsibilities, and that in the beginning it’s ok to ask questions.

Things I wish I would have known as a Junior****:

1. Ask questions – ask! ask! ask! But here is the flipside, I don’t want to have to answer the same question 19 times, get a notebook, take notes and go google if you need to, I am not into hand holding or completely supporting (e.g. doing their job for them) attitudes.

2. Knowledge – as more experienced admins we do not expect you to know everything as you are just building the foundation of knowledge you need to succeed in this field, I promise you there will be epiphanous moments, and the layers of abstraction will become clearer as time passes.

3. Expect mistakes – you will make mistakes, plain and simple. Do not be afraid of them, embrace them, learn from them and carry on. It’s not about whether you’re going to fall down, you will, it’s about how you pick yourself up and carry on which matters more.

4. Build up your toolsets – I don’t mean physically (though those help as well), I mean learn as much as you can, don’t limit yourself, read and learn on your own.

5. Listen – in the beginning you don’t know everything, listen to the more senior people, especially the good ones. They will not always be right (dammit you mean we don’t know it all?), but they will generally have been working with the environment long enough to know the common pitfalls.

6. Stand up for yourself – as a caveat to #5, even the most senior or those that have been doing it for years are not always right, they are human and make mistakes.

7. Stay away from poisonous people – this rule applies in life in general as well, but there are a lot of bitter, jaded systems administrators who will spew venom, you’ll be able to spot them a mile away. Try to keep your distance if you can (though I know it’s not always possible).

This is just my opinion, take of it what you will, thank you for reading.

* YMMV
** As Douglas Adams said “I love deadlines, I like the whooshing sound they make as they fly by.”
*** There are of course those that will never learn, and probably shouldn’t be in this business to begin with, but I digress…
**** This applies to programmers, network wonks, security freaks, etc…just s/systems admin/tech job/g

Couple of rules I follow in my job…

Wednesday, May 11th, 2011

I am a Systems Administrator (mainly Linux and Storage, some Windows) by trade, and there are a couple of rules that I have learned in the time I have been doing this job:

1. “Just because you can, doesn’t mean you should.” There are a ton of places where this is applicable, especially since most of us are responsible for a lot of sensitive data, and have the keys to the proverbial kingdom.

2. “You can do it right, or you can do it again.” <–This is mine, though probably not original. I have seen, in my career a lot of quickly done, rough, overly complex, and not well designed solutions. I do believe that a lot of times it is about simplicity, though there are times when a complex solution is necessary. Now don’t mistake me, I am just as apt to hack something together to get it working, but when it comes to actually designing for a production environment, it is my not so humble opinion that you should take a little extra time upfront and design to the best of your ability.

ssh escape sequences

Wednesday, September 22nd, 2010

I use one of these escape sequences all the time as I tend to leave my connections going when I leave from work (and then have to VPN back in) so the ssh session hangs. ~. will disconnect a hung ssh session. Here are some of the other common ssh escape sequences:

~?
Supported escape sequences:
  ~.  - terminate connection (and any multiplexed sessions)
  ~B  - send a BREAK to the remote system
  ~C  - open a command line
  ~R  - Request rekey (SSH protocol 2 only)
  ~^Z - suspend ssh
  ~#  - list forwarded connections
  ~&  - background ssh (when waiting for connections to terminate)
  ~?  - this message
  ~~  - send the escape character by typing it twice
(Note that escapes are only recognized immediately after newline.)
 
Here is what they look like from my command line
(note these have to be done after a newline):
 
~. 
Output: Connection to somehost.somedomain.com closed.
 
~C
Output:
ssh> ?
Commands:
      -L[bind_address:]port:host:hostport    Request local forward
      -R[bind_address:]port:host:hostport    Request remote forward
      -D[bind_address:]port                  Request dynamic forward
      -KR[bind_address:]port                 Cancel remote forward
~^Z
 
Output:
[Wed Sep 22]{snyce@somehost in ~}$ ~^Z [suspend ssh]
 
[1]+  Stopped                 ssh -X somehost.somedomain.com
 
(you can resume by typing fg)
 
~#
 
The following connections are open:
  #0 client-session (t4 r0 i0/0 o0/0 fd 5/6 cfd -1)

Hope this helps.

Translate lower to upper (or vice versa) on Linux command line

Friday, September 17th, 2010

An acquaintance of mine asked a really good interview question, so I thought I would share. For the linux wonks, describe three ways to translate from lower case to uppercase on the command line (or vice versa). I have to admit I only thought of one (though the other should have been obvious), so I came up with tr, missed sed (which should have been obvious), and last but not least, one I would have never thought of, dd. *Actually thought of another way using awk so I’ll add that as well. Below are the examples:

Created a file called test with the following contents: test test test (to go from upper to lower on the tr flip the options).

Tr:
tr: cat test | tr [:lower:] [:upper:]
Output: cat test | tr [:lower:] [:upper:]
TEST TEST TEST
 
tr: cat test | tr "a-z" "A-Z"
Output: cat test | tr "a-z" "A-Z"
TEST TEST TEST
 
Sed:
*With sed replace U with L if you want to lower (there are other ways as well):
 
sed: cat test | sed 's/\(.*\)/\U\1/'
Output: cat test | sed 's/\(.*\)/\U\1/'
TEST TEST TEST
 
dd: (bonus points if you know where dd got its name**)
 
dd if=test conv=upper
Output: dd if=test conv=ucase
TEST TEST TEST
0+1 records in
0+1 records out
15 bytes (15 B) copied, 0.000155 s, 96.8 kB/s
 
Awk:
cat test | awk '{print toupper($_)}' (use tolower($_) if you want to lcase)
Output: cat test | awk '{print toupper($_)}'
TEST TEST TEST

** From the history files, the name comes from “convert and copy” but some wily compiler designers had already taken the “cc” command.

Removing files by inode

Friday, September 17th, 2010

This is by no means original, nor is it mine, but I thought I would share. If you want a longer explanation go here. So basically what I did this morning is try to write a configuration file out of vim, and wasn’t paying sufficient attention and ended up writing a new file called ‘:wq. Instead of trying to figure out the quoting I just ls -i (to show inode) it returned 42076104 ‘ (as well as a bunch of others, it was the xorg.conf file if you’re really interested). So to get rid of the file just use find . -inum inode* -exec rm -f ‘{}’ \;, so for our file it would be:

find . -inum 42076104 -exec rm -f '{}' \;

Share and enjoy!
* had to update, sorry forgot angle brackets (or inequality signs) are interpolated as tags.

Re-scan SCSI bus for Qlogic Card

Thursday, September 9th, 2010

Short Story: With Qlogic driver you can use:

#> echo "scsi-qlascan" > /proc/scsi/qla2300/3 (or your device/port)

Must do for every port (if you have installed or balanced across ports), also this is with a qlogic driver, YMMV, also your numbers under the /proc/scsi/qla2xxx/ (or whatever your version is) will probably vary.

Long Story: Sorry for the hiatus, been on vacation then got back and simultaneously had a tape library install and virtual tape library install. So during the VTL install there was a bit of a hiccup on the main backup server seeing the scsi devices that we presented from the VTL setup (over Fibre of course ;)). *Now I am in the process of replacing the backup server (which is a hyper-threaded P4 w/6GB of RAM with a Dual Quad Core w/32GB or RAM and a metric ton of disk space – for restores and such). Any whos, so we needed the HBA to rescan the bus, and with the QLogic driver you can do the following (for as many ports as you have):

So on my system (the qla2342 is the HBA (yes old I know) with QLogic driver installed) when I do an ls on the /proc/scsi dir I get the following:

[Mon Jun 28 @ 22:45:27] # ls -CF /proc/scsi/ megaraid/ mptscsih/
qla2300/ scsi sg/

Since we know that the HBA is of the qlogic variety we then ls on the /proc/scsi/qla2300 dir:

[Mon Jun 28 @ 22:48:02] # ls -CF /proc/scsi/qla2300/ 3 4 HbaApiNode

so if I then cat /proc/scsi/qla2300/3 it returns (this is only partial output)

[Mon Jun 28 @ 22:49:10] # cat /proc/scsi/qla2300/3 QLogic 
PCI to Fibre Channel Host Adapter for QLA2342: Firmware 
version: 3.03.19, Driver version 7.07.06 Entry address = 
f8c82060 HBA: QLA2312 , Serial# Q20331 Request Queue =
0x36ce0000, Response Queue = 0x36cd0000 Request 
Queue count= 512, Response Queue count= 512 Total number of 
active commands = 1 Total number of interrupts = 679679694 Total number 
of active IP commands = 0 <...> Commands retried with dropped 
frame(s) = 0 Configured characteristic impedence: 50 ohms 
Configured data rate: 1-2 Gb/sec auto-negotiate <...>
[Mon Jun 28 @ 22:55:02] # echo "scsi-qlascan" >/proc/scsi/qla2300/3

And that will force the scsi bus rescan. For those that are interested we also had to add the devices manually, which if you’ve never done is fairly interesting. Here is how we did it for the devices on 3. After running the bus rescan the devices showed up starred (*) in /proc/scsi/qla2300/3 (which meant they weren’t yet registered, the server saw them just fine, now to add them), looking in the file they showed up (I am writing this post install so they are now registered, but the * shows up after the flags – so flags 0×0*):

SCSI LUN Information: (Id:Lun) * - indicates lun is not registered with
the OS. <...> (10: 0): Total reqs 5056, Pending reqs 0, flags 0x0,
0:0:8b, (10: 1): Total reqs 31, Pending reqs 0, flags 0x0, 0:0:8b, (10:
2): Total reqs 31, Pending reqs 0, flags 0x0, 0:0:8b, (10: 3): Total
reqs 5, Pending reqs 0, flags 0x0, 0:0:8b, (10: 4): Total reqs 5,
Pending reqs 0, flags 0x0, 0:0:8b,

So you would then add each device:

echo 'scsi add-single-device 3 0 10 0' > /proc/scsi/scsi 
echo 'scsi add-single-device 3 0 10 1' > /proc/scsi/scsi 
echo 'scsi add-single-device 3 0 10 2' > /proc/scsi/scsi 
echo 'scsi add-single-device 3 0 10 3' > /proc/scsi/scsi 
echo 'scsi add-single-device 3 0 10 4' > /proc/scsi/scsi

Here’s the snippet from the SCSI programming Howto:

Direct Link applicable section.

If a newer kernel and the /proc file system is running, a non-busy device can be removed and installed ‘on the fly’.

To remove a SCSI device:

echo "scsi remove-single-device a b c d" > /proc/scsi/scsi

and similar, to add a SCSI device, do:

echo "scsi add-single-device a b c d" > /proc/scsi/scsi

where a == hostadapter id (first one being 0) b == SCSI channel on hostadapter (first one being 0) c == ID d == LUN (first one being 0) So in order to swap the /dev/sgc and /dev/sgd mappings from the previous example, we could do:

echo "scsi remove-single-device 0 0 4 0" > /proc/scsi/scsi 
echo "scsi remove-single-device 0 0 5 0" > /proc/scsi/scsi 
echo "scsi add-single-device 0 0 5 0" > /proc/scsi/scsi 
echo "scsi add-single-device 0 0 4 0" > /proc/scsi/scsi

since generic devices are mapped in the order of their insertion.

Hope this helps! Till next time, be kind, share and enjoy.

Getting IP address from shell

Thursday, September 9th, 2010

So someone asked earlier on one of the social networking sites I follow how to get the ip address from the shell, here are two quick and dirty ways (using just cmd line utils that could be utilized from shell) – Use at your own risk :).

Linux: - You can specify a device or if you do not it returns them all.
ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e '/Bcast//g'
 
Output: 
 
$> ifconfig | grep inet | awk -F: '{print $2}' | grep -i bcast | sed -e 's/Bcast//g' 192.168.1.2 192.168.1.3
 
or
 
OSX / BSD: - If you leave off the final grep it will return then all (IPv6 as well) 
 
ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 
(greps out everything but numbers, grep -v [:digit:] works as well)
 
Output: 
 
$> ifconfig | grep inet | awk '{print $2}' | grep -v [:num:] 127.0.0.1 192.168.1.3

NetApp PAM quick trick

Thursday, September 9th, 2010

PAM Trick: So today at work we did a massive head upgrade (from 6070′s to 6080′s) as well as some shelf swaps. The new heads have Flash Cache cards in them, half a terabyte of memory on a card used for caching, which ideally should speed up stuff, and keep the overal IOPs own on the busiest shelves. Now I will give credit where credit is due, my coworker Dalvenjah (yes that is a handle), did a very slick thing to get a mysql db into cache quickly. The database was stopped, he cd’d into the data dir, then ran cat * > /dev/null which sequentially read the files, essentially pumping them into the cache, as the first read comes from disk, all subsequent reads came from cache, the speedup was awesome. Just thought I would share :), I am quite sure you can think of other just as slick ways to get data into cache.

Screen FTW!

Thursday, September 9th, 2010

Screen is, in my obviously not so humble opinion (this is my blog as it were), one of the most important tools a Systems Admin can have in their toolkit. As a Systems and Backup administrator I have learned quite a few things over the years, but one of the most important is the screen command (if you don’t have this on your system, see about getting it as soon as possible – sorry it’s GNU so no windows (that I’m currently aware of) get the latest version here. Here are a few of the things that I like and have learned.

Since I tend to do most of my restores from the command line, and some of them can be rather large (yes I know this is subjective – usually between a few GB’s and multi-terabyte), I have learned to always run my restores in a screen session, as if my terminal gets killed / cancelled / blowed up, the restore session continues (I have been bitten by this before, 3TB’s into a restore and my box got b0rked, blargle restore died).

Here are some of the neato things you can do with screen:
 
Basics: In this regard when you see cap C, it stands for the CTRL key.
 
start a new screen session: shell#> screen -S <name> creates a new screen session.
 
Detach from screen session: C - a d (means press CTRL + a (lower case) then d 
 
List screen sessions: screen -ls
 
Re-attach to screen session: screen -r pid or screen -r name
 
One of the coolest aspects of screen is the ability to do multiuser sessions.
 
Make screen multiuser: C - a then type : (colon) multiuser on
 
See who is attached to multiuser: C - a * or C - a :display 
 
Split the screen: C - a S or C - a :split
 
Switch windows (regions): C - a <Tab> or C - a :focus
 
So when you create a new region it will be blank, you can then type C - a C - c 
to create a new window, then C - a <Tab> to the new region, then type C - a 1
to switch the new region to window 1 (the one you just created, the first is always 0). 
Close all other regions: C - a Q or C - a :only

That’s just a small bit of what screen can do, Just go check out the manual, there are lots of things you can do.

Share and enjoy.

One cool Bashism

Thursday, September 9th, 2010

Sometimes when working the cli you need to quickly run the last command, and either:

1. change something or 2. remove something

Using ^ (caret) substitution makes that easy, it runs the previous command substituting or removing what you specify by the carets (^^) here are couple of real *examples that I use frequently (and I’m sure you can think of quite a few more):

1. Change something.

Since I work on the cli a lot, and jump from host to host to get information, do stuff, get files, etc. I myself will sometimes jump to the wrong host, and as long as they are part of the same domain you can easily rerun the ssh command without having to type it all out again, e.g.:

shell#> ssh host1.mydomain.com now if host2 lives on same domain you can CTRL+C then shell#> ^host1^host2 and the shell will run ssh host2.mydomain.com.

2. Remove something.

Sometimes I want to remove an [option|name|file|pipe|blah] from the previous command:

Test to see if all dependencies for package are met:

shell#> rpm -ivh --test really-long-named-rpm.with.version.number.rpm

I don’t necessarily want to hit the up arrow (or k if you’re leet enough to use vi mode in cli (set -o vi)), backspace (no I’m not going to cover all the vi commands), remove –test and re-run the command. Here’s where ^ substitution comes into play, if I run:

shell#> ^--test^

It will happily re-run the previous command sans the –test option (you have to be specific, if you just did ^test^ it would leave the — (double dash) and might cause havoc, depending on command). So the substitution works from left to right, and only on the first occurrence. It works like this ^thing_to_replace^replacement. So in the above example we are substituting nothing for –test, essentially removing it. Obviously the one major caveat is that it only works on the previous command in the history.

Share and enjoy.

*Yes these examples may be lame, but they give you an idea.

Meta:
Contact:
Scott:
Systems Engineer, Geek, Horror fan.

Ways to reach Scott:
Twitter
FriendFeed
Facebook
LinkedIn
Tumblr
Zerply
About.me

Barry:
Systems Administrator / Geek.

Ways to reach Barry:
Twitter
Friendfeed
Associations:
Twitter:

View more tweets | Powered by HL Twitter