It is a source of constant amazement to me that most SysAdmins (shorthand for System Administrators) have so little understanding of the applications running on their iron, apart from a passing “that’s the mail server”. Knowing exactly what your server is doing in normal operation makes it easier to troubleshoot when things aren’t “normal”.
Baselining
Everyone hates documenting system builds. It’s as much a truism as “the sky is blue”, “politicians always lie” and “whatever can go wrong will go wrong in the most spectacular way at the most inopportune moment”. Something as simple as a capture of what a server is doing just prior to deployment can make fire fighting much easier later on. There’s a bare minimum of information that I like to have on DropBox/Google Drive/Evernote for each server that I manage.
- Hostname
- DNS and LDAP/Active Directory domain names
- DNS server
- Authentication server (LDAP / Kerberos / Local)
- Local administration username/password
- Edited output of “netstat -ao” (Windows) or “netstat -tunlp” (Linux)
- Edited output of “tasklist” (Windows) or “ps -ef” (Linux)
You’re probably wondering “Why is the output of netstat so helpful?”. In one very concise text file, I can see:
- what network ports are open, which gives me an idea of what applications are running
- what network addresses an application listens on
- what hosts a server is talking to, which gives me an idea of the rest of the infrastructure, and what other dependencies might be at play
- combined with the “tasklist” / “ps -ef” output, gives me the executable and process name
Adding the “-o” (Windows) or “-p” (Linux) gives one really useful piece of information – the Process ID (PID) of the application that is holding that network port open. Using “tasklist” (Windows), you can then match the network port+PID to the application, whereas the “netstat -ap” output already has the application name listed next to the PID.
Another really helpful tool for Windows servers is Process Explorer – it’s a more interactive way of drilling down into processes, ports open files, and a bunch of other useful bits that I may talk about in a future post.
Traffic Analysis
No, that is not a typo – if you run a server, you should know what the traffic going in and out looks like. A good SysAdmin should be able to:
- understand (and be able to explain clearly) the TCP/IP addressing and subnetting in use on the network
- capture packet traces using tools like “Wireshark” or tcpdump
- filter a packet capture to narrow down a particular “stream” of traffic for analysis
- read through and understand, at a basic level, the session setup, protocol and data communications, and session tear down
Before you come after me with torches and pitchforks, let me explain why this is a useful skill. At some point in time, any SysAdmin is going to hear “Why is the <insert application> server running so slow?”. Now an entry-level SysAdmin would take a look at the monitoring systems (you do have something monitoring your servers right?) or jump onto the server in question and checks all the basics such as CPU load, Memory usage, disk I/O, network I/O and then blame it on the NetAdmins and wash your hands of it. Until the NetAdmins dump that little burning coal right back into your lap.
But you’re not an entry-level SysAdmin so you take it to a “Whole Nutha Level” and start a packet capture on the <insert application> server AND on a client system that is trying to access the <insert application> server. After a little more digging, your Wireshark analysis skills shows that the NIC (Network Interface Card) is showing a high number of retransmissions occurring, even though the NIC is nowhere near capacity. Send that burning coal to the NetAdmins, with the packet capture attached, and you’ve now shown that a) you’ve actually got some useful hard data to back up your hypothesis, and b) put the NetAdmins on the back foot before they can come back with “It’s a server issue.”
In all seriousness, the judicious use of Wireshark can help nail down those tricky application performance issues by detecting network congestion, faulty NICs, faulty or misconfigured network switch ports, dodgy cabling, or latency issues.
I’d also like to point you at another blog called PacketBomb – it’s written by a colleague of mine who REALLY knows how to fly Wireshark. Drop by and say “Hi!” to Kary. The “PacketBomb Fundamentals” is a great place to start with WireShark,
If you’re feeling a little out of your depth with the addressing, subnetting and other arcana of TCP/IP networking, please take a look at the links below:
- Networks 101 – AP Lawrence
- TCP 3-Way Handshaking (and other useful articles) – InetDaemon
- Wireshark – Wireshark.org
As always, I’d love to hear from you about:
- how do you document your systems?
- other online resources for network analysis