Monday, August 9, 2010

How I learned to stop worrying and love the sniffer

Once upon a time one used sockets to speak to the network. You basically manually pushed your little bits out on the network and you could almost physically feel them sail away into the void. Today I tend to use high level libs and it is often not trivial to figure out what calling the metod createObjectOnRemoteSystem actually results in. Logs are good but sniffing the network is sometimes the best way to figure out what is really going on. On a large number of occasions a good network sniffer is the difference between being totally stuck and solving a problem.

My favorite sniffer is Wireshark (formerly known as Ethereal). This sniffer is free and has a very good protocol analyzer while still giving you convenient access to the raw bits. The user interface may take some time to get used to so I thought I should write up a quick introduction on how to do some simple tasks.
Let me use an actual incident as an example on how to use a sniffer to find the root cause of a service outage in a complex environment.

The system outage first showed up in an application that basically provides an interface that translates between web service calls and LDAP queries into an AD database. The application suddenly started failing complaining in the logs that it couldn’t bind to the AD server.

First step was to check that you could bind to the AD server that was the primary login controller for the application server that hosted the service. That worked fine.

Second step was to look in the logs on the AD server to see if there were any entries about the failed binds. Unfortunately everything looked fine.

Now things looked a bit confusing. Was the problem that the application had gone totally off the rails? I decided to install Wireshark and see if the app was at all communicating with the DC.

Sniffing traffic with Wireshark is easy:
  1. Start up Wireshark
  2. Pick Capture-> Interfaces
  3. Pick the Network Interface Card that you want to listen to and press Start.
  4. Generate the traffic you want to listen to
  5. Press Capture -> Stop when you are done
You have now sniffed the traffic and next up is analyzing.

Analyzing can be a bit challenging. This is especially true if you are on a network where there is a lot of traffic so your traffic will simply drown in the background noise. The trick here is to figure out a good filter that lets you find your signal. In the below are a couple of options that I have found useful over the years.

  • "tcp.port==389" gets you all tcp traffic on that port (LDAP in this case)
  • "ip.host==192.168.1.30" gets you all ip based traffic to and from that specific host
Once you have identified the packet that is the trigger event by creating a good filter and press the Apply button you should be able to find the traffic. If you can’t directly filter for the traffic you want to look at you can filter on the triggering event and then can remove the filter by pressing clear. Usually you can find your packet of interest just below the triggering event.

In the lower part of the screen you can see the protocol layer stack. Depending on what you are doing you may be more interested more in the application layer or the network layer so you can expand or detract the different layers by clicking on the + signs on the left.

In this specific example I discovered that my app was talking to a completely different AD domain controller. Once the DC was taken down my application rebound to another DC and suddenly started working again.

There are many more nifty tricks you can use in Wireshark but I think this is enough for one posting. Stay tuned for more! (queue "We'll Meet Again")

2 comments:

  1. Hi Martin,

    Nice post.

    Along with what you discussed, We also use "tcpdump" (it is flexibile, can monitor two boxes etc), and it has an option for wireshark format (i think -s).

    Keep writing more. I am now following you on Google reader.

    Cheers,
    Vijay Chinnasamy

    ReplyDelete
  2. Tcpdump is another great sniffer. Especially if you are in a unix environment.

    ReplyDelete