Internet based NETWORK TOOL FOR DOMAIN
http://www.internic.net http://www.checkdns.net/quickcheckdomainf.aspx
http://network-tools.com/default.asp?prog=dnsrec&host=niec.edu.np
http://www.kloth.net/services/ http://openspf.org http://dnsstuff.com www.dnsgoodies.com/
Spam Database check
www.dnsbl.info/advanced.asp ,www.sorbs.net/lookup.shtml http://tools.web-max.ca/dsbl.php www.spamcop.net
An Introduction to TCP/IP
TCP/IP is a universal standard suite of protocols used to provide connectivity between networked devices. It is part of the larger OSI model upon which most data communications is based.
One component of TCP/IP is the Internet Protocol (IP), which is responsible for ensuring that data is transferred between two addresses without being corrupted.
For manageability, the data is usually split into multiple pieces or packets, each with its own error detection bytes in the control section or header of the packet. The remote computer then receives the packets and reassembles the data and checks for errors. It then passes the data to the program that expects to receive it.
How does the computer know what program needs the data? Each IP packet also contains a piece of information in its header called the type field. This informs the computer receiving the data about the type of layer 4 transportation mechanism being used.
The two most popular transportation mechanisms used on the Internet are Transmission Control Protocol (TCP) and User Datagram Protocol (UDP).
When the type of transport protocol has been determined, the TCP/UDP header is then inspected for the port value, which is used to determine which network application on the computer should process the data. This is explained in more detail later.
TCP Is a Connection-Oriented Protocol
TCP opens up a virtual connection between the client and server programs running on separate computers so that multiple and/or sporadic streams of data can be sent over an indefinite period of time between them. TCP keeps track of the packets sent by giving each one a sequence number with the remote server sending back acknowledgment packets confirming correct delivery. Programs that use TCP therefore have a means of detecting connection failures and requesting the retransmission of missing packets. TCP is a good example of a connection-oriented protocol.
How TCP Establishes a Connection
Any form of communication requires some form of acknowledgment for it to become meaningful. Someone knocks on the door to a house, the person inside asks, “Who is it?”, to which the visitor replies, “It’s me!” Then the door opens. Both persons knew who was on the other side of the door before it opened and a conversation can now begin.
TCP acts in a similar way. The server initiating the connection sends a segment with the SYN bit set in TCP header. The target replies with a segment with the SYN and ACK bits set, to which the originating server replies with a segment with only the ACK bit set. This SYN, SYN-ACK, ACK mechanism is often called the “three-way handshake.”
The communication then continues with a series of segment exchanges, each with only the ACK bit set. When one of the servers needs to end the communication, it sends a segment to the other with the FIN and ACK bits set, to which the other server also replies with a FIN-ACK segment. The communication terminates with a final ACK from the server that wanted to end the session.
This is the equivalent of ending a conversation by saying “I really have to go now, I have to go for lunch,” to which the reply is, “I think I’m finished here too, see you tomorrow.” The conversation ends with a final “bye” from the hungry person.
Here is a modified packet trace obtained from the tethereal program “Simple Network Troubleshooting.” You can clearly see the three-way handshake to connect and disconnect the session:
hostA -> hostB TCP 1443 > http [SYN] Seq=9766 Ack=0 Win=5840 Len=0 hostB -> hostA TCP http > 1443 [SYN, ACK] Seq=8404 Ack=9767 Win=5792 Len=0 hostA -> hostB TCP 1443 > http [ACK] Seq=9767 Ack=8405 Win=5840 Len=0 hostA -> hostB HTTP HEAD / HTTP/1.1 hostB -> hostA TCP http > 1443 [ACK] Seq=8405 Ack=9985 Win=54 Len=0 hostB -> hostA HTTP HTTP/1.1 200 OK hostA -> hostB TCP 1443 > http [ACK] Seq=9985 Ack=8672 Win=6432 Len=0 hostB -> hostA TCP http > 1443 [FIN, ACK] Seq=8672 Ack=9985 Win=54 Len=0 hostA -> hostB TCP 1443 > http [FIN, ACK] Seq=9985 Ack=8673 Win=6432 Len=0 hostB -> hostA TCP http > 1443 [ACK] Seq=8673 Ack=9986 Win=54
In this trace, the sequence number represents the serial number of the first byte of data in the segment. So in the first line, a random value of 9766 was assigned to the first byte, and all subsequent bytes for the connection from this host will be sequentially tracked. This makes the second byte in the segment number 9767, the third number 9768, etc. The acknowledgment number, or Ack, not to be confused with the ACK bit, is the sequential serial number of the first byte of the next segment it expects to receive from the other end, and the total number of bytes cannot exceed the Win or window value that follows it. If data isn’t received correctly, the receiver will re-send the requesting segment asking for the information to be sent again. The TCP code keeps track of all this along with the source and destination ports and IP addresses to ensure that each unique connection is serviced correctly.
UDP, TCP’s “Connectionless” Cousin
UDP is a connectionless protocol. Data is sent on a “best effort” basis with the machine that sends the data having no means of verifying whether the data was correctly received by the remote machine. UDP is usually used for applications in which the data sent is not mission-critical. It is also used when data needs to be broadcast to all available servers on a locally attached network where the creation of dozens of TCP connections for a short burst of data is considered resource-hungry.
TCP and UDP Ports
The data portion of the IP packet contains a TCP or UDP segment sandwiched inside. Only the TCP segment header contains sequence information, but both the UDP and the TCP segment headers track the port being used. The source/destination port and the source/destination IP addresses of the client and server computers are then combined to uniquely identify each data flow.
Certain programs are assigned specific ports that are internationally recognized. For example, port 80 is reserved for HTTP Web traffic, and port 25 is reserved for SMTP e-mail. Ports less than or equal to 1024 are reserved for privileged system functions, and those above 1024 are generally reserved for non-system third-party applications.
Usually when a connection is made from a client computer requesting data to the server that contains the data:
-
The client selects a random previously unused source port greater than 1024 and queries the server on the destination port specific to the application. If it is an HTTP request, the client will use a source port of, say, 2049 and query the server on port 80 (HTTP).
-
The server recognizes the port 80 request as an HTTP request and passes on the data to be handled by the Web server software. When the Web server software replies to the client, it tells the TCP application to respond back to port 2049 of the client using a source port of port 80.
-
The client keeps track of all its requests to the server’s IP address and will recognize that the reply on port 2049 isn’t a request initiation for “NFS,” but a response to the initial port 80 HTTP query.
The TCP/IP Time to Live Feature
Each IP packet has a Time to Live (TTL) section that keeps track of the number of network devices the packet has passed through to reach its destination. The server sending the packet sets the initial TTL value, and each network device that the packet passes through then reduces this value by 1. If the TTL value reaches 0, the network device will discard the packet.
This mechanism helps to ensure that bad routing on the Internet won’t cause packets to aimlessly loop around the network without being removed. TTLs therefore help reduce the clogging of data circuits with unnecessary traffic.
Remember this concept because it will be helpful in understanding the traceroute troubleshooting technique outlined in that covers network troubleshooting.
The ICMP Protocol and Its Relationship to TCP/IP
There is another commonly used protocol called the Internet Control Message Protocol (ICMP). It is not strictly a TCP/IP protocol, but TCP/IP-based applications use it frequently.
ICMP provides a suite of error, control, and informational messages for use by the operating system. For example, IP packets will occasionally arrive at a server with corrupted data due to any number of reasons including a bad connection, electrical interference, or even misconfiguration. The server will usually detect this by examining the packet and correlating the contents to what it finds in the IP header’s error-control section. It will then issue an ICMP reject message to the original sending machine saying that the data should be re-sent because the original transmission was corrupted
ICMP also includes echo and echo reply messages used by the Linux ping command to confirm network connectivity. ICMP TTL expired messages are also sent by network devices back to the originating server whenever the TTL in a packet is decremented to 0
===================================================================
Viewing Packet Flows with tcpdump
The tcpdump command is one of the most popular packages for viewing the flow of packets through your Linux box’s NIC card. It is installed by default on Red Hat/Fedora Linux and has very simple syntax, especially if you are doing simpler types of troubleshooting.
One of the most common uses of tcpdump is to determine whether you are getting basic two-way communication. Lack of communication could be due to the following:
-
Bad routing
-
Faulty cables, interfaces of devices in the packet flow
-
The server not listening on the port because the software isn’t installed or started
-
A network device in the packet path blocking traffic; common culprits are firewalls, routers with access control lists and even your Linux box running iptables
Analyzing tcpdump in much greater detail is beyond the scope of this section.
Like most Linux commands, tcpdump uses command-line switches to modify the output. Some of the more useful command-line switches are listed in
Switch Description |
-c |
Stop after viewing count packets. |
-t |
Don’t print a timestamp at the beginning of each line. |
-i |
Listen on interface. If this is not specified, tcpdump uses the lowest numbered interface that is UP. |
-w |
Dump the output to a specially formatted tcpdump dump file. |
-C |
Specify the size the dump file must reach before a new one with a numeric extension is created. |
---|
You can also add expressions after all the command-line switches. These act as filters to limit the volume of data presented on the screen. You can also use keywords such as and or or between expressions to further fine-tune your selection criteria. Some useful expressions are listed in
Tcpdump Command Expression |
Description |
host host-address |
View packets from the IP address host-address. |
icmp |
View icmp packets. |
tcp port port-number |
View TCP packets with either a source or destination TCP port of port-number. |
udp port port-number |
View UDP packets with either a source or destination UDP port of port-number. |
---|
The following is an example of tcpdump being used to view ICMP ping packets going through interface wlan0:
[root@bigboy tmp]# tcpdump -i wlan0 icmp tcpdump: listening on wlan0 21:48:58.927091 smallfry > bigboy: icmp: echo request (DF) 21:48:58.927510 bigboy > smallfry: icmp: echo reply 21:48:58.928257 smallfry > bigboy.my-web-site.org: icmp: echo request (DF) 21:48:58.928365 bigboy. > smallfry: icmp: echo reply 21:48:58.943926 smallfry > bigboy.my-web-site.org: icmp: echo request (DF) 21:48:58.944034 bigboy > smallfry: icmp: echo reply 21:48:58.962244 bigboy > smallfry: icmp: echo reply 21:48:58.963966 bigboy > smallfry: icmp: echo reply 21:48:58.968556 bigboy > smallfry: icmp: echo reply 9 packets received by filter 0 packets dropped by kernel [root@bigboy tmp]#
In this example:
-
The first column of data is a packet timestamp.
-
The second column of data shows the packet source and then the destination IP address or server name of the packet.
-
The third column shows the packet type.
-
Two-way communication is occurring as each echo gets an echo reply.
The following example shows tcpdump being used to view packets on interface wlan0 to/from host 192.168.1.102 on TCP port 22 with no timestamps in the output (-t switch):
[root@bigboy tmp]# tcpdump -i wlan0 -t host 192.168.1.102 and tcp port 22 tcpdump: listening on wlan0 smallfry.32938 > bigboy.ssh: S 2013297020:2013297020(0) win 5840 <mss 1460,sackOK,timestamp 75227931 0,nop,wscale 0> (DF) [tos 0x10] bigboy.ssh > smallfry.32938: R 0:0(0) ack 2013297021 win 0 (DF) [tos 0x10] smallfry.32938 > bigboy.ssh: S 2013297020:2013297020(0) win 5840 <mss 1460,sackOK,timestamp 75227931 0,nop,wscale 0> (DF) [tos 0x10] bigboy.ssh > smallfry.32938: R 0:0(0) ack 1 win 0 (DF) [tos 0x10] smallfry.32938 > bigboy.ssh: S 2013297020:2013297020(0) win 5840 <mss 1460,sackOK,timestamp 75227931 0,nop,wscale 0> (DF) [tos 0x10] 7 packets received by filter 0 packets dropped by kernel [root@bigboy tmp]#
In this example:
-
The first column of data shows the packet source and then the destination IP address or server name of the packet.
-
The second column shows the TCP flags within the packet.
-
The client named bigboy is using port 32938 to communicate with the server named smallfry on the TCP SSH port 22.
Analyzing tcpdump files
By using the -w filename option you can send the entire Ethernet frame, not just brief IP information that normally goes to the screen, to a file. This can then be analyzed by graphical analysis tools such as Ethereal, which is available in both Windows and Linux, with customized filters, colorization of packet records based on criteria deemed interesting, and the capability of automatically highlighting certain error conditions such as data retransmissions:
tcpdump -i eth1 -w /tmp/packets.dump tcp port 22
Covering Ethereal is beyond the scope of this book, but that shouldn’t discourage you from using it. The application is part of the Fedora RPM suite, and a Windows version is also available.
Common Problems with tcpdump
By default tcpdump will attempt to determine the DNS names of all the IP addresses it sees while logging data. This can slow down tcpdump so much that it appears not to be working at all. The -n switch stops DNS name lookups and makes tcpdump work more reliably.
The following are examples of how the -n switch affects the output.
Without the -n switch:
[root@bigboy tmp]# tcpdump -i eth1 tcp port 22 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes 02:24:34.818398 IP 192-168-1-242.my-web-site.org.1753 > bigboy-100.my- web-site.org.ssh: . ack 318574223 win 65471 02:24:34.818478 IP bigboy-100.my-web-site.org.ssh > 192-168-1-242.my- web-site.org.1753: P 1:165(164) ack 0 win 6432 02:24:35.019042 IP 192-168-1-242.my-web-site.org.1753 > bigboy-100.my- web-site.org.ssh: . ack 165 win 65307 02:24:35.019118 IP bigboy-100.my-web-site.org.ssh > 192-168-1-242.my- web-site.org.1753: P 165:401(236) ack 0 win 6432 02:24:35.176299 IP 192-168-1-242.my-web-site.org.1753 > bigboy-100.my- web-site.org.ssh: P 0:20(20) ack 401 win 65071 02:24:35.176337 IP bigboy-100.my-web-site.org.ssh > 192-168-1-242.my- web-site.org.1753: P 401:629(228) ack 20 win 6432 6 packets captured 7 packets received by filter 0 packets dropped by kernel [root@bigboy tmp]#
With the -n switch:
[root@bigboy tmp]# tcpdump -i eth1 -n tcp port 22 tcpdump: verbose output suppressed, use -v or -vv for full protocol decode listening on eth1, link-type EN10MB (Ethernet), capture size 96 bytes 02:25:53.068511 IP 192.168.1.242.1753 > 192.168.1.100.ssh: . ack 318576011 win 65163 02:25:53.068606 IP 192.168.1.100.ssh > 192.168.1.242.1753: P 1:165(164) ack 0 win 6432 02:25:53.269152 IP 192.168.1.242.1753 > 192.168.1.100.ssh: . ack 165 win 64999 02:25:53.269205 IP 192.168.1.100.ssh > 192.168.1.242.1753: P 165:353(188) ack 0 win 6432 02:25:53.408556 IP 192.168.1.242.1753 > 192.168.1.100.ssh: P 0:20(20) ack 353 win 64811 02:25:53.408589 IP 192.168.1.100.ssh > 192.168.1.242.1753: P 353:541(188) ack 20 win 6432 6 packets captured 7 packets received by filter 0 packets dropped by kernel [root@bigboy tmp]#
====================================================================
Viewing Packet Flows with tetherealThe tethereal program is a text version of the graphical Ethereal product that is part of the Fedora Linux RPM suite. The command-line options and screen output mimic that of tcpdump in many ways, but tethereal has a number of advantages. The tethereal command has the capability of dumping data to a file like tcpdump and creating new files with new filename extensions when a size limit has been reached. It can additionally limit the total number of files created before overwriting the first one in the queue, which is also known as a ring buffer. The tethereal screen output is also more intuitive to read, though the dump file format is identical to tcpdump. Tables 4.4 and 4.5 show some popular command switches and expressions that can be used with tethereal.
|
||
-c |
Stop after viewing count packets. |
|
-i |
Listen on interface. If this is not specified, tethereal will use the lowest numbered interface that is UP. |
|
-w |
Dump the output to a specially formatted tethereal dump file. |
|
-C |
Specify the size the dump file must reach before a new one with a numeric extension is created. |
|
-b |
Determine the size of the ring buffer when the -C switch is selected. |
tethereal Command Expression |
Description |
host host-address |
View packets from the IP address host-address. |
Icmp |
View icmp packets. |
tcp port port-number |
View TCP packets with packets either a source or destination TCP port of port-number. |
udp port port-number |
View UDP packets with either a source or destination UDP port of port-number. |
---|
In the next example we’re trying to observe an HTTP (TCP port 80) packet flow between server smallfry at address 192.168.1.102 and bigboy at IP address 192.168.1.100. The tethereal output groups the IP addresses and TCP ports together and then provides the TCP flags, followed by the sequence numbering. It may not be apparent on this page, but the formatting lines up in neat columns on your screen, making analysis much easier. Also notice how the command line mimics that of tcpdump:
[root@smallfry tmp]# tethereal -i eth0 tcp port 80 and host 192.168.1.100 Capturing on eth0 0.000000 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [SYN] Seq=3325831828 Ack=0 Win=5840 Len=0 0.000157 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [SYN, ACK] Seq=3291904936 Ack=3325831829 Win=5792 Len=0 0.000223 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [ACK] Seq=3325831829 Ack=3291904937 Win=5840 Len=0 2.602804 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [FIN, ACK] Seq=3325831829 Ack=3291904937 Win=5840 Len=0 2.603211 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [ACK] Seq=3291904937 Ack=3325831830 Win=46 Len=0 2.603356 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [FIN, ACK] Seq=3291904937 Ack=3325831830 Win=46 Len=0 2.603398 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [ACK] Seq=3325831830 Ack=3291904938 Win=5840 Len=0 [root@smallfry tmp]#
Using graphical Ethereal to analyze tethereal dump files is beyond the scope of this book, but that shouldn’t discourage you from using it. The application is part of the Fedora RPM suite and a Windows version is also available.
tethereal Command Expression |
Description |
host host-address |
View packets from the IP address host-address. |
Icmp |
View icmp packets. |
tcp port port-number |
View TCP packets with packets either a source or destination TCP port of port-number. |
udp port port-number |
View UDP packets with either a source or destination UDP port of port-number. |
---|
In the next example we’re trying to observe an HTTP (TCP port 80) packet flow between server smallfry at address 192.168.1.102 and bigboy at IP address 192.168.1.100. The tethereal output groups the IP addresses and TCP ports together and then provides the TCP flags, followed by the sequence numbering. It may not be apparent on this page, but the formatting lines up in neat columns on your screen, making analysis much easier. Also notice how the command line mimics that of tcpdump:
[root@smallfry tmp]# tethereal -i eth0 tcp port 80 and host 192.168.1.100 Capturing on eth0 0.000000 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [SYN] Seq=3325831828 Ack=0 Win=5840 Len=0 0.000157 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [SYN, ACK] Seq=3291904936 Ack=3325831829 Win=5792 Len=0 0.000223 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [ACK] Seq=3325831829 Ack=3291904937 Win=5840 Len=0 2.602804 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [FIN, ACK] Seq=3325831829 Ack=3291904937 Win=5840 Len=0 2.603211 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [ACK] Seq=3291904937 Ack=3325831830 Win=46 Len=0 2.603356 192.168.1.100 -> 192.168.1.102 TCP http > 1442 [FIN, ACK] Seq=3291904937 Ack=3325831830 Win=46 Len=0 2.603398 192.168.1.102 -> 192.168.1.100 TCP 1442 > http [ACK] Seq=3325831830 Ack=3291904938 Win=5840 Len=0 [root@smallfry tmp]#
Using graphical Ethereal to analyze tethereal dump files is beyond the scope of this book, but that shouldn’t discourage you from using it. The application is part of the Fedora RPM suite and a Windows version is also available.
Using MTR to Detect Network Congestion
Matt’s Traceroute is an application you can use to do a repeated TRaceroute in real time; it dynamically shows the round-trip time to reach each hop along the TRaceroute path. The constant updates enable you not only to visually determine which hops are slow, but also to determine when they appear to be slow. It is a good tool to use whenever you suspect there is some intermittent network congestion.
You type mTR followed by the target IP address to get output similar to the following:
[root@bigboy tmp]# mtr 192.168.25.26 Matt's traceroute [v0.52] Bigboy Fri Feb 20 17:19:17 2004 Keys: D - Display mode R - Restart statistics Q - Quit Packets Pings Hostname %Loss Rcv Snt Last Best Avg Worst 1. 192.168.1.1 0% 17 17 32 10 15 32 2. 192.168.2.254 0% 17 17 12 11 18 41 3. 192.168.3.15 0% 17 17 23 14 18 25 4. 192.168.18.35 0% 16 16 24 23 29 42 5. 192.168.25.26 0% 16 16 23 21 26 37 ^C [root@bigboy tmp]#
One of the nice features of MTR is that it gives you the best, worst, and average round-trip times in milliseconds for the probe packets between each hop along the way to the final destination. The advantage of this is that you can let MTR run for an extended period of time, acting as a constant monitor of communication path quality. The constant refreshing of the screen also enables you to instantaneously spot transient changes in quality fairly easily, making it much more convenient than a regular TRaceroute.
MTR is automatically installed as part of Fedora Linux. If MTR isn’t installed on your system, you can download the RPM software installation package from many of the Fedora download sites. “Installing RPM Software.” There is even a free Windows version called WinMTR.
Using nmap
You can use nmap to determine all the TCP/IP ports on which a remote server is listening. It isn’t usually an important tool in the home environment, but it can be used in a corporate environment to detect vulnerabilities in your network, such as servers running unauthorized network applications. It is a favorite tool of malicious surfers and therefore should be used to test external as well as internal servers under your control.
Whenever you are in doubt, you can get a list of available nmap options by just entering the command without arguments at the command prompt:
[root@bigboy tmp]# nmap Nmap V. 3.00 Usage: nmap [Scan Type(s)] [Options] <host or net list>
Some Common Scan Types ('*' options require root privileges)
* -sS TCP SYN stealth port scan (default if privileged (root))
-sT TCP connect() port scan (default for unprivileged users)
* -sU UDP port scan
-sP ping scan (Find any reachable machines)
...
...
[root@bigboy tmp]#
Some of the more common nmap options are listed in below , but you should also refer to the nmap man pages for full descriptions of them all.
Argument |
Description |
-P0 |
Attempts to ping a host before scanning it. If the server is being protected from ping queries, you can use this option to force it to scan anyway. |
-T |
Defines the timing between the packets set during a port scan. Some firewalls can detect the arrival of too many nonstandard packets within a predetermined time frame. This option can be used to send them from 60 seconds apart with a value of 5, “insane mode,” to 0.3 seconds with a value of 0 in “paranoid mode.” |
-O |
Tries to detect the operating system of the remote server based on known responses to various types of packets. |
-p |
Lists the TCP/IP port range to scan. |
-s |
Defines a variety of scan methods that use either packets that comply with the TCP/IP standard or are in violation of it. |
---|
Here is an example of trying to do a scan using valid TCP connections (-sT) in the extremely slow insane mode (-T 5) from ports 1 to 5000:
[root@bigboy tmp]# nmap -sT -T 5 -p 1-5000 192.168.1.153 Starting nmap V. 3.00 ( www.insecure.org/nmap/ ) Interesting ports on whoknows.my-site-int.com (192.168.1.153): (The 4981 ports scanned but not shown below are in state: closed) Port State Service 21/tcp open ftp 25/tcp open smtp 139/tcp open netbios-ssn 199/tcp open smux 2105/tcp open eklogin 2301/tcp open compaqdiag 3300/tcp open unknown Nmap run completed -- 1 IP address (1 host up) scanned in 8 seconds [root@bigboy tmp]#
Full coverage of the possibilities on nmap as a security scanning tool are beyond the scope of this book, but you should go the extra mile and purchase a text specifically on Linux security to help protect you against attempts at malicious security breaches.
Using traceroute to Test Connectivity
Another tool for network troubleshooting is the TRaceroute command. It gives a listing of all the router hops between your server and the target server. This helps you verify that routing over the networks in between is correct.
The traceroute command works by sending a UDP packet destined to the target with a TTL of 0. The first router on the route recognizes that the TTL has already been exceeded and discards or drops the packet, but also sends an ICMP time exceeded message back to the source. The TRaceroute program records the IP address of the router that sent the message and knows that that is the first hop on the path to the final destination. The traceroute program tries again, with a TTL of 1. The first hop sees nothing wrong with the packet, decrements the TTL to 0 as expected, and forwards the packet to the second hop on the path. Router 2 sees the TTL of 0, drops the packet and replies with an ICMP time exceeded message. TRaceroute now knows the IP address of the second router. This continues around and around until the final destination is reached.
NoteIn Linux the traceroute command is TRaceroute. In Windows it is TRacert. |
NoteYou receive traceroute responses only from functioning devices. If a device responds, it is less likely to be the source of your problems. |
Sample traceroute Output
Here is a sample output for a query to 144.232.20.158. Notice that all the hop times are under 50 milliseconds (ms) which is acceptable:
[root@bigboy tmp]# traceroute -I 144.232.20.158 traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte packets 1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110) 14.408 ms 14.064 ms 13.111 ms 2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 13.018 ms 12.887 ms 13.146 ms 3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 12.854 ms 13.035 ms 13.745 ms 4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 16.260 ms 15.618 ms 15.663 ms 5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 15.897 ms 15.785 ms 17.164 ms 6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 14.443 ms 16.279 ms 15.189 ms 7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.185 ms 15.857 ms 15.423 ms 8 sl-bb23-ana-6-0.another-isp-provider.net (144.232.20.158) 27.482 ms 26.306 ms 26.487 ms [root@bigboy tmp]#
Possible traceroute Messages
There are a number of possible message codes traceroute can give; these are listed in
traceroute Symbol |
Description |
* * * |
Expected 5-second response time exceeded. Could be caused by the following:
|
!H, !N, or !P |
Host, network, or protocol unreachable. |
!X or !A |
Communication administratively prohibited. A router Access Control List (ACL) or firewall is in the way. |
!S |
Source route failed. Source routing attempts to force traceroute to use a certain path. Failure may be due to a router security setting. |
---|
traceroute Time Exceeded False Alarms
If there is no response within a 5-second timeout interval, an asterisk (*) is printed for that probe, as seen in the following example:
[root@bigboy tmp]# traceroute 144.232.20.158 traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte packets 1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110) 14.304 ms 14.019 ms 16.120 ms 2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 12.971 ms 14.000 ms 14.627 ms 3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 15.521 ms 12.860 ms 13.179 ms 4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 13.991 ms 15.842 ms 15.728 ms 5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 16.133 ms 15.510 ms 15.909 ms 6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 16.510 ms 17.469 ms 18.116 ms 7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.212 ms 14.274 ms 15.926 ms 8 * * * 9 * * * [root@bigboy tmp]#
Some devices will prevent traceroute packets directed at their interfaces, but will allow ICMP packets. Using TRaceroute with an -I flag forces traceroute to use ICMP packets that may go through. In this case the * * * status messages disappear:
[root@bigboy tmp]# traceroute -I 144.232.20.158 traceroute to 144.232.20.158 (144.232.20.158), 30 hops max, 38 byte packets 1 adsl-67-120-221-110.dsl.sntc01.my-isp-provider.net (67.120.221.110) 14.408 ms 14.064 ms 13.111 ms 2 dist3-vlan50.sntc01.my-isp-provider.net (63.203.35.67) 13.018 ms 12.887 ms 13.146 ms 3 bb1-g1-0.sntc01.my-isp-provider.net (63.203.35.17) 12.854 ms 13.035 ms 13.745 ms 4 bb2-p11-0.snfc21.my-isp-provider.net (64.161.124.246) 16.260 ms 15.618 ms 15.663 ms 5 bb1-p14-0.snfc21.my-isp-provider.net (64.161.124.53) 15.897 ms 15.785 ms 17.164 ms 6 sl-gw11-sj-3-0.another-isp-provider.net (144.228.44.49) 14.443 ms 16.279 ms 15.189 ms 7 sl-bb25-sj-6-1.another-isp-provider.net (144.232.3.133) 16.185 ms 15.857 ms 15.423 ms 8 sl-bb23-ana-6-0.another-isp-provider.net (144.232.20.158) 27.482 ms 26.306 ms 26.487 ms [root@bigboy tmp]#
traceroute Internet Slowness False Alarm
The following traceroute gives the impression that a Web site at 80.40.118.227 might be slow because there is congestion along the way at hops 6 and 7 where the response time is over 200ms:
C:\>tracert 80.40.118.227 1 1 ms 2 ms 1 ms 66.134.200.97 2 43 ms 15 ms 44 ms 172.31.255.253 3 15 ms 16 ms 8 ms 192.168.21.65 4 26 ms 13 ms 16 ms 64.200.150.193 5 38 ms 12 ms 14 ms 64.200.151.229 6 239 ms 255 ms 253 ms 64.200.149.14 7 254 ms 252 ms 252 ms 64.200.150.110 8 24 ms 20 ms 20 ms 192.174.250.34 9 91 ms 89 ms 60 ms 192.174.47.6 10 17 ms 20 ms 20 ms 80.40.96.12 11 30 ms 16 ms 23 ms 80.40.118.227 Trace complete. C:\>
This indicates only that the devices on hops 6 and 7 were slow to respond with ICMP TTL exceeded messages, but not an indication of congestion, latency, or packet loss. If any of those conditions existed, all points past the problematic link would show high latency.
Many Internet routing devices give very low priority to traffic related to TRaceroute in favor of revenue-generating traffic.
traceroute Dies at the Router Just Before the Server
In this case the last device to respond to the traceroute just happens to be the router that acts as the default gateway of the server. The problem is not with the router, but with the server. Remember, you will only receive traceroute responses from functioning devices.
Possible causes of this problem include the following:
-
The server has a bad default gateway.
-
The server is running some type of firewall software that blocks TRaceroute.
-
The server is shut down or disconnected from the network, or it has an incorrectly configured NIC.
C:\>tracert 80.40.100.18 Tracing route to 80.40.100.18 over a maximum of 30 hops1 33 ms 49 ms 28 ms 192.168.1.1 2 33 ms 49 ms 28 ms 65.14.65.19 3 33 ms 32 ms 32 ms 81.25.68.252 4 47 ms 32 ms 31 ms 80.40.97.1 5 29 ms 28 ms 32 ms 80.40.96.114 6 * * * Request timed out. 7 ^C C:\>
Always Get a Bidirectional traceroute
It is always best to get a traceroute from the source IP to the target IP and also from the target IP to the source IP. This is because the packet’s return path from the target is sometimes not the same as the path taken to get there. A high traceroute time equates to the round-trip time for both the initial traceroute query to each hop and the response of each hop.
Here is an example of one such case, using disguised IP addresses and provider names. There was once a routing issue between telecommunications carriers FastNet and SlowNet. When a user at IP address 40.16.106.32 did a traceroute to 64.25.175.200, a problem seemed to appear at the 10th hop with OtherNet. However, when a user at 64.25.175.200 did a traceroute to 40.16.106.32, latency showed up at hop 7 with the return path being very different.
In this case, the real traffic congestion was occurring where FastNet handed off traffic to SlowNet in the second trace. The latency appeared to be caused at hop 10 on the first trace not because that hop was slow, but because that was the first hop at which the return packet traveled back to the source via the congested route. Remember, traceroute gives the packet round-trip time:
Trace route to 40.16.106.32 from 64.25.175.200 1 0 ms 0 ms 0 [64.25.175.200] 2 0 ms 0 ms 0 [64.25.175.253] 3 0 ms 0 ms 0 border-from-40-tesser.my-isp-provider.net [207.174.144.169] 4 0 ms 0 ms 0 [64.25.128.126] 5 0 ms 0 ms 0 p3-0.dnvtco1-cr3.another-isp-provider.net [4.25.26.53] 6 0 ms 0 ms 0 p2-1.dnvtco1-br1.another-isp-provider.net [4.24.11.25] 7 0 ms 0 ms 0 p15-0.dnvtco1-br2.another-isp-provider.net [4.24.11.38] 8 30 ms 30 ms 30 p15-0.snjpca1-br2.another-isp-provider.net [4.0.6.225] 9 30 ms 30 ms 30 p1-0.snjpca1-cr4.another-isp-provider.net [4.24.9.150] 10 1252 ms 1212 ms 1202 h0.webhostinc2.another-isp-provider.net [4.24.236.38] 11 1252 ms 1212 ms 1192 [40.16.96.11] 12 1262 ms 1212 ms 1192 [40.16.96.162] 13 1102 ms 1091 ms 1092 [40.16.106.32] Trace route to 64.25.175.200 from 40.16.106.32 1 1 ms 1 ms 1 ms [40.16.106.3] 2 1 ms 1 ms 1 ms [40.16.96.161] 3 2 ms 1 ms 1 ms [40.16.96.2] 4 1 ms 1 ms 1 ms [40.16.96.65] 5 2 ms 2 ms 1 ms border8.p4-2.webh02-1.sfj.fastnet.net [216.52.19.77] 6 2 ms 1 ms 1 ms core1.ge0-1-net2.sfj.fastnet.net [216.52.0.65] 7 993 ms 961 ms 999 ms sjo-edge-03.inet.slownet.net 8 [208.46.223.33] 1009 ms 1008 ms 971 ms sjo-core-01.inet.slownet.net [205.171.22.29] 9 985 ms 947 ms 983 ms svl-core-03.inet.slownet.net [205.171.5.97] 10 1028 ms 1010 ms 953 ms [205.171.205.30] 11 989 ms 988 ms 985 ms p4-3.paix-bi1.another-isp-provider.net [4.2.49.13] 12 1002 ms 1001 ms 973 ms p6-0.snjpca1-br1.another-isp-provider.net [4.24.7.61] 13 1031 ms 989 ms 978 ms p9-0.snjpca1-br2.another-isp-provider.net [4.24.9.130] 14 1031 ms 1017 ms 1017 ms p3-0.dnvtco1-br2.another-isp-provider.net [4.0.6.226] 15 1027 ms 1025 ms 1023 ms p15-0.dnvtco1-br1.another-isp-provider.net [4.24.11.37] 16 1045 ms 1037 ms 1050 ms p1-0.dnvtco1-cr3.another-isp-provider.net [4.24.11.26] 17 1030 ms 1020 ms 1045 ms p0-0.cointcorp.another-isp-provider.net [4.25.26.54] 18 1038 ms 1031 ms 1045 ms gw234.my-isp-provider.net [64.25.128.99] 19 1050 ms 1094 ms 1034 ms [64.25.175.253] 20 1050 ms 1094 ms 1034 ms [64.25.175.200]
ping and traceroute Troubleshooting Example
In this example, a ping to 186.9.17.153 gave a TTL timeout message. Ping TTLs will usually timeout only if there is a routing loop in which the packet bounces between two routers on the way to the target. Each bounce causes the TTL to decrease by a count of 1 until the TTL reaches 0, at which point you get the timeout.
The routing loop was confirmed by the TRaceroute, in which the packet was proven to be bouncing between routers at 186.40.64.94 and 186.40.64.93:
G:\>ping 186.9.17.153 Pinging 186.9.17.153 with 32 bytes of data:Reply from 186.40.64.94: TTL expired in transit. Reply from 186.40.64.94: TTL expired in transit. Reply from 186.40.64.94: TTL expired in transit. Reply from 186.40.64.94: TTL expired in transit. Ping statistics for 186.9.17.153: Packets: Sent = 4, Received = 4, Lost = 0 (0% loss), Approximate round trip times in milli-seconds: Minimum = 0ms, Maximum = 0ms, Average = 0ms G:\>tracert 186.9.17.153 Tracing route to lostserver.my-isp-provider.net [186.9.17.153] over a maximum of 30 hops: 1 <10 ms <10 ms <10 ms 186.217.33.1 2 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94] 3 70 ms 71 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93] 4 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94] 5 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93] 6 60 ms 70 ms 61 ms rtr-2.my-isp-provider.net [186.40.64.94] 7 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93] 8 60 ms 70 ms 60 ms rtr-2.my-isp-provider.net [186.40.64.94] 9 70 ms 70 ms 70 ms rtr-1.my-isp-provider.net [186.40.64.93] ... ... ... Trace complete.
This problem was solved by resetting the routing process on both routers. The problem was initially triggered by an unstable network link that caused frequent routing recalculations. The constant activity eventually corrupted the routing tables of one of the routers.
traceroute Web Sites
Many ISPs will provide their subscribers with the facility to do a traceroute from purpose-built servers called looking glasses. A simple Web search for the phrase Internet looking glass will provide a long list of alternatives. Doing a TRaceroute from a variety of locations can help identify whether the problem is with the ISP of your Web server or the ISP used at home/work to provide you with Internet access. A more convenient way of doing this is to use a site like traceroute.org, which provides a list of looking glasses sorted by country.
Possible Reasons for a Failed traceroute
A TRaceroute can fail to reach its intended destination for a number of reasons including the following:
-
traceroute packets are being blocked or rejected by a router in the path. The router immediately after the last visible one is usually the culprit. It’s usually good to check the routing table and/or other status of this next hop device.
-
The target server doesn’t exist on the network. It could be disconnected or turned off. (!H or !N messages may be produced.)
-
The network on which you expect the target host to reside doesn’t exist in the routing table of one of the routers in the path. (!H or !N messages may be produced.)
-
You may have a typographical error in the IP address of the target server.
-
You may have a routing loop in which packets bounce between two routers and never get to the intended destination.
-
The packets don’t have a proper return path to your server. The last visible hop is the last hop in which the packets return correctly. The router immediately after the last visible one is the one at which the routing changes. It’s usually good to do the following
-
Log on to the last visible router.
-
Look at the routing table to determine what the next hop is to your intended traceroute target.
-
Log on to this next hop router.
-
Do a traceroute from this router to your intended target server.
-
If this works: Routing to the target server is OK. Do a traceroute back to your source server. The TRaceroute will probably fail at the bad router on the return path.
-
If it doesn’t work: Test the routing table and/or other status of all the hops between it and your intended target.
-
NoteIf there is nothing blocking your traceroute traffic, the last visible router of an incomplete trace is either the last good router on the path or the last router that has a valid return path to the server issuing the traceroute |
The netstat Command
Like curl and wget, netstat can be very useful in helping determine the source of problems. Using netstat with the -an option lists all the TCP ports on which your Linux server is listening, including all the active network connections to and from your server. This can be very helpful in determining whether slowness is due to high traffic volumes:
[root@bigboy tmp]# netstat -an Active Internet connections (servers and established) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 127.0.0.1:25 0.0.0.0:* LISTEN tcp 0 0 :::80 :::* LISTEN ... ... ... [root@bigboy tmp]#
Most TCP connections create permanent connections. HTTP is different because the connections are shut down on their own after a predefined inactive timeout or time_wait period on the Web server. It is therefore a good idea to focus on these types of short-lived connections. You can determine the number of established and time_wait TCP connections on your server by using the netstat command filtered by the grep and egrep commands, with the number of matches being counted by the wc command, which in this case shows 14 connections:
[root@bigboy tmp]# netstat -an | grep tcp | egrep -i \ 'established|time_wait' | wc -l 14 [root@bigboy tmp]#
The netstat -nr command can also be used to view your routing table. It is always good to ensure that your routes are correct and that you can ping all the gateways in your routing table. The traceroute command, which I’ll discuss later, can then be used to verify that your routing table is correct by displaying the path a packet takes to get to a remote destination. If the first hop is incorrect, then your routing table needs to be examined more carefully.
[root@bigboy tmp]# netstat -nr Kernel IP routing table Destination Gateway Genmask Flags MSS Window irtt Iface 172.16.68.64 172.16.69.193 255.255.255.224 UG 40 0 0 eth1 172.16.11.96 172.16.69.193 255.255.255.224 UG 40 0 0 eth1 172.16.68.32 172.16.69.193 255.255.255.224 UG 40 0 0 eth1 172.16.67.0 172.16.67.135 255.255.255.224 UG 40 0 0 eth0 172.16.69.192 0.0.0.0 255.255.255.192 U 40 0 0 eth1 172.16.67.128 0.0.0.0 255.255.255.128 U 40 0 0 eth0 172.160.0 172.16.67.135 255.255.0.0 UG 40 0 0 eth0 172.16.0.0 172.16.67.131 255.240.0.0 UG 40 0 0 eth0 127.0.0.0 0.0.0.0 255.0.0.0 U 40 0 0 lo 0.0.0.0 172.16.69.193 0.0.0.0 UG 40 0 0 eth1 [root@bigboy tmp]#
netstat -ntp
Testing Link Status from the Command Line
Both the mii-tool and ethtool commands provide reports on the link status and duplex settings for supported NICs.
When used without any switches, mii-tool gives a very brief report. Use it with the -v switch because it provides more information on the supported autonegotiation speeds of the NIC, and this can be useful in troubleshooting speed and duplex issues.
The ethtool command provides much more information than mii-tool and should be your command of choice, especially because mii-tool will be soon deprecated in Linux. In both of the following examples, the NICs are operating at 100Mbps, full duplex, and the link is ok.
Link Status Output from mii-tool
[root@bigboy tmp]# mii-tool -v eth0: 100 Mbit, full duplex, link ok product info: Intel 82555 rev 4 basic mode: 100 Mbit, full duplex basic status: link ok capabilities: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD advertising: 100baseTx-FD 100baseTx-HD 10baseT-FD 10baseT-HD flow-control link partner: 100baseTx-HD [root@bigboy tmp]#
Link Status Output from ethtool
[root@bigboy tmp]# ethtool eth0 Settings for eth0: Supported ports: [ TP MII ] Supported link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Supports auto-negotiation: Yes Advertised link modes: 10baseT/Half 10baseT/Full 100baseT/Half 100baseT/Full Advertised auto-negotiation: No Speed: 100Mb/s Duplex: Full Port: MII PHYAD: 1 Transceiver: internal Auto-negotiation: off Supports Wake-on: g Wake-on: g Current message level: 0x00000007 (7) Link detected: yes [root@bigboy tmp]#
ethtool Error Output
The ethtool command can provide a much more detailed report when used with the -S switch:
[root@probe-001 root]# ethtool -S eth0 NIC statistics: rx_packets: 1669993 tx_packets: 627631 rx_bytes: 361714034 tx_bytes: 88228145 rx_errors: 0 tx_errors: 0 rx_dropped: 0 tx_dropped: 0 multicast: 0 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_fifo_errors: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_deferred: 0 tx_single_collisions: 0 tx_multi_collisions: 0 tx_flow_control_pause: 0 rx_flow_control_pause: 0 rx_flow_control_unsupported: 0 tx_tco_packets: 0 rx_tco_packets: 0 [root@probe-001 root]#
Possible Causes of Ethernet Errors
The following are possible causes of Ethernet errors:
-
Collisions: The NIC card detects itself and another server on the LAN attempting data transmissions at the same time. Collisions can be expected as a normal part of Ethernet operation and are typically below 0.1% of all frames sent. Higher error rates are likely to be caused by faulty NIC cards or poorly terminated cables.
-
Single Collisions: The Ethernet frame went through after only one collision.
-
Multiple Collisions: The NIC had to attempt multiple times before successfully sending the frame due to collisions.
-
CRC Errors: Frames were sent but were corrupted in transit. The presence of CRC errors, but not many collisions, usually is an indication of electrical noise. Make sure that you are using the correct type of cable, that the cabling is undamaged, and that the connectors are securely fastened.
-
Frame Errors: An incorrect CRC and a noninteger number of bytes are received. This is usually the result of collisions or a bad Ethernet device.
-
FIFO and Overrun Errors: The number of times that the NIC was blocked from transferring data from the network to its memory buffers because of the speed limitations of the hardware. This is usually a sign of excessive traffic.
-
Length Errors: The received frame length was less than or exceeded the Ethernet standard. This is most frequently due to incompatible duplex settings.
-
Carrier Errors: Errors are caused by the NIC losing its link connection to the hub or switch. Check for faulty cabling or faulty interfaces on the NIC and networking equipment
Who Has Used My System?
It is always important to know who has logged into your Linux box. This isn’t just to help track the activities of malicious users, but mostly to figure out who made the mistake that crashed the system or blew up Apache with a typographical error in the httpd.conf file.
The last Command
The most common command to determine who has logged into your system is last, which lists the last users who logged into the system. Here are some examples:
[root@bigboy tmp]# last -100 root pts/0 reggae.my-web-site.org Thu Jun 19 09:26 still logged in root pts/0 reggae.my-web-site.org Wed Jun 18 01:07 - 09:26 (1+08:18) reboot system boot 2.4.18-14 Wed Jun 18 01:07 (1+08:21) root pts/0 reggae.my-web-site.org Tue Jun 17 21:57 - down (03:07) root pts/0 reggae.my-web-site.org Mon Jun 16 07:24 - 00:35 (17:10) wtmp begins Sun Jun 15 16:29:18 2003 [root@bigboy tmp]#
In this example someone from reggae.my-web-site.org logged into bigboy as user root. I generally prefer not to give out the root password and let all the systems administrators log in with their own individual logins. They can then get root privileges by using sudo. This makes it easier to track down individuals rather than groups of users.
The who Command
The who command is used to see who is currently logged in to your computer. Here we see a user logged as root from server reggae.my-web-site.org:
[root@bigboy tmp]# who root pts/0 Jun 19 09:26 (reggae.my-web-site.org) [root@bigboy tmp]#