Introduction to TCP Window Troubleshooting
The sliding window mechanism and flow control are critical components of TCP, managing the amount of data that a TCP endpoint is prepared to receive. These mechanisms help regulate data flow and ensure efficient communication. In TCP Window Troubleshooting, understanding why TCP resets occur is essential, as they can happen under normal protocol operations or indicate potential issues. This section explores common problems associated with TCP resets and provides analysis on how to address them effectively.
This chapter covers both the sliding window mechanism and troubleshooting TCP reset scenarios.
More information
TCP window problem:
TCP zero window, zero window detection, zero window violation
A TCP zero window occurs when the receiver broadcasts a receive window of zero bytes in the window field of the TCP header. This event tells the sender to stop sending data because the receiver’s buffer is full. This also indicates that the following problems may occur on the receiver:
- The server cannot allocate memory behind the group for the process
- The application encounters a problem of insufficient memory, so TCP needs to tell the sender to stop sending data.
- The application consumes too much memory so the operating system has to limit the application resources
The TCP zero window probe is sent by the sender to see if the receiver’s zero window still exists. This message is sent to the receiver by sending the next byte of data. If the receiver replies that the window size is still zero, the sender’s probe timer is doubled.
TCP Window Violation: The sender ignores the receiver’s zero window size and sends extra bytes of data. A TCP zero window violation indicates a TCP error in the protocol stack. To check what the problem is, check for the following events:
- A terminal device (server or client) reports a terminal device failure
- An application reports a general application error
- An error occurs when performing an action in the application, such as opening a form, sending a file to a printer, creating a report, or other actions. In this case, the problem is with the application.
TCP Window Update
TCP sends a window update to the other end of the connection to indicate that the buffer size has changed and that it is prepared to accept a higher or lower data rate (the buffer size determines the rate at which the sender is allowed to send). This happens when:
- The TCP receiver recovers from a zero window and tells the sender to resend the rate. In this case, no further processing is required, except to check for the problem that caused the zero window in the first place.
- The TCP receiver frequently changes the window size. In this case, check the reason why the receiver is disturbed. It may be an application problem, a memory problem, or other problems on the terminal device.
Don’t be alarmed if you see this kind of behavior; it’s just how TCP works.
TCP Window Full
This message indicates that the sent message will completely fill the receiver’s receive buffer. This happens when the receiver does not send any ACK confirmation information for previously received data, so this will be the last message data before the sender receives an ACK from the receiver.
The triggering reason of this event is the same as the triggering reason of zero window, which is a sign of unresponsive server or application. A typical example is shown in the figure below:
In the above picture, you can see:
- In message 183816, 192.168.2.138 informs 192.168.1.58 that the sending window is full.
- In the next message, 192.168.1.58 sends a signal to 192.168.2.138, telling it to stop sending data. This is a zero window signal.
- Both parties continue to send zero windows and zero window probes.
- The last message of the connection is the RST message sent by 192.168.2.138, which is used to disconnect the connection.
- In some cases, a zero window can be recovered by a window change message. In some cases, it can be closed by a reset (it may be caused by the application having a zero window and thus not receiving any data).
How it works
The TCP sliding window mechanism is as follows:
- After the connection is established, the sender sends data to the receiver, filling the receive window.
- After several messages, the receiver sends an ACK to the sender to confirm that the bytes sent were received. Sending an ACK clears the receive window.
- This process continues, with the sender filling the window with data and the receiver clearing it and sending an acknowledgment.
- Increasing the receive window size tells the sender to increase throughput, and reducing the window tells the other party to reduce throughput. This mechanism follows the WS/RTT rule (which changes with different TCP versions):
You can also view the problem through the TCP throughput graph and IO graph. In the TCP throughput graph, using the TCP trace graph, the upper line shows the window size, and the distance from the lower line indicates the remaining size of the window. No distance means zero window.
A fixed distance between the two lines indicates that the receiver is working well. As the two lines get closer, it means that the sender is faster than the receiver. As long as the two lines do not overlap, TCP will continue to send data.
TCP reset and reasons:
Connect Wireshark to both ends of the suspected link or server and start capturing packets. Observe each window of the packet capture window. TCP reset can be sent in several situations. Some are normal working procedures of the protocol, while others indicate possible problems. In this section, we will find the problems and analyze the solutions to the problems.
Reset is a TCP signal used to tell the receiver to disconnect. It is sent by setting the RST flag to 1. During normal operation, TCP opens a connection with the SYN signal and closes it with the FIN signal. One of the great features of TCP is that it can quickly close a connection when there is a problem or just for better performance.
Send reset when no fault occurs
The standard way TCP closes a connection is through FIN and FIN-ACK signals. To close a connection, the user needs four messages: FIN/ACK and ACK from one side, and the same message from the other side. When you open a web page, you may have dozens of connections open at the same time (home page, news, ads, regularly updated pictures, etc.), and closing all of them sometimes requires hundreds of FIN and FIN-ACK messages. To prevent this from happening, web servers in many cases disconnect the connection with a reset after sending the request data. This is standard practice and depends on the application.
Send reset when there is a fault
In some cases, a reset indicates a fault (not necessarily a communication fault):
- Reset sent by firewall : When the remote server tries to open a connection but fails, you may see a RST signal returned. This is when the firewall blocks the connection. In the figure below, you can see that every SYN sent is returned with a RST.
- A reset is sent due to a problem on the sender or receiver side . Possible reasons include:
- Five consecutive retransmissions without receiving an ACK response. When the sender does not receive any retransmission response, it sends a reset signal to the peer to inform it to disconnect.
- Another reason is that there has been no data on the connection for several minutes (the number of minutes depends on the system defaults). The party that opened the connection usually sends a reset (but does not always do so, depending on the implementation).