Java Tcp Connection Pool Example
PrefaceAs a developer, we often hear the words HTTP protocol, TCP/IP protocol, UDP protocol, Socket, Socket long connection, Socket connection pool and so on. However, not everyone can understand their relationship, difference and principle clearly. This article starts from the network protocol foundation to Socket connection pool and explains their relationship step by step.
Seven-Layer Network Model
Firstly, we start with the hierarchical model of network communication: the seven-tier model, also known as OSI (Open System Interconnection) model. From bottom to top, it can be divided into physical layer, data link layer, network layer, transport layer, session layer, presentation layer and application layer. It is indispensable for all communication. The following picture introduces some protocols and hardware corresponding to each layer.
From the figure above, I know that IP protocol corresponds to network layer, TCP and UDP protocol corresponds to transport layer, while HTTP protocol corresponds to application layer. OSI does not have Socket. What is Socket? We will introduce it in detail with code later.
TCP and UDP Connections
About the transport layer TCP and UDP protocols, we may encounter more often. Some people say that TCP is secure, UDP is insecure, UDP transmission is faster than TCP. So why? We start with the process of establishing TCP connection, and then explain the difference between UDP and TCP.
TCP's three handshakes and four Breakups
We know that TCP establishes a connection through three handshakes and disconnects through four breakups. What and how do the three handshakes and the four breakups do respectively?
First handshakeEstablish a connection. The client sends the connection request message segment with SYN at 1 and Sequence Number at X. Then, the client enters the SYN_SEND state and waits for the server's confirmation.
The Second HandshakeWhen the server receives the SYN message segment of the client, it needs to confirm the SYN message segment and set the Acknowledgment Number to x+1 (Sequence Number+1); at the same time, it must send the SYN request information by itself, the SYN location is 1 and the Sequence Number is y; the server will put all the above information into a message segment (that is, SYN+ACK message segment) and send it to the client as soon as possible. When the server enters SYN_RECV state;
Third HandshakeThe client receives the server's SYN + ACK message segment. Then the Acknowledgment Number is set to y+1 and ACK message segment is sent to the server. After sending the ACK message segment, both client and server enter ESTABLISHED state, and TCP shakes hands three times.
After three handshakes, the client and server can begin to transmit data. This is the general introduction of TCP's three handshakes. At the end of the communication, the client and the server are disconnected and need to be confirmed four times separately.
First Break-upHost 1 (which can make the client or the server) sets Sequence Number and Acknowledgment Number to send a FIN message segment to Host 2. At this time, Host 1 enters FIN_WAIT_1 state, which means that Host 1 has no data to send to Host 2.
Second Break-upHost 2 receives FIN message segment sent by Host 1 and returns an ACK message segment to Host 1. Acknowledgment Number is Sequence Number plus 1; Host 1 enters FIN_WAIT_2 state; Host 2 tells Host 1 that I "agree" to your shutdown request;
Break up for the third timeHost 2 sends FIN message segment to Host 1, requests to close the connection, and Host 2 enters LAST_ACK state.
Fourth Break-upHost 1 receives FIN message segment sent by Host 2, sends ACK message segment to Host 2, then Host 1 enters TIME_WAIT state; Host 2 closes connection after receiving ACK message segment of Host 1; At this time, Host 1 still does not receive reply after waiting for 2MSL, which proves that Server has closed normally, well, Host 1 can also close connection.
You can see that a TCP request is set up and shut down for at least seven communications, which does not include data communications, while UDP does not need three handshakes and four breakups.
Differences between TCP and UDP
- TCP is link-oriented, although the insecurity and instability of the network determine how many handshakes can not guarantee the reliability of the connection, but the three handshakes of TCP ensure the reliability of the connection at the minimum (in fact, to a large extent); UDP is not connection-oriented, UDP does not establish a connection with the other party before transmitting data, and does not send confirmation of the received data. Signals, the sender does not know whether the data will be received correctly, of course, do not need to retransmit, so UDP is a connectionless, unreliable data transmission protocol.
- It is also because of the characteristics mentioned in 1 that the overhead of UDP is smaller and the data transmission rate is higher, because there is no need to confirm the sending and receiving data, so the real-time performance of UDP is better. Knowing the difference between TCP and UDP, it is not difficult to understand why MSN using TCP transmission protocol is slower than QQ transmission file using UDP, but it can not be said that QQ communication is insecure, because programmers can manually verify the sending and receiving of UDP data, such as the sender numbering each data packet and then verifying it by the receiver, even so, UDP is at the bottom level. TCP-like "three handshakes" are not used in the encapsulation of the protocol to achieve the transmission efficiency that TCP can not achieve.
problem
We often hear some questions about the transport layer
1. What is the maximum number of concurrent connections for TCP servers?
There is a misunderstanding about the maximum number of concurrent connections of TCP servers, that is, "because the upper limit of port number is 65535, the maximum number of concurrent connections that TCP servers can theoretically carry is 65535". First, we need to understand the components of a TCP connection: client IP, client port, server IP, server port. So for a TCP server process, the number of clients he can connect at the same time is not limited to the number of available ports. In theory, the number of connections a server can establish at one port is the number of global IP * the number of ports per machine. The actual number of concurrent connections is limited by the number of Linux open files, which is configurable and can be very large, so it is actually limited by system performance. Look at the maximum number of file handles in the service through # ulimit – n, and modify XXX through ulimit – N XXX is the number you want to open. System parameters can also be modified:
2. Why does TIME_WAIT state need to wait for 2MSL before returning to CLOSED state?
This is because although both sides agree to close the connection, and the four messages shake hands are coordinated and sent, it is reasonable to go directly back to CLOSED state (as from SYN_SEND state to ESTABLISH state); but because we have to assume that the network is unreliable, you can not guarantee that the ACK message you sent will be received by the other party, so the other party is in LAST. _ The socket in ACK state may retransmit FIN message because it does not receive ACK message due to the timeout, so the TIME_WAIT state is used to retransmit the lost ACK message.
3. TIME_WAIT status needs to wait for 2MSL before returning to CLOSED status.
After the two sides of communication establish TCP connection, the one who actively closes the connection will enter TIME_WAIT state. The TIME_WAIT state maintenance time is two MSL time lengths, that is, 1-4 minutes, and the Windows operating system is four minutes. Typically, the client enters the TIME_WAIT state, and a connection in the TIME_WAIT state occupies a local port. The maximum number of port numbers on a machine is 65536. If tens of thousands of customer requests are simulated by stress testing on the same machine and short-connection communication is made between the server and the server, the machine will generate about 4000 TIME_WAIT Sockets. The subsequent short-connection will result in the exception of address already in use: connection, if Nginx is used as the direction agent. We also need to consider the TIME_WAIT state, and find that there are a lot of TIME_WAIT state connections in the system, which can be solved by adjusting the kernel parameters.
Edit the file and add the following:
Then execute / SBIN / sysctl – P to make the parameter valid.
Net.ipv4.tcp_syncookies = 1 means to open SYN Cookies. When SYN waiting queue overflow occurs, cookies are enabled to deal with it, which can prevent a small number of SYN attacks, defaulting to 0, indicating closure.
Net. ipv4. tcp_tw_reuse = 1 indicates open reuse. TIME-WAIT sockets are allowed to be reused for new TCP connections by default of 0, indicating closure.
Net.ipv4.tcp_tw_recycle = 1 indicates the quick recovery of TIME-WAIT sockets in open TCP connections, defaulting to 0, indicating closure.
Net. ipv4. tcp_fin_timeout modifies the system's default TIMEOUT time.
HTTP protocol
About the relationship between TCP/IP and HTTP protocol, the network has a relatively easy to understand introduction: "When we transmit data, we can only use (transport layer) TCP/IP protocol, but in that case, if there is no application layer, it will not be able to identify the data content. If you want to make the transmitted data meaningful, you must use the application layer protocol. There are many application layer protocols, such as HTTP, FTP, TELNET, etc. You can also define application layer protocols by yourself.
HTTP protocol, namely Hypertext Transfer Protocol, is the basis of Web networking and one of the commonly used protocols for mobile phone networking. WEB uses HTTP protocol as application layer protocol to encapsulate HTTP text information, and then uses TCP/IP as transport layer protocol to send it to the network.
Since HTTP actively releases connections after each request, HTTP connections are short connections. To maintain the online status of client programs, it is necessary to continuously initiate connection requests to the server. Normally, the client does not need to obtain any data at any time. The client also keeps sending a "keep connection" request to the server at regular intervals. After receiving the request, the server replies to the client, indicating that it knows that the client is "online". If the server can't receive the client's request for a long time, the client will be considered "offline", and if the client can't receive the server's reply for a long time, the network will be disconnected.
Here is a simple request for HTTP Post application/json data content:
About Socket
Now we know that TCP/IP is only a protocol stack, just like the operating system, it must be implemented concretely, and also provide external operation interface. Just as the operating system will provide standard programming interface, such as Win32 programming interface, TCP/IP must also provide external programming interface, which is Socket. Now we know that Socket and TCP/IP are not necessarily linked. Socket programming interface is designed to adapt to other network protocols. Therefore, the emergence of Socket is just a more convenient use of TCP/IP protocol stack. It abstracts TCP/IP and forms several basic functional interfaces. For example, create, listen, accept, connect, read, write, etc.
Different languages have corresponding libraries to build Socket server and client. Here is an example of how Nodejs creates server and client.
Server:
Service Listener 9000 Port
Next, use the command line to send HTTP requests and telnet
Notice that curl only processes one message.
Client
Socket Long Connection
The so-called long connection refers to the continuous transmission of multiple data packets on a TCP connection. During the period of TCP connection maintenance, if no data packets are sent, the two sides need to send detection packets to maintain the connection (heartbeat packet), and generally need to do online maintenance by themselves. Short connection refers to the establishment of a TCP connection when both sides of the communication have data interaction, and the disconnection of the TCP connection after the completion of data transmission. For example, Http only connects, requests and closes, and the process time is short. If the server does not receive requests for a period of time, it can close the connection. In fact, long connection is relative to the usual short connection, that is, to keep the connection state between the client and the server for a long time.
The usual short connection operation steps are:
Connect → data transmission → close the connection;
Long connections are usually:
Connection Data transmission Keep connection (heartbeat) Data transmission Keep connection (heartbeat) Close the connection.
When to use long connection and short connection?
Long connections are often used for frequent operations, point-to-point communications, and the number of connections can not be too many. Every TCP connection needs three-step handshake, which takes time. If every operation is connected first and then operated, the processing speed will be much lower. So every operation will not be interrupted after completion. When the next operation is processed, the data packet will be sent directly to OK without establishing a TCP connection. For example, long connection is used for database connection, frequent communication with short connection will cause Socket error, and frequent creation of Socket is also a waste of resources.
What is a heartbeat package and why it is needed:
The heartbeat packet is a self-defined command word that informs the other party's state at regular intervals between the client and the server. It is similar to the heartbeat, so it is called the heartbeat packet. Receiving and sending data in the network are implemented with Socket. But if the socket has been disconnected (e.g. one party is disconnected), there will be problems sending and receiving data. But how to judge whether the socket is still usable? This requires creating a heartbeat mechanism in the system. In fact, TCP has implemented a mechanism called heartbeat for us. If you set the heartbeat, TCP will send the number of heartbeat you set (say 2 times) within a certain time (say 3 seconds), and this information will not affect your own defined protocol. You can also define it by yourself. The so-called "heartbeat" means sending a custom structure (heartbeat package or frame) regularly to let the other party know that they are "online" to ensure the validity of the link.
Realization:
Server:
The server outputs the results:
Client code:
Client output results:
Define your own protocol
If you want to make the transmitted data meaningful, you must use application layer protocols such as HTTP, mqtt, Dubbo, etc. Several problems need to be solved to customize the protocol of application layer based on TCP protocol:
- Definition and Processing of Heart Packet Format
- The definition of message header is that when you send data, you need to send the message header first. The length of the data you want to send can be parsed in the message.
- Is the format of the data package you send JSON or any other serialization
Let's define our own protocol and write services and client calls.
Define message header format: length: 000000xxxx; XXXX represents the length of data, the total length is 20, the example is not rigorous.
Data serialization: JSON.
Server:
Log Printing:
Client
Log Printing:
Here you can see that a client can handle a request at the same time, but imagine a scenario where the same client can call the server requests multiple times at the same time, send multiple header data and content data, and the data received by the server's data event can hardly distinguish which data is which request, such as two header data. At the same time, when arriving at the server, the server will ignore one of them, and the later content data does not necessarily correspond to this header. So if you want to reuse long connections and handle server requests concurrently, you need connection pooling.
Socket connection pool
What is a socket connection pool? The concept of a pool can be associated with a collection of resources, so a socket connection pool is a collection that maintains a certain number of long socket connections. It can automatically detect the validity of long socket connections, eliminate invalid connections, and supplement the number of long connections in the connection pool. At the code level, it is actually a class for human beings to implement this function. Generally, a connection pool contains the following attributes:
- Long connection queues available for idle use
- Long Connection Queue for Running Communications
- Queue waiting to get a request for a long idle connection
- Elimination of Invalid Long Connections
- Number Configuration of Long Connection Resource Pools
- New functions for Long-connected resources
Scenario: When a request comes, first go to the resource pool and ask for a long connection resource. If there is a long connection in the idle queue, the long connection Socket is obtained, and the Socket is moved to the running long connection queue. If there is no free queue and the length of the running queue is less than the number of configured connection pool resources, a new long connection to the running queue is created. If the running queue is less than the length of the configured resource pool, the request goes to the waiting queue. When a running Socket completes the request, it moves from the running queue to the idle queue and triggers the waiting request queue to obtain the idle resources, if there is a waiting situation.
Here's a brief introduction to a general connection pool module of Node.js: generic-pool.
Main file directory structure
Initialize connection pool
Use connection pools
The following is the use of connection pools. The protocol we used is the one we customized before.
Log Printing:
Here you can see that the first two requests have established a new Socket connection socket_pool 127.0.0.1 9000 connection. After the timer is finished, the two requests are re-initiated without establishing a new Socket connection, and the Socket connection resources are directly obtained from the connection pool.
Source code analysis
The main code found is Pool. JS in the Lib folder
Constructor:
lib/Pool.js
You can see the free resource queue, the resource queue being requested, the request queue waiting, and so on.
Let's look at the Pool. acquire method
lib/Pool.js
The above code goes down according to the situation until it finally gets the resources of long connection, and you can get a deeper understanding of other codes by yourself.
Author's Profile: Six years of experience in server development, responsible for the evolution and deployment of technology solutions for start-up projects from zero to high concurrent access, accumulated experience in platform development and high concurrent and high availability, and now responsible for the architecture and development of gateway layer under the company's micro-service framework.
Source: https://developpaper.com/understanding-tcp-http-socket-socket-connection-pool/
0 Response to "Java Tcp Connection Pool Example"
Postar um comentário