You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: Architecture.md
+34-34Lines changed: 34 additions & 34 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,15 +4,15 @@ In the field of server-based request replay, there are generally two main approa
4
4
5
5
For real-time request replication, there are generally two types:
6
6
7
-
1) Application-layer request replication
8
-
2) Packet-level request replication
7
+
- Application-layer request replication
8
+
- Packet-level request replication
9
9
10
10
Traditional approaches often replicate requests at the application layer, as seen in tools like Oracle Database Replay. Although easier to implement, this approach has several drawbacks:
11
11
12
-
1) Replicating requests from the application layer requires traversing the entire protocol stack, which can consume resources, such as valuable connection resources.
13
-
2) Testing becomes coupled with the actual application, increasing the potential impact on online systems. Server-based replication, for instance, can cause request processing times to depend on the slowest request (e.g., `max(actual request time, replicated request time)`).
14
-
3) Supporting high-stress replication is difficult and may severely impact online systems, according to feedback from some users.
15
-
4) Network latency is challenging to control.
12
+
- Replicating requests from the application layer requires traversing the entire protocol stack, which can consume resources, such as valuable connection resources.
13
+
- Testing becomes coupled with the actual application, increasing the potential impact on online systems. Server-based replication, for instance, can cause request processing times to depend on the slowest request (e.g., `max(actual request time, replicated request time)`).
14
+
- Supporting high-stress replication is difficult and may severely impact online systems, according to feedback from some users.
15
+
- Network latency is challenging to control.
16
16
17
17
Packet-level request replication, however, can avoid traversing the entire protocol stack. The shortest path can capture and send packets directly from the data link layer, or alternatively, at the IP layer. As long as TCP is not involved, the impact on online systems is significantly reduced.
18
18
@@ -42,16 +42,16 @@ Returning to the architecture, this early version generally functioned only with
42
42
43
43
**Advantages:**
44
44
45
-
1) Simple and direct
46
-
2) Suitable for smoke testing
47
-
3) Relatively realistic testing outcomes
45
+
- Simple and direct
46
+
- Suitable for smoke testing
47
+
- Relatively realistic testing outcomes
48
48
49
49
**Disadvantages:**
50
50
51
-
1) Higher impact on the online environment due to response packets returning to the online server (though still less than application-layer replication).
52
-
2) Network segment limitations.
53
-
3) For web applications, it is challenging to utilize multiple live flows, which limits its value for stress testing.
54
-
4) Internal applications are heavily restricted because the client IP of requests cannot match the replicated online server’s IP address.
51
+
- Higher impact on the online environment due to response packets returning to the online server (though still less than application-layer replication).
52
+
- Network segment limitations.
53
+
- For web applications, it is challenging to utilize multiple live flows, which limits its value for stress testing.
54
+
- Internal applications are heavily restricted because the client IP of requests cannot match the replicated online server’s IP address.
55
55
56
56
## The Second Architecture
57
57
@@ -65,9 +65,9 @@ As shown in the diagram, `tcpcopy` now captures packets from the IP layer and al
65
65
66
66
To analyze the interception of response packets, in theory, we could capture response packets at the IP layer or data link layer on the target server. Let’s examine these options:
67
67
68
-
1) Capturing at the data link layer: If no routing is configured, the response packet would return to the actual client initiating the request, which would affect the client’s TCP module (frequent resets) and, under high load, could cause unnecessary interference to the switch, router, and even the entire network.
68
+
- Capturing at the data link layer: If no routing is configured, the response packet would return to the actual client initiating the request, which would affect the client’s TCP module (frequent resets) and, under high load, could cause unnecessary interference to the switch, router, and even the entire network.
69
69
70
-
2) Capturing at the IP layer: The netlink technology offers a solution to the above issues. Netlink is a communication method for interaction between user-space processes and the kernel. Specifically, we can use kernel modules such as ip_queue (for kernel versions below 3.5) or nfqueue (for kernel 3.5 or above) to capture response packets.
70
+
- Capturing at the IP layer: The netlink technology offers a solution to the above issues. Netlink is a communication method for interaction between user-space processes and the kernel. Specifically, we can use kernel modules such as ip_queue (for kernel versions below 3.5) or nfqueue (for kernel 3.5 or above) to capture response packets.
71
71
72
72
We chose the second method, which captures response packets at the IP layer. Once a response packet is passed to `intercept`, we can retrieve the essential response packet information (generally TCP/IP header information) and transmit it to `tcpcopy`. We can also use a verdict to instruct the kernel on handling these response packets. If there is no whitelist setting, these response packets will be dropped at the IP layer, making them undetectable by tcpdump (which operates at the data link layer).
73
73
@@ -77,17 +77,17 @@ This design allows for the replication of traffic from multiple online servers o
77
77
78
78
**Advantages:**
79
79
80
-
1) Supports replicating traffic from multiple online servers
81
-
2) Minimizes impact on online servers, typically only returning TCP/IP header information
80
+
- Supports replicating traffic from multiple online servers
81
+
- Minimizes impact on online servers, typically only returning TCP/IP header information
82
82
83
83
**Disadvantages:**
84
84
85
-
1) More complex than the first architecture
86
-
2) Performance limits are often tied to ip_queue or nfqueue
87
-
3)`intercept` lacks scalability, restricted by ip_queue and nfqueue’s inability to support multi-process response packet capture
88
-
4)`intercept` affects the final test results on the target server, especially under high-stress conditions
89
-
5) Incomplete testing on the target server (no coverage of data link layer egress)
90
-
6) Less convenient for maintenance
85
+
- More complex than the first architecture
86
+
- Performance limits are often tied to ip_queue or nfqueue
87
+
-`intercept` lacks scalability, restricted by ip_queue and nfqueue’s inability to support multi-process response packet capture
88
+
-`intercept` affects the final test results on the target server, especially under high-stress conditions
89
+
- Incomplete testing on the target server (no coverage of data link layer egress)
90
+
- Less convenient for maintenance
91
91
92
92
## The Third Architecture
93
93
@@ -111,20 +111,20 @@ It’s important to note that in certain scenarios, pcap packet capture may expe
111
111
112
112
**Advantages:**
113
113
114
-
1. Provides a more realistic testing environment
115
-
2. Highly scalable
116
-
3. Suitable for high concurrency scenarios
117
-
4. Avoids the limitations of ip_queue and nfqueue
118
-
5. Virtually no performance impact on the target server
119
-
6. Easier maintenance on the target server running services
120
-
7. Will not crash alongside the service-running server in the event of a failure
114
+
- Provides a more realistic testing environment
115
+
- Highly scalable
116
+
- Suitable for high concurrency scenarios
117
+
- Avoids the limitations of ip_queue and nfqueue
118
+
- Virtually no performance impact on the target server
119
+
- Easier maintenance on the target server running services
120
+
- Will not crash alongside the service-running server in the event of a failure
121
121
122
122
**Disadvantages:**
123
123
124
-
1. More challenging to operate
125
-
2. Requires additional machine resources
126
-
3. Demands more knowledge
127
-
4. The assistant server (running `intercept`) should ideally be on the same network segment as the target server to simplify deployment
124
+
- More challenging to operate
125
+
- Requires additional machine resources
126
+
- Demands more knowledge
127
+
- The assistant server (running `intercept`) should ideally be on the same network segment as the target server to simplify deployment
128
128
129
129
All three architectures have their merits. Currently, only the second and third architectures are open-source, and tcpcopy defaults to the third architecture.
0 commit comments