Skip to content

Commit 42ab093

Browse files
committed
Update Architecture.md
1 parent cf3c63d commit 42ab093

File tree

1 file changed

+34
-34
lines changed

1 file changed

+34
-34
lines changed

Architecture.md

Lines changed: 34 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -4,15 +4,15 @@ In the field of server-based request replay, there are generally two main approa
44

55
For real-time request replication, there are generally two types:
66

7-
1) Application-layer request replication
8-
2) Packet-level request replication
7+
- Application-layer request replication
8+
- Packet-level request replication
99

1010
Traditional approaches often replicate requests at the application layer, as seen in tools like Oracle Database Replay. Although easier to implement, this approach has several drawbacks:
1111

12-
1) Replicating requests from the application layer requires traversing the entire protocol stack, which can consume resources, such as valuable connection resources.
13-
2) Testing becomes coupled with the actual application, increasing the potential impact on online systems. Server-based replication, for instance, can cause request processing times to depend on the slowest request (e.g., `max(actual request time, replicated request time)`).
14-
3) Supporting high-stress replication is difficult and may severely impact online systems, according to feedback from some users.
15-
4) Network latency is challenging to control.
12+
- Replicating requests from the application layer requires traversing the entire protocol stack, which can consume resources, such as valuable connection resources.
13+
- Testing becomes coupled with the actual application, increasing the potential impact on online systems. Server-based replication, for instance, can cause request processing times to depend on the slowest request (e.g., `max(actual request time, replicated request time)`).
14+
- Supporting high-stress replication is difficult and may severely impact online systems, according to feedback from some users.
15+
- Network latency is challenging to control.
1616

1717
Packet-level request replication, however, can avoid traversing the entire protocol stack. The shortest path can capture and send packets directly from the data link layer, or alternatively, at the IP layer. As long as TCP is not involved, the impact on online systems is significantly reduced.
1818

@@ -42,16 +42,16 @@ Returning to the architecture, this early version generally functioned only with
4242

4343
**Advantages:**
4444

45-
1) Simple and direct
46-
2) Suitable for smoke testing
47-
3) Relatively realistic testing outcomes
45+
- Simple and direct
46+
- Suitable for smoke testing
47+
- Relatively realistic testing outcomes
4848

4949
**Disadvantages:**
5050

51-
1) Higher impact on the online environment due to response packets returning to the online server (though still less than application-layer replication).
52-
2) Network segment limitations.
53-
3) For web applications, it is challenging to utilize multiple live flows, which limits its value for stress testing.
54-
4) Internal applications are heavily restricted because the client IP of requests cannot match the replicated online server’s IP address.
51+
- Higher impact on the online environment due to response packets returning to the online server (though still less than application-layer replication).
52+
- Network segment limitations.
53+
- For web applications, it is challenging to utilize multiple live flows, which limits its value for stress testing.
54+
- Internal applications are heavily restricted because the client IP of requests cannot match the replicated online server’s IP address.
5555

5656
## The Second Architecture
5757

@@ -65,9 +65,9 @@ As shown in the diagram, `tcpcopy` now captures packets from the IP layer and al
6565

6666
To analyze the interception of response packets, in theory, we could capture response packets at the IP layer or data link layer on the target server. Let’s examine these options:
6767

68-
1) Capturing at the data link layer: If no routing is configured, the response packet would return to the actual client initiating the request, which would affect the client’s TCP module (frequent resets) and, under high load, could cause unnecessary interference to the switch, router, and even the entire network.
68+
- Capturing at the data link layer: If no routing is configured, the response packet would return to the actual client initiating the request, which would affect the client’s TCP module (frequent resets) and, under high load, could cause unnecessary interference to the switch, router, and even the entire network.
6969

70-
2) Capturing at the IP layer: The netlink technology offers a solution to the above issues. Netlink is a communication method for interaction between user-space processes and the kernel. Specifically, we can use kernel modules such as ip_queue (for kernel versions below 3.5) or nfqueue (for kernel 3.5 or above) to capture response packets.
70+
- Capturing at the IP layer: The netlink technology offers a solution to the above issues. Netlink is a communication method for interaction between user-space processes and the kernel. Specifically, we can use kernel modules such as ip_queue (for kernel versions below 3.5) or nfqueue (for kernel 3.5 or above) to capture response packets.
7171

7272
We chose the second method, which captures response packets at the IP layer. Once a response packet is passed to `intercept`, we can retrieve the essential response packet information (generally TCP/IP header information) and transmit it to `tcpcopy`. We can also use a verdict to instruct the kernel on handling these response packets. If there is no whitelist setting, these response packets will be dropped at the IP layer, making them undetectable by tcpdump (which operates at the data link layer).
7373

@@ -77,17 +77,17 @@ This design allows for the replication of traffic from multiple online servers o
7777

7878
**Advantages:**
7979

80-
1) Supports replicating traffic from multiple online servers
81-
2) Minimizes impact on online servers, typically only returning TCP/IP header information
80+
- Supports replicating traffic from multiple online servers
81+
- Minimizes impact on online servers, typically only returning TCP/IP header information
8282

8383
**Disadvantages:**
8484

85-
1) More complex than the first architecture
86-
2) Performance limits are often tied to ip_queue or nfqueue
87-
3) `intercept` lacks scalability, restricted by ip_queue and nfqueue’s inability to support multi-process response packet capture
88-
4) `intercept` affects the final test results on the target server, especially under high-stress conditions
89-
5) Incomplete testing on the target server (no coverage of data link layer egress)
90-
6) Less convenient for maintenance
85+
- More complex than the first architecture
86+
- Performance limits are often tied to ip_queue or nfqueue
87+
- `intercept` lacks scalability, restricted by ip_queue and nfqueue’s inability to support multi-process response packet capture
88+
- `intercept` affects the final test results on the target server, especially under high-stress conditions
89+
- Incomplete testing on the target server (no coverage of data link layer egress)
90+
- Less convenient for maintenance
9191

9292
## The Third Architecture
9393

@@ -111,20 +111,20 @@ It’s important to note that in certain scenarios, pcap packet capture may expe
111111

112112
**Advantages:**
113113

114-
1. Provides a more realistic testing environment
115-
2. Highly scalable
116-
3. Suitable for high concurrency scenarios
117-
4. Avoids the limitations of ip_queue and nfqueue
118-
5. Virtually no performance impact on the target server
119-
6. Easier maintenance on the target server running services
120-
7. Will not crash alongside the service-running server in the event of a failure
114+
- Provides a more realistic testing environment
115+
- Highly scalable
116+
- Suitable for high concurrency scenarios
117+
- Avoids the limitations of ip_queue and nfqueue
118+
- Virtually no performance impact on the target server
119+
- Easier maintenance on the target server running services
120+
- Will not crash alongside the service-running server in the event of a failure
121121

122122
**Disadvantages:**
123123

124-
1. More challenging to operate
125-
2. Requires additional machine resources
126-
3. Demands more knowledge
127-
4. The assistant server (running `intercept`) should ideally be on the same network segment as the target server to simplify deployment
124+
- More challenging to operate
125+
- Requires additional machine resources
126+
- Demands more knowledge
127+
- The assistant server (running `intercept`) should ideally be on the same network segment as the target server to simplify deployment
128128

129129
All three architectures have their merits. Currently, only the second and third architectures are open-source, and tcpcopy defaults to the third architecture.
130130

0 commit comments

Comments
 (0)