Request for review: Accuracy check of my understanding of Linkerd's traffic interception and communication flow #14676

seal90 · 2025-11-01T12:52:41Z

seal90
Nov 1, 2025

Based on Linkerd's official documentation, I have analyzed Linkerd Traffic Interception via iptables and Linkerd Communication Workflow, using actual running Kubernetes objects as reference. The findings are compiled in the document below.

I would greatly appreciate your review of the workflow I've documented to verify its accuracy. Please feel free to point out any inaccuracies or gaps.

During this analysis, I identified two potential optimization opportunities for further discussion:

Dynamic Routing Rules

Extract field values (e.g., version) from request context—such as headers or query parameters—normalize them (e.g., convert to uppercase), and assign them to named variables (e.g., VERSION). These variables can then be interpolated in backendRefs to dynamically construct the target service name (e.g., hello-world-service-${VERSION}). If the resolved service does not exist, the request automatically falls back to a predefined default backend, ensuring it is always handled.

This pattern is already partially reflected in Linkerd_Communication_Workflow Dynamic Routing Rule Selection phase and could enable advanced use cases like canary releases or version-based routing.
Deferring DNS Resolution to the Linkerd Proxy

Instead of resolving the destination at the application layer, the initial DNS query from the application could return a placeholder (e.g., 192.0.2.1), while the actual target address is resolved by the linkerd-proxy using the original hostname at request time. This decouples the application from service discovery logic. However, it introduces complexity in HTTPS scenarios—particularly around SNI handling and certificate validation.

I did not find explicit documentation in Linkerd's official resources regarding “double DNS lookups” or such deferred resolution strategies.To the Linkerd team: Have you evaluated DNS optimization approaches like this? What are the key design trade-offs involved?

Linkerd_Traffic_Interception_via_iptables

Different Pod Communication

https://linkerd.io/2.18/reference/iptables/

hello-world-call → hello-world-service via Linkerd

[hello-world-call Request Package]

[hello-world-call] Generate request package
- Packet: src=10.244.0.1:9003 → dst=10.96.0.2:8080
- Assuming the source (ephemeral) port is 9003.
[1] — (The package passes through here.)

[OUTPUT] Target address redirected to 127.0.0.1:4140

Applied iptables rule:

 iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140

Resulting packet (DNAT): src=10.244.0.1:9003 → dst=127.0.0.1:4140

Tip: IPVS hooks at NF_IP_PRI_NAT_DST + 1, so IPVS does not interfere here.

 # https://github.com/torvalds/linux/blob/master/net/netfilter/ipvs/ip_vs_core.c
 static const struct nf_hook_ops ip_vs_ops4[] = {
 	/* After packet filtering, change source only for VS/NAT */
 	{
 		.hook		= ip_vs_out_hook,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_LOCAL_IN,
 		.priority	= NF_IP_PRI_NAT_SRC - 2,
 	},
 	/* After packet filtering, forward packet through VS/DR, VS/TUN,
 	* or VS/NAT(change destination), so that filtering rules can be
 	* applied to IPVS. */
 	{
 		.hook		= ip_vs_in_hook,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_LOCAL_IN,
 		.priority	= NF_IP_PRI_NAT_SRC - 1,
 	},
 	/* Before ip_vs_in, change source only for VS/NAT */
 	{
 		.hook		= ip_vs_out_hook,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_LOCAL_OUT,
 		.priority	= NF_IP_PRI_NAT_DST + 1,
 	},
 	/* After mangle, schedule and forward local requests */
 	{
 		.hook		= ip_vs_in_hook,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_LOCAL_OUT,
 		.priority	= NF_IP_PRI_NAT_DST + 2,
 	},
 	/* After packet filtering (but before ip_vs_out_icmp), catch icmp
 	* destined for 0.0.0.0/0, which is for incoming IPVS connections */
 	{
 		.hook		= ip_vs_forward_icmp,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_FORWARD,
 		.priority	= 99,
 	},
 	/* After packet filtering, change source only for VS/NAT */
 	{
 		.hook		= ip_vs_out_hook,
 		.pf		= NFPROTO_IPV4,
 		.hooknum	= NF_INET_FORWARD,
 		.priority	= 100,
 	},
 };

 # https://github.com/torvalds/linux/blob/master/include/uapi/linux/netfilter_ipv4.h
 enum nf_ip_hook_priorities {
 	NF_IP_PRI_FIRST = INT_MIN,
 	NF_IP_PRI_RAW_BEFORE_DEFRAG = -450,
 	NF_IP_PRI_CONNTRACK_DEFRAG = -400,
 	NF_IP_PRI_RAW = -300,
 	NF_IP_PRI_SELINUX_FIRST = -225,
 	NF_IP_PRI_CONNTRACK = -200,
 	NF_IP_PRI_MANGLE = -150,
 	NF_IP_PRI_NAT_DST = -100,
 	NF_IP_PRI_FILTER = 0,
 	NF_IP_PRI_SECURITY = 50,
 	NF_IP_PRI_NAT_SRC = 100,
 	NF_IP_PRI_SELINUX_LAST = 225,
 	NF_IP_PRI_CONNTRACK_HELPER = 300,
 	NF_IP_PRI_CONNTRACK_CONFIRM = INT_MAX,
 	NF_IP_PRI_LAST = INT_MAX,
 };

[2] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[3] — Go to INPUT
- Tip: Destination is a local address (127.0.0.1 or pod IP) → bypasses PREROUTING.
[INPUT] — (The package passes through here.)
[4] — (The package passes through here.)

[hello-world-call's Linkerd Request Package]

[Linkerd outbound] Linkerd generates new outbound packet after service discovery
- Resolved target: 10.244.0.2:8080
- New packet: src=10.244.0.1:19003 → dst=10.244.0.2:8080
- Assuming proxy outbound port is 19003.
[5] — (The package passes through here.)

[OUTPUT] Do nothing

Rule:

 iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN

Traffic from proxy (UID 2102) excluded from NAT.

[6] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[7] — (The package passes through here.)
[PREROUTING] Target address redirected to 127.0.0.1:4143
- Applied rule:
```
 iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143
```
- DNAT: dst=127.0.0.1:4143
[8] — (The package passes through here.)
[INPUT] — (The package passes through here.)
[9] —(The package passes through here.)

[hello-world-service's Linkerd Request Package]

[Linkerd inbound] Linkerd forwards request to local application
- Packet: src=10.244.0.2:29003 → dst=10.244.0.2:8080
- Assuming inbound proxy source port is 29003.
[10] — (The package passes through here.)

[OUTPUT] Do nothing

Applied rule:

 iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN

[11] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[12] — Go to INPUT
- Tip: Local destination → PREROUTING skipped.
[INPUT] — (The package passes through here.)
[13] — (The package passes through here.)

[hello-world-service Response Package]

[hello-world-service] Generate response packet
- Packet: src=10.244.0.2:8080 → dst=10.244.0.2:29003
[14] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → NAT rules (e.g., iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140) do not apply.
- Tip: Both endpoints are local → traffic uses lo interface.
[15] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[16] — Go to INPUT
- Tip: Local address → PREROUTING bypassed.
[INPUT] — (The package passes through here.)
[17] — (The package passes through here.)

[hello-world-service's Linkerd Response Package]

[Linkerd inbound] Forward response toward caller
- Packet: src=10.244.0.2:8080 → dst=10.244.0.1:19003
[18] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → NAT rules do not apply.
[19] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[20] — (The package passes through here.)
[PREROUTING] Do nothing
- Tip: Connection is ESTABLISHED → iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143 not applied.
[21] — (The package passes through here.)
[INPUT] — (The package passes through here.)
[22] — (The package passes through here.)

[hello-world-call's Linkerd Response Package]

[Linkerd outbound] Send response to original application
- Packet: src=127.0.0.1:4140 → dst=10.244.0.1:9003
[23] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → NAT rules do not apply.
[24] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[25] — SNAT applied, then go to INPUT
- Final packet: src=10.96.0.2:8080 → dst=10.244.0.1:9003
- Tip: Local destination → PREROUTING skipped.
[INPUT] — (The package passes through here.)
[26] — (The package passes through here.)
[hello-world-call] — Call completed successfully.

Intra-Pod (Same-Pod) Communication

In typical Kubernetes deployments, a Pod hosts a single service instance. When analyzing communication within the same Pod, we assume the workload calls its own service (e.g., via the service's ClusterIP or localhost).

When multiple services are deployed within a single Pod for communication, Linkerd may only partially support—or even completely fail to provide—service mesh capabilities (such as traffic management and observability). Therefore, deploying multiple services in a single Pod is not recommended.

Loopback Communication:

hello-world-call → hello-world-call via 127.0.0.1

[hello-world-call Request Package]

[hello-world-call] Generate request packet
- Packet: src=127.0.0.1:9003 → dst=127.0.0.1:8080
- Assuming the source (ephemeral) port is 9003.
[1] — (The package passes through here.)
[OUTPUT] Do nothing
- iptables rule explicitly skips loopback traffic:
```
 iptables -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN
```
- No redirection or NAT applied.
[2] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[3] — Go to INPUT
- Tip: Traffic uses loopback (127.0.0.1) → bypasses PREROUTING entirely.
[INPUT] — (The package passes through here.)
[4] — (The package passes through here.)

[hello-world-call Response Package]

[hello-world-call] Generate response packet
- Packet: src=127.0.0.1:8080 → dst=127.0.0.1:9003
[5] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → no NAT rules match; proxy interception skipped.
[6] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[7] — Go to INPUT
- Tip: Loopback traffic → PREROUTING not involved.
[INPUT] — (The package passes through here.)
[8] — (The package passes through here.)
[hello-world-call] — End of loopback communication.

ClusterIP Communication:

hello-world-call → hello-world-call via ClusterIP 10.96.0.1

[hello-world-call Request Package]

[hello-world-call] Generate request packet
- Packet: src=10.244.0.1:9003 → dst=10.96.0.1:8080
- Assuming ephemeral source port is 9003.
[1] — (The package passes through here.)
[OUTPUT] Target address redirected to 127.0.0.1:4140
- Applied iptables rule:
```
 iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140
```
- DNAT result: src=10.244.0.1:9003 → dst=127.0.0.1:4140
[2] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[3] — Go to INPUT
- Tip: Destination is local (127.0.0.1) → PREROUTING is bypassed.
[INPUT] — (The package passes through here.)
[4] — (The package passes through here.)

[hello-world-call's Linkerd Request Package]

[Linkerd outbound] Linkerd resolves service and forwards to local endpoint
- Service discovery resolves 10.96.0.1:8080 → 10.244.0.1:8080 (self)
- New packet: src=10.244.0.1:19003 → dst=10.244.0.1:8080
- Assuming proxy outbound port is 19003.
[5] — (The package passes through here.)

[OUTPUT] Do nothing

Rule:

 iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN

Traffic from proxy (UID 2102) excluded from NAT.

[6] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[7] — (The package passes through here.)
[INPUT] — (The package passes through here.)
[8] — (The package passes through here.)

[hello-world-call Response Package]

[hello-world-call] Generate response packet
- Packet: src=10.244.0.1:8080 → dst=10.244.0.1:19003
[9] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → NAT rules (e.g., REDIRECT --to-port 4140) do not apply.
[10] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[11] — Go to INPUT
- Tip: Local destination → PREROUTING skipped.
[INPUT] — (The package passes through here.)
[12] — (The package passes through here.)

[hello-world-call's Linkerd Response Package]

[Linkerd outbound] Proxy sends response back to original caller
- Packet: src=127.0.0.1:4140 → dst=10.244.0.1:9003
[13] — (The package passes through here.)
[OUTPUT] Do nothing
- Tip: Connection is ESTABLISHED → NAT rules do not apply.
[14] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[15] — SNAT applied, then go to INPUT
- Final packet: src=10.96.0.1:8080 → dst=10.244.0.1:9003
- Tip: Local destination → PREROUTING bypassed.
[INPUT] — (The package passes through here.)
[16] — (The package passes through here.)
[hello-world-call] — End of ClusterIP self-call.

Linkerd_Communication_Workflow

CRDs and Core Feature Implementation in Linkerd's Communication Workflow

Linkerd CRDs & Core Feature
├── ↔️ Service invoke
│   ├── HTTPRoute
│   ├── GRPCRoute (Defined by gateway.networking.k8s.io)
│   ├── HTTPLocalRateLimitPolicy
│   ├── ExternalWorkload
│   ├── [Retries](By K8s annotations on Service or HTTPRoute or GRPCRoute or ServiceProfile)
│   ├── [Timeouts](By K8s annotations on Service or HTTPRoute or GRPCRoute or ServiceProfile)
│   ├── [Load Balancing](The data obtained through the k8s API will convert the service IP to the pod IP.)
│   └── [Circuit Breaking](By K8s annotations on Service)
├── 🔐 Safety control
│   ├── [Default policies](By K8s annotations on cluster, namespace, and workload)
│   ├── Server
│   ├── ServerAuthorization(Will be deprecated in future)
│   ├── AuthorizationPolicy(Authorize traffic to a Server or an HTTPRoute by MeshTLSAuthentication or NetworkAuthentication)
│   ├── MeshTLSAuthentication(By mTLS identities or ServiceAccounts)
│   └── NetworkAuthentication(By Net)
├── 📊 Observability
│   ├── EgressNetwork
│   └── ServiceProfile
├── 🌐 Multi-cluster communication
│   └── Linke
└── ⚖️ Traffic management
    └── TrafficSplit (Defined by the Service Mesh Interface, implemented by Linkerd)

Linkerd Communication Workflow: Single-Cluster and Multi-Cluster Scenarios

Linkerd provides transparent, secure, and observable service-to-service communication in Kubernetes environments. Its workflow can be categorized into two primary scenarios: single-cluster communication and multi-cluster communication, both requiring zero application code changes.

I. Single-Cluster Communication Workflow

https://linkerd.io/2.18/getting-started/

Linkerd usage within a single cluster consists of three phases: control plane installation, service deployment (with automatic sidecar injection), and secure inter-service communication, enabling seamless integration of mTLS, traffic management, and observability.

1. Control Plane Installation

The Linkerd control plane is deployed as a lightweight, highly available set of components in a dedicated namespace (e.g., linkerd), including:

CRDs (Custom Resource Definitions)
Define extended resources such as ServiceProfile, HTTPRoute, AuthorizationPolicy, and Server to support advanced traffic policies.
Core Control Components
- linkerd-proxy-injector: A Mutating Admission Webhook responsible for automatic sidecar injection.
- linkerd-identity: Issues short-lived mTLS certificates to workloads based on the SPIFFE standard and handles automatic certificate rotation.
- linkerd-destination: Provides dynamic service discovery, load balancing, and policy distribution (e.g., routing rules, authorization policies).
- Supporting resources: ConfigMaps (configuration), Secrets (trust anchors), ServiceAccounts, and fine-grained RBAC policies.

✅ The control plane does not handle application traffic—it only provides policy and certificate services, resulting in minimal resource overhead.

2. Service Deployment: Automatic Sidecar Injection

When users deploy applications via kubectl or Helm, Linkerd transparently enhances workloads if injection criteria are met:

Injection Trigger

Namespace or Pod template annotated with:
```
linkerd.io/inject: enabled
```

Injection Process

During Pod creation, the Kubernetes API server invokes the linkerd-proxy-injector webhook.

The webhook receives the original Pod spec and returns a JSON Patch that dynamically injects the following:

Component	Purpose
`linkerd-init` (Init Container)	- Configures iptables rules to redirect all inbound/outbound traffic to `linkerd-proxy` - Requires `NET_ADMIN` and `NET_RAW` capabilities (declared via SecurityContext)
`linkerd-proxy` (Sidecar Container)	- Transparently handles all inbound/outbound traffic - Requests short-lived mTLS certificates from `linkerd-identity` at startup and auto-rotates them
Environment Variables & Volumes	- Mounts a short-lived ServiceAccount token (Audience=`identity.l5d.io`) for identity authentication - Sets `LINKERD2_PROXY_*` environment variables (e.g., control plane address, log level) - Mounts the TLS trust anchors (`trust-anchors`)

✅ Injection occurs only at Pod instantiation time. The original Deployment/YAML remains unchanged, adhering to Kubernetes' declarative design principles.

3. Service Invocation: Secure, Intelligent, and Observable

Example: hello-world-call invoking hello-world-service (with multi-version canary support):

Scenario Setup

hello-world-service runs three versions simultaneously: v1 (stable), v2 (canary A), and v3 (canary B).
The Kubernetes Service selects only Pods with version=v1, ensuring non-canary traffic remains stable.
All Pods are labeled with app=hello-world-service, version=vX to enable metric collection by dimensions like service + version (e.g., via kube-state-metrics).

Invocation Flow

Outbound Traffic Interception
hello-world-call sends an HTTP request → transparently captured by the co-located linkerd-proxy (outbound).
Target Instance Selection
- If a TrafficSplit or HTTPRoute exists:
  - Supports traffic splitting by weight (e.g., 90% v1, 5% v2, 5% v3)
  - Or request-based routing by header or path (currently static configuration)
  - Dynamic Routing Rule: Based on the request, extract a value (e.g., from a header), standardize it (e.g., convert to uppercase), and assign it to a named variable (e.g., VERSION). Then, in backendRefs, dynamically construct the target service name using variable interpolation, such as hello-world-service-${VERSION}. If the resulting service does not exist, fall back to a default service to handle the request. For example, extract the version field from the request header, name it VERSION, and use it to form the service name hello-world-service-${VERSION}.
- Otherwise, resolves the Service's Endpoints directly.
mTLS Establishment & Identity Authentication
- Both linkerd-proxy instances exchange certificates using SPIFFE IDs (e.g., spiffe://cluster.local/ns/default/sa/hello-world-service).
- Certificates are issued by linkerd-identity, bound to Kubernetes ServiceAccounts, and short-lived (default: 24 hours).
Fine-Grained Authorization
The destination linkerd-proxy (inbound) performs:
- Authentication: via MeshTLSAuthentication (mTLS) or NetworkAuthentication (IP-based)
- Authorization: enforced by AuthorizationPolicy, which binds:
  - A Server (defining port/protocol) + authentication method, or
  - An HTTPRoute (defining request matchers) + authentication method
- Only authorized requests are forwarded to the application container.
Request Delivery
After successful authentication, the inbound proxy forwards the request to the application process.

✅ End-to-end encryption, authentication, authorization, and telemetry are fully automated—zero application awareness required.

II. Multi-Cluster Communication Workflow (Linkerd Multicluster)

https://linkerd.io/2.18/tasks/multicluster/

Linkerd's multi-cluster capability extends the single-cluster model to enable cross-cluster service discovery and transparent invocation, also structured into installation, deployment, and invocation phases.

1. Installation Phase: Core Components and Permissions

Global Prerequisites

All clusters share the same trust anchor to ensure mutual mTLS trust.
Install Multicluster CRDs: Link.

Per-Cluster Deployment

Link CR instance: Declares connectivity to peer clusters.
linkerd-gateway Deployment:
- Acts as the cross-cluster traffic entrypoint.
- Injected with linkerd-proxy via linkerd inject to enable mTLS proxying.
- Uses a pause container as the main process; actual functionality is handled by the sidecar.
linkerd-service-mirror-remote-access ServiceAccount + ClusterRoleBinding:
- Grants read-only access to the cluster's API server for peer clusters (get/list/watch services/endpoints).
- Binding name format: linkerd-service-mirror-remote-access-<cluster-name>.
linkerd-local-service-mirror Deployment:
- Runs in the local cluster and watches for service changes in remote clusters.
- Automatically creates mirrored Services and Endpoints locally.

Establishing Connections

Run: To generate connection configuration.

linkerd multicluster link --cluster-name <name>

Clusters become mutually aware, forming a multi-cluster topology.

2. Deployment Phase: Service Export and Mirroring

Example with west and east clusters:

Export Services in East Cluster
- Deploy frontend and podinfo (and their Services) in the test namespace.
- Annotate services to be exported:
```
mirror.linkerd.io/exported: "true"
```
Automatic Synchronization in West Cluster
- linkerd-local-service-mirror (in west) uses the linkerd-service-mirror-remote-access permissions from east to detect new services.
- Automatically creates mirrored services in west's test namespace:
  - frontend-east
  - podinfo-east
- Endpoint resolution strategy:
  - Gateway mode: Endpoints point to the public IP of east's linkerd-gateway.
  - Pod-direct mode (e.g., with cross-cluster CNI): Endpoints point directly to east's Pod IPs.
Symmetric Mirroring
Services exported from west are similarly mirrored into east.

Traffic Splitting (Optional)
Create a TrafficSplit in west to blend local and remote traffic:

spec:
service: podinfo
backends:
- service: podinfo      # local instance, 50% weight
    weight: 50
- service: podinfo-east # remote instance, 50% weight
    weight: 50

3. Invocation Phase: Transparent Cross-Cluster Communication

Initiate Call
A frontend Pod in the west cluster calls podinfo.
Routing Decision
- The linkerd-proxy in frontend intercepts the request.
- Based on the TrafficSplit rule, it selects podinfo-east with 50% probability.
Cross-Cluster Traffic Path
- Gateway Mode:
  - Traffic is mTLS-encrypted → sent to east's linkerd-gateway.
  - Gateway decrypts and forwards to the podinfo Pod in east.
- Pod-Direct Mode:
  - Traffic is mTLS-encrypted → sent directly (point-to-point) to the podinfo Pod in east.

✅ Cross-cluster calls are fully transparent to applications, reusing the same security and traffic management capabilities as single-cluster deployments.

Linkerd_DNS_resolution_optimisation

Optimizing DNS Resolution in Linkerd Service Communication on Kubernetes

In a standard Kubernetes + Linkerd environment, inter-service communication typically follows this flow (using hello-world-call invoking hello-world-service as an example):

hello-world-call initiates a request to hello-world-service (e.g., http://hello-world-service).
The application process queries CoreDNS to resolve hello-world-service to a ClusterIP (e.g., 10.96.0.10).
The application sends a TCP/HTTP request to that ClusterIP.
The request is intercepted by the co-located linkerd-proxy in the same Pod (via iptables-based transparent redirection).
linkerd-proxy performs a second resolution of the destination:
- For in-cluster services: It bypasses the ClusterIP and fetches the latest Endpoints directly from the Kubernetes API Server to connect directly to target Pods.
- For external services: It falls back to DNS (e.g., CoreDNS or an upstream resolver) to obtain the actual IP.
linkerd-proxy establishes a connection to the final resolved destination.

📌 Problem Identification: DNS resolution occurs twice—once by the application (to ClusterIP) and again by linkerd-proxy (to the real backend). The first resolution is redundant in Linkerd's data plane, as traffic never actually traverses the Service's ClusterIP.

Optimization Idea: Defer DNS Resolution to the Proxy Layer

To eliminate redundant resolution, consider the following approach:

On the application's initial DNS query, return a “virtual address” (e.g., 192.0.2.1 or another reserved IP). Use iptables to transparently redirect traffic destined for this address to the linkerd-proxy. Upon receiving the connection, the proxy performs real service discovery or DNS resolution and forwards the request to the true backend.

Potential Benefits

✅ Improved initial resolution performance: Eliminates wait time for CoreDNS responses, reducing latency—especially beneficial in large-scale or high-load clusters.
✅ Unified egress control: All outbound traffic (including external domains) must pass through linkerd-proxy, enabling consistent enforcement of access control, auditing, rate limiting, and security policies.

Potential Drawbacks and Challenges

Reduced Observability
- Application logs and monitoring alerts will show the “virtual address” instead of the real backend IP or service name, complicating troubleshooting.
Complexity of Transparent HTTPS Proxying
- SNI dependency: For TLS traffic, linkerd-proxy must extract the Server Name Indication (SNI) from the TLS ClientHello to identify the target domain.
- TLS 1.3 ECH (Encrypted Client Hello): If ECH is enabled, SNI is encrypted, making it impossible for the proxy to determine the destination → transparent proxying fails.
  ■ Note: Since DNS is intercepted, clients cannot retrieve ECH public keys via DNS (e.g., HTTPS SVCB/HTTPSSVC records), so ECH is effectively non-functional in this model today.
- Non-standard proxy behavior: Traditional HTTPS relies on HTTP CONNECT, which most non-browser clients (e.g., gRPC, database drivers) do not support. Transparent MITM proxying would require dynamic certificate generation, increasing complexity and trust management overhead.
Loss of Semantic DNS Capabilities
- Some use cases depend on context-aware DNS (e.g., returning different IPs based on client region, VPC, or tenant).
- Linkerd currently does not implement policy-driven DNS resolution, though similar routing can be approximated using HTTPRoute and custom headers (e.g., l5d-dst-override).
- For HTTPS traffic, only the domain (via SNI) is visible—path, headers, or other business semantics cannot influence routing decisions.
Reduced Configuration Flexibility
- Services that should bypass proxying require explicit exception rules (e.g., skip-dns-resolve-uid-owner), increasing operational complexity.

Validate the feasibility of HTTP proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Nginx.

To validate the feasibility of “deferring DNS resolution to the proxy layer,” the following local simulation can be performed:

Application Initiates Request
Simulate an application (e.g., curl) accessing the target service: http://hello-world-service.io.
DNS Returns a Virtual Address
Configure a custom CoreDNS instance to resolve hello-world-service.io to a predefined virtual IP—e.g., 192.0.2.1 (an address that maps to no real service).
Application Connects to the Virtual Address
curl attempts to connect to 192.0.2.1:80 based on the DNS response.
Traffic Redirected to Local Proxy
Use iptables or network policies to transparently redirect traffic destined for 192.0.2.1:80 to a local proxy port (e.g., 127.0.0.1:4140), mimicking Linkerd's sidecar interception.
Proxy Performs Real Resolution and Forwarding
The local proxy (e.g., Nginx) upon receiving the connection:
- Resolves the real backend address for hello-world-service.io (via upstream DNS or static mapping),
- Forwards the request to the actual service,
- Subsequent behavior (e.g., mTLS, load balancing, telemetry) aligns with Linkerd's standard data plane.

docker pull ubuntu:25.04
docker run --cap-add=NET_ADMIN -it --name http ubuntu:25.04 /bin/bash

groupadd -g 2102 linkerd && useradd -u 2102 -g linkerd linkerd

apt update 
apt install nginx -y
apt install iptables -y
apt install curl -y
apt install dnsutils -y
apt install tcpdump -y
apt install -y wget

wget https://github.com/coredns/coredns/releases/download/v1.12.4/coredns_1.12.4_linux_amd64.tgz
tar -xf coredns_1.12.4_linux_amd64.tgz
mv coredns /bin
cat << EOF > /etc/coredns.conf
.:4153 {
  hosts {
    192.0.2.1 hello-world-service.io
  }
}
EOF
coredns -conf /etc/coredns.conf >/dev/null 2>&1 &
# test: dig @127.0.0.1 -p 4153 hello-world-service.io

sed -i 1s/^/#/ /etc/nginx/nginx.conf
sed -i "2i\user linkerd;" /etc/nginx/nginx.conf
echo '    server {
        listen       4140;
        location / {
            proxy_pass  https://linkerd.io/;
            
            proxy_ssl_server_name on;
            proxy_ssl_name linkerd.io;
            proxy_ssl_verify off;
            proxy_ssl_protocols TLSv1.2 TLSv1.3;
            proxy_ssl_ciphers HIGH:!aNULL:!MD5;
            proxy_set_header Host "linkerd.io";
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
            proxy_set_header X-Forwarded-Proto $scheme;
        }
    }' > /etc/nginx/conf.d/linkerd.conf
nginx
# test: curl 127.0.0.1:4140

# PREROUTING
iptables -t nat -N PROXY_INIT_REDIRECT
iptables -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 22 -j RETURN
iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143
iptables -t nat -A PREROUTING -j PROXY_INIT_REDIRECT

# OUTPUT
iptables -t nat -N PROXY_INIT_OUTPUT
iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN
iptables -t nat -A PROXY_INIT_OUTPUT -p udp --dport 53 -j REDIRECT --to-port 4153
iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -d 192.0.2.1 -j REDIRECT --to-port 4140
iptables -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN
# iptables -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports <ports> -j RETURN
iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140
iptables -t nat -A OUTPUT -j PROXY_INIT_OUTPUT

curl hello-world-service.io 
# now you can get linkerd.io response

Validate the feasibility of HTTPS proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Envoy:

Initiate Request: curl accesses https://linkerd.io/.
Custom DNS Resolution: Configure CoreDNS to resolve linkerd.io to a fixed virtual IP address, 192.0.2.1.
Traffic Redirection: Use iptables (or an equivalent mechanism) on the local host to redirect traffic destined for 192.0.2.1:443 to 127.0.0.1:4140, where Envoy is listening.
SNI-Aware Proxying: During the TLS handshake, Envoy uses the tls_inspector listener filter to extract the SNI (i.e., linkerd.io) and dynamically resolves its real IP address.
Dynamic Forwarding: Envoy establishes an HTTPS connection to the actual backend of linkerd.io and subsequently performs transparent TCP forwarding without inspecting or modifying application-layer content.

docker pull ubuntu:25.04
docker run --cap-add=NET_ADMIN -it --name https ubuntu:25.04 /bin/bash

groupadd -g 2102 linkerd && useradd -u 2102 -g linkerd linkerd

apt update
apt install -y wget
apt install -y gpg
wget -O- https://apt.envoyproxy.io/signing.key | gpg --dearmor -o /etc/apt/keyrings/envoy-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/envoy-keyring.gpg] https://apt.envoyproxy.io jammy main" | tee /etc/apt/sources.list.d/envoy.list
apt update
apt install -y envoy
apt install -y iptables 
apt install -y curl 
apt install -y dnsutils 
apt install -y tcpdump
wget https://github.com/coredns/coredns/releases/download/v1.12.4/coredns_1.12.4_linux_amd64.tgz
tar -xf coredns_1.12.4_linux_amd64.tgz
mv coredns /bin
chmod o+x /bin/coredns
cat << EOF > /etc/coredns.conf
.:4153 {
  hosts {
    192.0.2.1 linkerd.io
  }
}
EOF
coredns -conf /etc/coredns.conf >/dev/null 2>&1 &
# test: dig @127.0.0.1 -p 4153 linkerd.io

# https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/sni_dynamic_forward_proxy_filter
cat << EOF > /etc/envoy.yaml
admin:
  address:
    socket_address:
      protocol: TCP
      address: 127.0.0.1
      port_value: 9901
static_resources:
  listeners:
  - name: listener_0
    address:
      socket_address:
        protocol: TCP
        address: 0.0.0.0
        port_value: 4140
    listener_filters:
    - name: envoy.filters.listener.tls_inspector
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.filters.listener.tls_inspector.v3.TlsInspector
    filter_chains:
    - filters:
      - name: envoy.filters.network.sni_dynamic_forward_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.sni_dynamic_forward_proxy.v3.FilterConfig
          port_value: 443
          dns_cache_config:
            name: dynamic_forward_proxy_cache_config
            dns_lookup_family: V4_ONLY
      - name: envoy.tcp_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.tcp_proxy.v3.TcpProxy
          stat_prefix: tcp
          cluster: dynamic_forward_proxy_cluster
  clusters:
  - name: dynamic_forward_proxy_cluster
    lb_policy: CLUSTER_PROVIDED
    cluster_type:
      name: envoy.clusters.dynamic_forward_proxy
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.clusters.dynamic_forward_proxy.v3.ClusterConfig
        dns_cache_config:
          name: dynamic_forward_proxy_cache_config
          dns_lookup_family: V4_ONLY
EOF
runuser -l linkerd -c "envoy -c /etc/envoy.yaml" >/dev/null 2>&1 &

# PREROUTING
iptables -t nat -N PROXY_INIT_REDIRECT
iptables -t nat -A PROXY_INIT_REDIRECT -p tcp --match multiport --dports 22 -j RETURN
iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143
iptables -t nat -A PREROUTING -j PROXY_INIT_REDIRECT

# OUTPUT
iptables -t nat -N PROXY_INIT_OUTPUT
iptables -t nat -A PROXY_INIT_OUTPUT -m owner --uid-owner 2102 -j RETURN
iptables -t nat -A PROXY_INIT_OUTPUT -p udp --dport 53 -j REDIRECT --to-port 4153
iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -d 192.0.2.1 -j REDIRECT --to-port 4140
iptables -t nat -A PROXY_INIT_OUTPUT -o lo -j RETURN
# iptables -t nat -A PROXY_INIT_OUTPUT -p tcp --match multiport --dports <ports> -j RETURN
iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140
iptables -t nat -A OUTPUT -j PROXY_INIT_OUTPUT

curl https://linkerd.io/ 
# now you can get linkerd.io response

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Request for review: Accuracy check of my understanding of Linkerd's traffic interception and communication flow #14676

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Request for review: Accuracy check of my understanding of Linkerd's traffic interception and communication flow #14676

Uh oh!

seal90 Nov 1, 2025

Linkerd_Traffic_Interception_via_iptables

Different Pod Communication

[hello-world-call Request Package]

[hello-world-call's Linkerd Request Package]

[hello-world-service's Linkerd Request Package]

[hello-world-service Response Package]

[hello-world-service's Linkerd Response Package]

[hello-world-call's Linkerd Response Package]

Intra-Pod (Same-Pod) Communication

Loopback Communication:

[hello-world-call Request Package]

[hello-world-call Response Package]

ClusterIP Communication:

[hello-world-call Request Package]

[hello-world-call's Linkerd Request Package]

[hello-world-call Response Package]

[hello-world-call's Linkerd Response Package]

Linkerd_Communication_Workflow

CRDs and Core Feature Implementation in Linkerd's Communication Workflow

Linkerd Communication Workflow: Single-Cluster and Multi-Cluster Scenarios

I. Single-Cluster Communication Workflow

1. Control Plane Installation

2. Service Deployment: Automatic Sidecar Injection

Injection Trigger

Injection Process

3. Service Invocation: Secure, Intelligent, and Observable

Scenario Setup

Invocation Flow

II. Multi-Cluster Communication Workflow (Linkerd Multicluster)

1. Installation Phase: Core Components and Permissions

Global Prerequisites

Per-Cluster Deployment

Establishing Connections

2. Deployment Phase: Service Export and Mirroring

3. Invocation Phase: Transparent Cross-Cluster Communication

Linkerd_DNS_resolution_optimisation

Optimizing DNS Resolution in Linkerd Service Communication on Kubernetes

Optimization Idea: Defer DNS Resolution to the Proxy Layer

Potential Benefits

Potential Drawbacks and Challenges

Validate the feasibility of HTTP proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Nginx.

Validate the feasibility of HTTPS proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Envoy:

Replies: 0 comments

seal90
Nov 1, 2025