Request for review: Accuracy check of my understanding of Linkerd's traffic interception and communication flow #14676
Unanswered
seal90
asked this question in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Based on Linkerd's official documentation, I have analyzed Linkerd Traffic Interception via iptables and Linkerd Communication Workflow, using actual running Kubernetes objects as reference. The findings are compiled in the document below.
I would greatly appreciate your review of the workflow I've documented to verify its accuracy. Please feel free to point out any inaccuracies or gaps.
During this analysis, I identified two potential optimization opportunities for further discussion:
Dynamic Routing Rules
Extract field values (e.g., version) from request context—such as headers or query parameters—normalize them (e.g., convert to uppercase), and assign them to named variables (e.g., VERSION). These variables can then be interpolated in backendRefs to dynamically construct the target service name (e.g., hello-world-service-${VERSION}). If the resolved service does not exist, the request automatically falls back to a predefined default backend, ensuring it is always handled.
This pattern is already partially reflected in Linkerd_Communication_Workflow Dynamic Routing Rule Selection phase and could enable advanced use cases like canary releases or version-based routing.
Deferring DNS Resolution to the Linkerd Proxy
Instead of resolving the destination at the application layer, the initial DNS query from the application could return a placeholder (e.g., 192.0.2.1), while the actual target address is resolved by the linkerd-proxy using the original hostname at request time. This decouples the application from service discovery logic. However, it introduces complexity in HTTPS scenarios—particularly around SNI handling and certificate validation.
I did not find explicit documentation in Linkerd's official resources regarding “double DNS lookups” or such deferred resolution strategies.To the Linkerd team: Have you evaluated DNS optimization approaches like this? What are the key design trade-offs involved?
Linkerd_Traffic_Interception_via_iptables
Different Pod Communication
https://linkerd.io/2.18/reference/iptables/
hello-world-call→hello-world-servicevia Linkerd[hello-world-call Request Package]
[hello-world-call] Generate request package
src=10.244.0.1:9003 → dst=10.96.0.2:8080[1] — (The package passes through here.)
[OUTPUT] Target address redirected to
127.0.0.1:4140Applied iptables rule:
Resulting packet (DNAT):
src=10.244.0.1:9003 → dst=127.0.0.1:4140Tip: IPVS hooks at
NF_IP_PRI_NAT_DST + 1, so IPVS does not interfere here.[2] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[3] — Go to INPUT
127.0.0.1or pod IP) → bypasses PREROUTING.[INPUT] — (The package passes through here.)
[4] — (The package passes through here.)
[hello-world-call's Linkerd Request Package]
10.244.0.2:8080src=10.244.0.1:19003 → dst=10.244.0.2:8080Rule:
Traffic from proxy (UID 2102) excluded from NAT.
127.0.0.1:4143Applied rule:
DNAT:
dst=127.0.0.1:4143[hello-world-service's Linkerd Request Package]
[Linkerd inbound] Linkerd forwards request to local application
src=10.244.0.2:29003 → dst=10.244.0.2:8080[10] — (The package passes through here.)
[OUTPUT] Do nothing
Applied rule:
[11] — (The package passes through here.)
[POSTROUTING] — (The package passes through here.)
[12] — Go to INPUT
[INPUT] — (The package passes through here.)
[13] — (The package passes through here.)
[hello-world-service Response Package]
src=10.244.0.2:8080 → dst=10.244.0.2:29003iptables -t nat -A PROXY_INIT_OUTPUT -p tcp -j REDIRECT --to-port 4140) do not apply.lointerface.[hello-world-service's Linkerd Response Package]
src=10.244.0.2:8080 → dst=10.244.0.1:19003iptables -t nat -A PROXY_INIT_REDIRECT -p tcp -j REDIRECT --to-port 4143not applied.[hello-world-call's Linkerd Response Package]
src=127.0.0.1:4140 → dst=10.244.0.1:9003src=10.96.0.2:8080 → dst=10.244.0.1:9003Intra-Pod (Same-Pod) Communication
In typical Kubernetes deployments, a Pod hosts a single service instance. When analyzing communication within the same Pod, we assume the workload calls its own service (e.g., via the service's ClusterIP or localhost).
When multiple services are deployed within a single Pod for communication, Linkerd may only partially support—or even completely fail to provide—service mesh capabilities (such as traffic management and observability). Therefore, deploying multiple services in a single Pod is not recommended.
Loopback Communication:
hello-world-call→hello-world-callvia127.0.0.1[hello-world-call Request Package]
src=127.0.0.1:9003 → dst=127.0.0.1:8080iptables rule explicitly skips loopback traffic:
No redirection or NAT applied.
127.0.0.1) → bypasses PREROUTING entirely.[hello-world-call Response Package]
src=127.0.0.1:8080 → dst=127.0.0.1:9003ClusterIP Communication:
hello-world-call→hello-world-callvia ClusterIP10.96.0.1[hello-world-call Request Package]
src=10.244.0.1:9003 → dst=10.96.0.1:8080127.0.0.1:4140Applied iptables rule:
DNAT result:
src=10.244.0.1:9003 → dst=127.0.0.1:4140127.0.0.1) → PREROUTING is bypassed.[hello-world-call's Linkerd Request Package]
10.96.0.1:8080→10.244.0.1:8080(self)src=10.244.0.1:19003 → dst=10.244.0.1:8080Rule:
Traffic from proxy (UID 2102) excluded from NAT.
[hello-world-call Response Package]
src=10.244.0.1:8080 → dst=10.244.0.1:19003REDIRECT --to-port 4140) do not apply.[hello-world-call's Linkerd Response Package]
src=127.0.0.1:4140 → dst=10.244.0.1:9003src=10.96.0.1:8080 → dst=10.244.0.1:9003Linkerd_Communication_Workflow
CRDs and Core Feature Implementation in Linkerd's Communication Workflow
Linkerd Communication Workflow: Single-Cluster and Multi-Cluster Scenarios
Linkerd provides transparent, secure, and observable service-to-service communication in Kubernetes environments. Its workflow can be categorized into two primary scenarios: single-cluster communication and multi-cluster communication, both requiring zero application code changes.
I. Single-Cluster Communication Workflow
https://linkerd.io/2.18/getting-started/
Linkerd usage within a single cluster consists of three phases: control plane installation, service deployment (with automatic sidecar injection), and secure inter-service communication, enabling seamless integration of mTLS, traffic management, and observability.
1. Control Plane Installation
The Linkerd control plane is deployed as a lightweight, highly available set of components in a dedicated namespace (e.g.,
linkerd), including:Define extended resources such as
ServiceProfile,HTTPRoute,AuthorizationPolicy, andServerto support advanced traffic policies.linkerd-proxy-injector: A Mutating Admission Webhook responsible for automatic sidecar injection.linkerd-identity: Issues short-lived mTLS certificates to workloads based on the SPIFFE standard and handles automatic certificate rotation.linkerd-destination: Provides dynamic service discovery, load balancing, and policy distribution (e.g., routing rules, authorization policies).2. Service Deployment: Automatic Sidecar Injection
When users deploy applications via
kubectlor Helm, Linkerd transparently enhances workloads if injection criteria are met:Injection Trigger
Namespace or Pod template annotated with:
Injection Process
During Pod creation, the Kubernetes API server invokes the
linkerd-proxy-injectorwebhook.The webhook receives the original Pod spec and returns a JSON Patch that dynamically injects the following:
linkerd-init(Init Container)linkerd-proxy- RequiresNET_ADMINandNET_RAWcapabilities (declared via SecurityContext)linkerd-proxy(Sidecar Container)linkerd-identityat startup and auto-rotates themidentity.l5d.io) for identity authentication- Sets
LINKERD2_PROXY_*environment variables (e.g., control plane address, log level)- Mounts the TLS trust anchors (
trust-anchors)3. Service Invocation: Secure, Intelligent, and Observable
Example:
hello-world-callinvokinghello-world-service(with multi-version canary support):Scenario Setup
hello-world-serviceruns three versions simultaneously: v1 (stable), v2 (canary A), and v3 (canary B).version=v1, ensuring non-canary traffic remains stable.app=hello-world-service, version=vXto enable metric collection by dimensions like service + version (e.g., viakube-state-metrics).Invocation Flow
hello-world-callsends an HTTP request → transparently captured by the co-locatedlinkerd-proxy(outbound).TrafficSplitorHTTPRouteexists:VERSION). Then, inbackendRefs, dynamically construct the target service name using variable interpolation, such ashello-world-service-${VERSION}. If the resulting service does not exist, fall back to a default service to handle the request. For example, extract theversionfield from the request header, name itVERSION, and use it to form the service namehello-world-service-${VERSION}.linkerd-proxyinstances exchange certificates using SPIFFE IDs (e.g.,spiffe://cluster.local/ns/default/sa/hello-world-service).linkerd-identity, bound to Kubernetes ServiceAccounts, and short-lived (default: 24 hours).The destination
linkerd-proxy(inbound) performs:MeshTLSAuthentication(mTLS) orNetworkAuthentication(IP-based)AuthorizationPolicy, which binds:Server(defining port/protocol) + authentication method, orHTTPRoute(defining request matchers) + authentication methodAfter successful authentication, the inbound proxy forwards the request to the application process.
II. Multi-Cluster Communication Workflow (Linkerd Multicluster)
https://linkerd.io/2.18/tasks/multicluster/
Linkerd's multi-cluster capability extends the single-cluster model to enable cross-cluster service discovery and transparent invocation, also structured into installation, deployment, and invocation phases.
1. Installation Phase: Core Components and Permissions
Global Prerequisites
Link.Per-Cluster Deployment
LinkCR instance: Declares connectivity to peer clusters.linkerd-gatewayDeployment:linkerd-proxyvialinkerd injectto enable mTLS proxying.pausecontainer as the main process; actual functionality is handled by the sidecar.linkerd-service-mirror-remote-accessServiceAccount + ClusterRoleBinding:get/list/watch services/endpoints).linkerd-service-mirror-remote-access-<cluster-name>.linkerd-local-service-mirrorDeployment:Establishing Connections
Run: To generate connection configuration.
Clusters become mutually aware, forming a multi-cluster topology.
2. Deployment Phase: Service Export and Mirroring
Example with west and east clusters:
Export Services in East Cluster
Deploy
frontendandpodinfo(and their Services) in thetestnamespace.Annotate services to be exported:
Automatic Synchronization in West Cluster
linkerd-local-service-mirror(in west) uses thelinkerd-service-mirror-remote-accesspermissions from east to detect new services.testnamespace:frontend-eastpodinfo-eastlinkerd-gateway.Symmetric Mirroring
Services exported from west are similarly mirrored into east.
Traffic Splitting (Optional)
Create a
TrafficSplitin west to blend local and remote traffic:3. Invocation Phase: Transparent Cross-Cluster Communication
A
frontendPod in the west cluster callspodinfo.linkerd-proxyinfrontendintercepts the request.TrafficSplitrule, it selectspodinfo-eastwith 50% probability.linkerd-gateway.podinfoPod in east.podinfoPod in east.Linkerd_DNS_resolution_optimisation
Optimizing DNS Resolution in Linkerd Service Communication on Kubernetes
In a standard Kubernetes + Linkerd environment, inter-service communication typically follows this flow (using
hello-world-callinvokinghello-world-serviceas an example):hello-world-callinitiates a request tohello-world-service(e.g.,http://hello-world-service).hello-world-serviceto a ClusterIP (e.g.,10.96.0.10).linkerd-proxyin the same Pod (via iptables-based transparent redirection).linkerd-proxyperforms a second resolution of the destination:linkerd-proxyestablishes a connection to the final resolved destination.📌 Problem Identification: DNS resolution occurs twice—once by the application (to ClusterIP) and again by
linkerd-proxy(to the real backend). The first resolution is redundant in Linkerd's data plane, as traffic never actually traverses the Service's ClusterIP.Optimization Idea: Defer DNS Resolution to the Proxy Layer
To eliminate redundant resolution, consider the following approach:
Potential Benefits
linkerd-proxy, enabling consistent enforcement of access control, auditing, rate limiting, and security policies.Potential Drawbacks and Challenges
linkerd-proxymust extract the Server Name Indication (SNI) from the TLS ClientHello to identify the target domain.■ Note: Since DNS is intercepted, clients cannot retrieve ECH public keys via DNS (e.g., HTTPS SVCB/HTTPSSVC records), so ECH is effectively non-functional in this model today.
HTTP CONNECT, which most non-browser clients (e.g., gRPC, database drivers) do not support. Transparent MITM proxying would require dynamic certificate generation, increasing complexity and trust management overhead.HTTPRouteand custom headers (e.g.,l5d-dst-override).skip-dns-resolve-uid-owner), increasing operational complexity.Validate the feasibility of HTTP proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Nginx.
To validate the feasibility of “deferring DNS resolution to the proxy layer,” the following local simulation can be performed:
Simulate an application (e.g.,
curl) accessing the target service:http://hello-world-service.io.Configure a custom CoreDNS instance to resolve
hello-world-service.ioto a predefined virtual IP—e.g.,192.0.2.1(an address that maps to no real service).curlattempts to connect to192.0.2.1:80based on the DNS response.Use iptables or network policies to transparently redirect traffic destined for
192.0.2.1:80to a local proxy port (e.g.,127.0.0.1:4140), mimicking Linkerd's sidecar interception.The local proxy (e.g., Nginx) upon receiving the connection:
hello-world-service.io(via upstream DNS or static mapping),Validate the feasibility of HTTPS proxying with "DNS resolution deferred to the proxy layer" using curl, CoreDNS, and Envoy:
curlaccesseshttps://linkerd.io/.linkerd.ioto a fixed virtual IP address,192.0.2.1.iptables(or an equivalent mechanism) on the local host to redirect traffic destined for192.0.2.1:443to127.0.0.1:4140, where Envoy is listening.tls_inspectorlistener filter to extract the SNI (i.e.,linkerd.io) and dynamically resolves its real IP address.linkerd.ioand subsequently performs transparent TCP forwarding without inspecting or modifying application-layer content.Beta Was this translation helpful? Give feedback.
All reactions