Skip to main content

BPF Trace (bpftrace)

NOTE: retina bpftrace is an experimental feature. The flags and behavior may change in future versions.

The retina bpftrace command allows you to trace network issues on a Kubernetes node in real-time using eBPF/bpftrace.

This is useful for debugging connectivity problems such as:

  • Packet drops (with reason codes)
  • TCP RST events (connection resets)
  • Socket errors (ECONNREFUSED, ETIMEDOUT, etc.)
  • TCP retransmissions (packet loss indicators)

Getting Started

Trace network issues on a node:

# Basic usage - trace all network issues on a node
kubectl retina bpftrace <node-name>

# With custom duration
kubectl retina bpftrace <node-name> --duration 60s

# Filter by IP address
kubectl retina bpftrace <node-name> --ip 10.224.0.5

# Filter by CIDR
kubectl retina bpftrace <node-name> --cidr 10.224.0.0/16

# Output as JSON (for parsing)
kubectl retina bpftrace <node-name> -o json

# Trace only specific event types
kubectl retina bpftrace <node-name> --drops --rst

# Specify custom timeout for trace pod startup
kubectl retina bpftrace <node-name> --startup-timeout 120s

Run kubectl retina bpftrace -h for full documentation and examples.

Event Types

The bpftrace command captures several types of network events:

DROP - Packet Drops

Captures packets dropped by the kernel with reason codes. Common reasons include:

CodeNameDescription
0NOT_SPECIFIEDUnspecified reason
1NO_SOCKETNo listening socket
3TCP_CSUMTCP checksum error
6NETFILTER_DROPDropped by NetworkPolicy/iptables
8IP_CSUMIP checksum error

The full list of drop reasons is kernel-version specific and is printed at the start of each trace.

RST_SENT / RST_RECV - TCP Reset Events

Captures TCP RST packets sent or received. These indicate:

  • Connection refused (no service listening)
  • Connection reset by peer
  • Firewall rejecting connections

SOCK_ERR - Socket Errors

Captures socket-level errors reported to applications:

CodeNameDescription
104ECONNRESETConnection reset by peer
110ETIMEDOUTConnection timed out
111ECONNREFUSEDConnection refused
113EHOSTUNREACHNo route to host

RETRANS - TCP Retransmissions

Captures TCP segment retransmissions, which indicate:

  • Packet loss in the network
  • Network congestion
  • Slow or unresponsive peers

The reason code shows the TCP state during retransmission.

Output Format

Table Format (default)

TIME         TYPE       REASON             PROBE              SRC -> DST
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
18:28:12 SOCK_ERR ECONNREFUSED inet_sk_error_report 127.0.0.1:35779 -> 127.0.0.1:9999
18:28:27 DROP 6 kfree_skb 10.224.0.60:41929 -> 10.224.0.39:80
18:28:28 RETRANS 2 tcp_retransmit_skb 10.224.0.60:41929 -> 10.224.0.39:80
18:28:33 RST_SENT - tcp_send_reset 10.224.0.47:38470 -> 20.161.216.95:443

JSON Format (-o json)

{"time":"18:28:12","type":"SOCK_ERR","reason_code":111,"probe":"inet_sk_error_report","src_ip":"127.0.0.1","src_port":35779,"dst_ip":"127.0.0.1","dst_port":9999}
{"time":"18:28:27","type":"DROP","reason_code":6,"probe":"kfree_skb","src_ip":"10.224.0.60","src_port":41929,"dst_ip":"10.224.0.39","dst_port":80}

Flags

FlagTypeDefaultDescription
--durationduration0Duration to run the trace (0 = until Ctrl-C)
--startup-timeoutduration30sTimeout for trace pod startup
--ipstring""Filter events by IP address (matches src or dst)
--cidrstring""Filter events by CIDR (matches src or dst)
--dropsboolfalseEnable only packet drop events
--rstboolfalseEnable only TCP RST events
--errorsboolfalseEnable only socket error events
--retransmitsboolfalseEnable only retransmit events
--allboolfalseEnable all event types (default when no event flags specified)
-o, --outputstringtableOutput format: table or json
--retina-shell-image-repostring(default)Override the retina-shell image repository
--retina-shell-image-versionstring(default)Override the retina-shell image version

Example: Debugging NetworkPolicy Drops

When pods can't communicate due to NetworkPolicy:

# Start tracing on the node where the destination pod runs
kubectl retina bpftrace aks-nodepool1-12345678-vmss000000 --drops --duration 60s

# In another terminal, attempt the connection
kubectl exec -it client-pod -- curl http://server-service:80

You'll see output like:

18:14:41     DROP       6      kfree_skb          10.224.0.34:33061  ->  10.224.0.49:80   
18:14:42 RETRANS 2 tcp_retransmit_skb 10.224.0.34:33061 -> 10.224.0.49:80
18:14:42 DROP 6 kfree_skb 10.224.0.34:33061 -> 10.224.0.49:80

The DROP with reason code 6 (NETFILTER_DROP) confirms NetworkPolicy is blocking traffic.

Example: Debugging Connection Refused

When connecting to a service that's not listening:

kubectl retina bpftrace node-name --rst --errors --ip 10.224.0.5
18:19:05     RST_RECV   -      tcp_receive_reset  10.224.0.10:35267  ->  10.224.0.5:8080 
18:19:05 SOCK_ERR ECONNREFUSED inet_sk_error_report 10.224.0.10:35267 -> 10.224.0.5:8080

This shows the TCP RST and corresponding socket error, indicating no service is listening on port 8080.

Requirements

  • Linux nodes (Windows not supported)
  • Kernel with BTF support (5.x+ recommended)

Limitations

  • IPv6 filtering not currently supported
  • Cilium CNI: DROP events won't capture Cilium policy drops (Cilium uses eBPF datapath, not netfilter/kfree_skb)