Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] vpc-gw无法ping通pod #4864

Open
lygopher opened this issue Dec 23, 2024 · 4 comments
Open

[BUG] vpc-gw无法ping通pod #4864

lygopher opened this issue Dec 23, 2024 · 4 comments
Labels
bug Something isn't working subnet vpc

Comments

@lygopher
Copy link

Kube-OVN Version

v1.12.19

Kubernetes Version

v1.27.13

Operation-system/Kernel Version

5.10.149-1.el7.x86_64

Description

创建了一个vpc,并在该vpc下创建了subnet(10.10.15/24)、vpc-nat-gateway(10.10.15.254)。使用该subnet创建了多个pod。在gw的pod中ping 子网下的其它pod,部分能通,发送curl请求不通。所有pod ping gw pod都能通。pod与pod之间,gw 能ping通的,可以ping通,gw ping不通的,其它pod也ping不通。
image
在gw中抓包:
image
在gw所在node ovn-csi中抓包:
image
在pod所在node ovn-csi中抓包:
image
在pod中抓包:
image

Steps To Reproduce

vpc

---
apiVersion: kubeovn.io/v1
kind: Vpc
metadata:
  labels:
    ovn.kubernetes.io/appNamespace: '464446840910139393'
  name: vpc-464446840910139393-default
  resourceVersion: '17840'
spec:
  namespaces:
    - apaas-464446840910139393
  staticRoutes:
    - bfdId: ''
      cidr: 0.0.0.0/0
      ecmpMode: ''
      nextHopIP: 10.10.15.254
      policy: policyDst
      routeTable: ''

subnet

---
apiVersion: kubeovn.io/v1
kind: Subnet
metadata:
  finalizers:
    - kube-ovn-controller
  labels:
    ovn.kubernetes.io/appNamespace: '464446840910139393'
  name: subnet-464446840910139393-default
  resourceVersion: '976360'
spec:
  cidrBlock: 10.10.15.0/24
  default: false
  enableDHCP: true
  enableLb: true
  excludeIps:
    - 10.10.15.1..10.10.15.10
    - 10.10.15.200..10.10.15.254
  gateway: 10.10.15.1
  gatewayNode: ''
  gatewayType: distributed
  namespaces:
    - apaas-464446840910139393
  natOutgoing: false
  private: false
  protocol: IPv4
  provider: ovn
  vpc: vpc-464446840910139393-default

vpc-gateway-gw

---
apiVersion: kubeovn.io/v1
kind: VpcNatGateway
metadata:
  labels:
    ovn.kubernetes.io/appNamespace: '464446840910139393'
    ovn.kubernetes.io/subnet: subnet-464446840910139393-default
    ovn.kubernetes.io/vpc: vpc-464446840910139393-default
  name: gw-464446840910139393-default
  resourceVersion: '17925'
spec:
  affinity: {}
  externalSubnets:
    - ovn-vpc-external-network
  lanIp: 10.10.15.254
  qosPolicy: ''
  selector:
    - 'node-role.paas/gatewayIn: true'
  subnet: subnet-464446840910139393-default
  vpc: vpc-464446840910139393-default

Current Behavior

gw pod无法ping通subnet下的pod,无法发送curl请求

Expected Behavior

gw pod可以和subnet下pod正常通信

@lygopher lygopher added the bug Something isn't working label Dec 23, 2024
Copy link

dosubot bot commented Dec 23, 2024

I found a similar closed issue, but no solution was found in the comments [1].

To troubleshoot the connectivity issue where the VPC gateway cannot ping or send curl requests to some pods, you can follow these steps:

  1. Check Network Policies and Firewall Rules: Ensure that there are no network policies or firewall rules blocking traffic between the gateway and the pods. Verify that the security groups or firewall settings allow ICMP and HTTP/HTTPS traffic.

  2. Verify Subnet and Gateway Configuration: Double-check the subnet and gateway configurations to ensure they are correctly set up. Make sure the gateway IP and subnet CIDR are correctly defined and that there are no IP conflicts.

  3. Inspect Routing and NAT Rules: Ensure that the routing and NAT rules are correctly configured. The static route in your VPC configuration should direct traffic to the correct next hop. Verify that the NAT gateway is properly handling SNAT/DNAT rules if applicable.

  4. Use Diagnostic Tools: Utilize the kubectl ko diagnose command to check for connectivity issues. This tool can help identify problems with node connectivity, subnet configurations, and specific IP/port issues [2][3].

  5. Capture and Analyze Traffic: Use tcpdump to capture traffic on the gateway, nodes, and pods to identify where packets are being dropped. This can help pinpoint whether the issue is with the gateway, the network, or the pod itself.

  6. Check Pod and Node Logs: Review the logs of the pods and nodes involved to see if there are any error messages or warnings that could indicate the source of the problem.

By following these steps, you should be able to identify and resolve the connectivity issue between the VPC gateway and the pods.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@lygopher
Copy link
Author

image

@lygopher
Copy link
Author

ovn-trace和能ping通的进行比对:
无法ping通:

ct_lb_mark /* default (use --ct to customize) */
------------------------------------------------
 3. ls_out_acl_hint (northd.c:6405): !ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0, priority 4, uuid 1400995e
    reg0[8] = 1;
    reg0[10] = 1;
    next;
 4. ls_out_acl (northd.c:6748): reg0[10] == 1 && (outport == @mongodb.475.apaas.464446840910139393 && ip), priority 3000, uuid 46b8263a
    ct_commit { ct_mark.blocked = 1; };
+ set +x

能ping通:

ct_lb_mark /* default (use --ct to customize) */
------------------------------------------------
 3. ls_out_acl_hint (northd.c:6405): !ct.new && ct.est && !ct.rpl && ct_mark.blocked == 0, priority 4, uuid 1400995e
    reg0[8] = 1;
    reg0[10] = 1;
    next;
 8. ls_out_check_port_sec (northd.c:5845): 1, priority 0, uuid 6932340a
    reg0[15] = check_out_port_sec();
    next;
 9. ls_out_apply_port_sec (northd.c:5850): 1, priority 0, uuid f798250a
    output;
    /* output to "mysql-468-748d48f66b-j7vqh.apaas-464446840910139393", type "" */
+ set +x

@lygopher
Copy link
Author

lygopher commented Jan 6, 2025

@oilbeater

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working subnet vpc
Projects
None yet
Development

No branches or pull requests

1 participant