Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help request: apisix don't sync data from etcd #11390

Open
jujiale opened this issue Jul 5, 2024 · 4 comments
Open

help request: apisix don't sync data from etcd #11390

jujiale opened this issue Jul 5, 2024 · 4 comments

Comments

@jujiale
Copy link
Contributor

jujiale commented Jul 5, 2024

Description

Hello,I suffered the following situation in our prd apisix cluster and one dev apisix node
our prd has 4 env, each has 3 apisix instance, deployed with rpm, one cluster( we all it A in here) appear a odd thing, let me describe:

  1. we modify the cluster A config in apisix-dashboard, and we submit it, in etcd, I have found it is modify correctly, but when I use /v1/route/route_id, found that the whole config in cluster A instance is old version, and no matter how many times modify the config, the config in etcd is correactly, and the update_time is correct, but the config in instance is old, and the update time is very old, and nevery change.
    for example : etcd config

`

 /test/apisix/routes/515483732765836994
  {"id":"515483732765836994","create_time":1716781847,"update_time":1720085553,"uris": 
   ["/menu.service.query/m","/menu.service.query/pm/*"],"name":"aaa","priority":10,"methods":["GET","POST","PUT","DELETE","PATCH","HEAD","OPTIONS","CONNECT","TRACE"],"host":"xxx.com","upstream_id":"515483516306196172","status":1}

when I invoke /v1/route/route_id, config like below:

{
    "key": "/test/apisix/routes/515483732765836994",
    "createdIndex": 946,
    "has_domain": false,
    "clean_handlers": {},
    "modifiedIndex": 946,
    "update_count": 0,
    "orig_modifiedIndex": 946,
    "value": {
        "priority": 10,
        "host": "xxx.com",
        "name": "aaa",
        "methods": [
            "GET",
            "POST",
            "PUT",
            "DELETE",
            "PATCH",
            "HEAD",
            "OPTIONS",
            "CONNECT",
            "TRACE"
        ],
        "id": "515483732765836994",
        "uris": [
            "/menu.service.query/m",
            "/menu.service.query/w",
            "/menu.service.query/pm/*"
        ],
        "update_time": 1716781847,
        "create_time": 1716781847,
        "status": 1,
        "upstream_id": "515483516306196172"
    }
}

`
we could see that the uris is not the same, and the update_time is not the same, but in other cluster, it works well

2.apisix log shows:
note that the error log is consistent output, seems the issue occurs all the time.

`

    172.xx.61.52, server: _, request: "POST /menu.service.query/w HTTP/1.1", host: "xxx.com"
    2024/07/04 16:00:55 [error] 16235#16235: *143253446 [lua] config_util.lua:86: failed to find clean_handler with idx 1, client: 172.xx.61.47, server: _, request: "POST /menu.service.query/w HTTP/1.1", host: "xxx.com"
    2024/07/04 16:00:55 [error] 16234#16234: *143283913 [lua] config_etcd.lua:584: failed to fetch data from etcd: /test/apisix/apisix/core/config_util.lua:104: attempt to index local 'item' (a boolean value)
    stack traceback:
      /test/apisix/apisix/core/config_util.lua:104: in function 'fire_all_clean_handlers'
      /test/apisix/apisix/core/config_etcd.lua:315: in function 'sync_data'
      /test/apisix/apisix/core/config_etcd.lua:541: in function </test/apisix/apisix/core/config_etcd.lua:532>
      [C]: in function 'xpcall'
      /test/apisix/apisix/core/config_etcd.lua:532: in function </test/apisix/apisix/core/config_etcd.lua:513>,  etcd key: /test/apisix/upstreams, context: ngx.timer
    2024/07/04 16:00:55 [error] 16235#16235: *143280176 [lua] config_util.lua:86: failed to find clean_handler with idx 1, client: 172.xx.61.47, server: _, request: "POST /menu.service.validate/w HTTP/1.1", host: "xxx.com"
    2024/07/04 16:00:55 [error] 16240#16240: *143284010 [lua] config_etcd.lua:584: failed to fetch data from etcd: /test/apisix/apisix/core/config_util.lua:104: attempt to index local 'item' (a boolean value)
    stack traceback:
      /test/apisix/apisix/core/config_util.lua:104: in function 'fire_all_clean_handlers'
      /test/apisix/apisix/core/config_etcd.lua:315: in function 'sync_data'
      /test/apisix/apisix/core/config_etcd.lua:541: in function </test/apisix/apisix/core/config_etcd.lua:532>
      [C]: in function 'xpcall'
      /test/apisix/apisix/core/config_etcd.lua:532: in function </test/apisix/apisix/core/config_etcd.lua:513>,  etcd key: /test/apisix/janus/routes, context: ngx.timer

3.capture the 2379 port in apisix instance, found:

66
{"error":{"grpc_code":1,"http_code":408,"message":"context canceled","http_status":"Request Timeout"}}
0

`
also found many request is timeout beyond 30s, as below:
image

  1. I could confirm that the etcd is health, even I restart etcd, the scenario also exist. and apisix to etcd network is correct, some /v3/watch could return correctly, but apisix seems not use the config.

because we use 2.15.0 in prd env, so we could not upgrade it randomly

want to know if it is apisix bug, if it is , we plan merge some changes to solve it, and why config could not sync to apisix instance

Environment

  • APISIX version (run apisix version): 2.15.0
  • Operating system (run uname -a):
  • OpenResty / Nginx version (run openresty -V or nginx -V):
  • etcd version, if relevant (run curl http://127.0.0.1:9090/v1/server_info):3.5.0
  • APISIX Dashboard version, if relevant:
  • Plugin runner version, for issues related to plugin runners:
  • LuaRocks version, for installation issues (run luarocks --version):
@jujiale
Copy link
Contributor Author

jujiale commented Jul 5, 2024

found in #8493 it also have the same error log, but it seems not methion the sync data issue, so I don't know if it is the same issue

@jujiale
Copy link
Contributor Author

jujiale commented Jul 5, 2024

I try to modify the config_etcd.lua config_util.fire_all_clean_handlers(val) to config_util.fire_all_clean_handlers(false), which the error could the same as the above I mentioned, the data between etcd and apisix in not the same

@yydance
Copy link

yydance commented Oct 15, 2024

今天似乎遇到了类似问题,dashboard新增了一条路由,etcd存储OK,但是apisix始终无法查到该路由,最终删除了原apisix pod后恢复正常,目前日志尚未看到相关信息

@akshayparseja
Copy link

Do we have anything on this, we are also actively facing sync issues where we resolve it by doing a rollout restart of the deployment of apisix pods but itll be helpful to know if its fixed in higher versions or is planned for a fix .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants