Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: find no available rootcoord, check rootcoord state #39246

Open
1 task done
LiSuiTech opened this issue Jan 14, 2025 · 8 comments
Open
1 task done

[Bug]: find no available rootcoord, check rootcoord state #39246

LiSuiTech opened this issue Jan 14, 2025 · 8 comments
Assignees
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.

Comments

@LiSuiTech
Copy link

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: v2.5.3
- Deployment mode(standalone or cluster):
- MQ type(rocksmq, pulsar or kafka):   rocksmq 
- SDK version(e.g. pymilvus v2.0.0rc2): 
- OS(Ubuntu or CentOS): Ubuntu
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

root@etcd-0:/# etcdctl get --prefix ""
by-dev/kv/gid/idTimestamp
�l�
by-dev/kv/gid/timestamp
�m�PG
by-dev/kv/querycoord-id-allocator/idTimestamp
�l['�
by-dev/meta/idTimestamp
�l�%
by-dev/meta/queryCoord-ResourceGroup/__default_resource_group

__default_resource_group�="
�=
by-dev/meta/root-coord/credential/grantee-id/250dd41b686083b0/PrivilegeDescribeCollection
root
by-dev/meta/root-coord/credential/grantee-id/250dd41b686083b0/PrivilegeListAliases
root
by-dev/meta/root-coord/credential/grantee-id/e03326696e8a3b16/PrivilegeIndexDetail
root
by-dev/meta/root-coord/credential/grantee-privileges/public/Collection/.
e03326696e8a3b16
by-dev/meta/root-coord/credential/grantee-privileges/public/Global/.
250dd41b686083b0
by-dev/meta/root-coord/credential/roles/admin

by-dev/meta/root-coord/credential/roles/public

by-dev/meta/root-coord/credential/users/root
{"encrypted_password":"$2a$04$qGsn3BycCS0vYf9i1mjY7u.uqvS3vcDWKcOjXh./MAYWqcRb7.b2S"}
by-dev/meta/root-coord/database/db-info/1
default (鍮��
by-dev/meta/session/datacoord
{"ServerID":18,"ServerName":"datacoord","Address":"127.0.0.1:13333","Exclusive":true,"TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415509,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/datanode-18
{"ServerID":18,"ServerName":"datanode","Address":"127.0.0.1:21124","TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415552,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/id
19
by-dev/meta/session/indexcoord
{"ServerID":18,"ServerName":"indexcoord","Address":"127.0.0.1:13333","Exclusive":true,"TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415506,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/indexnode-18
{"ServerID":18,"ServerName":"indexnode","Address":"127.0.0.1:21121","TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415451,"HostName":"milvus-579df544f9-nbrb5","EnableDisk":true}
by-dev/meta/session/proxy-18
{"ServerID":18,"ServerName":"proxy","Address":"127.0.0.1:19529","TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415543,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/querycoord
{"ServerID":18,"ServerName":"querycoord","Address":"127.0.0.1:19531","Exclusive":true,"TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415522,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/querynode-18
{"ServerID":18,"ServerName":"querynode","Address":"127.0.0.1:21123","TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{"CurrentIndexVersion":6},"LeaseID":966748226050415454,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/session/rootcoord
{"ServerID":18,"ServerName":"rootcoord","Address":"127.0.0.1:53100","Exclusive":true,"TriggerKill":true,"Version":"2.5.3","IndexEngineVersion":{},"LeaseID":966748226050415495,"HostName":"milvus-579df544f9-nbrb5"}
by-dev/meta/snapshots/root-coord/database/db-info/1_ts455303077165531137
default (鍮��

Expected Behavior

No response

Steps To Reproduce

No response

Milvus Log

2025/01/14 09:08:24 maxprocs: Updating GOMAXPROCS=4: determined from CPU quota
Set runtime dir at /run/milvus failed, set it to /tmp/milvus directory

__  _________ _   ____  ______    

/ |/ / / /| | / / / / / __/
/ /|
/ // // /| |/ / // /\ \
// /////_/__/

Welcome to use Milvus!
Version: v2.5.3
Built: Mon Jan 13 07:33:44 UTC 2025
GitCommit: ac730be
GoVersion: go version go1.22.0 linux/arm64

TotalMem: 8589934592
UsedMem: 27688960

open pid file: /tmp/milvus/standalone.pid
lock pid file: /tmp/milvus/standalone.pid
[2025/01/14 09:08:24.517 +00:00] [INFO] [roles/roles.go:345] ["starting running Milvus components"]
[2025/01/14 09:08:24.517 +00:00] [INFO] [roles/roles.go:187] ["Enable Jemalloc"] ["Jemalloc Path"=/milvus/lib/libjemalloc.so]
[2025/01/14 09:08:24.524 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=FileSource]
[2025/01/14 09:08:24.525 +00:00] [DEBUG] [paramtable/base_table.go:213] ["init etcd source"] [etcdInfo="{"UseEmbed":false,"EnableAuth":false,"UserName":"","PassWord":"","UseSSL":false,"Endpoints":["http://etcd.shangshan-testing.svc.cluster.local:2379"],"KeyPrefix":"by-dev","CertFile":"","KeyFile":"","CaCertFile":"","MinVersion":"1.3","RefreshInterval":5000000000}"]
[2025/01/14 09:08:24.525 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
[2025/01/14 09:08:24.538 +00:00] [DEBUG] [config/etcd_source.go:92] ["etcd refreshConfigurations"] [prefix=by-dev/config] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"]
[2025/01/14 09:08:24.538 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=EtcdSource]
[2025/01/14 09:08:24.542 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=FileSource]
[2025/01/14 09:08:24.542 +00:00] [INFO] [paramtable/hook_config.go:21] ["hook config"] [hook={}]
[2025/01/14 09:08:24.542 +00:00] [INFO] [tracer/tracer.go:50] ["Init tracer finished"] [Exporter=stdout]
[2025/01/14 09:08:24.593 +00:00] [WARN] [client/client.go:104] ["RootCoordClient mess key not exist"] [key=rootcoord]
[2025/01/14 09:08:24.593 +00:00] [WARN] [grpcclient/client.go:262] ["failed to get client address"] [error="find no available rootcoord, check rootcoord state"]
[2025/01/14 09:08:24.593 +00:00] [WARN] [grpcclient/client.go:473] ["fail to get grpc client"] [client_role=rootcoord] [error="find no available rootcoord, check rootcoord state"]
[2025/01/14 09:08:24.593 +00:00] [WARN] [grpcclient/client.go:494] ["grpc client is nil, maybe fail to get client in the retry state"] [client_role=rootcoord] [error="empty grpc client: find no available rootcoord, check rootcoord state"] [errorVerbose="empty grpc client: find no available rootcoord, check rootcoord state\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call.func2\n | \t/workspace/source/internal/util/grpcclient/client.go:493\n | github.com/milvus-io/milvus/pkg/util/retry.Handle\n | \t/workspace/source/pkg/util/retry/retry.go:128\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call\n | \t/workspace/source/internal/util/grpcclient/client.go:486\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n | \t/workspace/source/internal/util/grpcclient/client.go:573\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n | \t/workspace/source/internal/util/grpcclient/client.go:589\n | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.wrapGrpcCall[...]\n | \t/workspace/source/internal/distributed/rootcoord/client/client.go:121\n | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.(*Client).GetComponentStates\n | \t/workspace/source/internal/distributed/rootcoord/client/client.go:135\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...].func1\n | \t/workspace/source/internal/util/componentutil/componentutil.go:39\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/workspace/source/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...]\n | \t/workspace/source/internal/util/componentutil/componentutil.go:64\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentHealthy[...]\n | \t/workspace/source/internal/util/componentutil/componentutil.go:85\n | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).init\n | \t/workspace/source/internal/distributed/proxy/service.go:486\n | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).Run\n | \t/workspace/source/internal/distributed/proxy/service.go:412\n | github.com/milvus-io/milvus/cmd/components.(*Proxy).Run\n | \t/workspace/source/cmd/components/proxy.go:60\n | github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n | \t/workspace/source/cmd/roles/roles.go:130\n | runtime.goexit\n | \t/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_arm64.s:1222\nWraps: (2) empty grpc client\nWraps: (3) find no available rootcoord, check rootcoord state\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]
[2025/01/14 09:08:24.595 +00:00] [WARN] [client/client.go:104] ["RootCoordClient mess key not exist"] [key=rootcoord]
[2025/01/14 09:08:24.595 +00:00] [WARN] [grpcclient/client.go:262] ["failed to get client address"] [error="find no available rootcoord, check rootcoord state"]
[2025/01/14 09:08:24.595 +00:00] [WARN] [grpcclient/client.go:480] ["fail to get grpc client in the retry state"] [client_role=rootcoord] [error="find no available rootcoord, check rootcoord state"]
[2025/01/14 09:08:24.595 +00:00] [WARN] [retry/retry.go:130] ["retry func failed"] [retried=0] [error="empty grpc client: find no available rootcoord, check rootcoord state"] [errorVerbose="empty grpc client: find no available rootcoord, check rootcoord state\n(1) attached stack trace\n -- stack trace:\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call.func2\n | \t/workspace/source/internal/util/grpcclient/client.go:493\n | github.com/milvus-io/milvus/pkg/util/retry.Handle\n | \t/workspace/source/pkg/util/retry/retry.go:128\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call\n | \t/workspace/source/internal/util/grpcclient/client.go:486\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n | \t/workspace/source/internal/util/grpcclient/client.go:573\n | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n | \t/workspace/source/internal/util/grpcclient/client.go:589\n | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.wrapGrpcCall[...]\n | \t/workspace/source/internal/distributed/rootcoord/client/client.go:121\n | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.(*Client).GetComponentStates\n | \t/workspace/source/internal/distributed/rootcoord/client/client.go:135\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...].func1\n | \t/workspace/source/internal/util/componentutil/componentutil.go:39\n | github.com/milvus-io/milvus/pkg/util/retry.Do\n | \t/workspace/source/pkg/util/retry/retry.go:44\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...]\n | \t/workspace/source/internal/util/componentutil/componentutil.go:64\n | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentHealthy[...]\n | \t/workspace/source/internal/util/componentutil/componentutil.go:85\n | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).init\n | \t/workspace/source/internal/distributed/proxy/service.go:486\n | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).Run\n | \t/workspace/source/internal/distributed/proxy/service.go:412\n | github.com/milvus-io/milvus/cmd/components.(*Proxy).Run\n | \t/workspace/source/cmd/components/proxy.go:60\n | github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n | \t/workspace/source/cmd/roles/roles.go:130\n | runtime.goexit\n | \t/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_arm64.s:1222\nWraps: (2) empty grpc client\nWraps: (3) find no available rootcoord, check rootcoord state\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]
[2025/01/14 09:08:24.671 +00:00] [WARN] [client/client.go:104] ["RootCoordClient mess key not exist"] [key=rootcoord]

Anything else?

No response

@LiSuiTech LiSuiTech added kind/bug Issues or changes related a bug needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 14, 2025
@xiaofan-luan
Copy link
Collaborator

please offer rootcoord log for debug, thanks.

check what is the error message of rootcoord to block it from start

@LiSuiTech
Copy link
Author

rootCoord:
dmlChannelNum: 16 # The number of DML-Channels to create at the root coord startup.
# The maximum number of partitions in each collection.
# New partitions cannot be created if this parameter is set as 0 or 1.
# Range: [0, INT64MAX]
maxPartitionNum: 1024
# The minimum row count of a segment required for creating index.
# Segments with smaller size than this parameter will not be indexed, and will be searched with brute force.
minSegmentSizeToEnableIndex: 1024
enableActiveStandby: false
maxDatabaseNum: 64 # Maximum number of database
maxGeneralCapacity: 65536 # upper limit for the sum of of product of partitionNumber and shardNumber
gracefulStopTimeout: 5 # seconds. force stop node without graceful stop
ip: # TCP/IP address of rootCoord. If not specified, use the first unicastable address
port: 53100 # TCP port of rootCoord
grpc:
serverMaxSendSize: 536870912 # The maximum size of each RPC request that the rootCoord can send, unit: byte
serverMaxRecvSize: 268435456 # The maximum size of each RPC request that the rootCoord can receive, unit: byte
clientMaxSendSize: 268435456 # The maximum size of each RPC request that the clients on rootCoord can send, unit: byte
clientMaxRecvSize: 536870912 # The maximum size of each RPC request that the clients on rootCoord can receive, unit: byte
log:
level: info # 只支持 debug, info, warn, error, panic 或 fatal。默认 'info'。
path: stdout # 将日志输出到标准输出
日志输出到终端所以以上 log 就是全部 log

@LiSuiTech
Copy link
Author

  2025/01/15 08:30:32 maxprocs: Updating GOMAXPROCS=4: determined from CPU quota
  
      __  _________ _   ____  ______    
     /  |/  /  _/ /| | / / / / / __/    
    / /|_/ // // /_| |/ / /_/ /\ \    
   /_/  /_/___/____/___/\____/___/     
  
  Welcome to use Milvus!
  Version:   v2.5.3
  Built:     Mon Jan 13 07:33:44 UTC 2025
  GitCommit: ac730be
  GoVersion: go version go1.22.0 linux/arm64
  
  TotalMem: 17179869184
  UsedMem: 26562560
  
  open pid file: /run/milvus/standalone.pid
  lock pid file: /run/milvus/standalone.pid
  [2025/01/15 08:30:32.370 +00:00] [INFO] [roles/roles.go:345] ["starting running Milvus components"]
  [2025/01/15 08:30:32.370 +00:00] [INFO] [roles/roles.go:187] ["Enable Jemalloc"] ["Jemalloc Path"=/milvus/lib/libjemalloc.so]
  [2025/01/15 08:30:32.379 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=FileSource]
  [2025/01/15 08:30:32.381 +00:00] [DEBUG] [paramtable/base_table.go:213] ["init etcd source"] [etcdInfo="{\"UseEmbed\":false,\"EnableAuth\":false,\"UserName\":\"\",\"PassWord\":\"\",\"UseSSL\":false,\"Endpoints\":[\"http://etcd.shangshan-testing.svc.cluster.local:2379\"],\"KeyPrefix\":\"by-dev\",\"CertFile\":\"/path/to/etcd-client.pem\",\"KeyFile\":\"/path/to/etcd-client-key.pem\",\"CaCertFile\":\"/path/to/ca.pem\",\"MinVersion\":\"1.3\",\"RefreshInterval\":5000000000}"]
  [2025/01/15 08:30:32.381 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.391 +00:00] [DEBUG] [config/etcd_source.go:92] ["etcd refreshConfigurations"] [prefix=by-dev/config] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"]
  [2025/01/15 08:30:32.391 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=EtcdSource]
  [2025/01/15 08:30:32.395 +00:00] [DEBUG] [runtime/asm_arm64.s:1222] ["start refreshing configurations"] [source=FileSource]
  [2025/01/15 08:30:32.395 +00:00] [INFO] [paramtable/hook_config.go:21] ["hook config"] [hook={}]
  [2025/01/15 08:30:32.396 +00:00] [INFO] [tracer/tracer.go:50] ["Init tracer finished"] [Exporter=noop]
  [2025/01/15 08:30:32.396 +00:00] [INFO] [logutil/logutil.go:163] ["Log directory"] [configDir=]
  [2025/01/15 08:30:32.396 +00:00] [INFO] [logutil/logutil.go:164] ["Set log file to "] [path=]
  [2025/01/15 08:30:32.396 +00:00] [INFO] [roles/roles.go:294] [setupPrometheusHTTPServer]
  [2025/01/15 08:30:32.396 +00:00] [INFO] [http/server.go:240] ["management listen"] [addr=:9091]
  [2025/01/15 08:30:32.397 +00:00] [INFO] [gc/gc_tuner.go:138] ["GC Helper initialized."] ["Initial GoGC"=100] [minimumGOGC=30] [maximumGOGC=200] [memoryThreshold=15461882265]
  [2025/01/15 08:30:32.398 +00:00] [INFO] [rootcoord/root_coord.go:165] ["update rootcoord state"] [state=Abnormal]
  [2025/01/15 08:30:32.398 +00:00] [INFO] [rootcoord/service.go:155] ["RootCoord listen on"] [address="[::]:53100"] [port=53100]
  [2025/01/15 08:30:32.398 +00:00] [INFO] [rootcoord/service.go:185] ["init params done.."]
  [2025/01/15 08:30:32.398 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.399 +00:00] [INFO] [datacoord/service.go:97] ["DataCoord listen on"] [address="[::]:13333"] [port=13333]
  [2025/01/15 08:30:32.399 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.400 +00:00] [INFO] [components/index_coord.go:42] ["IndexCoord running ..."]
  [2025/01/15 08:30:32.400 +00:00] [INFO] [querycoord/service.go:104] ["QueryCoord listen on"] [address="[::]:19531"] [port=19531]
  [2025/01/15 08:30:32.400 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.401 +00:00] [INFO] [querynode/service.go:102] ["QueryNode listen on"] [address="[::]:21123"] [port=21123]
  [2025/01/15 08:30:32.401 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.403 +00:00] [INFO] [datanode/service.go:102] ["DataNode listen on"] [address="[::]:21124"] [port=21124]
  [2025/01/15 08:30:32.403 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.404 +00:00] [INFO] [indexnode/service.go:78] ["IndexNode listen on"] [address="[::]:21121"] [port=21121]
  [2025/01/15 08:30:32.404 +00:00] [INFO] [utils/util.go:60] ["Internal TLS Enabled"] [value=false]
  [2025/01/15 08:30:32.404 +00:00] [INFO] [proxy/look_aside_balancer.go:233] ["Start check query node health loop"]
  [2025/01/15 08:30:32.405 +00:00] [INFO] [hookutil/hook.go:70] ["empty so path, skip to load plugin"]
  [2025/01/15 08:30:32.405 +00:00] [INFO] [vecindexmgr/vector_index_mgr.go:124] ["init vector indexes with features : BINFLAT : 720897,BIN_FLAT : 720897,BIN_IVF_FLAT : 524289,DISKANN : 2097166,FLAT : 720910,HNSW : 1572910,HNSWLIB_DEPRECATED : 1572879,HNSW_PQ : 524334,HNSW_PRQ : 524334,HNSW_SQ : 524334,IVFBIN : 524289,IVFFLAT : 524302,IVFFLATCC : 14,IVFPQ : 524302,IVFSQ : 524302,IVF_FLAT : 524302,IVF_FLAT_CC : 14,IVF_PQ : 524302,IVF_SQ : 524302,IVF_SQ8 : 524302,IVF_SQ_CC : 14,SCANN : 524302,SPARSE_INVERTED_INDEX : 524304,SPARSE_INVERTED_INDEX_CC : 524304,SPARSE_WAND : 524304,SPARSE_WAND_CC : 524304,"]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [proxy/listener_manager.go:53] ["Proxy listen on external grpc listener"] [address=100.101.175.111:19530] [port=19530]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [proxy/listener_manager.go:63] ["Proxy listen on internal grpc listener"] [address=100.101.175.111:19529] [port=19529]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [proxy/listener_manager.go:98] ["Proxy server(http) and external grpc server share the same port"]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [datacoord/server.go:1170] ["register metrics actions finished"]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [rootcoord/service.go:205] ["etcd connect done ..."]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [dependency/factory.go:86] ["try to init mq"] [standalone=true] [mqType=rocksmq]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [utils/util.go:60] ["Internal TLS Enabled"] [value=false]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [coordclient/registry.go:66] ["register query coord server"] [enableLocalClient="{\"ServerType\":\"standalone\",\"EnableQueryCoord\":true,\"EnableDataCoord\":true,\"EnableRootCoord\":true}"]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [datanode/service.go:257] ["DataNode address"] [address=100.101.175.111:21124]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [utils/util.go:60] ["Internal TLS Enabled"] [value=false]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [utils/util.go:60] ["Internal TLS Enabled"] [value=false]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [proxy/service.go:411] ["init Proxy server"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [proxy/service.go:431] ["Proxy init service's parameter table done"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [proxy/service.go:433] ["Proxy init http server's parameter table done"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [accesslog/global.go:146] ["Init access logger success"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [proxy/service.go:437] ["init Proxy's tracer done"] ["service name"="Proxy ip: 100.101.175.111, port: 19530"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.406 +00:00] [INFO] [rootcoord/service.go:219] ["RootCoord start to create DataCoord client"]
  [2025/01/15 08:30:32.407 +00:00] [INFO] [etcd/etcd_util.go:52] ["create etcd client"] [useEmbedEtcd=false] [useSSL=false] [endpoints="[http://etcd.shangshan-testing.svc.cluster.local:2379]"] [minVersion=1.3]
  [2025/01/15 08:30:32.416 +00:00] [INFO] [utils/util.go:60] ["Internal TLS Enabled"] [value=false]
  [2025/01/15 08:30:32.416 +00:00] [INFO] [proxy/service.go:387] ["create Proxy internal grpc server"] ["enforcement policy"="{\"MinTime\":5000000000,\"PermitWithoutStream\":true}"] ["server parameters"="{\"MaxConnectionIdle\":0,\"MaxConnectionAge\":0,\"MaxConnectionAgeGrace\":0,\"Time\":60000000000,\"Timeout\":10000000000}"]
  [2025/01/15 08:30:32.419 +00:00] [INFO] [rootcoord/service.go:228] ["RootCoord start to create QueryCoord client"]
  [2025/01/15 08:30:32.421 +00:00] [INFO] [rootcoord/root_coord.go:524] ["register metrics actions finished"]
  [2025/01/15 08:30:32.421 +00:00] [INFO] [dependency/factory.go:86] ["try to init mq"] [standalone=true] [mqType=rocksmq]
  [2025/01/15 08:30:32.423 +00:00] [WARN] [client/client.go:104] ["RootCoordClient mess key not exist"] [key=rootcoord]
  [2025/01/15 08:30:32.423 +00:00] [WARN] [grpcclient/client.go:262] ["failed to get client address"] [error="find no available rootcoord, check rootcoord state"]
  [2025/01/15 08:30:32.423 +00:00] [WARN] [grpcclient/client.go:473] ["fail to get grpc client"] [client_role=rootcoord] [error="find no available rootcoord, check rootcoord state"]
  [2025/01/15 08:30:32.423 +00:00] [WARN] [grpcclient/client.go:494] ["grpc client is nil, maybe fail to get client in the retry state"] [client_role=rootcoord] [error="empty grpc client: find no available rootcoord, check rootcoord state"] [errorVerbose="empty grpc client: find no available rootcoord, check rootcoord state\n(1) attached stack trace\n  -- stack trace:\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call.func2\n  | \t/workspace/source/internal/util/grpcclient/client.go:493\n  | github.com/milvus-io/milvus/pkg/util/retry.Handle\n  | \t/workspace/source/pkg/util/retry/retry.go:128\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).call\n  | \t/workspace/source/internal/util/grpcclient/client.go:486\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).Call\n  | \t/workspace/source/internal/util/grpcclient/client.go:573\n  | github.com/milvus-io/milvus/internal/util/grpcclient.(*ClientBase[...]).ReCall\n  | \t/workspace/source/internal/util/grpcclient/client.go:589\n  | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.wrapGrpcCall[...]\n  | \t/workspace/source/internal/distributed/rootcoord/client/client.go:121\n  | github.com/milvus-io/milvus/internal/distributed/rootcoord/client.(*Client).GetComponentStates\n  | \t/workspace/source/internal/distributed/rootcoord/client/client.go:135\n  | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...].func1\n  | \t/workspace/source/internal/util/componentutil/componentutil.go:39\n  | github.com/milvus-io/milvus/pkg/util/retry.Do\n  | \t/workspace/source/pkg/util/retry/retry.go:44\n  | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentStates[...]\n  | \t/workspace/source/internal/util/componentutil/componentutil.go:64\n  | github.com/milvus-io/milvus/internal/util/componentutil.WaitForComponentHealthy[...]\n  | \t/workspace/source/internal/util/componentutil/componentutil.go:85\n  | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).init\n  | \t/workspace/source/internal/distributed/proxy/service.go:486\n  | github.com/milvus-io/milvus/internal/distributed/proxy.(*Server).Run\n  | \t/workspace/source/internal/distributed/proxy/service.go:412\n  | github.com/milvus-io/milvus/cmd/components.(*Proxy).Run\n  | \t/workspace/source/cmd/components/proxy.go:60\n  | github.com/milvus-io/milvus/cmd/roles.runComponent[...].func1\n  | \t/workspace/source/cmd/roles/roles.go:130\n  | runtime.goexit\n  | \t/go/pkg/mod/golang.org/[email protected]/src/runtime/asm_arm64.s:1222\nWraps: (2) empty grpc client\nWraps: (3) find no available rootcoord, check rootcoord state\nError types: (1) *withstack.withStack (2) *errutil.withPrefix (3) *errors.errorString"]
  [2025/01/15 08:30:32.425 +00:00] [WARN] [client/client.go:104] ["RootCoordClient mess key not exist"] [key=rootcoord]
  [2025/01/15 08:30:32.425 +00:00] [WARN] [grpcclient/client.go:262] ["failed to get client address"] [error="find no available rootcoord, check rootcoord state"]
  [2025/01/15 08:30:32.425 +00:00] [WARN] [grpcclient/client.go:480] ["fail to get grpc client in the retry state"] [client_role=rootcoord] [error="find no available rootcoord, check rootcoord state"]

@yanliang567
Copy link
Contributor

I did not see any blocking errors in the pieces of logs above. @LiSuiTech could you please attache the completed milvus logs? For Milvus installed with docker-compose, you can use docker-compose logs > milvus.log to export the logs.

/assign @LiSuiTech
/unassign

@yanliang567 yanliang567 added triage/needs-information Indicates an issue needs more information in order to work on it. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 16, 2025
@LiSuiTech
Copy link
Author

我现在使用的是 k8s 安装的 这已经是完整的 milvus.log 了, 并且我使用的是 helm chart 方式部署的,然后上面的是报错,最后就是重复的获取 rootcoord 配置信息连接 所以我不知道他它为什么获取不到,我已在 etcd 中看到存在该数据了!

@yanliang567
Copy link
Contributor

/assign @liliu-z
any ideas?

@xiaofan-luan
Copy link
Collaborator

don't see a critical reason of why the cluster can not start.
I guess the log is missing.

The main focus is to read the rootcoord log and understand why it can not start.
it could due to etcd, or pulsar connectivity issue

@xiaofan-luan
Copy link
Collaborator

/assign @LiSuiTech
lack of evidence.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug triage/needs-information Indicates an issue needs more information in order to work on it.
Projects
None yet
Development

No branches or pull requests

4 participants