-
Notifications
You must be signed in to change notification settings - Fork 12.6k
-
Star 29.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
集群Leader逻辑是否存在问题?含义是什么(Is there a problem with cluster Leader logic? What does it mean) #7158
集群Leader逻辑是否存在问题?含义是什么(Is there a problem with cluster Leader logic? What does it mean) #7158
Comments
kuaile-zc
commented
Nov 1, 2021
|
chenshun00
commented
Nov 2, 2021
看起来是出现了不一致的数据 |
kuaile-zc
commented
Nov 2, 2021
这个数据不一致是不是处于不正常状态? |
MajorHe1
commented
Nov 2, 2021
我不确定 “每一种业务逻辑一个Leader” 这种表象是否符合nacos设计的初衷,但是我可以确定 naming_instance_metadata naming_persistent_service naming_persistent_service_v2 naming_service_metadata 这里确实是4个用到了raft协议的场景。我猜应该是它们各自单独使用了CP协议的模块所以才会这样。 至于集群是不是出于正常状态,看一下服务注册、服务发现、集群同步这几个功能是否正常,看一下日志里面是否出现了 sync failed 一类的关键字,如果没有集群应该就是正常的。这里的数据来判断不具备说服性。 至于这四个使用raft协议的场景分别是啥: 其次,在nacos 2.X的版本中,因为服务注册的权威节点计算是基于连接的,但是任意一台nacos节点均可修改该服务的元数据,所以为了达到这个能力,服务的某些元数据(例如enable状态、权重等等)必须通过raft协议同步,所以: 参考: https://my.oschina.net/u/3585447/blog/4818143 |
kuaile-zc
commented
Nov 3, 2021
感谢你的解答,我会顺着思路研究一下代码。十分感谢! |
stale
bot
commented
May 2, 2022
Thanks for your feedback and contribution. But the issue/pull request has not had recent activity more than 180 days. This issue/pull request will be closed if no further activity occurs 7 days later. |
jsonpang
commented
Jul 22, 2022
节点元数据不一致会导致实例同步失败 ,不停的刷新页面会发现,出现服务实例不一致的情况 |
Scofield-D
commented
Nov 1, 2023
这四个raft的集群leader确实是四个raft的场景,不是看单个节点的元数据,只要集群下所有节点的这整个一大坨元数据一样就可以了。但是在低于2.1.0版nacos集群存在1.X逻辑兼容,很容易触发注册服务实例数集群间数据同步不一致的情况,管理台上的反应是反复刷会出现显示注册服务不一样(查询接口负载到不同的节点上导致),客户端的反应是不停的报deleteIps(心跳到不同的节点上导致)。一定要升级到2.1.0及以上版本,不然现网很容易出问题,最后升级到2.2.0及以上。(2.1.0后面几个版本bug也较多) |
Describe the bug
我查看了nacos页面发现集群状态如下:
I checked the NACOS page and found the cluster status as follows:
{ "lastRefreshTime": 1635749568396, "raftMetaData": { "metaDataMap": { "naming_instance_metadata": { "leader": "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "raftGroupMember": [ "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-2.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-1.nacos-nacos-hs.performance.svc.cluster.local:7848" ], "term": 1 }, "naming_persistent_service": { "leader": "nacos-1.nacos-nacos-hs.performance.svc.cluster.local:7848", "raftGroupMember": [ "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-2.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-1.nacos-nacos-hs.performance.svc.cluster.local:7848" ], "term": 1 }, "naming_persistent_service_v2": { "leader": "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "raftGroupMember": [ "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-2.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-1.nacos-nacos-hs.performance.svc.cluster.local:7848" ], "term": 1 }, "naming_service_metadata": { "leader": "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "raftGroupMember": [ "nacos-0.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-2.nacos-nacos-hs.performance.svc.cluster.local:7848", "nacos-1.nacos-nacos-hs.performance.svc.cluster.local:7848" ], "term": 1 } } }, "raftPort": "7848", "readyToUpgrade": true, "site": "unknown", "version": "2.0.3", "weight": 2 }
我发现有好几个配置 naming_instance_metadata naming_persistent_service naming_persistent_service_v2 naming_service_metadata 他们的Leader都不一样。这样是预期中的结果吗?能否详细说明一下 这4个集群leader是什么意思呢?
我感觉如果每一种业务逻辑一个Leader是不是不太合理。
I found several configurations naming_instance_metadata naming_persistent_service naming_persistent_service_v2 naming_service_metadata They all have different leaders. Is this the intended result? Can you elaborate on what these four cluster leaders mean?
I feel that if each business logic a Leader is not too reasonable.
Desktop (please complete the following information):
The text was updated successfully, but these errors were encountered: