Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems about Cluster Data Environment #373

Open
Coder-Qian opened this issue Sep 21, 2020 · 2 comments
Open

Problems about Cluster Data Environment #373

Coder-Qian opened this issue Sep 21, 2020 · 2 comments

Comments

@Coder-Qian
Copy link

Question 1:
We know that 2^31 (2.1 billion) is the lucene max documents per index. By default, one Cassandra node can only store 2.1 billion data. For a single node, if I want to break through 2.1 billion, are there other ways to do it without virtual indexes? Because I don't know if there are performance issues with virtual indexes.

Question 2:
I have a 50 billion time series data. If I use 3 replicas, how many nodes should I set up to achieve the second level data query. How many CPU cores and memory are recommended for each node? And, How much out of heap memory should I leave for each node? In the production environment, there are 7 servers with 256g and 32 cpu cores. Can they meet the requirements with docker?

Please forgive my poor English. thanks you for your advice.

@vroyer
Copy link
Collaborator

vroyer commented Sep 21, 2020 via email

@Coder-Qian
Copy link
Author

Our production environment is that we have three tables. Each table has about 70 fields. Each table needs to store 20 billion data. The yyyyMMdd is used as partition keys. It is noticed that each cassandra node can have 30g heap, so, one server who has 256g menory can allocate 5 cassandra nodes by installed 2:1 ratio (heap memory and out heap memory, (256/(30+15) = 5.6). We want to deploy the cluster well when building the production environment, because it is too slow to migrate data after adding nodes to Elassandra. Thank everyone for helping me. Do not let me post sink.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants