You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using a model with a built-in image to launch on two GPUs,so the flag embedding also starts the model on the GPU.After serving dozens of users at this scale for a period of time,such as two days,the memory of these two GPUs has increased to a certain value,but it is within the normal range without memory overflow.However,the utilization rate of one of the GPUs has always been 100%,regardless of whether it is being used or not.When questions are asked again,the model cannot output.At this time,testing the VLLM backend large model separately is not a problem,so the issue must be with the previous retrieval or sorting.
Expected behavior
No response
Steps to reproduce
This issue only occurs after a certain number of users have utilized it,and it has now reliably appeared three times.
Additional information
No response
The text was updated successfully, but these errors were encountered:
Is there an existing issue for the same bug?
RAGFlow workspace code commit ID
none
RAGFlow image version
v0.13.0
Other environment information
No response
Actual behavior
I am using a model with a built-in image to launch on two GPUs,so the flag embedding also starts the model on the GPU.After serving dozens of users at this scale for a period of time,such as two days,the memory of these two GPUs has increased to a certain value,but it is within the normal range without memory overflow.However,the utilization rate of one of the GPUs has always been 100%,regardless of whether it is being used or not.When questions are asked again,the model cannot output.At this time,testing the VLLM backend large model separately is not a problem,so the issue must be with the previous retrieval or sorting.
Expected behavior
No response
Steps to reproduce
Additional information
No response
The text was updated successfully, but these errors were encountered: