-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server gets stuck after invalid request #5724
Comments
Maybe it's also related to #5246 |
No it's not fixed, in fact the issue is because if In short, you will get the same bug with I checked on OpenAI and in fact we can get embedding even if the input prompt is empty, I believe the output vector means "empty document". The simple fix is to enforce the input to have at least one token. If it's empty, we add a space to it |
I think I pinpointed the bug: In the loop where we try different batch sizes, if the
Also, the
Still I not sure it's worth to fix this, because @ggerganov said that the new llama_decode will handle batch size itself. Also, a sequence should always have at least a BOS token. |
Repro:
The text was updated successfully, but these errors were encountered: