Server gets stuck after invalid request #5724

ggerganov · 2024-02-26T08:01:49Z

Repro:

./server -m models/bert-bge-small/ggml-model-f16.gguf --embedding

# send invalid request
curl http://localhost:8080/v1/embeddings -H "Content-Type: application/json" -H "Authorization: Bearer no-key" -d '{ }'

# next requests makes server hang
curl http://localhost:8080/v1/embeddings -H "Content-Type: application/json" -H "Authorization: Bearer no-key" -d '{ "input": "hello" }'

# need to kill it
killall server

z80maniac · 2024-02-26T09:51:51Z

Maybe it's also related to #5246

phymbert · 2024-02-26T10:39:44Z

I will add a test scenario later today, I guess @ngxson already fixed it in #5710

ngxson · 2024-02-26T11:24:15Z

No it's not fixed, in fact the issue is because if "input" does not exist inside body payload, it will be defaulted to an empty string. This causes slot to not be cleaned up correctly (it never moves back to IDLE state).

In short, you will get the same bug with { "input": "" }

I checked on OpenAI and in fact we can get embedding even if the input prompt is empty, I believe the output vector means "empty document".

The simple fix is to enforce the input to have at least one token. If it's empty, we add a space to it prompt = " ", do you think that's enough @ggerganov ?

ngxson · 2024-02-26T12:58:44Z

I think I pinpointed the bug: In the loop where we try different batch sizes, if the n_tokens = 0 then the loop is totally skipped:

// batch.n_tokens = 0
for (int32_t i = 0; i < (int32_t) batch.n_tokens; i += n_batch)

Also, the has_prompt condition checks if the prompt is empty or not in order to release it:

const bool has_prompt = slot.prompt.is_array() || (slot.prompt.is_string() && !slot.prompt.get<std::string>().empty()) || !slot.images.empty();

Still I not sure it's worth to fix this, because @ggerganov said that the new llama_decode will handle batch size itself. Also, a sequence should always have at least a BOS token.

ggerganov added bug Something isn't working server/webui labels Feb 26, 2024

ngxson mentioned this issue Feb 26, 2024

Server: fix server hangs on empty prompt #5733

Merged

ngxson closed this as completed in #5733 Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Server gets stuck after invalid request #5724

Server gets stuck after invalid request #5724

ggerganov commented Feb 26, 2024 •

edited

Loading

z80maniac commented Feb 26, 2024

phymbert commented Feb 26, 2024 •

edited

Loading

ngxson commented Feb 26, 2024

ngxson commented Feb 26, 2024 •

edited

Loading

Server gets stuck after invalid request #5724

Server gets stuck after invalid request #5724

Comments

ggerganov commented Feb 26, 2024 • edited Loading

z80maniac commented Feb 26, 2024

phymbert commented Feb 26, 2024 • edited Loading

ngxson commented Feb 26, 2024

ngxson commented Feb 26, 2024 • edited Loading

ggerganov commented Feb 26, 2024 •

edited

Loading

phymbert commented Feb 26, 2024 •

edited

Loading

ngxson commented Feb 26, 2024 •

edited

Loading