diff --git a/docs/openai.md b/docs/openai.md
index d9265c6b3..8e7a5a5a9 100644
--- a/docs/openai.md
+++ b/docs/openai.md
@@ -152,6 +152,40 @@ curl http://localhost:11434/v1/completions \
 
 - `prompt` currently only accepts a string
 
+### `/v1/completions`
+
+#### Supported features
+
+- [x] Completions
+- [x] Streaming
+- [x] JSON mode
+- [x] Reproducible outputs
+- [ ] Logprobs
+
+#### Supported request fields
+
+- [x] `model`
+- [x] `prompt`
+- [x] `frequency_penalty`
+- [x] `presence_penalty`
+- [x] `seed`
+- [x] `stop`
+- [x] `stream`
+- [x] `temperature`
+- [x] `top_p`
+- [x] `max_tokens`
+- [ ] `best_of`
+- [ ] `echo`
+- [ ] `suffix`
+- [ ] `logit_bias`
+- [ ] `user`
+- [ ] `n`
+
+#### Notes
+
+- `prompt` currently only accepts a string
+- `usage.prompt_tokens` will be 0 for completions where prompt evaluation is cached
+
 ## Models
 
 Before using a model, pull it locally `ollama pull`:
diff --git a/llm/llama.cpp b/llm/llama.cpp
index 7c26775ad..a8db2a9ce 160000
--- a/llm/llama.cpp
+++ b/llm/llama.cpp
@@ -1 +1 @@
-Subproject commit 7c26775adb579e92b59c82e8084c07a1d0f75e9c
+Subproject commit a8db2a9ce64cd4417f6a312ab61858f17f0f8584