Compare commits
138 Commits
upgrade-al
...
whitespace
Author | SHA1 | Date | |
---|---|---|---|
![]() |
aae31dc6ed | ||
![]() |
199e79ec0c | ||
![]() |
8125ce4cb6 | ||
![]() |
636d6eea99 | ||
![]() |
df56f1ee5e | ||
![]() |
0b6c6c9092 | ||
![]() |
cb60389de7 | ||
![]() |
ce0c95d097 | ||
![]() |
a9bc1e1c37 | ||
![]() |
62c71f4cb1 | ||
![]() |
41aca5c2d0 | ||
![]() |
753724d867 | ||
![]() |
e4576c2ee1 | ||
![]() |
9a7a4b9533 | ||
![]() |
2653191222 | ||
![]() |
b338c0635f | ||
![]() |
4fcbf1cde6 | ||
![]() |
9220b4fa91 | ||
![]() |
fc39a6cd7a | ||
![]() |
1e23e82324 | ||
![]() |
f9fd08040b | ||
![]() |
4318e35ee3 | ||
![]() |
9754c6d9d8 | ||
![]() |
a497235a55 | ||
![]() |
df6dc4fd96 | ||
![]() |
88622847c6 | ||
![]() |
9774663013 | ||
![]() |
a468ae0459 | ||
![]() |
c3e62ba38a | ||
![]() |
117369aa73 | ||
![]() |
1ba734de67 | ||
![]() |
5208cf09b1 | ||
![]() |
bb9de6037c | ||
![]() |
272e53a1f5 | ||
![]() |
db2a9ad1fe | ||
![]() |
c9ab1aead3 | ||
![]() |
4a10e7a7fa | ||
![]() |
86808f80a8 | ||
![]() |
4240b045e6 | ||
![]() |
e547378893 | ||
![]() |
fd77dbec4d | ||
![]() |
fefb3e77d1 | ||
![]() |
ed5489a96e | ||
![]() |
76113742cf | ||
![]() |
57e60c836f | ||
![]() |
622b1f3e67 | ||
![]() |
7ad9844ac0 | ||
![]() |
e43648afe5 | ||
![]() |
823a520266 | ||
![]() |
66ef308abd | ||
![]() |
29e90cc13b | ||
![]() |
f397e0e988 | ||
![]() |
9da9e8fb72 | ||
![]() |
42e77e2a69 | ||
![]() |
9241a29336 | ||
![]() |
f7231ad9ad | ||
![]() |
6920964b87 | ||
![]() |
2f9ed52bbd | ||
![]() |
caf2b13c10 | ||
![]() |
1d263449ff | ||
![]() |
48a273f80b | ||
![]() |
939c60473f | ||
![]() |
f76ca04f9e | ||
![]() |
76b8728f0c | ||
![]() |
1f9078d6ae | ||
![]() |
6d84f07505 | ||
![]() |
26b13fc33c | ||
![]() |
1c8435ffa9 | ||
![]() |
6680761596 | ||
![]() |
42b797ed9c | ||
![]() |
336aa43f3c | ||
![]() |
69f392c9b7 | ||
![]() |
a1dfab43b9 | ||
![]() |
a0a199b108 | ||
![]() |
ab0d37fde4 | ||
![]() |
14e71350c8 | ||
![]() |
453f572f83 | ||
![]() |
c9dfa6e571 | ||
![]() |
3dcbcd367d | ||
![]() |
e805ac1d59 | ||
![]() |
b9229ffca5 | ||
![]() |
46c847c4ad | ||
![]() |
92b1a21f79 | ||
![]() |
de76b95dd4 | ||
![]() |
59ec837ef6 | ||
![]() |
f06b99a461 | ||
![]() |
128fce5495 | ||
![]() |
27aa2d4a19 | ||
![]() |
b9f91a0b36 | ||
![]() |
b538dc3858 | ||
![]() |
f0e9496c85 | ||
![]() |
09a6f76f4c | ||
![]() |
e135167484 | ||
![]() |
38296ab352 | ||
![]() |
f43dea68d1 | ||
![]() |
e1f50377f4 | ||
![]() |
7913104527 | ||
![]() |
bfbf2f7cf7 | ||
![]() |
fe3cbd014f | ||
![]() |
3d6f48507a | ||
![]() |
f3761405c8 | ||
![]() |
e49dc9f3d8 | ||
![]() |
d125510b4b | ||
![]() |
1ca386aa9e | ||
![]() |
fb56988014 | ||
![]() |
d046bee790 | ||
![]() |
f11bf0740b | ||
![]() |
8450bf66e6 | ||
![]() |
b4e11be8ef | ||
![]() |
a896079705 | ||
![]() |
583950c828 | ||
![]() |
8ac08a0eec | ||
![]() |
60f47be64c | ||
![]() |
6e56077ada | ||
![]() |
98ae9467bb | ||
![]() |
b7a24af083 | ||
![]() |
c8b1f2369e | ||
![]() |
72b12c3be7 | ||
![]() |
0632dff3f8 | ||
![]() |
509e2dec8a | ||
![]() |
78a48de804 | ||
![]() |
e7dbb00331 | ||
![]() |
c3f9538636 | ||
![]() |
2e06ed01d5 | ||
![]() |
4072b5879b | ||
![]() |
15562e887d | ||
![]() |
f2245c7c77 | ||
![]() |
e4b9b72f2a | ||
![]() |
311f8e0c3f | ||
![]() |
f07f8b7a9e | ||
![]() |
4c4c730a0a | ||
![]() |
e02ecfb6c8 | ||
![]() |
c8059b4dcf | ||
![]() |
59d87127f5 | ||
![]() |
6eb3cddcb6 | ||
![]() |
a4564232a4 | ||
![]() |
a447a083f2 | ||
![]() |
681a914990 |
4
.github/workflows/test.yaml
vendored
@@ -34,7 +34,7 @@ jobs:
|
||||
matrix:
|
||||
cuda-version:
|
||||
- '11.8.0'
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: linux
|
||||
container: nvidia/cuda:${{ matrix.cuda-version }}-devel-ubuntu20.04
|
||||
steps:
|
||||
- run: |
|
||||
@@ -64,7 +64,7 @@ jobs:
|
||||
rocm-version:
|
||||
- '5.7.1'
|
||||
- '6.0'
|
||||
runs-on: ubuntu-latest
|
||||
runs-on: linux
|
||||
container: rocm/dev-ubuntu-20.04:${{ matrix.rocm-version }}
|
||||
steps:
|
||||
- run: |
|
||||
|
3
.gitignore
vendored
@@ -9,4 +9,5 @@ ggml-metal.metal
|
||||
.cache
|
||||
*.exe
|
||||
.idea
|
||||
test_data
|
||||
test_data
|
||||
*.crt
|
38
README.md
@@ -10,16 +10,16 @@ Get up and running with large language models locally.
|
||||
|
||||
### macOS
|
||||
|
||||
[Download](https://ollama.ai/download/Ollama-darwin.zip)
|
||||
[Download](https://ollama.com/download/Ollama-darwin.zip)
|
||||
|
||||
### Windows
|
||||
### Windows preview
|
||||
|
||||
Coming soon! For now, you can install Ollama on Windows via WSL2.
|
||||
[Download](https://ollama.com/download/OllamaSetup.exe)
|
||||
|
||||
### Linux & WSL2
|
||||
### Linux
|
||||
|
||||
```
|
||||
curl https://ollama.ai/install.sh | sh
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
[Manual install instructions](https://github.com/jmorganca/ollama/blob/main/docs/linux.md)
|
||||
@@ -35,7 +35,7 @@ The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) `olla
|
||||
|
||||
## Quickstart
|
||||
|
||||
To run and chat with [Llama 2](https://ollama.ai/library/llama2):
|
||||
To run and chat with [Llama 2](https://ollama.com/library/llama2):
|
||||
|
||||
```
|
||||
ollama run llama2
|
||||
@@ -43,9 +43,9 @@ ollama run llama2
|
||||
|
||||
## Model library
|
||||
|
||||
Ollama supports a list of open-source models available on [ollama.ai/library](https://ollama.ai/library 'ollama model library')
|
||||
Ollama supports a list of models available on [ollama.com/library](https://ollama.com/library 'ollama model library')
|
||||
|
||||
Here are some example open-source models that can be downloaded:
|
||||
Here are some example models that can be downloaded:
|
||||
|
||||
| Model | Parameters | Size | Download |
|
||||
| ------------------ | ---------- | ----- | ------------------------------ |
|
||||
@@ -200,18 +200,21 @@ brew install cmake go
|
||||
```
|
||||
|
||||
Then generate dependencies:
|
||||
|
||||
```
|
||||
go generate ./...
|
||||
```
|
||||
|
||||
Then build the binary:
|
||||
|
||||
```
|
||||
go build .
|
||||
```
|
||||
|
||||
More detailed instructions can be found in the [developer guide](https://github.com/jmorganca/ollama/blob/main/docs/development.md)
|
||||
|
||||
|
||||
### Running local builds
|
||||
|
||||
Next, start the server:
|
||||
|
||||
```
|
||||
@@ -253,19 +256,21 @@ See the [API documentation](./docs/api.md) for all endpoints.
|
||||
## Community Integrations
|
||||
|
||||
### Web & Desktop
|
||||
|
||||
- [Bionic GPT](https://github.com/bionic-gpt/bionic-gpt)
|
||||
- [HTML UI](https://github.com/rtcfirefly/ollama-ui)
|
||||
- [Chatbot UI](https://github.com/ivanfioravanti/chatbot-ollama)
|
||||
- [Typescript UI](https://github.com/ollama-interface/Ollama-Gui?tab=readme-ov-file)
|
||||
- [Minimalistic React UI for Ollama Models](https://github.com/richawo/minimal-llm-ui)
|
||||
- [Web UI](https://github.com/ollama-webui/ollama-webui)
|
||||
- [Open WebUI](https://github.com/open-webui/open-webui)
|
||||
- [Ollamac](https://github.com/kevinhermawan/Ollamac)
|
||||
- [big-AGI](https://github.com/enricoros/big-agi/blob/main/docs/config-ollama.md)
|
||||
- [Cheshire Cat assistant framework](https://github.com/cheshire-cat-ai/core)
|
||||
- [Amica](https://github.com/semperai/amica)
|
||||
- [chatd](https://github.com/BruceMacD/chatd)
|
||||
- [Ollama-SwiftUI](https://github.com/kghandour/Ollama-SwiftUI)
|
||||
|
||||
- [MindMac](https://mindmac.app)
|
||||
- [NextJS Web Interface for Ollama](https://github.com/jakobhoeg/nextjs-ollama-llm-ui)
|
||||
|
||||
### Terminal
|
||||
|
||||
@@ -274,10 +279,14 @@ See the [API documentation](./docs/api.md) for all endpoints.
|
||||
- [Emacs client](https://github.com/zweifisch/ollama)
|
||||
- [gen.nvim](https://github.com/David-Kunz/gen.nvim)
|
||||
- [ollama.nvim](https://github.com/nomnivore/ollama.nvim)
|
||||
- [ollama-chat.nvim](https://github.com/gerazov/ollama-chat.nvim)
|
||||
- [ogpt.nvim](https://github.com/huynle/ogpt.nvim)
|
||||
- [gptel Emacs client](https://github.com/karthink/gptel)
|
||||
- [Oatmeal](https://github.com/dustinblackman/oatmeal)
|
||||
- [cmdh](https://github.com/pgibler/cmdh)
|
||||
- [tenere](https://github.com/pythops/tenere)
|
||||
- [llm-ollama](https://github.com/taketwo/llm-ollama) for [Datasette's LLM CLI](https://llm.datasette.io/en/stable/).
|
||||
- [ShellOracle](https://github.com/djcopley/ShellOracle)
|
||||
|
||||
### Database
|
||||
|
||||
@@ -286,12 +295,14 @@ See the [API documentation](./docs/api.md) for all endpoints.
|
||||
### Package managers
|
||||
|
||||
- [Pacman](https://archlinux.org/packages/extra/x86_64/ollama/)
|
||||
- [Helm Chart](https://artifacthub.io/packages/helm/ollama-helm/ollama)
|
||||
|
||||
### Libraries
|
||||
|
||||
- [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/modules/model_io/models/llms/integrations/ollama) with [example](https://js.langchain.com/docs/use_cases/question_answering/local_retrieval_qa)
|
||||
- [LangChainGo](https://github.com/tmc/langchaingo/) with [example](https://github.com/tmc/langchaingo/tree/main/examples/ollama-completion-example)
|
||||
- [LlamaIndex](https://gpt-index.readthedocs.io/en/stable/examples/llm/ollama.html)
|
||||
- [LangChain4j](https://github.com/langchain4j/langchain4j/tree/main/langchain4j-ollama)
|
||||
- [LiteLLM](https://github.com/BerriAI/litellm)
|
||||
- [OllamaSharp for .NET](https://github.com/awaescher/OllamaSharp)
|
||||
- [Ollama for Ruby](https://github.com/gbaptista/ollama-ai)
|
||||
@@ -304,7 +315,8 @@ See the [API documentation](./docs/api.md) for all endpoints.
|
||||
- [LangChainDart](https://github.com/davidmigloz/langchain_dart)
|
||||
- [Semantic Kernel - Python](https://github.com/microsoft/semantic-kernel/tree/main/python/semantic_kernel/connectors/ai/ollama)
|
||||
- [Haystack](https://github.com/deepset-ai/haystack-integrations/blob/main/integrations/ollama.md)
|
||||
|
||||
- [Ollama for R - rollama](https://github.com/JBGruber/rollama)
|
||||
- [Ollama-ex for Elixir](https://github.com/lebrunel/ollama-ex)
|
||||
|
||||
### Mobile
|
||||
|
||||
@@ -326,3 +338,5 @@ See the [API documentation](./docs/api.md) for all endpoints.
|
||||
- [Llama Coder](https://github.com/ex3ndr/llama-coder) (Copilot alternative using Ollama)
|
||||
- [Obsidian BMO Chatbot plugin](https://github.com/longy2k/obsidian-bmo-chatbot)
|
||||
- [Open Interpreter](https://docs.openinterpreter.com/language-model-setup/local-models/ollama)
|
||||
- [twinny](https://github.com/rjmacarthy/twinny) (Copilot and Copilot chat alternative using Ollama)
|
||||
- [Wingman-AI](https://github.com/RussellCanfield/wingman-ai) (Copilot code and chat alternative using Ollama and HuggingFace)
|
||||
|
@@ -415,8 +415,7 @@ func (d *Duration) UnmarshalJSON(b []byte) (err error) {
|
||||
switch t := v.(type) {
|
||||
case float64:
|
||||
if t < 0 {
|
||||
t = math.MaxFloat64
|
||||
d.Duration = time.Duration(t)
|
||||
d.Duration = time.Duration(math.MaxInt64)
|
||||
} else {
|
||||
d.Duration = time.Duration(t * float64(time.Second))
|
||||
}
|
||||
@@ -426,8 +425,7 @@ func (d *Duration) UnmarshalJSON(b []byte) (err error) {
|
||||
return err
|
||||
}
|
||||
if d.Duration < 0 {
|
||||
mf := math.MaxFloat64
|
||||
d.Duration = time.Duration(mf)
|
||||
d.Duration = time.Duration(math.MaxInt64)
|
||||
}
|
||||
}
|
||||
|
||||
|
93
app/.gitignore
vendored
@@ -1,92 +1 @@
|
||||
# Logs
|
||||
logs
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
lerna-debug.log*
|
||||
|
||||
# Diagnostic reports (https://nodejs.org/api/report.html)
|
||||
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
|
||||
|
||||
# Runtime data
|
||||
pids
|
||||
*.pid
|
||||
*.seed
|
||||
*.pid.lock
|
||||
.DS_Store
|
||||
|
||||
# Directory for instrumented libs generated by jscoverage/JSCover
|
||||
lib-cov
|
||||
|
||||
# Coverage directory used by tools like istanbul
|
||||
coverage
|
||||
*.lcov
|
||||
|
||||
# nyc test coverage
|
||||
.nyc_output
|
||||
|
||||
# node-waf configuration
|
||||
.lock-wscript
|
||||
|
||||
# Compiled binary addons (https://nodejs.org/api/addons.html)
|
||||
build/Release
|
||||
|
||||
# Dependency directories
|
||||
node_modules/
|
||||
jspm_packages/
|
||||
|
||||
# TypeScript v1 declaration files
|
||||
typings/
|
||||
|
||||
# TypeScript cache
|
||||
*.tsbuildinfo
|
||||
|
||||
# Optional npm cache directory
|
||||
.npm
|
||||
|
||||
# Optional eslint cache
|
||||
.eslintcache
|
||||
|
||||
# Optional REPL history
|
||||
.node_repl_history
|
||||
|
||||
# Output of 'npm pack'
|
||||
*.tgz
|
||||
|
||||
# Yarn Integrity file
|
||||
.yarn-integrity
|
||||
|
||||
# dotenv environment variables file
|
||||
.env
|
||||
.env.test
|
||||
|
||||
# parcel-bundler cache (https://parceljs.org/)
|
||||
.cache
|
||||
|
||||
# next.js build output
|
||||
.next
|
||||
|
||||
# nuxt.js build output
|
||||
.nuxt
|
||||
|
||||
# vuepress build output
|
||||
.vuepress/dist
|
||||
|
||||
# Serverless directories
|
||||
.serverless/
|
||||
|
||||
# FuseBox cache
|
||||
.fusebox/
|
||||
|
||||
# DynamoDB Local files
|
||||
.dynamodb/
|
||||
|
||||
# Webpack
|
||||
.webpack/
|
||||
|
||||
# Vite
|
||||
.vite/
|
||||
|
||||
# Electron-Forge
|
||||
out/
|
||||
ollama.syso
|
||||
|
@@ -1,21 +1,22 @@
|
||||
# Desktop
|
||||
# Ollama App
|
||||
|
||||
This app builds upon Ollama to provide a desktop experience for running models.
|
||||
## Linux
|
||||
|
||||
## Developing
|
||||
TODO
|
||||
|
||||
First, build the `ollama` binary:
|
||||
## MacOS
|
||||
|
||||
TODO
|
||||
|
||||
## Windows
|
||||
|
||||
If you want to build the installer, youll need to install
|
||||
- https://jrsoftware.org/isinfo.php
|
||||
|
||||
|
||||
In the top directory of this repo, run the following powershell script
|
||||
to build the ollama CLI, ollama app, and ollama installer.
|
||||
|
||||
```
|
||||
cd ..
|
||||
go build .
|
||||
powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps1
|
||||
```
|
||||
|
||||
Then run the desktop app with `npm start`:
|
||||
|
||||
```
|
||||
cd app
|
||||
npm install
|
||||
npm start
|
||||
```
|
||||
|
||||
|
BIN
app/assets/app.ico
Normal file
After Width: | Height: | Size: 7.3 KiB |
17
app/assets/assets.go
Normal file
@@ -0,0 +1,17 @@
|
||||
package assets
|
||||
|
||||
import (
|
||||
"embed"
|
||||
"io/fs"
|
||||
)
|
||||
|
||||
//go:embed *.ico
|
||||
var icons embed.FS
|
||||
|
||||
func ListIcons() ([]string, error) {
|
||||
return fs.Glob(icons, "*")
|
||||
}
|
||||
|
||||
func GetIcon(filename string) ([]byte, error) {
|
||||
return icons.ReadFile(filename)
|
||||
}
|
BIN
app/assets/setup.bmp
Normal file
After Width: | Height: | Size: 76 KiB |
BIN
app/assets/tray.ico
Normal file
After Width: | Height: | Size: 89 KiB |
BIN
app/assets/tray_upgrade.ico
Normal file
After Width: | Height: | Size: 91 KiB |
9
app/lifecycle/getstarted_nonwindows.go
Normal file
@@ -0,0 +1,9 @@
|
||||
//go:build !windows
|
||||
|
||||
package lifecycle
|
||||
|
||||
import "fmt"
|
||||
|
||||
func GetStarted() error {
|
||||
return fmt.Errorf("GetStarted not implemented")
|
||||
}
|
44
app/lifecycle/getstarted_windows.go
Normal file
@@ -0,0 +1,44 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"syscall"
|
||||
)
|
||||
|
||||
func GetStarted() error {
|
||||
const CREATE_NEW_CONSOLE = 0x00000010
|
||||
var err error
|
||||
bannerScript := filepath.Join(AppDir, "ollama_welcome.ps1")
|
||||
args := []string{
|
||||
// TODO once we're signed, the execution policy bypass should be removed
|
||||
"powershell", "-noexit", "-ExecutionPolicy", "Bypass", "-nologo", "-file", bannerScript,
|
||||
}
|
||||
args[0], err = exec.LookPath(args[0])
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
// Make sure the script actually exists
|
||||
_, err = os.Stat(bannerScript)
|
||||
if err != nil {
|
||||
return fmt.Errorf("getting started banner script error %s", err)
|
||||
}
|
||||
|
||||
slog.Info(fmt.Sprintf("opening getting started terminal with %v", args))
|
||||
attrs := &os.ProcAttr{
|
||||
Files: []*os.File{os.Stdin, os.Stdout, os.Stderr},
|
||||
Sys: &syscall.SysProcAttr{CreationFlags: CREATE_NEW_CONSOLE, HideWindow: false},
|
||||
}
|
||||
proc, err := os.StartProcess(args[0], args, attrs)
|
||||
|
||||
if err != nil {
|
||||
return fmt.Errorf("unable to start getting started shell %w", err)
|
||||
}
|
||||
|
||||
slog.Debug(fmt.Sprintf("getting started terminal PID: %d", proc.Pid))
|
||||
return proc.Release()
|
||||
}
|
92
app/lifecycle/lifecycle.go
Normal file
@@ -0,0 +1,92 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log"
|
||||
"log/slog"
|
||||
"os"
|
||||
"os/signal"
|
||||
"syscall"
|
||||
|
||||
"github.com/jmorganca/ollama/app/store"
|
||||
"github.com/jmorganca/ollama/app/tray"
|
||||
)
|
||||
|
||||
func Run() {
|
||||
InitLogging()
|
||||
|
||||
ctx, cancel := context.WithCancel(context.Background())
|
||||
var done chan int
|
||||
|
||||
t, err := tray.NewTray()
|
||||
if err != nil {
|
||||
log.Fatalf("Failed to start: %s", err)
|
||||
}
|
||||
callbacks := t.GetCallbacks()
|
||||
|
||||
signals := make(chan os.Signal, 1)
|
||||
signal.Notify(signals, syscall.SIGINT, syscall.SIGTERM)
|
||||
|
||||
go func() {
|
||||
slog.Debug("starting callback loop")
|
||||
for {
|
||||
select {
|
||||
case <-callbacks.Quit:
|
||||
slog.Debug("quit called")
|
||||
t.Quit()
|
||||
case <-signals:
|
||||
slog.Debug("shutting down due to signal")
|
||||
t.Quit()
|
||||
case <-callbacks.Update:
|
||||
err := DoUpgrade(cancel, done)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("upgrade attempt failed: %s", err))
|
||||
}
|
||||
case <-callbacks.ShowLogs:
|
||||
ShowLogs()
|
||||
case <-callbacks.DoFirstUse:
|
||||
err := GetStarted()
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("Failed to launch getting started shell: %s", err))
|
||||
}
|
||||
}
|
||||
}
|
||||
}()
|
||||
|
||||
// Are we first use?
|
||||
if !store.GetFirstTimeRun() {
|
||||
slog.Debug("First time run")
|
||||
err = t.DisplayFirstUseNotification()
|
||||
if err != nil {
|
||||
slog.Debug(fmt.Sprintf("XXX failed to display first use notification %v", err))
|
||||
}
|
||||
store.SetFirstTimeRun(true)
|
||||
} else {
|
||||
slog.Debug("Not first time, skipping first run notification")
|
||||
}
|
||||
|
||||
if IsServerRunning(ctx) {
|
||||
slog.Info("Detected another instance of ollama running, exiting")
|
||||
os.Exit(1)
|
||||
} else {
|
||||
done, err = SpawnServer(ctx, CLIName)
|
||||
if err != nil {
|
||||
// TODO - should we retry in a backoff loop?
|
||||
// TODO - should we pop up a warning and maybe add a menu item to view application logs?
|
||||
slog.Error(fmt.Sprintf("Failed to spawn ollama server %s", err))
|
||||
done = make(chan int, 1)
|
||||
done <- 1
|
||||
}
|
||||
}
|
||||
|
||||
StartBackgroundUpdaterChecker(ctx, t.UpdateAvailable)
|
||||
|
||||
t.Run()
|
||||
cancel()
|
||||
slog.Info("Waiting for ollama server to shutdown...")
|
||||
if done != nil {
|
||||
<-done
|
||||
}
|
||||
slog.Info("Ollama app exiting")
|
||||
}
|
46
app/lifecycle/logging.go
Normal file
@@ -0,0 +1,46 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func InitLogging() {
|
||||
level := slog.LevelInfo
|
||||
|
||||
if debug := os.Getenv("OLLAMA_DEBUG"); debug != "" {
|
||||
level = slog.LevelDebug
|
||||
}
|
||||
|
||||
var logFile *os.File
|
||||
var err error
|
||||
// Detect if we're a GUI app on windows, and if not, send logs to console
|
||||
if os.Stderr.Fd() != 0 {
|
||||
// Console app detected
|
||||
logFile = os.Stderr
|
||||
// TODO - write one-line to the app.log file saying we're running in console mode to help avoid confusion
|
||||
} else {
|
||||
logFile, err = os.OpenFile(AppLogFile, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0755)
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to create server log %v", err))
|
||||
return
|
||||
}
|
||||
}
|
||||
handler := slog.NewTextHandler(logFile, &slog.HandlerOptions{
|
||||
Level: level,
|
||||
AddSource: true,
|
||||
ReplaceAttr: func(_ []string, attr slog.Attr) slog.Attr {
|
||||
if attr.Key == slog.SourceKey {
|
||||
source := attr.Value.Any().(*slog.Source)
|
||||
source.File = filepath.Base(source.File)
|
||||
}
|
||||
return attr
|
||||
},
|
||||
})
|
||||
|
||||
slog.SetDefault(slog.New(handler))
|
||||
|
||||
slog.Info("ollama app started")
|
||||
}
|
9
app/lifecycle/logging_nonwindows.go
Normal file
@@ -0,0 +1,9 @@
|
||||
//go:build !windows
|
||||
|
||||
package lifecycle
|
||||
|
||||
import "log/slog"
|
||||
|
||||
func ShowLogs() {
|
||||
slog.Warn("ShowLogs not yet implemented")
|
||||
}
|
19
app/lifecycle/logging_windows.go
Normal file
@@ -0,0 +1,19 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os/exec"
|
||||
"syscall"
|
||||
)
|
||||
|
||||
func ShowLogs() {
|
||||
cmd_path := "c:\\Windows\\system32\\cmd.exe"
|
||||
slog.Debug(fmt.Sprintf("viewing logs with start %s", AppDataDir))
|
||||
cmd := exec.Command(cmd_path, "/c", "start", AppDataDir)
|
||||
cmd.SysProcAttr = &syscall.SysProcAttr{HideWindow: false, CreationFlags: 0x08000000}
|
||||
err := cmd.Start()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("Failed to open log dir: %s", err))
|
||||
}
|
||||
}
|
79
app/lifecycle/paths.go
Normal file
@@ -0,0 +1,79 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"errors"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
"strings"
|
||||
)
|
||||
|
||||
var (
|
||||
AppName = "ollama app"
|
||||
CLIName = "ollama"
|
||||
AppDir = "/opt/Ollama"
|
||||
AppDataDir = "/opt/Ollama"
|
||||
// TODO - should there be a distinct log dir?
|
||||
UpdateStageDir = "/tmp"
|
||||
AppLogFile = "/tmp/ollama_app.log"
|
||||
ServerLogFile = "/tmp/ollama.log"
|
||||
UpgradeLogFile = "/tmp/ollama_update.log"
|
||||
Installer = "OllamaSetup.exe"
|
||||
)
|
||||
|
||||
func init() {
|
||||
if runtime.GOOS == "windows" {
|
||||
AppName += ".exe"
|
||||
CLIName += ".exe"
|
||||
// Logs, configs, downloads go to LOCALAPPDATA
|
||||
localAppData := os.Getenv("LOCALAPPDATA")
|
||||
AppDataDir = filepath.Join(localAppData, "Ollama")
|
||||
UpdateStageDir = filepath.Join(AppDataDir, "updates")
|
||||
AppLogFile = filepath.Join(AppDataDir, "app.log")
|
||||
ServerLogFile = filepath.Join(AppDataDir, "server.log")
|
||||
UpgradeLogFile = filepath.Join(AppDataDir, "upgrade.log")
|
||||
|
||||
// Executables are stored in APPDATA
|
||||
AppDir = filepath.Join(localAppData, "Programs", "Ollama")
|
||||
|
||||
// Make sure we have PATH set correctly for any spawned children
|
||||
paths := strings.Split(os.Getenv("PATH"), ";")
|
||||
// Start with whatever we find in the PATH/LD_LIBRARY_PATH
|
||||
found := false
|
||||
for _, path := range paths {
|
||||
d, err := filepath.Abs(path)
|
||||
if err != nil {
|
||||
continue
|
||||
}
|
||||
if strings.EqualFold(AppDir, d) {
|
||||
found = true
|
||||
}
|
||||
}
|
||||
if !found {
|
||||
paths = append(paths, AppDir)
|
||||
|
||||
pathVal := strings.Join(paths, ";")
|
||||
slog.Debug("setting PATH=" + pathVal)
|
||||
err := os.Setenv("PATH", pathVal)
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to update PATH: %s", err))
|
||||
}
|
||||
}
|
||||
|
||||
// Make sure our logging dir exists
|
||||
_, err := os.Stat(AppDataDir)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
if err := os.MkdirAll(AppDataDir, 0o755); err != nil {
|
||||
slog.Error(fmt.Sprintf("create ollama dir %s: %v", AppDataDir, err))
|
||||
}
|
||||
}
|
||||
|
||||
} else if runtime.GOOS == "darwin" {
|
||||
// TODO
|
||||
AppName += ".app"
|
||||
// } else if runtime.GOOS == "linux" {
|
||||
// TODO
|
||||
}
|
||||
}
|
139
app/lifecycle/server.go
Normal file
@@ -0,0 +1,139 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log/slog"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"time"
|
||||
|
||||
"github.com/jmorganca/ollama/api"
|
||||
)
|
||||
|
||||
func getCLIFullPath(command string) string {
|
||||
cmdPath := ""
|
||||
appExe, err := os.Executable()
|
||||
if err == nil {
|
||||
cmdPath = filepath.Join(filepath.Dir(appExe), command)
|
||||
_, err := os.Stat(cmdPath)
|
||||
if err == nil {
|
||||
return cmdPath
|
||||
}
|
||||
}
|
||||
cmdPath, err = exec.LookPath(command)
|
||||
if err == nil {
|
||||
_, err := os.Stat(cmdPath)
|
||||
if err == nil {
|
||||
return cmdPath
|
||||
}
|
||||
}
|
||||
pwd, err := os.Getwd()
|
||||
if err == nil {
|
||||
cmdPath = filepath.Join(pwd, command)
|
||||
_, err = os.Stat(cmdPath)
|
||||
if err == nil {
|
||||
return cmdPath
|
||||
}
|
||||
}
|
||||
|
||||
return command
|
||||
}
|
||||
|
||||
func SpawnServer(ctx context.Context, command string) (chan int, error) {
|
||||
done := make(chan int)
|
||||
|
||||
logDir := filepath.Dir(ServerLogFile)
|
||||
_, err := os.Stat(logDir)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
if err := os.MkdirAll(logDir, 0o755); err != nil {
|
||||
return done, fmt.Errorf("create ollama server log dir %s: %v", logDir, err)
|
||||
}
|
||||
}
|
||||
|
||||
cmd := getCmd(ctx, getCLIFullPath(command))
|
||||
// send stdout and stderr to a file
|
||||
stdout, err := cmd.StdoutPipe()
|
||||
if err != nil {
|
||||
return done, fmt.Errorf("failed to spawn server stdout pipe %s", err)
|
||||
}
|
||||
stderr, err := cmd.StderrPipe()
|
||||
if err != nil {
|
||||
return done, fmt.Errorf("failed to spawn server stderr pipe %s", err)
|
||||
}
|
||||
stdin, err := cmd.StdinPipe()
|
||||
if err != nil {
|
||||
return done, fmt.Errorf("failed to spawn server stdin pipe %s", err)
|
||||
}
|
||||
|
||||
// TODO - rotation
|
||||
logFile, err := os.OpenFile(ServerLogFile, os.O_APPEND|os.O_WRONLY|os.O_CREATE, 0755)
|
||||
if err != nil {
|
||||
return done, fmt.Errorf("failed to create server log %w", err)
|
||||
}
|
||||
go func() {
|
||||
defer logFile.Close()
|
||||
io.Copy(logFile, stdout) //nolint:errcheck
|
||||
}()
|
||||
go func() {
|
||||
defer logFile.Close()
|
||||
io.Copy(logFile, stderr) //nolint:errcheck
|
||||
}()
|
||||
|
||||
// run the command and wait for it to finish
|
||||
if err := cmd.Start(); err != nil {
|
||||
return done, fmt.Errorf("failed to start server %w", err)
|
||||
}
|
||||
if cmd.Process != nil {
|
||||
slog.Info(fmt.Sprintf("started ollama server with pid %d", cmd.Process.Pid))
|
||||
}
|
||||
slog.Info(fmt.Sprintf("ollama server logs %s", ServerLogFile))
|
||||
|
||||
go func() {
|
||||
// Keep the server running unless we're shuttind down the app
|
||||
crashCount := 0
|
||||
for {
|
||||
cmd.Wait() //nolint:errcheck
|
||||
stdin.Close()
|
||||
var code int
|
||||
if cmd.ProcessState != nil {
|
||||
code = cmd.ProcessState.ExitCode()
|
||||
}
|
||||
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
slog.Debug(fmt.Sprintf("server shutdown with exit code %d", code))
|
||||
done <- code
|
||||
return
|
||||
default:
|
||||
crashCount++
|
||||
slog.Warn(fmt.Sprintf("server crash %d - exit code %d - respawning", crashCount, code))
|
||||
time.Sleep(500 * time.Millisecond)
|
||||
if err := cmd.Start(); err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to restart server %s", err))
|
||||
// Keep trying, but back off if we keep failing
|
||||
time.Sleep(time.Duration(crashCount) * time.Second)
|
||||
}
|
||||
}
|
||||
}
|
||||
}()
|
||||
return done, nil
|
||||
}
|
||||
|
||||
func IsServerRunning(ctx context.Context) bool {
|
||||
client, err := api.ClientFromEnvironment()
|
||||
if err != nil {
|
||||
slog.Info("unable to connect to server")
|
||||
return false
|
||||
}
|
||||
err = client.Heartbeat(ctx)
|
||||
if err != nil {
|
||||
slog.Debug(fmt.Sprintf("heartbeat from server: %s", err))
|
||||
slog.Info("unable to connect to server")
|
||||
return false
|
||||
}
|
||||
return true
|
||||
}
|
12
app/lifecycle/server_unix.go
Normal file
@@ -0,0 +1,12 @@
|
||||
//go:build !windows
|
||||
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"os/exec"
|
||||
)
|
||||
|
||||
func getCmd(ctx context.Context, cmd string) *exec.Cmd {
|
||||
return exec.CommandContext(ctx, cmd, "serve")
|
||||
}
|
13
app/lifecycle/server_windows.go
Normal file
@@ -0,0 +1,13 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"os/exec"
|
||||
"syscall"
|
||||
)
|
||||
|
||||
func getCmd(ctx context.Context, exePath string) *exec.Cmd {
|
||||
cmd := exec.CommandContext(ctx, exePath, "serve")
|
||||
cmd.SysProcAttr = &syscall.SysProcAttr{HideWindow: true, CreationFlags: 0x08000000}
|
||||
return cmd
|
||||
}
|
238
app/lifecycle/updater.go
Normal file
@@ -0,0 +1,238 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log/slog"
|
||||
"mime"
|
||||
"net/http"
|
||||
"net/url"
|
||||
"os"
|
||||
"path"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
"strings"
|
||||
"time"
|
||||
|
||||
"github.com/jmorganca/ollama/auth"
|
||||
"github.com/jmorganca/ollama/version"
|
||||
)
|
||||
|
||||
var (
|
||||
UpdateCheckURLBase = "https://ollama.com/api/update"
|
||||
UpdateDownloaded = false
|
||||
UpdateCheckInterval = 60 * 60 * time.Second
|
||||
)
|
||||
|
||||
// TODO - maybe move up to the API package?
|
||||
type UpdateResponse struct {
|
||||
UpdateURL string `json:"url"`
|
||||
UpdateVersion string `json:"version"`
|
||||
}
|
||||
|
||||
func getClient(req *http.Request) http.Client {
|
||||
proxyURL, err := http.ProxyFromEnvironment(req)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to handle proxy: %s", err))
|
||||
return http.Client{}
|
||||
}
|
||||
|
||||
return http.Client{
|
||||
Transport: &http.Transport{
|
||||
Proxy: http.ProxyURL(proxyURL),
|
||||
},
|
||||
}
|
||||
}
|
||||
|
||||
func IsNewReleaseAvailable(ctx context.Context) (bool, UpdateResponse) {
|
||||
var updateResp UpdateResponse
|
||||
|
||||
requestURL, err := url.Parse(UpdateCheckURLBase)
|
||||
if err != nil {
|
||||
return false, updateResp
|
||||
}
|
||||
|
||||
query := requestURL.Query()
|
||||
query.Add("os", runtime.GOOS)
|
||||
query.Add("arch", runtime.GOARCH)
|
||||
query.Add("version", version.Version)
|
||||
query.Add("ts", fmt.Sprintf("%d", time.Now().Unix()))
|
||||
|
||||
nonce, err := auth.NewNonce(rand.Reader, 16)
|
||||
if err != nil {
|
||||
return false, updateResp
|
||||
}
|
||||
|
||||
query.Add("nonce", nonce)
|
||||
requestURL.RawQuery = query.Encode()
|
||||
|
||||
data := []byte(fmt.Sprintf("%s,%s", http.MethodGet, requestURL.RequestURI()))
|
||||
signature, err := auth.Sign(ctx, data)
|
||||
if err != nil {
|
||||
return false, updateResp
|
||||
}
|
||||
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodGet, requestURL.String(), nil)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to check for update: %s", err))
|
||||
return false, updateResp
|
||||
}
|
||||
req.Header.Set("Authorization", signature)
|
||||
req.Header.Set("User-Agent", fmt.Sprintf("ollama/%s (%s %s) Go/%s", version.Version, runtime.GOARCH, runtime.GOOS, runtime.Version()))
|
||||
client := getClient(req)
|
||||
|
||||
slog.Debug("checking for available update", "requestURL", requestURL)
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to check for update: %s", err))
|
||||
return false, updateResp
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
|
||||
if resp.StatusCode == 204 {
|
||||
slog.Debug("check update response 204 (current version is up to date)")
|
||||
return false, updateResp
|
||||
}
|
||||
body, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to read body response: %s", err))
|
||||
}
|
||||
err = json.Unmarshal(body, &updateResp)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("malformed response checking for update: %s", err))
|
||||
return false, updateResp
|
||||
}
|
||||
// Extract the version string from the URL in the github release artifact path
|
||||
updateResp.UpdateVersion = path.Base(path.Dir(updateResp.UpdateURL))
|
||||
|
||||
slog.Info("New update available at " + updateResp.UpdateURL)
|
||||
return true, updateResp
|
||||
}
|
||||
|
||||
func DownloadNewRelease(ctx context.Context, updateResp UpdateResponse) error {
|
||||
// Do a head first to check etag info
|
||||
req, err := http.NewRequestWithContext(ctx, http.MethodHead, updateResp.UpdateURL, nil)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
client := getClient(req)
|
||||
resp, err := client.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("error checking update: %w", err)
|
||||
}
|
||||
if resp.StatusCode != 200 {
|
||||
return fmt.Errorf("unexpected status attempting to download update %d", resp.StatusCode)
|
||||
}
|
||||
resp.Body.Close()
|
||||
etag := strings.Trim(resp.Header.Get("etag"), "\"")
|
||||
if etag == "" {
|
||||
slog.Debug("no etag detected, falling back to filename based dedup")
|
||||
etag = "_"
|
||||
}
|
||||
filename := Installer
|
||||
_, params, err := mime.ParseMediaType(resp.Header.Get("content-disposition"))
|
||||
if err == nil {
|
||||
filename = params["filename"]
|
||||
}
|
||||
|
||||
stageFilename := filepath.Join(UpdateStageDir, etag, filename)
|
||||
|
||||
// Check to see if we already have it downloaded
|
||||
_, err = os.Stat(stageFilename)
|
||||
if err == nil {
|
||||
slog.Info("update already downloaded")
|
||||
return nil
|
||||
}
|
||||
|
||||
cleanupOldDownloads()
|
||||
|
||||
req.Method = http.MethodGet
|
||||
resp, err = client.Do(req)
|
||||
if err != nil {
|
||||
return fmt.Errorf("error checking update: %w", err)
|
||||
}
|
||||
defer resp.Body.Close()
|
||||
etag = strings.Trim(resp.Header.Get("etag"), "\"")
|
||||
if etag == "" {
|
||||
slog.Debug("no etag detected, falling back to filename based dedup") // TODO probably can get rid of this redundant log
|
||||
etag = "_"
|
||||
}
|
||||
|
||||
stageFilename = filepath.Join(UpdateStageDir, etag, filename)
|
||||
|
||||
_, err = os.Stat(filepath.Dir(stageFilename))
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
if err := os.MkdirAll(filepath.Dir(stageFilename), 0o755); err != nil {
|
||||
return fmt.Errorf("create ollama dir %s: %v", filepath.Dir(stageFilename), err)
|
||||
}
|
||||
}
|
||||
|
||||
payload, err := io.ReadAll(resp.Body)
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to read body response: %w", err)
|
||||
}
|
||||
fp, err := os.OpenFile(stageFilename, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0o755)
|
||||
if err != nil {
|
||||
return fmt.Errorf("write payload %s: %w", stageFilename, err)
|
||||
}
|
||||
defer fp.Close()
|
||||
if n, err := fp.Write(payload); err != nil || n != len(payload) {
|
||||
return fmt.Errorf("write payload %s: %d vs %d -- %w", stageFilename, n, len(payload), err)
|
||||
}
|
||||
slog.Info("new update downloaded " + stageFilename)
|
||||
|
||||
UpdateDownloaded = true
|
||||
return nil
|
||||
}
|
||||
|
||||
func cleanupOldDownloads() {
|
||||
files, err := os.ReadDir(UpdateStageDir)
|
||||
if err != nil && errors.Is(err, os.ErrNotExist) {
|
||||
// Expected behavior on first run
|
||||
return
|
||||
} else if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to list stage dir: %s", err))
|
||||
return
|
||||
}
|
||||
for _, file := range files {
|
||||
fullname := filepath.Join(UpdateStageDir, file.Name())
|
||||
slog.Debug("cleaning up old download: " + fullname)
|
||||
err = os.RemoveAll(fullname)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to cleanup stale update download %s", err))
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func StartBackgroundUpdaterChecker(ctx context.Context, cb func(string) error) {
|
||||
go func() {
|
||||
// Don't blast an update message immediately after startup
|
||||
// time.Sleep(30 * time.Second)
|
||||
time.Sleep(3 * time.Second)
|
||||
|
||||
for {
|
||||
available, resp := IsNewReleaseAvailable(ctx)
|
||||
if available {
|
||||
err := DownloadNewRelease(ctx, resp)
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to download new release: %s", err))
|
||||
}
|
||||
err = cb(resp.UpdateVersion)
|
||||
if err != nil {
|
||||
slog.Warn(fmt.Sprintf("failed to register update available with tray: %s", err))
|
||||
}
|
||||
}
|
||||
select {
|
||||
case <-ctx.Done():
|
||||
slog.Debug("stopping background update checker")
|
||||
return
|
||||
default:
|
||||
time.Sleep(UpdateCheckInterval)
|
||||
}
|
||||
}
|
||||
}()
|
||||
}
|
12
app/lifecycle/updater_nonwindows.go
Normal file
@@ -0,0 +1,12 @@
|
||||
//go:build !windows
|
||||
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
)
|
||||
|
||||
func DoUpgrade(cancel context.CancelFunc, done chan int) error {
|
||||
return fmt.Errorf("DoUpgrade not yet implemented")
|
||||
}
|
80
app/lifecycle/updater_windows.go
Normal file
@@ -0,0 +1,80 @@
|
||||
package lifecycle
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func DoUpgrade(cancel context.CancelFunc, done chan int) error {
|
||||
files, err := filepath.Glob(filepath.Join(UpdateStageDir, "*", "*.exe")) // TODO generalize for multiplatform
|
||||
if err != nil {
|
||||
return fmt.Errorf("failed to lookup downloads: %s", err)
|
||||
}
|
||||
if len(files) == 0 {
|
||||
return fmt.Errorf("no update downloads found")
|
||||
} else if len(files) > 1 {
|
||||
// Shouldn't happen
|
||||
slog.Warn(fmt.Sprintf("multiple downloads found, using first one %v", files))
|
||||
}
|
||||
installerExe := files[0]
|
||||
|
||||
slog.Info("starting upgrade with " + installerExe)
|
||||
slog.Info("upgrade log file " + UpgradeLogFile)
|
||||
|
||||
// When running in debug mode, we'll be "verbose" and let the installer pop up and prompt
|
||||
installArgs := []string{
|
||||
"/CLOSEAPPLICATIONS", // Quit the tray app if it's still running
|
||||
"/LOG=" + filepath.Base(UpgradeLogFile), // Only relative seems reliable, so set pwd
|
||||
"/FORCECLOSEAPPLICATIONS", // Force close the tray app - might be needed
|
||||
}
|
||||
// When we're not in debug mode, make the upgrade as quiet as possible (no GUI, no prompts)
|
||||
// TODO - temporarily disable since we're pinning in debug mode for the preview
|
||||
// if debug := os.Getenv("OLLAMA_DEBUG"); debug == "" {
|
||||
installArgs = append(installArgs,
|
||||
"/SP", // Skip the "This will install... Do you wish to continue" prompt
|
||||
"/SUPPRESSMSGBOXES",
|
||||
"/SILENT",
|
||||
"/VERYSILENT",
|
||||
)
|
||||
// }
|
||||
|
||||
// Safeguard in case we have requests in flight that need to drain...
|
||||
slog.Info("Waiting for server to shutdown")
|
||||
cancel()
|
||||
if done != nil {
|
||||
<-done
|
||||
} else {
|
||||
// Shouldn't happen
|
||||
slog.Warn("done chan was nil, not actually waiting")
|
||||
}
|
||||
|
||||
slog.Debug(fmt.Sprintf("starting installer: %s %v", installerExe, installArgs))
|
||||
os.Chdir(filepath.Dir(UpgradeLogFile)) //nolint:errcheck
|
||||
cmd := exec.Command(installerExe, installArgs...)
|
||||
|
||||
if err := cmd.Start(); err != nil {
|
||||
return fmt.Errorf("unable to start ollama app %w", err)
|
||||
}
|
||||
|
||||
if cmd.Process != nil {
|
||||
err = cmd.Process.Release()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to release server process: %s", err))
|
||||
}
|
||||
} else {
|
||||
// TODO - some details about why it didn't start, or is this a pedantic error case?
|
||||
return fmt.Errorf("installer process did not start")
|
||||
}
|
||||
|
||||
// TODO should we linger for a moment and check to make sure it's actually running by checking the pid?
|
||||
|
||||
slog.Info("Installer started in background, exiting")
|
||||
|
||||
os.Exit(0)
|
||||
// Not reached
|
||||
return nil
|
||||
}
|
12
app/main.go
Normal file
@@ -0,0 +1,12 @@
|
||||
package main
|
||||
|
||||
// Compile with the following to get rid of the cmd pop up on windows
|
||||
// go build -ldflags="-H windowsgui" .
|
||||
|
||||
import (
|
||||
"github.com/jmorganca/ollama/app/lifecycle"
|
||||
)
|
||||
|
||||
func main() {
|
||||
lifecycle.Run()
|
||||
}
|
153
app/ollama.iss
Normal file
@@ -0,0 +1,153 @@
|
||||
; Inno Setup Installer for Ollama
|
||||
;
|
||||
; To build the installer use the build script invoked from the top of the source tree
|
||||
;
|
||||
; powershell -ExecutionPolicy Bypass -File .\scripts\build_windows.ps
|
||||
|
||||
|
||||
#define MyAppName "Ollama"
|
||||
#if GetEnv("PKG_VERSION") != ""
|
||||
#define MyAppVersion GetEnv("PKG_VERSION")
|
||||
#else
|
||||
#define MyAppVersion "0.0.0"
|
||||
#endif
|
||||
#define MyAppPublisher "Ollama"
|
||||
#define MyAppURL "https://ollama.com/"
|
||||
#define MyAppExeName "ollama app.exe"
|
||||
#define MyIcon ".\assets\app.ico"
|
||||
|
||||
[Setup]
|
||||
; NOTE: The value of AppId uniquely identifies this application. Do not use the same AppId value in installers for other applications.
|
||||
; (To generate a new GUID, click Tools | Generate GUID inside the IDE.)
|
||||
AppId={{44E83376-CE68-45EB-8FC1-393500EB558C}
|
||||
AppName={#MyAppName}
|
||||
AppVersion={#MyAppVersion}
|
||||
VersionInfoVersion={#MyAppVersion}
|
||||
;AppVerName={#MyAppName} {#MyAppVersion}
|
||||
AppPublisher={#MyAppPublisher}
|
||||
AppPublisherURL={#MyAppURL}
|
||||
AppSupportURL={#MyAppURL}
|
||||
AppUpdatesURL={#MyAppURL}
|
||||
ArchitecturesAllowed=x64
|
||||
ArchitecturesInstallIn64BitMode=x64
|
||||
DefaultDirName={localappdata}\Programs\{#MyAppName}
|
||||
DefaultGroupName={#MyAppName}
|
||||
DisableProgramGroupPage=yes
|
||||
PrivilegesRequired=lowest
|
||||
OutputBaseFilename="OllamaSetup"
|
||||
SetupIconFile={#MyIcon}
|
||||
UninstallDisplayIcon={uninstallexe}
|
||||
Compression=lzma2
|
||||
SolidCompression=no
|
||||
WizardStyle=modern
|
||||
ChangesEnvironment=yes
|
||||
OutputDir=..\dist\
|
||||
|
||||
; Disable logging once everything's battle tested
|
||||
; Filename will be %TEMP%\Setup Log*.txt
|
||||
SetupLogging=yes
|
||||
CloseApplications=yes
|
||||
RestartApplications=no
|
||||
|
||||
; Make sure they can at least download llama2 as a minimum
|
||||
ExtraDiskSpaceRequired=3826806784
|
||||
|
||||
; https://jrsoftware.org/ishelp/index.php?topic=setup_wizardimagefile
|
||||
WizardSmallImageFile=.\assets\setup.bmp
|
||||
|
||||
; TODO verifty actual min windows version...
|
||||
; OG Win 10
|
||||
MinVersion=10.0.10240
|
||||
|
||||
; First release that supports WinRT UI Composition for win32 apps
|
||||
; MinVersion=10.0.17134
|
||||
; First release with XAML Islands - possible UI path forward
|
||||
; MinVersion=10.0.18362
|
||||
|
||||
; quiet...
|
||||
DisableDirPage=yes
|
||||
DisableFinishedPage=yes
|
||||
DisableReadyMemo=yes
|
||||
DisableReadyPage=yes
|
||||
DisableStartupPrompt=yes
|
||||
DisableWelcomePage=yes
|
||||
|
||||
; TODO - percentage can't be set less than 100, so how to make it shorter?
|
||||
; WizardSizePercent=100,80
|
||||
|
||||
#if GetEnv("KEY_CONTAINER")
|
||||
SignTool=MySignTool
|
||||
SignedUninstaller=yes
|
||||
#endif
|
||||
|
||||
SetupMutex=OllamaSetupMutex
|
||||
|
||||
[Languages]
|
||||
Name: "english"; MessagesFile: "compiler:Default.isl"
|
||||
|
||||
[LangOptions]
|
||||
DialogFontSize=12
|
||||
|
||||
[Files]
|
||||
Source: ".\app.exe"; DestDir: "{app}"; DestName: "{#MyAppExeName}" ; Flags: ignoreversion 64bit
|
||||
Source: "..\ollama.exe"; DestDir: "{app}"; Flags: ignoreversion 64bit
|
||||
Source: "..\dist\windeps\*.dll"; DestDir: "{app}"; Flags: ignoreversion 64bit
|
||||
Source: "..\dist\ollama_welcome.ps1"; DestDir: "{app}"; Flags: ignoreversion
|
||||
Source: ".\assets\app.ico"; DestDir: "{app}"; Flags: ignoreversion
|
||||
|
||||
[Icons]
|
||||
Name: "{group}\{#MyAppName}"; Filename: "{app}\{#MyAppExeName}"; IconFilename: "{app}\app.ico"
|
||||
Name: "{userstartup}\{#MyAppName}"; Filename: "{app}\{#MyAppExeName}"; IconFilename: "{app}\app.ico"
|
||||
Name: "{userprograms}\{#MyAppName}"; Filename: "{app}\{#MyAppExeName}"; IconFilename: "{app}\app.ico"
|
||||
|
||||
[Run]
|
||||
Filename: "{cmd}"; Parameters: "/C set PATH={app};%PATH% & ""{app}\{#MyAppExeName}"""; Flags: postinstall nowait runhidden
|
||||
|
||||
[UninstallRun]
|
||||
; Filename: "{cmd}"; Parameters: "/C ""taskkill /im ''{#MyAppExeName}'' /f /t"; Flags: runhidden
|
||||
; Filename: "{cmd}"; Parameters: "/C ""taskkill /im ollama.exe /f /t"; Flags: runhidden
|
||||
Filename: "taskkill"; Parameters: "/im ""{#MyAppExeName}"" /f /t"; Flags: runhidden
|
||||
Filename: "taskkill"; Parameters: "/im ""ollama.exe"" /f /t"; Flags: runhidden
|
||||
; HACK! need to give the server and app enough time to exit
|
||||
; TODO - convert this to a Pascal code script so it waits until they're no longer running, then completes
|
||||
Filename: "{cmd}"; Parameters: "/c timeout 5"; Flags: runhidden
|
||||
|
||||
[UninstallDelete]
|
||||
Type: filesandordirs; Name: "{%TEMP}\ollama*"
|
||||
Type: filesandordirs; Name: "{%LOCALAPPDATA}\Ollama"
|
||||
Type: filesandordirs; Name: "{%LOCALAPPDATA}\Programs\Ollama"
|
||||
Type: filesandordirs; Name: "{%USERPROFILE}\.ollama"
|
||||
; NOTE: if the user has a custom OLLAMA_MODELS it will be preserved
|
||||
|
||||
[Messages]
|
||||
WizardReady=Ollama Windows Preview
|
||||
ReadyLabel1=%nLet's get you up and running with your own large language models.
|
||||
SetupAppRunningError=Another Ollama installer is running.%n%nPlease cancel or finish the other installer, then click OK to continue with this install, or Cancel to exit.
|
||||
|
||||
|
||||
;FinishedHeadingLabel=Run your first model
|
||||
;FinishedLabel=%nRun this command in a PowerShell or cmd terminal.%n%n%n ollama run llama2
|
||||
;ClickFinish=%n
|
||||
|
||||
[Registry]
|
||||
Root: HKCU; Subkey: "Environment"; \
|
||||
ValueType: expandsz; ValueName: "Path"; ValueData: "{olddata};{app}"; \
|
||||
Check: NeedsAddPath('{app}')
|
||||
|
||||
[Code]
|
||||
|
||||
function NeedsAddPath(Param: string): boolean;
|
||||
var
|
||||
OrigPath: string;
|
||||
begin
|
||||
if not RegQueryStringValue(HKEY_CURRENT_USER,
|
||||
'Environment',
|
||||
'Path', OrigPath)
|
||||
then begin
|
||||
Result := True;
|
||||
exit;
|
||||
end;
|
||||
{ look for the path with leading and trailing semicolon }
|
||||
{ Pos() returns 0 if not found }
|
||||
Result := Pos(';' + ExpandConstant(Param) + ';', ';' + OrigPath + ';') = 0;
|
||||
end;
|
29
app/ollama.rc
Normal file
@@ -0,0 +1,29 @@
|
||||
#include <winver.h>
|
||||
|
||||
VS_VERSION_INFO VERSIONINFO
|
||||
FILEFLAGSMASK 0x3fL
|
||||
#ifdef _DEBUG
|
||||
FILEFLAGS 0x1L
|
||||
#else
|
||||
FILEFLAGS 0x0L
|
||||
#endif
|
||||
FILEOS 0x40004L
|
||||
FILETYPE 0x1L
|
||||
FILESUBTYPE 0x0L
|
||||
BEGIN
|
||||
BLOCK "StringFileInfo"
|
||||
BEGIN
|
||||
BLOCK "040904b0"
|
||||
BEGIN
|
||||
VALUE "FileDescription", "Ollama"
|
||||
VALUE "InternalName", "Ollama"
|
||||
VALUE "OriginalFilename", "ollama app.exe"
|
||||
VALUE "ProductName", "Ollama"
|
||||
END
|
||||
END
|
||||
|
||||
BLOCK "VarFileInfo"
|
||||
BEGIN
|
||||
VALUE "Translation", 0x409, 1200
|
||||
END
|
||||
END
|
8
app/ollama_welcome.ps1
Normal file
@@ -0,0 +1,8 @@
|
||||
# TODO - consider ANSI colors and maybe ASCII art...
|
||||
write-host ""
|
||||
write-host "Welcome to Ollama!"
|
||||
write-host ""
|
||||
write-host "Run your first model:"
|
||||
write-host ""
|
||||
write-host "`tollama run llama2"
|
||||
write-host ""
|
98
app/store/store.go
Normal file
@@ -0,0 +1,98 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"encoding/json"
|
||||
"errors"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sync"
|
||||
|
||||
"github.com/google/uuid"
|
||||
)
|
||||
|
||||
type Store struct {
|
||||
ID string `json:"id"`
|
||||
FirstTimeRun bool `json:"first-time-run"`
|
||||
}
|
||||
|
||||
var (
|
||||
lock sync.Mutex
|
||||
store Store
|
||||
)
|
||||
|
||||
func GetID() string {
|
||||
lock.Lock()
|
||||
defer lock.Unlock()
|
||||
if store.ID == "" {
|
||||
initStore()
|
||||
}
|
||||
return store.ID
|
||||
|
||||
}
|
||||
|
||||
func GetFirstTimeRun() bool {
|
||||
lock.Lock()
|
||||
defer lock.Unlock()
|
||||
if store.ID == "" {
|
||||
initStore()
|
||||
}
|
||||
return store.FirstTimeRun
|
||||
}
|
||||
|
||||
func SetFirstTimeRun(val bool) {
|
||||
lock.Lock()
|
||||
defer lock.Unlock()
|
||||
if store.FirstTimeRun == val {
|
||||
return
|
||||
}
|
||||
store.FirstTimeRun = val
|
||||
writeStore(getStorePath())
|
||||
}
|
||||
|
||||
// lock must be held
|
||||
func initStore() {
|
||||
storeFile, err := os.Open(getStorePath())
|
||||
if err == nil {
|
||||
defer storeFile.Close()
|
||||
err = json.NewDecoder(storeFile).Decode(&store)
|
||||
if err == nil {
|
||||
slog.Debug(fmt.Sprintf("loaded existing store %s - ID: %s", getStorePath(), store.ID))
|
||||
return
|
||||
}
|
||||
} else if !errors.Is(err, os.ErrNotExist) {
|
||||
slog.Debug(fmt.Sprintf("unexpected error searching for store: %s", err))
|
||||
}
|
||||
slog.Debug("initializing new store")
|
||||
store.ID = uuid.New().String()
|
||||
writeStore(getStorePath())
|
||||
}
|
||||
|
||||
func writeStore(storeFilename string) {
|
||||
ollamaDir := filepath.Dir(storeFilename)
|
||||
_, err := os.Stat(ollamaDir)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
if err := os.MkdirAll(ollamaDir, 0o755); err != nil {
|
||||
slog.Error(fmt.Sprintf("create ollama dir %s: %v", ollamaDir, err))
|
||||
return
|
||||
}
|
||||
}
|
||||
payload, err := json.Marshal(store)
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to marshal store: %s", err))
|
||||
return
|
||||
}
|
||||
fp, err := os.OpenFile(storeFilename, os.O_WRONLY|os.O_CREATE|os.O_TRUNC, 0o755)
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("write store payload %s: %v", storeFilename, err))
|
||||
return
|
||||
}
|
||||
defer fp.Close()
|
||||
if n, err := fp.Write(payload); err != nil || n != len(payload) {
|
||||
slog.Error(fmt.Sprintf("write store payload %s: %d vs %d -- %v", storeFilename, n, len(payload), err))
|
||||
return
|
||||
}
|
||||
slog.Debug("Store contents: " + string(payload))
|
||||
slog.Info(fmt.Sprintf("wrote store: %s", storeFilename))
|
||||
}
|
13
app/store/store_darwin.go
Normal file
@@ -0,0 +1,13 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func getStorePath() string {
|
||||
// TODO - system wide location?
|
||||
|
||||
home := os.Getenv("HOME")
|
||||
return filepath.Join(home, "Library", "Application Support", "Ollama", "config.json")
|
||||
}
|
16
app/store/store_linux.go
Normal file
@@ -0,0 +1,16 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func getStorePath() string {
|
||||
if os.Geteuid() == 0 {
|
||||
// TODO where should we store this on linux for system-wide operation?
|
||||
return "/etc/ollama/config.json"
|
||||
}
|
||||
|
||||
home := os.Getenv("HOME")
|
||||
return filepath.Join(home, ".ollama", "config.json")
|
||||
}
|
11
app/store/store_windows.go
Normal file
@@ -0,0 +1,11 @@
|
||||
package store
|
||||
|
||||
import (
|
||||
"os"
|
||||
"path/filepath"
|
||||
)
|
||||
|
||||
func getStorePath() string {
|
||||
localAppData := os.Getenv("LOCALAPPDATA")
|
||||
return filepath.Join(localAppData, "Ollama", "config.json")
|
||||
}
|
24
app/tray/commontray/types.go
Normal file
@@ -0,0 +1,24 @@
|
||||
package commontray
|
||||
|
||||
var (
|
||||
Title = "Ollama"
|
||||
ToolTip = "Ollama"
|
||||
|
||||
UpdateIconName = "tray_upgrade"
|
||||
IconName = "tray"
|
||||
)
|
||||
|
||||
type Callbacks struct {
|
||||
Quit chan struct{}
|
||||
Update chan struct{}
|
||||
DoFirstUse chan struct{}
|
||||
ShowLogs chan struct{}
|
||||
}
|
||||
|
||||
type OllamaTray interface {
|
||||
GetCallbacks() Callbacks
|
||||
Run()
|
||||
UpdateAvailable(ver string) error
|
||||
DisplayFirstUseNotification() error
|
||||
Quit()
|
||||
}
|
33
app/tray/tray.go
Normal file
@@ -0,0 +1,33 @@
|
||||
package tray
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"runtime"
|
||||
|
||||
"github.com/jmorganca/ollama/app/assets"
|
||||
"github.com/jmorganca/ollama/app/tray/commontray"
|
||||
)
|
||||
|
||||
func NewTray() (commontray.OllamaTray, error) {
|
||||
extension := ".png"
|
||||
if runtime.GOOS == "windows" {
|
||||
extension = ".ico"
|
||||
}
|
||||
iconName := commontray.UpdateIconName + extension
|
||||
updateIcon, err := assets.GetIcon(iconName)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to load icon %s: %w", iconName, err)
|
||||
}
|
||||
iconName = commontray.IconName + extension
|
||||
icon, err := assets.GetIcon(iconName)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("failed to load icon %s: %w", iconName, err)
|
||||
}
|
||||
|
||||
tray, err := InitPlatformTray(icon, updateIcon)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
|
||||
return tray, nil
|
||||
}
|
13
app/tray/tray_nonwindows.go
Normal file
@@ -0,0 +1,13 @@
|
||||
//go:build !windows
|
||||
|
||||
package tray
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
|
||||
"github.com/jmorganca/ollama/app/tray/commontray"
|
||||
)
|
||||
|
||||
func InitPlatformTray(icon, updateIcon []byte) (commontray.OllamaTray, error) {
|
||||
return nil, fmt.Errorf("NOT IMPLEMENTED YET")
|
||||
}
|
10
app/tray/tray_windows.go
Normal file
@@ -0,0 +1,10 @@
|
||||
package tray
|
||||
|
||||
import (
|
||||
"github.com/jmorganca/ollama/app/tray/commontray"
|
||||
"github.com/jmorganca/ollama/app/tray/wintray"
|
||||
)
|
||||
|
||||
func InitPlatformTray(icon, updateIcon []byte) (commontray.OllamaTray, error) {
|
||||
return wintray.InitTray(icon, updateIcon)
|
||||
}
|
184
app/tray/wintray/eventloop.go
Normal file
@@ -0,0 +1,184 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"sync"
|
||||
"unsafe"
|
||||
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
var (
|
||||
quitOnce sync.Once
|
||||
)
|
||||
|
||||
func (t *winTray) Run() {
|
||||
nativeLoop()
|
||||
}
|
||||
|
||||
func nativeLoop() {
|
||||
// Main message pump.
|
||||
slog.Debug("starting event handling loop")
|
||||
m := &struct {
|
||||
WindowHandle windows.Handle
|
||||
Message uint32
|
||||
Wparam uintptr
|
||||
Lparam uintptr
|
||||
Time uint32
|
||||
Pt point
|
||||
LPrivate uint32
|
||||
}{}
|
||||
for {
|
||||
ret, _, err := pGetMessage.Call(uintptr(unsafe.Pointer(m)), 0, 0, 0)
|
||||
|
||||
// If the function retrieves a message other than WM_QUIT, the return value is nonzero.
|
||||
// If the function retrieves the WM_QUIT message, the return value is zero.
|
||||
// If there is an error, the return value is -1
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms644936(v=vs.85).aspx
|
||||
switch int32(ret) {
|
||||
case -1:
|
||||
slog.Error(fmt.Sprintf("get message failure: %v", err))
|
||||
return
|
||||
case 0:
|
||||
return
|
||||
default:
|
||||
pTranslateMessage.Call(uintptr(unsafe.Pointer(m))) //nolint:errcheck
|
||||
pDispatchMessage.Call(uintptr(unsafe.Pointer(m))) //nolint:errcheck
|
||||
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// WindowProc callback function that processes messages sent to a window.
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms633573(v=vs.85).aspx
|
||||
func (t *winTray) wndProc(hWnd windows.Handle, message uint32, wParam, lParam uintptr) (lResult uintptr) {
|
||||
const (
|
||||
WM_RBUTTONUP = 0x0205
|
||||
WM_LBUTTONUP = 0x0202
|
||||
WM_COMMAND = 0x0111
|
||||
WM_ENDSESSION = 0x0016
|
||||
WM_CLOSE = 0x0010
|
||||
WM_DESTROY = 0x0002
|
||||
WM_MOUSEMOVE = 0x0200
|
||||
WM_LBUTTONDOWN = 0x0201
|
||||
)
|
||||
switch message {
|
||||
case WM_COMMAND:
|
||||
menuItemId := int32(wParam)
|
||||
// https://docs.microsoft.com/en-us/windows/win32/menurc/wm-command#menus
|
||||
switch menuItemId {
|
||||
case quitMenuID:
|
||||
select {
|
||||
case t.callbacks.Quit <- struct{}{}:
|
||||
// should not happen but in case not listening
|
||||
default:
|
||||
slog.Error("no listener on Quit")
|
||||
}
|
||||
case updateMenuID:
|
||||
select {
|
||||
case t.callbacks.Update <- struct{}{}:
|
||||
// should not happen but in case not listening
|
||||
default:
|
||||
slog.Error("no listener on Update")
|
||||
}
|
||||
case diagLogsMenuID:
|
||||
select {
|
||||
case t.callbacks.ShowLogs <- struct{}{}:
|
||||
// should not happen but in case not listening
|
||||
default:
|
||||
slog.Error("no listener on ShowLogs")
|
||||
}
|
||||
default:
|
||||
slog.Debug(fmt.Sprintf("Unexpected menu item id: %d", menuItemId))
|
||||
}
|
||||
case WM_CLOSE:
|
||||
boolRet, _, err := pDestroyWindow.Call(uintptr(t.window))
|
||||
if boolRet == 0 {
|
||||
slog.Error(fmt.Sprintf("failed to destroy window: %s", err))
|
||||
}
|
||||
err = t.wcex.unregister()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to uregister windo %s", err))
|
||||
}
|
||||
case WM_DESTROY:
|
||||
// same as WM_ENDSESSION, but throws 0 exit code after all
|
||||
defer pPostQuitMessage.Call(uintptr(int32(0))) //nolint:errcheck
|
||||
fallthrough
|
||||
case WM_ENDSESSION:
|
||||
t.muNID.Lock()
|
||||
if t.nid != nil {
|
||||
err := t.nid.delete()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to delete nid: %s", err))
|
||||
}
|
||||
}
|
||||
t.muNID.Unlock()
|
||||
case t.wmSystrayMessage:
|
||||
switch lParam {
|
||||
case WM_MOUSEMOVE, WM_LBUTTONDOWN:
|
||||
// Ignore these...
|
||||
case WM_RBUTTONUP, WM_LBUTTONUP:
|
||||
err := t.showMenu()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to show menu: %s", err))
|
||||
}
|
||||
case 0x405: // TODO - how is this magic value derived for the notification left click
|
||||
if t.pendingUpdate {
|
||||
select {
|
||||
case t.callbacks.Update <- struct{}{}:
|
||||
// should not happen but in case not listening
|
||||
default:
|
||||
slog.Error("no listener on Update")
|
||||
}
|
||||
} else {
|
||||
select {
|
||||
case t.callbacks.DoFirstUse <- struct{}{}:
|
||||
// should not happen but in case not listening
|
||||
default:
|
||||
slog.Error("no listener on DoFirstUse")
|
||||
}
|
||||
}
|
||||
case 0x404: // Middle click or close notification
|
||||
// slog.Debug("doing nothing on close of first time notification")
|
||||
default:
|
||||
// 0x402 also seems common - what is it?
|
||||
slog.Debug(fmt.Sprintf("unmanaged app message, lParm: 0x%x", lParam))
|
||||
}
|
||||
case t.wmTaskbarCreated: // on explorer.exe restarts
|
||||
t.muNID.Lock()
|
||||
err := t.nid.add()
|
||||
if err != nil {
|
||||
slog.Error(fmt.Sprintf("failed to refresh the taskbar on explorer restart: %s", err))
|
||||
}
|
||||
t.muNID.Unlock()
|
||||
default:
|
||||
// Calls the default window procedure to provide default processing for any window messages that an application does not process.
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms633572(v=vs.85).aspx
|
||||
lResult, _, _ = pDefWindowProc.Call(
|
||||
uintptr(hWnd),
|
||||
uintptr(message),
|
||||
uintptr(wParam),
|
||||
uintptr(lParam),
|
||||
)
|
||||
}
|
||||
return
|
||||
}
|
||||
|
||||
func (t *winTray) Quit() {
|
||||
quitOnce.Do(quit)
|
||||
}
|
||||
|
||||
func quit() {
|
||||
boolRet, _, err := pPostMessage.Call(
|
||||
uintptr(wt.window),
|
||||
WM_CLOSE,
|
||||
0,
|
||||
0,
|
||||
)
|
||||
if boolRet == 0 {
|
||||
slog.Error(fmt.Sprintf("failed to post close message on shutdown %s", err))
|
||||
}
|
||||
}
|
71
app/tray/wintray/menus.go
Normal file
@@ -0,0 +1,71 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"unsafe"
|
||||
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
const (
|
||||
updatAvailableMenuID = 1
|
||||
updateMenuID = updatAvailableMenuID + 1
|
||||
separatorMenuID = updateMenuID + 1
|
||||
diagLogsMenuID = separatorMenuID + 1
|
||||
diagSeparatorMenuID = diagLogsMenuID + 1
|
||||
quitMenuID = diagSeparatorMenuID + 1
|
||||
)
|
||||
|
||||
func (t *winTray) initMenus() error {
|
||||
if err := t.addOrUpdateMenuItem(diagLogsMenuID, 0, diagLogsMenuTitle, false); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w\n", err)
|
||||
}
|
||||
if err := t.addSeparatorMenuItem(diagSeparatorMenuID, 0); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w", err)
|
||||
}
|
||||
if err := t.addOrUpdateMenuItem(quitMenuID, 0, quitMenuTitle, false); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w\n", err)
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (t *winTray) UpdateAvailable(ver string) error {
|
||||
if !t.updateNotified {
|
||||
slog.Debug("updating menu and sending notification for new update")
|
||||
if err := t.addOrUpdateMenuItem(updatAvailableMenuID, 0, updateAvailableMenuTitle, true); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w", err)
|
||||
}
|
||||
if err := t.addOrUpdateMenuItem(updateMenuID, 0, updateMenutTitle, false); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w", err)
|
||||
}
|
||||
if err := t.addSeparatorMenuItem(separatorMenuID, 0); err != nil {
|
||||
return fmt.Errorf("unable to create menu entries %w", err)
|
||||
}
|
||||
iconFilePath, err := iconBytesToFilePath(wt.updateIcon)
|
||||
if err != nil {
|
||||
return fmt.Errorf("unable to write icon data to temp file: %w", err)
|
||||
}
|
||||
if err := wt.setIcon(iconFilePath); err != nil {
|
||||
return fmt.Errorf("unable to set icon: %w", err)
|
||||
}
|
||||
t.updateNotified = true
|
||||
|
||||
t.pendingUpdate = true
|
||||
// Now pop up the notification
|
||||
t.muNID.Lock()
|
||||
defer t.muNID.Unlock()
|
||||
copy(t.nid.InfoTitle[:], windows.StringToUTF16(updateTitle))
|
||||
copy(t.nid.Info[:], windows.StringToUTF16(fmt.Sprintf(updateMessage, ver)))
|
||||
t.nid.Flags |= NIF_INFO
|
||||
t.nid.Timeout = 10
|
||||
t.nid.Size = uint32(unsafe.Sizeof(*wt.nid))
|
||||
err = t.nid.modify()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
return nil
|
||||
}
|
15
app/tray/wintray/messages.go
Normal file
@@ -0,0 +1,15 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
const (
|
||||
firstTimeTitle = "Ollama is running"
|
||||
firstTimeMessage = "Click here to get started"
|
||||
updateTitle = "Update available"
|
||||
updateMessage = "Ollama version %s is ready to install"
|
||||
|
||||
quitMenuTitle = "Quit Ollama"
|
||||
updateAvailableMenuTitle = "An update is available"
|
||||
updateMenutTitle = "Restart to update"
|
||||
diagLogsMenuTitle = "View logs"
|
||||
)
|
66
app/tray/wintray/notifyicon.go
Normal file
@@ -0,0 +1,66 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"unsafe"
|
||||
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
// Contains information that the system needs to display notifications in the notification area.
|
||||
// Used by Shell_NotifyIcon.
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/bb773352(v=vs.85).aspx
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/bb762159
|
||||
type notifyIconData struct {
|
||||
Size uint32
|
||||
Wnd windows.Handle
|
||||
ID, Flags, CallbackMessage uint32
|
||||
Icon windows.Handle
|
||||
Tip [128]uint16
|
||||
State, StateMask uint32
|
||||
Info [256]uint16
|
||||
// Timeout, Version uint32
|
||||
Timeout uint32
|
||||
|
||||
InfoTitle [64]uint16
|
||||
InfoFlags uint32
|
||||
GuidItem windows.GUID
|
||||
BalloonIcon windows.Handle
|
||||
}
|
||||
|
||||
func (nid *notifyIconData) add() error {
|
||||
const NIM_ADD = 0x00000000
|
||||
res, _, err := pShellNotifyIcon.Call(
|
||||
uintptr(NIM_ADD),
|
||||
uintptr(unsafe.Pointer(nid)),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (nid *notifyIconData) modify() error {
|
||||
const NIM_MODIFY = 0x00000001
|
||||
res, _, err := pShellNotifyIcon.Call(
|
||||
uintptr(NIM_MODIFY),
|
||||
uintptr(unsafe.Pointer(nid)),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
func (nid *notifyIconData) delete() error {
|
||||
const NIM_DELETE = 0x00000002
|
||||
res, _, err := pShellNotifyIcon.Call(
|
||||
uintptr(NIM_DELETE),
|
||||
uintptr(unsafe.Pointer(nid)),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
485
app/tray/wintray/tray.go
Normal file
@@ -0,0 +1,485 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"crypto/md5"
|
||||
"encoding/hex"
|
||||
"fmt"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"sort"
|
||||
"sync"
|
||||
"unsafe"
|
||||
|
||||
"github.com/jmorganca/ollama/app/tray/commontray"
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
// Helpful sources: https://github.com/golang/exp/blob/master/shiny/driver/internal/win32
|
||||
|
||||
// Contains information about loaded resources
|
||||
type winTray struct {
|
||||
instance,
|
||||
icon,
|
||||
cursor,
|
||||
window windows.Handle
|
||||
|
||||
loadedImages map[string]windows.Handle
|
||||
muLoadedImages sync.RWMutex
|
||||
|
||||
// menus keeps track of the submenus keyed by the menu item ID, plus 0
|
||||
// which corresponds to the main popup menu.
|
||||
menus map[uint32]windows.Handle
|
||||
muMenus sync.RWMutex
|
||||
menuOf map[uint32]windows.Handle
|
||||
muMenuOf sync.RWMutex
|
||||
// menuItemIcons maintains the bitmap of each menu item (if applies). It's
|
||||
// needed to show the icon correctly when showing a previously hidden menu
|
||||
// item again.
|
||||
// menuItemIcons map[uint32]windows.Handle
|
||||
// muMenuItemIcons sync.RWMutex
|
||||
visibleItems map[uint32][]uint32
|
||||
muVisibleItems sync.RWMutex
|
||||
|
||||
nid *notifyIconData
|
||||
muNID sync.RWMutex
|
||||
wcex *wndClassEx
|
||||
|
||||
wmSystrayMessage,
|
||||
wmTaskbarCreated uint32
|
||||
|
||||
pendingUpdate bool
|
||||
updateNotified bool // Only pop up the notification once - TODO consider daily nag?
|
||||
// Callbacks
|
||||
callbacks commontray.Callbacks
|
||||
normalIcon []byte
|
||||
updateIcon []byte
|
||||
}
|
||||
|
||||
var wt winTray
|
||||
|
||||
func (t *winTray) GetCallbacks() commontray.Callbacks {
|
||||
return t.callbacks
|
||||
}
|
||||
|
||||
func InitTray(icon, updateIcon []byte) (*winTray, error) {
|
||||
wt.callbacks.Quit = make(chan struct{})
|
||||
wt.callbacks.Update = make(chan struct{})
|
||||
wt.callbacks.ShowLogs = make(chan struct{})
|
||||
wt.callbacks.DoFirstUse = make(chan struct{})
|
||||
wt.normalIcon = icon
|
||||
wt.updateIcon = updateIcon
|
||||
if err := wt.initInstance(); err != nil {
|
||||
return nil, fmt.Errorf("Unable to init instance: %w\n", err)
|
||||
}
|
||||
|
||||
if err := wt.createMenu(); err != nil {
|
||||
return nil, fmt.Errorf("Unable to create menu: %w\n", err)
|
||||
}
|
||||
|
||||
iconFilePath, err := iconBytesToFilePath(wt.normalIcon)
|
||||
if err != nil {
|
||||
return nil, fmt.Errorf("Unable to write icon data to temp file: %w", err)
|
||||
}
|
||||
if err := wt.setIcon(iconFilePath); err != nil {
|
||||
return nil, fmt.Errorf("Unable to set icon: %w", err)
|
||||
}
|
||||
|
||||
return &wt, wt.initMenus()
|
||||
}
|
||||
|
||||
func (t *winTray) initInstance() error {
|
||||
const (
|
||||
className = "OllamaClass"
|
||||
windowName = ""
|
||||
)
|
||||
|
||||
t.wmSystrayMessage = WM_USER + 1
|
||||
t.visibleItems = make(map[uint32][]uint32)
|
||||
t.menus = make(map[uint32]windows.Handle)
|
||||
t.menuOf = make(map[uint32]windows.Handle)
|
||||
|
||||
t.loadedImages = make(map[string]windows.Handle)
|
||||
|
||||
taskbarEventNamePtr, _ := windows.UTF16PtrFromString("TaskbarCreated")
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms644947
|
||||
res, _, err := pRegisterWindowMessage.Call(
|
||||
uintptr(unsafe.Pointer(taskbarEventNamePtr)),
|
||||
)
|
||||
if res == 0 { // success 0xc000-0xfff
|
||||
return fmt.Errorf("failed to register window: %w", err)
|
||||
}
|
||||
t.wmTaskbarCreated = uint32(res)
|
||||
|
||||
instanceHandle, _, err := pGetModuleHandle.Call(0)
|
||||
if instanceHandle == 0 {
|
||||
return err
|
||||
}
|
||||
t.instance = windows.Handle(instanceHandle)
|
||||
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms648072(v=vs.85).aspx
|
||||
iconHandle, _, err := pLoadIcon.Call(0, uintptr(IDI_APPLICATION))
|
||||
if iconHandle == 0 {
|
||||
return err
|
||||
}
|
||||
t.icon = windows.Handle(iconHandle)
|
||||
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms648391(v=vs.85).aspx
|
||||
cursorHandle, _, err := pLoadCursor.Call(0, uintptr(IDC_ARROW))
|
||||
if cursorHandle == 0 {
|
||||
return err
|
||||
}
|
||||
t.cursor = windows.Handle(cursorHandle)
|
||||
|
||||
classNamePtr, err := windows.UTF16PtrFromString(className)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
windowNamePtr, err := windows.UTF16PtrFromString(windowName)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
t.wcex = &wndClassEx{
|
||||
Style: CS_HREDRAW | CS_VREDRAW,
|
||||
WndProc: windows.NewCallback(t.wndProc),
|
||||
Instance: t.instance,
|
||||
Icon: t.icon,
|
||||
Cursor: t.cursor,
|
||||
Background: windows.Handle(6), // (COLOR_WINDOW + 1)
|
||||
ClassName: classNamePtr,
|
||||
IconSm: t.icon,
|
||||
}
|
||||
if err := t.wcex.register(); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
windowHandle, _, err := pCreateWindowEx.Call(
|
||||
uintptr(0),
|
||||
uintptr(unsafe.Pointer(classNamePtr)),
|
||||
uintptr(unsafe.Pointer(windowNamePtr)),
|
||||
uintptr(WS_OVERLAPPEDWINDOW),
|
||||
uintptr(CW_USEDEFAULT),
|
||||
uintptr(CW_USEDEFAULT),
|
||||
uintptr(CW_USEDEFAULT),
|
||||
uintptr(CW_USEDEFAULT),
|
||||
uintptr(0),
|
||||
uintptr(0),
|
||||
uintptr(t.instance),
|
||||
uintptr(0),
|
||||
)
|
||||
if windowHandle == 0 {
|
||||
return err
|
||||
}
|
||||
t.window = windows.Handle(windowHandle)
|
||||
|
||||
pShowWindow.Call(uintptr(t.window), uintptr(SW_HIDE)) //nolint:errcheck
|
||||
|
||||
boolRet, _, err := pUpdateWindow.Call(uintptr(t.window))
|
||||
if boolRet == 0 {
|
||||
slog.Error(fmt.Sprintf("failed to update window: %s", err))
|
||||
}
|
||||
|
||||
t.muNID.Lock()
|
||||
defer t.muNID.Unlock()
|
||||
t.nid = ¬ifyIconData{
|
||||
Wnd: windows.Handle(t.window),
|
||||
ID: 100,
|
||||
Flags: NIF_MESSAGE,
|
||||
CallbackMessage: t.wmSystrayMessage,
|
||||
}
|
||||
t.nid.Size = uint32(unsafe.Sizeof(*t.nid))
|
||||
|
||||
return t.nid.add()
|
||||
}
|
||||
|
||||
func (t *winTray) createMenu() error {
|
||||
|
||||
menuHandle, _, err := pCreatePopupMenu.Call()
|
||||
if menuHandle == 0 {
|
||||
return err
|
||||
}
|
||||
t.menus[0] = windows.Handle(menuHandle)
|
||||
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms647575(v=vs.85).aspx
|
||||
mi := struct {
|
||||
Size, Mask, Style, Max uint32
|
||||
Background windows.Handle
|
||||
ContextHelpID uint32
|
||||
MenuData uintptr
|
||||
}{
|
||||
Mask: MIM_APPLYTOSUBMENUS,
|
||||
}
|
||||
mi.Size = uint32(unsafe.Sizeof(mi))
|
||||
|
||||
res, _, err := pSetMenuInfo.Call(
|
||||
uintptr(t.menus[0]),
|
||||
uintptr(unsafe.Pointer(&mi)),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Contains information about a menu item.
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/ms647578(v=vs.85).aspx
|
||||
type menuItemInfo struct {
|
||||
Size, Mask, Type, State uint32
|
||||
ID uint32
|
||||
SubMenu, Checked, Unchecked windows.Handle
|
||||
ItemData uintptr
|
||||
TypeData *uint16
|
||||
Cch uint32
|
||||
BMPItem windows.Handle
|
||||
}
|
||||
|
||||
func (t *winTray) addOrUpdateMenuItem(menuItemId uint32, parentId uint32, title string, disabled bool) error {
|
||||
titlePtr, err := windows.UTF16PtrFromString(title)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
mi := menuItemInfo{
|
||||
Mask: MIIM_FTYPE | MIIM_STRING | MIIM_ID | MIIM_STATE,
|
||||
Type: MFT_STRING,
|
||||
ID: uint32(menuItemId),
|
||||
TypeData: titlePtr,
|
||||
Cch: uint32(len(title)),
|
||||
}
|
||||
mi.Size = uint32(unsafe.Sizeof(mi))
|
||||
if disabled {
|
||||
mi.State |= MFS_DISABLED
|
||||
}
|
||||
|
||||
var res uintptr
|
||||
t.muMenus.RLock()
|
||||
menu := t.menus[parentId]
|
||||
t.muMenus.RUnlock()
|
||||
if t.getVisibleItemIndex(parentId, menuItemId) != -1 {
|
||||
// We set the menu item info based on the menuID
|
||||
boolRet, _, err := pSetMenuItemInfo.Call(
|
||||
uintptr(menu),
|
||||
uintptr(menuItemId),
|
||||
0,
|
||||
uintptr(unsafe.Pointer(&mi)),
|
||||
)
|
||||
if boolRet == 0 {
|
||||
return fmt.Errorf("failed to set menu item: %w", err)
|
||||
}
|
||||
}
|
||||
|
||||
if res == 0 {
|
||||
// Menu item does not already exist, create it
|
||||
t.muMenus.RLock()
|
||||
submenu, exists := t.menus[menuItemId]
|
||||
t.muMenus.RUnlock()
|
||||
if exists {
|
||||
mi.Mask |= MIIM_SUBMENU
|
||||
mi.SubMenu = submenu
|
||||
}
|
||||
t.addToVisibleItems(parentId, menuItemId)
|
||||
position := t.getVisibleItemIndex(parentId, menuItemId)
|
||||
res, _, err = pInsertMenuItem.Call(
|
||||
uintptr(menu),
|
||||
uintptr(position),
|
||||
1,
|
||||
uintptr(unsafe.Pointer(&mi)),
|
||||
)
|
||||
if res == 0 {
|
||||
t.delFromVisibleItems(parentId, menuItemId)
|
||||
return err
|
||||
}
|
||||
t.muMenuOf.Lock()
|
||||
t.menuOf[menuItemId] = menu
|
||||
t.muMenuOf.Unlock()
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (t *winTray) addSeparatorMenuItem(menuItemId, parentId uint32) error {
|
||||
|
||||
mi := menuItemInfo{
|
||||
Mask: MIIM_FTYPE | MIIM_ID | MIIM_STATE,
|
||||
Type: MFT_SEPARATOR,
|
||||
ID: uint32(menuItemId),
|
||||
}
|
||||
|
||||
mi.Size = uint32(unsafe.Sizeof(mi))
|
||||
|
||||
t.addToVisibleItems(parentId, menuItemId)
|
||||
position := t.getVisibleItemIndex(parentId, menuItemId)
|
||||
t.muMenus.RLock()
|
||||
menu := uintptr(t.menus[parentId])
|
||||
t.muMenus.RUnlock()
|
||||
res, _, err := pInsertMenuItem.Call(
|
||||
menu,
|
||||
uintptr(position),
|
||||
1,
|
||||
uintptr(unsafe.Pointer(&mi)),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// func (t *winTray) hideMenuItem(menuItemId, parentId uint32) error {
|
||||
// const ERROR_SUCCESS syscall.Errno = 0
|
||||
|
||||
// t.muMenus.RLock()
|
||||
// menu := uintptr(t.menus[parentId])
|
||||
// t.muMenus.RUnlock()
|
||||
// res, _, err := pRemoveMenu.Call(
|
||||
// menu,
|
||||
// uintptr(menuItemId),
|
||||
// MF_BYCOMMAND,
|
||||
// )
|
||||
// if res == 0 && err.(syscall.Errno) != ERROR_SUCCESS {
|
||||
// return err
|
||||
// }
|
||||
// t.delFromVisibleItems(parentId, menuItemId)
|
||||
|
||||
// return nil
|
||||
// }
|
||||
|
||||
func (t *winTray) showMenu() error {
|
||||
p := point{}
|
||||
boolRet, _, err := pGetCursorPos.Call(uintptr(unsafe.Pointer(&p)))
|
||||
if boolRet == 0 {
|
||||
return err
|
||||
}
|
||||
boolRet, _, err = pSetForegroundWindow.Call(uintptr(t.window))
|
||||
if boolRet == 0 {
|
||||
slog.Warn(fmt.Sprintf("failed to bring menu to foreground: %s", err))
|
||||
}
|
||||
|
||||
boolRet, _, err = pTrackPopupMenu.Call(
|
||||
uintptr(t.menus[0]),
|
||||
TPM_BOTTOMALIGN|TPM_LEFTALIGN,
|
||||
uintptr(p.X),
|
||||
uintptr(p.Y),
|
||||
0,
|
||||
uintptr(t.window),
|
||||
0,
|
||||
)
|
||||
if boolRet == 0 {
|
||||
return err
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
func (t *winTray) delFromVisibleItems(parent, val uint32) {
|
||||
t.muVisibleItems.Lock()
|
||||
defer t.muVisibleItems.Unlock()
|
||||
visibleItems := t.visibleItems[parent]
|
||||
for i, itemval := range visibleItems {
|
||||
if val == itemval {
|
||||
t.visibleItems[parent] = append(visibleItems[:i], visibleItems[i+1:]...)
|
||||
break
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
func (t *winTray) addToVisibleItems(parent, val uint32) {
|
||||
t.muVisibleItems.Lock()
|
||||
defer t.muVisibleItems.Unlock()
|
||||
if visibleItems, exists := t.visibleItems[parent]; !exists {
|
||||
t.visibleItems[parent] = []uint32{val}
|
||||
} else {
|
||||
newvisible := append(visibleItems, val)
|
||||
sort.Slice(newvisible, func(i, j int) bool { return newvisible[i] < newvisible[j] })
|
||||
t.visibleItems[parent] = newvisible
|
||||
}
|
||||
}
|
||||
|
||||
func (t *winTray) getVisibleItemIndex(parent, val uint32) int {
|
||||
t.muVisibleItems.RLock()
|
||||
defer t.muVisibleItems.RUnlock()
|
||||
for i, itemval := range t.visibleItems[parent] {
|
||||
if val == itemval {
|
||||
return i
|
||||
}
|
||||
}
|
||||
return -1
|
||||
}
|
||||
|
||||
func iconBytesToFilePath(iconBytes []byte) (string, error) {
|
||||
bh := md5.Sum(iconBytes)
|
||||
dataHash := hex.EncodeToString(bh[:])
|
||||
iconFilePath := filepath.Join(os.TempDir(), "ollama_temp_icon_"+dataHash)
|
||||
|
||||
if _, err := os.Stat(iconFilePath); os.IsNotExist(err) {
|
||||
if err := os.WriteFile(iconFilePath, iconBytes, 0644); err != nil {
|
||||
return "", err
|
||||
}
|
||||
}
|
||||
return iconFilePath, nil
|
||||
}
|
||||
|
||||
// Loads an image from file and shows it in tray.
|
||||
// Shell_NotifyIcon: https://msdn.microsoft.com/en-us/library/windows/desktop/bb762159(v=vs.85).aspx
|
||||
func (t *winTray) setIcon(src string) error {
|
||||
|
||||
h, err := t.loadIconFrom(src)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
t.muNID.Lock()
|
||||
defer t.muNID.Unlock()
|
||||
t.nid.Icon = h
|
||||
t.nid.Flags |= NIF_ICON
|
||||
t.nid.Size = uint32(unsafe.Sizeof(*t.nid))
|
||||
|
||||
return t.nid.modify()
|
||||
}
|
||||
|
||||
// Loads an image from file to be shown in tray or menu item.
|
||||
// LoadImage: https://msdn.microsoft.com/en-us/library/windows/desktop/ms648045(v=vs.85).aspx
|
||||
func (t *winTray) loadIconFrom(src string) (windows.Handle, error) {
|
||||
|
||||
// Save and reuse handles of loaded images
|
||||
t.muLoadedImages.RLock()
|
||||
h, ok := t.loadedImages[src]
|
||||
t.muLoadedImages.RUnlock()
|
||||
if !ok {
|
||||
srcPtr, err := windows.UTF16PtrFromString(src)
|
||||
if err != nil {
|
||||
return 0, err
|
||||
}
|
||||
res, _, err := pLoadImage.Call(
|
||||
0,
|
||||
uintptr(unsafe.Pointer(srcPtr)),
|
||||
IMAGE_ICON,
|
||||
0,
|
||||
0,
|
||||
LR_LOADFROMFILE|LR_DEFAULTSIZE,
|
||||
)
|
||||
if res == 0 {
|
||||
return 0, err
|
||||
}
|
||||
h = windows.Handle(res)
|
||||
t.muLoadedImages.Lock()
|
||||
t.loadedImages[src] = h
|
||||
t.muLoadedImages.Unlock()
|
||||
}
|
||||
return h, nil
|
||||
}
|
||||
|
||||
func (t *winTray) DisplayFirstUseNotification() error {
|
||||
t.muNID.Lock()
|
||||
defer t.muNID.Unlock()
|
||||
copy(t.nid.InfoTitle[:], windows.StringToUTF16(firstTimeTitle))
|
||||
copy(t.nid.Info[:], windows.StringToUTF16(firstTimeMessage))
|
||||
t.nid.Flags |= NIF_INFO
|
||||
t.nid.Size = uint32(unsafe.Sizeof(*wt.nid))
|
||||
|
||||
return t.nid.modify()
|
||||
}
|
89
app/tray/wintray/w32api.go
Normal file
@@ -0,0 +1,89 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"runtime"
|
||||
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
var (
|
||||
k32 = windows.NewLazySystemDLL("Kernel32.dll")
|
||||
u32 = windows.NewLazySystemDLL("User32.dll")
|
||||
s32 = windows.NewLazySystemDLL("Shell32.dll")
|
||||
|
||||
pCreatePopupMenu = u32.NewProc("CreatePopupMenu")
|
||||
pCreateWindowEx = u32.NewProc("CreateWindowExW")
|
||||
pDefWindowProc = u32.NewProc("DefWindowProcW")
|
||||
pDestroyWindow = u32.NewProc("DestroyWindow")
|
||||
pDispatchMessage = u32.NewProc("DispatchMessageW")
|
||||
pGetCursorPos = u32.NewProc("GetCursorPos")
|
||||
pGetMessage = u32.NewProc("GetMessageW")
|
||||
pGetModuleHandle = k32.NewProc("GetModuleHandleW")
|
||||
pInsertMenuItem = u32.NewProc("InsertMenuItemW")
|
||||
pLoadCursor = u32.NewProc("LoadCursorW")
|
||||
pLoadIcon = u32.NewProc("LoadIconW")
|
||||
pLoadImage = u32.NewProc("LoadImageW")
|
||||
pPostMessage = u32.NewProc("PostMessageW")
|
||||
pPostQuitMessage = u32.NewProc("PostQuitMessage")
|
||||
pRegisterClass = u32.NewProc("RegisterClassExW")
|
||||
pRegisterWindowMessage = u32.NewProc("RegisterWindowMessageW")
|
||||
pSetForegroundWindow = u32.NewProc("SetForegroundWindow")
|
||||
pSetMenuInfo = u32.NewProc("SetMenuInfo")
|
||||
pSetMenuItemInfo = u32.NewProc("SetMenuItemInfoW")
|
||||
pShellNotifyIcon = s32.NewProc("Shell_NotifyIconW")
|
||||
pShowWindow = u32.NewProc("ShowWindow")
|
||||
pTrackPopupMenu = u32.NewProc("TrackPopupMenu")
|
||||
pTranslateMessage = u32.NewProc("TranslateMessage")
|
||||
pUnregisterClass = u32.NewProc("UnregisterClassW")
|
||||
pUpdateWindow = u32.NewProc("UpdateWindow")
|
||||
)
|
||||
|
||||
const (
|
||||
CS_HREDRAW = 0x0002
|
||||
CS_VREDRAW = 0x0001
|
||||
CW_USEDEFAULT = 0x80000000
|
||||
IDC_ARROW = 32512 // Standard arrow
|
||||
IDI_APPLICATION = 32512
|
||||
IMAGE_ICON = 1 // Loads an icon
|
||||
LR_DEFAULTSIZE = 0x00000040 // Loads default-size icon for windows(SM_CXICON x SM_CYICON) if cx, cy are set to zero
|
||||
LR_LOADFROMFILE = 0x00000010 // Loads the stand-alone image from the file
|
||||
MF_BYCOMMAND = 0x00000000
|
||||
MFS_DISABLED = 0x00000003
|
||||
MFT_SEPARATOR = 0x00000800
|
||||
MFT_STRING = 0x00000000
|
||||
MIIM_BITMAP = 0x00000080
|
||||
MIIM_FTYPE = 0x00000100
|
||||
MIIM_ID = 0x00000002
|
||||
MIIM_STATE = 0x00000001
|
||||
MIIM_STRING = 0x00000040
|
||||
MIIM_SUBMENU = 0x00000004
|
||||
MIM_APPLYTOSUBMENUS = 0x80000000
|
||||
NIF_ICON = 0x00000002
|
||||
NIF_INFO = 0x00000010
|
||||
NIF_MESSAGE = 0x00000001
|
||||
SW_HIDE = 0
|
||||
TPM_BOTTOMALIGN = 0x0020
|
||||
TPM_LEFTALIGN = 0x0000
|
||||
WM_CLOSE = 0x0010
|
||||
WM_USER = 0x0400
|
||||
WS_CAPTION = 0x00C00000
|
||||
WS_MAXIMIZEBOX = 0x00010000
|
||||
WS_MINIMIZEBOX = 0x00020000
|
||||
WS_OVERLAPPED = 0x00000000
|
||||
WS_OVERLAPPEDWINDOW = WS_OVERLAPPED | WS_CAPTION | WS_SYSMENU | WS_THICKFRAME | WS_MINIMIZEBOX | WS_MAXIMIZEBOX
|
||||
WS_SYSMENU = 0x00080000
|
||||
WS_THICKFRAME = 0x00040000
|
||||
)
|
||||
|
||||
// Not sure if this is actually needed on windows
|
||||
func init() {
|
||||
runtime.LockOSThread()
|
||||
}
|
||||
|
||||
// The POINT structure defines the x- and y- coordinates of a point.
|
||||
// https://msdn.microsoft.com/en-us/library/windows/desktop/dd162805(v=vs.85).aspx
|
||||
type point struct {
|
||||
X, Y int32
|
||||
}
|
45
app/tray/wintray/winclass.go
Normal file
@@ -0,0 +1,45 @@
|
||||
//go:build windows
|
||||
|
||||
package wintray
|
||||
|
||||
import (
|
||||
"unsafe"
|
||||
|
||||
"golang.org/x/sys/windows"
|
||||
)
|
||||
|
||||
// Contains window class information.
|
||||
// It is used with the RegisterClassEx and GetClassInfoEx functions.
|
||||
// https://msdn.microsoft.com/en-us/library/ms633577.aspx
|
||||
type wndClassEx struct {
|
||||
Size, Style uint32
|
||||
WndProc uintptr
|
||||
ClsExtra, WndExtra int32
|
||||
Instance, Icon, Cursor, Background windows.Handle
|
||||
MenuName, ClassName *uint16
|
||||
IconSm windows.Handle
|
||||
}
|
||||
|
||||
// Registers a window class for subsequent use in calls to the CreateWindow or CreateWindowEx function.
|
||||
// https://msdn.microsoft.com/en-us/library/ms633587.aspx
|
||||
func (w *wndClassEx) register() error {
|
||||
w.Size = uint32(unsafe.Sizeof(*w))
|
||||
res, _, err := pRegisterClass.Call(uintptr(unsafe.Pointer(w)))
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
||||
|
||||
// Unregisters a window class, freeing the memory required for the class.
|
||||
// https://msdn.microsoft.com/en-us/library/ms644899.aspx
|
||||
func (w *wndClassEx) unregister() error {
|
||||
res, _, err := pUnregisterClass.Call(
|
||||
uintptr(unsafe.Pointer(w.ClassName)),
|
||||
uintptr(w.Instance),
|
||||
)
|
||||
if res == 0 {
|
||||
return err
|
||||
}
|
||||
return nil
|
||||
}
|
61
auth/auth.go
Normal file
@@ -0,0 +1,61 @@
|
||||
package auth
|
||||
|
||||
import (
|
||||
"bytes"
|
||||
"context"
|
||||
"crypto/rand"
|
||||
"encoding/base64"
|
||||
"fmt"
|
||||
"io"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
|
||||
"golang.org/x/crypto/ssh"
|
||||
)
|
||||
|
||||
const defaultPrivateKey = "id_ed25519"
|
||||
|
||||
func NewNonce(r io.Reader, length int) (string, error) {
|
||||
nonce := make([]byte, length)
|
||||
if _, err := io.ReadFull(r, nonce); err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
return base64.RawURLEncoding.EncodeToString(nonce), nil
|
||||
}
|
||||
|
||||
func Sign(ctx context.Context, bts []byte) (string, error) {
|
||||
home, err := os.UserHomeDir()
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
keyPath := filepath.Join(home, ".ollama", defaultPrivateKey)
|
||||
|
||||
privateKeyFile, err := os.ReadFile(keyPath)
|
||||
if err != nil {
|
||||
slog.Info(fmt.Sprintf("Failed to load private key: %v", err))
|
||||
return "", err
|
||||
}
|
||||
|
||||
privateKey, err := ssh.ParsePrivateKey(privateKeyFile)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
// get the pubkey, but remove the type
|
||||
publicKey := ssh.MarshalAuthorizedKey(privateKey.PublicKey())
|
||||
parts := bytes.Split(publicKey, []byte(" "))
|
||||
if len(parts) < 2 {
|
||||
return "", fmt.Errorf("malformed public key")
|
||||
}
|
||||
|
||||
signedData, err := privateKey.Sign(rand.Reader, bts)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
|
||||
// signature is <pubkey>:<signature>
|
||||
return fmt.Sprintf("%s:%s", bytes.TrimSpace(parts[1]), base64.StdEncoding.EncodeToString(signedData.Blob)), nil
|
||||
}
|
146
cmd/cmd.go
@@ -14,7 +14,6 @@ import (
|
||||
"net"
|
||||
"net/http"
|
||||
"os"
|
||||
"os/exec"
|
||||
"os/signal"
|
||||
"path/filepath"
|
||||
"runtime"
|
||||
@@ -22,9 +21,12 @@ import (
|
||||
"syscall"
|
||||
"time"
|
||||
|
||||
"github.com/containerd/console"
|
||||
|
||||
"github.com/olekukonko/tablewriter"
|
||||
"github.com/spf13/cobra"
|
||||
"golang.org/x/crypto/ssh"
|
||||
"golang.org/x/exp/slices"
|
||||
"golang.org/x/term"
|
||||
|
||||
"github.com/jmorganca/ollama/api"
|
||||
@@ -146,19 +148,68 @@ func RunHandler(cmd *cobra.Command, args []string) error {
|
||||
}
|
||||
|
||||
name := args[0]
|
||||
|
||||
// check if the model exists on the server
|
||||
_, err = client.Show(cmd.Context(), &api.ShowRequest{Name: name})
|
||||
show, err := client.Show(cmd.Context(), &api.ShowRequest{Name: name})
|
||||
var statusError api.StatusError
|
||||
switch {
|
||||
case errors.As(err, &statusError) && statusError.StatusCode == http.StatusNotFound:
|
||||
if err := PullHandler(cmd, []string{name}); err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
show, err = client.Show(cmd.Context(), &api.ShowRequest{Name: name})
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
case err != nil:
|
||||
return err
|
||||
}
|
||||
|
||||
return RunGenerate(cmd, args)
|
||||
interactive := true
|
||||
|
||||
opts := runOptions{
|
||||
Model: args[0],
|
||||
WordWrap: os.Getenv("TERM") == "xterm-256color",
|
||||
Options: map[string]interface{}{},
|
||||
MultiModal: slices.Contains(show.Details.Families, "clip"),
|
||||
ParentModel: show.Details.ParentModel,
|
||||
}
|
||||
|
||||
format, err := cmd.Flags().GetString("format")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
opts.Format = format
|
||||
|
||||
prompts := args[1:]
|
||||
// prepend stdin to the prompt if provided
|
||||
if !term.IsTerminal(int(os.Stdin.Fd())) {
|
||||
in, err := io.ReadAll(os.Stdin)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
prompts = append([]string{string(in)}, prompts...)
|
||||
opts.WordWrap = false
|
||||
interactive = false
|
||||
}
|
||||
opts.Prompt = strings.Join(prompts, " ")
|
||||
if len(prompts) > 0 {
|
||||
interactive = false
|
||||
}
|
||||
|
||||
nowrap, err := cmd.Flags().GetBool("nowordwrap")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
opts.WordWrap = !nowrap
|
||||
|
||||
if !interactive {
|
||||
return generate(cmd, opts)
|
||||
}
|
||||
|
||||
return generateInteractive(cmd, opts)
|
||||
}
|
||||
|
||||
func PushHandler(cmd *cobra.Command, args []string) error {
|
||||
@@ -410,51 +461,6 @@ func PullHandler(cmd *cobra.Command, args []string) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func RunGenerate(cmd *cobra.Command, args []string) error {
|
||||
interactive := true
|
||||
|
||||
opts := runOptions{
|
||||
Model: args[0],
|
||||
WordWrap: os.Getenv("TERM") == "xterm-256color",
|
||||
Options: map[string]interface{}{},
|
||||
}
|
||||
|
||||
format, err := cmd.Flags().GetString("format")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
opts.Format = format
|
||||
|
||||
prompts := args[1:]
|
||||
// prepend stdin to the prompt if provided
|
||||
if !term.IsTerminal(int(os.Stdin.Fd())) {
|
||||
in, err := io.ReadAll(os.Stdin)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
|
||||
prompts = append([]string{string(in)}, prompts...)
|
||||
opts.WordWrap = false
|
||||
interactive = false
|
||||
}
|
||||
opts.Prompt = strings.Join(prompts, " ")
|
||||
if len(prompts) > 0 {
|
||||
interactive = false
|
||||
}
|
||||
|
||||
nowrap, err := cmd.Flags().GetBool("nowordwrap")
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
opts.WordWrap = !nowrap
|
||||
|
||||
if !interactive {
|
||||
return generate(cmd, opts)
|
||||
}
|
||||
|
||||
return generateInteractive(cmd, opts)
|
||||
}
|
||||
|
||||
type generateContextKey string
|
||||
|
||||
type runOptions struct {
|
||||
@@ -630,10 +636,18 @@ func generate(cmd *cobra.Command, opts runOptions) error {
|
||||
return nil
|
||||
}
|
||||
|
||||
if opts.MultiModal {
|
||||
opts.Prompt, opts.Images, err = extractFileData(opts.Prompt)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
}
|
||||
|
||||
request := api.GenerateRequest{
|
||||
Model: opts.Model,
|
||||
Prompt: opts.Prompt,
|
||||
Context: generateContext,
|
||||
Images: opts.Images,
|
||||
Format: opts.Format,
|
||||
System: opts.System,
|
||||
Template: opts.Template,
|
||||
@@ -672,7 +686,7 @@ func generate(cmd *cobra.Command, opts runOptions) error {
|
||||
}
|
||||
|
||||
func RunServer(cmd *cobra.Command, _ []string) error {
|
||||
host, port, err := net.SplitHostPort(os.Getenv("OLLAMA_HOST"))
|
||||
host, port, err := net.SplitHostPort(strings.Trim(os.Getenv("OLLAMA_HOST"), "\"'"))
|
||||
if err != nil {
|
||||
host, port = "127.0.0.1", "11434"
|
||||
if ip := net.ParseIP(strings.Trim(os.Getenv("OLLAMA_HOST"), "[]")); ip != nil {
|
||||
@@ -741,22 +755,8 @@ func initializeKeypair() error {
|
||||
return nil
|
||||
}
|
||||
|
||||
func startMacApp(ctx context.Context, client *api.Client) error {
|
||||
exe, err := os.Executable()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
link, err := os.Readlink(exe)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if !strings.Contains(link, "Ollama.app") {
|
||||
return fmt.Errorf("could not find ollama app")
|
||||
}
|
||||
path := strings.Split(link, "Ollama.app")
|
||||
if err := exec.Command("/usr/bin/open", "-a", path[0]+"Ollama.app").Run(); err != nil {
|
||||
return err
|
||||
}
|
||||
//nolint:unused
|
||||
func waitForServer(ctx context.Context, client *api.Client) error {
|
||||
// wait for the server to start
|
||||
timeout := time.After(5 * time.Second)
|
||||
tick := time.Tick(500 * time.Millisecond)
|
||||
@@ -770,6 +770,7 @@ func startMacApp(ctx context.Context, client *api.Client) error {
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
}
|
||||
|
||||
func checkServerHeartbeat(cmd *cobra.Command, _ []string) error {
|
||||
@@ -778,15 +779,11 @@ func checkServerHeartbeat(cmd *cobra.Command, _ []string) error {
|
||||
return err
|
||||
}
|
||||
if err := client.Heartbeat(cmd.Context()); err != nil {
|
||||
if !strings.Contains(err.Error(), "connection refused") {
|
||||
if !strings.Contains(err.Error(), " refused") {
|
||||
return err
|
||||
}
|
||||
if runtime.GOOS == "darwin" {
|
||||
if err := startMacApp(cmd.Context(), client); err != nil {
|
||||
return fmt.Errorf("could not connect to ollama app, is it running?")
|
||||
}
|
||||
} else {
|
||||
return fmt.Errorf("could not connect to ollama server, run 'ollama serve' to start it")
|
||||
if err := startApp(cmd.Context(), client); err != nil {
|
||||
return fmt.Errorf("could not connect to ollama app, is it running?")
|
||||
}
|
||||
}
|
||||
return nil
|
||||
@@ -816,6 +813,11 @@ func NewCLI() *cobra.Command {
|
||||
log.SetFlags(log.LstdFlags | log.Lshortfile)
|
||||
cobra.EnableCommandSorting = false
|
||||
|
||||
if runtime.GOOS == "windows" {
|
||||
// Enable colorful ANSI escape code in Windows terminal (disabled by default)
|
||||
console.ConsoleFromFile(os.Stdout) //nolint:errcheck
|
||||
}
|
||||
|
||||
rootCmd := &cobra.Command{
|
||||
Use: "ollama",
|
||||
Short: "Large language model runner",
|
||||
|
@@ -6,6 +6,7 @@ import (
|
||||
"io"
|
||||
"net/http"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"regexp"
|
||||
"sort"
|
||||
"strings"
|
||||
@@ -98,6 +99,11 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
fmt.Fprintln(os.Stderr, " /? shortcuts Help for keyboard shortcuts")
|
||||
fmt.Fprintln(os.Stderr, "")
|
||||
fmt.Fprintln(os.Stderr, "Use \"\"\" to begin a multi-line message.")
|
||||
|
||||
if opts.MultiModal {
|
||||
fmt.Fprintf(os.Stderr, "Use %s to include .jpg or .png images.\n", filepath.FromSlash("/path/to/file"))
|
||||
}
|
||||
|
||||
fmt.Fprintln(os.Stderr, "")
|
||||
}
|
||||
|
||||
@@ -207,6 +213,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
switch multiline {
|
||||
case MultilineSystem:
|
||||
opts.System = sb.String()
|
||||
opts.Messages = append(opts.Messages, api.Message{Role: "system", Content: opts.System})
|
||||
fmt.Println("Set system message.")
|
||||
sb.Reset()
|
||||
case MultilineTemplate:
|
||||
@@ -226,7 +233,6 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
fmt.Fprintln(&sb)
|
||||
multiline = MultilinePrompt
|
||||
scanner.Prompt.UseAlt = true
|
||||
break
|
||||
}
|
||||
case scanner.Pasting:
|
||||
fmt.Fprintln(&sb, line)
|
||||
@@ -348,11 +354,21 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
}
|
||||
|
||||
if args[1] == "system" {
|
||||
opts.System = sb.String()
|
||||
opts.System = sb.String() // for display in modelfile
|
||||
newMessage := api.Message{Role: "system", Content: sb.String()}
|
||||
// Check if the slice is not empty and the last message is from 'system'
|
||||
if len(opts.Messages) > 0 && opts.Messages[len(opts.Messages)-1].Role == "system" {
|
||||
// Replace the last message
|
||||
opts.Messages[len(opts.Messages)-1] = newMessage
|
||||
} else {
|
||||
opts.Messages = append(opts.Messages, newMessage)
|
||||
}
|
||||
fmt.Println("Set system message.")
|
||||
sb.Reset()
|
||||
} else if args[1] == "template" {
|
||||
opts.Template = sb.String()
|
||||
fmt.Println("Set prompt template.")
|
||||
sb.Reset()
|
||||
}
|
||||
|
||||
sb.Reset()
|
||||
@@ -454,7 +470,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
} else {
|
||||
usage()
|
||||
}
|
||||
case line == "/exit", line == "/bye":
|
||||
case strings.HasPrefix(line, "/exit"), strings.HasPrefix(line, "/bye"):
|
||||
return nil
|
||||
case strings.HasPrefix(line, "/"):
|
||||
args := strings.Fields(line)
|
||||
@@ -487,29 +503,18 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
newMessage.Content = msg
|
||||
|
||||
// reset the context if we find another image
|
||||
// clear all previous images for better responses
|
||||
if len(images) > 0 {
|
||||
newMessage.Images = append(newMessage.Images, images...)
|
||||
// reset the context for the new image
|
||||
opts.Messages = []api.Message{}
|
||||
} else {
|
||||
if len(opts.Messages) > 1 {
|
||||
newMessage.Images = append(newMessage.Images, opts.Messages[len(opts.Messages)-2].Images...)
|
||||
for i := range opts.Messages {
|
||||
opts.Messages[i].Images = nil
|
||||
}
|
||||
}
|
||||
if len(newMessage.Images) == 0 {
|
||||
fmt.Println("This model requires you to add a jpeg, png, or svg image.")
|
||||
fmt.Println()
|
||||
sb.Reset()
|
||||
continue
|
||||
}
|
||||
|
||||
newMessage.Content = msg
|
||||
newMessage.Images = images
|
||||
}
|
||||
|
||||
if opts.System != "" {
|
||||
opts.Messages = append(opts.Messages, api.Message{Role: "system", Content: opts.System})
|
||||
}
|
||||
opts.Messages = append(opts.Messages, newMessage)
|
||||
|
||||
assistant, err := chat(cmd, opts)
|
||||
@@ -603,10 +608,10 @@ func extractFileData(input string) (string, []api.ImageData, error) {
|
||||
if os.IsNotExist(err) {
|
||||
continue
|
||||
}
|
||||
fmt.Printf("Couldn't process image: %q\n", err)
|
||||
fmt.Fprintf(os.Stderr, "Couldn't process image: %q\n", err)
|
||||
return "", imgs, err
|
||||
}
|
||||
fmt.Printf("Added image '%s'\n", nfp)
|
||||
fmt.Fprintf(os.Stderr, "Added image '%s'\n", nfp)
|
||||
input = strings.ReplaceAll(input, fp, "")
|
||||
imgs = append(imgs, data)
|
||||
}
|
||||
@@ -627,7 +632,7 @@ func getImageData(filePath string) ([]byte, error) {
|
||||
}
|
||||
|
||||
contentType := http.DetectContentType(buf)
|
||||
allowedTypes := []string{"image/jpeg", "image/jpg", "image/svg+xml", "image/png"}
|
||||
allowedTypes := []string{"image/jpeg", "image/jpg", "image/png"}
|
||||
if !slices.Contains(allowedTypes, contentType) {
|
||||
return nil, fmt.Errorf("invalid image type: %s", contentType)
|
||||
}
|
||||
|
30
cmd/start_darwin.go
Normal file
@@ -0,0 +1,30 @@
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"strings"
|
||||
|
||||
"github.com/jmorganca/ollama/api"
|
||||
)
|
||||
|
||||
func startApp(ctx context.Context, client *api.Client) error {
|
||||
exe, err := os.Executable()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
link, err := os.Readlink(exe)
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
if !strings.Contains(link, "Ollama.app") {
|
||||
return fmt.Errorf("could not find ollama app")
|
||||
}
|
||||
path := strings.Split(link, "Ollama.app")
|
||||
if err := exec.Command("/usr/bin/open", "-a", path[0]+"Ollama.app").Run(); err != nil {
|
||||
return err
|
||||
}
|
||||
return waitForServer(ctx, client)
|
||||
}
|
14
cmd/start_default.go
Normal file
@@ -0,0 +1,14 @@
|
||||
//go:build !windows && !darwin
|
||||
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"fmt"
|
||||
|
||||
"github.com/jmorganca/ollama/api"
|
||||
)
|
||||
|
||||
func startApp(ctx context.Context, client *api.Client) error {
|
||||
return fmt.Errorf("could not connect to ollama server, run 'ollama serve' to start it")
|
||||
}
|
58
cmd/start_windows.go
Normal file
@@ -0,0 +1,58 @@
|
||||
package cmd
|
||||
|
||||
import (
|
||||
"context"
|
||||
"errors"
|
||||
"fmt"
|
||||
"os"
|
||||
"os/exec"
|
||||
"path/filepath"
|
||||
"strings"
|
||||
"syscall"
|
||||
|
||||
"github.com/jmorganca/ollama/api"
|
||||
)
|
||||
|
||||
func startApp(ctx context.Context, client *api.Client) error {
|
||||
// log.Printf("XXX Attempting to find and start ollama app")
|
||||
AppName := "ollama app.exe"
|
||||
exe, err := os.Executable()
|
||||
if err != nil {
|
||||
return err
|
||||
}
|
||||
appExe := filepath.Join(filepath.Dir(exe), AppName)
|
||||
_, err = os.Stat(appExe)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
// Try the standard install location
|
||||
localAppData := os.Getenv("LOCALAPPDATA")
|
||||
appExe = filepath.Join(localAppData, "Ollama", AppName)
|
||||
_, err := os.Stat(appExe)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
// Finally look in the path
|
||||
appExe, err = exec.LookPath(AppName)
|
||||
if err != nil {
|
||||
return fmt.Errorf("could not locate ollama app")
|
||||
}
|
||||
}
|
||||
}
|
||||
// log.Printf("XXX attempting to start app %s", appExe)
|
||||
|
||||
cmd_path := "c:\\Windows\\system32\\cmd.exe"
|
||||
cmd := exec.Command(cmd_path, "/c", appExe)
|
||||
// TODO - these hide flags aren't working - still pops up a command window for some reason
|
||||
cmd.SysProcAttr = &syscall.SysProcAttr{CreationFlags: 0x08000000, HideWindow: true}
|
||||
|
||||
// TODO this didn't help either...
|
||||
cmd.Stdin = strings.NewReader("")
|
||||
cmd.Stdout = os.Stdout
|
||||
cmd.Stderr = os.Stderr
|
||||
|
||||
if err := cmd.Start(); err != nil {
|
||||
return fmt.Errorf("unable to start ollama app %w", err)
|
||||
}
|
||||
|
||||
if cmd.Process != nil {
|
||||
defer cmd.Process.Release() //nolint:errcheck
|
||||
}
|
||||
return waitForServer(ctx, client)
|
||||
}
|
@@ -10,7 +10,7 @@ Create new models or modify models already in the library using the Modelfile. L
|
||||
|
||||
Import models using source model weights found on Hugging Face and similar sites by referring to the **[Import Documentation](./import.md)**.
|
||||
|
||||
Installing on Linux in most cases is easy using the script on Ollama.ai. To get more detail about the install, including CUDA drivers, see the **[Linux Documentation](./linux.md)**.
|
||||
Installing on Linux in most cases is easy using the script on [ollama.com/download](ollama.com/download). To get more detail about the install, including CUDA drivers, see the **[Linux Documentation](./linux.md)**.
|
||||
|
||||
Many of our users like the flexibility of using our official Docker Image. Learn more about using Docker with Ollama using the **[Docker Documentation](https://hub.docker.com/r/ollama/ollama)**.
|
||||
|
||||
|
64
docs/api.md
@@ -49,7 +49,8 @@ Advanced parameters (optional):
|
||||
- `template`: the prompt template to use (overrides what is defined in the `Modelfile`)
|
||||
- `context`: the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
|
||||
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
|
||||
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API.
|
||||
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API
|
||||
- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
|
||||
|
||||
#### JSON mode
|
||||
|
||||
@@ -246,6 +247,23 @@ curl http://localhost:11434/api/generate -d '{
|
||||
}'
|
||||
```
|
||||
|
||||
#### Request (Reproducible outputs)
|
||||
|
||||
For reproducible outputs, set `temperature` to 0 and `seed` to a number:
|
||||
|
||||
##### Request
|
||||
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{
|
||||
"model": "mistral",
|
||||
"prompt": "[INST] why is the sky blue? [/INST]",
|
||||
"options": {
|
||||
"seed": 101,
|
||||
"temperature": 0
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
##### Response
|
||||
|
||||
```json
|
||||
@@ -379,6 +397,7 @@ Advanced parameters (optional):
|
||||
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
|
||||
- `template`: the prompt template to use (overrides what is defined in the `Modelfile`)
|
||||
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
|
||||
- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
|
||||
|
||||
### Examples
|
||||
|
||||
@@ -542,7 +561,7 @@ curl http://localhost:11434/api/chat -d '{
|
||||
"role": "user",
|
||||
"content": "what is in this image?",
|
||||
"images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
|
||||
},
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
@@ -568,6 +587,46 @@ curl http://localhost:11434/api/chat -d '{
|
||||
}
|
||||
```
|
||||
|
||||
#### Chat request (Reproducible outputs)
|
||||
|
||||
##### Request
|
||||
|
||||
```shell
|
||||
curl http://localhost:11434/api/chat -d '{
|
||||
"model": "llama2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Hello!"
|
||||
}
|
||||
],
|
||||
"options": {
|
||||
"seed": 101,
|
||||
"temperature": 0
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
##### Response
|
||||
|
||||
```json
|
||||
{
|
||||
"model": "registry.ollama.ai/library/llama2:latest",
|
||||
"created_at": "2023-12-12T14:13:43.416799Z",
|
||||
"message": {
|
||||
"role": "assistant",
|
||||
"content": "Hello! How are you today?"
|
||||
},
|
||||
"done": true,
|
||||
"total_duration": 5191566416,
|
||||
"load_duration": 2154458,
|
||||
"prompt_eval_count": 26,
|
||||
"prompt_eval_duration": 383809000,
|
||||
"eval_count": 298,
|
||||
"eval_duration": 4799921000
|
||||
}
|
||||
```
|
||||
|
||||
## Create a Model
|
||||
|
||||
```shell
|
||||
@@ -958,6 +1017,7 @@ Generate embeddings from a model
|
||||
Advanced parameters:
|
||||
|
||||
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
|
||||
- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
|
||||
|
||||
### Examples
|
||||
|
||||
|
@@ -50,7 +50,8 @@ development and runtime packages.
|
||||
Typically the build scripts will auto-detect CUDA, however, if your Linux distro
|
||||
or installation approach uses unusual paths, you can specify the location by
|
||||
specifying an environment variable `CUDA_LIB_DIR` to the location of the shared
|
||||
libraries, and `CUDACXX` to the location of the nvcc compiler.
|
||||
libraries, and `CUDACXX` to the location of the nvcc compiler. You can customize
|
||||
set set of target CUDA architectues by setting `CMAKE_CUDA_ARCHITECTURES` (e.g. "50;60;70")
|
||||
|
||||
Then generate dependencies:
|
||||
|
||||
|
82
docs/faq.md
@@ -2,12 +2,40 @@
|
||||
|
||||
## How can I upgrade Ollama?
|
||||
|
||||
To upgrade Ollama, run the installation process again. On the Mac, click the Ollama icon in the menubar and choose the restart option if an update is available.
|
||||
Ollama on macOS and Windows will automatically download updates. Click on the taskbar or menubar item and then click "Restart to update" to apply the update. Updates can also be installed by downloading the latest version [manually](https://ollama.com/download/).
|
||||
|
||||
On Linux, re-run the install script:
|
||||
|
||||
```
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
## How can I view the logs?
|
||||
|
||||
Review the [Troubleshooting](./troubleshooting.md) docs for more about using logs.
|
||||
|
||||
## How can I specify the context window size?
|
||||
|
||||
By default, Ollama uses a context window size of 2048 tokens.
|
||||
|
||||
To change this when using `ollama run`, use `/set parameter`:
|
||||
|
||||
```
|
||||
/set parameter num_ctx 4096
|
||||
```
|
||||
|
||||
When using the API, specify the `num_ctx` parameter:
|
||||
|
||||
```
|
||||
curl http://localhost:11434/api/generate -d '{
|
||||
"model": "llama2",
|
||||
"prompt": "Why is the sky blue?",
|
||||
"options": {
|
||||
"num_ctx": 4096
|
||||
}
|
||||
}'
|
||||
```
|
||||
|
||||
## How do I configure Ollama server?
|
||||
|
||||
Ollama server can be configured with environment variables.
|
||||
@@ -46,6 +74,21 @@ If Ollama is run as a systemd service, environment variables should be set using
|
||||
systemctl restart ollama
|
||||
```
|
||||
|
||||
### Setting environment variables on Windows
|
||||
|
||||
On windows, Ollama inherits your user and system environment variables.
|
||||
|
||||
1. First Quit Ollama by clicking on it in the task bar
|
||||
|
||||
2. Edit system environment variables from the control panel
|
||||
|
||||
3. Edit or create New variable(s) for your user account for `OLLAMA_HOST`, `OLLAMA_MODELS`, etc.
|
||||
|
||||
4. Click OK/Apply to save
|
||||
|
||||
5. Run `ollama` from a new terminal window
|
||||
|
||||
|
||||
## How can I expose Ollama on my network?
|
||||
|
||||
Ollama binds 127.0.0.1 port 11434 by default. Change the bind address with the `OLLAMA_HOST` environment variable.
|
||||
@@ -60,8 +103,9 @@ Refer to the section [above](#how-do-i-configure-ollama-server) for how to set e
|
||||
|
||||
## Where are models stored?
|
||||
|
||||
- macOS: `~/.ollama/models`.
|
||||
- macOS: `~/.ollama/models`
|
||||
- Linux: `/usr/share/ollama/.ollama/models`
|
||||
- Windows: `C:\Users\<username>\.ollama\models`
|
||||
|
||||
### How do I set them to a different location?
|
||||
|
||||
@@ -115,3 +159,37 @@ This can impact both installing Ollama, as well as downloading models.
|
||||
Open `Control Panel > Networking and Internet > View network status and tasks` and click on `Change adapter settings` on the left panel. Find the `vEthernel (WSL)` adapter, right click and select `Properties`.
|
||||
Click on `Configure` and open the `Advanced` tab. Search through each of the properties until you find `Large Send Offload Version 2 (IPv4)` and `Large Send Offload Version 2 (IPv6)`. *Disable* both of these
|
||||
properties.
|
||||
|
||||
## How can I pre-load a model to get faster response times?
|
||||
|
||||
If you are using the API you can preload a model by sending the Ollama server an empty request. This works with both the `/api/generate` and `/api/chat` API endpoints.
|
||||
|
||||
To preload the mistral model using the generate endpoint, use:
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{"model": "mistral"}'
|
||||
```
|
||||
|
||||
To use the chat completions endpoint, use:
|
||||
```shell
|
||||
curl http://localhost:11434/api/chat -d '{"model": "mistral"}'
|
||||
```
|
||||
|
||||
## How do I keep a model loaded in memory or make it unload immediately?
|
||||
|
||||
By default models are kept in memory for 5 minutes before being unloaded. This allows for quicker response times if you are making numerous requests to the LLM. You may, however, want to free up the memory before the 5 minutes have elapsed or keep the model loaded indefinitely. Use the `keep_alive` parameter with either the `/api/generate` and `/api/chat` API endpoints to control how long the model is left in memory.
|
||||
|
||||
The `keep_alive` parameter can be set to:
|
||||
* a duration string (such as "10m" or "24h")
|
||||
* a number in seconds (such as 3600)
|
||||
* any negative number which will keep the model loaded in memory (e.g. -1 or "-1m")
|
||||
* '0' which will unload the model immediately after generating a response
|
||||
|
||||
For example, to preload a model and leave it in memory use:
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": -1}'
|
||||
```
|
||||
|
||||
To unload the model and free up memory use:
|
||||
```shell
|
||||
curl http://localhost:11434/api/generate -d '{"model": "llama2", "keep_alive": 0}'
|
||||
```
|
||||
|
120
docs/import.md
@@ -15,7 +15,7 @@ FROM ./mistral-7b-v0.1.Q4_0.gguf
|
||||
(Optional) many chat models require a prompt template in order to answer correctly. A default prompt template can be specified with the `TEMPLATE` instruction in the `Modelfile`:
|
||||
|
||||
```
|
||||
FROM ./q4_0.bin
|
||||
FROM ./mistral-7b-v0.1.Q4_0.gguf
|
||||
TEMPLATE "[INST] {{ .Prompt }} [/INST]"
|
||||
```
|
||||
|
||||
@@ -37,55 +37,69 @@ ollama run example "What is your favourite condiment?"
|
||||
|
||||
## Importing (PyTorch & Safetensors)
|
||||
|
||||
### Supported models
|
||||
> Importing from PyTorch and Safetensors is a longer process than importing from GGUF. Improvements that make it easier are a work in progress.
|
||||
|
||||
Ollama supports a set of model architectures, with support for more coming soon:
|
||||
### Setup
|
||||
|
||||
- Llama & Mistral
|
||||
- Falcon & RW
|
||||
- BigCode
|
||||
First, clone the `ollama/ollama` repo:
|
||||
|
||||
To view a model's architecture, check the `config.json` file in its HuggingFace repo. You should see an entry under `architectures` (e.g. `LlamaForCausalLM`).
|
||||
```
|
||||
git clone git@github.com:ollama/ollama.git ollama
|
||||
cd ollama
|
||||
```
|
||||
|
||||
### Step 1: Clone the HuggingFace repository (optional)
|
||||
and then fetch its `llama.cpp` submodule:
|
||||
|
||||
```shell
|
||||
git submodule init
|
||||
git submodule update llm/llama.cpp
|
||||
```
|
||||
|
||||
Next, install the Python dependencies:
|
||||
|
||||
```
|
||||
python3 -m venv llm/llama.cpp/.venv
|
||||
source llm/llama.cpp/.venv/bin/activate
|
||||
pip install -r llm/llama.cpp/requirements.txt
|
||||
```
|
||||
|
||||
Then build the `quantize` tool:
|
||||
|
||||
```
|
||||
make -C llm/llama.cpp quantize
|
||||
```
|
||||
|
||||
### Clone the HuggingFace repository (optional)
|
||||
|
||||
If the model is currently hosted in a HuggingFace repository, first clone that repository to download the raw model.
|
||||
|
||||
Install [Git LFS](https://docs.github.com/en/repositories/working-with-files/managing-large-files/installing-git-large-file-storage), verify it's installed, and then clone the model's repository:
|
||||
|
||||
```
|
||||
git lfs install
|
||||
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1
|
||||
cd Mistral-7B-Instruct-v0.1
|
||||
git clone https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1 model
|
||||
```
|
||||
|
||||
### Step 2: Convert and quantize to a `.bin` file (optional, for PyTorch and Safetensors)
|
||||
### Convert the model
|
||||
|
||||
If the model is in PyTorch or Safetensors format, a [Docker image](https://hub.docker.com/r/ollama/quantize) with the tooling required to convert and quantize models is available.
|
||||
|
||||
First, Install [Docker](https://www.docker.com/get-started/).
|
||||
|
||||
Next, to convert and quantize your model, run:
|
||||
> Note: some model architectures require using specific convert scripts. For example, Qwen models require running `convert-hf-to-gguf.py` instead of `convert.py`
|
||||
|
||||
```
|
||||
docker run --rm -v .:/model ollama/quantize -q q4_0 /model
|
||||
python llm/llama.cpp/convert.py ./model --outtype f16 --outfile converted.bin
|
||||
```
|
||||
|
||||
This will output two files into the directory:
|
||||
### Quantize the model
|
||||
|
||||
- `f16.bin`: the model converted to GGUF
|
||||
- `q4_0.bin` the model quantized to a 4-bit quantization (Ollama will use this file to create the Ollama model)
|
||||
```
|
||||
llm/llama.cpp/quantize converted.bin quantized.bin q4_0
|
||||
```
|
||||
|
||||
### Step 3: Write a `Modelfile`
|
||||
|
||||
Next, create a `Modelfile` for your model:
|
||||
|
||||
```
|
||||
FROM ./q4_0.bin
|
||||
```
|
||||
|
||||
(Optional) many chat models require a prompt template in order to answer correctly. A default prompt template can be specified with the `TEMPLATE` instruction in the `Modelfile`:
|
||||
|
||||
```
|
||||
FROM ./q4_0.bin
|
||||
FROM quantized.bin
|
||||
TEMPLATE "[INST] {{ .Prompt }} [/INST]"
|
||||
```
|
||||
|
||||
@@ -109,9 +123,9 @@ ollama run example "What is your favourite condiment?"
|
||||
|
||||
Publishing models is in early alpha. If you'd like to publish your model to share with others, follow these steps:
|
||||
|
||||
1. Create [an account](https://ollama.ai/signup)
|
||||
2. Run `cat ~/.ollama/id_ed25519.pub` to view your Ollama public key. Copy this to the clipboard.
|
||||
3. Add your public key to your [Ollama account](https://ollama.ai/settings/keys)
|
||||
1. Create [an account](https://ollama.com/signup)
|
||||
2. Run `cat ~/.ollama/id_ed25519.pub` (or `type %USERPROFILE%\.ollama\id_ed25519.pub` on Windows) to view your Ollama public key. Copy this to the clipboard.
|
||||
3. Add your public key to your [Ollama account](https://ollama.com/settings/keys)
|
||||
|
||||
Next, copy your model to your username's namespace:
|
||||
|
||||
@@ -125,7 +139,7 @@ Then push the model:
|
||||
ollama push <your username>/example
|
||||
```
|
||||
|
||||
After publishing, your model will be available at `https://ollama.ai/<your username>/example`.
|
||||
After publishing, your model will be available at `https://ollama.com/<your username>/example`.
|
||||
|
||||
## Quantization reference
|
||||
|
||||
@@ -149,47 +163,3 @@ The quantization options are as follow (from highest highest to lowest levels of
|
||||
- `q6_K`
|
||||
- `q8_0`
|
||||
- `f16`
|
||||
|
||||
## Manually converting & quantizing models
|
||||
|
||||
### Prerequisites
|
||||
|
||||
Start by cloning the `llama.cpp` repo to your machine in another directory:
|
||||
|
||||
```
|
||||
git clone https://github.com/ggerganov/llama.cpp.git
|
||||
cd llama.cpp
|
||||
```
|
||||
|
||||
Next, install the Python dependencies:
|
||||
|
||||
```
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Finally, build the `quantize` tool:
|
||||
|
||||
```
|
||||
make quantize
|
||||
```
|
||||
|
||||
### Convert the model
|
||||
|
||||
Run the correct conversion script for your model architecture:
|
||||
|
||||
```shell
|
||||
# LlamaForCausalLM or MistralForCausalLM
|
||||
python convert.py <path to model directory>
|
||||
|
||||
# FalconForCausalLM
|
||||
python convert-falcon-hf-to-gguf.py <path to model directory>
|
||||
|
||||
# GPTBigCodeForCausalLM
|
||||
python convert-starcoder-hf-to-gguf.py <path to model directory>
|
||||
```
|
||||
|
||||
### Quantize the model
|
||||
|
||||
```
|
||||
quantize <path to model dir>/ggml-model-f32.bin <path to model dir>/q4_0.bin q4_0
|
||||
```
|
||||
|
@@ -3,9 +3,11 @@
|
||||
## Install
|
||||
|
||||
Install Ollama running this one-liner:
|
||||
|
||||
>
|
||||
|
||||
```bash
|
||||
curl https://ollama.ai/install.sh | sh
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
## Manual install
|
||||
@@ -15,7 +17,7 @@ curl https://ollama.ai/install.sh | sh
|
||||
Ollama is distributed as a self-contained binary. Download it to a directory in your PATH:
|
||||
|
||||
```bash
|
||||
sudo curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||||
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||||
sudo chmod +x /usr/bin/ollama
|
||||
```
|
||||
|
||||
@@ -75,13 +77,13 @@ sudo systemctl start ollama
|
||||
Update ollama by running the install script again:
|
||||
|
||||
```bash
|
||||
curl https://ollama.ai/install.sh | sh
|
||||
curl -fsSL https://ollama.com/install.sh | sh
|
||||
```
|
||||
|
||||
Or by downloading the ollama binary:
|
||||
|
||||
```bash
|
||||
sudo curl -L https://ollama.ai/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||||
sudo curl -L https://ollama.com/download/ollama-linux-amd64 -o /usr/bin/ollama
|
||||
sudo chmod +x /usr/bin/ollama
|
||||
```
|
||||
|
||||
@@ -110,6 +112,7 @@ sudo rm $(which ollama)
|
||||
```
|
||||
|
||||
Remove the downloaded models and Ollama service user and group:
|
||||
|
||||
```bash
|
||||
sudo rm -r /usr/share/ollama
|
||||
sudo userdel ollama
|
||||
|
@@ -67,13 +67,13 @@ To use this:
|
||||
|
||||
More examples are available in the [examples directory](../examples).
|
||||
|
||||
### `Modelfile`s in [ollama.ai/library][1]
|
||||
### `Modelfile`s in [ollama.com/library][1]
|
||||
|
||||
There are two ways to view `Modelfile`s underlying the models in [ollama.ai/library][1]:
|
||||
There are two ways to view `Modelfile`s underlying the models in [ollama.com/library][1]:
|
||||
|
||||
- Option 1: view a details page from a model's tags page:
|
||||
1. Go to a particular model's tags (e.g. https://ollama.ai/library/llama2/tags)
|
||||
2. Click on a tag (e.g. https://ollama.ai/library/llama2:13b)
|
||||
1. Go to a particular model's tags (e.g. https://ollama.com/library/llama2/tags)
|
||||
2. Click on a tag (e.g. https://ollama.com/library/llama2:13b)
|
||||
3. Scroll down to "Layers"
|
||||
- Note: if the [`FROM` instruction](#from-required) is not present,
|
||||
it means the model was created from a local file
|
||||
@@ -86,7 +86,7 @@ There are two ways to view `Modelfile`s underlying the models in [ollama.ai/libr
|
||||
# FROM llama2:13b
|
||||
|
||||
FROM /root/.ollama/models/blobs/sha256:123abc
|
||||
TEMPLATE """[INST] {{ if and .First .System }}<<SYS>>{{ .System }}<</SYS>>
|
||||
TEMPLATE """[INST] {{ if .System }}<<SYS>>{{ .System }}<</SYS>>
|
||||
|
||||
{{ end }}{{ .Prompt }} [/INST] """
|
||||
SYSTEM """"""
|
||||
@@ -154,31 +154,23 @@ PARAMETER <parameter> <parametervalue>
|
||||
|
||||
### TEMPLATE
|
||||
|
||||
`TEMPLATE` of the full prompt template to be passed into the model. It may include (optionally) a system message and a user's prompt. This is used to create a full custom prompt, and syntax may be model specific. You can usually find the template for a given model in the readme for that model.
|
||||
`TEMPLATE` of the full prompt template to be passed into the model. It may include (optionally) a system message, a user's message and the response from the model. Note: syntax may be model specific. Templates use Go [template syntax](https://pkg.go.dev/text/template).
|
||||
|
||||
#### Template Variables
|
||||
|
||||
| Variable | Description |
|
||||
| ----------------- | ------------------------------------------------------------------------------------------------------------- |
|
||||
| `{{ .System }}` | The system message used to specify custom behavior, this must also be set in the Modelfile as an instruction. |
|
||||
| `{{ .Prompt }}` | The incoming prompt, this is not specified in the model file and will be set based on input. |
|
||||
| `{{ .Response }}` | The response from the LLM, if not specified response is appended to the end of the template. |
|
||||
| `{{ .First }}` | A boolean value used to render specific template information for the first generation of a session. |
|
||||
| Variable | Description |
|
||||
| ----------------- | --------------------------------------------------------------------------------------------- |
|
||||
| `{{ .System }}` | The system message used to specify custom behavior. |
|
||||
| `{{ .Prompt }}` | The user prompt message. |
|
||||
| `{{ .Response }}` | The response from the model. When generating a response, text after this variable is omitted. |
|
||||
|
||||
```modelfile
|
||||
TEMPLATE """
|
||||
{{- if .First }}
|
||||
### System:
|
||||
{{ .System }}
|
||||
{{- end }}
|
||||
|
||||
### User:
|
||||
{{ .Prompt }}
|
||||
|
||||
### Response:
|
||||
```
|
||||
TEMPLATE """{{ if .System }}<|im_start|>system
|
||||
{{ .System }}<|im_end|>
|
||||
{{ end }}{{ if .Prompt }}<|im_start|>user
|
||||
{{ .Prompt }}<|im_end|>
|
||||
{{ end }}<|im_start|>assistant
|
||||
"""
|
||||
|
||||
SYSTEM """<system message>"""
|
||||
```
|
||||
|
||||
### SYSTEM
|
||||
@@ -225,4 +217,4 @@ MESSAGE assistant yes
|
||||
- the **`Modelfile` is not case sensitive**. In the examples, uppercase instructions are used to make it easier to distinguish it from arguments.
|
||||
- Instructions can be in any order. In the examples, the `FROM` instruction is first to keep it easily readable.
|
||||
|
||||
[1]: https://ollama.ai/library
|
||||
[1]: https://ollama.com/library
|
||||
|
141
docs/openai.md
Normal file
@@ -0,0 +1,141 @@
|
||||
# OpenAI compatibility
|
||||
|
||||
> **Note:** OpenAI compatibility is experimental and is subject to major adjustments including breaking changes. For fully-featured access to the Ollama API, see the Ollama [Python library](https://github.com/ollama/ollama-python), [JavaScript library](https://github.com/ollama/ollama-js) and [REST API](https://github.com/jmorganca/ollama/blob/main/docs/api.md).
|
||||
|
||||
Ollama provides experimental compatibility with parts of the [OpenAI API](https://platform.openai.com/docs/api-reference) to help connect existing applications to Ollama.
|
||||
|
||||
## Usage
|
||||
|
||||
### OpenAI Python library
|
||||
|
||||
```python
|
||||
from openai import OpenAI
|
||||
|
||||
client = OpenAI(
|
||||
base_url='http://localhost:11434/v1/',
|
||||
|
||||
# required but ignored
|
||||
api_key='ollama',
|
||||
)
|
||||
|
||||
chat_completion = client.chat.completions.create(
|
||||
messages=[
|
||||
{
|
||||
'role': 'user',
|
||||
'content': 'Say this is a test',
|
||||
}
|
||||
],
|
||||
model='llama2',
|
||||
)
|
||||
```
|
||||
|
||||
### OpenAI JavaScript library
|
||||
|
||||
```javascript
|
||||
import OpenAI from 'openai'
|
||||
|
||||
const openai = new OpenAI({
|
||||
baseURL: 'http://localhost:11434/v1/',
|
||||
|
||||
// required but ignored
|
||||
apiKey: 'ollama',
|
||||
})
|
||||
|
||||
const chatCompletion = await openai.chat.completions.create({
|
||||
messages: [{ role: 'user', content: 'Say this is a test' }],
|
||||
model: 'llama2',
|
||||
})
|
||||
```
|
||||
|
||||
### `curl`
|
||||
|
||||
```
|
||||
curl http://localhost:11434/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "llama2",
|
||||
"messages": [
|
||||
{
|
||||
"role": "system",
|
||||
"content": "You are a helpful assistant."
|
||||
},
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Hello!"
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
||||
|
||||
## Endpoints
|
||||
|
||||
### `/v1/chat/completions`
|
||||
|
||||
#### Supported features
|
||||
|
||||
- [x] Chat completions
|
||||
- [x] Streaming
|
||||
- [x] JSON mode
|
||||
- [x] Reproducible outputs
|
||||
- [ ] Vision
|
||||
- [ ] Function calling
|
||||
- [ ] Logprobs
|
||||
|
||||
#### Supported request fields
|
||||
|
||||
- [x] `model`
|
||||
- [x] `messages`
|
||||
- [x] Text `content`
|
||||
- [ ] Array of `content` parts
|
||||
- [x] `frequency_penalty`
|
||||
- [x] `presence_penalty`
|
||||
- [x] `response_format`
|
||||
- [x] `seed`
|
||||
- [x] `stop`
|
||||
- [x] `stream`
|
||||
- [x] `temperature`
|
||||
- [x] `top_p`
|
||||
- [x] `max_tokens`
|
||||
- [ ] `logit_bias`
|
||||
- [ ] `tools`
|
||||
- [ ] `tool_choice`
|
||||
- [ ] `user`
|
||||
- [ ] `n`
|
||||
|
||||
#### Notes
|
||||
|
||||
- Setting `seed` will always set `temperature` to `0`
|
||||
- `finish_reason` will always be `stop`
|
||||
- `usage.prompt_tokens` will be 0 for completions where prompt evaluation is cached
|
||||
|
||||
## Models
|
||||
|
||||
Before using a model, pull it locally `ollama pull`:
|
||||
|
||||
```shell
|
||||
ollama pull llama2
|
||||
```
|
||||
|
||||
### Default model names
|
||||
|
||||
For tooling that relies on default OpenAI model names such as `gpt-3.5-turbo`, use `ollama cp` to copy an existing model name to a temporary name:
|
||||
|
||||
```
|
||||
ollama cp llama2 gpt-3.5-turbo
|
||||
```
|
||||
|
||||
Afterwards, this new model name can be specified the `model` field:
|
||||
|
||||
```shell
|
||||
curl http://localhost:11434/v1/chat/completions \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"model": "gpt-3.5-turbo",
|
||||
"messages": [
|
||||
{
|
||||
"role": "user",
|
||||
"content": "Hello!"
|
||||
}
|
||||
]
|
||||
}'
|
||||
```
|
@@ -1,53 +1,72 @@
|
||||
# How to troubleshoot issues
|
||||
|
||||
Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on Mac by running the command:
|
||||
|
||||
```shell
|
||||
cat ~/.ollama/logs/server.log
|
||||
```
|
||||
|
||||
On Linux systems with systemd, the logs can be found with this command:
|
||||
|
||||
```shell
|
||||
journalctl -u ollama
|
||||
```
|
||||
|
||||
If manually running `ollama serve` in a terminal, the logs will be on that terminal.
|
||||
|
||||
Join the [Discord](https://discord.gg/ollama) for help interpreting the logs.
|
||||
|
||||
## LLM libraries
|
||||
|
||||
Ollama includes multiple LLM libraries compiled for different GPUs and CPU
|
||||
vector features. Ollama tries to pick the best one based on the capabilities of
|
||||
your system. If this autodetection has problems, or you run into other problems
|
||||
(e.g. crashes in your GPU) you can workaround this by forcing a specific LLM
|
||||
library. `cpu_avx2` will perform the best, followed by `cpu_avx` an the slowest
|
||||
but most compatible is `cpu`. Rosetta emulation under MacOS will work with the
|
||||
`cpu` library.
|
||||
|
||||
In the server log, you will see a message that looks something like this (varies
|
||||
from release to release):
|
||||
|
||||
```
|
||||
Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
|
||||
```
|
||||
|
||||
**Experimental LLM Library Override**
|
||||
|
||||
You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass
|
||||
autodetection, so for example, if you have a CUDA card, but want to force the
|
||||
CPU LLM library with AVX2 vector support, use:
|
||||
|
||||
```
|
||||
OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
|
||||
```
|
||||
|
||||
You can see what features your CPU has with the following.
|
||||
```
|
||||
cat /proc/cpuinfo| grep flags | head -1
|
||||
```
|
||||
|
||||
## Known issues
|
||||
|
||||
# How to troubleshoot issues
|
||||
|
||||
Sometimes Ollama may not perform as expected. One of the best ways to figure out what happened is to take a look at the logs. Find the logs on **Mac** by running the command:
|
||||
|
||||
```shell
|
||||
cat ~/.ollama/logs/server.log
|
||||
```
|
||||
|
||||
On **Linux** systems with systemd, the logs can be found with this command:
|
||||
|
||||
```shell
|
||||
journalctl -u ollama
|
||||
```
|
||||
|
||||
When you run Ollama in a **container**, the logs go to stdout/stderr in the container:
|
||||
|
||||
```shell
|
||||
docker logs <container-name>
|
||||
```
|
||||
(Use `docker ps` to find the container name)
|
||||
|
||||
If manually running `ollama serve` in a terminal, the logs will be on that terminal.
|
||||
|
||||
When you run Ollama on **Windows**, there are a few different locations. You can view them in the explorer window by hitting `<cmd>+R` and type in:
|
||||
- `explorer %LOCALAPPDATA%\Ollama` to view logs
|
||||
- `explorer %LOCALAPPDATA%\Programs\Ollama` to browse the binaries (The installer adds this to your user PATH)
|
||||
- `explorer %HOMEPATH%\.ollama` to browse where models and configuration is stored
|
||||
- `explorer %TEMP%` where temporary executable files are stored in one or more `ollama*` directories
|
||||
|
||||
To enable additional debug logging to help troubleshoot problems, first **Quit the running app from the tray menu** then in a powershell terminal
|
||||
```powershell
|
||||
$env:OLLAMA_DEBUG="1"
|
||||
& "ollama app.exe"
|
||||
```
|
||||
|
||||
Join the [Discord](https://discord.gg/ollama) for help interpreting the logs.
|
||||
|
||||
## LLM libraries
|
||||
|
||||
Ollama includes multiple LLM libraries compiled for different GPUs and CPU
|
||||
vector features. Ollama tries to pick the best one based on the capabilities of
|
||||
your system. If this autodetection has problems, or you run into other problems
|
||||
(e.g. crashes in your GPU) you can workaround this by forcing a specific LLM
|
||||
library. `cpu_avx2` will perform the best, followed by `cpu_avx` an the slowest
|
||||
but most compatible is `cpu`. Rosetta emulation under MacOS will work with the
|
||||
`cpu` library.
|
||||
|
||||
In the server log, you will see a message that looks something like this (varies
|
||||
from release to release):
|
||||
|
||||
```
|
||||
Dynamic LLM libraries [rocm_v6 cpu cpu_avx cpu_avx2 cuda_v11 rocm_v5]
|
||||
```
|
||||
|
||||
**Experimental LLM Library Override**
|
||||
|
||||
You can set OLLAMA_LLM_LIBRARY to any of the available LLM libraries to bypass
|
||||
autodetection, so for example, if you have a CUDA card, but want to force the
|
||||
CPU LLM library with AVX2 vector support, use:
|
||||
|
||||
```
|
||||
OLLAMA_LLM_LIBRARY="cpu_avx2" ollama serve
|
||||
```
|
||||
|
||||
You can see what features your CPU has with the following.
|
||||
```
|
||||
cat /proc/cpuinfo| grep flags | head -1
|
||||
```
|
||||
|
||||
## Known issues
|
||||
|
||||
* N/A
|
@@ -17,7 +17,7 @@ Prerequisites:
|
||||
|
||||
Here are the steps:
|
||||
|
||||
- Install Ollama via standard Linux command (ignore the 404 error): `curl https://ollama.ai/install.sh | sh`
|
||||
- Install Ollama via standard Linux command (ignore the 404 error): `curl https://ollama.com/install.sh | sh`
|
||||
- Stop the Ollama service: `sudo systemctl stop ollama`
|
||||
- Start Ollama serve in a tmux session called ollama_jetson and reference the CUDA libraries path: `tmux has-session -t ollama_jetson 2>/dev/null || tmux new-session -d -s ollama_jetson
|
||||
'LD_LIBRARY_PATH=/usr/local/cuda/lib64 ollama serve'`
|
||||
|
46
docs/windows.md
Normal file
@@ -0,0 +1,46 @@
|
||||
# Ollama Windows Preview
|
||||
|
||||
Welcome to the Ollama Windows preview.
|
||||
|
||||
No more WSL required!
|
||||
|
||||
Ollama now runs as a native Windows application, including NVIDIA GPU support.
|
||||
After installing Ollama Windows Preview, Ollama will run in the background and
|
||||
the `ollama` command line is available in `cmd`, `powershell` or your favorite
|
||||
terminal application. As usual the Ollama [api](./api.md) will be served on
|
||||
`http://localhost:11434`.
|
||||
|
||||
As this is a preview release, you should expect a few bugs here and there. If
|
||||
you run into a problem you can reach out on
|
||||
[Discord](https://discord.gg/ollama), or file an
|
||||
[issue](https://github.com/ollama/ollama/issues).
|
||||
Logs will often be helpful in dianosing the problem (see
|
||||
[Troubleshooting](#troubleshooting) below)
|
||||
|
||||
## System Requirements
|
||||
|
||||
* Windows 10 or newer, Home or Pro
|
||||
* NVIDIA 452.39 or newer Drivers if you have an NVIDIA card
|
||||
|
||||
## API Access
|
||||
|
||||
Here's a quick example showing API access from `powershell`
|
||||
```powershell
|
||||
(Invoke-WebRequest -method POST -Body '{"model":"llama2", "prompt":"Why is the sky blue?", "stream": false}' -uri http://localhost:11434/api/generate ).Content | ConvertFrom-json
|
||||
```
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
While we're in preview, `OLLAMA_DEBUG` is always enabled, which adds
|
||||
a "view logs" menu item to the app, and increses logging for the GUI app and
|
||||
server.
|
||||
|
||||
Ollama on Windows stores files in a few different locations. You can view them in
|
||||
the explorer window by hitting `<cmd>+R` and type in:
|
||||
- `explorer %LOCALAPPDATA%\Ollama` contains logs, and downloaded updates
|
||||
- *app.log* contains logs from the GUI application
|
||||
- *server.log* contains the server logs
|
||||
- *upgrade.log* contains log output for upgrades
|
||||
- `explorer %LOCALAPPDATA%\Programs\Ollama` contains the binaries (The installer adds this to your user PATH)
|
||||
- `explorer %HOMEPATH%\.ollama` contains models and configuration
|
||||
- `explorer %TEMP%` contains temporary executable files in one or more `ollama*` directories
|
@@ -8,7 +8,7 @@
|
||||
"outputs": [],
|
||||
"source": [
|
||||
"# Download and run the Ollama Linux install script\n",
|
||||
"!curl https://ollama.ai/install.sh | sh\n",
|
||||
"!curl -fsSL https://ollama.com/install.sh | sh\n",
|
||||
"!command -v systemctl >/dev/null && sudo systemctl stop ollama"
|
||||
]
|
||||
},
|
||||
|
@@ -2,28 +2,28 @@
|
||||
|
||||
## Prerequisites
|
||||
|
||||
- Ollama: https://ollama.ai/download
|
||||
- Ollama: https://ollama.com/download
|
||||
- Kubernetes cluster. This example will use Google Kubernetes Engine.
|
||||
|
||||
## Steps
|
||||
|
||||
1. Create the Ollama namespace, daemon set, and service
|
||||
|
||||
```bash
|
||||
kubectl apply -f cpu.yaml
|
||||
```
|
||||
```bash
|
||||
kubectl apply -f cpu.yaml
|
||||
```
|
||||
|
||||
1. Port forward the Ollama service to connect and use it locally
|
||||
|
||||
```bash
|
||||
kubectl -n ollama port-forward service/ollama 11434:80
|
||||
```
|
||||
```bash
|
||||
kubectl -n ollama port-forward service/ollama 11434:80
|
||||
```
|
||||
|
||||
1. Pull and run a model, for example `orca-mini:3b`
|
||||
|
||||
```bash
|
||||
ollama run orca-mini:3b
|
||||
```
|
||||
```bash
|
||||
ollama run orca-mini:3b
|
||||
```
|
||||
|
||||
## (Optional) Hardware Acceleration
|
||||
|
||||
|
@@ -1,6 +1,6 @@
|
||||
# LangChain Web Summarization
|
||||
|
||||
This example summarizes the website, [https://ollama.ai/blog/run-llama2-uncensored-locally](https://ollama.ai/blog/run-llama2-uncensored-locally)
|
||||
This example summarizes the website, [https://ollama.com/blog/run-llama2-uncensored-locally](https://ollama.com/blog/run-llama2-uncensored-locally)
|
||||
|
||||
## Running the Example
|
||||
|
||||
|
@@ -2,7 +2,7 @@ from langchain.llms import Ollama
|
||||
from langchain.document_loaders import WebBaseLoader
|
||||
from langchain.chains.summarize import load_summarize_chain
|
||||
|
||||
loader = WebBaseLoader("https://ollama.ai/blog/run-llama2-uncensored-locally")
|
||||
loader = WebBaseLoader("https://ollama.com/blog/run-llama2-uncensored-locally")
|
||||
docs = loader.load()
|
||||
|
||||
llm = Ollama(model="llama2")
|
||||
|
@@ -40,13 +40,13 @@ You are a log file analyzer. You will receive a set of lines from a log file for
|
||||
"""
|
||||
```
|
||||
|
||||
This model is available at https://ollama.ai/mattw/loganalyzer. You can customize it and add to your own namespace using the command `ollama create <namespace/modelname> -f <path-to-modelfile>` then `ollama push <namespace/modelname>`.
|
||||
This model is available at https://ollama.com/mattw/loganalyzer. You can customize it and add to your own namespace using the command `ollama create <namespace/modelname> -f <path-to-modelfile>` then `ollama push <namespace/modelname>`.
|
||||
|
||||
Then loganalysis.py scans all the lines in the given log file and searches for the word 'error'. When the word is found, the 10 lines before and after are set as the prompt for a call to the Generate API.
|
||||
|
||||
```python
|
||||
data = {
|
||||
"prompt": "\n".join(error_logs),
|
||||
"prompt": "\n".join(error_logs),
|
||||
"model": "mattw/loganalyzer"
|
||||
}
|
||||
```
|
||||
|
@@ -29,9 +29,9 @@ You can also add your own character to be chosen at random when you ask a questi
|
||||
```bash
|
||||
ollama pull stablebeluga2:70b-q4_K_M
|
||||
```
|
||||
|
||||
|
||||
2. Create a new character:
|
||||
|
||||
|
||||
```bash
|
||||
npm run charactergen "Lorne Greene"
|
||||
```
|
||||
@@ -41,15 +41,15 @@ You can also add your own character to be chosen at random when you ask a questi
|
||||
3. Now you can create a model with this command:
|
||||
|
||||
```bash
|
||||
ollama create <YourNamespace>/lornegreene -f lornegreene/Modelfile
|
||||
ollama create <username>/lornegreene -f lornegreene/Modelfile
|
||||
```
|
||||
|
||||
`YourNamespace` is whatever name you set up when you signed up at [https://ollama.ai/signup](https://ollama.ai/signup).
|
||||
`username` is whatever name you set up when you signed up at [https://ollama.com/signup](https://ollama.com/signup).
|
||||
|
||||
4. To add this to your mentors, you will have to update the code as follows. On line 8 of `mentors.ts`, add an object to the array, replacing `<YourNamespace>` with the namespace you used above.
|
||||
4. To add this to your mentors, you will have to update the code as follows. On line 8 of `mentors.ts`, add an object to the array, replacing `<username>` with the username you used above.
|
||||
|
||||
```bash
|
||||
{ns: "<YourNamespace>", char: "Lorne Greene"}
|
||||
{ns: "<username>", char: "Lorne Greene"}
|
||||
```
|
||||
|
||||
## Review the Code
|
||||
|
14
go.mod
@@ -12,12 +12,26 @@ require (
|
||||
)
|
||||
|
||||
require (
|
||||
github.com/containerd/console v1.0.3 // indirect
|
||||
github.com/cratonica/2goarray v0.0.0-20190331194516-514510793eaa // indirect
|
||||
github.com/davecgh/go-spew v1.1.1 // indirect
|
||||
github.com/getlantern/context v0.0.0-20190109183933-c447772a6520 // indirect
|
||||
github.com/getlantern/errors v0.0.0-20190325191628-abdb3e3e36f7 // indirect
|
||||
github.com/getlantern/golog v0.0.0-20190830074920-4ef2e798c2d7 // indirect
|
||||
github.com/getlantern/hex v0.0.0-20190417191902-c6586a6fe0b7 // indirect
|
||||
github.com/getlantern/hidden v0.0.0-20190325191715-f02dbb02be55 // indirect
|
||||
github.com/getlantern/ops v0.0.0-20190325191751-d70cb0d6f85f // indirect
|
||||
github.com/getlantern/systray v1.2.2 // indirect
|
||||
github.com/go-stack/stack v1.8.0 // indirect
|
||||
github.com/google/uuid v1.0.0 // indirect
|
||||
github.com/mattn/go-runewidth v0.0.14 // indirect
|
||||
github.com/oxtoacart/bpool v0.0.0-20190530202638-03653db5a59c // indirect
|
||||
github.com/pborman/uuid v1.2.1 // indirect
|
||||
github.com/pmezard/go-difflib v1.0.0 // indirect
|
||||
github.com/rivo/uniseg v0.2.0 // indirect
|
||||
)
|
||||
|
||||
|
||||
require (
|
||||
github.com/bytedance/sonic v1.9.1 // indirect
|
||||
github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311 // indirect
|
||||
|
33
go.sum
@@ -4,7 +4,11 @@ github.com/bytedance/sonic v1.9.1/go.mod h1:i736AoUSYt75HyZLoJW9ERYxcy6eaN6h4BZX
|
||||
github.com/chenzhuoyu/base64x v0.0.0-20211019084208-fb5309c8db06/go.mod h1:DH46F32mSOjUmXrMHnKwZdA8wcEefY7UVqBKYGjpdQY=
|
||||
github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311 h1:qSGYFH7+jGhDF8vLC+iwCD4WpbV1EBDSzWkJODFLams=
|
||||
github.com/chenzhuoyu/base64x v0.0.0-20221115062448-fe3a3abad311/go.mod h1:b583jCggY9gE99b6G5LEC39OIiVsWj+R97kbl5odCEk=
|
||||
github.com/containerd/console v1.0.3 h1:lIr7SlA5PxZyMV30bDW0MGbiOPXwc63yRuCP0ARubLw=
|
||||
github.com/containerd/console v1.0.3/go.mod h1:7LqA/THxQ86k76b8c/EMSiaJ3h1eZkMkXar0TQ1gf3U=
|
||||
github.com/cpuguy83/go-md2man/v2 v2.0.2/go.mod h1:tgQtvFlXSQOSOSIRvRPT7W67SCa46tRHOmNcaadrF8o=
|
||||
github.com/cratonica/2goarray v0.0.0-20190331194516-514510793eaa h1:Wg+722vs7a2zQH5lR9QWYsVbplKeffaQFIs5FTdfNNo=
|
||||
github.com/cratonica/2goarray v0.0.0-20190331194516-514510793eaa/go.mod h1:6Arca19mRx58CA7OWEd7Wu1NpC1rd3uDnNs6s1pj/DI=
|
||||
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
|
||||
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
|
||||
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
|
||||
@@ -13,6 +17,20 @@ github.com/emirpasic/gods v1.18.1 h1:FXtiHYKDGKCW2KzwZKx0iC0PQmdlorYgdFG9jPXJ1Bc
|
||||
github.com/emirpasic/gods v1.18.1/go.mod h1:8tpGGwCnJ5H4r6BWwaV6OrWmMoPhUl5jm/FMNAnJvWQ=
|
||||
github.com/gabriel-vasile/mimetype v1.4.2 h1:w5qFW6JKBz9Y393Y4q372O9A7cUSequkh1Q7OhCmWKU=
|
||||
github.com/gabriel-vasile/mimetype v1.4.2/go.mod h1:zApsH/mKG4w07erKIaJPFiX0Tsq9BFQgN3qGY5GnNgA=
|
||||
github.com/getlantern/context v0.0.0-20190109183933-c447772a6520 h1:NRUJuo3v3WGC/g5YiyF790gut6oQr5f3FBI88Wv0dx4=
|
||||
github.com/getlantern/context v0.0.0-20190109183933-c447772a6520/go.mod h1:L+mq6/vvYHKjCX2oez0CgEAJmbq1fbb/oNJIWQkBybY=
|
||||
github.com/getlantern/errors v0.0.0-20190325191628-abdb3e3e36f7 h1:6uJ+sZ/e03gkbqZ0kUG6mfKoqDb4XMAzMIwlajq19So=
|
||||
github.com/getlantern/errors v0.0.0-20190325191628-abdb3e3e36f7/go.mod h1:l+xpFBrCtDLpK9qNjxs+cHU6+BAdlBaxHqikB6Lku3A=
|
||||
github.com/getlantern/golog v0.0.0-20190830074920-4ef2e798c2d7 h1:guBYzEaLz0Vfc/jv0czrr2z7qyzTOGC9hiQ0VC+hKjk=
|
||||
github.com/getlantern/golog v0.0.0-20190830074920-4ef2e798c2d7/go.mod h1:zx/1xUUeYPy3Pcmet8OSXLbF47l+3y6hIPpyLWoR9oc=
|
||||
github.com/getlantern/hex v0.0.0-20190417191902-c6586a6fe0b7 h1:micT5vkcr9tOVk1FiH8SWKID8ultN44Z+yzd2y/Vyb0=
|
||||
github.com/getlantern/hex v0.0.0-20190417191902-c6586a6fe0b7/go.mod h1:dD3CgOrwlzca8ed61CsZouQS5h5jIzkK9ZWrTcf0s+o=
|
||||
github.com/getlantern/hidden v0.0.0-20190325191715-f02dbb02be55 h1:XYzSdCbkzOC0FDNrgJqGRo8PCMFOBFL9py72DRs7bmc=
|
||||
github.com/getlantern/hidden v0.0.0-20190325191715-f02dbb02be55/go.mod h1:6mmzY2kW1TOOrVy+r41Za2MxXM+hhqTtY3oBKd2AgFA=
|
||||
github.com/getlantern/ops v0.0.0-20190325191751-d70cb0d6f85f h1:wrYrQttPS8FHIRSlsrcuKazukx/xqO/PpLZzZXsF+EA=
|
||||
github.com/getlantern/ops v0.0.0-20190325191751-d70cb0d6f85f/go.mod h1:D5ao98qkA6pxftxoqzibIBBrLSUli+kYnJqrgBf9cIA=
|
||||
github.com/getlantern/systray v1.2.2 h1:dCEHtfmvkJG7HZ8lS/sLklTH4RKUcIsKrAD9sThoEBE=
|
||||
github.com/getlantern/systray v1.2.2/go.mod h1:pXFOI1wwqwYXEhLPm9ZGjS2u/vVELeIgNMY5HvhHhcE=
|
||||
github.com/gin-contrib/cors v1.4.0 h1:oJ6gwtUl3lqV0WEIwM/LxPF1QZ5qe2lGWdY2+bz7y0g=
|
||||
github.com/gin-contrib/cors v1.4.0/go.mod h1:bs9pNM0x/UsmHPBWT2xZz9ROh8xYjYkiURUfmBoMlcs=
|
||||
github.com/gin-contrib/sse v0.1.0 h1:Y/yl/+YNO8GZSjAhjMsSuLt29uWRFHdHYUb5lYOV9qE=
|
||||
@@ -31,6 +49,8 @@ github.com/go-playground/universal-translator v0.18.1/go.mod h1:xekY+UJKNuX9WP91
|
||||
github.com/go-playground/validator/v10 v10.10.0/go.mod h1:74x4gJWsvQexRdW8Pn3dXSGrTK4nAUsbPlLADvpJkos=
|
||||
github.com/go-playground/validator/v10 v10.14.0 h1:vgvQWe3XCz3gIeFDm/HnTIbj6UGmg/+t63MyGU2n5js=
|
||||
github.com/go-playground/validator/v10 v10.14.0/go.mod h1:9iXMNT7sEkjXb0I+enO7QXmzG6QCsPWY4zveKFVRSyU=
|
||||
github.com/go-stack/stack v1.8.0 h1:5SgMzNM5HxrEjV0ww2lTmX6E2Izsfxas4+YHWRs3Lsk=
|
||||
github.com/go-stack/stack v1.8.0/go.mod h1:v0f6uXyyMGvRgIKkXu+yp6POWl0qKG85gN/melR3HDY=
|
||||
github.com/goccy/go-json v0.9.7/go.mod h1:6MelG93GURQebXPDq3khkgXZkazVtN9CRI+MGFi0w8I=
|
||||
github.com/goccy/go-json v0.10.2 h1:CrxCmQqYDkv1z7lO7Wbh2HN93uovUHgrECaO5ZrCXAU=
|
||||
github.com/goccy/go-json v0.10.2/go.mod h1:6MelG93GURQebXPDq3khkgXZkazVtN9CRI+MGFi0w8I=
|
||||
@@ -39,6 +59,8 @@ github.com/google/go-cmp v0.5.5/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/
|
||||
github.com/google/go-cmp v0.5.9 h1:O2Tfq5qg4qc4AmwVlvv0oLiVAGB7enBSJ2x2DqQFi38=
|
||||
github.com/google/go-cmp v0.5.9/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
|
||||
github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
|
||||
github.com/google/uuid v1.0.0 h1:b4Gk+7WdP/d3HZH8EJsZpvV7EtDOgaZLtnaNGIu1adA=
|
||||
github.com/google/uuid v1.0.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
|
||||
github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
|
||||
github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
|
||||
github.com/json-iterator/go v1.1.12 h1:PV8peI4a0ysnczrg+LtxykD8LfKY9ML6u2jnxaEnrnM=
|
||||
@@ -57,6 +79,8 @@ github.com/kr/text v0.2.0/go.mod h1:eLer722TekiGuMkidMxC/pM04lWEeraHUUmBw8l2grE=
|
||||
github.com/leodido/go-urn v1.2.1/go.mod h1:zt4jvISO2HfUBqxjfIshjdMTYS56ZS/qv49ictyFfxY=
|
||||
github.com/leodido/go-urn v1.2.4 h1:XlAE/cm/ms7TE/VMVoduSpNBoyc2dOxHs5MZSwAN63Q=
|
||||
github.com/leodido/go-urn v1.2.4/go.mod h1:7ZrI8mTSeBSHl/UaRyKQW1qZeMgak41ANeCNaVckg+4=
|
||||
github.com/lxn/walk v0.0.0-20210112085537-c389da54e794/go.mod h1:E23UucZGqpuUANJooIbHWCufXvOcT6E7Stq81gU+CSQ=
|
||||
github.com/lxn/win v0.0.0-20210218163916-a377121e959e/go.mod h1:KxxjdtRkfNoYDCUP5ryK7XJJNTnpC8atvtmTheChOtk=
|
||||
github.com/mattn/go-isatty v0.0.14/go.mod h1:7GGIvUiUoEMVVmxf/4nioHXj79iQHKdU27kJ6hsGG94=
|
||||
github.com/mattn/go-isatty v0.0.19 h1:JITubQf0MOLdlGRuRq+jtsDlekdYPia9ZFsB8h/APPA=
|
||||
github.com/mattn/go-isatty v0.0.19/go.mod h1:W+V8PltTTMOvKvAeJH7IuucS94S2C6jfK/D7dTCTo3Y=
|
||||
@@ -70,8 +94,12 @@ github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9G
|
||||
github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
|
||||
github.com/olekukonko/tablewriter v0.0.5 h1:P2Ga83D34wi1o9J6Wh1mRuqd4mF/x/lgBS7N7AbDhec=
|
||||
github.com/olekukonko/tablewriter v0.0.5/go.mod h1:hPp6KlRPjbx+hW8ykQs1w3UBbZlj6HuIJcUGPhkA7kY=
|
||||
github.com/oxtoacart/bpool v0.0.0-20190530202638-03653db5a59c h1:rp5dCmg/yLR3mgFuSOe4oEnDDmGLROTvMragMUXpTQw=
|
||||
github.com/oxtoacart/bpool v0.0.0-20190530202638-03653db5a59c/go.mod h1:X07ZCGwUbLaax7L0S3Tw4hpejzu63ZrrQiUe6W0hcy0=
|
||||
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58 h1:onHthvaw9LFnH4t2DcNVpwGmV9E1BkGknEliJkfwQj0=
|
||||
github.com/pbnjay/memory v0.0.0-20210728143218-7b4eea64cf58/go.mod h1:DXv8WO4yhMYhSNPKjeNKa5WY9YCIEBRbNzFFPJbWO6Y=
|
||||
github.com/pborman/uuid v1.2.1 h1:+ZZIw58t/ozdjRaXh/3awHfmWRbzYxJoAdNJxe/3pvw=
|
||||
github.com/pborman/uuid v1.2.1/go.mod h1:X/NO0urCmaxf9VXbdlT7C2Yzkj2IKimNn4k+gtPdI/k=
|
||||
github.com/pelletier/go-toml/v2 v2.0.1/go.mod h1:r9LEWfGN8R5k0VXJ+0BkIe7MYkRdwZOjgMj2KwnJFUo=
|
||||
github.com/pelletier/go-toml/v2 v2.0.8 h1:0ctb6s9mE31h0/lhu+J6OPmVeDxJn+kYnJc2jZR9tGQ=
|
||||
github.com/pelletier/go-toml/v2 v2.0.8/go.mod h1:vuYfssBdrU2XDZ9bYydBu6t+6a6PYNcZljzZR9VXg+4=
|
||||
@@ -84,6 +112,7 @@ github.com/rogpeppe/go-internal v1.6.1/go.mod h1:xXDCJY+GAPziupqXw64V24skbSoqbTE
|
||||
github.com/rogpeppe/go-internal v1.8.0 h1:FCbCCtXNOY3UtUuHUYaghJg4y7Fd14rXifAYUAtL9R8=
|
||||
github.com/rogpeppe/go-internal v1.8.0/go.mod h1:WmiCO8CzOY8rg0OYDC4/i/2WRWAB6poM+XZ2dLUbcbE=
|
||||
github.com/russross/blackfriday/v2 v2.1.0/go.mod h1:+Rmxgy9KzJVeS9/2gXHxylqXiyQDYRxCVz55jmeOWTM=
|
||||
github.com/skratchdot/open-golang v0.0.0-20200116055534-eef842397966/go.mod h1:sUM3LWHvSMaG192sy56D9F7CNvL7jUJVXoqM1QKLnog=
|
||||
github.com/spf13/cobra v1.7.0 h1:hyqWnYt1ZQShIddO5kBpj3vu05/++x6tJ6dg8EC572I=
|
||||
github.com/spf13/cobra v1.7.0/go.mod h1:uLxZILRyS/50WlhOIKD7W6V5bgeIt+4sICxh6uRMrb0=
|
||||
github.com/spf13/pflag v1.0.5 h1:iy+VFUOCP1a+8yFto/drg2CJ5u0yRoB7fZw3DKv/JXA=
|
||||
@@ -120,11 +149,14 @@ golang.org/x/net v0.17.0 h1:pVaXccu2ozPjCXewfr1S7xza/zcXTity9cCdXQYSjIM=
|
||||
golang.org/x/net v0.17.0/go.mod h1:NxSsAGuq816PNPmqtQdLE42eU2Fs7NoRIZrHJAlaCOE=
|
||||
golang.org/x/sync v0.3.0 h1:ftCYgMx6zT/asHUrPw8BLLscYtGznsLAnjq5RH9P66E=
|
||||
golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
|
||||
golang.org/x/sys v0.0.0-20201018230417-eeed37f84f13/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20201119102817-f84b799fce68/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20210124154548-22da62e12c0c/go.mod h1:h1NjWce9XRLGQEsW7wpKNCjG9DtNlClVuFLEZdDNbEs=
|
||||
golang.org/x/sys v0.0.0-20210615035016-665e8c7367d1/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20210630005230-0f9fa26af87c/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20210806184541-e5e7981a1069/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.0.0-20220704084225-05e143d24a9e/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.1.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.6.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
golang.org/x/sys v0.13.0 h1:Af8nKPmuFypiUBjVoU9V20FiaFXOcuZI21p0ycVYYGE=
|
||||
golang.org/x/sys v0.13.0/go.mod h1:oPkhp1MJrh7nUepCBck5+mAzfO9JrbApNNgaTdGDITg=
|
||||
@@ -141,6 +173,7 @@ google.golang.org/protobuf v1.26.0-rc.1/go.mod h1:jlhhOSvTdKEhbULTjvd4ARK9grFBp0
|
||||
google.golang.org/protobuf v1.28.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
|
||||
google.golang.org/protobuf v1.30.0 h1:kPPoIgf3TsEvrm0PFe15JQ+570QVxYzEvvHqChK+cng=
|
||||
google.golang.org/protobuf v1.30.0/go.mod h1:HV8QOd/L58Z+nl8r43ehVNZIU/HEI6OcFqwMG9pJV4I=
|
||||
gopkg.in/Knetic/govaluate.v3 v3.0.0/go.mod h1:csKLBORsPbafmSCGTEh3U7Ozmsuq8ZSIlKk1bcqph0E=
|
||||
gopkg.in/check.v1 v0.0.0-20161208181325-20d25e280405/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/check.v1 v1.0.0-20180628173108-788fd7840127/go.mod h1:Co6ibVJAznAaIkqp8huTwlJQCZ016jof/cbN4VW5Yz0=
|
||||
gopkg.in/check.v1 v1.0.0-20201130134442-10cb98267c6c h1:Hei/4ADfdWqJk1ZMxUNpqntNwaWcugrBjAiHlqqRiVk=
|
||||
|
101
gpu/amd.go
Normal file
@@ -0,0 +1,101 @@
|
||||
package gpu
|
||||
|
||||
import (
|
||||
"bufio"
|
||||
"errors"
|
||||
"fmt"
|
||||
"io"
|
||||
"log/slog"
|
||||
"os"
|
||||
"path/filepath"
|
||||
"strconv"
|
||||
"strings"
|
||||
)
|
||||
|
||||
// TODO - windows vs. non-windows vs darwin
|
||||
|
||||
// Discovery logic for AMD/ROCm GPUs
|
||||
|
||||
const (
|
||||
DriverVersionFile = "/sys/module/amdgpu/version"
|
||||
GPUPropertiesFileGlob = "/sys/class/kfd/kfd/topology/nodes/*/properties"
|
||||
// TODO probably break these down per GPU to make the logic simpler
|
||||
GPUTotalMemoryFileGlob = "/sys/class/kfd/kfd/topology/nodes/*/mem_banks/*/properties" // size_in_bytes line
|
||||
GPUUsedMemoryFileGlob = "/sys/class/kfd/kfd/topology/nodes/*/mem_banks/*/used_memory"
|
||||
)
|
||||
|
||||
func AMDDetected() bool {
|
||||
// Some driver versions (older?) don't have a version file, so just lookup the parent dir
|
||||
sysfsDir := filepath.Dir(DriverVersionFile)
|
||||
_, err := os.Stat(sysfsDir)
|
||||
if errors.Is(err, os.ErrNotExist) {
|
||||
slog.Debug("amd driver not detected " + sysfsDir)
|
||||
return false
|
||||
} else if err != nil {
|
||||
slog.Debug(fmt.Sprintf("error looking up amd driver %s %s", sysfsDir, err))
|
||||
return false
|
||||
}
|
||||
return true
|
||||
}
|
||||
|
||||
func AMDDriverVersion() (string, error) {
|
||||
_, err := os.Stat(DriverVersionFile)
|
||||
if err != nil {
|
||||
return "", fmt.Errorf("amdgpu file stat error: %s %w", DriverVersionFile, err)
|
||||
}
|
||||
fp, err := os.Open(DriverVersionFile)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
defer fp.Close()
|
||||
verString, err := io.ReadAll(fp)
|
||||
if err != nil {
|
||||
return "", err
|
||||
}
|
||||
return strings.TrimSpace(string(verString)), nil
|
||||
}
|
||||
|
||||
func AMDGFXVersions() []Version {
|
||||
res := []Version{}
|
||||
matches, _ := filepath.Glob(GPUPropertiesFileGlob)
|
||||
for _, match := range matches {
|
||||
fp, err := os.Open(match)
|
||||
if err != nil {
|
||||
slog.Debug(fmt.Sprintf("failed to open sysfs node file %s: %s", match, err))
|
||||
continue
|
||||
}
|
||||
defer fp.Close()
|
||||
|
||||
scanner := bufio.NewScanner(fp)
|
||||
// optionally, resize scanner's capacity for lines over 64K, see next example
|
||||
for scanner.Scan() {
|
||||
line := strings.TrimSpace(scanner.Text())
|
||||
if strings.HasPrefix(line, "gfx_target_version") {
|
||||
ver := strings.Fields(line)
|
||||
if len(ver) != 2 || len(ver[1]) < 5 {
|
||||
slog.Debug("malformed " + line)
|
||||
continue
|
||||
}
|
||||
l := len(ver[1])
|
||||
patch, err1 := strconv.ParseUint(ver[1][l-2:l], 10, 32)
|
||||
minor, err2 := strconv.ParseUint(ver[1][l-4:l-2], 10, 32)
|
||||
major, err3 := strconv.ParseUint(ver[1][:l-4], 10, 32)
|
||||
if err1 != nil || err2 != nil || err3 != nil {
|
||||
slog.Debug("malformed int " + line)
|
||||
continue
|
||||
}
|
||||
|
||||
res = append(res, Version{
|
||||
Major: uint(major),
|
||||
Minor: uint(minor),
|
||||
Patch: uint(patch),
|
||||
})
|
||||
}
|
||||
}
|
||||
}
|
||||
return res
|
||||
}
|
||||
|
||||
func (v Version) ToGFXString() string {
|
||||
return fmt.Sprintf("gfx%d%d%d", v.Major, v.Minor, v.Patch)
|
||||
}
|
101
gpu/gpu.go
@@ -30,8 +30,8 @@ type handles struct {
|
||||
var gpuMutex sync.Mutex
|
||||
var gpuHandles *handles = nil
|
||||
|
||||
// With our current CUDA compile flags, 5.2 and older will not work properly
|
||||
const CudaComputeMajorMin = 6
|
||||
// With our current CUDA compile flags, older than 5.0 will not work properly
|
||||
var CudaComputeMin = [2]C.int{5, 0}
|
||||
|
||||
// Possible locations for the nvidia-ml library
|
||||
var CudaLinuxGlobs = []string{
|
||||
@@ -122,70 +122,93 @@ func GetGPUInfo() GpuInfo {
|
||||
initGPUHandles()
|
||||
}
|
||||
|
||||
// All our GPU builds have AVX enabled, so fallback to CPU if we don't detect at least AVX
|
||||
// All our GPU builds on x86 have AVX enabled, so fallback to CPU if we don't detect at least AVX
|
||||
cpuVariant := GetCPUVariant()
|
||||
if cpuVariant == "" {
|
||||
if cpuVariant == "" && runtime.GOARCH == "amd64" {
|
||||
slog.Warn("CPU does not have AVX or AVX2, disabling GPU support.")
|
||||
}
|
||||
|
||||
var memInfo C.mem_info_t
|
||||
resp := GpuInfo{}
|
||||
if gpuHandles.cuda != nil && cpuVariant != "" {
|
||||
if gpuHandles.cuda != nil && (cpuVariant != "" || runtime.GOARCH != "amd64") {
|
||||
C.cuda_check_vram(*gpuHandles.cuda, &memInfo)
|
||||
if memInfo.err != nil {
|
||||
slog.Info(fmt.Sprintf("error looking up CUDA GPU memory: %s", C.GoString(memInfo.err)))
|
||||
C.free(unsafe.Pointer(memInfo.err))
|
||||
} else {
|
||||
} else if memInfo.count > 0 {
|
||||
// Verify minimum compute capability
|
||||
var cc C.cuda_compute_capability_t
|
||||
C.cuda_compute_capability(*gpuHandles.cuda, &cc)
|
||||
if cc.err != nil {
|
||||
slog.Info(fmt.Sprintf("error looking up CUDA GPU compute capability: %s", C.GoString(cc.err)))
|
||||
C.free(unsafe.Pointer(cc.err))
|
||||
} else if cc.major >= CudaComputeMajorMin {
|
||||
} else if cc.major > CudaComputeMin[0] || (cc.major == CudaComputeMin[0] && cc.minor >= CudaComputeMin[1]) {
|
||||
slog.Info(fmt.Sprintf("CUDA Compute Capability detected: %d.%d", cc.major, cc.minor))
|
||||
resp.Library = "cuda"
|
||||
} else {
|
||||
slog.Info(fmt.Sprintf("CUDA GPU is too old. Falling back to CPU mode. Compute Capability detected: %d.%d", cc.major, cc.minor))
|
||||
}
|
||||
}
|
||||
} else if gpuHandles.rocm != nil && cpuVariant != "" {
|
||||
C.rocm_check_vram(*gpuHandles.rocm, &memInfo)
|
||||
if memInfo.err != nil {
|
||||
slog.Info(fmt.Sprintf("error looking up ROCm GPU memory: %s", C.GoString(memInfo.err)))
|
||||
C.free(unsafe.Pointer(memInfo.err))
|
||||
} else if memInfo.igpu_index >= 0 && memInfo.count == 1 {
|
||||
// Only one GPU detected and it appears to be an integrated GPU - skip it
|
||||
slog.Info("ROCm unsupported integrated GPU detected")
|
||||
} else if AMDDetected() && gpuHandles.rocm != nil && (cpuVariant != "" || runtime.GOARCH != "amd64") {
|
||||
ver, err := AMDDriverVersion()
|
||||
if err == nil {
|
||||
slog.Info("AMD Driver: " + ver)
|
||||
} else {
|
||||
if memInfo.igpu_index >= 0 {
|
||||
// We have multiple GPUs reported, and one of them is an integrated GPU
|
||||
// so we have to set the env var to bypass it
|
||||
// If the user has specified their own ROCR_VISIBLE_DEVICES, don't clobber it
|
||||
val := os.Getenv("ROCR_VISIBLE_DEVICES")
|
||||
if val == "" {
|
||||
devices := []string{}
|
||||
for i := 0; i < int(memInfo.count); i++ {
|
||||
if i == int(memInfo.igpu_index) {
|
||||
continue
|
||||
// For now this is benign, but we may eventually need to fail compatibility checks
|
||||
slog.Debug("error looking up amd driver version: %s", err)
|
||||
}
|
||||
gfx := AMDGFXVersions()
|
||||
tooOld := false
|
||||
for _, v := range gfx {
|
||||
if v.Major < 9 {
|
||||
slog.Info("AMD GPU too old, falling back to CPU " + v.ToGFXString())
|
||||
tooOld = true
|
||||
break
|
||||
}
|
||||
|
||||
// TODO - remap gfx strings for unsupporetd minor/patch versions to supported for the same major
|
||||
// e.g. gfx1034 works if we map it to gfx1030 at runtime
|
||||
|
||||
}
|
||||
if !tooOld {
|
||||
// TODO - this algo can be shifted over to use sysfs instead of the rocm info library...
|
||||
C.rocm_check_vram(*gpuHandles.rocm, &memInfo)
|
||||
if memInfo.err != nil {
|
||||
slog.Info(fmt.Sprintf("error looking up ROCm GPU memory: %s", C.GoString(memInfo.err)))
|
||||
C.free(unsafe.Pointer(memInfo.err))
|
||||
} else if memInfo.igpu_index >= 0 && memInfo.count == 1 {
|
||||
// Only one GPU detected and it appears to be an integrated GPU - skip it
|
||||
slog.Info("ROCm unsupported integrated GPU detected")
|
||||
} else if memInfo.count > 0 {
|
||||
if memInfo.igpu_index >= 0 {
|
||||
// We have multiple GPUs reported, and one of them is an integrated GPU
|
||||
// so we have to set the env var to bypass it
|
||||
// If the user has specified their own ROCR_VISIBLE_DEVICES, don't clobber it
|
||||
val := os.Getenv("ROCR_VISIBLE_DEVICES")
|
||||
if val == "" {
|
||||
devices := []string{}
|
||||
for i := 0; i < int(memInfo.count); i++ {
|
||||
if i == int(memInfo.igpu_index) {
|
||||
continue
|
||||
}
|
||||
devices = append(devices, strconv.Itoa(i))
|
||||
}
|
||||
devices = append(devices, strconv.Itoa(i))
|
||||
val = strings.Join(devices, ",")
|
||||
os.Setenv("ROCR_VISIBLE_DEVICES", val)
|
||||
}
|
||||
val = strings.Join(devices, ",")
|
||||
os.Setenv("ROCR_VISIBLE_DEVICES", val)
|
||||
slog.Info(fmt.Sprintf("ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=%s", val))
|
||||
}
|
||||
slog.Info(fmt.Sprintf("ROCm integrated GPU detected - ROCR_VISIBLE_DEVICES=%s", val))
|
||||
resp.Library = "rocm"
|
||||
var version C.rocm_version_resp_t
|
||||
C.rocm_get_version(*gpuHandles.rocm, &version)
|
||||
verString := C.GoString(version.str)
|
||||
if version.status == 0 {
|
||||
resp.Variant = "v" + verString
|
||||
} else {
|
||||
slog.Info(fmt.Sprintf("failed to look up ROCm version: %s", verString))
|
||||
}
|
||||
C.free(unsafe.Pointer(version.str))
|
||||
}
|
||||
resp.Library = "rocm"
|
||||
var version C.rocm_version_resp_t
|
||||
C.rocm_get_version(*gpuHandles.rocm, &version)
|
||||
verString := C.GoString(version.str)
|
||||
if version.status == 0 {
|
||||
resp.Variant = "v" + verString
|
||||
} else {
|
||||
slog.Info(fmt.Sprintf("failed to look up ROCm version: %s", verString))
|
||||
}
|
||||
C.free(unsafe.Pointer(version.str))
|
||||
}
|
||||
}
|
||||
if resp.Library == "" {
|
||||
|
@@ -178,7 +178,7 @@ void rocm_get_version(rocm_handle_t h, rocm_version_resp_t *resp) {
|
||||
const int buflen = 256;
|
||||
char buf[buflen + 1];
|
||||
if (h.handle == NULL) {
|
||||
resp->str = strdup("nvml handle not initialized");
|
||||
resp->str = strdup("rocm handle not initialized");
|
||||
resp->status = 1;
|
||||
return;
|
||||
}
|
||||
@@ -195,4 +195,4 @@ void rocm_get_version(rocm_handle_t h, rocm_version_resp_t *resp) {
|
||||
resp->str = strdup(buf);
|
||||
}
|
||||
|
||||
#endif // __APPLE__
|
||||
#endif // __APPLE__
|
||||
|
@@ -16,3 +16,9 @@ type GpuInfo struct {
|
||||
|
||||
// TODO add other useful attributes about the card here for discovery information
|
||||
}
|
||||
|
||||
type Version struct {
|
||||
Major uint
|
||||
Minor uint
|
||||
Patch uint
|
||||
}
|
||||
|
@@ -4,7 +4,7 @@ package llm
|
||||
#cgo CFLAGS: -I${SRCDIR}/ext_server -I${SRCDIR}/llama.cpp -I${SRCDIR}/llama.cpp/common -I${SRCDIR}/llama.cpp/examples/server
|
||||
#cgo CFLAGS: -DNDEBUG -DLLAMA_SERVER_LIBRARY=1 -D_XOPEN_SOURCE=600 -DACCELERATE_NEW_LAPACK -DACCELERATE_LAPACK_ILP64
|
||||
#cgo CFLAGS: -Wmissing-noreturn -Wextra -Wcast-qual -Wno-unused-function -Wno-array-bounds
|
||||
#cgo CPPFLAGS: -Ofast -Wextra -Wno-unused-function -Wno-unused-variable -Wno-deprecated-declarations -Wno-unused-but-set-variable
|
||||
#cgo CPPFLAGS: -Ofast -Wextra -Wno-unused-function -Wno-unused-variable -Wno-deprecated-declarations
|
||||
#cgo darwin CFLAGS: -D_DARWIN_C_SOURCE
|
||||
#cgo darwin CPPFLAGS: -DGGML_USE_ACCELERATE
|
||||
#cgo darwin CPPFLAGS: -DGGML_USE_METAL -DGGML_METAL_NDEBUG
|
||||
@@ -161,13 +161,10 @@ func newDynExtServer(library, model string, adapters, projectors []string, opts
|
||||
func (llm *dynExtServer) Predict(ctx context.Context, predict PredictOpts, fn func(PredictResult)) error {
|
||||
resp := newExtServerResp(128)
|
||||
defer freeExtServerResp(resp)
|
||||
var imageData []ImageData
|
||||
|
||||
if len(predict.Images) > 0 {
|
||||
for cnt, i := range predict.Images {
|
||||
imageData = append(imageData, ImageData{Data: i, ID: cnt})
|
||||
}
|
||||
slog.Info(fmt.Sprintf("loaded %d images", len(predict.Images)))
|
||||
}
|
||||
slog.Info(fmt.Sprintf("loaded %d images", len(imageData)))
|
||||
|
||||
request := map[string]any{
|
||||
"prompt": predict.Prompt,
|
||||
@@ -189,7 +186,7 @@ func (llm *dynExtServer) Predict(ctx context.Context, predict PredictOpts, fn fu
|
||||
"penalize_nl": predict.Options.PenalizeNewline,
|
||||
"seed": predict.Options.Seed,
|
||||
"stop": predict.Options.Stop,
|
||||
"image_data": imageData,
|
||||
"image_data": predict.Images,
|
||||
"cache_prompt": true,
|
||||
}
|
||||
|
||||
@@ -197,6 +194,7 @@ func (llm *dynExtServer) Predict(ctx context.Context, predict PredictOpts, fn fu
|
||||
request["grammar"] = jsonGrammar
|
||||
}
|
||||
|
||||
var whitespace int
|
||||
retryDelay := 100 * time.Microsecond
|
||||
for retries := 0; retries < maxRetries; retries++ {
|
||||
if retries > 0 {
|
||||
@@ -255,13 +253,31 @@ func (llm *dynExtServer) Predict(ctx context.Context, predict PredictOpts, fn fu
|
||||
break out
|
||||
}
|
||||
|
||||
// detect if p.Content is entirely whitespace
|
||||
if predict.Format == "json" && strings.TrimSpace(p.Content) == "" {
|
||||
whitespace++
|
||||
|
||||
// if we get 100 consecutive whitespace responses, cancel
|
||||
if whitespace > 100 {
|
||||
slog.Debug("cancelling due to excessive whitespace")
|
||||
C.dyn_llama_server_completion_cancel(llm.s, resp.id, &resp)
|
||||
if resp.id < 0 {
|
||||
return extServerResponseToErr(resp)
|
||||
}
|
||||
|
||||
return nil
|
||||
}
|
||||
} else {
|
||||
whitespace = 0
|
||||
}
|
||||
|
||||
if p.Content != "" {
|
||||
fn(PredictResult{
|
||||
Content: p.Content,
|
||||
})
|
||||
}
|
||||
|
||||
if p.Stop {
|
||||
if p.Stop || bool(result.stop) {
|
||||
fn(PredictResult{
|
||||
Done: true,
|
||||
PromptEvalCount: p.Timings.PromptN,
|
||||
|
@@ -1,4 +1,5 @@
|
||||
#include "ext_server.h"
|
||||
#include <atomic>
|
||||
|
||||
// Necessary evil since the server types are not defined in a header
|
||||
#include "server.cpp"
|
||||
@@ -26,13 +27,29 @@
|
||||
|
||||
// Expose the llama server as a callable extern "C" API
|
||||
llama_server_context *llama = NULL;
|
||||
std::atomic<bool> ext_server_running(false);
|
||||
std::thread ext_server_thread;
|
||||
bool shutting_down = false;
|
||||
std::atomic_int recv_counter;
|
||||
|
||||
// RAII wrapper for tracking in-flight recv calls
|
||||
class atomicRecv {
|
||||
public:
|
||||
atomicRecv(std::atomic<int> &atomic) : atomic(atomic) {
|
||||
++this->atomic;
|
||||
}
|
||||
~atomicRecv() {
|
||||
--this->atomic;
|
||||
}
|
||||
private:
|
||||
std::atomic<int> &atomic;
|
||||
};
|
||||
|
||||
void llama_server_init(ext_server_params *sparams, ext_server_resp_t *err) {
|
||||
recv_counter = 0;
|
||||
assert(err != NULL && sparams != NULL);
|
||||
log_set_target(stderr);
|
||||
if (!sparams->verbose_logging) {
|
||||
server_verbose = true;
|
||||
log_disable();
|
||||
}
|
||||
|
||||
@@ -122,18 +139,23 @@ void llama_server_start() {
|
||||
assert(llama != NULL);
|
||||
// TODO mutex to protect thread creation
|
||||
ext_server_thread = std::thread([&]() {
|
||||
ext_server_running = true;
|
||||
try {
|
||||
LOG_TEE("llama server main loop starting\n");
|
||||
ggml_time_init();
|
||||
while (ext_server_running.load()) {
|
||||
if (!llama->update_slots()) {
|
||||
LOG_TEE(
|
||||
"unexpected error in llama server update_slots - exiting main "
|
||||
"loop\n");
|
||||
break;
|
||||
}
|
||||
}
|
||||
llama->queue_tasks.on_new_task(std::bind(
|
||||
&llama_server_context::process_single_task, llama, std::placeholders::_1));
|
||||
llama->queue_tasks.on_finish_multitask(std::bind(
|
||||
&llama_server_context::on_finish_multitask, llama, std::placeholders::_1));
|
||||
llama->queue_tasks.on_all_tasks_finished(std::bind(
|
||||
&llama_server_context::run_on_all_tasks_finished, llama));
|
||||
llama->queue_results.on_multitask_update(std::bind(
|
||||
&llama_server_queue::update_multitask,
|
||||
&llama->queue_tasks,
|
||||
std::placeholders::_1,
|
||||
std::placeholders::_2,
|
||||
std::placeholders::_3
|
||||
));
|
||||
llama->queue_tasks.start_loop();
|
||||
} catch (std::exception &e) {
|
||||
LOG_TEE("caught exception in llama server main loop: %s\n", e.what());
|
||||
} catch (...) {
|
||||
@@ -146,17 +168,22 @@ void llama_server_start() {
|
||||
|
||||
void llama_server_stop() {
|
||||
assert(llama != NULL);
|
||||
// TODO - too verbose, remove once things are solid
|
||||
LOG_TEE("requesting llama server shutdown\n");
|
||||
ext_server_running = false;
|
||||
// Shutdown any in-flight requests and block incoming requests.
|
||||
LOG_TEE("\ninitiating shutdown - draining remaining tasks...\n");
|
||||
shutting_down = true;
|
||||
|
||||
// unblocks the update_slots() loop so it can clean up and exit
|
||||
llama->request_cancel(0);
|
||||
while (recv_counter.load() > 0) {
|
||||
std::this_thread::sleep_for(std::chrono::milliseconds(50));
|
||||
}
|
||||
|
||||
// This may take a while for any pending tasks to drain
|
||||
// TODO - consider a timeout to cancel tasks if it's taking too long
|
||||
llama->queue_tasks.terminate();
|
||||
ext_server_thread.join();
|
||||
delete llama;
|
||||
llama = NULL;
|
||||
LOG_TEE("llama server shutdown complete\n");
|
||||
shutting_down = false;
|
||||
}
|
||||
|
||||
void llama_server_completion(const char *json_req, ext_server_resp_t *resp) {
|
||||
@@ -164,8 +191,13 @@ void llama_server_completion(const char *json_req, ext_server_resp_t *resp) {
|
||||
resp->id = -1;
|
||||
resp->msg[0] = '\0';
|
||||
try {
|
||||
if (shutting_down) {
|
||||
throw std::runtime_error("server shutting down");
|
||||
}
|
||||
json data = json::parse(json_req);
|
||||
resp->id = llama->request_completion(data, false, false, -1);
|
||||
resp->id = llama->queue_tasks.get_new_id();
|
||||
llama->queue_results.add_waiting_task_id(resp->id);
|
||||
llama->request_completion(resp->id, data, false, false, -1);
|
||||
} catch (std::exception &e) {
|
||||
snprintf(resp->msg, resp->msg_len, "exception %s", e.what());
|
||||
} catch (...) {
|
||||
@@ -183,16 +215,28 @@ void llama_server_completion_next_result(const int task_id,
|
||||
resp->json_resp = NULL;
|
||||
std::string result_json;
|
||||
try {
|
||||
task_result result = llama->next_result(task_id);
|
||||
atomicRecv ar(recv_counter);
|
||||
task_result result = llama->queue_results.recv(task_id);
|
||||
result_json =
|
||||
result.result_json.dump(-1, ' ', false, json::error_handler_t::replace);
|
||||
resp->id = result.id;
|
||||
resp->stop = result.stop;
|
||||
resp->error = result.error;
|
||||
if (result.error) {
|
||||
LOG_TEE("next result cancel on error\n");
|
||||
llama->request_cancel(task_id);
|
||||
LOG_TEE("next result removing waiting tak ID: %d\n", task_id);
|
||||
llama->queue_results.remove_waiting_task_id(task_id);
|
||||
} else if (result.stop) {
|
||||
LOG_TEE("next result cancel on stop\n");
|
||||
llama->request_cancel(task_id);
|
||||
LOG_TEE("next result removing waiting task ID: %d\n", task_id);
|
||||
llama->queue_results.remove_waiting_task_id(task_id);
|
||||
} else if (shutting_down) {
|
||||
LOG_TEE("aborting completion due to shutdown %d\n", task_id);
|
||||
llama->request_cancel(task_id);
|
||||
llama->queue_results.remove_waiting_task_id(task_id);
|
||||
resp->stop = true;
|
||||
}
|
||||
} catch (std::exception &e) {
|
||||
resp->error = true;
|
||||
@@ -223,6 +267,7 @@ void llama_server_completion_cancel(const int task_id, ext_server_resp_t *err) {
|
||||
err->msg[0] = '\0';
|
||||
try {
|
||||
llama->request_cancel(task_id);
|
||||
llama->queue_results.remove_waiting_task_id(task_id);
|
||||
} catch (std::exception &e) {
|
||||
err->id = -1;
|
||||
snprintf(err->msg, err->msg_len, "exception %s", e.what());
|
||||
@@ -240,6 +285,9 @@ void llama_server_tokenize(const char *json_req, char **json_resp,
|
||||
err->id = 0;
|
||||
err->msg[0] = '\0';
|
||||
try {
|
||||
if (shutting_down) {
|
||||
throw std::runtime_error("server shutting down");
|
||||
}
|
||||
const json body = json::parse(json_req);
|
||||
std::vector<llama_token> tokens;
|
||||
if (body.count("content") != 0) {
|
||||
@@ -273,6 +321,9 @@ void llama_server_detokenize(const char *json_req, char **json_resp,
|
||||
err->id = 0;
|
||||
err->msg[0] = '\0';
|
||||
try {
|
||||
if (shutting_down) {
|
||||
throw std::runtime_error("server shutting down");
|
||||
}
|
||||
const json body = json::parse(json_req);
|
||||
std::string content;
|
||||
if (body.count("tokens") != 0) {
|
||||
@@ -300,6 +351,9 @@ void llama_server_embedding(const char *json_req, char **json_resp,
|
||||
err->id = 0;
|
||||
err->msg[0] = '\0';
|
||||
try {
|
||||
if (shutting_down) {
|
||||
throw std::runtime_error("server shutting down");
|
||||
}
|
||||
const json body = json::parse(json_req);
|
||||
json prompt;
|
||||
if (body.count("content") != 0) {
|
||||
@@ -307,13 +361,16 @@ void llama_server_embedding(const char *json_req, char **json_resp,
|
||||
} else {
|
||||
prompt = "";
|
||||
}
|
||||
const int task_id = llama->request_completion(
|
||||
{{"prompt", prompt}, {"n_predict", 0}}, false, true, -1);
|
||||
task_result result = llama->next_result(task_id);
|
||||
const int task_id = llama->queue_tasks.get_new_id();
|
||||
llama->queue_results.add_waiting_task_id(task_id);
|
||||
llama->request_completion(task_id, {{"prompt", prompt}, {"n_predict", 0}}, false, true, -1);
|
||||
atomicRecv ar(recv_counter);
|
||||
task_result result = llama->queue_results.recv(task_id);
|
||||
std::string result_json = result.result_json.dump();
|
||||
const std::string::size_type size = result_json.size() + 1;
|
||||
*json_resp = new char[size];
|
||||
snprintf(*json_resp, size, "%s", result_json.c_str());
|
||||
llama->queue_results.remove_waiting_task_id(task_id);
|
||||
} catch (std::exception &e) {
|
||||
err->id = -1;
|
||||
snprintf(err->msg, err->msg_len, "exception %s", e.what());
|
||||
|
@@ -39,6 +39,9 @@ init_vars() {
|
||||
*)
|
||||
;;
|
||||
esac
|
||||
if [ -z "${CMAKE_CUDA_ARCHITECTURES}" ] ; then
|
||||
CMAKE_CUDA_ARCHITECTURES="50;52;61;70;75;80"
|
||||
fi
|
||||
}
|
||||
|
||||
git_module_setup() {
|
||||
@@ -62,15 +65,17 @@ apply_patches() {
|
||||
echo 'include (../../../ext_server/CMakeLists.txt) # ollama' >>${LLAMACPP_DIR}/examples/server/CMakeLists.txt
|
||||
fi
|
||||
|
||||
# apply temporary patches until fix is upstream
|
||||
for patch in ../patches/*.diff; do
|
||||
for file in $(grep "^+++ " ${patch} | cut -f2 -d' ' | cut -f2- -d/); do
|
||||
(cd ${LLAMACPP_DIR}; git checkout ${file})
|
||||
if [ -n "$(ls -A ../patches/*.diff)" ]; then
|
||||
# apply temporary patches until fix is upstream
|
||||
for patch in ../patches/*.diff; do
|
||||
for file in $(grep "^+++ " ${patch} | cut -f2 -d' ' | cut -f2- -d/); do
|
||||
(cd ${LLAMACPP_DIR}; git checkout ${file})
|
||||
done
|
||||
done
|
||||
done
|
||||
for patch in ../patches/*.diff; do
|
||||
(cd ${LLAMACPP_DIR} && git apply ${patch})
|
||||
done
|
||||
for patch in ../patches/*.diff; do
|
||||
(cd ${LLAMACPP_DIR} && git apply ${patch})
|
||||
done
|
||||
fi
|
||||
|
||||
# Avoid duplicate main symbols when we link into the cgo binary
|
||||
sed -e 's/int main(/int __main(/g' <${LLAMACPP_DIR}/examples/server/server.cpp >${LLAMACPP_DIR}/examples/server/server.cpp.tmp &&
|
||||
@@ -109,4 +114,12 @@ compress_libs() {
|
||||
# Keep the local tree clean after we're done with the build
|
||||
cleanup() {
|
||||
(cd ${LLAMACPP_DIR}/examples/server/ && git checkout CMakeLists.txt server.cpp)
|
||||
|
||||
if [ -n "$(ls -A ../patches/*.diff)" ]; then
|
||||
for patch in ../patches/*.diff; do
|
||||
for file in $(grep "^+++ " ${patch} | cut -f2 -d' ' | cut -f2- -d/); do
|
||||
(cd ${LLAMACPP_DIR}; git checkout ${file})
|
||||
done
|
||||
done
|
||||
fi
|
||||
}
|
||||
|
@@ -21,7 +21,6 @@ amdGPUs() {
|
||||
return
|
||||
fi
|
||||
GPU_LIST=(
|
||||
"gfx803"
|
||||
"gfx900"
|
||||
"gfx906:xnack-"
|
||||
"gfx908:xnack-"
|
||||
@@ -128,6 +127,11 @@ if [ -z "${CUDA_LIB_DIR}" ] && [ -d /opt/cuda/targets/x86_64-linux/lib ]; then
|
||||
CUDA_LIB_DIR=/opt/cuda/targets/x86_64-linux/lib
|
||||
fi
|
||||
|
||||
# Allow override in case libcudart is in the wrong place
|
||||
if [ -z "${CUDART_LIB_DIR}" ]; then
|
||||
CUDART_LIB_DIR="${CUDA_LIB_DIR}"
|
||||
fi
|
||||
|
||||
if [ -d "${CUDA_LIB_DIR}" ]; then
|
||||
echo "CUDA libraries detected - building dynamic CUDA library"
|
||||
init_vars
|
||||
@@ -135,7 +139,7 @@ if [ -d "${CUDA_LIB_DIR}" ]; then
|
||||
if [ -n "${CUDA_MAJOR}" ]; then
|
||||
CUDA_VARIANT=_v${CUDA_MAJOR}
|
||||
fi
|
||||
CMAKE_DEFS="-DLLAMA_CUBLAS=on ${COMMON_CMAKE_DEFS} ${CMAKE_DEFS}"
|
||||
CMAKE_DEFS="-DLLAMA_CUBLAS=on -DLLAMA_CUDA_FORCE_MMQ=on -DCMAKE_CUDA_ARCHITECTURES=${CMAKE_CUDA_ARCHITECTURES} ${COMMON_CMAKE_DEFS} ${CMAKE_DEFS}"
|
||||
BUILD_DIR="${LLAMACPP_DIR}/build/linux/${ARCH}/cuda${CUDA_VARIANT}"
|
||||
EXTRA_LIBS="-L${CUDA_LIB_DIR} -lcudart -lcublas -lcublasLt -lcuda"
|
||||
build
|
||||
@@ -151,6 +155,8 @@ if [ -d "${CUDA_LIB_DIR}" ]; then
|
||||
cp "${CUDA_LIB_DIR}/${DEP}" "${BUILD_DIR}/lib/"
|
||||
elif [ -e "${CUDA_LIB_DIR}/${lib}.${CUDA_MAJOR}" ]; then
|
||||
cp "${CUDA_LIB_DIR}/${lib}.${CUDA_MAJOR}" "${BUILD_DIR}/lib/"
|
||||
elif [ -e "${CUDART_LIB_DIR}/${lib}" ]; then
|
||||
cp -d ${CUDART_LIB_DIR}/${lib}* "${BUILD_DIR}/lib/"
|
||||
else
|
||||
cp -d "${CUDA_LIB_DIR}/${lib}*" "${BUILD_DIR}/lib/"
|
||||
fi
|
||||
|
@@ -3,8 +3,9 @@
|
||||
$ErrorActionPreference = "Stop"
|
||||
|
||||
function init_vars {
|
||||
$script:SRC_DIR = $(resolve-path "..\..\")
|
||||
$script:llamacppDir = "../llama.cpp"
|
||||
$script:cmakeDefs = @("-DBUILD_SHARED_LIBS=on", "-DLLAMA_NATIVE=off", "-A","x64")
|
||||
$script:cmakeDefs = @("-DBUILD_SHARED_LIBS=on", "-DLLAMA_NATIVE=off", "-A", "x64")
|
||||
$script:cmakeTargets = @("ext_server")
|
||||
$script:ARCH = "amd64" # arm not yet supported.
|
||||
if ($env:CGO_CFLAGS -contains "-g") {
|
||||
@@ -19,12 +20,23 @@ function init_vars {
|
||||
$d=(get-command -ea 'silentlycontinue' nvcc).path
|
||||
if ($d -ne $null) {
|
||||
$script:CUDA_LIB_DIR=($d| split-path -parent)
|
||||
$script:CUDA_INCLUDE_DIR=($script:CUDA_LIB_DIR|split-path -parent)+"\include"
|
||||
}
|
||||
} else {
|
||||
$script:CUDA_LIB_DIR=$env:CUDA_LIB_DIR
|
||||
}
|
||||
$script:GZIP=(get-command -ea 'silentlycontinue' gzip).path
|
||||
$script:DUMPBIN=(get-command -ea 'silentlycontinue' dumpbin).path
|
||||
if ($null -eq $env:CMAKE_CUDA_ARCHITECTURES) {
|
||||
$script:CMAKE_CUDA_ARCHITECTURES="50;52;61;70;75;80"
|
||||
} else {
|
||||
$script:CMAKE_CUDA_ARCHITECTURES=$env:CMAKE_CUDA_ARCHITECTURES
|
||||
}
|
||||
# Note: 10 Windows Kit signtool crashes with GCP's plugin
|
||||
${script:SignTool}="C:\Program Files (x86)\Windows Kits\8.1\bin\x64\signtool.exe"
|
||||
if ("${env:KEY_CONTAINER}") {
|
||||
${script:OLLAMA_CERT}=$(resolve-path "${script:SRC_DIR}\ollama_inc.crt")
|
||||
}
|
||||
}
|
||||
|
||||
function git_module_setup {
|
||||
@@ -51,8 +63,8 @@ function apply_patches {
|
||||
}
|
||||
|
||||
# Checkout each file
|
||||
Set-Location -Path ${script:llamacppDir}
|
||||
foreach ($file in $filePaths) {
|
||||
Set-Location -Path ${script:llamacppDir}
|
||||
git checkout $file
|
||||
}
|
||||
}
|
||||
@@ -84,13 +96,23 @@ function install {
|
||||
md "${script:buildDir}/lib" -ea 0 > $null
|
||||
cp "${script:buildDir}/bin/${script:config}/ext_server.dll" "${script:buildDir}/lib"
|
||||
cp "${script:buildDir}/bin/${script:config}/llama.dll" "${script:buildDir}/lib"
|
||||
|
||||
# Display the dll dependencies in the build log
|
||||
if ($script:DUMPBIN -ne $null) {
|
||||
& "$script:DUMPBIN" /dependents "${script:buildDir}/bin/${script:config}/ext_server.dll" | select-string ".dll"
|
||||
}
|
||||
}
|
||||
|
||||
function sign {
|
||||
if ("${env:KEY_CONTAINER}") {
|
||||
write-host "Signing ${script:buildDir}/lib/*.dll"
|
||||
foreach ($file in (get-childitem "${script:buildDir}/lib/*.dll")){
|
||||
& "${script:SignTool}" sign /v /fd sha256 /t http://timestamp.digicert.com /f "${script:OLLAMA_CERT}" `
|
||||
/csp "Google Cloud KMS Provider" /kc "${env:KEY_CONTAINER}" $file
|
||||
if ($LASTEXITCODE -ne 0) { exit($LASTEXITCODE)}
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function compress_libs {
|
||||
if ($script:GZIP -eq $null) {
|
||||
write-host "gzip not installed, not compressing files"
|
||||
@@ -104,8 +126,23 @@ function compress_libs {
|
||||
}
|
||||
|
||||
function cleanup {
|
||||
$patches = Get-ChildItem "../patches/*.diff"
|
||||
foreach ($patch in $patches) {
|
||||
# Extract file paths from the patch file
|
||||
$filePaths = Get-Content $patch.FullName | Where-Object { $_ -match '^\+\+\+ ' } | ForEach-Object {
|
||||
$parts = $_ -split ' '
|
||||
($parts[1] -split '/', 2)[1]
|
||||
}
|
||||
|
||||
# Checkout each file
|
||||
Set-Location -Path ${script:llamacppDir}
|
||||
foreach ($file in $filePaths) {
|
||||
git checkout $file
|
||||
}
|
||||
}
|
||||
Set-Location "${script:llamacppDir}/examples/server"
|
||||
git checkout CMakeLists.txt server.cpp
|
||||
|
||||
}
|
||||
|
||||
init_vars
|
||||
@@ -124,6 +161,7 @@ $script:buildDir="${script:llamacppDir}/build/windows/${script:ARCH}/cpu"
|
||||
write-host "Building LCD CPU"
|
||||
build
|
||||
install
|
||||
sign
|
||||
compress_libs
|
||||
|
||||
$script:cmakeDefs = $script:commonCpuDefs + @("-DLLAMA_AVX=on", "-DLLAMA_AVX2=off", "-DLLAMA_AVX512=off", "-DLLAMA_FMA=off", "-DLLAMA_F16C=off") + $script:cmakeDefs
|
||||
@@ -131,6 +169,7 @@ $script:buildDir="${script:llamacppDir}/build/windows/${script:ARCH}/cpu_avx"
|
||||
write-host "Building AVX CPU"
|
||||
build
|
||||
install
|
||||
sign
|
||||
compress_libs
|
||||
|
||||
$script:cmakeDefs = $script:commonCpuDefs + @("-DLLAMA_AVX=on", "-DLLAMA_AVX2=on", "-DLLAMA_AVX512=off", "-DLLAMA_FMA=on", "-DLLAMA_F16C=on") + $script:cmakeDefs
|
||||
@@ -138,25 +177,22 @@ $script:buildDir="${script:llamacppDir}/build/windows/${script:ARCH}/cpu_avx2"
|
||||
write-host "Building AVX2 CPU"
|
||||
build
|
||||
install
|
||||
sign
|
||||
compress_libs
|
||||
|
||||
if ($null -ne $script:CUDA_LIB_DIR) {
|
||||
# Then build cuda as a dynamically loaded library
|
||||
$nvcc = (get-command -ea 'silentlycontinue' nvcc)
|
||||
if ($null -ne $nvcc) {
|
||||
$script:CUDA_VERSION=(get-item ($nvcc | split-path | split-path)).Basename
|
||||
}
|
||||
$nvcc = "$script:CUDA_LIB_DIR\nvcc.exe"
|
||||
$script:CUDA_VERSION=(get-item ($nvcc | split-path | split-path)).Basename
|
||||
if ($null -ne $script:CUDA_VERSION) {
|
||||
$script:CUDA_VARIANT="_"+$script:CUDA_VERSION
|
||||
}
|
||||
init_vars
|
||||
$script:buildDir="${script:llamacppDir}/build/windows/${script:ARCH}/cuda$script:CUDA_VARIANT"
|
||||
$script:cmakeDefs += @("-DLLAMA_CUBLAS=ON", "-DLLAMA_AVX=on")
|
||||
$script:cmakeDefs += @("-DLLAMA_CUBLAS=ON", "-DLLAMA_AVX=on", "-DLLAMA_AVX2=off", "-DCUDAToolkit_INCLUDE_DIR=$script:CUDA_INCLUDE_DIR", "-DCMAKE_CUDA_ARCHITECTURES=${script:CMAKE_CUDA_ARCHITECTURES}")
|
||||
build
|
||||
install
|
||||
cp "${script:CUDA_LIB_DIR}/cudart64_*.dll" "${script:buildDir}/lib"
|
||||
cp "${script:CUDA_LIB_DIR}/cublas64_*.dll" "${script:buildDir}/lib"
|
||||
cp "${script:CUDA_LIB_DIR}/cublasLt64_*.dll" "${script:buildDir}/lib"
|
||||
sign
|
||||
compress_libs
|
||||
}
|
||||
# TODO - actually implement ROCm support on windows
|
||||
@@ -167,4 +203,4 @@ md "${script:buildDir}/lib" -ea 0 > $null
|
||||
echo $null >> "${script:buildDir}/lib/.generated"
|
||||
|
||||
cleanup
|
||||
write-host "`ngo generate completed"
|
||||
write-host "`ngo generate completed"
|
||||
|
@@ -62,7 +62,7 @@ const maxRetries = 3
|
||||
type PredictOpts struct {
|
||||
Prompt string
|
||||
Format string
|
||||
Images []api.ImageData
|
||||
Images []ImageData
|
||||
Options api.Options
|
||||
}
|
||||
|
||||
|
14
llm/llm.go
@@ -120,7 +120,7 @@ func New(workDir, model string, adapters, projectors []string, opts api.Options)
|
||||
|
||||
opts.RopeFrequencyBase = 0.0
|
||||
opts.RopeFrequencyScale = 0.0
|
||||
return newLlmServer(info, model, adapters, projectors, opts)
|
||||
return newLlmServer(info, workDir, model, adapters, projectors, opts)
|
||||
}
|
||||
|
||||
// Give any native cgo implementations an opportunity to initialize
|
||||
@@ -128,7 +128,7 @@ func Init(workdir string) error {
|
||||
return nativeInit(workdir)
|
||||
}
|
||||
|
||||
func newLlmServer(gpuInfo gpu.GpuInfo, model string, adapters, projectors []string, opts api.Options) (LLM, error) {
|
||||
func newLlmServer(gpuInfo gpu.GpuInfo, workDir, model string, adapters, projectors []string, opts api.Options) (LLM, error) {
|
||||
dynLibs := getDynLibs(gpuInfo)
|
||||
|
||||
// Check to see if the user has requested a specific library instead of auto-detecting
|
||||
@@ -143,6 +143,16 @@ func newLlmServer(gpuInfo gpu.GpuInfo, model string, adapters, projectors []stri
|
||||
}
|
||||
}
|
||||
|
||||
// We stage into a temp directory, and if we've been idle for a while, it may have been reaped
|
||||
_, err := os.Stat(dynLibs[0])
|
||||
if err != nil {
|
||||
slog.Info(fmt.Sprintf("%s has disappeared, reloading libraries", dynLibs[0]))
|
||||
err = nativeInit(workDir)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
}
|
||||
|
||||
err2 := fmt.Errorf("unable to locate suitable llm library")
|
||||
for _, dynLib := range dynLibs {
|
||||
srv, err := newDynExtServer(dynLib, model, adapters, projectors, opts)
|
||||
|
@@ -1,30 +1,21 @@
|
||||
diff --git a/examples/server/server.cpp b/examples/server/server.cpp
|
||||
index 0462fbd2..4fa7b57f 100644
|
||||
index d86d7e04..2694e92e 100644
|
||||
--- a/examples/server/server.cpp
|
||||
+++ b/examples/server/server.cpp
|
||||
@@ -1857,12 +1857,6 @@ struct llama_server_context
|
||||
LOG_TEE("slot %d : in cache: %i tokens | to process: %i tokens\n", slot.id, slot.n_past, slot.num_prompt_tokens_processed);
|
||||
}
|
||||
@@ -901,13 +901,15 @@ struct llama_server_context
|
||||
slot.sent_count += result.text_to_send.size();
|
||||
// add the token to slot queue and cache
|
||||
}
|
||||
- slot.add_token_string(result);
|
||||
+
|
||||
if (slot.params.stream)
|
||||
{
|
||||
send_partial_response(slot, result);
|
||||
}
|
||||
}
|
||||
|
||||
- LOG_TEE("slot %d : kv cache rm - [%d, end)\n", slot.id, (int) system_tokens.size() + slot.n_past);
|
||||
-
|
||||
- llama_kv_cache_seq_rm(ctx, slot.id, system_tokens.size() + slot.n_past, -1);
|
||||
-
|
||||
- slot.cache_tokens = prompt_tokens;
|
||||
-
|
||||
if (slot.n_past == slot.num_prompt_tokens && slot.n_past > 0)
|
||||
{
|
||||
// we have to evaluate at least 1 token to generate logits.
|
||||
@@ -1870,6 +1864,12 @@ struct llama_server_context
|
||||
slot.n_past--;
|
||||
}
|
||||
|
||||
+ LOG_TEE("slot %d : kv cache rm - [%d, end)\n", slot.id, (int) system_tokens.size() + slot.n_past);
|
||||
+ slot.add_token_string(result);
|
||||
+
|
||||
+ llama_kv_cache_seq_rm(ctx, slot.id, system_tokens.size() + slot.n_past, -1);
|
||||
+
|
||||
+ slot.cache_tokens = prompt_tokens;
|
||||
+
|
||||
LOG_VERBOSE("prompt ingested", {
|
||||
{"n_past", slot.n_past},
|
||||
{"cached", tokens_to_str(ctx, slot.cache_tokens.cbegin(), slot.cache_tokens.cbegin() + slot.n_past)},
|
||||
if (incomplete)
|
||||
{
|
||||
slot.has_next_token = true;
|
||||
|
96
llm/patches/02-shutdown.diff
Normal file
@@ -0,0 +1,96 @@
|
||||
diff --git a/examples/server/server.cpp b/examples/server/server.cpp
|
||||
index a0b46970..7800c6e7 100644
|
||||
--- a/examples/server/server.cpp
|
||||
+++ b/examples/server/server.cpp
|
||||
@@ -28,6 +28,7 @@
|
||||
#include <chrono>
|
||||
#include <condition_variable>
|
||||
#include <atomic>
|
||||
+#include <signal.h>
|
||||
|
||||
using json = nlohmann::json;
|
||||
|
||||
@@ -2511,6 +2512,9 @@ static void append_to_generated_text_from_generated_token_probs(llama_server_con
|
||||
}
|
||||
}
|
||||
|
||||
+std::function<void(int)> shutdown_handler;
|
||||
+inline void signal_handler(int signal) { shutdown_handler(signal); }
|
||||
+
|
||||
int main(int argc, char **argv)
|
||||
{
|
||||
#if SERVER_VERBOSE != 1
|
||||
@@ -3128,8 +3132,25 @@ int main(int argc, char **argv)
|
||||
std::placeholders::_2,
|
||||
std::placeholders::_3
|
||||
));
|
||||
- llama.queue_tasks.start_loop();
|
||||
|
||||
+ shutdown_handler = [&](int) {
|
||||
+ llama.queue_tasks.terminate();
|
||||
+ };
|
||||
+
|
||||
+#if defined (__unix__) || (defined (__APPLE__) && defined (__MACH__))
|
||||
+ struct sigaction sigint_action;
|
||||
+ sigint_action.sa_handler = signal_handler;
|
||||
+ sigemptyset (&sigint_action.sa_mask);
|
||||
+ sigint_action.sa_flags = 0;
|
||||
+ sigaction(SIGINT, &sigint_action, NULL);
|
||||
+#elif defined (_WIN32)
|
||||
+ auto console_ctrl_handler = +[](DWORD ctrl_type) -> BOOL {
|
||||
+ return (ctrl_type == CTRL_C_EVENT) ? (signal_handler(SIGINT), true) : false;
|
||||
+ };
|
||||
+ SetConsoleCtrlHandler(reinterpret_cast<PHANDLER_ROUTINE>(console_ctrl_handler), true);
|
||||
+#endif
|
||||
+ llama.queue_tasks.start_loop();
|
||||
+ svr.stop();
|
||||
t.join();
|
||||
|
||||
llama_backend_free();
|
||||
diff --git a/examples/server/utils.hpp b/examples/server/utils.hpp
|
||||
index 54854896..0ee670db 100644
|
||||
--- a/examples/server/utils.hpp
|
||||
+++ b/examples/server/utils.hpp
|
||||
@@ -220,6 +220,7 @@ inline std::string format_chatml(std::vector<json> messages)
|
||||
struct llama_server_queue {
|
||||
int id = 0;
|
||||
std::mutex mutex_tasks;
|
||||
+ bool running;
|
||||
// queues
|
||||
std::vector<task_server> queue_tasks;
|
||||
std::vector<task_server> queue_tasks_deferred;
|
||||
@@ -278,9 +279,18 @@ struct llama_server_queue {
|
||||
queue_tasks_deferred.clear();
|
||||
}
|
||||
|
||||
- // Start the main loop. This call is blocking
|
||||
- [[noreturn]]
|
||||
+ // end the start_loop routine
|
||||
+ void terminate() {
|
||||
+ {
|
||||
+ std::unique_lock<std::mutex> lock(mutex_tasks);
|
||||
+ running = false;
|
||||
+ }
|
||||
+ condition_tasks.notify_all();
|
||||
+ }
|
||||
+
|
||||
+ // Start the main loop.
|
||||
void start_loop() {
|
||||
+ running = true;
|
||||
while (true) {
|
||||
// new task arrived
|
||||
LOG_VERBOSE("have new task", {});
|
||||
@@ -324,8 +334,12 @@ struct llama_server_queue {
|
||||
{
|
||||
std::unique_lock<std::mutex> lock(mutex_tasks);
|
||||
if (queue_tasks.empty()) {
|
||||
+ if (!running) {
|
||||
+ LOG_VERBOSE("ending start_loop", {});
|
||||
+ return;
|
||||
+ }
|
||||
condition_tasks.wait(lock, [&]{
|
||||
- return !queue_tasks.empty();
|
||||
+ return (!queue_tasks.empty() || !running);
|
||||
});
|
||||
}
|
||||
}
|
116
llm/patches/03-cudaleaks.diff
Normal file
@@ -0,0 +1,116 @@
|
||||
diff --git a/examples/server/server.cpp b/examples/server/server.cpp
|
||||
index 3102762c..568ac1d0 100644
|
||||
--- a/examples/server/server.cpp
|
||||
+++ b/examples/server/server.cpp
|
||||
@@ -307,6 +307,10 @@ struct llama_client_slot
|
||||
}
|
||||
};
|
||||
|
||||
+#ifdef GGML_USE_CUBLAS
|
||||
+extern "C" GGML_CALL void ggml_free_cublas(void);
|
||||
+#endif
|
||||
+
|
||||
struct llama_server_context
|
||||
{
|
||||
llama_model *model = nullptr;
|
||||
@@ -353,6 +357,10 @@ struct llama_server_context
|
||||
llama_free_model(model);
|
||||
model = nullptr;
|
||||
}
|
||||
+#ifdef GGML_USE_CUBLAS
|
||||
+ ggml_free_cublas();
|
||||
+#endif
|
||||
+
|
||||
}
|
||||
|
||||
bool load_model(const gpt_params ¶ms_)
|
||||
@@ -3093,6 +3101,7 @@ int main(int argc, char **argv)
|
||||
sigemptyset (&sigint_action.sa_mask);
|
||||
sigint_action.sa_flags = 0;
|
||||
sigaction(SIGINT, &sigint_action, NULL);
|
||||
+ sigaction(SIGUSR1, &sigint_action, NULL);
|
||||
#elif defined (_WIN32)
|
||||
auto console_ctrl_handler = +[](DWORD ctrl_type) -> BOOL {
|
||||
return (ctrl_type == CTRL_C_EVENT) ? (signal_handler(SIGINT), true) : false;
|
||||
@@ -3106,3 +3115,4 @@ int main(int argc, char **argv)
|
||||
llama_backend_free();
|
||||
return 0;
|
||||
}
|
||||
+
|
||||
diff --git a/ggml-cuda.cu b/ggml-cuda.cu
|
||||
index 96976f24..3543920e 100644
|
||||
--- a/ggml-cuda.cu
|
||||
+++ b/ggml-cuda.cu
|
||||
@@ -39,6 +39,7 @@
|
||||
#define __shfl_xor_sync(mask, var, laneMask, width) __shfl_xor(var, laneMask, width)
|
||||
#define cublasComputeType_t hipblasDatatype_t //deprecated, new hipblasComputeType_t not in 5.6
|
||||
#define cublasCreate hipblasCreate
|
||||
+#define cublasDestroy hipblasDestroy
|
||||
#define cublasGemmEx hipblasGemmEx
|
||||
#define cublasGemmBatchedEx hipblasGemmBatchedEx
|
||||
#define cublasGemmStridedBatchedEx hipblasGemmStridedBatchedEx
|
||||
@@ -7928,10 +7929,11 @@ GGML_CALL bool ggml_cublas_loaded(void) {
|
||||
return g_cublas_loaded;
|
||||
}
|
||||
|
||||
+static bool g_cublas_initialized = false;
|
||||
+
|
||||
GGML_CALL void ggml_init_cublas() {
|
||||
- static bool initialized = false;
|
||||
|
||||
- if (!initialized) {
|
||||
+ if (!g_cublas_initialized) {
|
||||
|
||||
#ifdef __HIP_PLATFORM_AMD__
|
||||
// Workaround for a rocBLAS bug when using multiple graphics cards:
|
||||
@@ -7941,7 +7943,7 @@ GGML_CALL void ggml_init_cublas() {
|
||||
#endif
|
||||
|
||||
if (cudaGetDeviceCount(&g_device_count) != cudaSuccess) {
|
||||
- initialized = true;
|
||||
+ g_cublas_initialized = true;
|
||||
g_cublas_loaded = false;
|
||||
return;
|
||||
}
|
||||
@@ -8011,7 +8013,7 @@ GGML_CALL void ggml_init_cublas() {
|
||||
// configure logging to stdout
|
||||
// CUBLAS_CHECK(cublasLoggerConfigure(1, 1, 0, nullptr));
|
||||
|
||||
- initialized = true;
|
||||
+ g_cublas_initialized = true;
|
||||
g_cublas_loaded = true;
|
||||
}
|
||||
}
|
||||
@@ -11528,3 +11530,17 @@ GGML_CALL int ggml_backend_cuda_reg_devices() {
|
||||
}
|
||||
return device_count;
|
||||
}
|
||||
+
|
||||
+extern "C" GGML_CALL void ggml_free_cublas(void);
|
||||
+GGML_CALL void ggml_free_cublas(void) {
|
||||
+ for (int id = 0; id < g_device_count; ++id) {
|
||||
+#if !defined(GGML_USE_HIPBLAS)
|
||||
+ CU_CHECK(cuMemUnmap(g_cuda_pool_addr[id], g_cuda_pool_size[id]));
|
||||
+ g_cuda_pool_size[id] = 0;
|
||||
+ g_cuda_pool_addr[id] = 0;
|
||||
+#endif
|
||||
+ CUBLAS_CHECK(cublasDestroy(g_cublas_handles[id]));
|
||||
+ g_cublas_handles[id] = nullptr;
|
||||
+ }
|
||||
+ g_cublas_initialized = false;
|
||||
+}
|
||||
\ No newline at end of file
|
||||
diff --git a/ggml-cuda.h b/ggml-cuda.h
|
||||
index b1ebd61d..b4c80c2c 100644
|
||||
--- a/ggml-cuda.h
|
||||
+++ b/ggml-cuda.h
|
||||
@@ -20,6 +20,9 @@ extern "C" {
|
||||
// Always success. To check if CUDA is actually loaded, use `ggml_cublas_loaded`.
|
||||
GGML_API GGML_CALL void ggml_init_cublas(void);
|
||||
|
||||
+// Release CUDA resources
|
||||
+GGML_API GGML_CALL void ggml_free_cublas(void);
|
||||
+
|
||||
// Returns `true` if there are available CUDA devices and cublas loads successfully; otherwise, it returns `false`.
|
||||
GGML_API GGML_CALL bool ggml_cublas_loaded(void);
|
||||
|
@@ -90,6 +90,7 @@ func getDynLibs(gpuInfo gpu.GpuInfo) []string {
|
||||
if len(dynLibs) == 0 {
|
||||
dynLibs = []string{availableDynLibs["cpu"]}
|
||||
}
|
||||
slog.Debug(fmt.Sprintf("ordered list of LLM libraries to try %v", dynLibs))
|
||||
return dynLibs
|
||||
}
|
||||
|
||||
|
92
macapp/.gitignore
vendored
Normal file
@@ -0,0 +1,92 @@
|
||||
# Logs
|
||||
logs
|
||||
*.log
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
lerna-debug.log*
|
||||
|
||||
# Diagnostic reports (https://nodejs.org/api/report.html)
|
||||
report.[0-9]*.[0-9]*.[0-9]*.[0-9]*.json
|
||||
|
||||
# Runtime data
|
||||
pids
|
||||
*.pid
|
||||
*.seed
|
||||
*.pid.lock
|
||||
.DS_Store
|
||||
|
||||
# Directory for instrumented libs generated by jscoverage/JSCover
|
||||
lib-cov
|
||||
|
||||
# Coverage directory used by tools like istanbul
|
||||
coverage
|
||||
*.lcov
|
||||
|
||||
# nyc test coverage
|
||||
.nyc_output
|
||||
|
||||
# node-waf configuration
|
||||
.lock-wscript
|
||||
|
||||
# Compiled binary addons (https://nodejs.org/api/addons.html)
|
||||
build/Release
|
||||
|
||||
# Dependency directories
|
||||
node_modules/
|
||||
jspm_packages/
|
||||
|
||||
# TypeScript v1 declaration files
|
||||
typings/
|
||||
|
||||
# TypeScript cache
|
||||
*.tsbuildinfo
|
||||
|
||||
# Optional npm cache directory
|
||||
.npm
|
||||
|
||||
# Optional eslint cache
|
||||
.eslintcache
|
||||
|
||||
# Optional REPL history
|
||||
.node_repl_history
|
||||
|
||||
# Output of 'npm pack'
|
||||
*.tgz
|
||||
|
||||
# Yarn Integrity file
|
||||
.yarn-integrity
|
||||
|
||||
# dotenv environment variables file
|
||||
.env
|
||||
.env.test
|
||||
|
||||
# parcel-bundler cache (https://parceljs.org/)
|
||||
.cache
|
||||
|
||||
# next.js build output
|
||||
.next
|
||||
|
||||
# nuxt.js build output
|
||||
.nuxt
|
||||
|
||||
# vuepress build output
|
||||
.vuepress/dist
|
||||
|
||||
# Serverless directories
|
||||
.serverless/
|
||||
|
||||
# FuseBox cache
|
||||
.fusebox/
|
||||
|
||||
# DynamoDB Local files
|
||||
.dynamodb/
|
||||
|
||||
# Webpack
|
||||
.webpack/
|
||||
|
||||
# Vite
|
||||
.vite/
|
||||
|
||||
# Electron-Forge
|
||||
out/
|
21
macapp/README.md
Normal file
@@ -0,0 +1,21 @@
|
||||
# Desktop
|
||||
|
||||
This app builds upon Ollama to provide a desktop experience for running models.
|
||||
|
||||
## Developing
|
||||
|
||||
First, build the `ollama` binary:
|
||||
|
||||
```
|
||||
cd ..
|
||||
go build .
|
||||
```
|
||||
|
||||
Then run the desktop app with `npm start`:
|
||||
|
||||
```
|
||||
cd app
|
||||
npm install
|
||||
npm start
|
||||
```
|
||||
|
Before Width: | Height: | Size: 402 B After Width: | Height: | Size: 402 B |
Before Width: | Height: | Size: 741 B After Width: | Height: | Size: 741 B |
Before Width: | Height: | Size: 440 B After Width: | Height: | Size: 440 B |
Before Width: | Height: | Size: 763 B After Width: | Height: | Size: 763 B |
Before Width: | Height: | Size: 447 B After Width: | Height: | Size: 447 B |
Before Width: | Height: | Size: 891 B After Width: | Height: | Size: 891 B |
Before Width: | Height: | Size: 443 B After Width: | Height: | Size: 443 B |
Before Width: | Height: | Size: 844 B After Width: | Height: | Size: 844 B |