lint fix

api: return structured error on unauthorized push
This commit implements a structured error response system for the Ollama API, replacing ad-hoc error handling and string parsing with proper error types and codes through a new ErrorResponse struct. Instead of relying on regex to parse error messages for SSH keys, the API now passes this data in a structured format with standardized fields for error messages, codes, and additional data. This structured approach makes the API more maintainable and reliable while improving the developer experience by enabling programmatic error handling, consistent error formats, and better error documentation.
2024-12-12 13:07:09 -08:00 · 2024-12-12 11:49:36 -08:00 · 2024-11-27 15:52:30 -08:00 · 2024-11-27 15:45:45 -08:00 · 2024-11-27 15:24:08 -08:00 · 2024-11-27 15:22:27 -08:00
33 changed files with 694 additions and 488 deletions
--- a/README.md
+++ b/README.md
@@ -298,7 +298,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [AnythingLLM (Docker + MacOs/Windows/Linux native app)](https://github.com/Mintplex-Labs/anything-llm)
 - [Ollama Basic Chat: Uses HyperDiv Reactive UI](https://github.com/rapidarchitect/ollama_basic_chat)
 - [Ollama-chats RPG](https://github.com/drazdra/ollama-chats)
- [QA-Pilot](https://github.com/reid41/QA-Pilot) (Chat with Code Repository)
+- [QA-Pilot](https://github.com/reid41/QA-Pilot) (Interactive chat tool that can leverage Ollama models for rapid understanding and navigation of GitHub code repositories)
 - [ChatOllama](https://github.com/sugarforever/chat-ollama) (Open Source Chatbot based on Ollama with Knowledge Bases)
 - [CRAG Ollama Chat](https://github.com/Nagi-ovo/CRAG-Ollama-Chat) (Simple Web Search with Corrective RAG)
 - [RAGFlow](https://github.com/infiniflow/ragflow) (Open-source Retrieval-Augmented Generation engine based on deep document understanding)
@@ -308,11 +308,17 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [Ollama RAG Chatbot](https://github.com/datvodinh/rag-chatbot.git) (Local Chat with multiple PDFs using Ollama and RAG)
 - [BrainSoup](https://www.nurgo-software.com/products/brainsoup) (Flexible native client with RAG & multi-agent automation)
 - [macai](https://github.com/Renset/macai) (macOS client for Ollama, ChatGPT, and other compatible API back-ends)
+- [RWKV-Runner](https://github.com/josStorer/RWKV-Runner) (RWKV offline LLM deployment tool, also usable as a client for ChatGPT and Ollama)
+- [Ollama Grid Search](https://github.com/dezoito/ollama-grid-search) (app to evaluate and compare models)
 - [Olpaka](https://github.com/Otacon/olpaka) (User-friendly Flutter Web App for Ollama)
 - [OllamaSpring](https://github.com/CrazyNeil/OllamaSpring) (Ollama Client for macOS)
 - [LLocal.in](https://github.com/kartikm7/llocal) (Easy to use Electron Desktop Client for Ollama)
+- [Shinkai Desktop](https://github.com/dcSpark/shinkai-apps) (Two click install Local AI using Ollama + Files + RAG)
 - [AiLama](https://github.com/zeyoyt/ailama) (A Discord User App that allows you to interact with Ollama anywhere in discord )
 - [Ollama with Google Mesop](https://github.com/rapidarchitect/ollama_mesop/) (Mesop Chat Client implementation with Ollama)
+- [R2R](https://github.com/SciPhi-AI/R2R) (Open-source RAG engine)
+- [Ollama-Kis](https://github.com/elearningshow/ollama-kis) (A simple easy to use GUI with sample custom LLM for Drivers Education)
+- [OpenGPA](https://opengpa.org) (Open-source offline-first Enterprise Agentic Application)
 - [Painting Droid](https://github.com/mateuszmigas/painting-droid) (Painting app with AI integrations)
 - [Kerlig AI](https://www.kerlig.com/) (AI writing assistant for macOS)
 - [AI Studio](https://github.com/MindWorkAI/AI-Studio)
@@ -320,6 +326,8 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [LLMStack](https://github.com/trypromptly/LLMStack) (No-code multi-agent framework to build LLM agents and workflows)
 - [BoltAI for Mac](https://boltai.com) (AI Chat Client for Mac)
 - [Harbor](https://github.com/av/harbor) (Containerized LLM Toolkit with Ollama as default backend)
+- [PyGPT](https://github.com/szczyglis-dev/py-gpt) (AI desktop assistant for Linux, Windows and Mac)
+- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT/blob/master/docs/content/platform/ollama.md) (AutoGPT Ollama integration)
 - [Go-CREW](https://www.jonathanhecl.com/go-crew/) (Powerful Offline RAG in Golang)
 - [PartCAD](https://github.com/openvmp/partcad/) (CAD model generation with OpenSCAD and CadQuery)
 - [Ollama4j Web UI](https://github.com/ollama4j/ollama4j-web-ui) - Java-based Web UI for Ollama built with Vaadin, Spring Boot and Ollama4j
@@ -329,18 +337,32 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [ConfiChat](https://github.com/1runeberg/confichat) (Lightweight, standalone, multi-platform, and privacy focused LLM chat interface with optional encryption)
 - [Archyve](https://github.com/nickthecook/archyve) (RAG-enabling document library)
 - [crewAI with Mesop](https://github.com/rapidarchitect/ollama-crew-mesop) (Mesop Web Interface to run crewAI with Ollama)
+- [Tkinter-based client](https://github.com/chyok/ollama-gui) (Python tkinter-based Client for Ollama)
 - [LLMChat](https://github.com/trendy-design/llmchat) (Privacy focused, 100% local, intuitive all-in-one chat interface)
+- [Local Multimodal AI Chat](https://github.com/Leon-Sander/Local-Multimodal-AI-Chat) (Ollama-based LLM Chat with support for multiple features, including PDF RAG, voice chat, image-based interactions, and integration with OpenAI.)
 - [ARGO](https://github.com/xark-argo/argo) (Locally download and run Ollama and Huggingface models with RAG on Mac/Windows/Linux)
+- [OrionChat](https://github.com/EliasPereirah/OrionChat) - OrionChat is a web interface for chatting with different AI providers
 - [G1](https://github.com/bklieger-groq/g1) (Prototype of using prompting strategies to improve the LLM's reasoning through o1-like reasoning chains.)
+- [Web management](https://github.com/lemonit-eric-mao/ollama-web-management) (Web management page)
+- [Promptery](https://github.com/promptery/promptery) (desktop client for Ollama.)
 - [Ollama App](https://github.com/JHubi1/ollama-app) (Modern and easy-to-use multi-platform client for Ollama)
+- [ollamarama-matrix](https://github.com/h1ddenpr0cess20/ollamarama-matrix) (Ollama chatbot for the Matrix chat protocol)
+- [ollama-chat-app](https://github.com/anan1213095357/ollama-chat-app) (Flutter-based chat app)
 - [Perfect Memory AI](https://www.perfectmemory.ai/) (Productivity AI assists personalized by what you have seen on your screen, heard and said in the meetings)
 - [Hexabot](https://github.com/hexastack/hexabot) (A conversational AI builder)
 - [Reddit Rate](https://github.com/rapidarchitect/reddit_analyzer) (Search and Rate Reddit topics with a weighted summation)
- [OpenTalkGpt](https://github.com/adarshM84/OpenTalkGpt)
+- [OpenTalkGpt](https://github.com/adarshM84/OpenTalkGpt) (Chrome Extension to manage open-source models supported by Ollama, create custom models, and chat with models from a user-friendly UI)
 - [VT](https://github.com/vinhnx/vt.ai) (A minimal multimodal AI chat app, with dynamic conversation routing. Supports local models via Ollama)
- [Witsy](https://github.com/nbonamy/witsy) (An AI Desktop application avaiable for Mac/Windows/Linux) 
+- [Nosia](https://github.com/nosia-ai/nosia) (Easy to install and use RAG platform based on Ollama)
+- [Witsy](https://github.com/nbonamy/witsy) (An AI Desktop application avaiable for Mac/Windows/Linux)
 - [Abbey](https://github.com/US-Artificial-Intelligence/abbey) (A configurable AI interface server with notebooks, document storage, and YouTube support)

+### Cloud
+
+- [Google Cloud](https://cloud.google.com/run/docs/tutorials/gpu-gemma2-with-ollama)
+- [Fly.io](https://fly.io/docs/python/do-more/add-ollama/)
+- [Koyeb](https://www.koyeb.com/deploy/ollama)
+
 ### Terminal

 - [oterm](https://github.com/ggozad/oterm)
@@ -355,7 +377,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [Oatmeal](https://github.com/dustinblackman/oatmeal)
 - [cmdh](https://github.com/pgibler/cmdh)
 - [ooo](https://github.com/npahlfer/ooo)
- [shell-pilot](https://github.com/reid41/shell-pilot)
+- [shell-pilot](https://github.com/reid41/shell-pilot)(Interact with models via pure shell scripts on Linux or macOS)
 - [tenere](https://github.com/pythops/tenere)
 - [llm-ollama](https://github.com/taketwo/llm-ollama) for [Datasette's LLM CLI](https://llm.datasette.io/en/stable/).
 - [typechat-cli](https://github.com/anaisbetts/typechat-cli)
@@ -363,13 +385,19 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [tlm](https://github.com/yusufcanb/tlm)
 - [podman-ollama](https://github.com/ericcurtin/podman-ollama)
 - [gollama](https://github.com/sammcj/gollama)
+- [ParLlama](https://github.com/paulrobello/parllama)
 - [Ollama eBook Summary](https://github.com/cognitivetech/ollama-ebook-summary/)
 - [Ollama Mixture of Experts (MOE) in 50 lines of code](https://github.com/rapidarchitect/ollama_moe)
 - [vim-intelligence-bridge](https://github.com/pepo-ec/vim-intelligence-bridge) Simple interaction of "Ollama" with the Vim editor
+- [x-cmd ollama](https://x-cmd.com/mod/ollama)
+- [bb7](https://github.com/drunkwcodes/bb7)
 - [SwollamaCLI](https://github.com/marcusziade/Swollama) bundled with the Swollama Swift package. [Demo](https://github.com/marcusziade/Swollama?tab=readme-ov-file#cli-usage)
 - [aichat](https://github.com/sigoden/aichat) All-in-one LLM CLI tool featuring Shell Assistant, Chat-REPL, RAG, AI tools & agents, with access to OpenAI, Claude, Gemini, Ollama, Groq, and more.
+- [PowershAI](https://github.com/rrg92/powershai) PowerShell module that brings AI to terminal on Windows, including support for Ollama
+- [orbiton](https://github.com/xyproto/orbiton) Configuration-free text editor and IDE with support for tab completion with Ollama.

 ### Apple Vision Pro
+
 - [Enchanted](https://github.com/AugustDev/enchanted)

 ### Database
@@ -391,9 +419,11 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/integrations/chat/ollama/) with [example](https://js.langchain.com/docs/tutorials/local_rag/)
 - [Firebase Genkit](https://firebase.google.com/docs/genkit/plugins/ollama)
 - [crewAI](https://github.com/crewAIInc/crewAI)
+- [Spring AI](https://github.com/spring-projects/spring-ai) with [reference](https://docs.spring.io/spring-ai/reference/api/chat/ollama-chat.html) and [example](https://github.com/tzolov/ollama-tools)
 - [LangChainGo](https://github.com/tmc/langchaingo/) with [example](https://github.com/tmc/langchaingo/tree/main/examples/ollama-completion-example)
 - [LangChain4j](https://github.com/langchain4j/langchain4j) with [example](https://github.com/langchain4j/langchain4j-examples/tree/main/ollama-examples/src/main/java)
 - [LangChainRust](https://github.com/Abraxas-365/langchain-rust) with [example](https://github.com/Abraxas-365/langchain-rust/blob/main/examples/llm_ollama.rs)
+- [LLPhant](https://github.com/theodo-group/LLPhant?tab=readme-ov-file#ollama)
 - [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/ollama/) and [LlamaIndexTS](https://ts.llamaindex.ai/modules/llms/available_llms/ollama)
 - [LiteLLM](https://github.com/BerriAI/litellm)
 - [OllamaFarm for Go](https://github.com/presbrey/ollamafarm)
@@ -425,6 +455,8 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [High-level function abstraction in Go](https://gitlab.com/tozd/go/fun)
 - [Ollama PHP](https://github.com/ArdaGnsrn/ollama-php)
 - [Agents-Flex for Java](https://github.com/agents-flex/agents-flex) with [example](https://github.com/agents-flex/agents-flex/tree/main/agents-flex-llm/agents-flex-llm-ollama/src/test/java/com/agentsflex/llm/ollama)
+- [Parakeet](https://github.com/parakeet-nest/parakeet) is a GoLang library, made to simplify the development of small generative AI applications with Ollama.
+- [Haverscript](https://github.com/andygill/haverscript) with [examples](https://github.com/andygill/haverscript/tree/main/examples)
 - [Ollama for Swift](https://github.com/mattt/ollama-swift)
 - [Swollama for Swift](https://github.com/marcusziade/Swollama) with [DocC](https://marcusziade.github.io/Swollama/documentation/swollama/)
 - [GoLamify](https://github.com/prasad89/golamify)
@@ -466,10 +498,13 @@ See the [API documentation](./docs/api.md) for all endpoints.
 - [AI Telegram Bot](https://github.com/tusharhero/aitelegrambot) (Telegram bot using Ollama in backend)
 - [AI ST Completion](https://github.com/yaroslavyaroslav/OpenAI-sublime-text) (Sublime Text 4 AI assistant plugin with Ollama support)
 - [Discord-Ollama Chat Bot](https://github.com/kevinthedang/discord-ollama) (Generalized TypeScript Discord Bot w/ Tuning Documentation)
+- [ChatGPTBox: All in one browser extension](https://github.com/josStorer/chatGPTBox) with [Integrating Tutorial](https://github.com/josStorer/chatGPTBox/issues/616#issuecomment-1975186467)
 - [Discord AI chat/moderation bot](https://github.com/rapmd73/Companion) Chat/moderation bot written in python. Uses Ollama to create personalities.
 - [Headless Ollama](https://github.com/nischalj10/headless-ollama) (Scripts to automatically install ollama client & models on any OS for apps that depends on ollama server)
+- [Terraform AWS Ollama & Open WebUI](https://github.com/xuyangbocn/terraform-aws-self-host-llm) (A Terraform module to deploy on AWS a ready-to-use Ollama service, together with its front end Open WebUI service.)
+- [node-red-contrib-ollama](https://github.com/jakubburkiewicz/node-red-contrib-ollama)
 - [Local AI Helper](https://github.com/ivostoykov/localAI) (Chrome and Firefox extensions that enable interactions with the active tab and customisable API endpoints. Includes secure storage for user prompts.)
- [vnc-lm](https://github.com/jk011ru/vnc-lm) (A containerized Discord bot with support for attachments and web links)
+- [vnc-lm](https://github.com/jake83741/vnc-lm) (Discord bot for messaging with LLMs through Ollama and LiteLLM. Seamlessly move between local and flagship models.)
 - [LSP-AI](https://github.com/SilasMarvin/lsp-ai) (Open-source language server for AI-powered functionality)
 - [QodeAssist](https://github.com/Palm1r/QodeAssist) (AI-powered coding assistant plugin for Qt Creator)
 - [Obsidian Quiz Generator plugin](https://github.com/ECuiDev/obsidian-quiz-generator)
@@ -480,3 +515,7 @@ See the [API documentation](./docs/api.md) for all endpoints.

 - [llama.cpp](https://github.com/ggerganov/llama.cpp) project founded by Georgi Gerganov.

+### Observability
+
+- [OpenLIT](https://github.com/openlit/openlit) is an OpenTelemetry-native tool for monitoring Ollama Applications & GPUs using traces and metrics.
+- [HoneyHive](https://docs.honeyhive.ai/integrations/ollama) is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production. 
--- a/api/client.go
+++ b/api/client.go
@@ -163,24 +163,29 @@ func (c *Client) stream(ctx context.Context, method, path string, data any, fn f
 	scanBuf := make([]byte, 0, maxBufferSize)
 	scanner.Buffer(scanBuf, maxBufferSize)
 	for scanner.Scan() {
-		var errorResponse struct {
-			Error string `json:"error,omitempty"`
-		}
-
 		bts := scanner.Bytes()
+
+		var errorResponse ErrorResponse
 		if err := json.Unmarshal(bts, &errorResponse); err != nil {
 			return fmt.Errorf("unmarshal: %w", err)
 		}

-		if errorResponse.Error != "" {
-			return errors.New(errorResponse.Error)
+		switch errorResponse.Code {
+		case ErrCodeUnknownKey:
+			return ErrUnknownOllamaKey{
+				Message: errorResponse.Message,
+				Key:     errorResponse.Data["key"].(string),
+			}
+		}
+		if errorResponse.Message != "" {
+			return errors.New(errorResponse.Message)
 		}

 		if response.StatusCode >= http.StatusBadRequest {
 			return StatusError{
 				StatusCode:   response.StatusCode,
 				Status:       response.Status,
-				ErrorMessage: errorResponse.Error,
+				ErrorMessage: errorResponse.Message,
 			}
 		}

--- a/api/client_test.go
+++ b/api/client_test.go
@@ -1,6 +1,12 @@
 package api

 import (
+	"context"
+	"errors"
+	"fmt"
+	"net/http"
+	"net/http/httptest"
+	"net/url"
 	"testing"
 )

@@ -43,3 +49,117 @@ func TestClientFromEnvironment(t *testing.T) {
 		})
 	}
 }
+
+func TestStream(t *testing.T) {
+	tests := []struct {
+		name           string
+		serverResponse []string
+		statusCode     int
+		expectedError  error
+	}{
+		{
+			name: "unknown key error",
+			serverResponse: []string{
+				`{"error":"unauthorized access","code":"unknown_key","data":{"key":"test-key"}}`,
+			},
+			statusCode: http.StatusUnauthorized,
+			expectedError: &ErrUnknownOllamaKey{
+				Message: "unauthorized access",
+				Key:     "test-key",
+			},
+		},
+		{
+			name: "general error message",
+			serverResponse: []string{
+				`{"error":"something went wrong"}`,
+			},
+			statusCode:    http.StatusInternalServerError,
+			expectedError: fmt.Errorf("something went wrong"),
+		},
+		{
+			name: "malformed json response",
+			serverResponse: []string{
+				`{invalid-json`,
+			},
+			statusCode:    http.StatusOK,
+			expectedError: fmt.Errorf("unmarshal: invalid character 'i' looking for beginning of object key string"),
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			server := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+				w.Header().Set("Content-Type", "application/x-ndjson")
+				w.WriteHeader(tt.statusCode)
+				for _, resp := range tt.serverResponse {
+					fmt.Fprintln(w, resp)
+				}
+			}))
+			defer server.Close()
+
+			baseURL, err := url.Parse(server.URL)
+			if err != nil {
+				t.Fatalf("failed to parse server URL: %v", err)
+			}
+
+			client := &Client{
+				http: server.Client(),
+				base: baseURL,
+			}
+
+			var responses [][]byte
+			err = client.stream(context.Background(), "POST", "/test", "test", func(bts []byte) error {
+				responses = append(responses, bts)
+				return nil
+			})
+
+			// Error checking
+			if tt.expectedError == nil {
+				if err != nil {
+					t.Fatalf("unexpected error: %v", err)
+				}
+				return
+			}
+
+			if err == nil {
+				t.Fatalf("expected error %v, got nil", tt.expectedError)
+			}
+
+			// Check for specific error types
+			var unknownKeyErr ErrUnknownOllamaKey
+			if errors.As(tt.expectedError, &unknownKeyErr) {
+				var gotErr ErrUnknownOllamaKey
+				if !errors.As(err, &gotErr) {
+					t.Fatalf("expected ErrUnknownOllamaKey, got %T", err)
+				}
+				if unknownKeyErr.Key != gotErr.Key {
+					t.Errorf("expected key %q, got %q", unknownKeyErr.Key, gotErr.Key)
+				}
+				if unknownKeyErr.Message != gotErr.Message {
+					t.Errorf("expected message %q, got %q", unknownKeyErr.Message, gotErr.Message)
+				}
+				return
+			}
+
+			var statusErr StatusError
+			if errors.As(tt.expectedError, &statusErr) {
+				var gotErr StatusError
+				if !errors.As(err, &gotErr) {
+					t.Fatalf("expected StatusError, got %T", err)
+				}
+				if statusErr.StatusCode != gotErr.StatusCode {
+					t.Errorf("expected status code %d, got %d", statusErr.StatusCode, gotErr.StatusCode)
+				}
+				if statusErr.ErrorMessage != gotErr.ErrorMessage {
+					t.Errorf("expected error message %q, got %q", statusErr.ErrorMessage, gotErr.ErrorMessage)
+				}
+				return
+			}
+
+			// For other errors, compare error strings
+			if err.Error() != tt.expectedError.Error() {
+				t.Errorf("expected error %q, got %q", tt.expectedError, err)
+			}
+		})
+	}
+}
--- a/api/errors.go
+++ b/api/errors.go
@@ -0,0 +1,74 @@
+package api
+
+import (
+	"fmt"
+	"slices"
+	"strings"
+)
+
+const InvalidModelNameErrMsg = "invalid model name"
+
+// API error responses
+// ErrorCode represents a standardized error code identifier
+type ErrorCode string
+
+const (
+	ErrCodeUnknownKey ErrorCode = "unknown_key"
+	ErrCodeGeneral    ErrorCode = "general" // Generic fallback error code
+)
+
+// ErrorResponse implements a structured error interface
+type ErrorResponse struct {
+	Message string         `json:"error"` // Human-readable error message, uses 'error' field name for backwards compatibility
+	Code    ErrorCode      `json:"code"`  // Machine-readable error code for programmatic handling, not response code
+	Data    map[string]any `json:"data"`  // Additional error specific data, if any
+}
+
+func (e ErrorResponse) Error() string {
+	return e.Message
+}
+
+type ErrUnknownOllamaKey struct {
+	Message string
+	Key     string
+}
+
+func (e ErrUnknownOllamaKey) Error() string {
+	return fmt.Sprintf("unauthorized: unknown ollama key %q", strings.TrimSpace(e.Key))
+}
+
+func (e *ErrUnknownOllamaKey) FormatUserMessage(localKeys []string) string {
+	// The user should only be told to add the key if it is the same one that exists locally
+	if slices.Index(localKeys, e.Key) == -1 {
+		return e.Message
+	}
+
+	return fmt.Sprintf(`%s
+
+Your ollama key is:
+%s
+Add your key at:
+https://ollama.com/settings/keys`, e.Message, e.Key)
+}
+
+// StatusError is an error with an HTTP status code and message,
+// it is parsed on the client-side and not returned from the API
+type StatusError struct {
+	StatusCode   int    // e.g. 200
+	Status       string // e.g. "200 OK"
+	ErrorMessage string `json:"error"`
+}
+
+func (e StatusError) Error() string {
+	switch {
+	case e.Status != "" && e.ErrorMessage != "":
+		return fmt.Sprintf("%s: %s", e.Status, e.ErrorMessage)
+	case e.Status != "":
+		return e.Status
+	case e.ErrorMessage != "":
+		return e.ErrorMessage
+	default:
+		// this should not happen
+		return "something went wrong, please see the ollama server logs for details"
+	}
+}
--- a/api/types.go
+++ b/api/types.go
@@ -12,27 +12,6 @@ import (
 	"time"
 )

-// StatusError is an error with an HTTP status code and message.
-type StatusError struct {
-	StatusCode   int
-	Status       string
-	ErrorMessage string `json:"error"`
-}
-
-func (e StatusError) Error() string {
-	switch {
-	case e.Status != "" && e.ErrorMessage != "":
-		return fmt.Sprintf("%s: %s", e.Status, e.ErrorMessage)
-	case e.Status != "":
-		return e.Status
-	case e.ErrorMessage != "":
-		return e.ErrorMessage
-	default:
-		// this should not happen
-		return "something went wrong, please see the ollama server logs for details"
-	}
-}
-
 // ImageData represents the raw binary data of an image file.
 type ImageData []byte

--- a/app/store/store.go
+++ b/app/store/store.go
@@ -64,7 +64,7 @@ func initStore() {
 		slog.Debug(fmt.Sprintf("unexpected error searching for store: %s", err))
 	}
 	slog.Debug("initializing new store")
-	store.ID = uuid.New().String()
+	store.ID = uuid.NewString()
 	writeStore(getStorePath())
 }

--- a/app/tray/wintray/menus.go
+++ b/app/tray/wintray/menus.go
@@ -39,7 +39,7 @@ func (t *winTray) UpdateAvailable(ver string) error {
 		if err := t.addOrUpdateMenuItem(updateAvailableMenuID, 0, updateAvailableMenuTitle, true); err != nil {
 			return fmt.Errorf("unable to create menu entries %w", err)
 		}
-		if err := t.addOrUpdateMenuItem(updateMenuID, 0, updateMenutTitle, false); err != nil {
+		if err := t.addOrUpdateMenuItem(updateMenuID, 0, updateMenuTitle, false); err != nil {
 			return fmt.Errorf("unable to create menu entries %w", err)
 		}
 		if err := t.addSeparatorMenuItem(separatorMenuID, 0); err != nil {
--- a/app/tray/wintray/messages.go
+++ b/app/tray/wintray/messages.go
@@ -10,6 +10,6 @@ const (

 	quitMenuTitle            = "Quit Ollama"
 	updateAvailableMenuTitle = "An update is available"
-	updateMenutTitle         = "Restart to update"
+	updateMenuTitle          = "Restart to update"
 	diagLogsMenuTitle        = "View logs"
 )
--- a/cmd/cmd.go
+++ b/cmd/cmd.go
@@ -19,7 +19,6 @@ import (
 	"os"
 	"os/signal"
 	"path/filepath"
-	"regexp"
 	"runtime"
 	"strconv"
 	"strings"
@@ -41,7 +40,6 @@ import (
 	"github.com/ollama/ollama/parser"
 	"github.com/ollama/ollama/progress"
 	"github.com/ollama/ollama/server"
-	"github.com/ollama/ollama/types/errtypes"
 	"github.com/ollama/ollama/types/model"
 	"github.com/ollama/ollama/version"
 )
@@ -456,6 +454,10 @@ func RunHandler(cmd *cobra.Command, args []string) error {
 	if len(prompts) > 0 {
 		interactive = false
 	}
+	// Be quiet if we're redirecting to a pipe or file
+	if !term.IsTerminal(int(os.Stdout.Fd())) {
+		interactive = false
+	}

 	nowrap, err := cmd.Flags().GetBool("nowordwrap")
 	if err != nil {
@@ -512,45 +514,22 @@ func RunHandler(cmd *cobra.Command, args []string) error {
 	return generate(cmd, opts)
 }

-func errFromUnknownKey(unknownKeyErr error) error {
-	// find SSH public key in the error message
-	sshKeyPattern := `ssh-\w+ [^\s"]+`
-	re := regexp.MustCompile(sshKeyPattern)
-	matches := re.FindStringSubmatch(unknownKeyErr.Error())
-
-	if len(matches) > 0 {
-		serverPubKey := matches[0]
-
-		localPubKey, err := auth.GetPublicKey()
-		if err != nil {
-			return unknownKeyErr
-		}
-
-		if runtime.GOOS == "linux" && serverPubKey != localPubKey {
-			// try the ollama service public key
-			svcPubKey, err := os.ReadFile("/usr/share/ollama/.ollama/id_ed25519.pub")
-			if err != nil {
-				return unknownKeyErr
-			}
-			localPubKey = strings.TrimSpace(string(svcPubKey))
-		}
-
-		// check if the returned public key matches the local public key, this prevents adding a remote key to the user's account
-		if serverPubKey != localPubKey {
-			return unknownKeyErr
-		}
-
-		var msg strings.Builder
-		msg.WriteString(unknownKeyErr.Error())
-		msg.WriteString("\n\nYour ollama key is:\n")
-		msg.WriteString(localPubKey)
-		msg.WriteString("\nAdd your key at:\n")
-		msg.WriteString("https://ollama.com/settings/keys")
-
-		return errors.New(msg.String())
+func localPubKeys() ([]string, error) {
+	usrKey, err := auth.GetPublicKey()
+	if err != nil {
+		return nil, err
 	}

-	return unknownKeyErr
+	keys := []string{usrKey}
+
+	if runtime.GOOS == "linux" {
+		// try the ollama service public key if on Linux
+		if svcKey, err := os.ReadFile("/usr/share/ollama/.ollama/id_ed25519.pub"); err == nil {
+			keys = append(keys, strings.TrimSpace(string(svcKey)))
+		}
+	}
+
+	return keys, nil
 }

 func PushHandler(cmd *cobra.Command, args []string) error {
@@ -599,25 +578,36 @@ func PushHandler(cmd *cobra.Command, args []string) error {
 	}

 	request := api.PushRequest{Name: args[0], Insecure: insecure}
+
+	n := model.ParseName(args[0])
+	isOllamaHost := strings.HasSuffix(n.Host, ".ollama.ai") || strings.HasSuffix(n.Host, ".ollama.com")
 	if err := client.Push(cmd.Context(), &request, fn); err != nil {
 		if spinner != nil {
 			spinner.Stop()
 		}
+		var ke api.ErrUnknownOllamaKey
+		if errors.As(err, &ke) && isOllamaHost {
+			// the user has not added their ollama key to ollama.com
+			// return an error with a more user-friendly message
+			locals, _ := localPubKeys()
+			return errors.New(ke.FormatUserMessage(locals))
+		}
 		if strings.Contains(err.Error(), "access denied") {
 			return errors.New("you are not authorized to push to this namespace, create the model under a namespace you own")
 		}
-		host := model.ParseName(args[0]).Host
-		isOllamaHost := strings.HasSuffix(host, ".ollama.ai") || strings.HasSuffix(host, ".ollama.com")
-		if strings.Contains(err.Error(), errtypes.UnknownOllamaKeyErrMsg) && isOllamaHost {
-			// the user has not added their ollama key to ollama.com
-			// re-throw an error with a more user-friendly message
-			return errFromUnknownKey(err)
-		}
-
-		return err
+		return fmt.Errorf("yoyoyo: %w", err)
 	}

+	p.Stop()
 	spinner.Stop()
+
+	destination := n.String()
+	if isOllamaHost {
+		destination = "https://ollama.com/" + strings.TrimSuffix(n.DisplayShortest(), ":latest")
+	}
+	fmt.Printf("\nYou can find your model at:\n\n")
+	fmt.Printf("\t%s\n", destination)
+
 	return nil
 }

--- a/cmd/cmd_test.go
+++ b/cmd/cmd_test.go
@@ -4,6 +4,7 @@ import (
 	"bytes"
 	"context"
 	"encoding/json"
+	"io"
 	"net/http"
 	"net/http/httptest"
 	"os"
@@ -369,3 +370,143 @@ func TestGetModelfileName(t *testing.T) {
 		})
 	}
 }
+
+func TestPushHandler(t *testing.T) {
+	tests := []struct {
+		modelName      string
+		serverResponse map[string]func(w http.ResponseWriter, r *http.Request)
+		expectedError  string
+		expectedOutput string
+	}{
+		{
+			modelName: "successful-push",
+			serverResponse: map[string]func(w http.ResponseWriter, r *http.Request){
+				"/api/push": func(w http.ResponseWriter, r *http.Request) {
+					if r.Method != http.MethodPost {
+						t.Errorf("expected POST request, got %s", r.Method)
+					}
+
+					var req api.PushRequest
+					if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
+						http.Error(w, err.Error(), http.StatusBadRequest)
+						return
+					}
+
+					if req.Name != "successful-push" {
+						t.Errorf("expected model name 'successful-push', got %s", req.Name)
+					}
+
+					// Simulate progress updates
+					responses := []api.ProgressResponse{
+						{Status: "preparing manifest"},
+						{Digest: "sha256:abc123456789", Total: 100, Completed: 50},
+						{Digest: "sha256:abc123456789", Total: 100, Completed: 100},
+					}
+
+					for _, resp := range responses {
+						if err := json.NewEncoder(w).Encode(resp); err != nil {
+							http.Error(w, err.Error(), http.StatusInternalServerError)
+							return
+						}
+						w.(http.Flusher).Flush()
+					}
+				},
+			},
+			expectedOutput: "\nYou can find your model at:\n\n\thttps://ollama.com/successful-push\n",
+		},
+		{
+			modelName: "unauthorized-push",
+			serverResponse: map[string]func(w http.ResponseWriter, r *http.Request){
+				"/api/push": func(w http.ResponseWriter, r *http.Request) {
+					w.Header().Set("Content-Type", "application/json")
+					w.WriteHeader(http.StatusUnauthorized)
+					err := json.NewEncoder(w).Encode(map[string]string{
+						"error": "access denied",
+					})
+					if err != nil {
+						t.Fatal(err)
+					}
+				},
+			},
+			expectedError: "you are not authorized to push to this namespace, create the model under a namespace you own",
+		},
+		{
+			modelName: "unknown-key-err",
+			serverResponse: map[string]func(w http.ResponseWriter, r *http.Request){
+				"/api/push": func(w http.ResponseWriter, r *http.Request) {
+					w.Header().Set("Content-Type", "application/json")
+					w.WriteHeader(http.StatusUnauthorized)
+					uerr := api.ErrUnknownOllamaKey{
+						Key: "aaa",
+					}
+					err := json.NewEncoder(w).Encode(map[string]string{
+						"error": uerr.Error(),
+					})
+					if err != nil {
+						t.Fatal(err)
+					}
+				},
+			},
+			expectedError: "unauthorized: unknown ollama key \"aaa\"",
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.modelName, func(t *testing.T) {
+			mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
+				if handler, ok := tt.serverResponse[r.URL.Path]; ok {
+					handler(w, r)
+					return
+				}
+				http.Error(w, "not found", http.StatusNotFound)
+			}))
+			defer mockServer.Close()
+
+			t.Setenv("OLLAMA_HOST", mockServer.URL)
+
+			cmd := &cobra.Command{}
+			cmd.Flags().Bool("insecure", false, "")
+			cmd.SetContext(context.TODO())
+
+			// Redirect stderr to capture progress output
+			oldStderr := os.Stderr
+			r, w, _ := os.Pipe()
+			os.Stderr = w
+
+			// Capture stdout for the "Model pushed" message
+			oldStdout := os.Stdout
+			outR, outW, _ := os.Pipe()
+			os.Stdout = outW
+
+			err := PushHandler(cmd, []string{tt.modelName})
+
+			// Restore stderr
+			w.Close()
+			os.Stderr = oldStderr
+			// drain the pipe
+			if _, err := io.ReadAll(r); err != nil {
+				t.Fatal(err)
+			}
+
+			// Restore stdout and get output
+			outW.Close()
+			os.Stdout = oldStdout
+			stdout, _ := io.ReadAll(outR)
+
+			if tt.expectedError == "" {
+				if err != nil {
+					t.Errorf("expected no error, got %v", err)
+				}
+				if tt.expectedOutput != "" {
+					if got := string(stdout); got != tt.expectedOutput {
+						t.Errorf("expected output %q, got %q", tt.expectedOutput, got)
+					}
+				}
+			} else {
+				if err == nil || !strings.Contains(err.Error(), tt.expectedError) {
+					t.Errorf("expected error containing %q, got %v", tt.expectedError, err)
+				}
+			}
+		})
+	}
+}
--- a/cmd/interactive.go
+++ b/cmd/interactive.go
@@ -19,7 +19,6 @@ import (
 	"github.com/ollama/ollama/envconfig"
 	"github.com/ollama/ollama/parser"
 	"github.com/ollama/ollama/readline"
-	"github.com/ollama/ollama/types/errtypes"
 )

 type MultilineState int
@@ -220,7 +219,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
 			fn := func(resp api.ProgressResponse) error { return nil }
 			err = client.Create(cmd.Context(), req, fn)
 			if err != nil {
-				if strings.Contains(err.Error(), errtypes.InvalidModelNameErrMsg) {
+				if strings.Contains(err.Error(), api.InvalidModelNameErrMsg) {
 					fmt.Printf("error: The model name '%s' is invalid\n", args[1])
 					continue
 				}
@@ -319,8 +318,6 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
 						opts.Messages = append(opts.Messages, newMessage)
 					}
 					fmt.Println("Set system message.")
-					sb.Reset()
-
 					sb.Reset()
 					continue
 				default:
@@ -516,7 +513,7 @@ func extractFileNames(input string) []string {
 	// Regex to match file paths starting with optional drive letter, / ./ \ or .\ and include escaped or unescaped spaces (\ or %20)
 	// and followed by more characters and a file extension
 	// This will capture non filename strings, but we'll check for file existence to remove mismatches
-	regexPattern := `(?:[a-zA-Z]:)?(?:\./|/|\\)[\S\\ ]+?\.(?i:jpg|jpeg|png|svg)\b`
+	regexPattern := `(?:[a-zA-Z]:)?(?:\./|/|\\)[\S\\ ]+?\.(?i:jpg|jpeg|png)\b`
 	re := regexp.MustCompile(regexPattern)

 	return re.FindAllString(input, -1)
--- a/cmd/interactive_test.go
+++ b/cmd/interactive_test.go
@@ -12,44 +12,45 @@ import (
 func TestExtractFilenames(t *testing.T) {
 	// Unix style paths
 	input := ` some preamble 
- ./relative\ path/one.png inbetween1 ./not a valid two.jpg inbetween2
-/unescaped space /three.jpeg inbetween3 /valid\ path/dir/four.png "./quoted with spaces/five.svg`
+ ./relative\ path/one.png inbetween1 ./not a valid two.jpg inbetween2 ./1.svg
+/unescaped space /three.jpeg inbetween3 /valid\ path/dir/four.png "./quoted with spaces/five.JPG`
 	res := extractFileNames(input)
 	assert.Len(t, res, 5)
 	assert.Contains(t, res[0], "one.png")
 	assert.Contains(t, res[1], "two.jpg")
 	assert.Contains(t, res[2], "three.jpeg")
 	assert.Contains(t, res[3], "four.png")
-	assert.Contains(t, res[4], "five.svg")
+	assert.Contains(t, res[4], "five.JPG")
 	assert.NotContains(t, res[4], '"')
-	assert.NotContains(t, res, "inbtween")
+	assert.NotContains(t, res, "inbetween1")
+	assert.NotContains(t, res, "./1.svg")

 	// Windows style paths
 	input = ` some preamble
 c:/users/jdoe/one.png inbetween1 c:/program files/someplace/two.jpg inbetween2 
 /absolute/nospace/three.jpeg inbetween3 /absolute/with space/four.png inbetween4
-./relative\ path/five.svg inbetween5 "./relative with/spaces/six.png inbetween6
-d:\path with\spaces\seven.svg inbetween7 c:\users\jdoe\eight.png inbetween8 
- d:\program files\someplace\nine.png inbetween9 "E:\program files\someplace\ten.svg some ending
+./relative\ path/five.JPG inbetween5 "./relative with/spaces/six.png inbetween6
+d:\path with\spaces\seven.JPEG inbetween7 c:\users\jdoe\eight.png inbetween8 
+ d:\program files\someplace\nine.png inbetween9 "E:\program files\someplace\ten.PNG some ending
 `
 	res = extractFileNames(input)
 	assert.Len(t, res, 10)
-	assert.NotContains(t, res, "inbtween")
+	assert.NotContains(t, res, "inbetween2")
 	assert.Contains(t, res[0], "one.png")
 	assert.Contains(t, res[0], "c:")
 	assert.Contains(t, res[1], "two.jpg")
 	assert.Contains(t, res[1], "c:")
 	assert.Contains(t, res[2], "three.jpeg")
 	assert.Contains(t, res[3], "four.png")
-	assert.Contains(t, res[4], "five.svg")
+	assert.Contains(t, res[4], "five.JPG")
 	assert.Contains(t, res[5], "six.png")
-	assert.Contains(t, res[6], "seven.svg")
+	assert.Contains(t, res[6], "seven.JPEG")
 	assert.Contains(t, res[6], "d:")
 	assert.Contains(t, res[7], "eight.png")
 	assert.Contains(t, res[7], "c:")
 	assert.Contains(t, res[8], "nine.png")
 	assert.Contains(t, res[8], "d:")
-	assert.Contains(t, res[9], "ten.svg")
+	assert.Contains(t, res[9], "ten.PNG")
 	assert.Contains(t, res[9], "E:")
 }

--- a/docs/troubleshooting.md
+++ b/docs/troubleshooting.md
@@ -104,6 +104,12 @@ If you are experiencing problems getting Ollama to correctly discover or use you
 - `OLLAMA_DEBUG=1` During GPU discovery additional information will be reported
 - Check dmesg for any errors from amdgpu or kfd drivers `sudo dmesg | grep -i amdgpu` and `sudo dmesg | grep -i kfd`

+## Multiple AMD GPUs
+
+If you experience gibberish responses when models load across multiple AMD GPUs on Linux, see the following guide.
+
+- https://rocm.docs.amd.com/projects/radeon/en/latest/docs/install/native_linux/mgpu.html#mgpu-known-issues-and-limitations
+
 ## Windows Terminal Errors

 Older versions of Windows 10 (e.g., 21H1) are known to have a bug where the standard terminal program does not display control characters correctly.  This can result in a long string of strings like `←[?25h←[?25l` being displayed, sometimes erroring with `The parameter is incorrect`  To resolve this problem, please update to Win 10 22H1 or newer.
--- a/docs/tutorials/fly-gpu.md
+++ b/docs/tutorials/fly-gpu.md
@@ -1,83 +0,0 @@
-# Running Ollama on Fly.io GPU Instances
-
-Ollama runs with little to no configuration on [Fly.io GPU instances](https://fly.io/docs/gpus/gpu-quickstart/). If you don't have access to GPUs yet, you'll need to [apply for access](https://fly.io/gpu/) on the waitlist. Once you're accepted, you'll get an email with instructions on how to get started.
-
-Create a new app with `fly apps create`:
-
-```bash
-fly apps create
-```
-
-Then create a `fly.toml` file in a new folder that looks like this:
-
-```toml
-app = "sparkling-violet-709"
-primary_region = "ord"
-vm.size = "a100-40gb" # see https://fly.io/docs/gpus/gpu-quickstart/ for more info
-
-[build]
-  image = "ollama/ollama"
-
-[http_service]
-  internal_port = 11434
-  force_https = false
-  auto_stop_machines = true
-  auto_start_machines = true
-  min_machines_running = 0
-  processes = ["app"]
-
-[mounts]
-  source = "models"
-  destination = "/root/.ollama"
-  initial_size = "100gb"
-```
-
-Then create a [new private IPv6 address](https://fly.io/docs/reference/private-networking/#flycast-private-load-balancing) for your app:
-
-```bash
-fly ips allocate-v6 --private
-```
-
-Then deploy your app:
-
-```bash
-fly deploy
-```
-
-And finally you can access it interactively with a new Fly.io Machine:
-
-```
-fly machine run -e OLLAMA_HOST=http://your-app-name.flycast --shell ollama/ollama
-```
-
-```bash
-$ ollama run openchat:7b-v3.5-fp16
->>> How do I bake chocolate chip cookies?
- To bake chocolate chip cookies, follow these steps:
-
-1. Preheat the oven to 375°F (190°C) and line a baking sheet with parchment paper or silicone baking mat.
-
-2. In a large bowl, mix together 1 cup of unsalted butter (softened), 3/4 cup granulated sugar, and 3/4
-cup packed brown sugar until light and fluffy.
-
-3. Add 2 large eggs, one at a time, to the butter mixture, beating well after each addition. Stir in 1
-teaspoon of pure vanilla extract.
-
-4. In a separate bowl, whisk together 2 cups all-purpose flour, 1/2 teaspoon baking soda, and 1/2 teaspoon
-salt. Gradually add the dry ingredients to the wet ingredients, stirring until just combined.
-
-5. Fold in 2 cups of chocolate chips (or chunks) into the dough.
-
-6. Drop rounded tablespoons of dough onto the prepared baking sheet, spacing them about 2 inches apart.
-
-7. Bake for 10-12 minutes, or until the edges are golden brown. The centers should still be slightly soft.
-
-8. Allow the cookies to cool on the baking sheet for a few minutes before transferring them to a wire rack
-to cool completely.
-
-Enjoy your homemade chocolate chip cookies!
-```
-
-When you set it up like this, it will automatically turn off when you're done using it. Then when you access it again, it will automatically turn back on. This is a great way to save money on GPU instances when you're not using them. If you want a persistent wake-on-use connection to your Ollama instance, you can set up a [connection to your Fly network using WireGuard](https://fly.io/docs/reference/private-networking/#discovering-apps-through-dns-on-a-wireguard-connection). Then you can access your Ollama instance at `http://your-app-name.flycast`.
-
-And that's it!
--- a/docs/tutorials/langchainjs.md
+++ b/docs/tutorials/langchainjs.md
@@ -1,77 +0,0 @@
-# Using LangChain with Ollama using JavaScript
-
-In this tutorial, we are going to use JavaScript with LangChain and Ollama to learn about something just a touch more recent. In August 2023, there was a series of wildfires on Maui. There is no way an LLM trained before that time can know about this, since their training data would not include anything as recent as that. So we can find the [Wikipedia article about the fires](https://en.wikipedia.org/wiki/2023_Hawaii_wildfires) and ask questions about the contents.
-
-To get started, let's just use **LangChain** to ask a simple question to a model. To do this with JavaScript, we need to install **LangChain**:
-
-```bash
-npm install @langchain/community
-```
-
-Now we can start building out our JavaScript:
-
-```javascript
-import { Ollama } from "@langchain/community/llms/ollama";
-
-const ollama = new Ollama({
-  baseUrl: "http://localhost:11434",
-  model: "llama3.2",
-});
-
-const answer = await ollama.invoke(`why is the sky blue?`);
-
-console.log(answer);
-```
-
-That will get us the same thing as if we ran `ollama run llama3.2 "why is the sky blue"` in the terminal. But we want to load a document from the web to ask a question against. **Cheerio** is a great library for ingesting a webpage, and **LangChain** uses it in their **CheerioWebBaseLoader**. So let's install **Cheerio** and build that part of the app.
-
-```bash
-npm install cheerio
-```
-
-```javascript
-import { CheerioWebBaseLoader } from "langchain/document_loaders/web/cheerio";
-
-const loader = new CheerioWebBaseLoader("https://en.wikipedia.org/wiki/2023_Hawaii_wildfires");
-const data = await loader.load();
-```
-
-That will load the document. Although this page is smaller than the Odyssey, it is certainly bigger than the context size for most LLMs. So we are going to need to split into smaller pieces, and then select just the pieces relevant to our question. This is a great use for a vector datastore. In this example, we will use the **MemoryVectorStore** that is part of **LangChain**. But there is one more thing we need to get the content into the datastore. We have to run an embeddings process that converts the tokens in the text into a series of vectors. And for that, we are going to use **Tensorflow**. There is a lot of stuff going on in this one. First, install the **Tensorflow** components that we need.
-
-```javascript
-npm install @tensorflow/tfjs-core@3.6.0 @tensorflow/tfjs-converter@3.6.0 @tensorflow-models/universal-sentence-encoder@1.3.3 @tensorflow/tfjs-node@4.10.0
-```
-
-If you just install those components without the version numbers, it will install the latest versions, but there are conflicts within **Tensorflow**, so you need to install the compatible versions.
-
-```javascript
-import { RecursiveCharacterTextSplitter } from "langchain/text_splitter"
-import { MemoryVectorStore } from "langchain/vectorstores/memory";
-import "@tensorflow/tfjs-node";
-import { TensorFlowEmbeddings } from "langchain/embeddings/tensorflow";
-
-// Split the text into 500 character chunks. And overlap each chunk by 20 characters
-const textSplitter = new RecursiveCharacterTextSplitter({
- chunkSize: 500,
- chunkOverlap: 20
-});
-const splitDocs = await textSplitter.splitDocuments(data);
-
-// Then use the TensorFlow Embedding to store these chunks in the datastore
-const vectorStore = await MemoryVectorStore.fromDocuments(splitDocs, new TensorFlowEmbeddings());
-```
-
-To connect the datastore to a question asked to a LLM, we need to use the concept at the heart of **LangChain**: the chain. Chains are a way to connect a number of activities together to accomplish a particular tasks. There are a number of chain types available, but for this tutorial we are using the **RetrievalQAChain**.
-
-```javascript
-import { RetrievalQAChain } from "langchain/chains";
-
-const retriever = vectorStore.asRetriever();
-const chain = RetrievalQAChain.fromLLM(ollama, retriever);
-const result = await chain.call({query: "When was Hawaii's request for a major disaster declaration approved?"});
-console.log(result.text)
-```
-
-So we created a retriever, which is a way to return the chunks that match a query from a datastore. And then connect the retriever and the model via a chain. Finally, we send a query to the chain, which results in an answer using our document as a source. The answer it returned was correct, August 10, 2023.
-
-And that is a simple introduction to what you can do with **LangChain** and **Ollama.**
--- a/docs/tutorials/langchainpy.md
+++ b/docs/tutorials/langchainpy.md
@@ -1,85 +0,0 @@
-# Using LangChain with Ollama in Python
-
-Let's imagine we are studying the classics, such as **the Odyssey** by **Homer**. We might have a question about Neleus and his family. If you ask llama2 for that info, you may get something like:
-
-> I apologize, but I'm a large language model, I cannot provide information on individuals or families that do not exist in reality. Neleus is not a real person or character, and therefore does not have a family or any other personal details. My apologies for any confusion. Is there anything else I can help you with?
-
-This sounds like a typical censored response, but even llama2-uncensored gives a mediocre answer:
-
-> Neleus was a legendary king of Pylos and the father of Nestor, one of the Argonauts. His mother was Clymene, a sea nymph, while his father was Neptune, the god of the sea.
-
-So let's figure out how we can use **LangChain** with Ollama to ask our question to the actual document, the Odyssey by Homer, using Python.
-
-Let's start by asking a simple question that we can get an answer to from the **Llama3** model using **Ollama**. First, we need to install the **LangChain** package:
-
-`pip install langchain_community`
-
-Then we can create a model and ask the question:
-
-```python
-from langchain_community.llms import Ollama
-ollama = Ollama(
-    base_url='http://localhost:11434',
-    model="llama3"
-)
-print(ollama.invoke("why is the sky blue"))
-```
-
-Notice that we are defining the model and the base URL for Ollama.
-
-Now let's load a document to ask questions against. I'll load up the Odyssey by Homer, which you can find at Project Gutenberg. We will need **WebBaseLoader** which is part of **LangChain** and loads text from any webpage. On my machine, I also needed to install **bs4** to get that to work, so run `pip install bs4`.
-
-```python
-from langchain.document_loaders import WebBaseLoader
-loader = WebBaseLoader("https://www.gutenberg.org/files/1727/1727-h/1727-h.htm")
-data = loader.load()
-```
-
-This file is pretty big. Just the preface is 3000 tokens. Which means the full document won't fit into the context for the model. So we need to split it up into smaller pieces.
-
-```python
-from langchain.text_splitter import RecursiveCharacterTextSplitter
-
-text_splitter=RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
-all_splits = text_splitter.split_documents(data)
-```
-
-It's split up, but we have to find the relevant splits and then submit those to the model. We can do this by creating embeddings and storing them in a vector database. We can use Ollama directly to instantiate an embedding model. We will use ChromaDB in this example for a vector database. `pip install chromadb`
-We also need to pull embedding model: `ollama pull nomic-embed-text`
-```python
-from langchain.embeddings import OllamaEmbeddings
-from langchain.vectorstores import Chroma
-oembed = OllamaEmbeddings(base_url="http://localhost:11434", model="nomic-embed-text")
-vectorstore = Chroma.from_documents(documents=all_splits, embedding=oembed)
-```
-
-Now let's ask a question from the document. **Who was Neleus, and who is in his family?** Neleus is a character in the Odyssey, and the answer can be found in our text.
-
-```python
-question="Who is Neleus and who is in Neleus' family?"
-docs = vectorstore.similarity_search(question)
-len(docs)
-```
-
-This will output the number of matches for chunks of data similar to the search.
-
-The next thing is to send the question and the relevant parts of the docs to the model to see if we can get a good answer. But we are stitching two parts of the process together, and that is called a chain. This means we need to define a chain:
-
-```python
-from langchain.chains import RetrievalQA
-qachain=RetrievalQA.from_chain_type(ollama, retriever=vectorstore.as_retriever())
-res = qachain.invoke({"query": question})
-print(res['result'])
-```
-
-The answer received from this chain was:
-
-> Neleus is a character in Homer's "Odyssey" and is mentioned in the context of Penelope's suitors. Neleus is the father of Chloris, who is married to Neleus and bears him several children, including Nestor, Chromius, Periclymenus, and Pero. Amphinomus, the son of Nisus, is also mentioned as a suitor of Penelope and is known for his good natural disposition and agreeable conversation.
-
-It's not a perfect answer, as it implies Neleus married his daughter when actually Chloris "was the youngest daughter to Amphion son of Iasus and king of Minyan Orchomenus, and was Queen in Pylos".
-
-I updated the chunk_overlap for the text splitter to 20 and tried again and got a much better answer:
-
-> Neleus is a character in Homer's epic poem "The Odyssey." He is the husband of Chloris, who is the youngest daughter of Amphion son of Iasus and king of Minyan Orchomenus. Neleus has several children with Chloris, including Nestor, Chromius, Periclymenus, and Pero.
-
-And that is a much better answer.
--- a/docs/tutorials/nvidia-jetson.md
+++ b/docs/tutorials/nvidia-jetson.md
@@ -1,15 +0,0 @@
-# Running Ollama on NVIDIA Jetson Devices
-
-Ollama runs well on [NVIDIA Jetson Devices](https://www.nvidia.com/en-us/autonomous-machines/embedded-systems/) and should run out of the box with the standard installation instructions. 
-
-The following has been tested on [JetPack 5.1.2](https://developer.nvidia.com/embedded/jetpack), but should also work on JetPack 6.0.
-
- Install Ollama via standard Linux command (ignore the 404 error): `curl https://ollama.com/install.sh | sh`
- Pull the model you want to use (e.g. mistral): `ollama pull mistral`
- Start an interactive session: `ollama run mistral`
-
-And that's it!
-
-# Running Ollama in Docker
-
-When running GPU accelerated applications in Docker, it is highly recommended to use [dusty-nv jetson-containers repo](https://github.com/dusty-nv/jetson-containers).
--- a/examples/langchain-python-simple/main.py
+++ b/examples/langchain-python-simple/main.py
@@ -1,6 +1,6 @@
 from langchain.llms import Ollama

-input = input("What is your question?")
+input = input("What is your question?\n> ")
 llm = Ollama(model="llama3.2")
-res = llm.predict(input)
+res = llm.invoke(input)
 print (res)
--- a/go.mod
+++ b/go.mod
@@ -7,7 +7,7 @@ require (
 	github.com/emirpasic/gods v1.18.1
 	github.com/gin-gonic/gin v1.10.0
 	github.com/golang/protobuf v1.5.4 // indirect
-	github.com/google/uuid v1.1.2
+	github.com/google/uuid v1.6.0
 	github.com/olekukonko/tablewriter v0.0.5
 	github.com/spf13/cobra v1.7.0
 	github.com/stretchr/testify v1.9.0
@@ -29,7 +29,7 @@ require (
 	github.com/apache/arrow/go/arrow v0.0.0-20211112161151-bc219186db40 // indirect
 	github.com/bytedance/sonic/loader v0.1.1 // indirect
 	github.com/chewxy/hm v1.0.0 // indirect
-	github.com/chewxy/math32 v1.10.1 // indirect
+	github.com/chewxy/math32 v1.11.0 // indirect
 	github.com/cloudwego/base64x v0.1.4 // indirect
 	github.com/cloudwego/iasm v0.2.0 // indirect
 	github.com/davecgh/go-spew v1.1.1 // indirect
--- a/go.sum
+++ b/go.sum
@@ -21,8 +21,8 @@ github.com/census-instrumentation/opencensus-proto v0.2.1/go.mod h1:f6KPmirojxKA
 github.com/chewxy/hm v1.0.0 h1:zy/TSv3LV2nD3dwUEQL2VhXeoXbb9QkpmdRAVUFiA6k=
 github.com/chewxy/hm v1.0.0/go.mod h1:qg9YI4q6Fkj/whwHR1D+bOGeF7SniIP40VweVepLjg0=
 github.com/chewxy/math32 v1.0.0/go.mod h1:Miac6hA1ohdDUTagnvJy/q+aNnEk16qWUdb8ZVhvCN0=
-github.com/chewxy/math32 v1.10.1 h1:LFpeY0SLJXeaiej/eIp2L40VYfscTvKh/FSEZ68uMkU=
-github.com/chewxy/math32 v1.10.1/go.mod h1:dOB2rcuFrCn6UHrze36WSLVPKtzPMRAQvBvUwkSsLqs=
+github.com/chewxy/math32 v1.11.0 h1:8sek2JWqeaKkVnHa7bPVqCEOUPbARo4SGxs6toKyAOo=
+github.com/chewxy/math32 v1.11.0/go.mod h1:dOB2rcuFrCn6UHrze36WSLVPKtzPMRAQvBvUwkSsLqs=
 github.com/client9/misspell v0.3.4/go.mod h1:qj6jICC3Q7zFZvVWo7KLAzC3yx5G7kyvSDkc90ppPyw=
 github.com/cloudwego/base64x v0.1.4 h1:jwCgWpFanWmN8xoIUHa2rtzmkd5J2plF/dnLS6Xd/0Y=
 github.com/cloudwego/base64x v0.1.4/go.mod h1:0zlkT4Wn5C6NdauXdJRhSKRlJvmclQ1hhJgA0rcu/8w=
@@ -113,8 +113,9 @@ github.com/google/go-cmp v0.5.6/go.mod h1:v8dTdLbMG2kIc/vJvl+f65V22dbkXbowE6jgT/
 github.com/google/go-cmp v0.6.0 h1:ofyhxvXcZhMsU5ulbFiLKl/XBFqE1GSq7atu8tAmTRI=
 github.com/google/go-cmp v0.6.0/go.mod h1:17dUlkBOakJ0+DkrSSNjCkIjxS6bF9zb3elmeNGIjoY=
 github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
-github.com/google/uuid v1.1.2 h1:EVhdT+1Kseyi1/pUmXKaFxYsDNy9RQYkMWRH68J/W7Y=
 github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
+github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
+github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 github.com/grpc-ecosystem/grpc-gateway v1.16.0/go.mod h1:BDjrQk3hbvj6Nolgz8mAMFbcEtjT1g+wF4CSlocrBnw=
 github.com/inconshreveable/mousetrap v1.1.0 h1:wN+x4NVGpMsO7ErUn/mUI3vEoE6Jt13X2s0bqwp9tc8=
 github.com/inconshreveable/mousetrap v1.1.0/go.mod h1:vpF70FUmC8bwa3OWnCshd2FqLfsEA9PFc4w1p2J65bw=
@@ -230,8 +231,6 @@ golang.org/x/image v0.0.0-20200430140353-33d19683fad8/go.mod h1:FeLwcggjj3mMvU+o
 golang.org/x/image v0.0.0-20200618115811-c13761719519/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
 golang.org/x/image v0.0.0-20201208152932-35266b937fa6/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
 golang.org/x/image v0.0.0-20210216034530-4410531fe030/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
-golang.org/x/image v0.14.0 h1:tNgSxAFe3jC4uYqvZdTr84SZoM1KfwdC9SKIFrLjFn4=
-golang.org/x/image v0.14.0/go.mod h1:HUYqC05R2ZcZ3ejNQsIHQDQiwWM4JBqmm6MKANTp4LE=
 golang.org/x/image v0.22.0 h1:UtK5yLUzilVrkjMAZAZ34DXGpASN8i8pj8g+O+yd10g=
 golang.org/x/image v0.22.0/go.mod h1:9hPFhljd4zZ1GNSIZJ49sqbp45GKK9t6w+iXvGqZUz4=
 golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
@@ -267,8 +266,6 @@ golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJ
 golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
 golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
 golang.org/x/sync v0.0.0-20210220032951-036812b2e83c/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
-golang.org/x/sync v0.3.0 h1:ftCYgMx6zT/asHUrPw8BLLscYtGznsLAnjq5RH9P66E=
-golang.org/x/sync v0.3.0/go.mod h1:FU7BRWz2tNW+3quACPkgCx/L+uEAv1htQ0V83Z9Rj+Y=
 golang.org/x/sync v0.9.0 h1:fEo0HyrW1GIgZdpbhCRO0PkJajUS5H9IFUztCgEo2jQ=
 golang.org/x/sync v0.9.0/go.mod h1:Czt+wKu1gCyEFDUtn0jG5QVvpJ6rzVqr5aXyt9drQfk=
 golang.org/x/sys v0.0.0-20180830151530-49385e6e1522/go.mod h1:STP8DvDyc/dI5b8T5hshtkjS+E42TnysNCUPdjciGhY=
@@ -295,8 +292,6 @@ golang.org/x/text v0.3.0/go.mod h1:NqM8EUOU14njkJ3fqMW+pc6Ldnwhi/IjpwHt7yyuwOQ=
 golang.org/x/text v0.3.3/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
 golang.org/x/text v0.3.5/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
 golang.org/x/text v0.3.6/go.mod h1:5Zoc/QRtKVWzQhOtBMvqHzDpF6irO9z98xDceosuGiQ=
-golang.org/x/text v0.15.0 h1:h1V/4gjBv8v9cjcR6+AR5+/cIYK5N/WAgiv4xlsEtAk=
-golang.org/x/text v0.15.0/go.mod h1:18ZOQIKpY8NJVqYksKHtTdi31H5itFRjB5/qKTNYzSU=
 golang.org/x/text v0.20.0 h1:gK/Kv2otX8gz+wn7Rmb3vT96ZwuoxnQlY+HlJVj7Qug=
 golang.org/x/text v0.20.0/go.mod h1:D4IsuqiFMhST5bX19pQ9ikHC2GsaKyk/oF+pn3ducp4=
 golang.org/x/tools v0.0.0-20180525024113-a5b4c53f6e8b/go.mod h1:n7NCudcB/nEzxVGmLbDWY5pfWTLqBcC2KZ6jyYvM4mQ=
--- a/integration/max_queue_test.go
+++ b/integration/max_queue_test.go
@@ -16,7 +16,6 @@ import (
 	"github.com/stretchr/testify/require"

 	"github.com/ollama/ollama/api"
-	"github.com/ollama/ollama/envconfig"
 )

 func TestMaxQueue(t *testing.T) {
@@ -27,12 +26,8 @@ func TestMaxQueue(t *testing.T) {

 	// Note: This test can be quite slow when running in CPU mode, so keep the threadCount low unless your on GPU
 	// Also note that by default Darwin can't sustain > ~128 connections without adjusting limits
-	threadCount := 32
-	if maxQueue := envconfig.MaxQueue(); maxQueue != 0 {
-		threadCount = int(maxQueue)
-	} else {
-		t.Setenv("OLLAMA_MAX_QUEUE", strconv.Itoa(threadCount))
-	}
+	threadCount := 16
+	t.Setenv("OLLAMA_MAX_QUEUE", strconv.Itoa(threadCount))

 	req := api.GenerateRequest{
 		Model:  "orca-mini",
--- a/llama/README.md
+++ b/llama/README.md
@@ -55,7 +55,7 @@ go build -tags avx,cuda .

 ### ROCm

-Install the [CUDA toolkit v11.3.1](https://developer.nvidia.com/cuda-11-3-1-download-archive):
+Install [ROCm](https://rocm.docs.amd.com/en/latest/).

 ```shell
 make ggml_hipblas.so
@@ -77,7 +77,7 @@ go build -tags avx,cuda .

 ### ROCm

-Install [ROCm 5.7.1](https://rocm.docs.amd.com/en/docs-5.7.1/).
+Install [ROCm](https://rocm.docs.amd.com/en/latest/).

 ```shell
 make ggml_hipblas.dll
--- a/llama/runner/cache.go
+++ b/llama/runner/cache.go
@@ -199,6 +199,20 @@ func countCommonPrefix(a []input, b []input) int {
 	return count
 }

+func (c *InputCache) ShiftDiscard(inputLen int, numKeep int) int {
+	targetFree := (c.numCtx - numKeep) / 2
+	targetFree = max(targetFree, 1)
+
+	currentFree := c.numCtx - inputLen
+	discard := targetFree - currentFree
+
+	if discard < 0 {
+		discard = 0
+	}
+
+	return discard
+}
+
 // Frees up space in the KV cache by deleting the oldest half of history and shifting
 // the newest half into that space (saving numKeep inputs at the beginning).
 //
@@ -208,11 +222,7 @@ func (c *InputCache) ShiftCacheSlot(slot *InputCacheSlot, numKeep int) error {
 		return fmt.Errorf("unable to shift context - keep exceeds context (keep: %v context: %v)", numKeep, c.numCtx)
 	}

-	targetFree := (c.numCtx - numKeep) / 2
-	targetFree = max(targetFree, 1)
-
-	currentFree := c.numCtx - len(slot.Inputs)
-	discard := targetFree - currentFree
+	discard := c.ShiftDiscard(len(slot.Inputs), numKeep)

 	if discard <= 0 {
 		return nil
--- a/llama/runner/cache_test.go
+++ b/llama/runner/cache_test.go
@@ -227,3 +227,66 @@ func TestFindCacheSlot(t *testing.T) {
 		})
 	}
 }
+
+func TestShiftDiscard(t *testing.T) {
+	tests := []struct {
+		name     string
+		numCtx   int
+		numKeep  int
+		inputLen int
+		expected int
+	}{
+		{
+			name:     "Shift",
+			numCtx:   2048,
+			numKeep:  5,
+			inputLen: 2048,
+			expected: 1021,
+		},
+		{
+			name:     "Max Keep",
+			numCtx:   2048,
+			numKeep:  2047,
+			inputLen: 2048,
+			expected: 1,
+		},
+		{
+			name:     "No Keep",
+			numCtx:   2048,
+			numKeep:  0,
+			inputLen: 2048,
+			expected: 1024,
+		},
+		{
+			name:     "Truncate",
+			numCtx:   2048,
+			numKeep:  5,
+			inputLen: 5000,
+			expected: 3973,
+		},
+		{
+			name:     "Truncate Keep",
+			numCtx:   2048,
+			numKeep:  2047,
+			inputLen: 5000,
+			expected: 2953,
+		},
+		{
+			name:     "No Op",
+			numCtx:   2048,
+			numKeep:  5,
+			inputLen: 512,
+			expected: 0,
+		},
+	}
+
+	for _, tt := range tests {
+		t.Run(tt.name, func(t *testing.T) {
+			c := InputCache{numCtx: tt.numCtx}
+			result := c.ShiftDiscard(tt.inputLen, tt.numKeep)
+			if result != tt.expected {
+				t.Errorf("shiftDiscard(ctx: %v, keep: %v input: %v): have %v; want %v", tt.numCtx, tt.numKeep, tt.inputLen, result, tt.expected)
+			}
+		})
+	}
+}
--- a/llama/runner/runner.go
+++ b/llama/runner/runner.go
@@ -122,9 +122,11 @@ func (s *Server) NewSequence(prompt string, images []ImageData, params NewSequen
 	params.numKeep = min(params.numKeep, s.cache.numCtx-1)

 	if len(inputs) > s.cache.numCtx {
-		slog.Warn("truncating input prompt", "limit", s.cache.numCtx, "prompt", len(inputs), "numKeep", params.numKeep)
+		discard := len(inputs) - s.cache.numCtx
 		newInputs := inputs[:params.numKeep]
-		newInputs = append(newInputs, inputs[len(inputs)-s.cache.numCtx+params.numKeep:]...)
+		newInputs = append(newInputs, inputs[params.numKeep+discard:]...)
+
+		slog.Warn("truncating input prompt", "limit", s.cache.numCtx, "prompt", len(inputs), "keep", params.numKeep, "new", len(newInputs))
 		inputs = newInputs
 	}

@@ -162,10 +164,16 @@ func (s *Server) NewSequence(prompt string, images []ImageData, params NewSequen
 // generating image embeddings for each image
 func (s *Server) inputs(prompt string, images []ImageData) ([]input, error) {
 	var inputs []input
+	var parts []string
+	var matches [][]string

-	re := regexp.MustCompile(`\[img-(\d+)\]`)
-	parts := re.Split(prompt, -1)
-	matches := re.FindAllStringSubmatch(prompt, -1)
+	if s.image != nil {
+		re := regexp.MustCompile(`\[img-(\d+)\]`)
+		parts = re.Split(prompt, -1)
+		matches = re.FindAllStringSubmatch(prompt, -1)
+	} else {
+		parts = []string{prompt}
+	}

 	for i, part := range parts {
 		// text - tokenize
@@ -300,6 +308,7 @@ func (s *Server) removeSequence(seqIndex int, reason string) {
 	close(seq.embedding)
 	seq.cache.InUse = false
 	s.seqs[seqIndex] = nil
+	s.seqsSem.Release(1)
 }

 func (s *Server) run(ctx context.Context) {
@@ -649,14 +658,18 @@ func (s *Server) completion(w http.ResponseWriter, r *http.Request) {
 		return
 	}

-	// Ensure that a place to put the sequence is available
+	// Ensure there is a place to put the sequence, released when removed from s.seqs
 	if err := s.seqsSem.Acquire(r.Context(), 1); err != nil {
-		slog.Error("Failed to acquire semaphore", "error", err)
+		if errors.Is(err, context.Canceled) {
+			slog.Info("aborting completion request due to client closing the connection")
+		} else {
+			slog.Error("Failed to acquire semaphore", "error", err)
+		}
 		return
 	}
-	defer s.seqsSem.Release(1)

 	s.mu.Lock()
+	found := false
 	for i, sq := range s.seqs {
 		if sq == nil {
 			seq.cache, seq.inputs, err = s.cache.LoadCacheSlot(seq.inputs, req.CachePrompt)
@@ -670,11 +683,17 @@ func (s *Server) completion(w http.ResponseWriter, r *http.Request) {

 			s.seqs[i] = seq
 			s.cond.Signal()
+			found = true
 			break
 		}
 	}
 	s.mu.Unlock()

+	if !found {
+		http.Error(w, "could not find an available sequence", http.StatusInternalServerError)
+		return
+	}
+
 	for {
 		select {
 		case <-r.Context().Done():
@@ -738,14 +757,18 @@ func (s *Server) embeddings(w http.ResponseWriter, r *http.Request) {
 		return
 	}

-	// Ensure that a place to put the sequence is available
+	// Ensure there is a place to put the sequence, released when removed from s.seqs
 	if err := s.seqsSem.Acquire(r.Context(), 1); err != nil {
-		slog.Error("Failed to acquire semaphore", "error", err)
+		if errors.Is(err, context.Canceled) {
+			slog.Info("aborting embeddings request due to client closing the connection")
+		} else {
+			slog.Error("Failed to acquire semaphore", "error", err)
+		}
 		return
 	}
-	defer s.seqsSem.Release(1)

 	s.mu.Lock()
+	found := false
 	for i, sq := range s.seqs {
 		if sq == nil {
 			seq.cache, seq.inputs, err = s.cache.LoadCacheSlot(seq.inputs, req.CachePrompt)
@@ -756,11 +779,17 @@ func (s *Server) embeddings(w http.ResponseWriter, r *http.Request) {
 			}
 			s.seqs[i] = seq
 			s.cond.Signal()
+			found = true
 			break
 		}
 	}
 	s.mu.Unlock()

+	if !found {
+		http.Error(w, "could not find an available sequence", http.StatusInternalServerError)
+		return
+	}
+
 	embedding := <-seq.embedding

 	if err := json.NewEncoder(w).Encode(&EmbeddingResponse{
--- a/llm/filetype.go
+++ b/llm/filetype.go
@@ -32,9 +32,10 @@ const (
 	fileTypeIQ1_S
 	fileTypeIQ4_NL
 	fileTypeIQ3_S
+	fileTypeIQ3_M
 	fileTypeIQ2_S
-	fileTypeIQ4_XS
 	fileTypeIQ2_M
+	fileTypeIQ4_XS
 	fileTypeIQ1_M
 	fileTypeBF16

@@ -93,6 +94,8 @@ func ParseFileType(s string) (fileType, error) {
 		return fileTypeIQ4_NL, nil
 	case "IQ3_S":
 		return fileTypeIQ3_S, nil
+	case "IQ3_M":
+		return fileTypeIQ3_M, nil
 	case "IQ2_S":
 		return fileTypeIQ2_S, nil
 	case "IQ4_XS":
@@ -160,6 +163,8 @@ func (t fileType) String() string {
 		return "IQ4_NL"
 	case fileTypeIQ3_S:
 		return "IQ3_S"
+	case fileTypeIQ3_M:
+		return "IQ3_M"
 	case fileTypeIQ2_S:
 		return "IQ2_S"
 	case fileTypeIQ4_XS:
--- a/llm/server.go
+++ b/llm/server.go
@@ -687,7 +687,11 @@ type CompletionResponse struct {

 func (s *llmServer) Completion(ctx context.Context, req CompletionRequest, fn func(CompletionResponse)) error {
 	if err := s.sem.Acquire(ctx, 1); err != nil {
-		slog.Error("Failed to acquire semaphore", "error", err)
+		if errors.Is(err, context.Canceled) {
+			slog.Info("aborting completion request due to client closing the connection")
+		} else {
+			slog.Error("Failed to acquire semaphore", "error", err)
+		}
 		return err
 	}
 	defer s.sem.Release(1)
@@ -865,7 +869,11 @@ type EmbeddingResponse struct {

 func (s *llmServer) Embedding(ctx context.Context, input string) ([]float32, error) {
 	if err := s.sem.Acquire(ctx, 1); err != nil {
-		slog.Error("Failed to acquire semaphore", "error", err)
+		if errors.Is(err, context.Canceled) {
+			slog.Info("aborting embedding request due to client closing the connection")
+		} else {
+			slog.Error("Failed to acquire semaphore", "error", err)
+		}
 		return nil, err
 	}
 	defer s.sem.Release(1)
--- a/openai/openai.go
+++ b/openai/openai.go
@@ -571,7 +571,7 @@ type EmbedWriter struct {
 	model string
 }

-func (w *BaseWriter) writeError(code int, data []byte) (int, error) {
+func (w *BaseWriter) writeError(data []byte) (int, error) {
 	var serr api.StatusError
 	err := json.Unmarshal(data, &serr)
 	if err != nil {
@@ -630,7 +630,7 @@ func (w *ChatWriter) writeResponse(data []byte) (int, error) {
 func (w *ChatWriter) Write(data []byte) (int, error) {
 	code := w.ResponseWriter.Status()
 	if code != http.StatusOK {
-		return w.writeError(code, data)
+		return w.writeError(data)
 	}

 	return w.writeResponse(data)
@@ -679,7 +679,7 @@ func (w *CompleteWriter) writeResponse(data []byte) (int, error) {
 func (w *CompleteWriter) Write(data []byte) (int, error) {
 	code := w.ResponseWriter.Status()
 	if code != http.StatusOK {
-		return w.writeError(code, data)
+		return w.writeError(data)
 	}

 	return w.writeResponse(data)
@@ -704,7 +704,7 @@ func (w *ListWriter) writeResponse(data []byte) (int, error) {
 func (w *ListWriter) Write(data []byte) (int, error) {
 	code := w.ResponseWriter.Status()
 	if code != http.StatusOK {
-		return w.writeError(code, data)
+		return w.writeError(data)
 	}

 	return w.writeResponse(data)
@@ -730,7 +730,7 @@ func (w *RetrieveWriter) writeResponse(data []byte) (int, error) {
 func (w *RetrieveWriter) Write(data []byte) (int, error) {
 	code := w.ResponseWriter.Status()
 	if code != http.StatusOK {
-		return w.writeError(code, data)
+		return w.writeError(data)
 	}

 	return w.writeResponse(data)
@@ -755,7 +755,7 @@ func (w *EmbedWriter) writeResponse(data []byte) (int, error) {
 func (w *EmbedWriter) Write(data []byte) (int, error) {
 	code := w.ResponseWriter.Status()
 	if code != http.StatusOK {
-		return w.writeError(code, data)
+		return w.writeError(data)
 	}

 	return w.writeResponse(data)
--- a/scripts/env.sh
+++ b/scripts/env.sh
@@ -5,7 +5,6 @@ export GOFLAGS="'-ldflags=-w -s \"-X=github.com/ollama/ollama/version.Version=$V
 # TODO - consider `docker buildx ls --format=json` to autodiscover platform capability
 PLATFORM=${PLATFORM:-"linux/arm64,linux/amd64"}
 DOCKER_ORG=${DOCKER_ORG:-"ollama"}
-RELEASE_IMAGE_REPO=${RELEASE_IMAGE_REPO:-"${DOCKER_ORG}/release"}
 FINAL_IMAGE_REPO=${FINAL_IMAGE_REPO:-"${DOCKER_ORG}/ollama"}
 OLLAMA_COMMON_BUILD_ARGS="--build-arg=VERSION \
    --build-arg=GOFLAGS \
--- a/server/images.go
+++ b/server/images.go
@@ -5,7 +5,6 @@ import (
 	"cmp"
 	"context"
 	"crypto/sha256"
-	"encoding/base64"
 	"encoding/hex"
 	"encoding/json"
 	"errors"
@@ -31,8 +30,8 @@ import (
 	"github.com/ollama/ollama/llm"
 	"github.com/ollama/ollama/parser"
 	"github.com/ollama/ollama/template"
-	"github.com/ollama/ollama/types/errtypes"
 	"github.com/ollama/ollama/types/model"
+	"github.com/ollama/ollama/types/registry"
 	"github.com/ollama/ollama/version"
 )

@@ -983,39 +982,7 @@ func GetSHA256Digest(r io.Reader) (string, int64) {
 	return fmt.Sprintf("sha256:%x", h.Sum(nil)), n
 }

-var errUnauthorized = errors.New("unauthorized: access denied")
-
-// getTokenSubject returns the subject of a JWT token, it does not validate the token
-func getTokenSubject(token string) string {
-	parts := strings.Split(token, ".")
-	if len(parts) != 3 {
-		return ""
-	}
-
-	payload := parts[1]
-	payloadBytes, err := base64.RawURLEncoding.DecodeString(payload)
-	if err != nil {
-		slog.Error(fmt.Sprintf("failed to decode jwt payload: %v", err))
-		return ""
-	}
-
-	var payloadMap map[string]interface{}
-	if err := json.Unmarshal(payloadBytes, &payloadMap); err != nil {
-		slog.Error(fmt.Sprintf("failed to unmarshal payload JSON: %v", err))
-		return ""
-	}
-
-	sub, ok := payloadMap["sub"]
-	if !ok {
-		slog.Error("jwt does not contain 'sub' field")
-		return ""
-	}
-
-	return fmt.Sprintf("%s", sub)
-}
-
 func makeRequestWithRetry(ctx context.Context, method string, requestURL *url.URL, headers http.Header, body io.ReadSeeker, regOpts *registryOptions) (*http.Response, error) {
-	anonymous := true // access will default to anonymous if no user is found associated with the public key
 	for range 2 {
 		resp, err := makeRequest(ctx, method, requestURL, headers, body, regOpts)
 		if err != nil {
@@ -1036,7 +1003,6 @@ func makeRequestWithRetry(ctx context.Context, method string, requestURL *url.UR
 			if err != nil {
 				return nil, err
 			}
-			anonymous = getTokenSubject(token) == "anonymous"
 			regOpts.Token = token
 			if body != nil {
 				_, err = body.Seek(0, io.SeekStart)
@@ -1053,23 +1019,33 @@ func makeRequestWithRetry(ctx context.Context, method string, requestURL *url.UR
 			if err != nil {
 				return nil, fmt.Errorf("%d: %s", resp.StatusCode, err)
 			}
+
+			var re registry.Errs
+			if err := json.Unmarshal(responseBody, &re); err == nil && len(re.Errors) > 0 {
+				if re.HasCode(registry.ErrCodeAnonymous) {
+					// if the error is due to anonymous access return a custom error
+					// this error is used by the CLI to direct a user to add their key to an account
+					pubKey, nestedErr := auth.GetPublicKey()
+					if nestedErr != nil {
+						slog.Error(fmt.Sprintf("couldn't get public key: %v", nestedErr))
+						return nil, re
+					}
+					return nil, api.ErrUnknownOllamaKey{
+						Key: pubKey,
+					}
+				}
+				return nil, re
+			}
+
+			// Fallback to returning the raw response if parsing fails
 			return nil, fmt.Errorf("%d: %s", resp.StatusCode, responseBody)
 		default:
 			return resp, nil
 		}
 	}

-	if anonymous {
-		// no user is associated with the public key, and the request requires non-anonymous access
-		pubKey, nestedErr := auth.GetPublicKey()
-		if nestedErr != nil {
-			slog.Error(fmt.Sprintf("couldn't get public key: %v", nestedErr))
-			return nil, errUnauthorized
-		}
-		return nil, &errtypes.UnknownOllamaKey{Key: pubKey}
-	}
-	// user is associated with the public key, but is not authorized to make the request
-	return nil, errUnauthorized
+	// should never be reached
+	return nil, fmt.Errorf("failed to make upload request")
 }

 // testMakeRequestDialContext specifies the dial function for the http client in
@@ -1120,17 +1096,15 @@ func makeRequest(ctx context.Context, method string, requestURL *url.URL, header
 		req.ContentLength = contentLength
 	}

-	resp, err := (&http.Client{
-		Transport: &http.Transport{
-			DialContext: testMakeRequestDialContext,
-		},
+	c := &http.Client{
 		CheckRedirect: regOpts.CheckRedirect,
-	}).Do(req)
-	if err != nil {
-		return nil, err
 	}
-
-	return resp, nil
+	if testMakeRequestDialContext != nil {
+		tr := http.DefaultTransport.(*http.Transport).Clone()
+		tr.DialContext = testMakeRequestDialContext
+		c.Transport = tr
+	}
+	return c.Do(req)
 }

 func getValue(header, key string) string {
--- a/server/routes.go
+++ b/server/routes.go
@@ -36,7 +36,6 @@ import (
 	"github.com/ollama/ollama/runners"
 	"github.com/ollama/ollama/server/imageproc"
 	"github.com/ollama/ollama/template"
-	"github.com/ollama/ollama/types/errtypes"
 	"github.com/ollama/ollama/types/model"
 	"github.com/ollama/ollama/version"
 )
@@ -610,7 +609,7 @@ func (s *Server) PushHandler(c *gin.Context) {
 		defer cancel()

 		if err := PushModel(ctx, model, regOpts, fn); err != nil {
-			ch <- gin.H{"error": err.Error()}
+			ch <- newErr(err)
 		}
 	}()

@@ -650,7 +649,7 @@ func (s *Server) CreateHandler(c *gin.Context) {

 	name := model.ParseName(cmp.Or(r.Model, r.Name))
 	if !name.IsValid() {
-		c.AbortWithStatusJSON(http.StatusBadRequest, gin.H{"error": errtypes.InvalidModelNameErrMsg})
+		c.AbortWithStatusJSON(http.StatusBadRequest, gin.H{"error": api.InvalidModelNameErrMsg})
 		return
 	}

@@ -1141,7 +1140,7 @@ func (s *Server) GenerateRoutes() http.Handler {
 	config.AllowWildcard = true
 	config.AllowBrowserExtensions = true
 	config.AllowHeaders = []string{"Authorization", "Content-Type", "User-Agent", "Accept", "X-Requested-With"}
-	openAIProperties := []string{"lang", "package-version", "os", "arch", "runtime", "runtime-version", "async"}
+	openAIProperties := []string{"lang", "package-version", "os", "arch", "retry-count", "runtime", "runtime-version", "async"}
 	for _, prop := range openAIProperties {
 		config.AllowHeaders = append(config.AllowHeaders, "x-stainless-"+prop)
 	}
@@ -1550,3 +1549,24 @@ func handleScheduleError(c *gin.Context, name string, err error) {
 		c.JSON(http.StatusInternalServerError, gin.H{"error": err.Error()})
 	}
 }
+
+// newErr creates a structured API ErrorResponse from an existing error
+func newErr(err error) api.ErrorResponse {
+	if err == nil {
+		return api.ErrorResponse{}
+	}
+	// Default to just returning the generic error message
+	resp := api.ErrorResponse{
+		Code:    api.ErrCodeGeneral,
+		Message: err.Error(),
+	}
+	// Add additional error specific data, if any
+	var keyErr api.ErrUnknownOllamaKey
+	if errors.As(err, &keyErr) {
+		resp.Code = api.ErrCodeUnknownKey
+		resp.Data = map[string]any{
+			"key": keyErr.Key,
+		}
+	}
+	return resp
+}
--- a/types/errtypes/errtypes.go
+++ b/types/errtypes/errtypes.go
@@ -1,21 +0,0 @@
-// Package errtypes contains custom error types
-package errtypes
-
-import (
-	"fmt"
-	"strings"
-)
-
-const (
-	UnknownOllamaKeyErrMsg = "unknown ollama key"
-	InvalidModelNameErrMsg = "invalid model name"
-)
-
-// TODO: This should have a structured response from the API
-type UnknownOllamaKey struct {
-	Key string
-}
-
-func (e *UnknownOllamaKey) Error() string {
-	return fmt.Sprintf("unauthorized: %s %q", UnknownOllamaKeyErrMsg, strings.TrimSpace(e.Key))
-}
--- a/types/registry/error.go
+++ b/types/registry/error.go
@@ -0,0 +1,37 @@
+package registry
+
+import (
+	"fmt"
+	"slices"
+	"strings"
+)
+
+const ErrCodeAnonymous = "ANONYMOUS_ACCESS_DENIED"
+
+type Err struct {
+	Code    string `json:"code"`
+	Message string `json:"message"`
+}
+
+// Errs represents the structure of error responses from the registry
+// TODO (brucemacd): this struct should be imported from some shared package that is used between the registry and ollama
+type Errs struct {
+	Errors []Err `json:"errors"`
+}
+
+func (e Errs) Error() string {
+	if len(e.Errors) == 0 {
+		return "unknown registry error"
+	}
+	var msgs []string
+	for _, err := range e.Errors {
+		msgs = append(msgs, fmt.Sprintf("%s: %s", err.Code, err.Message))
+	}
+	return strings.Join(msgs, "; ")
+}
+
+func (e Errs) HasCode(code string) bool {
+	return slices.ContainsFunc(e.Errors, func(err Err) bool {
+		return err.Code == code
+	})
+}
Author	SHA1	Message	Date
Bruce MacDonald	e19c64e047	lint fix	2024-12-12 13:07:09 -08:00
Bruce MacDonald	9e190ac4d9	api: return structured error on unauthorized push This commit implements a structured error response system for the Ollama API, replacing ad-hoc error handling and string parsing with proper error types and codes through a new ErrorResponse struct. Instead of relying on regex to parse error messages for SSH keys, the API now passes this data in a structured format with standardized fields for error messages, codes, and additional data. This structured approach makes the API more maintainable and reliable while improving the developer experience by enabling programmatic error handling, consistent error formats, and better error documentation.	2024-12-12 11:49:36 -08:00
Bruce MacDonald	ae9165d661	remove images_test.go (uses filesystem key)	2024-11-27 15:52:30 -08:00
Bruce MacDonald	a262b86a5e	fix lint checks	2024-11-27 15:45:45 -08:00
Bruce MacDonald	4d5d3c3276	Update error.go	2024-11-27 15:24:08 -08:00
Bruce MacDonald	ea90ee7415	Update cmd.go	2024-11-27 15:22:27 -08:00
Bruce MacDonald	40134c6587	server: show user feedback when key is anonymous When an ollama key is not registered with any account on ollama.com this is not obvious. In the current CLI an error message that the user is not authorized is displayed. This change brings back previous behavior to show the user their key and where they should add it. It protects against adding unexpected keys by checking that the key is available locally. A follow-up change should add structured errors from the API. This change just relies on a known error message.	2024-11-27 15:01:12 -08:00
Bruce MacDonald	940e62772e	openai: remove unused error code (#7850 ) The writeError takes a code argument which is no longer used. Remove it for clarity.	2024-11-26 16:08:09 -08:00
Jesse Gross	71e6a0d0d1	runner.go: Don't try to extract image tags for text models When processing a prompt, we look for image tags of the form [img-0], which are inserted by the Ollama server process. However, this can cause errors if the original prompt has these tags - typically an image not found error is returned. This changes tag searching behavior to be similar to the 0.3.x series, which will largely avoid these problems. However,they can still happen when input text with these tags is used with image models. The correct solution is to escape the tags but this is a larger issue with special sequences in general so this is an incremental fix that should avoid the problem for the majority of cases.	2024-11-26 13:23:24 -08:00
Jesse Gross	2cd11ae365	runner.go: Add unit tests for context shifting This also makes it easier to truncate long inputs the same as shifting but does not actually implement it. This type of truncation has a trade off between quality and time to first token.	2024-11-26 11:21:35 -08:00
jake83741	52bbad12f9	readme: update description for vnc-lm community integration (#7832 )	2024-11-25 17:56:30 -08:00
frob	30e88d7f31	cmd: don't submit svg files as images for now (#7830 )	2024-11-25 16:43:29 -08:00
Blake Mizerany	2b7ed61ca2	server: fix Transport override (#7834 ) This changes makeRequest to update the http client Transport if and only if testMakeRequestDialContext is set. This is to avoid overriding the default Transport when testMakeRequestDialContext is nil, which broke existing behavior, included proxies, timeouts, and other behaviors. Fixes #7829 Fixes #7788	2024-11-25 15:08:34 -08:00
Shikhar Bakhda	647513a7d4	readme: add HoneyHive to community integrations (#7831 )	2024-11-25 09:55:33 -08:00
Bruce MacDonald	a210ec74d2	cmd: print location of model after pushing (#7695 ) After a user pushes their model it is not clear what to do next. Add a link to the output of `ollama push` that tells the user where their model can now be found.	2024-11-25 09:40:16 -08:00
Simon Schampijer	cfb1ddd6fc	examples: update langchain-python-simple (#3591 ) - better formatting of input prompt - use invoke instead of predict	2024-11-24 16:06:22 -08:00
reid41	3987acd7ec	readme: add descriptions for QA-Pilot and shell-pilot community integrations (#4303 )	2024-11-24 15:55:09 -08:00
frob	fda1e6b563	llm: bring fileTypes into alignment with llama.cpp (#7819 )	2024-11-24 10:33:33 -08:00
Adarsh Mishra	3440ffb37b	readme: add description for OpenTalkGpt in community integrations (#7818 )	2024-11-24 10:32:23 -08:00
Patcher	a820d2b267	readme: add observability section with OpenLIT to community-integrations	2024-11-23 18:03:12 -08:00
Meng Zhuo	2ebdb54fb3	all: update math32 go mod to v1.11.0 (#6627 )	2024-11-23 15:21:54 -08:00
josc146	bb52abfa55	readme: add ChatGPTBox and RWKV-Runner to community integrations (#4118 )	2024-11-23 13:31:27 -08:00
oza6ut0ne	31cb1ca9e5	openai: accept X-Stainless-Retry-Count header (#6910 )	2024-11-23 12:39:05 -08:00
Rodrigo Ribeiro Gomes	78f779a323	readme: add powershai, a powershell module with ollama support to community integrations (#7438 )	2024-11-23 10:08:59 -08:00
Jesse Gross	3478b2cf14	runner.go: Fix deadlock with many concurrent requests If there are no avilable slots for new sequences then a request will not be added to the processing queue but will continue on to wait for a response that never comes. Besides never giving a response to the request, this prevents the model from being unloaded due to the outstanding request. To prevent this, there are semaphores that prevent more requests from being processed than there are slots - one in the Ollama server and one in the runner. - The Ollama server one works but it is not designed to protect the runner's data internal structures and the runner can return a final response before clearing its data structures. - The internal runner semaphore has similar behavior where it can release the semaphore when it issues a response. This is wrong - it should only release the semaphore after it has cleared the data structure. In addition, we should return an error if a slot is not found rather than deadlocking in the event we ever get to this spot. Fixes #7779	2024-11-22 16:14:51 -08:00
Bruce MacDonald	7b5585b9cb	server: remove out of date anonymous access check (#7785 ) In the past the ollama.com server would return a JWT that contained information about the user being authenticated. This was used to return different error messages to the user. This is no longer possible since the token used to authenticate does not contain information about the user anymore. Removing this code that no longer works. Follow up changes will improve the error messages returned here, but good to clean up first.	2024-11-22 11:57:35 -08:00
Daniel Hiltgen	f0a351810c	tests: fix max queue integration test (#7782 ) This had fallen out of sync with the envconfig behavior, where max queue default was not zero.	2024-11-22 08:05:45 -08:00
Daniel Hiltgen	b85520bfb9	logs: explain client aborts better (#7783 ) Users get confused by "Failed to acquire semaphore" error="context canceled" messages in the logs, which are actually clients giving up. While there could be a legitimate hang bug in the system, sometimes this is just short client timeouts with an overloaded system, so this should help users understand what's going on better.	2024-11-22 08:05:32 -08:00
Daniel Hiltgen	d88972ea48	Be quiet when redirecting output (#7360 ) This avoids emitting the progress indicators to stderr, and the interactive prompts to the output file or pipe. Running "ollama run model > out.txt" now exits immediately, and "echo hello \| ollama run model > out.txt" produces zero stderr output and a typical response in out.txt	2024-11-22 08:04:54 -08:00
Leon Sander	25c9339e2d	readme: add Local Multimodal AI Chat app to community integrations (#6931 )	2024-11-21 20:39:38 -08:00
Mikel Olasagasti Uranga	597072ef1b	readme: update google/uuid module (#7310 ) update uuid.New().String() to uuid.NewString()	2024-11-21 19:37:04 -08:00
Dustin	84b3e07f1b	readme: add ollamarama-matrix to community integrations (#7325 )	2024-11-21 17:49:30 -08:00
Edwin.JH.Lee	422d52858c	readme: add x-cmd ollama module to community integrations (#5191 )	2024-11-21 16:55:25 -08:00
Elias	723f285813	readme: add OrionChat to community integrations (#7084 ) OrionChat is a free web-based chat interface that simplifies interactions with multiple AI model providers. It provides a unified platform for chatting and exploring multiple large language models (LLMs).	2024-11-21 11:23:42 -08:00
湛露先生	eaaf5d309d	cmd: delete duplicated call to sb.Reset() (#7308 ) Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>	2024-11-21 11:20:48 -08:00
Jeffrey Morgan	27d9c749d5	docs: remove tutorials, add cloud section to community integrations (#7784 )	2024-11-21 09:59:53 -08:00
R0CKSTAR	b7bddeebc1	env.sh: cleanup unused RELEASE_IMAGE_REPO (#6855 ) Signed-off-by: Xiaodong Ye <xiaodong.ye@mthreads.com>	2024-11-21 08:28:04 -08:00
Paul Robello	6a0c2ec50f	readme: add terminal tool ParLlama to community integrations (#5623 )	2024-11-21 02:55:35 -08:00
毛巳煜	baa41be2aa	readme: add a community made ollama web management tool (#7126 )	2024-11-21 02:51:45 -08:00
xuyangbocn	2157b1232e	readme: add Terraform AWS Ollama & Open WebUI community example (#5633 )	2024-11-21 02:28:57 -08:00
emrgnt-cmplxty	37711578a2	readme: add R2R to community integrations (#5587 )	2024-11-21 02:09:36 -08:00
Cyril Blaecke	fb2c9594e0	readme: Add Nosia to Community Integrations (#5381 )	2024-11-21 02:07:17 -08:00
Christian Tzolov	7fbcd55da3	readme: Add Spring AI library reference (#5981 )	2024-11-21 02:02:14 -08:00
Philippe Charrière	b4348bdd25	readme: add Parakeet to community integrations Parakeet is a GoLang SDK for Ollama --------- Co-authored-by: Parth Sareen <parth.sareen@ollama.com>	2024-11-21 02:00:32 -08:00
Marcin Szczygliński	155734e09a	readme: add community integration py-gpt (#6503 )	2024-11-21 01:54:39 -08:00
Michael	883d80e097	readme: add Promptery to community integrations (#7093 )	2024-11-21 01:46:20 -08:00
Jakub Burkiewicz	e4c9f75b23	readme: add node-red-contrib-ollama to community integrations (#4648 )	2024-11-21 01:09:37 -08:00
Dezoito	f5ec7cc872	readme: add ollama grid search, a community project (#4301 )	2024-11-21 01:02:46 -08:00
Franco Lombardo	811bafba82	readme: Add LLPhant to community integrations (#5679 )	2024-11-21 00:54:26 -08:00
Aarushi	431075fcbb	readme: add autogpt integration to list of community integrations (#6459 )	2024-11-21 00:51:38 -08:00
Kevin Brake	c4f27225ac	readme: add community contribution to readme ollama-kis (#5575 )	2024-11-21 00:31:27 -08:00
chyok	b7aa5ee06c	readme: Add tkinter-based client to community based integrations (#5412 )	2024-11-21 00:19:24 -08:00
Nico	3f87f71755	readme: add Shinkai Desktop to community integrations (#4877 )	2024-11-21 00:16:18 -08:00
Laurent Eschenauer	20623cec13	readme: add OpenGPA to community integrations (#5497 )	2024-11-21 00:13:54 -08:00
Andy Gill	0e5f31a86d	readme: add Haverscript to community integrations (#6945 ) Haverscript uses classical functional programming techniques to provide a composable interface for interacting with ollama-hosted LLMs.	2024-11-21 00:11:39 -08:00
drunkwcodes	7e92091751	readme: Terminal app bb7 to community integrations (#7064 )	2024-11-21 00:03:11 -08:00
boessu	1a742f54c9	readme: update AMD ROCm links (#7213 )	2024-11-20 23:48:55 -08:00
奶茶叔叔	6a89dcf848	readme: flutter-based chat app to community integrations (#7221 )	2024-11-20 23:30:10 -08:00
Alexander F. Rødseth	c5e238e8e5	readme: orbiton to community integrations (#7770 )	2024-11-20 23:24:05 -08:00
Nikita Ganzikov	fce30f407a	app: typo in wintray messages const (#7705 )	2024-11-20 22:01:58 -08:00
Daniel Hiltgen	d863298210	docs: Link to AMD guide on multi-GPU guidance (#7744 )	2024-11-20 16:00:46 -08:00