Compare commits

...

113 Commits

Author SHA1 Message Date
ParthSareen
b4de2e9189 change name to context_length 2025-02-07 11:50:38 -08:00
ParthSareen
61a5254115 context_window and addressing comments 2025-02-05 11:26:55 -08:00
ParthSareen
53d2cf37d2 update docs 2025-02-04 15:17:16 -08:00
ParthSareen
75f88e7aac Update docs 2025-02-04 10:47:32 -08:00
ParthSareen
4982089c84 Fix formatting 2025-01-30 13:53:24 -08:00
Parth Sareen
8c231b0826 Update openai/openai.go
Co-authored-by: Michael Yang <mxyng@pm.me>
2025-01-30 13:50:25 -08:00
ParthSareen
16abd181a9 remove context shifting with max tokens and update docs 2025-01-30 13:48:24 -08:00
ParthSareen
5c2f35d846 Add tests 2025-01-30 13:16:15 -08:00
ParthSareen
6de3227841 Cleanup api 2025-01-30 13:15:57 -08:00
ParthSareen
35e97db03b set num_ctx through extra body 2025-01-29 13:13:11 -08:00
Xiaofu Huang
2ef3c803a1 readme: add AI Toolkit for VSCode to community integrations (#8604) 2025-01-27 00:36:23 -08:00
Matěj Štágl
453e4d090b readme: add LlmTornado to community integrations (#8551) 2025-01-25 01:04:07 -08:00
Daniel Jalkut
ca2f9843c8 docs: remove reference to the deleted examples folder (#8524) 2025-01-22 22:52:15 -08:00
frob
294b6f5a22 docs: remove tfs_z option from documentation (#8515) 2025-01-21 09:28:59 -08:00
EndoTheDev
7bb356c680 docs: update suspend header in gpu.md (#8487) 2025-01-19 18:45:35 -08:00
Jannik Maierhöfer
021817e59a readme: add link to Langfuse (#8455) 2025-01-16 22:41:12 -08:00
Patrick Devine
a420a453b4 fix default modelfile for create (#8452) 2025-01-16 01:14:04 -08:00
Jeffrey Morgan
42cf4db601 parser: fix parsing Modelfiles with multiple FROM commands (#8449) 2025-01-16 00:14:04 -08:00
Josh
93a8daf285 convert: import support for command-r models from safetensors (#6063)
---------

Co-authored-by: Patrick Devine <patrick@infrahq.com>
2025-01-15 16:31:22 -08:00
Gloryjaw
a041b4df7c docs: fix path to examples (#8438) 2025-01-15 11:49:12 -08:00
Patrick Devine
2539f2dbf9 Fix absolute path names + gguf detection (#8428) 2025-01-14 19:01:24 -08:00
Jeffrey Morgan
61676fb506 llama: move grammar tests to llama_test.go (#8411) 2025-01-14 12:55:45 -08:00
Bruce MacDonald
f6f3713001 convert: qwen2 from safetensors (#8408)
Add native support for converting Qwen2 family models (including Qwen2.5)
from safetensors to gguf format so we can run it.
2025-01-14 10:34:37 -08:00
Steve Berdy
a30f347201 readme: add LangChain for .NET to community integrations (#8352) 2025-01-14 09:37:35 -08:00
Jeffrey Morgan
74ea4fb604 remove .prettierrc.json (#8413) 2025-01-14 09:30:34 -08:00
Jeffrey Morgan
6982e9cc96 readme: remove link to missing page 2025-01-13 18:56:31 -08:00
Patrick Devine
ab39872cb4 add new create api doc (#8388) 2025-01-13 17:30:24 -08:00
Parth Sareen
84a2314463 examples: remove codified examples (#8267) 2025-01-13 11:26:22 -08:00
Jeffrey Morgan
17fcdea698 readme: move discord link 2025-01-12 22:45:47 -08:00
Patrick Devine
32bd37adf8 make the modelfile path relative for ollama create (#8380) 2025-01-10 16:14:08 -08:00
Michael Yang
9446c2c902 Merge pull request #8196 from ollama/mxyng/gods-v2
chore: upgrade to gods v2
2025-01-10 13:50:11 -08:00
Jeffrey Morgan
9aa141d023 readme: remove discord badge image for now 2025-01-09 22:02:18 -08:00
Patrick Devine
8bccae4f92 show a more descriptive error in the client if it is newer than the server (#8351) 2025-01-09 10:12:30 -08:00
isamu arimoto
6ae2adc1af openai: accept additional headers to fix CORS errors (#8343) 2025-01-08 11:28:11 -08:00
Jeffrey Morgan
1deafd8254 llama: update vendored code to commit 46e3556 (#8308) 2025-01-08 11:22:01 -08:00
Michael
57f038ec7b readme: add phi4 model (#8350) 2025-01-08 11:21:39 -08:00
frob
cdf3a181dc Add CUSTOM_CPU_FLAGS to Dockerfile. (#8284)
* Add CUSTOM_CPU_FLAGS.

* fix golangci-lint error.

---------

Co-authored-by: Richard Lyons <rick@frob.com.au>
2025-01-06 09:17:19 -08:00
Ubaldo Porcheddu
3919f4ba3d llama: fix runner api example url in README.md (#8307) 2025-01-04 15:45:16 -08:00
Bruce MacDonald
2d33c4e97d discover: remove leading new-line for linter 2025-01-03 12:03:58 -08:00
Bruce MacDonald
29a8975c66 api: remove unused create fields
These fields are deprecated, but specifying them will not do anything. Removing them as the other deprecated fields will still work, but these do not, so they dont match our existing pattern.
2025-01-03 12:03:58 -08:00
Patrick Devine
86a622cbdc Update the /api/create endpoint to use JSON (#7935)
Replaces `POST /api/create` to use JSON instead of a Modelfile.

This is a breaking change.
2024-12-31 18:02:30 -08:00
Jeffrey Morgan
459d822b51 readme: link header to ollama.com 2024-12-29 17:36:07 -05:00
Simon Schampijer
844899440a examples: updated deprecated imports (#3602) 2024-12-29 14:36:25 -05:00
Anas Khan
103db4216d docs: add /api/version endpoint documentation (#8082)
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com>
2024-12-29 14:33:44 -05:00
Jeffrey Morgan
6daddcde01 readme: update import header 2024-12-29 14:12:23 -05:00
Emilien Lancelot
07f7e69b36 readme: add Yacana multi-agent framework to community integrations (#7259) 2024-12-28 15:05:57 -05:00
CIIDMike
b68e8e5727 docs: add syntax highlighting on Go template code blocks (#8215) 2024-12-27 13:17:49 -05:00
Adarsh Mishra
369fb529e2 readme: add TextLLaMA to community integrations 2024-12-27 13:16:06 -05:00
Jared Donnell
023e4bca14 readme: add neollama to terminal section of community integrations (#8242) 2024-12-25 17:16:11 -05:00
aritra saha
51af455f62 readme: add alpaca client application to community integrations (#8227) 2024-12-24 23:05:35 -05:00
Emanuil Rusev
ffe3549064 readme: add IntelliBar to community integrations (#7950) 2024-12-23 12:04:18 -05:00
湛露先生
928de9050e server: reuse InvalidModelNameErrMsg type (#8163) 2024-12-23 10:38:34 -05:00
ItzCrazyKns
36aea6154a readme: add Perplexica to community-integrations (#8198) 2024-12-22 20:04:01 -05:00
Patrick Devine
dd352ab27f fix crash bug with /save when quotes are used (#8208) 2024-12-21 22:31:37 -08:00
Michael Yang
cb40d60469 chore: upgrade to gods v2
gods v2 uses go generics rather than interfaces which simplifies the
code considerably
2024-12-21 00:05:16 -08:00
Patrick Devine
d8bab8ea44 remove tutorials.md which pointed to removed tutorials (#8189) 2024-12-20 14:04:20 -08:00
Squishedmac
9ab62eb96f update golang.org/x dependencies (#8172) 2024-12-20 09:29:30 -08:00
Parth Sareen
290cf2040a llama: test key order preservation in schema_to_grammar (#8078)
This change adds a test to catch a regression in schema_to_grammar where
the order of keys in the JSON schema is not preserved in the generated
grammar, which is critical for step-by-step reasoning.
2024-12-18 19:44:50 -08:00
Jeffrey Morgan
a72f2dce45 scripts: sign renamed macOS binary (#8131) 2024-12-17 18:03:49 -08:00
Jesse Gross
08a832b482 llama: Ensure KV cache is fully defragmented.
Sometimes the KV cache requires defragmentation even without
triggering the threshold heuristic. In this case, decoding
will not being able to find a KV cache slot. This is particularly
difficult for the caller to handle if it happens in between
ubatches. To avoid this, we should immediately trigger a defrag.

In addition, a heavily fragmented cache can require more than
max_moves to defragment. Currently, we stop when we hit the limit
but this can leave a cache that still does not have adequate space
even after defragmentation is triggered. Instead, we should do
multiple batches of processing until everything is complete.

Fixes #7949
2024-12-17 14:01:19 -08:00
Blake Mizerany
2ddc32d5c5 llm: do not error on "null" format (#8139)
This fixes another regression in the previous commit that fixed other
known bugs.
2024-12-17 09:49:37 -08:00
Jascha Beste
2cde4b8817 readme: change getting started guide link for pgai (#8119) 2024-12-16 22:13:23 -08:00
Blake Mizerany
87f0a49fe6 llm: do not silently fail for supplied, but invalid formats (#8130)
Changes in #8002 introduced fixes for bugs with mangling JSON Schemas.
It also fixed a bug where the server would silently fail when clients
requested invalid formats. It also, unfortunately, introduced a bug
where the server would reject requests with an empty format, which
should be allowed.

The change in #8127 updated the code to allow the empty format, but also
reintroduced the regression where the server would silently fail when
the format was set, but invalid.

This commit fixes both regressions. The server does not reject the empty
format, but it does reject invalid formats. It also adds tests to help
us catch regressions in the future.

Also, the updated code provides a more detailed error message when a
client sends a non-empty, but invalid format, echoing the invalid format
in the response.

This commits also takes the opportunity to remove superfluous linter
checks.
2024-12-16 21:57:49 -08:00
Jeffrey Morgan
0f06a6daa7 llm: loosen format check to default to no format (#8127) 2024-12-16 18:45:46 -08:00
Daniel Hiltgen
8f805dd74b darwin: restore multiple runners for x86 (#8125)
In 0.5.2 we simplified packaging to have avx only for macos x86.  It looks like
there may still be some non-AVX systems out there, so this puts back the prior
logic of building no-AVX for the primary binary, and now 2 runners for avx and avx2.
These will be packaged in the App bundle only, so the stand-alone binary will now be
without AVX support on macos.  On arm, we'll also see these runners reported
as available in the log, but they're dormant and will never be used at runtime.
2024-12-16 18:45:02 -08:00
Michael
89d5e2f2fd readme: example/get started guide for pgai with Ollama (#8115)
readme: example/get started guide for pgai with Ollama
2024-12-16 17:14:37 +08:00
Jascha Beste
297ada6c87 readme: add pgai to readme for semantic search (#8028)
* docs: switch around database integrations order and link to quickstart

* docs: link to blog post in example readme

* chore: link to main readme

* readme: removing example to link externally

readme: removing example to link externally so we don't have to keep this example up-to-date

---------
2024-12-16 17:02:28 +08:00
Patrick Devine
8c9fb8eb73 imageproc mllama refactor (#7537)
Refactor mllama image processing code, and add pixtral and qwen2vl
2024-12-14 19:50:15 -08:00
Daniel Hiltgen
b75ccfc5ec ci: be more aggressive on parallelism in build (#8102) 2024-12-14 14:56:05 -08:00
Jeffrey Morgan
7a81daf026 llama: update vendor code to commit ba1cb19c (#8101) 2024-12-14 14:55:51 -08:00
Daniel Hiltgen
60f75560a2 runner: switch logging back to stderr (#8091)
This puts the low-level runner logging back on stderr for consistency with prior releases
2024-12-13 14:36:50 -08:00
Anuraag (Rag) Agrawal
e28f2d4900 openai: return usage as final chunk for streams (#6784)
* openai: return usage as final chunk for streams

---------

Co-authored-by: ParthSareen <parth.sareen@ollama.com>
2024-12-12 17:09:30 -08:00
Pascal Patry
c216850523 llama: parse JSON schema using nlohmann::ordered_json to maintain ordering (#8071) 2024-12-12 09:57:28 -08:00
Parth Sareen
18f6a98bd6 llama: enable JSON schema key ordering for generating grammars (#8055) 2024-12-11 17:17:36 -08:00
Blake Mizerany
b1fd7fef86 server: more support for mixed-case model names (#8017)
Fixes #7944
2024-12-11 15:29:59 -08:00
Daniel Hiltgen
36d111e788 ci: fix linux version (#8054)
Pass through the version override so the makefiles use it
2024-12-11 14:09:57 -08:00
Blake Mizerany
9039c821a2 llama: preserve field order in user-defined JSON schemas (#8002)
Previously we decoded and re-encoded JSON schemas during validation,
which served no purpose since json.RawMessage already validates JSON
syntax. Worse, the re-encoding lost field ordering from the original
schema, which affects inference quality during step-by-step reasoning.

While fixing this ordering issue by using json.RawMessage directly,
testing revealed that schema_to_grammar (from llama.cpp) also fails to
preserve field order during grammar generation. This appears to be the
root cause of inference degradation.

This change prevents us from mangling the user's original schema order,
but we still need to address the ordering issue in schema_to_grammar.
That will be a separate change.

Updates #7978
2024-12-11 14:07:30 -08:00
Daniel Hiltgen
581a4a5553 ci: fix artifact path prefix for missing windows payloads (#8052)
upload-artifacts strips off leading common paths so when
the ./build/ artifacts were removed, the ./dist/windows-amd64
prefix became common and was stripped, making the
later download-artifacts place them in the wrong location
2024-12-11 10:59:32 -08:00
Daniel Hiltgen
cf4d7c52c4 win: builtin arm runner (#8039)
The new build embeds the arm runner in the
main binary, so there is no longer a lib/ollama
2024-12-11 08:32:13 -08:00
Daniel Hiltgen
6a6328a5e9 ci: build dir changed (#8037)
Remove no longer relevant build log dir
2024-12-10 20:33:34 -08:00
Jeffrey Morgan
527cc97899 llama: update vendored code to commit 40c6d79f (#7875) 2024-12-10 19:21:34 -08:00
Blake Mizerany
a37f4a86a7 go.mod: go 1.22.8 -> 1.23.4 (#8036) 2024-12-10 18:16:16 -08:00
湛露先生
46f74e0cb5 Return err when NewHipLib() detect error. (#8012)
Signed-off-by: zhanluxianshen <zhanluxianshen@163.com>
2024-12-10 16:32:29 -08:00
Phil Wornath
7622ea21af readme: add AI summary helper plugin to community-integrations (#7202) 2024-12-10 16:13:06 -08:00
Tao Zuhong
c5d3947084 readme: add Kangaroo, an AI-powered SQL admin tool to community integrations (#7948) 2024-12-10 13:48:32 -08:00
frob
757eeacc1b server: lowercase hostname for Host header check (#5851) 2024-12-10 13:43:22 -08:00
Dr. Daniel Bender
dd42acf737 readme: add aidful-ollama-model-delete to community integrations (#8024) 2024-12-10 13:03:19 -08:00
Daniel Hiltgen
b9ccb3741e Remove unused runner CpuFeatures (#8032)
The final implementation of #7499 removed dynamic vector requirements
in favor of a simpler filename based model, and this was left over logic that
is no longer needed.
2024-12-10 12:59:39 -08:00
Stefan Weil
abfdc4710f all: fix typos in documentation, code, and comments (#7021) 2024-12-10 12:58:06 -08:00
Daniel Hiltgen
82a02e18d9 build: fix typo in override variable (#8031)
The "F" was missing.
2024-12-10 10:51:16 -08:00
Daniel Hiltgen
4879a234c4 build: Make target improvements (#7499)
* llama: wire up builtin runner

This adds a new entrypoint into the ollama CLI to run the cgo built runner.
On Mac arm64, this will have GPU support, but on all other platforms it will
be the lowest common denominator CPU build.  After we fully transition
to the new Go runners more tech-debt can be removed and we can stop building
the "default" runner via make and rely on the builtin always.

* build: Make target improvements

Add a few new targets and help for building locally.
This also adjusts the runner lookup to favor local builds, then
runners relative to the executable, and finally payloads.

* Support customized CPU flags for runners

This implements a simplified custom CPU flags pattern for the runners.
When built without overrides, the runner name contains the vector flag
we check for (AVX) to ensure we don't try to run on unsupported systems
and crash.  If the user builds a customized set, we omit the naming
scheme and don't check for compatibility.  This avoids checking
requirements at runtime, so that logic has been removed as well.  This
can be used to build GPU runners with no vector flags, or CPU/GPU
runners with additional flags (e.g. AVX512) enabled.

* Use relative paths

If the user checks out the repo in a path that contains spaces, make gets
really confused so use relative paths for everything in-repo to avoid breakage.

* Remove payloads from main binary

* install: clean up prior libraries

This removes support for v0.3.6 and older versions (before the tar bundle)
and ensures we clean up prior libraries before extracting the bundle(s).
Without this change, runners and dependent libraries could leak when we
update and lead to subtle runtime errors.
2024-12-10 09:47:19 -08:00
frob
63269668c0 Prevent underflow when FreeMemory < overhead (#8014)
Co-authored-by: Richard Lyons <frob@cloudstaff.com>
2024-12-10 09:10:40 -08:00
Jesse Gross
900f64e6be prompt: Don't trim whitespace from prompts
New lines can be an important part of a user's prompt and trimming
it can alter the results. We previously only trimmed prompts with
images but refactoring brought this behavior to all prompts, where
it became more noticable.

The /generate endpoint adds less whitespace and therefore doesn't
need to trim it out - this brings the same behavior to /chat.

Thanks to @gabe-l-hart for spotting the issue!

Fixes #7795
2024-12-09 11:02:55 -08:00
Yannick Gloster
da09488fbf docs: remove comment regarding tool streaming in openai.md (#7960) 2024-12-07 22:16:21 -08:00
湛露先生
7f0ccc8a9d docs: fix syntax error in openai.md (#7986) 2024-12-07 22:14:36 -08:00
Parth Sareen
de52b6c2f9 bugfix: "null" value json mode (#7979) 2024-12-06 14:13:15 -08:00
Michael
acd7d03266 readme: add llama3.3 to readme (#7975)
readme: add llama3.3 to readme
2024-12-06 14:05:11 -05:00
Parth Sareen
f6e87fd628 docs: update readmes for structured outputs (#7962) 2024-12-06 10:35:37 -08:00
Jeffrey Morgan
aed1419c64 ci: skip go build for tests (#7899) 2024-12-04 21:22:36 -08:00
Parth Sareen
c6c526275d api: add generate endpoint for structured outputs (#7939) 2024-12-04 17:37:12 -08:00
Parth Sareen
630e7dc6ff api: structured outputs - chat endpoint (#7900)
Adds structured outputs to chat endpoint
---------

Co-authored-by: Michael Yang <mxyng@pm.me>
Co-authored-by: Hieu Nguyen <hieunguyen1053@outlook.com>
2024-12-04 16:31:19 -08:00
Michael Yang
eb8366d658 Merge pull request #7932 from ollama/mxyng/fix-merges 2024-12-04 10:04:52 -08:00
Michael Yang
4456012956 fix unmarshaling merges 2024-12-04 09:21:56 -08:00
Sam
539be43640 llm: normalise kvct parameter handling (#7926) 2024-12-03 16:30:40 -08:00
Sam
1bdab9fdb1 llm: introduce k/v context quantization (vRAM improvements) (#6279) 2024-12-03 15:57:19 -08:00
owboson
2b82c5a8a1 docs: correct default num_predict value in modelfile.md (#7693) 2024-12-03 15:00:05 -08:00
Tigran
55c3efa900 docs: remove extra quote in modelfile.md (#7908) 2024-12-02 09:28:56 -08:00
David Mayboroda
1aedffad93 readme: add minima to community integrations (#7906) 2024-12-02 01:14:47 -08:00
Jeffrey Morgan
ff6c2d6dc8 cmd: don't rely on reading repo file for test (#7898) 2024-11-30 14:12:53 -08:00
Jeffrey Morgan
d543b282a7 server: add warning message for deprecated context field (#7878) 2024-11-30 14:05:50 -08:00
Parth Sareen
5f8051180e Enable index tracking for tools - openai api support (#7888) 2024-11-29 20:00:09 -08:00
Jeffrey Morgan
39e29ae5dd llama: fix typo and formatting in readme (#7876) 2024-11-28 17:27:11 -08:00
TheCookingSenpai
30a9f063c9 readme: add SpaceLlama, YouLama, and DualMind to community integrations (#7216) 2024-11-28 15:16:27 -08:00
545 changed files with 77685 additions and 61988 deletions

View File

@@ -85,13 +85,12 @@ jobs:
import-module 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\Tools\Microsoft.VisualStudio.DevShell.dll'
Enter-VsDevShell -vsinstallpath 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise' -skipautomaticlocation -DevCmdArguments '-arch=x64 -no_logo'
if (!(gcc --version | select-string -quiet clang)) { throw "wrong gcc compiler detected - must be clang" }
make
make dist
name: make
- uses: actions/upload-artifact@v4
with:
name: generate-windows-cpu
path: |
build/**/*
dist/windows-amd64/**
# ROCm generation step
@@ -143,13 +142,12 @@ jobs:
import-module 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\Tools\Microsoft.VisualStudio.DevShell.dll'
Enter-VsDevShell -vsinstallpath 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise' -skipautomaticlocation -DevCmdArguments '-arch=x64 -no_logo'
if (!(gcc --version | select-string -quiet clang)) { throw "wrong gcc compiler detected - must be clang" }
make -C llama print-HIP_PATH print-HIP_LIB_DIR
make rocm
make help-runners
make dist_rocm
- uses: actions/upload-artifact@v4
with:
name: generate-windows-rocm
path: |
build/**/*
dist/windows-amd64/**
# CUDA generation step
@@ -226,12 +224,11 @@ jobs:
import-module 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\Tools\Microsoft.VisualStudio.DevShell.dll'
Enter-VsDevShell -vsinstallpath 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise' -skipautomaticlocation -DevCmdArguments '-arch=x64 -no_logo'
if (!(gcc --version | select-string -quiet clang)) { throw "wrong gcc compiler detected - must be clang" }
make cuda_v$(($env:CUDA_PATH | split-path -leaf) -replace 'v(\d+).*', '$1')
make dist_cuda_v$(($env:CUDA_PATH | split-path -leaf) -replace 'v(\d+).*', '$1')
- uses: actions/upload-artifact@v4
with:
name: generate-windows-cuda-${{ matrix.cuda.version }}
path: |
build/**/*
dist/windows-amd64/**
# windows arm64 generate, go build, and zip file (no installer)
@@ -450,20 +447,23 @@ jobs:
- uses: actions/download-artifact@v4
with:
name: generate-windows-cpu
path: dist/windows-amd64/
- uses: actions/download-artifact@v4
with:
name: generate-windows-cuda-11.3
path: dist/windows-amd64/
- uses: actions/download-artifact@v4
with:
name: generate-windows-cuda-12.4
path: dist/windows-amd64/
- uses: actions/download-artifact@v4
with:
name: generate-windows-rocm
path: dist/windows-amd64/
- uses: actions/download-artifact@v4
with:
name: windows-arm64
path: dist
- run: dir build
- run: |
import-module 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise\Common7\Tools\Microsoft.VisualStudio.DevShell.dll'
Enter-VsDevShell -vsinstallpath 'C:\Program Files (x86)\Microsoft Visual Studio\2019\Enterprise' -skipautomaticlocation -DevCmdArguments '-arch=x64 -no_logo'

View File

@@ -243,7 +243,7 @@ jobs:
$env:PATH="$gopath;$gccpath;$env:PATH"
echo $env:PATH
if (!(gcc --version | select-string -quiet clang)) { throw "wrong gcc compiler detected - must be clang" }
make -j 4
make -j 4
- name: 'Build Unix Go Runners'
if: ${{ ! startsWith(matrix.os, 'windows-') }}
run: make -j 4
@@ -269,6 +269,15 @@ jobs:
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Add msys paths
if: ${{ startsWith(matrix.os, 'windows-') }}
run: |
echo "c:\msys64\usr\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "C:\msys64\clang64\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- name: Install msys2 tools
if: ${{ startsWith(matrix.os, 'windows-') }}
run: |
Start-Process "c:\msys64\usr\bin\pacman.exe" -ArgumentList @("-S", "--noconfirm", "mingw-w64-clang-x86_64-gcc-compat", "mingw-w64-clang-x86_64-clang") -NoNewWindow -Wait
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
@@ -300,6 +309,15 @@ jobs:
- uses: actions/checkout@v4
with:
submodules: recursive
- name: Add msys paths
if: ${{ startsWith(matrix.os, 'windows-') }}
run: |
echo "c:\msys64\usr\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
echo "C:\msys64\clang64\bin" | Out-File -FilePath $env:GITHUB_PATH -Encoding utf8 -Append
- name: Install msys2 tools
if: ${{ startsWith(matrix.os, 'windows-') }}
run: |
Start-Process "c:\msys64\usr\bin\pacman.exe" -ArgumentList @("-S", "--noconfirm", "mingw-w64-clang-x86_64-gcc-compat", "mingw-w64-clang-x86_64-clang") -NoNewWindow -Wait
- uses: actions/setup-go@v5
with:
go-version-file: go.mod
@@ -310,8 +328,7 @@ jobs:
arm64) echo ARCH=arm64 ;;
esac >>$GITHUB_ENV
shell: bash
- run: go build
- run: go test -v ./...
- run: go test ./...
patches:
needs: [changes]
@@ -323,4 +340,4 @@ jobs:
submodules: recursive
- name: Verify patches carry all the changes
run: |
make apply-patches sync && git diff --compact-summary --exit-code llama
make apply-patches sync && git diff --compact-summary --exit-code llama

3
.gitignore vendored
View File

@@ -10,9 +10,6 @@ ollama
.idea
test_data
*.crt
llm/build
build/*/*/*
!build/**/placeholder
llama/build
__debug_bin*
llama/vendor

View File

@@ -8,8 +8,6 @@ linters:
- containedctx
- contextcheck
- errcheck
- exportloopref
- gci
- gocheckcompilerdirectives
- gofmt
- gofumpt
@@ -30,8 +28,6 @@ linters:
- wastedassign
- whitespace
linters-settings:
gci:
sections: [standard, default, localmodule]
staticcheck:
checks:
- all

View File

@@ -1,10 +0,0 @@
{
"trailingComma": "es5",
"tabWidth": 2,
"useTabs": false,
"semi": false,
"singleQuote": true,
"jsxSingleQuote": true,
"printWidth": 120,
"arrowParens": "avoid"
}

View File

@@ -1,9 +1,6 @@
ARG GOLANG_VERSION=1.22.8
ARG CMAKE_VERSION=3.22.1
ARG CUDA_VERSION_11=11.3.1
ARG CUDA_V11_ARCHITECTURES="50;52;53;60;61;62;70;72;75;80;86"
ARG CUDA_VERSION_12=12.4.0
ARG CUDA_V12_ARCHITECTURES="60;61;62;70;72;75;80;86;87;89;90;90a"
ARG ROCM_VERSION=6.1.2
ARG JETPACK_6=r36.2.0
ARG JETPACK_5=r35.4.1
@@ -15,24 +12,22 @@ ARG JETPACK_5=r35.4.1
#
### Then incremental builds will be much faster in this container
#
# make -j 10 && go build -trimpath -o dist/linux-amd64/ollama .
# make -j 10 dist
#
FROM --platform=linux/amd64 rocm/dev-centos-7:${ROCM_VERSION}-complete AS unified-builder-amd64
ARG CMAKE_VERSION
ARG GOLANG_VERSION
ARG CUDA_VERSION_11
ARG CUDA_VERSION_12
COPY ./scripts/rh_linux_deps.sh /
ENV PATH /opt/rh/devtoolset-10/root/usr/bin:/usr/local/cuda/bin:$PATH
ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
ENV LIBRARY_PATH=/usr/local/cuda/lib64/stubs:/opt/amdgpu/lib64
RUN CMAKE_VERSION=${CMAKE_VERSION} GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
RUN GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
RUN yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel7/x86_64/cuda-rhel7.repo && \
dnf clean all && \
dnf install -y \
zsh \
cuda-$(echo ${CUDA_VERSION_11} | cut -f1-2 -d. | sed -e "s/\./-/g") \
cuda-$(echo ${CUDA_VERSION_12} | cut -f1-2 -d. | sed -e "s/\./-/g")
cuda-toolkit-$(echo ${CUDA_VERSION_11} | cut -f1-2 -d. | sed -e "s/\./-/g") \
cuda-toolkit-$(echo ${CUDA_VERSION_12} | cut -f1-2 -d. | sed -e "s/\./-/g")
# TODO intel oneapi goes here...
ENV GOARCH amd64
ENV CGO_ENABLED 1
@@ -46,12 +41,11 @@ ENTRYPOINT [ "zsh" ]
# docker run --platform linux/arm64 --rm -it -v $(pwd):/go/src/github.com/ollama/ollama/ builder-arm64
#
FROM --platform=linux/arm64 rockylinux:8 AS unified-builder-arm64
ARG CMAKE_VERSION
ARG GOLANG_VERSION
ARG CUDA_VERSION_11
ARG CUDA_VERSION_12
COPY ./scripts/rh_linux_deps.sh /
RUN CMAKE_VERSION=${CMAKE_VERSION} GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
RUN GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
RUN yum-config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/rhel8/sbsa/cuda-rhel8.repo && \
dnf config-manager --set-enabled appstream && \
dnf clean all && \
@@ -62,37 +56,30 @@ RUN yum-config-manager --add-repo https://developer.download.nvidia.com/compute/
ENV PATH /opt/rh/gcc-toolset-10/root/usr/bin:$PATH:/usr/local/cuda/bin
ENV LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64
ENV LIBRARY_PATH=/usr/local/cuda/lib64/stubs:/opt/amdgpu/lib64
ENV GOARCH amd64
ENV GOARCH arm64
ENV CGO_ENABLED 1
WORKDIR /go/src/github.com/ollama/ollama/
ENTRYPOINT [ "zsh" ]
FROM --platform=linux/amd64 unified-builder-amd64 AS runners-amd64
FROM --platform=linux/amd64 unified-builder-amd64 AS build-amd64
COPY . .
ARG OLLAMA_SKIP_CUDA_GENERATE
ARG OLLAMA_SKIP_CUDA_11_GENERATE
ARG OLLAMA_SKIP_CUDA_12_GENERATE
ARG OLLAMA_SKIP_ROCM_GENERATE
ARG CUDA_V11_ARCHITECTURES
ARG CUDA_V12_ARCHITECTURES
ARG OLLAMA_FAST_BUILD
ARG VERSION
ARG CUSTOM_CPU_FLAGS
RUN --mount=type=cache,target=/root/.ccache \
if grep "^flags" /proc/cpuinfo|grep avx>/dev/null; then \
make -j $(expr $(nproc) / 2 ) ; \
make -j $(nproc) dist ; \
else \
make -j 5 ; \
make -j 5 dist ; \
fi
RUN cd dist/linux-$GOARCH && \
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH.tgz
RUN if [ -z ${OLLAMA_SKIP_ROCM_GENERATE} ] ; then \
cd dist/linux-$GOARCH-rocm && \
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH-rocm.tgz ;\
fi
FROM --platform=linux/arm64 unified-builder-arm64 AS runners-arm64
COPY . .
ARG OLLAMA_SKIP_CUDA_GENERATE
ARG OLLAMA_SKIP_CUDA_11_GENERATE
ARG OLLAMA_SKIP_CUDA_12_GENERATE
ARG CUDA_V11_ARCHITECTURES
ARG CUDA_V12_ARCHITECTURES
ARG OLLAMA_FAST_BUILD
RUN --mount=type=cache,target=/root/.ccache \
make -j 5
# Jetsons need to be built in discrete stages
FROM --platform=linux/arm64 nvcr.io/nvidia/l4t-jetpack:${JETPACK_5} AS runners-jetpack5-arm64
@@ -106,11 +93,11 @@ WORKDIR /go/src/github.com/ollama/ollama/
COPY . .
ARG CGO_CFLAGS
ENV GOARCH arm64
ARG VERSION
RUN --mount=type=cache,target=/root/.ccache \
make -j 5 cuda_v11 \
make -j 5 dist_cuda_v11 \
CUDA_ARCHITECTURES="72;87" \
GPU_RUNNER_VARIANT=_jetpack5 \
CGO_EXTRA_LDFLAGS_LINUX=-L/usr/local/cuda/lib64/stubs \
DIST_LIB_DIR=/go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack5/lib/ollama \
DIST_GPU_RUNNER_DEPS_DIR=/go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack5/lib/ollama/cuda_jetpack5
@@ -125,70 +112,29 @@ WORKDIR /go/src/github.com/ollama/ollama/
COPY . .
ARG CGO_CFLAGS
ENV GOARCH arm64
ARG VERSION
RUN --mount=type=cache,target=/root/.ccache \
make -j 5 cuda_v12 \
make -j 5 dist_cuda_v12 \
CUDA_ARCHITECTURES="87" \
GPU_RUNNER_VARIANT=_jetpack6 \
CGO_EXTRA_LDFLAGS_LINUX=-L/usr/local/cuda/lib64/stubs \
DIST_LIB_DIR=/go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack6/lib/ollama \
DIST_GPU_RUNNER_DEPS_DIR=/go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack6/lib/ollama/cuda_jetpack6
# Intermediate stages used for ./scripts/build_linux.sh
FROM --platform=linux/amd64 centos:7 AS builder-amd64
ARG CMAKE_VERSION
ARG GOLANG_VERSION
COPY ./scripts/rh_linux_deps.sh /
RUN CMAKE_VERSION=${CMAKE_VERSION} GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
ENV PATH /opt/rh/devtoolset-10/root/usr/bin:$PATH
ENV CGO_ENABLED 1
ENV GOARCH amd64
WORKDIR /go/src/github.com/ollama/ollama
FROM --platform=linux/amd64 builder-amd64 AS build-amd64
FROM --platform=linux/arm64 unified-builder-arm64 AS build-arm64
COPY . .
COPY --from=runners-amd64 /go/src/github.com/ollama/ollama/dist/ dist/
COPY --from=runners-amd64 /go/src/github.com/ollama/ollama/build/ build/
ARG GOFLAGS
ARG CGO_CFLAGS
ARG OLLAMA_SKIP_ROCM_GENERATE
ARG OLLAMA_SKIP_CUDA_GENERATE
ARG OLLAMA_FAST_BUILD
ARG VERSION
RUN --mount=type=cache,target=/root/.ccache \
go build -trimpath -o dist/linux-amd64/bin/ollama .
RUN cd dist/linux-$GOARCH && \
tar --exclude runners -cf - . | pigz --best > ../ollama-linux-$GOARCH.tgz
RUN if [ -z ${OLLAMA_SKIP_ROCM_GENERATE} ] ; then \
cd dist/linux-$GOARCH-rocm && \
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH-rocm.tgz ;\
fi
FROM --platform=linux/arm64 rockylinux:8 AS builder-arm64
ARG CMAKE_VERSION
ARG GOLANG_VERSION
COPY ./scripts/rh_linux_deps.sh /
RUN CMAKE_VERSION=${CMAKE_VERSION} GOLANG_VERSION=${GOLANG_VERSION} sh /rh_linux_deps.sh
ENV PATH /opt/rh/gcc-toolset-10/root/usr/bin:$PATH
ENV CGO_ENABLED 1
ENV GOARCH arm64
WORKDIR /go/src/github.com/ollama/ollama
FROM --platform=linux/arm64 builder-arm64 AS build-arm64
COPY . .
COPY --from=runners-arm64 /go/src/github.com/ollama/ollama/dist/ dist/
COPY --from=runners-arm64 /go/src/github.com/ollama/ollama/build/ build/
make -j 5 dist
COPY --from=runners-jetpack5-arm64 /go/src/github.com/ollama/ollama/dist/ dist/
COPY --from=runners-jetpack5-arm64 /go/src/github.com/ollama/ollama/build/ build/
COPY --from=runners-jetpack6-arm64 /go/src/github.com/ollama/ollama/dist/ dist/
COPY --from=runners-jetpack6-arm64 /go/src/github.com/ollama/ollama/build/ build/
ARG GOFLAGS
ARG CGO_CFLAGS
RUN --mount=type=cache,target=/root/.ccache \
go build -trimpath -o dist/linux-arm64/bin/ollama .
RUN cd dist/linux-$GOARCH && \
tar --exclude runners -cf - . | pigz --best > ../ollama-linux-$GOARCH.tgz
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH.tgz
RUN cd dist/linux-$GOARCH-jetpack5 && \
tar --exclude runners -cf - . | pigz --best > ../ollama-linux-$GOARCH-jetpack5.tgz
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH-jetpack5.tgz
RUN cd dist/linux-$GOARCH-jetpack6 && \
tar --exclude runners -cf - . | pigz --best > ../ollama-linux-$GOARCH-jetpack6.tgz
tar -cf - . | pigz --best > ../ollama-linux-$GOARCH-jetpack6.tgz
FROM --platform=linux/amd64 scratch AS dist-amd64
COPY --from=build-amd64 /go/src/github.com/ollama/ollama/dist/ollama-linux-*.tgz /
@@ -197,30 +143,13 @@ COPY --from=build-arm64 /go/src/github.com/ollama/ollama/dist/ollama-linux-*.tgz
FROM dist-$TARGETARCH AS dist
# Optimized container images do not cary nested payloads
FROM --platform=linux/amd64 builder-amd64 AS container-build-amd64
WORKDIR /go/src/github.com/ollama/ollama
COPY . .
ARG GOFLAGS
ARG CGO_CFLAGS
RUN --mount=type=cache,target=/root/.ccache \
go build -trimpath -o dist/linux-amd64/bin/ollama .
FROM --platform=linux/arm64 builder-arm64 AS container-build-arm64
WORKDIR /go/src/github.com/ollama/ollama
COPY . .
ARG GOFLAGS
ARG CGO_CFLAGS
RUN --mount=type=cache,target=/root/.ccache \
go build -trimpath -o dist/linux-arm64/bin/ollama .
# For amd64 container images, filter out cuda/rocm to minimize size
FROM runners-amd64 AS runners-cuda-amd64
FROM build-amd64 AS runners-cuda-amd64
RUN rm -rf \
./dist/linux-amd64/lib/ollama/libggml_hipblas.so \
./dist/linux-amd64/lib/ollama/runners/rocm*
FROM runners-amd64 AS runners-rocm-amd64
FROM build-amd64 AS runners-rocm-amd64
RUN rm -rf \
./dist/linux-amd64/lib/ollama/libggml_cuda*.so \
./dist/linux-amd64/lib/ollama/libcu*.so* \
@@ -230,15 +159,15 @@ FROM --platform=linux/amd64 ubuntu:22.04 AS runtime-amd64
RUN apt-get update && \
apt-get install -y ca-certificates && \
apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=container-build-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/bin/ /bin/
COPY --from=build-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/bin/ /bin/
COPY --from=runners-cuda-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/lib/ /lib/
FROM --platform=linux/arm64 ubuntu:22.04 AS runtime-arm64
RUN apt-get update && \
apt-get install -y ca-certificates && \
apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=container-build-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64/bin/ /bin/
COPY --from=runners-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64/lib/ /lib/
COPY --from=build-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64/bin/ /bin/
COPY --from=build-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64/lib/ /lib/
COPY --from=runners-jetpack5-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack5/lib/ /lib/
COPY --from=runners-jetpack6-arm64 /go/src/github.com/ollama/ollama/dist/linux-arm64-jetpack6/lib/ /lib/
@@ -251,7 +180,7 @@ COPY --from=build-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64-rocm/l
RUN apt-get update && \
apt-get install -y ca-certificates && \
apt-get clean && rm -rf /var/lib/apt/lists/*
COPY --from=container-build-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/bin/ /bin/
COPY --from=build-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/bin/ /bin/
COPY --from=runners-rocm-amd64 /go/src/github.com/ollama/ollama/dist/linux-amd64/lib/ /lib/
EXPOSE 11434

107
Makefile
View File

@@ -1,4 +1,103 @@
GOALS := $(or $(MAKECMDGOALS),all)
.PHONY: $(GOALS)
$(GOALS):
$(MAKE) -C llama $@
# top level makefile for Ollama
include make/common-defs.make
# Determine which if any GPU runners we should build
include make/cuda-v11-defs.make
include make/cuda-v12-defs.make
include make/rocm-defs.make
ifeq ($(CUSTOM_CPU_FLAGS),)
ifeq ($(ARCH),amd64)
RUNNER_TARGETS=cpu
endif
# Without CUSTOM_CPU_FLAGS we default to build both v11 and v12 if present
ifeq ($(OLLAMA_SKIP_CUDA_GENERATE),)
ifneq ($(CUDA_11_COMPILER),)
RUNNER_TARGETS += cuda_v11
endif
ifneq ($(CUDA_12_COMPILER),)
RUNNER_TARGETS += cuda_v12
endif
endif
else # CUSTOM_CPU_FLAGS is set, we'll build only the latest cuda version detected
ifneq ($(CUDA_12_COMPILER),)
RUNNER_TARGETS += cuda_v12
else ifneq ($(CUDA_11_COMPILER),)
RUNNER_TARGETS += cuda_v11
endif
endif
ifeq ($(OLLAMA_SKIP_ROCM_GENERATE),)
ifneq ($(HIP_COMPILER),)
RUNNER_TARGETS += rocm
endif
endif
all: runners exe
dist: $(addprefix dist_, $(RUNNER_TARGETS)) dist_exe
dist_%:
@$(MAKE) --no-print-directory -f make/Makefile.$* dist
runners: $(RUNNER_TARGETS)
$(RUNNER_TARGETS):
@$(MAKE) --no-print-directory -f make/Makefile.$@
exe dist_exe:
@$(MAKE) --no-print-directory -f make/Makefile.ollama $@
help-sync apply-patches create-patches sync sync-clean:
@$(MAKE) --no-print-directory -f make/Makefile.sync $@
test integration lint:
@$(MAKE) --no-print-directory -f make/Makefile.test $@
clean:
rm -rf $(BUILD_DIR) $(DIST_LIB_DIR) $(OLLAMA_EXE) $(DIST_OLLAMA_EXE)
go clean -cache
help:
@echo "The following make targets will help you build Ollama"
@echo ""
@echo " make all # (default target) Build Ollama llm subprocess runners, and the primary ollama executable"
@echo " make runners # Build Ollama llm subprocess runners; after you may use 'go build .' to build the primary ollama exectuable"
@echo " make <runner> # Build specific runners. Enabled: '$(RUNNER_TARGETS)'"
@echo " make dist # Build the runners and primary ollama executable for distribution"
@echo " make help-sync # Help information on vendor update targets"
@echo " make help-runners # Help information on runner targets"
@echo ""
@echo "The following make targets will help you test Ollama"
@echo ""
@echo " make test # Run unit tests"
@echo " make integration # Run integration tests. You must 'make all' first"
@echo " make lint # Run lint and style tests"
@echo ""
@echo "For more information see 'docs/development.md'"
@echo ""
help-runners:
@echo "The following runners will be built based on discovered GPU libraries: '$(RUNNER_TARGETS)'"
@echo ""
@echo "GPU Runner CPU Flags: '$(GPU_RUNNER_CPU_FLAGS)' (Override with CUSTOM_CPU_FLAGS)"
@echo ""
@echo "# CUDA_PATH sets the location where CUDA toolkits are present"
@echo "CUDA_PATH=$(CUDA_PATH)"
@echo " CUDA_11_PATH=$(CUDA_11_PATH)"
@echo " CUDA_11_COMPILER=$(CUDA_11_COMPILER)"
@echo " CUDA_12_PATH=$(CUDA_12_PATH)"
@echo " CUDA_12_COMPILER=$(CUDA_12_COMPILER)"
@echo ""
@echo "# HIP_PATH sets the location where the ROCm toolkit is present"
@echo "HIP_PATH=$(HIP_PATH)"
@echo " HIP_COMPILER=$(HIP_COMPILER)"
.PHONY: all exe dist help help-sync help-runners test integration lint runners clean $(RUNNER_TARGETS)
# Handy debugging for make variables
print-%:
@echo '$*=$($*)'

View File

@@ -1,11 +1,11 @@
<div align="center">
 <img alt="ollama" height="200px" src="https://github.com/ollama/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
  <a href="https://ollama.com" />
<img alt="ollama" height="200px" src="https://github.com/ollama/ollama/assets/3325447/0d0b44e2-8f4a-4e99-9b52-a5c1c741c8f7">
</a>
</div>
# Ollama
[![Discord](https://dcbadge.vercel.app/api/server/ollama?style=flat&compact=true)](https://discord.gg/ollama)
Get up and running with large language models.
### macOS
@@ -33,6 +33,11 @@ The official [Ollama Docker image](https://hub.docker.com/r/ollama/ollama) `olla
- [ollama-python](https://github.com/ollama/ollama-python)
- [ollama-js](https://github.com/ollama/ollama-js)
### Community
- [Discord](https://discord.gg/ollama)
- [Reddit](https://reddit.com/r/ollama)
## Quickstart
To run and chat with [Llama 3.2](https://ollama.com/library/llama3.2):
@@ -49,15 +54,15 @@ Here are some example models that can be downloaded:
| Model | Parameters | Size | Download |
| ------------------ | ---------- | ----- | -------------------------------- |
| Llama 3.3 | 70B | 43GB | `ollama run llama3.3` |
| Llama 3.2 | 3B | 2.0GB | `ollama run llama3.2` |
| Llama 3.2 | 1B | 1.3GB | `ollama run llama3.2:1b` |
| Llama 3.2 Vision | 11B | 7.9GB | `ollama run llama3.2-vision` |
| Llama 3.2 Vision | 90B | 55GB | `ollama run llama3.2-vision:90b` |
| Llama 3.1 | 8B | 4.7GB | `ollama run llama3.1` |
| Llama 3.1 | 70B | 40GB | `ollama run llama3.1:70b` |
| Llama 3.1 | 405B | 231GB | `ollama run llama3.1:405b` |
| Phi 4 | 14B | 9.1GB | `ollama run phi4` |
| Phi 3 Mini | 3.8B | 2.3GB | `ollama run phi3` |
| Phi 3 Medium | 14B | 7.9GB | `ollama run phi3:medium` |
| Gemma 2 | 2B | 1.6GB | `ollama run gemma2:2b` |
| Gemma 2 | 9B | 5.5GB | `ollama run gemma2` |
| Gemma 2 | 27B | 16GB | `ollama run gemma2:27b` |
@@ -97,7 +102,7 @@ Ollama supports importing GGUF models in the Modelfile:
ollama run example
```
### Import from PyTorch or Safetensors
### Import from Safetensors
See the [guide](docs/import.md) on importing models for more information.
@@ -132,7 +137,7 @@ ollama run mario
Hello! It's your friend Mario.
```
For more examples, see the [examples](examples) directory. For more information on working with a Modelfile, see the [Modelfile](docs/modelfile.md) documentation.
For more information on working with a Modelfile, see the [Modelfile](docs/modelfile.md) documentation.
## CLI Reference
@@ -298,6 +303,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [AnythingLLM (Docker + MacOs/Windows/Linux native app)](https://github.com/Mintplex-Labs/anything-llm)
- [Ollama Basic Chat: Uses HyperDiv Reactive UI](https://github.com/rapidarchitect/ollama_basic_chat)
- [Ollama-chats RPG](https://github.com/drazdra/ollama-chats)
- [IntelliBar](https://intellibar.app/) (AI-powered assistant for macOS)
- [QA-Pilot](https://github.com/reid41/QA-Pilot) (Interactive chat tool that can leverage Ollama models for rapid understanding and navigation of GitHub code repositories)
- [ChatOllama](https://github.com/sugarforever/chat-ollama) (Open Source Chatbot based on Ollama with Knowledge Bases)
- [CRAG Ollama Chat](https://github.com/Nagi-ovo/CRAG-Ollama-Chat) (Simple Web Search with Corrective RAG)
@@ -327,6 +333,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [BoltAI for Mac](https://boltai.com) (AI Chat Client for Mac)
- [Harbor](https://github.com/av/harbor) (Containerized LLM Toolkit with Ollama as default backend)
- [PyGPT](https://github.com/szczyglis-dev/py-gpt) (AI desktop assistant for Linux, Windows and Mac)
- [Alpaca](https://github.com/Jeffser/Alpaca) (An Ollama client application for linux and macos made with GTK4 and Adwaita)
- [AutoGPT](https://github.com/Significant-Gravitas/AutoGPT/blob/master/docs/content/platform/ollama.md) (AutoGPT Ollama integration)
- [Go-CREW](https://www.jonathanhecl.com/go-crew/) (Powerful Offline RAG in Golang)
- [PartCAD](https://github.com/openvmp/partcad/) (CAD model generation with OpenSCAD and CadQuery)
@@ -346,6 +353,9 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [Web management](https://github.com/lemonit-eric-mao/ollama-web-management) (Web management page)
- [Promptery](https://github.com/promptery/promptery) (desktop client for Ollama.)
- [Ollama App](https://github.com/JHubi1/ollama-app) (Modern and easy-to-use multi-platform client for Ollama)
- [SpaceLlama](https://github.com/tcsenpai/spacellama) (Firefox and Chrome extension to quickly summarize web pages with ollama in a sidebar)
- [YouLama](https://github.com/tcsenpai/youlama) (Webapp to quickly summarize any YouTube video, supporting Invidious as well)
- [DualMind](https://github.com/tcsenpai/dualmind) (Experimental app allowing two models to talk to each other in the terminal or in a web interface)
- [ollamarama-matrix](https://github.com/h1ddenpr0cess20/ollamarama-matrix) (Ollama chatbot for the Matrix chat protocol)
- [ollama-chat-app](https://github.com/anan1213095357/ollama-chat-app) (Flutter-based chat app)
- [Perfect Memory AI](https://www.perfectmemory.ai/) (Productivity AI assists personalized by what you have seen on your screen, heard and said in the meetings)
@@ -354,8 +364,12 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [OpenTalkGpt](https://github.com/adarshM84/OpenTalkGpt) (Chrome Extension to manage open-source models supported by Ollama, create custom models, and chat with models from a user-friendly UI)
- [VT](https://github.com/vinhnx/vt.ai) (A minimal multimodal AI chat app, with dynamic conversation routing. Supports local models via Ollama)
- [Nosia](https://github.com/nosia-ai/nosia) (Easy to install and use RAG platform based on Ollama)
- [Witsy](https://github.com/nbonamy/witsy) (An AI Desktop application avaiable for Mac/Windows/Linux)
- [Witsy](https://github.com/nbonamy/witsy) (An AI Desktop application available for Mac/Windows/Linux)
- [Abbey](https://github.com/US-Artificial-Intelligence/abbey) (A configurable AI interface server with notebooks, document storage, and YouTube support)
- [Minima](https://github.com/dmayboroda/minima) (RAG with on-premises or fully local workflow)
- [aidful-ollama-model-delete](https://github.com/AidfulAI/aidful-ollama-model-delete) (User interface for simplified model cleanup)
- [Perplexica](https://github.com/ItzCrazyKns/Perplexica) (An AI-powered search engine & an open-source alternative to Perplexity AI)
- [AI Toolkit for Visual Studio Code](https://aka.ms/ai-tooklit/ollama-docs) (Microsoft-official VSCode extension to chat, test, evaluate models with Ollama support, and use them in your AI applications.)
### Cloud
@@ -368,6 +382,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [oterm](https://github.com/ggozad/oterm)
- [Ellama Emacs client](https://github.com/s-kostyaev/ellama)
- [Emacs client](https://github.com/zweifisch/ollama)
- [neollama](https://github.com/paradoxical-dev/neollama) UI client for interacting with models from within Neovim
- [gen.nvim](https://github.com/David-Kunz/gen.nvim)
- [ollama.nvim](https://github.com/nomnivore/ollama.nvim)
- [ollero.nvim](https://github.com/marco-souza/ollero.nvim)
@@ -402,8 +417,11 @@ See the [API documentation](./docs/api.md) for all endpoints.
### Database
- [pgai](https://github.com/timescale/pgai) - PostgreSQL as a vector database (Create and search embeddings from Ollama models using pgvector)
- [Get started guide](https://github.com/timescale/pgai/blob/main/docs/vectorizer-quick-start.md)
- [MindsDB](https://github.com/mindsdb/mindsdb/blob/staging/mindsdb/integrations/handlers/ollama_handler/README.md) (Connects Ollama models with nearly 200 data platforms and apps)
- [chromem-go](https://github.com/philippgille/chromem-go/blob/v0.5.0/embed_ollama.go) with [example](https://github.com/philippgille/chromem-go/tree/v0.5.0/examples/rag-wikipedia-ollama)
- [Kangaroo](https://github.com/dbkangaroo/kangaroo) (AI-powered SQL client and admin tool for popular databases)
### Package managers
@@ -419,10 +437,12 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [LangChain](https://python.langchain.com/docs/integrations/llms/ollama) and [LangChain.js](https://js.langchain.com/docs/integrations/chat/ollama/) with [example](https://js.langchain.com/docs/tutorials/local_rag/)
- [Firebase Genkit](https://firebase.google.com/docs/genkit/plugins/ollama)
- [crewAI](https://github.com/crewAIInc/crewAI)
- [Yacana](https://remembersoftwares.github.io/yacana/) (User-friendly multi-agent framework for brainstorming and executing predetermined flows with built-in tool integration)
- [Spring AI](https://github.com/spring-projects/spring-ai) with [reference](https://docs.spring.io/spring-ai/reference/api/chat/ollama-chat.html) and [example](https://github.com/tzolov/ollama-tools)
- [LangChainGo](https://github.com/tmc/langchaingo/) with [example](https://github.com/tmc/langchaingo/tree/main/examples/ollama-completion-example)
- [LangChain4j](https://github.com/langchain4j/langchain4j) with [example](https://github.com/langchain4j/langchain4j-examples/tree/main/ollama-examples/src/main/java)
- [LangChainRust](https://github.com/Abraxas-365/langchain-rust) with [example](https://github.com/Abraxas-365/langchain-rust/blob/main/examples/llm_ollama.rs)
- [LangChain for .NET](https://github.com/tryAGI/LangChain) with [example](https://github.com/tryAGI/LangChain/blob/main/examples/LangChain.Samples.OpenAI/Program.cs)
- [LLPhant](https://github.com/theodo-group/LLPhant?tab=readme-ov-file#ollama)
- [LlamaIndex](https://docs.llamaindex.ai/en/stable/examples/llm/ollama/) and [LlamaIndexTS](https://ts.llamaindex.ai/modules/llms/available_llms/ollama)
- [LiteLLM](https://github.com/BerriAI/litellm)
@@ -462,6 +482,7 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [GoLamify](https://github.com/prasad89/golamify)
- [Ollama for Haskell](https://github.com/tusharad/ollama-haskell)
- [multi-llm-ts](https://github.com/nbonamy/multi-llm-ts) (A Typescript/JavaScript library allowing access to different LLM in unified API)
- [LlmTornado](https://github.com/lofcz/llmtornado) (C# library providing a unified interface for major FOSS & Commercial inference APIs)
### Mobile
@@ -508,8 +529,10 @@ See the [API documentation](./docs/api.md) for all endpoints.
- [LSP-AI](https://github.com/SilasMarvin/lsp-ai) (Open-source language server for AI-powered functionality)
- [QodeAssist](https://github.com/Palm1r/QodeAssist) (AI-powered coding assistant plugin for Qt Creator)
- [Obsidian Quiz Generator plugin](https://github.com/ECuiDev/obsidian-quiz-generator)
- [AI Summmary Helper plugin](https://github.com/philffm/ai-summary-helper)
- [TextCraft](https://github.com/suncloudsmoon/TextCraft) (Copilot in Word alternative using Ollama)
- [Alfred Ollama](https://github.com/zeitlings/alfred-ollama) (Alfred Workflow)
- [TextLLaMA](https://github.com/adarshM84/TextLLaMA) A Chrome Extension that helps you write emails, correct grammar, and translate into any language
### Supported backends
@@ -518,4 +541,5 @@ See the [API documentation](./docs/api.md) for all endpoints.
### Observability
- [OpenLIT](https://github.com/openlit/openlit) is an OpenTelemetry-native tool for monitoring Ollama Applications & GPUs using traces and metrics.
- [HoneyHive](https://docs.honeyhive.ai/integrations/ollama) is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production.
- [HoneyHive](https://docs.honeyhive.ai/integrations/ollama) is an AI observability and evaluation platform for AI agents. Use HoneyHive to evaluate agent performance, interrogate failures, and monitor quality in production.
- [Langfuse](https://langfuse.com/docs/integrations/ollama) is an open source LLM observability platform that enables teams to collaboratively monitor, evaluate and debug AI applications.

17
api/examples/README.md Normal file
View File

@@ -0,0 +1,17 @@
# Ollama API Examples
Run the examples in this directory with:
```
go run example_name/main.go
```
## Chat - Chat with a model
- [chat/main.go](chat/main.go)
## Generate - Generate text from a model
- [generate/main.go](generate/main.go)
- [generate-streaming/main.go](generate-streaming/main.go)
## Pull - Pull a model
- [pull-progress/main.go](pull-progress/main.go)

View File

@@ -67,7 +67,7 @@ type GenerateRequest struct {
Raw bool `json:"raw,omitempty"`
// Format specifies the format to return a response in.
Format string `json:"format"`
Format json.RawMessage `json:"format,omitempty"`
// KeepAlive controls how long the model will stay loaded in memory following
// this request.
@@ -94,7 +94,7 @@ type ChatRequest struct {
Stream *bool `json:"stream,omitempty"`
// Format is the format to return the response in (e.g. "json").
Format string `json:"format"`
Format json.RawMessage `json:"format,omitempty"`
// KeepAlive controls how long the model will stay loaded into memory
// following the request.
@@ -146,6 +146,7 @@ type ToolCall struct {
}
type ToolCallFunction struct {
Index int `json:"index,omitempty"`
Name string `json:"name"`
Arguments ToolCallFunctionArguments `json:"arguments"`
}
@@ -215,7 +216,6 @@ type Options struct {
TopK int `json:"top_k,omitempty"`
TopP float32 `json:"top_p,omitempty"`
MinP float32 `json:"min_p,omitempty"`
TFSZ float32 `json:"tfs_z,omitempty"`
TypicalP float32 `json:"typical_p,omitempty"`
RepeatLastN int `json:"repeat_last_n,omitempty"`
Temperature float32 `json:"temperature,omitempty"`
@@ -225,7 +225,6 @@ type Options struct {
Mirostat int `json:"mirostat,omitempty"`
MirostatTau float32 `json:"mirostat_tau,omitempty"`
MirostatEta float32 `json:"mirostat_eta,omitempty"`
PenalizeNewline bool `json:"penalize_newline,omitempty"`
Stop []string `json:"stop,omitempty"`
}
@@ -295,17 +294,21 @@ type EmbeddingResponse struct {
// CreateRequest is the request passed to [Client.Create].
type CreateRequest struct {
Model string `json:"model"`
Modelfile string `json:"modelfile"`
Stream *bool `json:"stream,omitempty"`
Quantize string `json:"quantize,omitempty"`
Model string `json:"model"`
Stream *bool `json:"stream,omitempty"`
Quantize string `json:"quantize,omitempty"`
From string `json:"from,omitempty"`
Files map[string]string `json:"files,omitempty"`
Adapters map[string]string `json:"adapters,omitempty"`
Template string `json:"template,omitempty"`
License any `json:"license,omitempty"`
System string `json:"system,omitempty"`
Parameters map[string]any `json:"parameters,omitempty"`
Messages []Message `json:"messages,omitempty"`
// Deprecated: set the model name with Model instead
Name string `json:"name"`
// Deprecated: set the file content with Modelfile instead
Path string `json:"path"`
// Deprecated: use Quantize instead
Quantization string `json:"quantization,omitempty"`
}
@@ -594,7 +597,6 @@ func DefaultOptions() Options {
Temperature: 0.8,
TopK: 40,
TopP: 0.9,
TFSZ: 1.0,
TypicalP: 1.0,
RepeatLastN: 64,
RepeatPenalty: 1.1,
@@ -603,7 +605,6 @@ func DefaultOptions() Options {
Mirostat: 0,
MirostatTau: 5.0,
MirostatEta: 0.1,
PenalizeNewline: true,
Seed: -1,
Runner: Runner{

View File

@@ -97,7 +97,6 @@ Source: "..\dist\windows-amd64\lib\ollama\*"; DestDir: "{app}\lib\ollama\"; Chec
Source: "..\dist\windows-arm64\vc_redist.arm64.exe"; DestDir: "{tmp}"; Check: IsArm64() and vc_redist_needed(); Flags: deleteafterinstall
Source: "..\dist\windows-arm64-app.exe"; DestDir: "{app}"; DestName: "{#MyAppExeName}" ;Check: IsArm64(); Flags: ignoreversion 64bit
Source: "..\dist\windows-arm64\ollama.exe"; DestDir: "{app}"; Check: IsArm64(); Flags: ignoreversion 64bit
Source: "..\dist\windows-arm64\lib\ollama\*"; DestDir: "{app}\lib\ollama\"; Check: IsArm64(); Flags: ignoreversion 64bit recursesubdirs
#endif
Source: "..\dist\ollama_welcome.ps1"; DestDir: "{app}"; Flags: ignoreversion

View File

@@ -98,7 +98,7 @@ func (t *winTray) wndProc(hWnd windows.Handle, message uint32, wParam, lParam ui
}
err = t.wcex.unregister()
if err != nil {
slog.Error(fmt.Sprintf("failed to uregister windo %s", err))
slog.Error(fmt.Sprintf("failed to unregister window %s", err))
}
case WM_DESTROY:
// same as WM_ENDSESSION, but throws 0 exit code after all

View File

@@ -1 +0,0 @@
This is here to make sure the build/ directory exists for the go:embed command

View File

@@ -1 +0,0 @@
This is here to make sure the build/ directory exists for the go:embed command

View File

@@ -1,8 +0,0 @@
package build
import "embed"
// Darwin payloads separated by architecture to avoid duplicate payloads when cross compiling
//go:embed darwin/amd64/*
var EmbedFS embed.FS

View File

@@ -1,8 +0,0 @@
package build
import "embed"
// Darwin payloads separated by architecture to avoid duplicate payloads when cross compiling
//go:embed darwin/arm64/*
var EmbedFS embed.FS

View File

@@ -1,6 +0,0 @@
package build
import "embed"
//go:embed linux/*
var EmbedFS embed.FS

View File

@@ -1,8 +0,0 @@
//go:build !linux && !darwin
package build
import "embed"
// unused on windows
var EmbedFS embed.FS

View File

@@ -1 +0,0 @@
This is here to make sure the build/ directory exists for the go:embed command

View File

@@ -1 +0,0 @@
This is here to make sure the build/ directory exists for the go:embed command

View File

@@ -1,13 +1,11 @@
package cmd
import (
"archive/zip"
"bufio"
"bytes"
"context"
"crypto/ed25519"
"crypto/rand"
"crypto/sha256"
"encoding/json"
"encoding/pem"
"errors"
"fmt"
@@ -36,6 +34,8 @@ import (
"github.com/ollama/ollama/api"
"github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/format"
"github.com/ollama/ollama/llama"
"github.com/ollama/ollama/llama/runner"
"github.com/ollama/ollama/parser"
"github.com/ollama/ollama/progress"
"github.com/ollama/ollama/server"
@@ -43,15 +43,11 @@ import (
"github.com/ollama/ollama/version"
)
var (
errModelNotFound = errors.New("no Modelfile or safetensors files found")
errModelfileNotFound = errors.New("specified Modelfile wasn't found")
)
var errModelfileNotFound = errors.New("specified Modelfile wasn't found")
func getModelfileName(cmd *cobra.Command) (string, error) {
fn, _ := cmd.Flags().GetString("file")
filename, _ := cmd.Flags().GetString("file")
filename := fn
if filename == "" {
filename = "Modelfile"
}
@@ -63,7 +59,7 @@ func getModelfileName(cmd *cobra.Command) (string, error) {
_, err = os.Stat(absName)
if err != nil {
return fn, err
return "", err
}
return absName, nil
@@ -99,68 +95,52 @@ func CreateHandler(cmd *cobra.Command, args []string) error {
return err
}
home, err := os.UserHomeDir()
status := "gathering model components"
spinner := progress.NewSpinner(status)
p.Add(status, spinner)
req, err := modelfile.CreateRequest(filepath.Dir(filename))
if err != nil {
return err
}
spinner.Stop()
status := "transferring model data"
spinner := progress.NewSpinner(status)
p.Add(status, spinner)
defer p.Stop()
req.Name = args[0]
quantize, _ := cmd.Flags().GetString("quantize")
if quantize != "" {
req.Quantize = quantize
}
client, err := api.ClientFromEnvironment()
if err != nil {
return err
}
for i := range modelfile.Commands {
switch modelfile.Commands[i].Name {
case "model", "adapter":
path := modelfile.Commands[i].Args
if path == "~" {
path = home
} else if strings.HasPrefix(path, "~/") {
path = filepath.Join(home, path[2:])
}
if !filepath.IsAbs(path) {
path = filepath.Join(filepath.Dir(filename), path)
}
fi, err := os.Stat(path)
if errors.Is(err, os.ErrNotExist) && modelfile.Commands[i].Name == "model" {
continue
} else if err != nil {
if len(req.Files) > 0 {
fileMap := map[string]string{}
for f, digest := range req.Files {
if _, err := createBlob(cmd, client, f, digest, p); err != nil {
return err
}
if fi.IsDir() {
// this is likely a safetensors or pytorch directory
// TODO make this work w/ adapters
tempfile, err := tempZipFiles(path)
if err != nil {
return err
}
defer os.RemoveAll(tempfile)
path = tempfile
}
digest, err := createBlob(cmd, client, path, spinner)
if err != nil {
return err
}
modelfile.Commands[i].Args = "@" + digest
fileMap[filepath.Base(f)] = digest
}
req.Files = fileMap
}
if len(req.Adapters) > 0 {
fileMap := map[string]string{}
for f, digest := range req.Adapters {
if _, err := createBlob(cmd, client, f, digest, p); err != nil {
return err
}
fileMap[filepath.Base(f)] = digest
}
req.Adapters = fileMap
}
bars := make(map[string]*progress.Bar)
fn := func(resp api.ProgressResponse) error {
if resp.Digest != "" {
spinner.Stop()
bar, ok := bars[resp.Digest]
if !ok {
bar = progress.NewBar(fmt.Sprintf("pulling %s...", resp.Digest[7:19]), resp.Total, resp.Completed)
@@ -180,145 +160,23 @@ func CreateHandler(cmd *cobra.Command, args []string) error {
return nil
}
quantize, _ := cmd.Flags().GetString("quantize")
request := api.CreateRequest{Name: args[0], Modelfile: modelfile.String(), Quantize: quantize}
if err := client.Create(cmd.Context(), &request, fn); err != nil {
if err := client.Create(cmd.Context(), req, fn); err != nil {
if strings.Contains(err.Error(), "path or Modelfile are required") {
return fmt.Errorf("the ollama server must be updated to use `ollama create` with this client")
}
return err
}
return nil
}
func tempZipFiles(path string) (string, error) {
tempfile, err := os.CreateTemp("", "ollama-tf")
func createBlob(cmd *cobra.Command, client *api.Client, path string, digest string, p *progress.Progress) (string, error) {
realPath, err := filepath.EvalSymlinks(path)
if err != nil {
return "", err
}
defer tempfile.Close()
detectContentType := func(path string) (string, error) {
f, err := os.Open(path)
if err != nil {
return "", err
}
defer f.Close()
var b bytes.Buffer
b.Grow(512)
if _, err := io.CopyN(&b, f, 512); err != nil && !errors.Is(err, io.EOF) {
return "", err
}
contentType, _, _ := strings.Cut(http.DetectContentType(b.Bytes()), ";")
return contentType, nil
}
glob := func(pattern, contentType string) ([]string, error) {
matches, err := filepath.Glob(pattern)
if err != nil {
return nil, err
}
for _, safetensor := range matches {
if ct, err := detectContentType(safetensor); err != nil {
return nil, err
} else if ct != contentType {
return nil, fmt.Errorf("invalid content type: expected %s for %s", ct, safetensor)
}
}
return matches, nil
}
var files []string
if st, _ := glob(filepath.Join(path, "model*.safetensors"), "application/octet-stream"); len(st) > 0 {
// safetensors files might be unresolved git lfs references; skip if they are
// covers model-x-of-y.safetensors, model.fp32-x-of-y.safetensors, model.safetensors
files = append(files, st...)
} else if st, _ := glob(filepath.Join(path, "adapters.safetensors"), "application/octet-stream"); len(st) > 0 {
// covers adapters.safetensors
files = append(files, st...)
} else if st, _ := glob(filepath.Join(path, "adapter_model.safetensors"), "application/octet-stream"); len(st) > 0 {
// covers adapter_model.safetensors
files = append(files, st...)
} else if pt, _ := glob(filepath.Join(path, "pytorch_model*.bin"), "application/zip"); len(pt) > 0 {
// pytorch files might also be unresolved git lfs references; skip if they are
// covers pytorch_model-x-of-y.bin, pytorch_model.fp32-x-of-y.bin, pytorch_model.bin
files = append(files, pt...)
} else if pt, _ := glob(filepath.Join(path, "consolidated*.pth"), "application/zip"); len(pt) > 0 {
// pytorch files might also be unresolved git lfs references; skip if they are
// covers consolidated.x.pth, consolidated.pth
files = append(files, pt...)
} else {
return "", errModelNotFound
}
// add configuration files, json files are detected as text/plain
js, err := glob(filepath.Join(path, "*.json"), "text/plain")
if err != nil {
return "", err
}
files = append(files, js...)
// bert models require a nested config.json
// TODO(mxyng): merge this with the glob above
js, err = glob(filepath.Join(path, "**/*.json"), "text/plain")
if err != nil {
return "", err
}
files = append(files, js...)
if tks, _ := glob(filepath.Join(path, "tokenizer.model"), "application/octet-stream"); len(tks) > 0 {
// add tokenizer.model if it exists, tokenizer.json is automatically picked up by the previous glob
// tokenizer.model might be a unresolved git lfs reference; error if it is
files = append(files, tks...)
} else if tks, _ := glob(filepath.Join(path, "**/tokenizer.model"), "text/plain"); len(tks) > 0 {
// some times tokenizer.model is in a subdirectory (e.g. meta-llama/Meta-Llama-3-8B)
files = append(files, tks...)
}
zipfile := zip.NewWriter(tempfile)
defer zipfile.Close()
for _, file := range files {
f, err := os.Open(file)
if err != nil {
return "", err
}
defer f.Close()
fi, err := f.Stat()
if err != nil {
return "", err
}
zfi, err := zip.FileInfoHeader(fi)
if err != nil {
return "", err
}
zfi.Name, err = filepath.Rel(path, file)
if err != nil {
return "", err
}
zf, err := zipfile.CreateHeader(zfi)
if err != nil {
return "", err
}
if _, err := io.Copy(zf, f); err != nil {
return "", err
}
}
return tempfile.Name(), nil
}
func createBlob(cmd *cobra.Command, client *api.Client, path string, spinner *progress.Spinner) (string, error) {
bin, err := os.Open(path)
bin, err := os.Open(realPath)
if err != nil {
return "", err
}
@@ -331,18 +189,11 @@ func createBlob(cmd *cobra.Command, client *api.Client, path string, spinner *pr
}
fileSize := fileInfo.Size()
hash := sha256.New()
if _, err := io.Copy(hash, bin); err != nil {
return "", err
}
if _, err := bin.Seek(0, io.SeekStart); err != nil {
return "", err
}
var pw progressWriter
status := "transferring model data 0%"
spinner.SetMessage(status)
status := fmt.Sprintf("copying file %s 0%%", digest)
spinner := progress.NewSpinner(status)
p.Add(status, spinner)
defer spinner.Stop()
done := make(chan struct{})
defer close(done)
@@ -353,15 +204,14 @@ func createBlob(cmd *cobra.Command, client *api.Client, path string, spinner *pr
for {
select {
case <-ticker.C:
spinner.SetMessage(fmt.Sprintf("transferring model data %d%%", int(100*pw.n.Load()/fileSize)))
spinner.SetMessage(fmt.Sprintf("copying file %s %d%%", digest, int(100*pw.n.Load()/fileSize)))
case <-done:
spinner.SetMessage("transferring model data 100%")
spinner.SetMessage(fmt.Sprintf("copying file %s 100%%", digest))
return
}
}
}()
digest := fmt.Sprintf("sha256:%x", hash.Sum(nil))
if err = client.CreateBlob(cmd.Context(), digest, io.TeeReader(bin, &pw)); err != nil {
return "", err
}
@@ -598,7 +448,7 @@ func ListHandler(cmd *cobra.Command, args []string) error {
var data [][]string
for _, m := range models.Models {
if len(args) == 0 || strings.HasPrefix(m.Name, args[0]) {
if len(args) == 0 || strings.HasPrefix(strings.ToLower(m.Name), strings.ToLower(args[0])) {
data = append(data, []string{m.Name, m.Digest[:12], format.HumanBytes(m.Size), format.HumanTime(m.ModifiedAt, "Never")})
}
}
@@ -1035,10 +885,14 @@ func chat(cmd *cobra.Command, opts runOptions) (*api.Message, error) {
return nil
}
if opts.Format == "json" {
opts.Format = `"` + opts.Format + `"`
}
req := &api.ChatRequest{
Model: opts.Model,
Messages: opts.Messages,
Format: opts.Format,
Format: json.RawMessage(opts.Format),
Options: opts.Options,
}
@@ -1120,12 +974,16 @@ func generate(cmd *cobra.Command, opts runOptions) error {
}
}
if opts.Format == "json" {
opts.Format = `"` + opts.Format + `"`
}
request := api.GenerateRequest{
Model: opts.Model,
Prompt: opts.Prompt,
Context: generateContext,
Images: opts.Images,
Format: opts.Format,
Format: json.RawMessage(opts.Format),
System: opts.System,
Options: opts.Options,
KeepAlive: opts.KeepAlive,
@@ -1411,6 +1269,19 @@ func NewCLI() *cobra.Command {
RunE: DeleteHandler,
}
runnerCmd := &cobra.Command{
Use: "runner",
Short: llama.PrintSystemInfo(),
Hidden: true,
RunE: func(cmd *cobra.Command, args []string) error {
return runner.Execute(os.Args[1:])
},
FParseErrWhitelist: cobra.FParseErrWhitelist{UnknownFlags: true},
}
runnerCmd.SetHelpFunc(func(cmd *cobra.Command, args []string) {
_ = runner.Execute(args[1:])
})
envVars := envconfig.AsMap()
envs := []envconfig.EnvVar{envVars["OLLAMA_HOST"]}
@@ -1445,6 +1316,7 @@ func NewCLI() *cobra.Command {
envVars["OLLAMA_SCHED_SPREAD"],
envVars["OLLAMA_TMPDIR"],
envVars["OLLAMA_FLASH_ATTENTION"],
envVars["OLLAMA_KV_CACHE_TYPE"],
envVars["OLLAMA_LLM_LIBRARY"],
envVars["OLLAMA_GPU_OVERHEAD"],
envVars["OLLAMA_LOAD_TIMEOUT"],
@@ -1466,6 +1338,7 @@ func NewCLI() *cobra.Command {
psCmd,
copyCmd,
deleteCmd,
runnerCmd,
)
return rootCmd

View File

@@ -8,7 +8,6 @@ import (
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"testing"
@@ -180,18 +179,14 @@ Weigh anchor!
t.Run("license", func(t *testing.T) {
var b bytes.Buffer
license, err := os.ReadFile(filepath.Join("..", "LICENSE"))
if err != nil {
t.Fatal(err)
}
license := "MIT License\nCopyright (c) Ollama\n"
if err := showInfo(&api.ShowResponse{
Details: api.ModelDetails{
Family: "test",
ParameterSize: "7B",
QuantizationLevel: "FP16",
},
License: string(license),
License: license,
}, &b); err != nil {
t.Fatal(err)
}
@@ -298,7 +293,7 @@ func TestGetModelfileName(t *testing.T) {
name: "modelfile specified, no modelfile exists",
modelfileName: "crazyfile",
fileExists: false,
expectedName: "crazyfile",
expectedName: "",
expectedErr: os.ErrNotExist,
},
{
@@ -343,8 +338,8 @@ func TestGetModelfileName(t *testing.T) {
t.Fatalf("couldn't set file flag: %v", err)
}
} else {
expectedFilename = tt.expectedName
if tt.modelfileName != "" {
expectedFilename = tt.modelfileName
err := cmd.Flags().Set("file", tt.modelfileName)
if err != nil {
t.Fatalf("couldn't set file flag: %v", err)
@@ -494,3 +489,130 @@ func TestPushHandler(t *testing.T) {
})
}
}
func TestCreateHandler(t *testing.T) {
tests := []struct {
name string
modelName string
modelFile string
serverResponse map[string]func(w http.ResponseWriter, r *http.Request)
expectedError string
expectedOutput string
}{
{
name: "successful create",
modelName: "test-model",
modelFile: "FROM foo",
serverResponse: map[string]func(w http.ResponseWriter, r *http.Request){
"/api/create": func(w http.ResponseWriter, r *http.Request) {
if r.Method != http.MethodPost {
t.Errorf("expected POST request, got %s", r.Method)
}
req := api.CreateRequest{}
if err := json.NewDecoder(r.Body).Decode(&req); err != nil {
http.Error(w, err.Error(), http.StatusBadRequest)
return
}
if req.Name != "test-model" {
t.Errorf("expected model name 'test-model', got %s", req.Name)
}
if req.From != "foo" {
t.Errorf("expected from 'foo', got %s", req.From)
}
responses := []api.ProgressResponse{
{Status: "using existing layer sha256:56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb"},
{Status: "writing manifest"},
{Status: "success"},
}
for _, resp := range responses {
if err := json.NewEncoder(w).Encode(resp); err != nil {
http.Error(w, err.Error(), http.StatusInternalServerError)
return
}
w.(http.Flusher).Flush()
}
},
},
expectedOutput: "",
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
mockServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
handler, ok := tt.serverResponse[r.URL.Path]
if !ok {
t.Errorf("unexpected request to %s", r.URL.Path)
http.Error(w, "not found", http.StatusNotFound)
return
}
handler(w, r)
}))
t.Setenv("OLLAMA_HOST", mockServer.URL)
t.Cleanup(mockServer.Close)
tempFile, err := os.CreateTemp("", "modelfile")
if err != nil {
t.Fatal(err)
}
defer os.Remove(tempFile.Name())
if _, err := tempFile.WriteString(tt.modelFile); err != nil {
t.Fatal(err)
}
if err := tempFile.Close(); err != nil {
t.Fatal(err)
}
cmd := &cobra.Command{}
cmd.Flags().String("file", "", "")
if err := cmd.Flags().Set("file", tempFile.Name()); err != nil {
t.Fatal(err)
}
cmd.Flags().Bool("insecure", false, "")
cmd.SetContext(context.TODO())
// Redirect stderr to capture progress output
oldStderr := os.Stderr
r, w, _ := os.Pipe()
os.Stderr = w
// Capture stdout for the "Model pushed" message
oldStdout := os.Stdout
outR, outW, _ := os.Pipe()
os.Stdout = outW
err = CreateHandler(cmd, []string{tt.modelName})
// Restore stderr
w.Close()
os.Stderr = oldStderr
// drain the pipe
if _, err := io.ReadAll(r); err != nil {
t.Fatal(err)
}
// Restore stdout and get output
outW.Close()
os.Stdout = oldStdout
stdout, _ := io.ReadAll(outR)
if tt.expectedError == "" {
if err != nil {
t.Errorf("expected no error, got %v", err)
}
if tt.expectedOutput != "" {
if got := string(stdout); got != tt.expectedOutput {
t.Errorf("expected output %q, got %q", tt.expectedOutput, got)
}
}
}
})
}
}

View File

@@ -13,11 +13,9 @@ import (
"strings"
"github.com/spf13/cobra"
"golang.org/x/exp/maps"
"github.com/ollama/ollama/api"
"github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/parser"
"github.com/ollama/ollama/readline"
"github.com/ollama/ollama/types/errtypes"
)
@@ -213,10 +211,7 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
return err
}
req := &api.CreateRequest{
Name: args[1],
Modelfile: buildModelfile(opts),
}
req := NewCreateRequest(args[1], opts)
fn := func(resp api.ProgressResponse) error { return nil }
err = client.Create(cmd.Context(), req, fn)
if err != nil {
@@ -459,36 +454,25 @@ func generateInteractive(cmd *cobra.Command, opts runOptions) error {
}
}
func buildModelfile(opts runOptions) string {
var f parser.File
f.Commands = append(f.Commands, parser.Command{Name: "model", Args: cmp.Or(opts.ParentModel, opts.Model)})
func NewCreateRequest(name string, opts runOptions) *api.CreateRequest {
req := &api.CreateRequest{
Name: name,
From: cmp.Or(opts.ParentModel, opts.Model),
}
if opts.System != "" {
f.Commands = append(f.Commands, parser.Command{Name: "system", Args: opts.System})
req.System = opts.System
}
keys := maps.Keys(opts.Options)
slices.Sort(keys)
for _, k := range keys {
v := opts.Options[k]
var cmds []parser.Command
switch t := v.(type) {
case []string:
for _, s := range t {
cmds = append(cmds, parser.Command{Name: k, Args: s})
}
default:
cmds = append(cmds, parser.Command{Name: k, Args: fmt.Sprintf("%v", t)})
}
f.Commands = append(f.Commands, cmds...)
if len(opts.Options) > 0 {
req.Parameters = opts.Options
}
for _, msg := range opts.Messages {
f.Commands = append(f.Commands, parser.Command{Name: "message", Args: fmt.Sprintf("%s: %s", msg.Role, msg.Content)})
if len(opts.Messages) > 0 {
req.Messages = opts.Messages
}
return f.String()
return req
}
func normalizeFilePath(fp string) string {

View File

@@ -3,10 +3,7 @@ package cmd
import (
"testing"
"github.com/google/go-cmp/cmp"
"github.com/stretchr/testify/assert"
"github.com/ollama/ollama/api"
)
func TestExtractFilenames(t *testing.T) {
@@ -53,56 +50,3 @@ d:\path with\spaces\seven.JPEG inbetween7 c:\users\jdoe\eight.png inbetween8
assert.Contains(t, res[9], "ten.PNG")
assert.Contains(t, res[9], "E:")
}
func TestModelfileBuilder(t *testing.T) {
opts := runOptions{
Model: "hork",
System: "You are part horse and part shark, but all hork. Do horklike things",
Messages: []api.Message{
{Role: "user", Content: "Hey there hork!"},
{Role: "assistant", Content: "Yes it is true, I am half horse, half shark."},
},
Options: map[string]any{
"temperature": 0.9,
"seed": 42,
"penalize_newline": false,
"stop": []string{"hi", "there"},
},
}
t.Run("model", func(t *testing.T) {
expect := `FROM hork
SYSTEM You are part horse and part shark, but all hork. Do horklike things
PARAMETER penalize_newline false
PARAMETER seed 42
PARAMETER stop hi
PARAMETER stop there
PARAMETER temperature 0.9
MESSAGE user Hey there hork!
MESSAGE assistant Yes it is true, I am half horse, half shark.
`
actual := buildModelfile(opts)
if diff := cmp.Diff(expect, actual); diff != "" {
t.Errorf("mismatch (-want +got):\n%s", diff)
}
})
t.Run("parent model", func(t *testing.T) {
opts.ParentModel = "horseshark"
expect := `FROM horseshark
SYSTEM You are part horse and part shark, but all hork. Do horklike things
PARAMETER penalize_newline false
PARAMETER seed 42
PARAMETER stop hi
PARAMETER stop there
PARAMETER temperature 0.9
MESSAGE user Hey there hork!
MESSAGE assistant Yes it is true, I am half horse, half shark.
`
actual := buildModelfile(opts)
if diff := cmp.Diff(expect, actual); diff != "" {
t.Errorf("mismatch (-want +got):\n%s", diff)
}
})
}

15
cmd/runner/main.go Normal file
View File

@@ -0,0 +1,15 @@
package main
import (
"fmt"
"os"
"github.com/ollama/ollama/llama/runner"
)
func main() {
if err := runner.Execute(os.Args[1:]); err != nil {
fmt.Fprintf(os.Stderr, "error: %s\n", err)
os.Exit(1)
}
}

View File

@@ -187,8 +187,12 @@ func ConvertModel(fsys fs.FS, ws io.WriteSeeker) error {
conv = &gemma2Model{}
case "Phi3ForCausalLM":
conv = &phi3Model{}
case "Qwen2ForCausalLM":
conv = &qwen2Model{}
case "BertModel":
conv = &bertModel{}
case "CohereForCausalLM":
conv = &commandrModel{}
default:
return errors.New("unsupported architecture")
}

View File

@@ -0,0 +1,76 @@
package convert
import (
"cmp"
"github.com/ollama/ollama/llm"
)
type commandrModel struct {
ModelParameters
MaxPositionEmbeddings uint32 `json:"max_position_embeddings"`
HiddenSize uint32 `json:"hidden_size"`
HiddenLayers uint32 `json:"num_hidden_layers"`
IntermediateSize uint32 `json:"intermediate_size"`
NumAttentionHeads uint32 `json:"num_attention_heads"`
NumKeyValueHeads uint32 `json:"num_key_value_heads"`
LayerNormEPS float32 `json:"layer_norm_eps"`
RopeTheta float32 `json:"rope_theta"`
UseQKNorm bool `json:"use_qk_norm"`
MaxLength uint32 `json:"model_max_length"`
LogitScale float32 `json:"logit_scale"`
NCtx uint32 `json:"n_ctx"`
}
var _ ModelConverter = (*commandrModel)(nil)
func (p *commandrModel) KV(t *Tokenizer) llm.KV {
kv := p.ModelParameters.KV(t)
kv["general.architecture"] = "command-r"
kv["general.name"] = "command-r"
kv["command-r.context_length"] = cmp.Or(p.MaxLength, p.MaxPositionEmbeddings, p.NCtx)
kv["command-r.embedding_length"] = p.HiddenSize
kv["command-r.block_count"] = p.HiddenLayers
kv["command-r.feed_forward_length"] = p.IntermediateSize
kv["command-r.attention.head_count"] = p.NumAttentionHeads
kv["command-r.attention.head_count_kv"] = p.NumKeyValueHeads
kv["command-r.attention.layer_norm_epsilon"] = p.LayerNormEPS
kv["command-r.rope.freq_base"] = p.RopeTheta
kv["command-r.max_position_embeddings"] = cmp.Or(p.MaxLength, p.MaxPositionEmbeddings)
kv["command-r.logit_scale"] = p.LogitScale
kv["command-r.rope.scaling.type"] = "none"
return kv
}
func (p *commandrModel) Tensors(ts []Tensor) []llm.Tensor {
var out []llm.Tensor
for _, t := range ts {
out = append(out, llm.Tensor{
Name: t.Name(),
Kind: t.Kind(),
Shape: t.Shape(),
WriterTo: t,
})
}
return out
}
func (p *commandrModel) Replacements() []string {
return []string{
"self_attn.q_norm", "attn_q_norm",
"self_attn.k_norm", "attn_k_norm",
"model.layers", "blk",
"input_layernorm", "attn_norm",
"mlp.down_proj", "ffn_down",
"mlp.gate_proj", "ffn_gate",
"mlp.up_proj", "ffn_up",
"self_attn.k_proj", "attn_k",
"self_attn.o_proj", "attn_output",
"self_attn.q_proj", "attn_q",
"self_attn.v_proj", "attn_v",
"model.norm", "output_norm",
"model.embed_tokens", "token_embd",
}
}

78
convert/convert_qwen2.go Normal file
View File

@@ -0,0 +1,78 @@
package convert
import "github.com/ollama/ollama/llm"
type qwen2Model struct {
ModelParameters
MaxPositionEmbeddings uint32 `json:"max_position_embeddings"`
HiddenSize uint32 `json:"hidden_size"`
HiddenLayers uint32 `json:"num_hidden_layers"`
IntermediateSize uint32 `json:"intermediate_size"`
NumAttentionHeads uint32 `json:"num_attention_heads"`
NumKeyValueHeads uint32 `json:"num_key_value_heads"`
RopeTheta float32 `json:"rope_theta"`
RopeScaling struct {
Type string `json:"type"`
Factor ropeFactor `json:"factor"`
OriginalMaxPositionEmbeddings uint32 `json:"original_max_position_embeddings"`
} `json:"rope_scaling"`
RMSNormEPS float32 `json:"rms_norm_eps"`
}
var _ ModelConverter = (*qwen2Model)(nil)
func (q *qwen2Model) KV(t *Tokenizer) llm.KV {
kv := q.ModelParameters.KV(t)
kv["general.architecture"] = "qwen2"
kv["qwen2.block_count"] = q.HiddenLayers
kv["qwen2.context_length"] = q.MaxPositionEmbeddings
kv["qwen2.embedding_length"] = q.HiddenSize
kv["qwen2.feed_forward_length"] = q.IntermediateSize
kv["qwen2.attention.head_count"] = q.NumAttentionHeads
kv["qwen2.attention.head_count_kv"] = q.NumKeyValueHeads
kv["qwen2.rope.freq_base"] = q.RopeTheta
kv["qwen2.attention.layer_norm_rms_epsilon"] = q.RMSNormEPS
switch q.RopeScaling.Type {
case "":
// no scaling
case "yarn":
kv["qwen2.rope.scaling.type"] = q.RopeScaling.Type
kv["qwen2.rope.scaling.factor"] = q.RopeScaling.Factor
default:
panic("unknown rope scaling type")
}
return kv
}
func (q *qwen2Model) Tensors(ts []Tensor) []llm.Tensor {
var out []llm.Tensor
for _, t := range ts {
out = append(out, llm.Tensor{
Name: t.Name(),
Kind: t.Kind(),
Shape: t.Shape(),
WriterTo: t,
})
}
return out
}
func (p *qwen2Model) Replacements() []string {
return []string{
"lm_head", "output",
"model.embed_tokens", "token_embd",
"model.layers", "blk",
"input_layernorm", "attn_norm",
"self_attn.k_proj", "attn_k",
"self_attn.v_proj", "attn_v",
"self_attn.q_proj", "attn_q",
"self_attn.o_proj", "attn_output",
"mlp.down_proj", "ffn_down",
"mlp.gate_proj", "ffn_gate",
"mlp.up_proj", "ffn_up",
"post_attention_layernorm", "ffn_norm",
"model.norm", "output_norm",
}
}

View File

@@ -108,6 +108,8 @@ func TestConvertModel(t *testing.T) {
"Phi-3-mini-128k-instruct",
"all-MiniLM-L6-v2",
"gemma-2-9b-it",
"Qwen2.5-0.5B-Instruct",
"c4ai-command-r-v01",
}
for i := range cases {

View File

@@ -331,7 +331,7 @@ type TrainerSpec struct {
// Reserved special meta tokens.
// * -1 is not used.
// * unk_id must not be -1.
// Id must starts with 0 and be contigous.
// Id must start with 0 and be contiguous.
UnkId *int32 `protobuf:"varint,40,opt,name=unk_id,json=unkId,def=0" json:"unk_id,omitempty"` // <unk>
BosId *int32 `protobuf:"varint,41,opt,name=bos_id,json=bosId,def=1" json:"bos_id,omitempty"` // <s>
EosId *int32 `protobuf:"varint,42,opt,name=eos_id,json=eosId,def=2" json:"eos_id,omitempty"` // </s>

View File

@@ -213,7 +213,7 @@ message TrainerSpec {
// Reserved special meta tokens.
// * -1 is not used.
// * unk_id must not be -1.
// Id must starts with 0 and be contigous.
// Id must start with 0 and be contiguous.
optional int32 unk_id = 40 [default = 0]; // <unk>
optional int32 bos_id = 41 [default = 1]; // <s>
optional int32 eos_id = 42 [default = 2]; // </s>

View File

@@ -0,0 +1,314 @@
{
"general.architecture": "qwen2",
"general.file_type": "1",
"general.parameter_count": "494032768",
"general.quantization_version": "2",
"output_norm.weight": "93a01a6db3419e85320a244bbf8ae81c43033b1d10c342bea3797ff2ce348390",
"qwen2.attention.head_count": "14",
"qwen2.attention.head_count_kv": "2",
"qwen2.attention.layer_norm_rms_epsilon": "1e-06",
"qwen2.block_count": "24",
"qwen2.context_length": "32768",
"qwen2.embedding_length": "896",
"qwen2.feed_forward_length": "4864",
"qwen2.rope.freq_base": "1e+06",
"token_embd.weight": "d74257dc547b48be5ae7b93f1c9af072c0c42dbbb85503078e25c59cd09e68d0",
"tokenizer.ggml.add_eos_token": "false",
"tokenizer.ggml.add_padding_token": "false",
"tokenizer.ggml.eos_token_id": "151645",
"tokenizer.ggml.merges": "6b1b1c58f1223d74f9095929d3e6416cdd74784440221a5507b87b8197f2bfd2",
"tokenizer.ggml.model": "gpt2",
"tokenizer.ggml.padding_token_id": "151643",
"tokenizer.ggml.pre": "qwen2",
"tokenizer.ggml.scores": "94e247e531e8b0fa3d248f3de09c9beae0c87da8106208a8edfaac0b8ec4b53d",
"tokenizer.ggml.token_type": "b178dbc9d1b2e08f84d02918e00fc2de2619a250e6c188c91a6605f701860055",
"tokenizer.ggml.tokens": "1d93f6679b23a1152b725f7f473792d54d53c1040c5250d3e46b42f81e0a1a34",
"blk.0.attn_k.bias": "5ce6617845f66c34515978d23d52e729c298d8bffa28c356a0428bef17142cf1",
"blk.0.attn_k.weight": "a960832a9e0e83e4d95402e5d1a01cc74300fcca0c381237162126330e1a7af8",
"blk.0.attn_norm.weight": "32c7d51cd0958f1f1771174192db341f9770516d7595a2f0fd18a4d78bd5aba3",
"blk.0.attn_output.weight": "c67e6e7e868354a11bf9121c70ee56c140b20eec611a8955e7dfe54a21d40a98",
"blk.0.attn_q.bias": "3e9e994eb1f03bccfc82f8bb3c324c920d42d547e07de5be83be12c428645063",
"blk.0.attn_q.weight": "dc12132f789b97cfa1e3f5775ceb835247fa67aa47400fd09c8f9f3769208583",
"blk.0.attn_v.bias": "a3fd0757b31fdc78af5ec320332d239c1a79d34e8804df06c5454e86955e8cc9",
"blk.0.attn_v.weight": "f43094a2134c7ee2dcc52aac3c8b7d9d64fb0295a8adb94cabfd49213f017b84",
"blk.0.ffn_down.weight": "18c2aec92db14f21976838a8c35d5575f80d0e4b1e05ccc0d8388d5877e80147",
"blk.0.ffn_gate.weight": "a3a1c4ef38f8f750eabadfe3d83bbb0f77941eec1cc1a388e51852e99c8691f6",
"blk.0.ffn_norm.weight": "b59b779c42d44b5c4cec41e39b4eb61e0491a07c1b3e946ccb5b8d5c657eda3f",
"blk.0.ffn_up.weight": "db64f09987ea59449e90abae5a2ffcc20efd9203f0eebec77a6aacb5809d6cff",
"blk.1.attn_k.bias": "a5c8c5671703ec0aa0143ff70a20ffdd67b5d5790ca1dfa5bba4e87e4071ed9f",
"blk.1.attn_k.weight": "835c7c7cc95b3cb2e55bd9cac585aa0760a033896621d3e06421f3378c540f7d",
"blk.1.attn_norm.weight": "f4c36fb6c14fce721fab0de78cc118d6f66e3a3d3ea0017bb14aade24c3c5434",
"blk.1.attn_output.weight": "cc1e80310c97cef068e48e40b7096f32fa2138519d6209c6a1a9994985999016",
"blk.1.attn_q.bias": "bc332780e66b0aac80ec5e63ac32344919a840db2fcc8f87bcef16a43a54138e",
"blk.1.attn_q.weight": "d766f06c925cce38d4b31b2165b3448e1fb49a7d561985f95d9cd2fcba52367a",
"blk.1.attn_v.bias": "9f486626fb6ed9ac84970a71e9b9818dd2758501fd3f61bb1c08540dcc7a8631",
"blk.1.attn_v.weight": "e873d1e5bd4f4d6abfd47c0f55119c2c111105838753ee273a03c5ccea25ce5c",
"blk.1.ffn_down.weight": "b3ce82b093f187344de04284b1783a452de1b72640914609b8f830dc81580521",
"blk.1.ffn_gate.weight": "5cd44ad237edaca525a28a3ac13975d1b565f576d6a8003237a341ae0d156f2e",
"blk.1.ffn_norm.weight": "4ac774ee8afaee119610c46aa1ff89fc6c9084a29d226075bc4aa4d2f15f746c",
"blk.1.ffn_up.weight": "042d81ab5f1983d85c81213232f3bfc05a9302d9dfaa98d931ebba326b6058b8",
"blk.10.attn_k.bias": "767ecfeacd60a2c2221ac4d76c357190849dd9cdf64ced418d9d0c7949101401",
"blk.10.attn_k.weight": "a9f3df343227537636be8202303453086375091944e498bad11e0b91e45e8c71",
"blk.10.attn_norm.weight": "01acd0e7b3e363f873dbfde6f0995ffcce83f5aaa10ff91c31dbf775035f6d5a",
"blk.10.attn_output.weight": "a531fe660769604ab869f01b203eb115e025cad4c0baeacdd1bcca99cf6d0264",
"blk.10.attn_q.bias": "356a02c9163dd660c1340fbe1e049b335ac6178891e00996131bba9ab4cb3e59",
"blk.10.attn_q.weight": "81be0cfb227339d83f954cd8dcf35828441211c6e1d184060e3eb76085041e2f",
"blk.10.attn_v.bias": "ed0450653284b62f8bf2c2db19c0ff7a6cf3cda1324d0a044c5e3db7bb692bd3",
"blk.10.attn_v.weight": "c1247ff7092babd2ed979883095b9aa022b2996cab1c77fb9e6176ddc1498d16",
"blk.10.ffn_down.weight": "fda7544965dc9af874f1062c22151c6cefc8ba08cbe15dc67aa89979e77b2de4",
"blk.10.ffn_gate.weight": "9f2632b1dee7304d10c70bd38d85bb1f148a628a8468f894f57975b8a2f1d945",
"blk.10.ffn_norm.weight": "94f8cbd6b17a4d5aabd93fa32930a687db3b11f086142f1cd71c535c11adcad4",
"blk.10.ffn_up.weight": "8dc2f8db0474939a277a3d89db34c3bcc3381cfea57bd05a8426a164634d9112",
"blk.11.attn_k.bias": "3b8e5a662b19411e3f6530714b766aad2ee41eebc8161bec9db0bc82d383a6e0",
"blk.11.attn_k.weight": "2c29f1ed1ce53ce9604e9ea3663c2c373157e909a0d6064a8920005f6d15dad9",
"blk.11.attn_norm.weight": "48f68a99c3da4ab4c9e492677b606d1b8e0e3de1fdbf6a977523f97b8c21ec31",
"blk.11.attn_output.weight": "5859f3838a94898b020c23040941ed88f4fcb132db400d0849f30a01f62c0f1c",
"blk.11.attn_q.bias": "c5ad89a5628f2bd81252ef44ef6bbcbff15c33ad16fba66435509b959c2af6d3",
"blk.11.attn_q.weight": "d102104e5d61c1e3219564f1d0149fd593db6c6daa9f3872460c84403323cfef",
"blk.11.attn_v.bias": "8653f7d48c5f75a5b55630819f99ecf01c932f12d33fd1a3ee634613e70edde8",
"blk.11.attn_v.weight": "e0a7c7d89b9f2d0d781ce85330022229126e130a8600a09d4a5f920f0bbd50b2",
"blk.11.ffn_down.weight": "4a22b3361eba8bbe1d9a6fda1812618e894c49f13bcacb505defa9badb6b96a6",
"blk.11.ffn_gate.weight": "484698b206760d3fd8df68b252a3c5bae65c8bf6392fb53a5261b021b6f39144",
"blk.11.ffn_norm.weight": "da69e96338cbe30882cf5a9544004387f5bbc0bcb6038e61ba2baabbd2623bac",
"blk.11.ffn_up.weight": "26ec74f1f504d1281715680dfbcc321db4e9900c53932fa40955daceb891b9aa",
"blk.12.attn_k.bias": "f94b49ec3e498f14f6bc3ebefe1f82018935bbe594df03253bfffae36bc20751",
"blk.12.attn_k.weight": "ae6323d0bbcfcea01f598d308993d1a7530317e78c1f64923e36d4b1649e9e73",
"blk.12.attn_norm.weight": "3784536a7611a839a42a29a5cc538c74ee4f9793092e5efe1b227b48f8c4d37f",
"blk.12.attn_output.weight": "46826c00b066829355db78293ab216e890f5eaaed3a70499ee68785189a6b0d9",
"blk.12.attn_q.bias": "b14db2d327ce0deec97beda7d3965a56c43e1e63dc9181840fb176b114cf643a",
"blk.12.attn_q.weight": "30f67df52ced06f76b6c85531657584276a454d6ec9bb7d0c7d2ca8f067f5551",
"blk.12.attn_v.bias": "57ab4b7e43f4fc5853bca7bfbb2702f8c2c391a49252a760abbb7b26330dc4aa",
"blk.12.attn_v.weight": "3ccd9da0cfe241cd33a63310f3ca6d81c5bc5a50d200bfea6612ac376166aca2",
"blk.12.ffn_down.weight": "a095774413198a83c549ce132d7c9684c0baef33145eaa889be370ef9c881c81",
"blk.12.ffn_gate.weight": "bb3b2bbdfb065d2a0a795909c53beec327781a4a7e974bf9f99c436cea459991",
"blk.12.ffn_norm.weight": "3b486c6cd97eb4b17967d9d6c0cc3821a1a6ad73d96b4d8fbf980101b32b8dab",
"blk.12.ffn_up.weight": "d020b82dd39a5d5a9d3881397bf53a567790a07f395284e6eb0f5fe0fef53de3",
"blk.13.attn_k.bias": "69381f8254586eba3623eceb18697fe79f9b4d8f2c30136acb10d5926e3ba1d0",
"blk.13.attn_k.weight": "c4d7a31495d71269f81b586203a50abea3a9e2985667faf258c9306ec6030f1d",
"blk.13.attn_norm.weight": "907da11075d16eda668dabe548af3cfd794df26b8ab53939af1344d91bec6fba",
"blk.13.attn_output.weight": "ca01cf6d2b8ece2fb3b0f56f1eb76194471ac27b54fe264f99c909f5eb7fef4a",
"blk.13.attn_q.bias": "2f5ecebafe03b1d485b93c41cff756ca57fb65b02e9d8336f14a3d26ab5d159a",
"blk.13.attn_q.weight": "f557f8acad7f0fa62da06b5da134182fe04a5bed8bdb269e316f970c9cc440fb",
"blk.13.attn_v.bias": "a492a88ae131e95714b092545a8752eaea7c7d2f9cb77852628ca8296c415525",
"blk.13.attn_v.weight": "d1220b1fe9f1cc0a5a88ee239d65fec900f5eaf6c448b6c2cbe74c81e15ed333",
"blk.13.ffn_down.weight": "53184e33440b49848a896304eb16a983efbc6b8bee0b93de8c8de716e1585fcb",
"blk.13.ffn_gate.weight": "684bf8896f148c851506c62717e45c426921b93c10d536ecdeb0fb28259a106d",
"blk.13.ffn_norm.weight": "6cb4e547ad8665eb7c174855c08afe1e5490fece66122522c1e9e8132d9064eb",
"blk.13.ffn_up.weight": "c64107897e38c06727075aba4ea7940b2cdd0e278b5c555dffb2790ef553bb57",
"blk.14.attn_k.bias": "2814ca9b160b16ae39557c9b629482fbe3a7592d372c1e1bf1ac59a2d578fde1",
"blk.14.attn_k.weight": "3377177396463afba667742972920ebb45dfdc37e9950e1f0e1d60a2f936b27d",
"blk.14.attn_norm.weight": "5cae870477d51dd35a6d22aaeacfce4dff218ffba693820ede6a4e11f02afd6d",
"blk.14.attn_output.weight": "3cfe9ccf3d48ae9e95b93a132a1c6240189a277d764f58590fb36fdbb714cad0",
"blk.14.attn_q.bias": "6a75acc2f090b2e67bfc26f7fca080ae8bd7c7aa090ec252e694be66b8b8f038",
"blk.14.attn_q.weight": "5ef45c86d7dda1df585aa1b827b89823adf679a6bb9c164bd0f97b2aa6eb96f1",
"blk.14.attn_v.bias": "5534480443e10ed72c31a917f3d104b0f49df5e6dbfa58d0eb5e7318120e3aee",
"blk.14.attn_v.weight": "58f45cf3240c4623626ec415c7d5441eaa8d2fb184f101aba973f222989422d1",
"blk.14.ffn_down.weight": "2dc82a0f20c05b77512458738130d8d05ce150cc078680ae7ee6dd7ed68d955d",
"blk.14.ffn_gate.weight": "d4a6c6f0fcccddfd1fcaa074846622f4a74cb22b9a654ab497abdc1d0dde9450",
"blk.14.ffn_norm.weight": "777e444932a0212ff3feac98442444e17bd8a98cb758ea3356697d0846d12c56",
"blk.14.ffn_up.weight": "6b75f6bd00195198447b69a417ed9d98f8ca28b3cb8be82f4bad908be0777d57",
"blk.15.attn_k.bias": "2d07211a58e6c2f23aa3a6dc03c80a7d135dfb28726b60b0e0fdd0f35ea5c37b",
"blk.15.attn_k.weight": "e77f3c0075a1810e70df956cc51fd08612f576cc09b6de8708dcae5daedb0739",
"blk.15.attn_norm.weight": "379a10d90609a5d5ba67d633803eda1424fc61ba5cca8d3bffe70c8b18b58ebf",
"blk.15.attn_output.weight": "402751c12ee9dbc9db5e3bf66a7b23ebe7d36c0500e0be67be4c8b1c4357fa62",
"blk.15.attn_q.bias": "acb37fc409ee725ceedf7a3a41b40106086abc47b76780728f781942c5120208",
"blk.15.attn_q.weight": "89cd3047a09b46ed2bb57c69dd687f67a1f0235149b30376fa31b525898e4a55",
"blk.15.attn_v.bias": "f081a37289cbe811978feb4da3ef543bdeb7355414d476f44e09b498da10cb2c",
"blk.15.attn_v.weight": "8404f242a11e6d512c9ead9b2f083cda031e9b269f8a0a83f57ee4c56934764e",
"blk.15.ffn_down.weight": "93438f43ee8cc4f1a7fd3840a6afdd5f02123e76db4f0d9474430c0100d148fc",
"blk.15.ffn_gate.weight": "ff935a2698843e87fad9dbf7125f53e460190ec71ee128b650b3fc027fe37bfc",
"blk.15.ffn_norm.weight": "4be80f199841cba831982e988451e1833c3c938a4d6ca1169319087bf0bd723e",
"blk.15.ffn_up.weight": "ee9ba63c66d71053e33551ddd519878bb30b88eeb03cfe047119c5c4000fb0a6",
"blk.16.attn_k.bias": "3f5fbabed4510c620b99d9d542739295fa6a262a7157f3a00a4889253f8341b8",
"blk.16.attn_k.weight": "8ca6eb139b281c257324cddea97a8e9aa7c048b53075cf00153123b967c27ee5",
"blk.16.attn_norm.weight": "290157f005e5aa7dddf4bd60100e7ee7b0baa7f11ec5c2cea5e0ead2aad3a4c6",
"blk.16.attn_output.weight": "b1f4d80a7447f08f1c331712527f750d00147f35c042442ade96fd029dadc5a1",
"blk.16.attn_q.bias": "e3e4e442ad4416791b468cad8de0d0d2d68c7e7df8d06002f4d49b4da9cb25e4",
"blk.16.attn_q.weight": "cc7392fa5bb1107d3816e7e7363de252d37efd4165d065e258806291ce0a147b",
"blk.16.attn_v.bias": "a7629830f2f6293e018916849614636d40b1bcd11245f75dbc34d38abae8f324",
"blk.16.attn_v.weight": "b6c7856c7d594437630929c8cf3b31d476e817875daf1095334ec08e40c5e355",
"blk.16.ffn_down.weight": "f9c0a777a00170990a4982d5a06717511bf9b0dd08aeaab64d9040d59bcbebba",
"blk.16.ffn_gate.weight": "ed88f11bc3176c9f22004e3559ccb9830a278b75edd05e11971d51c014bd5cd2",
"blk.16.ffn_norm.weight": "ab24abdcc4957895e434c6bb3a5237a71ff5044efb9f76c1a9e76e280c128410",
"blk.16.ffn_up.weight": "99f594dc8db37f554efa606e71d215fbc3907aa464a54038d6e40e9229a547ff",
"blk.17.attn_k.bias": "f236625676f9b2faa6781c7184d12d84c089c130d2a9350a6cf70210990f6bf1",
"blk.17.attn_k.weight": "c2a4f20cd3e98538308a13afe9cc5880bdd90d543449c6072dedd694b511ee1a",
"blk.17.attn_norm.weight": "5a9da4ee168311f487a79fc9d065a035432c6cafa8adb963a84954cf32f57a2a",
"blk.17.attn_output.weight": "d5df7031e354186ce65dc09d6f8a92eb721c0319816f8596b0c8a5d148ed0a2a",
"blk.17.attn_q.bias": "3212d5eeaa7ed7fac93cc99e16544de93c01bb681ae9391256ed4a8671fc6b00",
"blk.17.attn_q.weight": "d18cd9aa7ee10c551cb705549fa1ae974aea233f86471c9a19022dc29b63d0d5",
"blk.17.attn_v.bias": "a74ad11a1f8357742f80e2a0c0b3a2578fc8bbaf14c8223000767e07a5d79703",
"blk.17.attn_v.weight": "da18ac0e90884436a1cb0ad6a067f97a37f321b03c70b8b03bf481339fef5c80",
"blk.17.ffn_down.weight": "81a8a5d7a194fb53d976558e0347efbe9fdb1effffde9634c70162e1a20eff51",
"blk.17.ffn_gate.weight": "72870d83ab62f2dcd45f593924e291a45e4ae1b87f804b5b88aa34cfd76dd15e",
"blk.17.ffn_norm.weight": "cae39ac69b9bdaeefab7533796fdf11dbb7a4bdbdeed601e20f209503aafe008",
"blk.17.ffn_up.weight": "e7cb40b0842468507cec0e502bbed8a86428b51d439e3466bc12f44b2754e28f",
"blk.18.attn_k.bias": "8bfc02b94f9587aa125e2d8bbc2b15f0a5eb8f378d8b3e64a8150ae0a8ca3df2",
"blk.18.attn_k.weight": "434bc3b3332ea48afee890aa689eb458a75c50bc783492b0cbf64d42db40e8ad",
"blk.18.attn_norm.weight": "d6ffc09396c42a70d1f0e97d81113eee704d3bfc9eeae2bed022075a5dd08075",
"blk.18.attn_output.weight": "133f001f81f3b082468a7de67cb2e7a76508fce34bcc4dee7f0858e06eee082c",
"blk.18.attn_q.bias": "758d0e28bf5e660b3090aafb70e2a3191b4f3bb218d65e9139a086ceacaf599f",
"blk.18.attn_q.weight": "12d7b86fc1b09b9fa7f8b7ed43d8a410892cec8672d0c752f8346f6193343696",
"blk.18.attn_v.bias": "9efd15bab0519462431d6c6e8a5b7dd4e151dc449468097ee0ddca369c0ecc2e",
"blk.18.attn_v.weight": "f631231a79d4a2e9730fb2e386d8c18621eb3fb7900fbfdff5e6d52cc42db122",
"blk.18.ffn_down.weight": "874a2dddf456f3ab56b958b0860d71c8c680a6f89322c9bf6b2f32a113592300",
"blk.18.ffn_gate.weight": "4549ef8976c345a511df4a7133bdaf6fe387335f52dfd8a4605a8ae3f728c403",
"blk.18.ffn_norm.weight": "80c258a2536a860e19bfcbd9f29afa13214fbb4c34bde0d4da51287d354e9a59",
"blk.18.ffn_up.weight": "8b03308a581457a3c038b7a086f3cdf14941d7ad4107c4bd6d9d6b062fd00d73",
"blk.19.attn_k.bias": "e77f7b0c8e3e0a9b0d61918cd88371047752a1b02b1576936f4ec807d4d870ee",
"blk.19.attn_k.weight": "a2a318e93355230c0d0f95c441b080bf9c4914507255f363fb67a5e771d4d1e6",
"blk.19.attn_norm.weight": "9a4bdeb3970be21ac74a94c2c81eb36986533db81b78db6edec48d9802910d59",
"blk.19.attn_output.weight": "2369b103dd3947e2cef02b2669b405af5957fb3a7f9d0ff40646078c4b4317ad",
"blk.19.attn_q.bias": "e20bf427bef69059ae84a5d9f98f7d688489627f198fb6153def018ff9fd2e34",
"blk.19.attn_q.weight": "45a3bb3bdfd2f29dd76e5f78ddae73678b9a2a85dfaf609e460240ef5b7be2ad",
"blk.19.attn_v.bias": "a441f58a3e02ed86ee1819eefc9bd4e8b70d11b864a929d58a2c2ac0aeb8203d",
"blk.19.attn_v.weight": "30b0b04480c510450a7abb2ce9fa05c65b150a3cc4dc76f8916bf8d013f1b6be",
"blk.19.ffn_down.weight": "eebb9ab8fdb6a6efcfff8cf383adac9ec2d64aeeff703d16ed60d3621f86c395",
"blk.19.ffn_gate.weight": "3fef1493029298378886586478410b3d2e4e879f6aa83c07e210a7ce6481817f",
"blk.19.ffn_norm.weight": "e1be99ea1e8fb9678f7b8ba200f3f37e03878f3574d65d57bcd3a9fd796e2112",
"blk.19.ffn_up.weight": "f07cf25e09394fb69fe3ef324bdc0df9a4cecf3dc53070b8acc39e6d1689bf82",
"blk.2.attn_k.bias": "b29baa8221f125eff6b8ac1a950fa1d7cfc1bce7bdc636bf3df7d4065ab6466c",
"blk.2.attn_k.weight": "4bd0c179bced8bc37a09f5748c394e0cf50273942fb38a866e5cf50b6c96c437",
"blk.2.attn_norm.weight": "07b3edc6a6325c3428aa12f29bcae0be0de363ce61a6af487bc5c93fb8c468d9",
"blk.2.attn_output.weight": "056b5b31dbc81087c81b9d41c25960aa66c7190004c842ba343979644d7f4d88",
"blk.2.attn_q.bias": "479b6212401e097767c9d52b12a1adb8961c0fce9fcaaab81f202a9d85744376",
"blk.2.attn_q.weight": "f89196076f446a6dd8a9eee017f303504f9c03094c326449cee5a7fc0a97fade",
"blk.2.attn_v.bias": "ef9b1b986dbd9d7291027a88b67dc31434435b20e76e4f1e9d6273ebd31224f0",
"blk.2.attn_v.weight": "9322f4f00e85f8c0936845c51ca64b202a93df104f36886986a8452a8e4967a5",
"blk.2.ffn_down.weight": "7beac0d2440dc49af33ededb85a6cc3ba23ab33ad3ffa5760714b2ef84d94f6e",
"blk.2.ffn_gate.weight": "818a93864a5890c1f4dc66429004fad07645a50142350e9bff9a68fe24608a52",
"blk.2.ffn_norm.weight": "152c924d5514942ad274aafb8cc91b35c1db3627c3d973d92f60ff75f3daf9ba",
"blk.2.ffn_up.weight": "9c9579e600f209546db6015c9acfeda4f51b6d3cca6e8db4d20a04285fe61a37",
"blk.20.attn_k.bias": "fd22bfeffb63d818ce2ff1ea2ace0db5d940f7a9489b6bfc1ec4a5398848d7fe",
"blk.20.attn_k.weight": "f74439bc74c2f9252130c9c28384fd7352368b58bb7ce3f2444cf0288dfff861",
"blk.20.attn_norm.weight": "5c15d2613df87be6495fb7546b7dcedd2801d12fa5ecc02c877df889330e8f37",
"blk.20.attn_output.weight": "6731a39286a67f6859832f96695732e579e14e0c36956eccd1edce3db11595b8",
"blk.20.attn_q.bias": "04466e5a3f454a19b9b433fc2585396feac780027ece7ccb4e4bb3e406fc14d8",
"blk.20.attn_q.weight": "ead4c71daaeb17bf20d014a34c88b97f238456488e815ae0f281a5daf6fc99b8",
"blk.20.attn_v.bias": "adcc848e043025de9bd55ccb14dd8fb6343e8b5185ed07e12964be41d0faf99f",
"blk.20.attn_v.weight": "81bfc23f83526386a4761c2c16b6a93cd0bbf9d846c1a51b82c71f1474a465f1",
"blk.20.ffn_down.weight": "9bf660af3bafad919d03173c89a65fc9c89440a76c42c9e55e4d171076f3c17f",
"blk.20.ffn_gate.weight": "c04b4f3ccce44917ee228b998e2c19dd702aef10a43413afb152e808b5ac5c42",
"blk.20.ffn_norm.weight": "3d5b555d7746a71220143c6b8fff5ce4eb63283d9d9c772f1233d848f69f4ff4",
"blk.20.ffn_up.weight": "d7a196505c39e5469dfc7c6958bdbb54e93629ac1a047a6663ed96b318753094",
"blk.21.attn_k.bias": "4db1f48e5c6a3bc5720a5da813bbef08283e6269e12d83f8a9c54e52715d8011",
"blk.21.attn_k.weight": "c687b2f0e132a5e220a2a059b61aa2a537f37d8a674d7709f87880637b263b31",
"blk.21.attn_norm.weight": "ec23b0ff847a4b45585ab8e04f10fc20bb1637c5f1fbcdc4d73f336bcb5d1bd0",
"blk.21.attn_output.weight": "01255390576316c1731ef201e32c6e934eba356c28438cd06d9027ac6a3ff84f",
"blk.21.attn_q.bias": "3098f37205a15418e1681e407c82b7ce7c6fda6c6826b0590a13e1b68a38a1ea",
"blk.21.attn_q.weight": "30ea62cbb702a5359229dc96819df17ee535e2e9988d044b005c73ea536e1005",
"blk.21.attn_v.bias": "7bbedb2c22a04737f21993115701d4a06b985b7ca3b64681f53cd1be8d7ea39e",
"blk.21.attn_v.weight": "e11905e63579e36fbee978062af7599339ae29633765a4835628d79a795ec8df",
"blk.21.ffn_down.weight": "84def2ffd8aca766f9ce12ed9ac76919ab81eb34bdeae44fa4224417c38af527",
"blk.21.ffn_gate.weight": "4e99f05377b4a0b8d875045530a5c59dee6a46ac8a45597f6579f6fdfa800787",
"blk.21.ffn_norm.weight": "af48f13d03fba38ff8794a5f5005e666e501f971ca2e30bbded2777a8096f37d",
"blk.21.ffn_up.weight": "a29541c39a6acbc364be86994632a5bf55d701027cb7f23320f8c6d55ee42c91",
"blk.22.attn_k.bias": "c97f84db6c75422df6ef5768676d4e9abefaa3b8337aa2730ff260f8fc350480",
"blk.22.attn_k.weight": "af9a0c56f68779513e95be11611b7be6175ddae27d48bee9dd72fdbf05f6cbfa",
"blk.22.attn_norm.weight": "1c7518eb5bcff4a202c6f4a2827f14abd76f9bcc64ce75fe9db60b69437a5c9c",
"blk.22.attn_output.weight": "1abcf1f3caa2f59dd018646b93f9cf8fd30d49e98a473e6a8704419a751be46f",
"blk.22.attn_q.bias": "7221e01cb692faf2f7f8c2eb6e2fac38a1b751a9c9fdb6a21a0a936eb0bf4b96",
"blk.22.attn_q.weight": "faaf8fb7b6c19f343d47f3ea6b57151fb46c787e0b3bd2c292fd327d3d4d8e35",
"blk.22.attn_v.bias": "3ec05942e82d735de99dfd0d8228d8425e63e2fc584da98b3326bdef89ecb2e5",
"blk.22.attn_v.weight": "42e7b0ad06db76227837da9d4e74b2db97f3df4050ecb3a87cb9b55e08dfcb42",
"blk.22.ffn_down.weight": "87ef98ad2d0e824b0fa5ad8aa18787162922e527c9b1b721a99bc07d3bf97c82",
"blk.22.ffn_gate.weight": "562d6e5a1654b03aaa0e33864d23c10297fd4bcaa72d30fac69fb771ee1df9d6",
"blk.22.ffn_norm.weight": "f8a405dee467749d59427ce05cdd4b9c11bb18934a89258ea461f013b7d251f5",
"blk.22.ffn_up.weight": "90e1f4ae4062649d4d838399eb353e8bb8d56a49982b6a7f64aa3945377f7187",
"blk.23.attn_k.bias": "9ad22178a85f3be7e25f5aff462f31627466364f2f5e92f265cc91db0da9a8a8",
"blk.23.attn_k.weight": "d813beffb10f03278f5b58eea0f9d73cdcb7b5b4045ae025c379592e854f7dfd",
"blk.23.attn_norm.weight": "f583c9836044bdb056d6f8911088ac28add68e500043ae1f97b5d9158fe3d769",
"blk.23.attn_output.weight": "02789911ac3b97f6b761e958b7dd6dc7da61a46a1be92bd0b346039ca7ecd2b2",
"blk.23.attn_q.bias": "38c4970fb9b4f7e4a139258a45639d848653814b4bc89ea9849709b13f16414b",
"blk.23.attn_q.weight": "eb694be9a5ab5858b8dab064ee4cce247dc757424e65282989bd4d015b8580ce",
"blk.23.attn_v.bias": "0a25f6533aa7e7a152a4b198cf6c411c2408a34afa4f161bb4d5ffba2f74e33f",
"blk.23.attn_v.weight": "187e1bac6b70f74e6364de226565aa8275ee2854d09cbe5895451a689596049e",
"blk.23.ffn_down.weight": "88880dd9ba7ee80ade972927f810b5d2c30a69520c615190b27f9daabc0a8c5a",
"blk.23.ffn_gate.weight": "5abec63197935ab3eb8e6de0a5307396ec46cdb1cc5de25d87c845f3c4a3e887",
"blk.23.ffn_norm.weight": "60e1f5e6310c3a531c554a6bb7cd883aed58db1e51853f739436ea461c1843d7",
"blk.23.ffn_up.weight": "3d7f502771743f4a634188dfcd8b8a384fb07467ca8528366aee59ddb25b7bce",
"blk.3.attn_k.bias": "0b6b442ebbac29c8c4b67e8e3876d0382dd2dc52efdf4ab0ebbc6f71b6252393",
"blk.3.attn_k.weight": "480f40584fbda692c26f2cee45f5923780b236f8b4e8ec7bbee0237777a0918d",
"blk.3.attn_norm.weight": "39872be2af31bc9cd6b583ebba6fb759f621d586d66e5a2fc0b85991615a8923",
"blk.3.attn_output.weight": "924b2c80d8513bf637f8ebb3756a340d9cf2243de723fd08d7f5dccd46b3f8b6",
"blk.3.attn_q.bias": "863c9d848156847a3fe9bbc44415a4395245b5d13e95673c014fdb71e494ab0a",
"blk.3.attn_q.weight": "bff73ee5de92fba8f6c089bbb19ce57e17ab3c9c29295712804bb752711b882e",
"blk.3.attn_v.bias": "e1b6fea126e86189112fcdfee79ffc66a087461527bc9c2dc52dc80f3b7de95e",
"blk.3.attn_v.weight": "7812b7f5133636f06cdbb4dcc48ef7803206538641b6c960777b37f60a8e6752",
"blk.3.ffn_down.weight": "00b393d6a7e3ad9b5224211ccdbc54a96aae151f24ed631764ac224972a6bc82",
"blk.3.ffn_gate.weight": "cfd63fa3a038af05dc53c6eeb3c192f1602f26ff24cb840bcf1510fcb37b5513",
"blk.3.ffn_norm.weight": "7389fc240a282949580ea2f5b0d7973ac79f32f76dc0155b537bb6b751f8e27a",
"blk.3.ffn_up.weight": "2a945f47090df9cb16f92f1f06c520f156f8e232182eaaed09f257b8947a2a62",
"blk.4.attn_k.bias": "62533c31f0de498187593f238c6597503fef2a92e920cd540a96bc5311b3b2a0",
"blk.4.attn_k.weight": "93e829868bffd980a8e589b9c4566cd81e6ce4296a5f357a2ae93febe1284156",
"blk.4.attn_norm.weight": "9e0aaa4bbdd1389890f8abec20533f3ab16d61b872b1a8dbd623023921c660a9",
"blk.4.attn_output.weight": "74467d6f44357d67f452ac49da861468b38e98057017bd38bc9a449f9d3538e6",
"blk.4.attn_q.bias": "8e6d9026fd69b314c1773c5946be2e11daf806ef22a5d91d744344fd30c58c59",
"blk.4.attn_q.weight": "e5bfbafd94a4d530f3769f5edbba8cc08d9b5bee8f66ebf4cb54e69bc0b7f63b",
"blk.4.attn_v.bias": "20c570f92022d9905eb85c0e41d1fdb30db22007a9628b51f512f8268d6c34a2",
"blk.4.attn_v.weight": "9638d459d61da03c9dd34dad985e03c43b4f8a5bc9701a82153478329b0517e0",
"blk.4.ffn_down.weight": "9d91b06e89d52f4365dece7eaeec50f81e52cb2407b333248a81e6e2f84c05b8",
"blk.4.ffn_gate.weight": "bf6350a79c6a6ee9146edfd788b88d4a4c2b54db1aa0adcc1464dbba8a84b646",
"blk.4.ffn_norm.weight": "11a70a6b9f7ce336292f4e3a2c6c92d366d4ee4306ad4fdb1870fde107e9cc31",
"blk.4.ffn_up.weight": "64f23f493d02b147a72a59605e6b7dd1c4c74f6813a38a2a60818bd66f697347",
"blk.5.attn_k.bias": "f6c2c279c0ed686f298ad1e5514b5cd882199341f896abbb2c2129d4c64ce9c5",
"blk.5.attn_k.weight": "0e682f75870abf9efaca10dac5f04c580f42820ecf4e234d43af967019acb86f",
"blk.5.attn_norm.weight": "01efae7653705e741932fcd79dff3be643d7e97f4b5719b887835dffe44b3a82",
"blk.5.attn_output.weight": "69e841d00d196acc489cd70bc5ffbbb63530ac5fabb169d40c4fb3a32ebb8ed8",
"blk.5.attn_q.bias": "f3304d76ccd44fed887565857c8e513b1211d89a5d3e81782de507ab3f6fc045",
"blk.5.attn_q.weight": "98612a6b7920a247853ada95c240807d4ca8e43604279e7a2fc9bb265ae40469",
"blk.5.attn_v.bias": "39940a9b353ceed3edfd4a39b985c9520490aa1b9f11749c94fdf6d879d1a259",
"blk.5.attn_v.weight": "839f84b828cf83aecf479a0dc7bc86cce05145ef77dcf29916dc3e0680f5b665",
"blk.5.ffn_down.weight": "1f48cbb0960f15e06ab8a3754ade792995a655856389ddbca629c07e89d1b114",
"blk.5.ffn_gate.weight": "33d8219fce3189e1aab376039896eebd4ad36ebd26a8278cd19b26e4357e4f81",
"blk.5.ffn_norm.weight": "0f4a0f83d37127fa4483f2905cb4f38ef6ddc71584b6cb05632c62a9af313dda",
"blk.5.ffn_up.weight": "22a64a11e5f0a1ff45ca327bf9e1efa258f085ff6a96edc398b7474f725b4514",
"blk.6.attn_k.bias": "baa91df99d4df2d25e8d590bca4e334b97f2d9aa3df8e748fedc8a6188499111",
"blk.6.attn_k.weight": "121f3b9f4b9491996499392e2688a929cafe102a67920b4cb2a039349c43d8eb",
"blk.6.attn_norm.weight": "b4cf987e923d71f2f84c58d20ea8af7576b225bf61952145b489fdd395e3d411",
"blk.6.attn_output.weight": "a112642150a138d54b2a4038042fd33619035a35694771e966f3575856c635d6",
"blk.6.attn_q.bias": "a97ea10469cdfa3fdddf8bad6de683ef99f6170eb8d29d15dcf6bf4bce37c5a3",
"blk.6.attn_q.weight": "d80c787019317a87361de6bbc7df6701357216bdd9b404522cede34a719a5500",
"blk.6.attn_v.bias": "d846269db9cd77ae28da26ba0914cace1b6754bd5301af9c44607085dfcbd2d7",
"blk.6.attn_v.weight": "06567c433e8a391647633291b50828a076ad7c2436106bb9278c60a3f8fccb3b",
"blk.6.ffn_down.weight": "f15f66f56b3c474eac8c6315c5fff07c3e29c6e483d7efd4d303c7f43814be91",
"blk.6.ffn_gate.weight": "47768f89c6da8eefb29adb766ff4eb38c9dfd79320bbc1386248319fcbcf567f",
"blk.6.ffn_norm.weight": "7f8195e6b148212967145fc9d86ce36b699cff0de026042245c2d344f1ef8510",
"blk.6.ffn_up.weight": "53d7707ae4347aadb445289f9f87a008b72df5cb855b00080a605442fdd8edf3",
"blk.7.attn_k.bias": "63e274df3217dde25b8369a383e480fe4f6b403a74385f15ac0b5db71dce2744",
"blk.7.attn_k.weight": "f6fce88602f5945eee09767acbcad387d132614e6da39ae359f2bbf380d94b1f",
"blk.7.attn_norm.weight": "bbf5dc7336c0f9a511afef6bf5efeffd78f1b83940850c3eb7eb20c621b75656",
"blk.7.attn_output.weight": "d9fb907a138396a859cecbfcb377927308dc93c24c7fb52dba5eb59265feadec",
"blk.7.attn_q.bias": "f02ba1318346af77e309f40aee716e2de7ee8cab67e67b17636db9bf40894fb0",
"blk.7.attn_q.weight": "54a691e824be287a61c35c172edc01922ed792d2addeee029afc17ba6c7e11b9",
"blk.7.attn_v.bias": "3a4f182f51e84ce862d558fb2751b91802b65d74596bb14d624808513a8a83ec",
"blk.7.attn_v.weight": "a142fe6e106d3ab484e2dc6f9c72b8fc0a385279dde08deb1ad1fd05ac25deb1",
"blk.7.ffn_down.weight": "8daf7e8c430d183a4d6ab3eb575fafa4b5e31689f68b290c8b370411ad9d0f12",
"blk.7.ffn_gate.weight": "a2a786b45eb660994254b48e2aaf22f3e9821cfb383dee0ba04cc4350a2f8e72",
"blk.7.ffn_norm.weight": "73828bbc8c9610cc139fcf03e96272648cdc291263251fe3a67367408deb69e1",
"blk.7.ffn_up.weight": "e85dd0f63fed449ce16893c5795ea6a050a2d7a66d9534410a227e22c905dafa",
"blk.8.attn_k.bias": "91a752a6e2c364e5ee6a015770fe289aece4911ae6c6bbfe74ac52f465465f93",
"blk.8.attn_k.weight": "99c069e92c43a2efb74e23188256b3cabbbe06399878e681ce203a05d5da378a",
"blk.8.attn_norm.weight": "c76d36d3cc06aa2a9edb1abf9f602bb7ed61ac9d61f8ef7ed736a1e619abe717",
"blk.8.attn_output.weight": "ee5ff156a2625e1f203f65e69b514f9df04bd9a5e82b28e3876e16cf1c6f65c5",
"blk.8.attn_q.bias": "8fbd868a93b330c8b0418b488c5301f42a7eb0c58445a4e515d56777f1d96ed5",
"blk.8.attn_q.weight": "9f20ef86e80098ba52a3a31ebcc315bea3a614dac9cba7ac1db02f156db9b577",
"blk.8.attn_v.bias": "c4813571d5d618742183a7890c0b89cd7f18e210c758f63aad564659bc38a26d",
"blk.8.attn_v.weight": "ea88e1a4cf8bd56e9a88ada427d2b0cd352234827640757ee2a9ed594fb67a53",
"blk.8.ffn_down.weight": "b0d1a7495811580b189aaa3e20ea871d6d01ed7b6c23e59825078ef786944ff2",
"blk.8.ffn_gate.weight": "0a17c0caa0b06721c49b59b2a63a5dcbf744dd1cffa55962b404ba910c658a62",
"blk.8.ffn_norm.weight": "f15f109d4a8e9d1ff7c71fa5bc6373df7ee80c5f7d1de3fa0d4849d747e36bcb",
"blk.8.ffn_up.weight": "bbf4c5c4c5c8a0f9ae8b88e3cc8b86f81b98148722d5a350995af176c0b774f2",
"blk.9.attn_k.bias": "a7f60d962686b8ca60f69643e0e0fa8614688be738fb0b1c6bd54de35c2beb5e",
"blk.9.attn_k.weight": "dd80ce4adb00e338fc04b307e4c18a27071f4ba4397184a24d765e6e4a268ef4",
"blk.9.attn_norm.weight": "721e6487547e2b3986ab4b4e2500ceade59d908bccf4436e1e8031f246deb2bd",
"blk.9.attn_output.weight": "5a800af39107b363861e5f5173483cdcd644d8ac3b0c8a443b9c759d71285db8",
"blk.9.attn_q.bias": "0a19b4925ea8ca8067acc909b058adc327de3874cfc94cc9eb4a106d3f370123",
"blk.9.attn_q.weight": "93e84906684c0c7ede79967236d9fc8344da84a9f1daa04e8295c2c9b6b26a24",
"blk.9.attn_v.bias": "615421f812f821e230ecde4e6da35d868823248355ce7e4e51e2d650ead565f9",
"blk.9.attn_v.weight": "7f4913e289aefd9ceecbdaf9767b1e95303f5d59dd67ecb2cc15768477f4d08e",
"blk.9.ffn_down.weight": "95d1b3933221e87dc4af70dd566daec9498bf358070b8d26f1fc70766a84a152",
"blk.9.ffn_gate.weight": "530f2d04f6a1fbffaaa5f2fbc3a328ebed7b330e3af14b4fc7d8a51b13ad8d42",
"blk.9.ffn_norm.weight": "28077de416217ea1df94b96017bef4cc562ab62e51b1a03a671c70abc29ce52a",
"blk.9.ffn_up.weight": "b87b6190778aaee4695938e24ac6c90dbbee6dce7c5c2ab5bc26ba4564581822"
}

344
convert/testdata/c4ai-command-r-v01.json vendored Normal file
View File

@@ -0,0 +1,344 @@
{
"general.architecture": "command-r",
"general.name": "command-r",
"command-r.attention.head_count": "64",
"command-r.attention.head_count_kv": "64",
"command-r.attention.layer_norm_epsilon": "1e-05",
"command-r.block_count": "40",
"command-r.context_length": "131072",
"command-r.embedding_length": "8192",
"command-r.feed_forward_length": "22528",
"command-r.logit_scale": "0.0625",
"command-r.rope.freq_base": "8e+06",
"command-r.rope.scaling.type": "none",
"tokenizer.ggml.add_bos_token": "true",
"tokenizer.ggml.add_eos_token": "false",
"tokenizer.ggml.bos_token_id": "5",
"tokenizer.ggml.eos_token_id": "255001",
"tokenizer.ggml.merges": "902a060cac8884a5793d2a857dd2e53a259de46c8d08c4deb243c239671e1350",
"tokenizer.ggml.model": "gpt2",
"tokenizer.ggml.padding_token_id": "0",
"tokenizer.ggml.token_type": "b7a352ccd1c99d4413bcf452c2db707b0526d0e1216616b865560fab80296462",
"tokenizer.ggml.tokens": "815ac90ff23565081522d7258f46648c8a0619eb847a9c7c31b238a9b984e4ae",
"blk.0.attn_k.weight": "6fcfdb466f9ceb1229404ce4ec4e480751b8d00da12707a11783dad7256cb864",
"blk.0.attn_norm.weight": "6063317f731371864049c7704a70772f1eb632194201ebdc2ed0f8e483507c72",
"blk.0.attn_output.weight": "920f49716a1e2fc73b6794ec777947f1c122701e63ed302422ac89e90f06e9da",
"blk.0.attn_q.weight": "ddbcd7cde197e632564ac58e4f25d9e3a8ca52917329eeb6081eb41a797932ab",
"blk.0.attn_v.weight": "318fc02a189d87420f0cbf57f47f11e00c21ec1ed472ce0a2a895b44f7fa0fca",
"blk.0.ffn_down.weight": "aa71975b6eb1f4c77b03d2ac4a194cf8d95718efac741bb12f0f3ff79a27f9bc",
"blk.0.ffn_gate.weight": "42967702fa0bc738b88dc50007ace26dbe74a5a9e0978124dd093f818241a9e1",
"blk.0.ffn_up.weight": "5282c8788b086bd30f46525e7995a17464882a72703fd27165491afdd8bfd4af",
"blk.1.attn_k.weight": "cd248882e64fd2c3402c44790ebe12440133dc671b6893fdad0564c461973adc",
"blk.1.attn_norm.weight": "ba84e1c8fd30af6ec94208db4078befac8c921aad3acb887812887f3282ea2be",
"blk.1.attn_output.weight": "2efa3ef7c5666ccceb05e339b83ad680cc0d2c3ec78203f5da5959f23a80e14f",
"blk.1.attn_q.weight": "5106f2e255358a1303c22e8b5f0ec044852bb30a866c52cabefd30017a7a6b7d",
"blk.1.attn_v.weight": "a211a634a1a5df1d5f973645438be0461dd922210f9747c6b04e386c7f1ebe95",
"blk.1.ffn_down.weight": "37093afe48d32c578ec956c9ed85242cd000d6aa979e60526aafa10c822dbb10",
"blk.1.ffn_gate.weight": "469860819e9159caefb1aad0bc66db790f3393f05fd87b08e52256a7ed256543",
"blk.1.ffn_up.weight": "736742c97d35d1a011f9cafd3c0ce947ad559bb2fba6da73c816f6bfd0fa9aeb",
"blk.2.attn_k.weight": "92c219d92804d832ab404bd6dc7339c90877bb7cf405dd030c121f8b27757739",
"blk.2.attn_norm.weight": "61e4466069474b76b6d1e702566420eb669faf3556b00ff7b824784aca13a2d6",
"blk.2.attn_output.weight": "d2fb38a2b2171fd91caf037faa585a62225819aa232d86fd4f7f9d2c3c8a45e9",
"blk.2.attn_q.weight": "f6faf5cc6844e3daa4f9f68d90f5458c64879de68a7728860e38374e30c3429d",
"blk.2.attn_v.weight": "f340ef8f7341d987a6f37c0e9afe0aef5be67be00c0ce5f57612daf73319cce1",
"blk.2.ffn_down.weight": "c7be61a701d779860b621b143fb6365b607bf99ec7c0f153b07908ac8120885a",
"blk.2.ffn_gate.weight": "b64f0878187bd3392abfa4c3e8ad2f8b4c133903e54246747ff8f3b4639ad83e",
"blk.2.ffn_up.weight": "50b11c712652e90ee7428dbb45cffebb80662ac982bc72bd9eafff361b5eb5a8",
"blk.3.attn_k.weight": "2b7bcbe9ee5c9c630c8c8d7483887e78b73581016f4cbb6933db2a147a25f431",
"blk.3.attn_norm.weight": "0181dac7f4eee7252980323e8032cf339bef2046ce0a16c0fd72af7c98a8a37b",
"blk.3.attn_output.weight": "aef8843b636ce231da9e7c9acbee197883cc15df0e2887709324c6a50f16da7b",
"blk.3.attn_q.weight": "55404130fa10e81322d33eb378aa0de31a92990ce7730f1338c0ace0406bb1b1",
"blk.3.attn_v.weight": "76f7fb8040d82b957d689ce34fea2302a6640ad5bbaa0052ad2b7ebce270c33d",
"blk.3.ffn_down.weight": "648628933eff3b357c3729c33c5b1ae51c28e59b9c19acd1601a2ff7c5d5d9a5",
"blk.3.ffn_gate.weight": "6a588885d16e98d5f50ebed05af089154f680085ca9c97691e5b489088630a4a",
"blk.3.ffn_up.weight": "e12455a1d702f4986e1a663493e3d5102b367af74d45557522002a35d63ecac2",
"blk.4.attn_k.weight": "40d943380a8a85e4eab147934bf6e16f23cc8ab753f6636526382c074d182288",
"blk.4.attn_norm.weight": "4ab2c098983d4599fe540eef624c4df954adb7473faebda7471ef0ba4134814c",
"blk.4.attn_output.weight": "d14b91e40f58bf4a3c8c2eca0b12bb541de406574af39027d56f6c588a147082",
"blk.4.attn_q.weight": "e1224960a3562107488589f883fa32414bae41712fa8dbd47c5f3e3a7801452f",
"blk.4.attn_v.weight": "063f297bc4aa6e709fc32c4c32e35af7d07d80e83cb939b76adbba858006c03d",
"blk.4.ffn_down.weight": "f88a18020c5e1caaa29596895eb348e76ee5bfad27ed57651a86cd8cd1f9b5aa",
"blk.4.ffn_gate.weight": "48e7e1eed3fb52e92e61d3557dd0ec002418327090e034ce4322fd68542266f8",
"blk.4.ffn_up.weight": "1ca8a7aa17355b6ce0d9ad5539fdad3899fa47fd359c285fbfb31f19f47bf073",
"blk.5.attn_k.weight": "2bdf15f8e73d068d972380f25d207004cf0bf3b5bfa46946803ba6fba07d9175",
"blk.5.attn_norm.weight": "60448d7cde6e1b6467aa31bdea012e39cdb08c88081cee7d102dca4f93f766ef",
"blk.5.attn_output.weight": "f9f687d7c457537f9fca8a4087a59f1c3bebfaf5537b94e42c831a13224f7799",
"blk.5.attn_q.weight": "987db7a2ad68657a92625e1980effbb1f79697c2183f2b9f3b3a0570c51b0ab9",
"blk.5.attn_v.weight": "cf696891148f3e4783ad1d20f93462ae091eb8651c656bba9b662253b6263e02",
"blk.5.ffn_down.weight": "c0662b0bd0929136005fb9d691fdd9b2c33867d9ce9622339a6a456b720b059a",
"blk.5.ffn_gate.weight": "200bbdfab615d7a3a84719b6ced7751e3ce52757ef212d96f87798bc1de5e987",
"blk.5.ffn_up.weight": "df5d23e7e035fb1b9d163da7ddfdfe38da6a37e86e96534dc02ad20f011b55b3",
"blk.6.attn_k.weight": "c0dae2d272a7c5a2fa004bbb8475dbab362fc1f6d008e73d5a4434a9382ac6ba",
"blk.6.attn_norm.weight": "51c57ac8b55e04354d5dca6bb9c0cf4177639d3b038e80209e33036209688f64",
"blk.6.attn_output.weight": "229d97892c62f85bcdf431675250e01c976ad69ffa450b01fb543bf88f14a2fb",
"blk.6.attn_q.weight": "c20e49621821bd46ed156e6823864a5bda4f317750e71ab8dc54e44eb48cf7c2",
"blk.6.attn_v.weight": "53ceb1a2ee43fce3c7b5b33c58a9fc5ee7f44dc1c6f29bc9dbefc37582102dc9",
"blk.6.ffn_down.weight": "7923c943b7629d560a032d1efa210d1d75c6692140f1be94464ee7ed24f44ed0",
"blk.6.ffn_gate.weight": "57593d350361af753a6a39f53b066282634c0fb44f396f6f2966a574b01d8f8c",
"blk.6.ffn_up.weight": "327b6a7a387098b8899d3ded04a4d4e7c658ca61b80d4e7b17594be232721602",
"blk.7.attn_k.weight": "9ca48b87a10116fd8868e62b76f211d4bb91f166096be9061439ee2e1c3a5c20",
"blk.7.attn_norm.weight": "cd56cfcc4e2ad6b96e23ea7b0d32b4caf236107d99a0b22c56760b62e63c8cfd",
"blk.7.attn_output.weight": "7352b509a03cae2491ffc060e577d189341a0f861233f18c96f9d275dc4234bf",
"blk.7.attn_q.weight": "2b3791c8c008c33ddbe12bedba8191322ceea2dcce5cf0eb7a93d40ad254e672",
"blk.7.attn_v.weight": "3ae721d52466487a3d48150581e57f6d64ea1e83ab929f23b28c3d777422eeb6",
"blk.7.ffn_down.weight": "3b6fa8ececdb3c34af3a5363863d6f94289c1c95bf47fce3a3ddcf184c5f0848",
"blk.7.ffn_gate.weight": "dbd7df6c5ae5eb4adb859f0d36453813a4e289a359a1ba8f72d67fcbf21c3e22",
"blk.7.ffn_up.weight": "de68380a334b4c5cfd4c318b0e9854aec59bd79aa0f0c30af3f56414f83482b0",
"blk.8.attn_k.weight": "7303c4e4480abc72a7ee271811311199245fb5c2ea27a2bd3b8cad3a53a03c27",
"blk.8.attn_norm.weight": "2e3d1921898d1b943ce1a1b6818546c8b471d6d542da24f51a8b514b8c3dd4ef",
"blk.8.attn_output.weight": "30421520887b66bf97a18dbcdc283bc8d0b60590b612fd638a319a6eae923227",
"blk.8.attn_q.weight": "73e064d5433c9b500068a1c31744dbd53f4ade298fb450a0e8c97f62cf1f8a8d",
"blk.8.attn_v.weight": "27e21f8b9a9a8533e8178ca34a72aa1d786393d57302b7806dcdf3e51de511a8",
"blk.8.ffn_down.weight": "bf694bd8e00047982108000e7b3dee7b225db8b19abc595e5697b6bbefd92e7c",
"blk.8.ffn_gate.weight": "d55fdbf8606d9141b774b0500c58944fd1253b9e69d1f765eaa9a680b9f2ca40",
"blk.8.ffn_up.weight": "1ae3f580655e7c8e8dd6c34fa4ac574fdfc5e3f1a8536da0c5442d3a2976f0e7",
"blk.9.attn_k.weight": "b18080626012d8aabcf78542d6c7bf31c712bf55a70172fbfe173fcf34481036",
"blk.9.attn_norm.weight": "2e3620620dc09998c6d3063a7d5de5433fbbae8c11e5b00d13f145d39140e162",
"blk.9.attn_output.weight": "69c3c0e27ef1c0fc933eeb7b612b70909f18cde238873c0d576a2ba9714ef174",
"blk.9.attn_q.weight": "68330e5aa28a28873c9a6e67f032186ef651df2df5844e0f27094ba349fbe4ab",
"blk.9.attn_v.weight": "3df8d45a102be082d0793a51cb82aa62a43cd0e9d047ba4115ca0f2414b39325",
"blk.9.ffn_down.weight": "1d6cc162b73745b135b4f040a0aac3c06d5135a3dc5b2421e7ee2af48662fd7f",
"blk.9.ffn_gate.weight": "034a9d40fb1e32b534b45f4bccd65cbe43c4a6a3f5d01132bd245ca0005de5fc",
"blk.9.ffn_up.weight": "c838c38d0e1a0ac0da17eb2a66023ed31929f07d8fcfe1cc546df26096c91f0c",
"blk.10.attn_k.weight": "a78507cb72f744b86ceaa032596e74e5571c822d0226d334881169addb32cbd5",
"blk.10.attn_norm.weight": "35f48d0b28ee0e6b4cad4e983925737562d64824be5b168b3e26df3d6b260cf1",
"blk.10.attn_output.weight": "53712db06796de39b131323e7abf9a58551b6d52da6db66a471580386d396252",
"blk.10.attn_q.weight": "efe08429ba196026b81cd1c471e1c7418afd9e966659feb3936b674aa0803b58",
"blk.10.attn_v.weight": "7ec6055e134f89da0cbe79ec9f13ef2e442ac584b1f03c3e13e7d0cdad0078bd",
"blk.10.ffn_down.weight": "37e66af4bcd1f3079e841e892255b8255070655901864ea3a8c602a7f681a640",
"blk.10.ffn_gate.weight": "1825282bc34830d371c6edcc3c1e73e6ecc1e10f4aea0122dbb7acc1d6f7b1bc",
"blk.10.ffn_up.weight": "819b3b276a4d4c14a35ed6682d5ef18a5e8ed468e5ce3f12e8c75ec18ac20ec4",
"blk.11.attn_k.weight": "5327e6a2af82dfff0619a14971f5864a15553c36fead84e1af42c7630f2729c6",
"blk.11.attn_norm.weight": "fec363b3c4a43036d2c635fb8aa9e122dd87ee79811839f2f6cd955be3373e7b",
"blk.11.attn_output.weight": "ccf7b38f18ee8798b8a6a35018e2df3eb3e007de62876befb68025dd66c79763",
"blk.11.attn_q.weight": "da8c4a1c824ffe174e39f126cd72f7ef83c56aff1259d452a1212de80f98f5e9",
"blk.11.attn_v.weight": "d17ae6bb77f03982b55d341eb67acb5969e9ad3da5994b96eafc09793dcfe3a0",
"blk.11.ffn_down.weight": "a6bac521e2791345f22c57205fa1c2f2f687794dfd24d0e98d50ae0d0eb6088a",
"blk.11.ffn_gate.weight": "5ed902c488cb51ba5635f3df08258c5f84f31a679a00211ea5f9d8b824ef6d9d",
"blk.11.ffn_up.weight": "ee9f1437eb890d2cf9df2574afa1cecf20aafdd847cd75b152d7eb74419afd34",
"blk.12.attn_k.weight": "5a069c06e1019b0f889088e67458f7a11ec77fa190ada6069e46211f62219947",
"blk.12.attn_norm.weight": "194d7e5fcc8c49aea62daf1940532419cf3c505afdce6be377286b677db5db8f",
"blk.12.attn_output.weight": "6534995fd4d6fecb55e317add4b1723aba4d825e1e9471d0b08813dfdc247176",
"blk.12.attn_q.weight": "4ab51ca519b5995581fa34f846276feca3b907ef2b51f192f6cc0b3263c3f5a2",
"blk.12.attn_v.weight": "5652ca3fa81ef9a1ac1543d71fc6813f8517f8ec54b25c701f6f98061614830f",
"blk.12.ffn_down.weight": "4b2c263f54c88516b8eb273bb8d9615b01c5c8b484dc70358adb91b50b300edd",
"blk.12.ffn_gate.weight": "8f50c3c3e3e8568991d6c1b0e74b500cf4f208e7700bbb8e87c3f6a6d359b6b5",
"blk.12.ffn_up.weight": "1c1a581fec1fbe959e1427fa513f400100b5e1ee9d83932630be9905fb49c231",
"blk.13.attn_k.weight": "efd7a38c46f08d8376d82974f33c644e3a02220e142d63b1704718699a8a884c",
"blk.13.attn_norm.weight": "d28fa4f1bd75abbd063b0e622e08f579c89cd0c0c5ce63c1952ec9f944f8ee13",
"blk.13.attn_output.weight": "71e0068a639288718bdb70a6cfdefd50bc8b3ec3993347a65129e70001ca5827",
"blk.13.attn_q.weight": "b97077adc92cff07a2e07d80ee38f214ad8713571c69cd5c70ebd43dc501ac87",
"blk.13.attn_v.weight": "79b3e2749ab4b459c81e96e322b215f1e8af645eb346e176c326bd00cf6ed2fd",
"blk.13.ffn_down.weight": "9f8687d11effa1db7cfecf7bec5631734bcf2962aad74a9f519144491e08ec85",
"blk.13.ffn_gate.weight": "7d14dfa0543852e7777fe8fff29ca533744cbcf1ebcf10067e5adfc4eb345e65",
"blk.13.ffn_up.weight": "852b9527b97fdab211ff3f832a660ee1d93ccb56906144c50f01319a6e8ee615",
"blk.14.attn_k.weight": "79e926b20f36f66d58226cb358881f2f68ae7b468787d33cafae5110287a14a0",
"blk.14.attn_norm.weight": "97d481b63deb0df6142c2c6cd23043720c62eb609e390f47a7113751c79974ec",
"blk.14.attn_output.weight": "aa6e94d7176d5c79fbb89b96e5f13ce75702ce3dd23ee52986446da436a6c3d6",
"blk.14.attn_q.weight": "214becb6d1bb460da9fb8ace0f99b9a5afa9edf7aa7acc19606c7401b11d6305",
"blk.14.attn_v.weight": "488b0e6d7f1a7a2ed0972aaa6d10ef9c775ee5373460324efcf5b3e3da9311df",
"blk.14.ffn_down.weight": "29c7ad16cf9542e30996a1a01ab95b844533b28051f04cc7949c371afb796471",
"blk.14.ffn_gate.weight": "b7ef208f2b054803665b377f5a5980c122c026841809cf855c6ba06d1c3a885a",
"blk.14.ffn_up.weight": "76a5cc28100748d79c4398ce7b9176aab4d661548b6293a82f99144812e5b70e",
"blk.15.attn_k.weight": "a6b8f9e98ab878fa7ebc5d080978ebf2d050acc2ab2fa8ea9188eb10e27702c8",
"blk.15.attn_norm.weight": "a26d07a9752d6dccb68e3a8a2a49fd0752cdd0a415e05547819bc37d9ba63d5e",
"blk.15.attn_output.weight": "c63616c69048ccbee801e05be4f56d21fda21aa0cc470f41d57c31b4d9283a4d",
"blk.15.attn_q.weight": "fd595a67bf96c6ba16eb148a9d02fa52fa3c1d33ed10be28a08f851409fd6e64",
"blk.15.attn_v.weight": "1c5c9d33fa07c05d5f4ed0032c6c4aa83d863f0d31c94a66109d239dcd03cea3",
"blk.15.ffn_down.weight": "585ea62ab8aff7d7d212ea5c1a03226fda6b68370c890b776834af70c948dcbc",
"blk.15.ffn_gate.weight": "a13c63f86f879b03a573d5dd2a25cfd1f4dc73e8132e6454ecc23e538b4cdf6f",
"blk.15.ffn_up.weight": "f7112450f57c12fcd511f049e0dc0b541625a107a7901c3261ed9e984299f65c",
"blk.16.attn_k.weight": "2d2c8b11dd71fba6d1c106aa1673c113a5448653cca7eab897c8739212ed5003",
"blk.16.attn_norm.weight": "95c2ec7be9469690e18a9a1779684acb3e9da44b13e263a0da840305646fbf8a",
"blk.16.attn_output.weight": "31a65046e677f54dae654ded4e733479fcc0f7283d83076b7dc7cbcae8528230",
"blk.16.attn_q.weight": "bfc6292b9c6d49b7118d08060242a138182eb182d136ba5dfaf469437c16081d",
"blk.16.attn_v.weight": "68f81d037340217d87c7853ff4d6edfbc46d9e827ee6d5bff7c3f6238e3a95ad",
"blk.16.ffn_down.weight": "bbd6629691950cef4d5113e1c6670e91b216a9b872cb92cee02dfda4d6c4f7b8",
"blk.16.ffn_gate.weight": "63cb56f282b7401ed6c76e5bb6fdf1bf68a64f9af0c82c014209b55bcb5191d0",
"blk.16.ffn_up.weight": "b54f39a2541063cbfb6f713aa81c3b69a04100e999aa2ebbeec195dc382eceec",
"blk.17.attn_k.weight": "3d9ba49799cc56664ec30a002bcad61eb651294212a68c3ddb573eb042aef5a4",
"blk.17.attn_norm.weight": "42ee0db4b9d63257bca0012a30b12737ead1caafeb5ed3d93c8f48ffec4b46de",
"blk.17.attn_output.weight": "a38fd100f05c9041c592bc739e287de0b10d08ef2bda41a879225bdca9002f71",
"blk.17.attn_q.weight": "8a3bee285b0180a9eb35662e449ee4cbe16d992bdd48fb3a94bc4a347728cfa2",
"blk.17.attn_v.weight": "d7f8f1b8b863494ed4392a1656775912e9b264ad36016547b12e832a1d6757d6",
"blk.17.ffn_down.weight": "bb7ee58f61da8630972e25b621996fbe8ec06f4dc9ab1e268ab5b120c526ca28",
"blk.17.ffn_gate.weight": "6b652dbf167fee09a45ebfd78d500ff6548fb2756dbe5343ffec3f7e6207179f",
"blk.17.ffn_up.weight": "3b67f727e55e742715de978fab80457781e7a3762bc48f79d13b45dcb8de664c",
"blk.18.attn_k.weight": "ff7fe57c57b90c6fcc0aefc39ec24593c3a7d1ea1c23770480075a015450e0f5",
"blk.18.attn_norm.weight": "1d40faca082d2633ef0ccf19e121870dd6c7c3e2154607c7f3543fa96e99cb2d",
"blk.18.attn_output.weight": "9adfecaaa397a92db4687efd5fcabfa0daef9e6b0493763b7ff5ebc185c43a6c",
"blk.18.attn_q.weight": "ad1803eb9b291948639277afe981e666b07167eb3fcae903ba5b73bf86d8f50b",
"blk.18.attn_v.weight": "308cf23399adccf27401a4ab60d74dac6fb9d4cd4b9c5940d9145118d1881b34",
"blk.18.ffn_down.weight": "7de4ac9a561fb580619b745687dfd7ca8a69ef70471dee978741b80e9ff7bead",
"blk.18.ffn_gate.weight": "0c66970f696b33bd5ee8f1f2fbcb41fd78fa5ccabdc927e11a4d5a4089f19c69",
"blk.18.ffn_up.weight": "66a42e988e8a1f468fabf976c48e9e4bb045eaac6916ef16555ac101cd674abc",
"blk.19.attn_k.weight": "a928ab50390bacbcebe2e4b66922498134ce22d7b93beaa87d6cf4ab52eb7174",
"blk.19.attn_norm.weight": "b4a02c55b46c2a96aec9c64a254087cf48e6c1d4b6f31782c77a46fc4daebad1",
"blk.19.attn_output.weight": "b768319c641dff1eac5d1f8ceb960c9899c795bf2b24c1d6bf70aa24fda45f77",
"blk.19.attn_q.weight": "79ef3f57d187d3954a26362096e1b6c222d76f537dff73e034d6e9999935b8bc",
"blk.19.attn_v.weight": "ce13d6b13e24fcb2d5bc6a2662e5bd295b31b12db10a6d0307f86cf29b8d5001",
"blk.19.ffn_down.weight": "cf90d7e2137482cfd50934a8223ad774621d08554969da80a9712df5e6227eb0",
"blk.19.ffn_gate.weight": "71ce30150f003b6eeb3bf7464e05b6ae615f135110d8e47f0a47fd973e537c0f",
"blk.19.ffn_up.weight": "7f92aca0cc29866633feec701ec01a85a8ee2fd4e2b9630173a6cffb1d9d50ee",
"blk.20.attn_k.weight": "a2df23159d6fb74ef28e14b61028fe8b00a693a2fc9234a980be74f20b958682",
"blk.20.attn_norm.weight": "c6cd5f1b096fc5efa4eb59ca1c8c4bd28730f3dcedd59a63601663eccc6724ed",
"blk.20.attn_output.weight": "896a8a166d0f006d4b09867ae4345426303cbc3fb13a18d3d4e1bde00f16dbdf",
"blk.20.attn_q.weight": "01eb79588fe61baea0da43e99f4dc5939590e1bafd01e12dadb8326f102bfea2",
"blk.20.attn_v.weight": "bd39630fdd5a7c859ac1addaf53e63faf524c3f32f5f4896d86b6e746b1d5c06",
"blk.20.ffn_down.weight": "0304a5d39957a0e3f031c4bcc4549a135d396c8d97c8d276fd1c823ce86560c2",
"blk.20.ffn_gate.weight": "117b79d595b1dca0c8b37586beaecc4d84411507276212dc286cde7fc36c9bef",
"blk.20.ffn_up.weight": "6e799346db145c125f01783539749d3828fcc451cd4f10c5352f047a47e28714",
"blk.21.attn_k.weight": "1c37e4c0664147e775bb006b226b9553e3421140cd96288ea755f81731ab80ba",
"blk.21.attn_norm.weight": "00ae783a29000ccda5e4bdbff03df0752fb82805dc3f9b987500ebd80714476e",
"blk.21.attn_output.weight": "7588b84f9fb19f15095b5265c60b4a4e7ae74bcc47d4607dfa5d0bfab6f136cb",
"blk.21.attn_q.weight": "a65f1c0dd06d45bb97532d3e932689c1eecfe7359089b39174a96a149335cbc1",
"blk.21.attn_v.weight": "4220b77e7d5e8709b4eef33a679b5dad11f297085ef44c9977f9e54ef08f7a2d",
"blk.21.ffn_down.weight": "b8c082a0530d4b5328e67db0df84c5498f2af956de23c639fa0198ffea853950",
"blk.21.ffn_gate.weight": "cd1b656ee72d00e9835ef667c19ef89a88de261eb8eb7c0e936e0f9ddf83ef9f",
"blk.21.ffn_up.weight": "dc445f73e36ec7a3bd86884186b728f8e0187f32848c3b8b69d4d41f8571bf31",
"blk.22.attn_k.weight": "e37cf0b893ec8b9ee8c78dd139b8d9c45cb997a3bc0c3d93a70ca1c3f6af8859",
"blk.22.attn_norm.weight": "248a27838d3c46cc03a5c312facc84e2e0e2c990ef8401e93da25918497f88d1",
"blk.22.attn_output.weight": "fc191a18f6d18332c66761f7ab28008bfe295dd1f5c8741a2488442f9e00d0f5",
"blk.22.attn_q.weight": "4b193a2ab8bc2b085db18f2bf3eeba26e02b537b2cdd738160c8f14b165d0f5a",
"blk.22.attn_v.weight": "7a60ce5ccac7e045e55ba1e1e85bd2a0f93f8c781daee96c5223665e22f0c666",
"blk.22.ffn_down.weight": "e0a34fb4244e2c7168f3dbaa1904c15d339ec39999cdf27128bbaf619ee0a237",
"blk.22.ffn_gate.weight": "8bac872d4b8549c8812f927efa309f1792b524f33601095fff61b826de5a5615",
"blk.22.ffn_up.weight": "b67fa2b94dd901b6ec64c0853ce8ca2d86fe9cb1cc6d2f15fbbbe0e691c0c648",
"blk.23.attn_k.weight": "2c32e66ad01942b819ac09a197c71579fe66f02226a264fdd72ad1e02c67a27e",
"blk.23.attn_norm.weight": "825fdc94deb439cb93c713eeb077c1052b90ed658d6d464fc4ad3d611e911d48",
"blk.23.attn_output.weight": "95ca6707a95b8750b0c7c5d379d368f0f2e7ebef631954e7d4d8ec0f41f13a3a",
"blk.23.attn_q.weight": "6eccc84faca5fac015d1b26e2854501edcfd292a302228fe14cf99f5eb59a34b",
"blk.23.attn_v.weight": "b343ac3d226040f1033ee049668aa1d89b1774bc18431965682e5dbdce78ccdc",
"blk.23.ffn_down.weight": "9fc599befea8d3b1e342d564a110074f66d2542df406c4b90b6bdc5828fbb2b2",
"blk.23.ffn_gate.weight": "488556c1b0c9f0b20b0c99b4bac2e0f4046b81edb601d7b91e7e5b3bab47d667",
"blk.23.ffn_up.weight": "1088e291d7008dd9c7c2dd6830af686a8a84b724d123a016209bd5156d6898f1",
"blk.24.attn_k.weight": "a923fbe35e61e009a53927d7828818e0592bb737d6a1106c4b0b5a1efc367e07",
"blk.24.attn_norm.weight": "9b51aaaa939cefafdd9b13a7e5b74ac7fa2d603427e55a16a909d6f3f353750a",
"blk.24.attn_output.weight": "1beb2baba56f8409466434b037771248c2f620ec5f53e15f44c271d5a2d9ecf4",
"blk.24.attn_q.weight": "4b0194fe5bfae0c6bf6131dcf8cb6e2b994f6ea10b27cb03574f0f4f8cc0c950",
"blk.24.attn_v.weight": "6ac34b1ab0f66226d85bca1194a7c212cd93d384ecbc8b8395de48aec0970a61",
"blk.24.ffn_down.weight": "5508f74cb732a662c2936b32ac5e90742d172b9f961a747b0e5cba0e5906a89d",
"blk.24.ffn_gate.weight": "095e39b8584403835f9bb1ac33e0e81f54175575e4800273d281b845bff381e7",
"blk.24.ffn_up.weight": "2d43ec21637dda12973de367b0113ee9840b0d815bf6fce042f7c3f270b0b530",
"blk.25.attn_k.weight": "9e2aee029f3d2c7f67dfc7926e72c8228fb978382c8e5a4701bbf82c93801419",
"blk.25.attn_norm.weight": "220cd7164fb4cdbe22d26058e4153b26c27c7b5ce2bec8e95bf2c0ea08d23103",
"blk.25.attn_output.weight": "a17f4a5dc6aa51f03dbd75602d98e9491767c205cdc2c3a5f8667fc54bbf7c64",
"blk.25.attn_q.weight": "f60827496835c440c794bf57ce9780704d10a59d8229886bf75ebb18900ba4ef",
"blk.25.attn_v.weight": "9cac217e9e9f4f4c85f14ee51165a77c580165bd4a34b202389169bbe61a1ced",
"blk.25.ffn_down.weight": "a0f36949b663e80849581dfb71e7babcc73580793bbcb0c80ab26d5a6e000359",
"blk.25.ffn_gate.weight": "df4d1be4d50d6afe5ad3ef0d0e0fac76a33e85c963dea769641d612dd53e7d13",
"blk.25.ffn_up.weight": "992da76be762632e25ebc5ef4d03728eece1b43f7c4e31827df19ca724aea694",
"blk.26.attn_k.weight": "34199ff856ac32a500c754539d070258574192a34ecba87a182897cb59fdff52",
"blk.26.attn_norm.weight": "a8e9dfb2dae5d22b5c0aec5f3675991c0e3c3e6a44153db2579136b73f456e00",
"blk.26.attn_output.weight": "1c4f257ffb0d7db0f11cfb275e38b4af736917b43ad82de1badce3f1d227da4d",
"blk.26.attn_q.weight": "33d55786274c2e718cf61e8fbecf3dfa5ee0c208f0b716d42b061f55459acb3c",
"blk.26.attn_v.weight": "684b636939cd4ffcfec5a6238a0790ffa43d853c95783af9b9e8275e74071a7a",
"blk.26.ffn_down.weight": "89d0bf066db154e6d312b5433aed1714f6a28b40f4c52e3e1530ee07703303c8",
"blk.26.ffn_gate.weight": "393d649bebe5e2940e1b043649f6c860b4b8b9f380f30e9da1744a830f358156",
"blk.26.ffn_up.weight": "179edc85ababd9d8440cc6093eecd1004290aa1cb96434b26ecf7585b6cca17b",
"blk.27.attn_k.weight": "334841445a7f1e14731b08f56eb0b1f0938c63823d28bc6d078c4c5f05b36f19",
"blk.27.attn_norm.weight": "57344471bbda2e9deffdfdb2dd05a07aa47f8761e24de53525588639145bf551",
"blk.27.attn_output.weight": "506126af9ee54b535d49f97e36f630e74834f480329f098d6d62e96246d8d65a",
"blk.27.attn_q.weight": "dd984df1acb4783849e25ba7ae378bfd385cd9efc540fb798cd5bdd873f0118f",
"blk.27.attn_v.weight": "b4b3fe9a4455d34c297ff20a2f537b647cef424741d840a747b265f23d320ac0",
"blk.27.ffn_down.weight": "621fdb185ba0d35ba5476dae73d2c81ec1482a0e878d5bfd5c3b29fe837af013",
"blk.27.ffn_gate.weight": "e4fbab45f2ec506fa374103251a0bdb7baa6f576080bdd796f3e9db92098e08f",
"blk.27.ffn_up.weight": "a0c57e463e988002bbd6a6c6792baa21a65e6f89ae303a2c301951b0ae6e4bbe",
"blk.28.attn_k.weight": "bac36cbd52ec5056841663865e1291ddab4b47ef9a2544dd285d4503bfb0e4a0",
"blk.28.attn_norm.weight": "5774a9df2bbb2e86d1f70179c7b92d81e1f401160148b3328fb64db6646a5425",
"blk.28.attn_output.weight": "e8712622d1569557000c75f26c3f55fad267fd300463c2c2cfe3afbfa1c8f908",
"blk.28.attn_q.weight": "11677751fddee52cc739699c02836f7be54d96038be4240be5d4f53d00161608",
"blk.28.attn_v.weight": "e5ee459b8958d65e1445997b9aa1e90e2f5d17761ebcf5357313119a45322507",
"blk.28.ffn_down.weight": "3934518f9f85292da8475fe38a8edcbfc4e24ac56c351b472d6351f98750871e",
"blk.28.ffn_gate.weight": "6ba735d57e98d0847e487f25ffaa25256deaa8abec76f428cb70bd9774279d83",
"blk.28.ffn_up.weight": "977fae6e1e5353114fc645dd98429464749758765cbc6e6457593d596e57850c",
"blk.29.attn_k.weight": "8122a457307d580ad6f1e0acea09a2f593d97f595ba0d6737f5fea16d2433642",
"blk.29.attn_norm.weight": "d626f721e05aa1202439b01027031d4caf1adace61ed37870a277cb6297c77cc",
"blk.29.attn_output.weight": "7fb7122ab1b6b1e6615ca746897da27bc52c92cb70d3147183cdde61795b72b3",
"blk.29.attn_q.weight": "be43e94ff6b6e391024dc824101efa0ddf4005d5b002ac26cb03765c0c73c2fa",
"blk.29.attn_v.weight": "af93c85ebff908f74f9935b81bde0516ca487c84139868a1ce079c3ae20036b1",
"blk.29.ffn_down.weight": "39dae12340ed3120bd19c495fe0872b559613641e41fde69d02d8631900b84c0",
"blk.29.ffn_gate.weight": "36fd482439840ef197c9f3b8905d86acfcea49bcf018544106ca465d4bf8d5c7",
"blk.29.ffn_up.weight": "5243fbdfdc1e2a1dd84b6210a9869d18a014db9088897e345240cdc99990bd5d",
"blk.30.attn_k.weight": "948f263616bd3788b2b968baafd69b9c5bd1b77578665f096c4b7e247b4cea42",
"blk.30.attn_norm.weight": "e168df981e744874ff303faf2eb470e5f6868c2040ba5f383f6c5148669975e7",
"blk.30.attn_output.weight": "4cf0ccca04b792573b756655a24fc89cfb1f272da8305633f0bc66ef14990b93",
"blk.30.attn_q.weight": "21e07d6cba6c50d65350289258209717174a13c42be57e8141d69712cbaf32c1",
"blk.30.attn_v.weight": "65a8ca29c7237b3182ccf03e2fc94e84f9a53d0e160fb679ab401c853170dd9c",
"blk.30.ffn_down.weight": "8b00500a6d00d84058f6658ee1d6f06fb4fcae2f90d4341792259362923b3c13",
"blk.30.ffn_gate.weight": "5bc0e19ab7a31b50ac2118ad1b36e31055271a322cd8ff661d47c3ac0210703c",
"blk.30.ffn_up.weight": "f37a0561955725bd59ee2d064fa9f4e00a12a1b620b624db3bc3add5330bc321",
"blk.31.attn_k.weight": "9a5663edda227f5d87533897146764f8e8a7481b9e71fae197c39204f8463221",
"blk.31.attn_norm.weight": "060a4f438a1ee5e220b5b5278ad2f5c085a428bf38c515766781815597c87529",
"blk.31.attn_output.weight": "6ada5d3cad9dea4780ffbb43302bb6ccc2f24eddd0fc4f5f84c9ce0fc0c6e5dd",
"blk.31.attn_q.weight": "bb5d08c08603907981ad388d5d8b70fcc9b98034ba264b8474c8890cc0297af0",
"blk.31.attn_v.weight": "e01b4252ea9c6a889c32b21144b441a347464d04536ef4f6572425be55759796",
"blk.31.ffn_down.weight": "8ba4d679c36e93ba65ba03180385ef35ea86b3b7cdf2fded9df59369f1c09630",
"blk.31.ffn_gate.weight": "e5b41dc93645f8b5e8eebae3ada3ea43a18f97ce2654228655170b07b463ccb0",
"blk.31.ffn_up.weight": "25b88cdddc8b547af294ed107d3d1312e90b983cae87936fa6062ecd8ea02539",
"blk.32.attn_k.weight": "4bcf86dc0858c8ca2fbdf6aa76674d43eb698f78979fdc1a38f556a7af1facc4",
"blk.32.attn_norm.weight": "cdcc12f3b8b9773c6722736bfb748a2729230b21478cbcc4104859d3148df815",
"blk.32.attn_output.weight": "d43f1196822995ed89a9365c97054753a8b30ce20b6e273c8edcc42673a1e141",
"blk.32.attn_q.weight": "ebf2972bb3865cbc5be4840113a322089752038344beab2a0122c7cb4fb399b6",
"blk.32.attn_v.weight": "714db81704ff34fa137512903c1013acee7877467473e46600728b9240582eb7",
"blk.32.ffn_down.weight": "2cde3da1258bb170a79d5d3cdfe10c86a71eb34b77da46b74c5ed71e7f4fe274",
"blk.32.ffn_gate.weight": "c7e1ed792532613ff9d4e5834b6536e2e0f47df2303bc0fdaa90aac0c1f4e8db",
"blk.32.ffn_up.weight": "d8d6f13fe66a716e28f79101a29817f0c0d6f99969a6f017d51bafd1a16c600c",
"blk.33.attn_k.weight": "a0a28f6cbca88da00cab2ca37094d9b0503bf9defdae77b91895b911c408cbb6",
"blk.33.attn_norm.weight": "0251200c24cc8445607ace6dc8c5aa0566567997262b7cca53a11ac23cc564b2",
"blk.33.attn_output.weight": "b2423205bdf6a1096d43c44d8d12f1a84fcd4e1bb70fcf6dc8542b8b8a71a13c",
"blk.33.attn_q.weight": "00b425c3ef71065ce5e0234e702bf38143b4952da78a85f52ab2c2e3073d97ab",
"blk.33.attn_v.weight": "035edd2335df816c42c765a5e66b9d9b9e15a822a8dc1863508145499c942c14",
"blk.33.ffn_down.weight": "4894a923a3db75bae4496ba3ce5f28796ad31fe33996a066271fb8654964310e",
"blk.33.ffn_gate.weight": "8f6c819b8bbfbe3357fae89e1ac5a3d58be85b3b04be3bacf7b62775869046ff",
"blk.33.ffn_up.weight": "257c3544b5b544fd5d839665bf5caf107a329b59dbc3751efcaa24ae63c56179",
"blk.34.attn_k.weight": "b6cd8bba892e38dac4a2ebc3ba1bce49e71b967fc436fde30c6d76f54a18935f",
"blk.34.attn_norm.weight": "2b3c8e60a064cba9955752bbbbdd92c71ba5c2f1bd721097bdbe88b5abc68787",
"blk.34.attn_output.weight": "8cc272551c9aaca9db5a660c6927bab94a0243d74a30b2bc165f06bd577714ea",
"blk.34.attn_q.weight": "74b561eb4792484e6a94b58fe2583848c3ae28ff2f1bf3d02939a0cfdfa49990",
"blk.34.attn_v.weight": "dba19e24ff05154dc5a1f55c023729303a583d13d68732ce22ea74d4410dc8f0",
"blk.34.ffn_down.weight": "76eca5dfeb274c35774e0bf9f22ee420ed9085c8e99aa2cd5a236e4918b44c61",
"blk.34.ffn_gate.weight": "9af0862d5fcbc24732846488e653db8242a467765c0cdbc00332b3a40256b4a6",
"blk.34.ffn_up.weight": "2a03126bf73587eaba99ece2066103d12e47bcd4ce30ff6c17b2f383b81d40df",
"blk.35.attn_k.weight": "52513fc0cd4e997a842729af7d21dd09399bce0a339558374738be266d0fa2f0",
"blk.35.attn_norm.weight": "e5281fa911964263ccf1630b14762edbd41d0b9472d6ec695fc600fed4892c35",
"blk.35.attn_output.weight": "b391d6705d5dc6f48326b5fd16573f679edf64109d86fb729a498819676590ca",
"blk.35.attn_q.weight": "d16446921966db9b0e0539626ad22a2511ace780e59379d6a4162d8c5441440b",
"blk.35.attn_v.weight": "9d8cdf23ffdb0c5c74106843390b94b24c9f33ef0eb9998d39f78c73390101ea",
"blk.35.ffn_down.weight": "938eb6301f7bbf162d7dd965682a5ed11d0a4a530c6fedd7e5469ce80012fc17",
"blk.35.ffn_gate.weight": "5ad84f5a0c8edcfea1ecf1a3e3d21d85ceda0c4ad9e3c6ca68885eeff8ed3c2f",
"blk.35.ffn_up.weight": "1c4330d9dc71bf4c98812c34356c51f520f47610a534152aa6d29284b758090d",
"blk.36.attn_k.weight": "ef720655e5ca2465f13db2dfc4732fb4ef2c9d53acde52f514fd4f301e974081",
"blk.36.attn_norm.weight": "88f4b9310b3c8c2644e3029160cd35678c79dfa59280430e03f5c29a6fe84a58",
"blk.36.attn_output.weight": "aec6f915fffd7bb72cd783273e871b4f09605950089d45e72059d1316b6c4b01",
"blk.36.attn_q.weight": "72f9408a2405d42f8db6ce5fcf1d26a3660b6f225fc60e77d0277109cfcb82ed",
"blk.36.attn_v.weight": "0f3b3d851dc44b3893ef53f6cca5b4acc9658bacfe1cc2d13c3d704ddd409b67",
"blk.36.ffn_down.weight": "470aec48ce8c5129a6654d9fd26fcae72776f9fc1429a8bb05818072a876475d",
"blk.36.ffn_gate.weight": "7f5f296d09cf55679767b5d15de3eff489c456782119f25204be4b1647f18dcf",
"blk.36.ffn_up.weight": "b7ef74a1f7ffb4982711d93f1787be3a70edc3d2358d5203c41d8900508037d4",
"blk.37.attn_k.weight": "c4ffa5412e4ff2dcfe1aed991c1f54169fd171a4c7638e4b9f21a1ca64c5e1d6",
"blk.37.attn_norm.weight": "4eb6c888d841cccfacf5b963f8611120f6ff24b84af0b5714fd9ab36dcda422f",
"blk.37.attn_output.weight": "db2a7bbf9682f9f6eea672dae8e150738f1bf74dbc80edc7022017a3f040c8ac",
"blk.37.attn_q.weight": "e38c0462aff139afcbab289189823527e453abc9e541154adde5e7af88cacf0b",
"blk.37.attn_v.weight": "952eb2492ed452a72f96bcc12d4b2affad9dfdf46ee39ce4a5d7b57a5dc301e5",
"blk.37.ffn_down.weight": "25f23a8fbc44febf6dc4848fd7fe03a580e2822bd3b3b5a51f4990826bfe3e4e",
"blk.37.ffn_gate.weight": "707da5eb40118b035305d3262444382351f170a20a537386a70e90c5a83a7817",
"blk.37.ffn_up.weight": "d2d2ba5cfc4ef47338dd7384219e22bf030a5a2209e0354d88f5bbaaafd20e87",
"blk.38.attn_k.weight": "abc4bb189dedf7ce661e79028427623a4f91ac091c2cd60e31b58bc62b1cda71",
"blk.38.attn_norm.weight": "9f4803a7d03fd40fcb83d85f84eb1d5682ea4e5bb084f210c02850675d804c3d",
"blk.38.attn_output.weight": "77cb66007f1a41df7135d0e7f900ceb499c2f667dfc3f1a6ac01a3203bbd3ccf",
"blk.38.attn_q.weight": "d94a8b26cd375bf2bcaa76597e314aa8268ee50a479d00931e5e0e021feadb5d",
"blk.38.attn_v.weight": "660c907888bc5016dc69b7d35fe6f55c7ded697c93be0e2d332a2f17aff88758",
"blk.38.ffn_down.weight": "6f06173bae5b00ffaf88ef383619a8b9c6a8d0d5c6494695d17f6c1de1a68a13",
"blk.38.ffn_gate.weight": "89f99be149d03f116527bfcabe073c50001c874de40fb6e817f6619027f3cd05",
"blk.38.ffn_up.weight": "8d57557c8d5e2d2688b73f01dddf1ce8d5194990cda6358153320aea88aac7f8",
"blk.39.attn_k.weight": "21be09c988b46c8393e6c2ec9230f3b5136eb7607dd1953ba92d0811c2f0dd75",
"blk.39.attn_norm.weight": "ba7c1912dd1c4e2d16917201f62396fd0600e4a451137eaddff255548c209abd",
"blk.39.attn_output.weight": "acfaf4abb3fd27fd899b5563c3877f176b597d8f6cdb2f2fd3f3a0bd4da15ed6",
"blk.39.attn_q.weight": "e8adbc140d4c8f0db2a27ca584c5531d5b1e080555fe627e34d80d0814a92bed",
"blk.39.attn_v.weight": "92f96b0e1f724e73a0f90a76c145654418844c04a6d4b14c05eb5af8a62bf8dc",
"blk.39.ffn_down.weight": "4d9ee7c65fc16fe95d10c47b79ac6a525741947600a64b5fcea5d300a82c50de",
"blk.39.ffn_gate.weight": "7e18507989f39b32191133d2657c2ee3b74f42f070579204d727eb72215793d1",
"blk.39.ffn_up.weight": "22cda752269c9757ba918abede1df95bb0f83a5c772dea13c8deea3d5f2723d9",
"output_norm.weight": "2858cf0e39d32caf52b7861378ace076000241e147f10b9eb21d8a5cd149e3cb"
}

View File

@@ -10,6 +10,7 @@ import (
"log/slog"
"os"
"slices"
"strings"
"golang.org/x/exp/maps"
)
@@ -60,7 +61,25 @@ func parseTokenizer(fsys fs.FS, specialTokenTypes []string) (*Tokenizer, error)
addedTokens[t.Content] = t
}
t.Merges = tt.Model.Merges
if len(tt.Model.Merges) == 0 {
// noop; merges is empty
} else if err := json.Unmarshal(tt.Model.Merges, &t.Merges); err == nil {
// noop; merges is []string
} else if merges, err := func() ([][]string, error) {
var merges [][]string
if err := json.Unmarshal(tt.Model.Merges, &merges); err != nil {
return nil, err
}
return merges, nil
}(); err == nil {
t.Merges = make([]string, len(merges))
for i := range merges {
t.Merges[i] = strings.Join(merges[i], " ")
}
} else {
return nil, fmt.Errorf("could not parse tokenizer merges. expected []string or [][]string: %w", err)
}
sha256sum := sha256.New()
for _, pt := range tt.PreTokenizer.PreTokenizers {
@@ -81,6 +100,8 @@ func parseTokenizer(fsys fs.FS, specialTokenTypes []string) (*Tokenizer, error)
t.Pre = "deepseek-llm"
case "21cde974d587f0d54dc8d56b183cc1e6239600172035c68fbd6d4b9f8da0576e":
t.Pre = "deepseek-coder"
case "1ff7f41064896984db5d1bb6ff64fa4bc29007d08c1b439e505b7392777a319e":
t.Pre = "qwen2"
case "e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855":
// noop, empty pretokenizer
default:
@@ -156,9 +177,9 @@ func parseTokenizer(fsys fs.FS, specialTokenTypes []string) (*Tokenizer, error)
type tokenizer struct {
AddedTokens []token `json:"added_tokens"`
Model struct {
Type string `json:"type"`
Vocab map[string]int `json:"vocab"`
Merges []string `json:"merges"`
Type string `json:"type"`
Vocab map[string]int `json:"vocab"`
Merges json.RawMessage `json:"merges"`
} `json:"model"`
PreTokenizer struct {

View File

@@ -191,6 +191,62 @@ func TestParseTokenizer(t *testing.T) {
Pre: "default",
},
},
{
name: "list string merges",
fsys: createTokenizerFS(t, t.TempDir(), map[string]io.Reader{
"tokenizer.json": strings.NewReader(`{
"model": {
"merges": [
"a b",
"c d",
"e f"
]
}
}`),
}),
want: &Tokenizer{
Vocabulary: &Vocabulary{
Model: "gpt2",
},
Merges: []string{
"a b",
"c d",
"e f",
},
Pre: "default",
},
},
{
name: "list list string merges",
fsys: createTokenizerFS(t, t.TempDir(), map[string]io.Reader{
"tokenizer.json": strings.NewReader(`{
"model": {
"merges": [
[
"a", "b"
],
[
"c", "d"
],
[
"e", "f"
]
]
}
}`),
}),
want: &Tokenizer{
Vocabulary: &Vocabulary{
Model: "gpt2",
},
Merges: []string{
"a b",
"c d",
"e f",
},
Pre: "default",
},
},
}
for _, tt := range cases {

View File

@@ -77,6 +77,7 @@ func AMDGetGPUInfo() ([]RocmGPUInfo, error) {
gfxOverride := envconfig.HsaOverrideGfxVersion()
var supported []string
depPaths := LibraryDirs()
libDir := ""
// The amdgpu driver always exposes the host CPU(s) first, but we have to skip them and subtract
@@ -300,8 +301,11 @@ func AMDGetGPUInfo() ([]RocmGPUInfo, error) {
})
continue
}
if int(major) < RocmComputeMin {
minVer, err := strconv.Atoi(RocmComputeMajorMin)
if err != nil {
slog.Error("invalid RocmComputeMajorMin setting", "value", RocmComputeMajorMin, "error", err)
}
if int(major) < minVer {
reason := fmt.Sprintf("amdgpu too old gfx%d%x%x", major, minor, patch)
slog.Warn(reason, "gpu", gpuID)
unsupportedGPUs = append(unsupportedGPUs, UnsupportedGPUInfo{
@@ -349,8 +353,9 @@ func AMDGetGPUInfo() ([]RocmGPUInfo, error) {
})
return nil, err
}
depPaths = append(depPaths, libDir)
}
gpuInfo.DependencyPath = []string{libDir}
gpuInfo.DependencyPath = depPaths
if gfxOverride == "" {
// Only load supported list once

View File

@@ -50,12 +50,14 @@ func AMDGetGPUInfo() ([]RocmGPUInfo, error) {
slog.Info(err.Error())
return nil, err
}
depPaths := LibraryDirs()
libDir, err := AMDValidateLibDir()
if err != nil {
err = fmt.Errorf("unable to verify rocm library: %w", err)
slog.Warn(err.Error())
return nil, err
}
depPaths = append(depPaths, libDir)
var supported []string
gfxOverride := envconfig.HsaOverrideGfxVersion()
@@ -111,7 +113,7 @@ func AMDGetGPUInfo() ([]RocmGPUInfo, error) {
UnreliableFreeMemory: true,
ID: strconv.Itoa(i), // TODO this is probably wrong if we specify visible devices
DependencyPath: []string{libDir},
DependencyPath: depPaths,
MinimumMemory: rocmMinimumMemory,
Name: name,
Compute: gfx,
@@ -182,7 +184,7 @@ func (gpus RocmGPUInfoList) RefreshFreeMemory() error {
hl, err := NewHipLib()
if err != nil {
slog.Debug(err.Error())
return nil
return err
}
defer hl.Release()

View File

@@ -5,21 +5,8 @@ import (
"path/filepath"
"runtime"
"strings"
"golang.org/x/sys/cpu"
)
func GetCPUCapability() CPUCapability {
if cpu.X86.HasAVX2 {
return CPUCapabilityAVX2
}
if cpu.X86.HasAVX {
return CPUCapabilityAVX
}
// else LCD
return CPUCapabilityNone
}
func IsNUMA() bool {
if runtime.GOOS != "linux" {
// numa support in llama.cpp is linux only

View File

@@ -16,12 +16,14 @@ import (
"os"
"path/filepath"
"runtime"
"strconv"
"strings"
"sync"
"unsafe"
"github.com/ollama/ollama/envconfig"
"github.com/ollama/ollama/format"
"github.com/ollama/ollama/runners"
)
type cudaHandles struct {
@@ -45,7 +47,6 @@ const (
var (
gpuMutex sync.Mutex
bootstrapped bool
cpuCapability CPUCapability
cpus []CPUInfo
cudaGPUs []CudaGPUInfo
nvcudaLibPath string
@@ -64,9 +65,13 @@ var (
)
// With our current CUDA compile flags, older than 5.0 will not work properly
var CudaComputeMin = [2]C.int{5, 0}
// (string values used to allow ldflags overrides at build time)
var (
CudaComputeMajorMin = "5"
CudaComputeMinorMin = "0"
)
var RocmComputeMin = 9
var RocmComputeMajorMin = "9"
// TODO find a better way to detect iGPU instead of minimum memory
const IGPUMemLimit = 1 * format.GibiByte // 512G is what they typically report, so anything less than 1G must be iGPU
@@ -101,9 +106,9 @@ func initCudaHandles() *cudaHandles {
localAppData := os.Getenv("LOCALAPPDATA")
cudartMgmtPatterns = []string{filepath.Join(localAppData, "Programs", "Ollama", CudartMgmtName)}
}
libDir := LibraryDir()
if libDir != "" {
cudartMgmtPatterns = []string{filepath.Join(libDir, CudartMgmtName)}
libDirs := LibraryDirs()
for _, d := range libDirs {
cudartMgmtPatterns = append(cudartMgmtPatterns, filepath.Join(d, CudartMgmtName))
}
cudartMgmtPatterns = append(cudartMgmtPatterns, CudartGlobs...)
@@ -219,16 +224,23 @@ func GetGPUInfo() GpuInfoList {
if !bootstrapped {
slog.Info("looking for compatible GPUs")
cudaComputeMajorMin, err := strconv.Atoi(CudaComputeMajorMin)
if err != nil {
slog.Error("invalid CudaComputeMajorMin setting", "value", CudaComputeMajorMin, "error", err)
}
cudaComputeMinorMin, err := strconv.Atoi(CudaComputeMinorMin)
if err != nil {
slog.Error("invalid CudaComputeMinorMin setting", "value", CudaComputeMinorMin, "error", err)
}
bootstrapErrors = []error{}
needRefresh = false
cpuCapability = GetCPUCapability()
var memInfo C.mem_info_t
mem, err := GetCPUMem()
if err != nil {
slog.Warn("error looking up system memory", "error", err)
}
depPath := LibraryDir()
depPaths := LibraryDirs()
details, err := GetCPUDetails()
if err != nil {
slog.Warn("failed to lookup CPU details", "error", err)
@@ -238,24 +250,14 @@ func GetGPUInfo() GpuInfoList {
GpuInfo: GpuInfo{
memInfo: mem,
Library: "cpu",
Variant: cpuCapability.String(),
Variant: runners.GetCPUCapability().String(),
ID: "0",
DependencyPath: []string{depPath},
DependencyPath: depPaths,
},
CPUs: details,
},
}
// Fallback to CPU mode if we're lacking required vector extensions on x86
if cpuCapability < GPURunnerCPUCapability && runtime.GOARCH == "amd64" {
err := fmt.Errorf("CPU does not have minimum vector extensions, GPU inference disabled. Required:%s Detected:%s", GPURunnerCPUCapability, cpuCapability)
slog.Warn(err.Error())
bootstrapErrors = append(bootstrapErrors, err)
bootstrapped = true
// No need to do any GPU discovery, since we can't run on them
return GpuInfoList{cpus[0].GpuInfo}
}
// Load ALL libraries
cHandles = initCudaHandles()
@@ -292,19 +294,23 @@ func GetGPUInfo() GpuInfoList {
gpuInfo.DriverMajor = driverMajor
gpuInfo.DriverMinor = driverMinor
variant := cudaVariant(gpuInfo)
if depPath != "" {
gpuInfo.DependencyPath = []string{depPath}
if depPaths != nil {
gpuInfo.DependencyPath = depPaths
// Check for variant specific directory
if variant != "" {
if _, err := os.Stat(filepath.Join(depPath, "cuda_"+variant)); err == nil {
gpuInfo.DependencyPath = []string{filepath.Join(depPath, "cuda_"+variant), depPath}
for _, d := range depPaths {
if _, err := os.Stat(filepath.Join(d, "cuda_"+variant)); err == nil {
// Put the variant directory first in the search path to avoid runtime linking to the wrong library
gpuInfo.DependencyPath = append([]string{filepath.Join(d, "cuda_"+variant)}, gpuInfo.DependencyPath...)
break
}
}
}
}
gpuInfo.Name = C.GoString(&memInfo.gpu_name[0])
gpuInfo.Variant = variant
if memInfo.major < CudaComputeMin[0] || (memInfo.major == CudaComputeMin[0] && memInfo.minor < CudaComputeMin[1]) {
if int(memInfo.major) < cudaComputeMajorMin || (int(memInfo.major) == cudaComputeMajorMin && int(memInfo.minor) < cudaComputeMinorMin) {
unsupportedGPUs = append(unsupportedGPUs,
UnsupportedGPUInfo{
GpuInfo: gpuInfo.GpuInfo,
@@ -370,7 +376,7 @@ func GetGPUInfo() GpuInfoList {
gpuInfo.FreeMemory = uint64(memInfo.free)
gpuInfo.ID = C.GoString(&memInfo.gpu_id[0])
gpuInfo.Name = C.GoString(&memInfo.gpu_name[0])
gpuInfo.DependencyPath = []string{depPath}
gpuInfo.DependencyPath = depPaths
oneapiGPUs = append(oneapiGPUs, gpuInfo)
}
}
@@ -385,6 +391,8 @@ func GetGPUInfo() GpuInfoList {
if len(cudaGPUs) == 0 && len(rocmGPUs) == 0 && len(oneapiGPUs) == 0 {
slog.Info("no compatible GPUs were discovered")
}
// TODO verify we have runners for the discovered GPUs, filter out any that aren't supported with good error messages
}
// For detected GPUs, load library if not loaded
@@ -509,7 +517,10 @@ func FindGPULibs(baseLibName string, defaultPatterns []string) []string {
slog.Debug("Searching for GPU library", "name", baseLibName)
// Start with our bundled libraries
patterns := []string{filepath.Join(LibraryDir(), baseLibName)}
patterns := []string{}
for _, d := range LibraryDirs() {
patterns = append(patterns, filepath.Join(d, baseLibName))
}
switch runtime.GOOS {
case "windows":
@@ -531,7 +542,6 @@ func FindGPULibs(baseLibName string, defaultPatterns []string) []string {
patterns = append(patterns, defaultPatterns...)
slog.Debug("gpu library search", "globs", patterns)
for _, pattern := range patterns {
// Nvidia PhysX known to return bogus results
if strings.Contains(pattern, "PhysX") {
slog.Debug("skipping PhysX cuda library path", "path", pattern)
@@ -705,32 +715,26 @@ func (l GpuInfoList) GetVisibleDevicesEnv() (string, string) {
}
}
func LibraryDir() string {
// On Windows/linux we bundle the dependencies at the same level as the executable
func LibraryDirs() []string {
// dependencies can exist wherever we found the runners (e.g. build tree for developers) and relative to the executable
// This can be simplified once we no longer carry runners as payloads
paths := []string{}
appExe, err := os.Executable()
if err != nil {
slog.Warn("failed to lookup executable path", "error", err)
} else {
appRelative := filepath.Join(filepath.Dir(appExe), envconfig.LibRelativeToExe(), "lib", "ollama")
if _, err := os.Stat(appRelative); err == nil {
paths = append(paths, appRelative)
}
}
cwd, err := os.Getwd()
rDir := runners.Locate()
if err != nil {
slog.Warn("failed to lookup working directory", "error", err)
slog.Warn("unable to locate gpu dependency libraries", "error", err)
} else {
paths = append(paths, filepath.Dir(rDir))
}
// Scan for any of our dependeices, and pick first match
for _, root := range []string{filepath.Dir(appExe), filepath.Join(filepath.Dir(appExe), envconfig.LibRelativeToExe()), cwd} {
libDep := filepath.Join("lib", "ollama")
if _, err := os.Stat(filepath.Join(root, libDep)); err == nil {
return filepath.Join(root, libDep)
}
// Developer mode, local build
if _, err := os.Stat(filepath.Join(root, runtime.GOOS+"-"+runtime.GOARCH, libDep)); err == nil {
return filepath.Join(root, runtime.GOOS+"-"+runtime.GOARCH, libDep)
}
if _, err := os.Stat(filepath.Join(root, "dist", runtime.GOOS+"-"+runtime.GOARCH, libDep)); err == nil {
return filepath.Join(root, "dist", runtime.GOOS+"-"+runtime.GOARCH, libDep)
}
}
slog.Warn("unable to locate gpu dependency libraries")
return ""
return paths
}
func GetSystemInfo() SystemInfo {

View File

@@ -15,6 +15,7 @@ import (
"syscall"
"github.com/ollama/ollama/format"
"github.com/ollama/ollama/runners"
)
const (
@@ -27,7 +28,7 @@ func GetGPUInfo() GpuInfoList {
return []GpuInfo{
{
Library: "cpu",
Variant: GetCPUCapability().String(),
Variant: runners.GetCPUCapability().String(),
memInfo: mem,
},
}
@@ -50,7 +51,7 @@ func GetCPUInfo() GpuInfoList {
return []GpuInfo{
{
Library: "cpu",
Variant: GetCPUCapability().String(),
Variant: runners.GetCPUCapability().String(),
memInfo: mem,
},
}

View File

@@ -209,7 +209,7 @@ func processSystemLogicalProcessorInforationList(buf []byte) []*winPackage {
}
}
// Sumarize the results
// Summarize the results
for i, pkg := range packages {
slog.Info("", "package", i, "cores", pkg.coreCount, "efficiency", pkg.efficiencyCoreCount, "threads", pkg.threadCount)
}

View File

@@ -5,6 +5,7 @@ import (
"log/slog"
"github.com/ollama/ollama/format"
"github.com/ollama/ollama/runners"
)
type memInfo struct {
@@ -47,6 +48,13 @@ type GpuInfo struct { // TODO better name maybe "InferenceProcessor"?
// TODO other performance capability info to help in scheduling decisions
}
func (gpu GpuInfo) RunnerName() string {
if gpu.Variant != "" {
return gpu.Library + "_" + gpu.Variant
}
return gpu.Library
}
type CPUInfo struct {
GpuInfo
CPUs []CPU
@@ -99,7 +107,7 @@ func (l GpuInfoList) ByLibrary() []GpuInfoList {
for _, info := range l {
found := false
requested := info.Library
if info.Variant != CPUCapabilityNone.String() {
if info.Variant != runners.CPUCapabilityNone.String() {
requested += "_" + info.Variant
}
for i, lib := range libs {
@@ -140,29 +148,6 @@ func (a ByFreeMemory) Len() int { return len(a) }
func (a ByFreeMemory) Swap(i, j int) { a[i], a[j] = a[j], a[i] }
func (a ByFreeMemory) Less(i, j int) bool { return a[i].FreeMemory < a[j].FreeMemory }
type CPUCapability uint32
// Override at build time when building base GPU runners
var GPURunnerCPUCapability = CPUCapabilityAVX
const (
CPUCapabilityNone CPUCapability = iota
CPUCapabilityAVX
CPUCapabilityAVX2
// TODO AVX512
)
func (c CPUCapability) String() string {
switch c {
case CPUCapabilityAVX:
return "avx"
case CPUCapabilityAVX2:
return "avx2"
default:
return "no vector extensions"
}
}
type SystemInfo struct {
System CPUInfo `json:"system"`
GPUs []GpuInfo `json:"gpus"`
@@ -183,3 +168,17 @@ func (si SystemInfo) GetOptimalThreadCount() int {
return coreCount
}
// For each GPU, check if it does NOT support flash attention
func (l GpuInfoList) FlashAttentionSupported() bool {
for _, gpu := range l {
supportsFA := gpu.Library == "metal" ||
(gpu.Library == "cuda" && gpu.DriverMajor >= 7) ||
gpu.Library == "rocm"
if !supportsFA {
return false
}
}
return true
}

View File

@@ -2,7 +2,7 @@
### Getting Started
* [Quickstart](../README.md#quickstart)
* [Examples](../examples)
* [Examples](./examples.md)
* [Importing models](./import.md)
* [Linux Documentation](./linux.md)
* [Windows Documentation](./windows.md)

View File

@@ -13,6 +13,7 @@
- [Push a Model](#push-a-model)
- [Generate Embeddings](#generate-embeddings)
- [List Running Models](#list-running-models)
- [Version](#version)
## Conventions
@@ -45,14 +46,18 @@ Generate a response for a given prompt with a provided model. This is a streamin
Advanced parameters (optional):
- `format`: the format to return a response in. Currently the only accepted value is `json`
- `format`: the format to return a response in. Format can be `json` or a JSON schema
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
- `system`: system message to (overrides what is defined in the `Modelfile`)
- `template`: the prompt template to use (overrides what is defined in the `Modelfile`)
- `context`: the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
- `raw`: if `true` no formatting will be applied to the prompt. You may choose to use the `raw` parameter if you are specifying a full templated prompt in your request to the API
- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
- `context` (deprecated): the context parameter returned from a previous request to `/generate`, this can be used to keep a short conversational memory
#### Structured outputs
Structured outputs are supported by providing a JSON schema in the `format` parameter. The model will generate a response that matches the schema. See the [structured outputs](#request-structured-outputs) example below.
#### JSON mode
@@ -185,6 +190,52 @@ curl http://localhost:11434/api/generate -d '{
}
```
#### Request (Structured outputs)
##### Request
```shell
curl -X POST http://localhost:11434/api/generate -H "Content-Type: application/json" -d '{
"model": "llama3.1:8b",
"prompt": "Ollama is 22 years old and is busy saving the world. Respond using JSON",
"stream": false,
"format": {
"type": "object",
"properties": {
"age": {
"type": "integer"
},
"available": {
"type": "boolean"
}
},
"required": [
"age",
"available"
]
}
}'
```
##### Response
```json
{
"model": "llama3.1:8b",
"created_at": "2024-12-06T00:48:09.983619Z",
"response": "{\n \"age\": 22,\n \"available\": true\n}",
"done": true,
"done_reason": "stop",
"context": [1, 2, 3],
"total_duration": 1075509083,
"load_duration": 567678166,
"prompt_eval_count": 28,
"prompt_eval_duration": 236000000,
"eval_count": 16,
"eval_duration": 269000000
}
```
#### Request (JSON mode)
> [!IMPORTANT]
@@ -337,7 +388,6 @@ curl http://localhost:11434/api/generate -d '{
"top_k": 20,
"top_p": 0.9,
"min_p": 0.0,
"tfs_z": 0.5,
"typical_p": 0.7,
"repeat_last_n": 33,
"temperature": 0.8,
@@ -456,11 +506,15 @@ The `message` object has the following fields:
Advanced parameters (optional):
- `format`: the format to return a response in. Currently the only accepted value is `json`
- `format`: the format to return a response in. Format can be `json` or a JSON schema.
- `options`: additional model parameters listed in the documentation for the [Modelfile](./modelfile.md#valid-parameters-and-values) such as `temperature`
- `stream`: if `false` the response will be returned as a single response object, rather than a stream of objects
- `keep_alive`: controls how long the model will stay loaded into memory following the request (default: `5m`)
### Structured outputs
Structured outputs are supported by providing a JSON schema in the `format` parameter. The model will generate a response that matches the schema. See the [Chat request (Structured outputs)](#chat-request-structured-outputs) example below.
### Examples
#### Chat Request (Streaming)
@@ -551,6 +605,54 @@ curl http://localhost:11434/api/chat -d '{
}
```
#### Chat request (Structured outputs)
##### Request
```shell
curl -X POST http://localhost:11434/api/chat -H "Content-Type: application/json" -d '{
"model": "llama3.1",
"messages": [{"role": "user", "content": "Ollama is 22 years old and busy saving the world. Return a JSON object with the age and availability."}],
"stream": false,
"format": {
"type": "object",
"properties": {
"age": {
"type": "integer"
},
"available": {
"type": "boolean"
}
},
"required": [
"age",
"available"
]
},
"options": {
"temperature": 0
}
}'
```
##### Response
```json
{
"model": "llama3.1",
"created_at": "2024-12-06T00:46:58.265747Z",
"message": { "role": "assistant", "content": "{\"age\": 22, \"available\": false}" },
"done_reason": "stop",
"done": true,
"total_duration": 2254970291,
"load_duration": 574751416,
"prompt_eval_count": 34,
"prompt_eval_duration": 1502000000,
"eval_count": 12,
"eval_duration": 175000000
}
```
#### Chat request (With History)
Send a chat message with a conversation history. You can use this same approach to start the conversation using multi-shot or chain-of-thought prompting.
@@ -826,14 +928,25 @@ A single JSON object is returned:
POST /api/create
```
Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `modelfile` to the content of the Modelfile rather than just set `path`. This is a requirement for remote create. Remote model creation must also create any file blobs, fields such as `FROM` and `ADAPTER`, explicitly with the server using [Create a Blob](#create-a-blob) and the value to the path indicated in the response.
Create a model from:
* another model;
* a safetensors directory; or
* a GGUF file.
If you are creating a model from a safetensors directory or from a GGUF file, you must [create a blob](#create-a-blob) for each of the files and then use the file name and SHA256 digest associated with each blob in the `files` field.
### Parameters
- `model`: name of the model to create
- `modelfile` (optional): contents of the Modelfile
- `from`: (optional) name of an existing model to create the new model from
- `files`: (optional) a dictionary of file names to SHA256 digests of blobs to create the model from
- `adapters`: (optional) a dictionary of file names to SHA256 digests of blobs for LORA adapters
- `template`: (optional) the prompt template for the model
- `license`: (optional) a string or list of strings containing the license or licenses for the model
- `system`: (optional) a string containing the system prompt for the model
- `parameters`: (optional) a dictionary of parameters for the model (see [Modelfile](./modelfile.md#valid-parameters-and-values) for a list of parameters)
- `messages`: (optional) a list of message objects used to create a conversation
- `stream`: (optional) if `false` the response will be returned as a single response object, rather than a stream of objects
- `path` (optional): path to the Modelfile
- `quantize` (optional): quantize a non-quantized (e.g. float16) model
#### Quantization types
@@ -859,14 +972,15 @@ Create a model from a [`Modelfile`](./modelfile.md). It is recommended to set `m
#### Create a new model
Create a new model from a `Modelfile`.
Create a new model from an existing model.
##### Request
```shell
curl http://localhost:11434/api/create -d '{
"model": "mario",
"modelfile": "FROM llama3\nSYSTEM You are mario from Super Mario Bros."
"from": "llama3.2",
"system": "You are Mario from Super Mario Bros."
}'
```
@@ -897,7 +1011,7 @@ Quantize a non-quantized model.
```shell
curl http://localhost:11434/api/create -d '{
"model": "llama3.1:quantized",
"modelfile": "FROM llama3.1:8b-instruct-fp16",
"from": "llama3.1:8b-instruct-fp16",
"quantize": "q4_K_M"
}'
```
@@ -917,52 +1031,112 @@ A stream of JSON objects is returned:
{"status":"success"}
```
#### Create a model from GGUF
### Check if a Blob Exists
Create a model from a GGUF file. The `files` parameter should be filled out with the file name and SHA256 digest of the GGUF file you wish to use. Use [/api/blobs/:digest](#push-a-blob) to push the GGUF file to the server before calling this API.
##### Request
```shell
curl http://localhost:11434/api/create -d '{
"model": "my-gguf-model",
"files": {
"test.gguf": "sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"
}
}'
```
##### Response
A stream of JSON objects is returned:
```
{"status":"parsing GGUF"}
{"status":"using existing layer sha256:432f310a77f4650a88d0fd59ecdd7cebed8d684bafea53cbff0473542964f0c3"}
{"status":"writing manifest"}
{"status":"success"}
```
#### Create a model from a Safetensors directory
The `files` parameter should include a dictionary of files for the safetensors model which includes the file names and SHA256 digest of each file. Use [/api/blobs/:digest](#push-a-blob) to first push each of the files to the server before calling this API. Files will remain in the cache until the Ollama server is restarted.
##### Request
```shell
curl http://localhost:11434/api/create -d '{
"model": "fred",
"files": {
"config.json": "sha256:dd3443e529fb2290423a0c65c2d633e67b419d273f170259e27297219828e389",
"generation_config.json": "sha256:88effbb63300dbbc7390143fbbdd9d9fa50587b37e8bfd16c8c90d4970a74a36",
"special_tokens_map.json": "sha256:b7455f0e8f00539108837bfa586c4fbf424e31f8717819a6798be74bef813d05",
"tokenizer.json": "sha256:bbc1904d35169c542dffbe1f7589a5994ec7426d9e5b609d07bab876f32e97ab",
"tokenizer_config.json": "sha256:24e8a6dc2547164b7002e3125f10b415105644fcf02bf9ad8b674c87b1eaaed6",
"model.safetensors": "sha256:1ff795ff6a07e6a68085d206fb84417da2f083f68391c2843cd2b8ac6df8538f"
}
}'
```
##### Response
A stream of JSON objects is returned:
```shell
{"status":"converting model"}
{"status":"creating new layer sha256:05ca5b813af4a53d2c2922933936e398958855c44ee534858fcfd830940618b6"}
{"status":"using autodetected template llama3-instruct"}
{"status":"using existing layer sha256:56bb8bd477a519ffa694fc449c2413c6f0e1d3b1c88fa7e3c9d88d3ae49d4dcb"}
{"status":"writing manifest"}
{"status":"success"}
```
## Check if a Blob Exists
```shell
HEAD /api/blobs/:digest
```
Ensures that the file blob used for a FROM or ADAPTER field exists on the server. This is checking your Ollama server and not ollama.com.
Ensures that the file blob (Binary Large Object) used with create a model exists on the server. This checks your Ollama server and not ollama.com.
#### Query Parameters
### Query Parameters
- `digest`: the SHA256 digest of the blob
#### Examples
### Examples
##### Request
#### Request
```shell
curl -I http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
```
##### Response
#### Response
Return 200 OK if the blob exists, 404 Not Found if it does not.
### Create a Blob
## Push a Blob
```shell
POST /api/blobs/:digest
```
Create a blob from a file on the server. Returns the server file path.
Push a file to the Ollama server to create a "blob" (Binary Large Object).
#### Query Parameters
### Query Parameters
- `digest`: the expected SHA256 digest of the file
#### Examples
### Examples
##### Request
#### Request
```shell
curl -T model.bin -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
curl -T model.gguf -X POST http://localhost:11434/api/blobs/sha256:29fdb92e57cf0827ded04ae6461b5931d01fa595843f55d36f5b275a52087dd2
```
##### Response
#### Response
Return 201 Created if the blob was successfully created, 400 Bad Request if the digest used is not expected.
@@ -1425,3 +1599,29 @@ curl http://localhost:11434/api/embeddings -d '{
]
}
```
## Version
```shell
GET /api/version
```
Retrieve the Ollama version
### Examples
#### Request
```shell
curl http://localhost:11434/api/version
```
#### Response
```json
{
"version": "0.5.1"
}
```

View File

@@ -3,35 +3,24 @@
Install required tools:
- go version 1.22 or higher
- gcc version 11.4.0 or higher
- OS specific C/C++ compiler (see below)
- GNU Make
## Overview
Ollama uses a mix of Go and C/C++ code to interface with GPUs. The C/C++ code is compiled with both CGO and GPU library specific compilers. A set of GNU Makefiles are used to compile the project. GPU Libraries are auto-detected based on the typical environment variables used by the respective libraries, but can be overridden if necessary. The default make target will build the runners and primary Go Ollama application that will run within the repo directory. Throughout the examples below `-j 5` is suggested for 5 parallel jobs to speed up the build. You can adjust the job count based on your CPU Core count to reduce build times. If you want to relocate the built binaries, use the `dist` target and recursively copy the files in `./dist/$OS-$ARCH/` to your desired location. To learn more about the other make targets use `make help`
Once you have built the GPU/CPU runners, you can compile the main application with `go build .`
### MacOS
[Download Go](https://go.dev/dl/)
Optionally enable debugging and more verbose logging:
```bash
# At build time
export CGO_CFLAGS="-g"
# At runtime
export OLLAMA_DEBUG=1
```
Get the required libraries and build the native LLM code: (Adjust the job count based on your number of processors for a faster build)
```bash
make -j 5
```
Then build ollama:
```bash
go build .
```
Now you can run `ollama`:
```bash
@@ -51,64 +40,42 @@ _Your operating system distribution may already have packages for NVIDIA CUDA. D
Install `make`, `gcc` and `golang` as well as [NVIDIA CUDA](https://developer.nvidia.com/cuda-downloads)
development and runtime packages.
Typically the build scripts will auto-detect CUDA, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `CUDA_LIB_DIR` to the location of the shared
libraries, and `CUDACXX` to the location of the nvcc compiler. You can customize
a set of target CUDA architectures by setting `CMAKE_CUDA_ARCHITECTURES` (e.g. "50;60;70")
Then generate dependencies: (Adjust the job count based on your number of processors for a faster build)
Typically the makefile will auto-detect CUDA, however, if your Linux distro
or installation approach uses alternative paths, you can specify the location by
overriding `CUDA_PATH` to the location of the CUDA toolkit. You can customize
a set of target CUDA architectures by setting `CUDA_ARCHITECTURES` (e.g. `CUDA_ARCHITECTURES=50;60;70`)
```
make -j 5
```
Then build the binary:
If both v11 and v12 tookkits are detected, runners for both major versions will be built by default. You can build just v12 with `make cuda_v12`
```
go build .
```
#### Older Linux CUDA (NVIDIA)
To support older GPUs with Compute Capability 3.5 or 3.7, you will need to use an older version of the Driver from [Unix Driver Archive](https://www.nvidia.com/en-us/drivers/unix/) (tested with 470) and [CUDA Toolkit Archive](https://developer.nvidia.com/cuda-toolkit-archive) (tested with cuda V11). When you build Ollama, you will need to set two make variable to adjust the minimum compute capability Ollama supports via `make -j 5 CUDA_ARCHITECTURES="35;37;50;52" EXTRA_GOLDFLAGS="\"-X=github.com/ollama/ollama/discover.CudaComputeMajorMin=3\" \"-X=github.com/ollama/ollama/discover.CudaComputeMinorMin=5\""`. To find the Compute Capability of your older GPU, refer to [GPU Compute Capability](https://developer.nvidia.com/cuda-gpus).
#### Linux ROCm (AMD)
_Your operating system distribution may already have packages for AMD ROCm and CLBlast. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
_Your operating system distribution may already have packages for AMD ROCm. Distro packages are often preferable, but instructions are distro-specific. Please consult distro-specific docs for dependencies if available!_
Install [CLBlast](https://github.com/CNugteren/CLBlast/blob/master/doc/installation.md) and [ROCm](https://rocm.docs.amd.com/en/latest/) development packages first, as well as `make`, `gcc`, and `golang`.
Install [ROCm](https://rocm.docs.amd.com/en/latest/) development packages first, as well as `make`, `gcc`, and `golang`.
Typically the build scripts will auto-detect ROCm, however, if your Linux distro
or installation approach uses unusual paths, you can specify the location by
specifying an environment variable `ROCM_PATH` to the location of the ROCm
install (typically `/opt/rocm`), and `CLBlast_DIR` to the location of the
CLBlast install (typically `/usr/lib/cmake/CLBlast`). You can also customize
the AMD GPU targets by setting AMDGPU_TARGETS (e.g. `AMDGPU_TARGETS="gfx1101;gfx1102"`)
Then generate dependencies: (Adjust the job count based on your number of processors for a faster build)
specifying an environment variable `HIP_PATH` to the location of the ROCm
install (typically `/opt/rocm`). You can also customize
the AMD GPU targets by setting HIP_ARCHS (e.g. `HIP_ARCHS=gfx1101;gfx1102`)
```
make -j 5
```
Then build the binary:
```
go build .
```
ROCm requires elevated privileges to access the GPU at runtime. On most distros you can add your user account to the `render` group, or run as root.
#### Advanced CPU Settings
By default, running `make` will compile a few different variations
of the LLM library based on common CPU families and vector math capabilities,
including a lowest-common-denominator which should run on almost any 64 bit CPU
somewhat slowly. At runtime, Ollama will auto-detect the optimal variation to
load.
Custom CPU settings are not currently supported in the new Go server build but will be added back after we complete the transition.
#### Containerized Linux Build
If you have Docker available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included. The resulting binary is placed in `./dist`
If you have Docker and buildx available, you can build linux binaries with `./scripts/build_linux.sh` which has the CUDA and ROCm dependencies included. The resulting artifacts are placed in `./dist` and by default the script builds both arm64 and amd64 binaries. If you want to build only amd64, you can build with `PLATFORM=linux/amd64 ./scripts/build_linux.sh`
### Windows
@@ -126,12 +93,8 @@ The following tools are required as a minimal development environment to build C
> [!NOTE]
> Due to bugs in the GCC C++ library for unicode support, Ollama should be built with clang on windows.
Then, build the `ollama` binary:
```powershell
$env:CGO_ENABLED="1"
make -j 8
go build .
```
make -j 5
```
#### GPU Support
@@ -173,3 +136,30 @@ pacman -S mingw-w64-clang-aarch64-clang mingw-w64-clang-aarch64-gcc-compat mingw
```
You will need to ensure your PATH includes go, cmake, gcc and clang mingw32-make to build ollama from source. (typically `C:\msys64\clangarm64\bin\`)
## Advanced CPU Vector Settings
On x86, running `make` will compile several CPU runners which can run on different CPU families. At runtime, Ollama will auto-detect the best variation to load. If GPU libraries are present at build time, Ollama also compiles GPU runners with the `AVX` CPU vector feature enabled. This provides a good performance balance when loading large models that split across GPU and CPU with broad compatibility. Some users may prefer no vector extensions (e.g. older Xeon/Celeron processors, or hypervisors that mask the vector features) while other users may prefer turning on many more vector extensions to further improve performance for split model loads.
To customize the set of CPU vector features enabled for a CPU runner and all GPU runners, use CUSTOM_CPU_FLAGS during the build.
To build without any vector flags:
```
make CUSTOM_CPU_FLAGS=""
```
To build with both AVX and AVX2:
```
make CUSTOM_CPU_FLAGS=avx,avx2
```
To build with AVX512 features turned on:
```
make CUSTOM_CPU_FLAGS=avx,avx2,avx512,avx512vbmi,avx512vnni,avx512bf16
```
> [!NOTE]
> If you are experimenting with different flags, make sure to do a `make clean` between each change to ensure everything is rebuilt with the new compiler flags

20
docs/examples.md Normal file
View File

@@ -0,0 +1,20 @@
# Examples
This directory contains different examples of using Ollama.
## Python examples
Ollama Python examples at [ollama-python/examples](https://github.com/ollama/ollama-python/tree/main/examples)
## JavaScript examples
Ollama JavaScript examples at [ollama-js/examples](https://github.com/ollama/ollama-js/tree/main/examples)
## OpenAI compatibility examples
Ollama OpenAI compatibility examples at [ollama/examples/openai](../docs/openai.md)
## Community examples
- [LangChain Ollama Python](https://python.langchain.com/docs/integrations/chat/ollama/)
- [LangChain Ollama JS](https://js.langchain.com/docs/integrations/chat/ollama/)

View File

@@ -151,7 +151,7 @@ Refer to the section [above](#how-do-i-configure-ollama-server) for how to set e
Ollama runs an HTTP server and can be exposed using a proxy server such as Nginx. To do so, configure the proxy to forward requests and optionally set required headers (if not exposing Ollama on the network). For example, with Nginx:
```
```nginx
server {
listen 80;
server_name example.com; # Replace with your domain or IP
@@ -285,4 +285,28 @@ Note: Windows with Radeon GPUs currently default to 1 model maximum due to limit
## How does Ollama load models on multiple GPUs?
Installing multiple GPUs of the same brand can be a great way to increase your available VRAM to load larger models. When you load a new model, Ollama evaluates the required VRAM for the model against what is currently available. If the model will entirely fit on any single GPU, Ollama will load the model on that GPU. This typically provides the best performance as it reduces the amount of data transfering across the PCI bus during inference. If the model does not fit entirely on one GPU, then it will be spread across all the available GPUs.
When loading a new model, Ollama evaluates the required VRAM for the model against what is currently available. If the model will entirely fit on any single GPU, Ollama will load the model on that GPU. This typically provides the best performance as it reduces the amount of data transferring across the PCI bus during inference. If the model does not fit entirely on one GPU, then it will be spread across all the available GPUs.
## How can I enable Flash Attention?
Flash Attention is a feature of most modern models that can significantly reduce memory usage as the context size grows. To enable Flash Attention, set the `OLLAMA_FLASH_ATTENTION` environment variable to `1` when starting the Ollama server.
## How can I set the quantization type for the K/V cache?
The K/V context cache can be quantized to significantly reduce memory usage when Flash Attention is enabled.
To use quantized K/V cache with Ollama you can set the following environment variable:
- `OLLAMA_KV_CACHE_TYPE` - The quantization type for the K/V cache. Default is `f16`.
> Note: Currently this is a global option - meaning all models will run with the specified quantization type.
The currently available K/V cache quantization types are:
- `f16` - high precision and memory usage (default).
- `q8_0` - 8-bit quantization, uses approximately 1/2 the memory of `f16` with a very small loss in precision, this usually has no noticeable impact on the model's quality (recommended if not using f16).
- `q4_0` - 4-bit quantization, uses approximately 1/4 the memory of `f16` with a small-medium loss in precision that may be more noticeable at higher context sizes.
How much the cache quantization impacts the model's response quality will depend on the model and the task. Models that have a high GQA count (e.g. Qwen2) may see a larger impact on precision from quantization than models with a low GQA count.
You may need to experiment with different quantization types to find the best balance between memory usage and quality.

View File

@@ -28,6 +28,7 @@ Check your compute compatibility to see if your card is supported:
| 5.0 | GeForce GTX | `GTX 750 Ti` `GTX 750` `NVS 810` |
| | Quadro | `K2200` `K1200` `K620` `M1200` `M520` `M5000M` `M4000M` `M3000M` `M2000M` `M1000M` `K620M` `M600M` `M500M` |
For building locally to support older GPUs, see [developer.md](./development.md#linux-cuda-nvidia)
### GPU Selection
@@ -37,7 +38,7 @@ Numeric IDs may be used, however ordering may vary, so UUIDs are more reliable.
You can discover the UUID of your GPUs by running `nvidia-smi -L` If you want to
ignore the GPUs and force CPU usage, use an invalid GPU ID (e.g., "-1")
### Laptop Suspend Resume
### Linux Suspend Resume
On linux, after a suspend/resume cycle, sometimes Ollama will fail to discover
your NVIDIA GPU, and fallback to running on the CPU. You can workaround this

View File

@@ -10,6 +10,9 @@ curl -fsSL https://ollama.com/install.sh | sh
## Manual install
> [!NOTE]
> If you are upgrading from a prior version, you should remove the old libraries with `sudo rm -rf /usr/lib/ollama` first.
Download and extract the package:
```shell

View File

@@ -63,12 +63,10 @@ SYSTEM You are Mario from super mario bros, acting as an assistant.
To use this:
1. Save it as a file (e.g. `Modelfile`)
2. `ollama create choose-a-model-name -f <location of the file e.g. ./Modelfile>'`
2. `ollama create choose-a-model-name -f <location of the file e.g. ./Modelfile>`
3. `ollama run choose-a-model-name`
4. Start using the model!
More examples are available in the [examples directory](../examples).
To view the Modelfile of a given model, use the `ollama show --modelfile` command.
```bash
@@ -155,8 +153,7 @@ PARAMETER <parameter> <parametervalue>
| temperature | The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8) | float | temperature 0.7 |
| seed | Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0) | int | seed 42 |
| stop | Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate `stop` parameters in a modelfile. | string | stop "AI assistant:" |
| tfs_z | Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1) | float | tfs_z 1 |
| num_predict | Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context) | int | num_predict 42 |
| num_predict | Maximum number of tokens to predict when generating text. (Default: -1, infinite generation) | int | num_predict 42 |
| top_k | Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40) | int | top_k 40 |
| top_p | Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9) | float | top_p 0.9 |
| min_p | Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter *p* represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with *p*=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0) | float | min_p 0.05 |

View File

@@ -59,6 +59,40 @@ embeddings = client.embeddings.create(
input=["why is the sky blue?", "why is the grass green?"],
)
```
#### Structured outputs
```py
from pydantic import BaseModel
from openai import OpenAI
client = OpenAI(base_url="http://localhost:11434/v1", api_key="ollama")
# Define the schema for the response
class FriendInfo(BaseModel):
name: str
age: int
is_available: bool
class FriendList(BaseModel):
friends: list[FriendInfo]
try:
completion = client.beta.chat.completions.parse(
temperature=0,
model="llama3.1:8b",
messages=[
{"role": "user", "content": "I have two friends. The first is Ollama 22 years old busy saving the world, and the second is Alonso 23 years old and wants to hang out. Return a list of friends in JSON format"}
],
response_format=FriendList,
)
friends_response = completion.choices[0].message
if friends_response.parsed:
print(friends_response.parsed)
elif friends_response.refusal:
print(friends_response.refusal)
except Exception as e:
print(f"Error: {e}")
```
### OpenAI JavaScript library
@@ -170,6 +204,45 @@ curl http://localhost:11434/v1/embeddings \
}'
```
## Extra arguments
### Setting context length
- `context_length` parameter can be used to set the context length for the model
#### OpenAI python library
- OpenAI python library does not support setting context length, however this can be set for Ollama through the `extra_body` parameter
```py
completion = client.chat.completions.create(
model="llama3.1:8b",
messages=[{"role": "user", "content": "Say this is a test"}],
extra_body={"context_length": 4096},
)
```
#### OpenAI JavaScript library
- OpenAI JavaScript library does not support setting context length, however this can be set for Ollama by passing `context_length` directly with a `@ts-expect-error` as an undocumented parameter in the OpenAI JavaScript library. [See documentation here](https://github.com/openai/openai-node?tab=readme-ov-file#making-customundocumented-requests)
```ts
const chatCompletion = await openai.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }],
model: 'llama3.2',
// @ts-expect-error context_length is an additional parameter
context_length: 4096,
})
```
#### `curl`
```shell
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "llama3.2",
"messages": [{"role": "user", "content": "Say this is a test"}],
"context_length": 4096
}'
```
## Endpoints
### `/v1/chat/completions`
@@ -179,9 +252,10 @@ curl http://localhost:11434/v1/embeddings \
- [x] Chat completions
- [x] Streaming
- [x] JSON mode
- [x] Structured outputs
- [x] Reproducible outputs
- [x] Vision
- [x] Tools (streaming support coming soon)
- [x] Tools
- [ ] Logprobs
#### Supported request fields
@@ -199,6 +273,8 @@ curl http://localhost:11434/v1/embeddings \
- [x] `seed`
- [x] `stop`
- [x] `stream`
- [x] `stream_options`
- [x] `include_usage`
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
@@ -227,6 +303,8 @@ curl http://localhost:11434/v1/embeddings \
- [x] `seed`
- [x] `stop`
- [x] `stream`
- [x] `stream_options`
- [x] `include_usage`
- [x] `temperature`
- [x] `top_p`
- [x] `max_tokens`
@@ -301,27 +379,3 @@ curl http://localhost:11434/v1/chat/completions \
}'
```
### Setting the context size
The OpenAI API does not have a way of setting the context size for a model. If you need to change the context size, create a `Modelfile` which looks like:
```modelfile
FROM <some model>
PARAMETER num_ctx <context size>
```
Use the `ollama create mymodel` command to create a new model with the updated context size. Call the API with the updated model name:
```shell
curl http://localhost:11434/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
"model": "mymodel",
"messages": [
{
"role": "user",
"content": "Hello!"
}
]
}'
```

View File

@@ -111,7 +111,7 @@ Keep the following tips and best practices in mind when working with Go template
ChatML is a popular template format. It can be used for models such as Databrick's DBRX, Intel's Neural Chat, and Microsoft's Orca 2.
```gotmpl
```go
{{- range .Messages }}<|im_start|>{{ .Role }}
{{ .Content }}<|im_end|>
{{ end }}<|im_start|>assistant
@@ -125,7 +125,7 @@ Tools support can be added to a model by adding a `{{ .Tools }}` node to the tem
Mistral v0.3 and Mixtral 8x22B supports tool calling.
```gotmpl
```go
{{- range $index, $_ := .Messages }}
{{- if eq .Role "user" }}
{{- if and (le (len (slice $.Messages $index)) 2) $.Tools }}[AVAILABLE_TOOLS] {{ json $.Tools }}[/AVAILABLE_TOOLS]
@@ -151,7 +151,7 @@ Fill-in-middle support can be added to a model by adding a `{{ .Suffix }}` node
CodeLlama [7B](https://ollama.com/library/codellama:7b-code) and [13B](https://ollama.com/library/codellama:13b-code) code completion models support fill-in-middle.
```gotmpl
```go
<PRE> {{ .Prompt }} <SUF>{{ .Suffix }} <MID>
```

View File

@@ -80,7 +80,7 @@ If you are using a container to run Ollama, make sure you've set up the containe
Sometimes the Ollama can have difficulties initializing the GPU. When you check the server logs, this can show up as various error codes, such as "3" (not initialized), "46" (device unavailable), "100" (no device), "999" (unknown), or others. The following troubleshooting techniques may help resolve the problem
- If you are using a container, is the container runtime working? Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama wont be able to see your NVIDIA GPU.
- If you are using a container, is the container runtime working? Try `docker run --gpus all ubuntu nvidia-smi` - if this doesn't work, Ollama won't be able to see your NVIDIA GPU.
- Is the uvm driver loaded? `sudo nvidia-modprobe -u`
- Try reloading the nvidia_uvm driver - `sudo rmmod nvidia_uvm` then `sudo modprobe nvidia_uvm`
- Try rebooting

View File

@@ -1,9 +0,0 @@
# Tutorials
Here is a list of ways you can use Ollama with other tools to build interesting applications.
- [Using LangChain with Ollama in JavaScript](./tutorials/langchainjs.md)
- [Using LangChain with Ollama in Python](./tutorials/langchainpy.md)
- [Running Ollama on NVIDIA Jetson Devices](./tutorials/nvidia-jetson.md)
Also be sure to check out the [examples](../examples) directory for more ways to use Ollama.

View File

@@ -83,3 +83,6 @@ If you'd like to install or integrate Ollama as a service, a standalone
and GPU library dependencies for Nvidia and AMD. This allows for embedding
Ollama in existing applications, or running it as a system service via `ollama
serve` with tools such as [NSSM](https://nssm.cc/).
> [!NOTE]
> If you are upgrading from a prior version, you should remove the old directories first.

View File

@@ -153,6 +153,8 @@ var (
Debug = Bool("OLLAMA_DEBUG")
// FlashAttention enables the experimental flash attention feature.
FlashAttention = Bool("OLLAMA_FLASH_ATTENTION")
// KvCacheType is the quantization type for the K/V cache.
KvCacheType = String("OLLAMA_KV_CACHE_TYPE")
// NoHistory disables readline history.
NoHistory = Bool("OLLAMA_NOHISTORY")
// NoPrune disables pruning of model blobs on startup.
@@ -173,7 +175,6 @@ func String(s string) func() string {
var (
LLMLibrary = String("OLLAMA_LLM_LIBRARY")
TmpDir = String("OLLAMA_TMPDIR")
CudaVisibleDevices = String("CUDA_VISIBLE_DEVICES")
HipVisibleDevices = String("HIP_VISIBLE_DEVICES")
@@ -234,6 +235,7 @@ func AsMap() map[string]EnvVar {
ret := map[string]EnvVar{
"OLLAMA_DEBUG": {"OLLAMA_DEBUG", Debug(), "Show additional debug information (e.g. OLLAMA_DEBUG=1)"},
"OLLAMA_FLASH_ATTENTION": {"OLLAMA_FLASH_ATTENTION", FlashAttention(), "Enabled flash attention"},
"OLLAMA_KV_CACHE_TYPE": {"OLLAMA_KV_CACHE_TYPE", KvCacheType(), "Quantization type for the K/V cache (default: f16)"},
"OLLAMA_GPU_OVERHEAD": {"OLLAMA_GPU_OVERHEAD", GpuOverhead(), "Reserve a portion of VRAM per GPU (bytes)"},
"OLLAMA_HOST": {"OLLAMA_HOST", Host(), "IP Address for the ollama server (default 127.0.0.1:11434)"},
"OLLAMA_KEEP_ALIVE": {"OLLAMA_KEEP_ALIVE", KeepAlive(), "The duration that models stay loaded in memory (default \"5m\")"},
@@ -247,7 +249,6 @@ func AsMap() map[string]EnvVar {
"OLLAMA_NUM_PARALLEL": {"OLLAMA_NUM_PARALLEL", NumParallel(), "Maximum number of parallel requests"},
"OLLAMA_ORIGINS": {"OLLAMA_ORIGINS", Origins(), "A comma separated list of allowed origins"},
"OLLAMA_SCHED_SPREAD": {"OLLAMA_SCHED_SPREAD", SchedSpread(), "Always schedule model across all GPUs"},
"OLLAMA_TMPDIR": {"OLLAMA_TMPDIR", TmpDir(), "Location for temporary files"},
"OLLAMA_MULTIUSER_CACHE": {"OLLAMA_MULTIUSER_CACHE", MultiUserCache(), "Optimize prompt caching for multi-user scenarios"},
// Informational

174
examples/.gitignore vendored
View File

@@ -1,174 +0,0 @@
node_modules
bun.lockb
.vscode
# OSX
.DS_STORE
# Models
models/
# Local Chroma db
.chroma/
db/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

View File

@@ -1,3 +0,0 @@
# Examples
This directory contains different examples of using Ollama.

View File

@@ -1 +0,0 @@
fly.toml

View File

@@ -1,67 +0,0 @@
# Deploy Ollama to Fly.io
> Note: this example exposes a public endpoint and does not configure authentication. Use with care.
## Prerequisites
- Ollama: https://ollama.com/download
- Fly.io account. Sign up for a free account: https://fly.io/app/sign-up
## Steps
1. Login to Fly.io
```bash
fly auth login
```
1. Create a new Fly app
```bash
fly launch --name <name> --image ollama/ollama --internal-port 11434 --vm-size shared-cpu-8x --now
```
1. Pull and run `orca-mini:3b`
```bash
OLLAMA_HOST=https://<name>.fly.dev ollama run orca-mini:3b
```
`shared-cpu-8x` is a free-tier eligible machine type. For better performance, switch to a `performance` or `dedicated` machine type or attach a GPU for hardware acceleration (see below).
## (Optional) Persistent Volume
By default Fly Machines use ephemeral storage which is problematic if you want to use the same model across restarts without pulling it again. Create and attach a persistent volume to store the downloaded models:
1. Create the Fly Volume
```bash
fly volume create ollama
```
1. Update `fly.toml` and add `[mounts]`
```toml
[mounts]
source = "ollama"
destination = "/mnt/ollama/models"
```
1. Update `fly.toml` and add `[env]`
```toml
[env]
OLLAMA_MODELS = "/mnt/ollama/models"
```
1. Deploy your app
```bash
fly deploy
```
## (Optional) Hardware Acceleration
Fly.io GPU is currently in waitlist. Sign up for the waitlist: https://fly.io/gpu
Once you've been accepted, create the app with the additional flags `--vm-gpu-kind a100-pcie-40gb` or `--vm-gpu-kind a100-pcie-80gb`.

View File

@@ -1,29 +0,0 @@
package main
import (
"bytes"
"fmt"
"io"
"log"
"net/http"
"os"
)
func main() {
body := []byte(`{"model":"mistral"}`)
resp, err := http.Post("http://localhost:11434/api/generate", "application/json", bytes.NewBuffer(body))
if err != nil {
fmt.Print(err.Error())
os.Exit(1)
}
defer resp.Body.Close()
responseData, err := io.ReadAll(resp.Body)
if err != nil {
log.Fatal(err)
}
fmt.Println(string(responseData))
}

View File

@@ -1,5 +0,0 @@
# Ollama Jupyter Notebook
This example downloads and installs Ollama in a Jupyter instance such as Google Colab. It will start the Ollama service and expose an endpoint using `ngrok` which can be used to communicate with the Ollama instance remotely.
For best results, use an instance with GPU accelerator.

View File

@@ -1,102 +0,0 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"id": "93f59dcb-c588-41b8-a792-55d88ade739c",
"metadata": {},
"outputs": [],
"source": [
"# Download and run the Ollama Linux install script\n",
"!curl -fsSL https://ollama.com/install.sh | sh\n",
"!command -v systemctl >/dev/null && sudo systemctl stop ollama"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "658c147e-c7f8-490e-910e-62b80f577dda",
"metadata": {},
"outputs": [],
"source": [
"!pip install aiohttp pyngrok\n",
"\n",
"import os\n",
"import asyncio\n",
"from aiohttp import ClientSession\n",
"\n",
"# Set LD_LIBRARY_PATH so the system NVIDIA library becomes preferred\n",
"# over the built-in library. This is particularly important for \n",
"# Google Colab which installs older drivers\n",
"os.environ.update({'LD_LIBRARY_PATH': '/usr/lib64-nvidia'})\n",
"\n",
"async def run(cmd):\n",
" '''\n",
" run is a helper function to run subcommands asynchronously.\n",
" '''\n",
" print('>>> starting', *cmd)\n",
" p = await asyncio.subprocess.create_subprocess_exec(\n",
" *cmd,\n",
" stdout=asyncio.subprocess.PIPE,\n",
" stderr=asyncio.subprocess.PIPE,\n",
" )\n",
"\n",
" async def pipe(lines):\n",
" async for line in lines:\n",
" print(line.strip().decode('utf-8'))\n",
"\n",
" await asyncio.gather(\n",
" pipe(p.stdout),\n",
" pipe(p.stderr),\n",
" )\n",
"\n",
"\n",
"await asyncio.gather(\n",
" run(['ollama', 'serve']),\n",
" run(['ngrok', 'http', '--log', 'stderr', '11434']),\n",
")"
]
},
{
"cell_type": "markdown",
"id": "e7735a55-9aad-4caf-8683-52e2163ba53b",
"metadata": {},
"source": [
"The previous cell starts two processes, `ollama` and `ngrok`. The log output will show a line like the following which describes the external address.\n",
"\n",
"```\n",
"t=2023-11-12T22:55:56+0000 lvl=info msg=\"started tunnel\" obj=tunnels name=command_line addr=http://localhost:11434 url=https://8249-34-125-179-11.ngrok.io\n",
"```\n",
"\n",
"The external address in this case is `https://8249-34-125-179-11.ngrok.io` which can be passed into `OLLAMA_HOST` to access this instance.\n",
"\n",
"```bash\n",
"export OLLAMA_HOST=https://8249-34-125-179-11.ngrok.io\n",
"ollama list\n",
"ollama run mistral\n",
"```"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}

View File

@@ -1,38 +0,0 @@
# Deploy Ollama to Kubernetes
## Prerequisites
- Ollama: https://ollama.com/download
- Kubernetes cluster. This example will use Google Kubernetes Engine.
## Steps
1. Create the Ollama namespace, deployment, and service
```bash
kubectl apply -f cpu.yaml
```
## (Optional) Hardware Acceleration
Hardware acceleration in Kubernetes requires NVIDIA's [`k8s-device-plugin`](https://github.com/NVIDIA/k8s-device-plugin) which is deployed in Kubernetes in form of daemonset. Follow the link for more details.
Once configured, create a GPU enabled Ollama deployment.
```bash
kubectl apply -f gpu.yaml
```
## Test
1. Port forward the Ollama service to connect and use it locally
```bash
kubectl -n ollama port-forward service/ollama 11434:80
```
1. Pull and run a model, for example `orca-mini:3b`
```bash
ollama run orca-mini:3b
```

View File

@@ -1,42 +0,0 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: ollama
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
namespace: ollama
spec:
selector:
matchLabels:
name: ollama
template:
metadata:
labels:
name: ollama
spec:
containers:
- name: ollama
image: ollama/ollama:latest
ports:
- name: http
containerPort: 11434
protocol: TCP
---
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: ollama
spec:
type: ClusterIP
selector:
name: ollama
ports:
- port: 80
name: http
targetPort: http
protocol: TCP

View File

@@ -1,58 +0,0 @@
---
apiVersion: v1
kind: Namespace
metadata:
name: ollama
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: ollama
namespace: ollama
spec:
strategy:
type: Recreate
selector:
matchLabels:
name: ollama
template:
metadata:
labels:
name: ollama
spec:
containers:
- name: ollama
image: ollama/ollama:latest
env:
- name: PATH
value: /usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
- name: LD_LIBRARY_PATH
value: /usr/local/nvidia/lib:/usr/local/nvidia/lib64
- name: NVIDIA_DRIVER_CAPABILITIES
value: compute,utility
ports:
- name: http
containerPort: 11434
protocol: TCP
resources:
limits:
nvidia.com/gpu: 1
tolerations:
- key: nvidia.com/gpu
operator: Exists
effect: NoSchedule
---
apiVersion: v1
kind: Service
metadata:
name: ollama
namespace: ollama
spec:
type: ClusterIP
selector:
name: ollama
ports:
- port: 80
name: http
targetPort: http
protocol: TCP

View File

@@ -1,29 +0,0 @@
# LangChain Document QA
This example provides an interface for asking questions to a PDF document.
## Setup
1. Ensure you have the `llama3.2` model installed:
```
ollama pull llama3.2
```
2. Install the Python Requirements.
```
pip install -r requirements.txt
```
## Run
```
python main.py
```
A prompt will appear, where questions may be asked:
```
Query: How many locations does WeWork have?
```

View File

@@ -1,61 +0,0 @@
from langchain.document_loaders import OnlinePDFLoader
from langchain.vectorstores import Chroma
from langchain.embeddings import GPT4AllEmbeddings
from langchain import PromptTemplate
from langchain.llms import Ollama
from langchain.callbacks.manager import CallbackManager
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.chains import RetrievalQA
import sys
import os
class SuppressStdout:
def __enter__(self):
self._original_stdout = sys.stdout
self._original_stderr = sys.stderr
sys.stdout = open(os.devnull, 'w')
sys.stderr = open(os.devnull, 'w')
def __exit__(self, exc_type, exc_val, exc_tb):
sys.stdout.close()
sys.stdout = self._original_stdout
sys.stderr = self._original_stderr
# load the pdf and split it into chunks
loader = OnlinePDFLoader("https://d18rn0p25nwr6d.cloudfront.net/CIK-0001813756/975b3e9b-268e-4798-a9e4-2a9a7c92dc10.pdf")
data = loader.load()
from langchain.text_splitter import RecursiveCharacterTextSplitter
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
all_splits = text_splitter.split_documents(data)
with SuppressStdout():
vectorstore = Chroma.from_documents(documents=all_splits, embedding=GPT4AllEmbeddings())
while True:
query = input("\nQuery: ")
if query == "exit":
break
if query.strip() == "":
continue
# Prompt
template = """Use the following pieces of context to answer the question at the end.
If you don't know the answer, just say that you don't know, don't try to make up an answer.
Use three sentences maximum and keep the answer as concise as possible.
{context}
Question: {question}
Helpful Answer:"""
QA_CHAIN_PROMPT = PromptTemplate(
input_variables=["context", "question"],
template=template,
)
llm = Ollama(model="llama3.2", callback_manager=CallbackManager([StreamingStdOutCallbackHandler()]))
qa_chain = RetrievalQA.from_chain_type(
llm,
retriever=vectorstore.as_retriever(),
chain_type_kwargs={"prompt": QA_CHAIN_PROMPT},
)
result = qa_chain({"query": query})

View File

@@ -1,109 +0,0 @@
absl-py==1.4.0
aiohttp==3.8.5
aiosignal==1.3.1
anyio==3.7.1
astunparse==1.6.3
async-timeout==4.0.3
attrs==23.1.0
backoff==2.2.1
beautifulsoup4==4.12.2
bs4==0.0.1
cachetools==5.3.1
certifi==2023.7.22
cffi==1.15.1
chardet==5.2.0
charset-normalizer==3.2.0
Chroma==0.2.0
chroma-hnswlib==0.7.2
chromadb==0.4.5
click==8.1.6
coloredlogs==15.0.1
cryptography==41.0.3
dataclasses-json==0.5.14
fastapi==0.99.1
filetype==1.2.0
flatbuffers==23.5.26
frozenlist==1.4.0
gast==0.4.0
google-auth==2.22.0
google-auth-oauthlib==1.0.0
google-pasta==0.2.0
gpt4all==1.0.8
grpcio==1.57.0
h11==0.14.0
h5py==3.9.0
httptools==0.6.0
humanfriendly==10.0
idna==3.4
importlib-resources==6.0.1
joblib==1.3.2
keras==2.13.1
langchain==0.0.261
langsmith==0.0.21
libclang==16.0.6
lxml==4.9.3
Markdown==3.4.4
MarkupSafe==2.1.3
marshmallow==3.20.1
monotonic==1.6
mpmath==1.3.0
multidict==6.0.4
mypy-extensions==1.0.0
nltk==3.8.1
numexpr==2.8.5
numpy==1.24.3
oauthlib==3.2.2
onnxruntime==1.15.1
openapi-schema-pydantic==1.2.4
opt-einsum==3.3.0
overrides==7.4.0
packaging==23.1
pdf2image==1.16.3
pdfminer==20191125
pdfminer.six==20221105
Pillow==10.0.0
posthog==3.0.1
protobuf==4.24.0
pulsar-client==3.2.0
pyasn1==0.5.0
pyasn1-modules==0.3.0
pycparser==2.21
pycryptodome==3.18.0
pydantic==1.10.12
PyPika==0.48.9
python-dateutil==2.8.2
python-dotenv==1.0.0
python-magic==0.4.27
PyYAML==6.0.1
regex==2023.8.8
requests==2.31.0
requests-oauthlib==1.3.1
rsa==4.9
six==1.16.0
sniffio==1.3.0
soupsieve==2.4.1
SQLAlchemy==2.0.19
starlette==0.27.0
sympy==1.12
tabulate==0.9.0
tenacity==8.2.2
tensorboard==2.13.0
tensorboard-data-server==0.7.1
tensorflow==2.13.0
tensorflow-estimator==2.13.0
tensorflow-hub==0.14.0
tensorflow-macos==2.13.0
termcolor==2.3.0
tokenizers==0.13.3
tqdm==4.66.1
typing-inspect==0.9.0
typing_extensions==4.5.0
unstructured==0.9.2
urllib3==1.26.16
uvicorn==0.23.2
uvloop==0.17.0
watchfiles==0.19.0
websockets==11.0.3
Werkzeug==2.3.6
wrapt==1.15.0
yarl==1.9.2

View File

@@ -1,170 +0,0 @@
# OSX
.DS_STORE
# Models
models/
# Local Chroma db
.chroma/
db/
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
.pybuilder/
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock
# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock
# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml
# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/
# Celery stuff
celerybeat-schedule
celerybeat.pid
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
# pytype static type analyzer
.pytype/
# Cython debug symbols
cython_debug/
# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

View File

@@ -1,201 +0,0 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@@ -1,91 +0,0 @@
# PrivateGPT with Llama 2 uncensored
https://github.com/ollama/ollama/assets/3325447/20cf8ec6-ff25-42c6-bdd8-9be594e3ce1b
> Note: this example is a slightly modified version of PrivateGPT using models such as Llama 2 Uncensored. All credit for PrivateGPT goes to Iván Martínez who is the creator of it, and you can find his GitHub repo [here](https://github.com/imartinez/privateGPT).
### Setup
Set up a virtual environment (optional):
```
python3 -m venv .venv
source .venv/bin/activate
```
Install the Python dependencies:
```shell
pip install -r requirements.txt
```
Pull the model you'd like to use:
```
ollama pull llama2-uncensored
```
### Getting WeWork's latest quarterly earnings report (10-Q)
```
mkdir source_documents
curl https://d18rn0p25nwr6d.cloudfront.net/CIK-0001813756/975b3e9b-268e-4798-a9e4-2a9a7c92dc10.pdf -o source_documents/wework.pdf
```
### Ingesting files
```shell
python ingest.py
```
Output should look like this:
```shell
Creating new vectorstore
Loading documents from source_documents
Loading new documents: 100%|██████████████████████| 1/1 [00:01<00:00, 1.73s/it]
Loaded 1 new documents from source_documents
Split into 90 chunks of text (max. 500 tokens each)
Creating embeddings. May take some minutes...
Using embedded DuckDB with persistence: data will be stored in: db
Ingestion complete! You can now run privateGPT.py to query your documents
```
### Ask questions
```shell
python privateGPT.py
Enter a query: How many locations does WeWork have?
> Answer (took 17.7 s.):
As of June 2023, WeWork has 777 locations worldwide, including 610 Consolidated Locations (as defined in the section entitled Key Performance Indicators).
```
### Try a different model:
```
ollama pull llama2:13b
MODEL=llama2:13b python privateGPT.py
```
## Adding more files
Put any and all your files into the `source_documents` directory
The supported extensions are:
- `.csv`: CSV,
- `.docx`: Word Document,
- `.doc`: Word Document,
- `.enex`: EverNote,
- `.eml`: Email,
- `.epub`: EPub,
- `.html`: HTML File,
- `.md`: Markdown,
- `.msg`: Outlook Message,
- `.odt`: Open Document Text,
- `.pdf`: Portable Document Format (PDF),
- `.pptx` : PowerPoint Document,
- `.ppt` : PowerPoint Document,
- `.txt`: Text file (UTF-8),

View File

@@ -1,11 +0,0 @@
import os
from chromadb.config import Settings
# Define the folder for storing database
PERSIST_DIRECTORY = os.environ.get('PERSIST_DIRECTORY', 'db')
# Define the Chroma settings
CHROMA_SETTINGS = Settings(
persist_directory=PERSIST_DIRECTORY,
anonymized_telemetry=False
)

View File

@@ -1,170 +0,0 @@
#!/usr/bin/env python3
import os
import glob
from typing import List
from multiprocessing import Pool
from tqdm import tqdm
from langchain.document_loaders import (
CSVLoader,
EverNoteLoader,
PyMuPDFLoader,
TextLoader,
UnstructuredEmailLoader,
UnstructuredEPubLoader,
UnstructuredHTMLLoader,
UnstructuredMarkdownLoader,
UnstructuredODTLoader,
UnstructuredPowerPointLoader,
UnstructuredWordDocumentLoader,
)
from langchain.text_splitter import RecursiveCharacterTextSplitter
from langchain.vectorstores import Chroma
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.docstore.document import Document
from constants import CHROMA_SETTINGS
# Load environment variables
persist_directory = os.environ.get('PERSIST_DIRECTORY', 'db')
source_directory = os.environ.get('SOURCE_DIRECTORY', 'source_documents')
embeddings_model_name = os.environ.get('EMBEDDINGS_MODEL_NAME', 'all-MiniLM-L6-v2')
chunk_size = 500
chunk_overlap = 50
# Custom document loaders
class MyElmLoader(UnstructuredEmailLoader):
"""Wrapper to fallback to text/plain when default does not work"""
def load(self) -> List[Document]:
"""Wrapper adding fallback for elm without html"""
try:
try:
doc = UnstructuredEmailLoader.load(self)
except ValueError as e:
if 'text/html content not found in email' in str(e):
# Try plain text
self.unstructured_kwargs["content_source"]="text/plain"
doc = UnstructuredEmailLoader.load(self)
else:
raise
except Exception as e:
# Add file_path to exception message
raise type(e)(f"{self.file_path}: {e}") from e
return doc
# Map file extensions to document loaders and their arguments
LOADER_MAPPING = {
".csv": (CSVLoader, {}),
# ".docx": (Docx2txtLoader, {}),
".doc": (UnstructuredWordDocumentLoader, {}),
".docx": (UnstructuredWordDocumentLoader, {}),
".enex": (EverNoteLoader, {}),
".eml": (MyElmLoader, {}),
".epub": (UnstructuredEPubLoader, {}),
".html": (UnstructuredHTMLLoader, {}),
".md": (UnstructuredMarkdownLoader, {}),
".odt": (UnstructuredODTLoader, {}),
".pdf": (PyMuPDFLoader, {}),
".ppt": (UnstructuredPowerPointLoader, {}),
".pptx": (UnstructuredPowerPointLoader, {}),
".txt": (TextLoader, {"encoding": "utf8"}),
# Add more mappings for other file extensions and loaders as needed
}
def load_single_document(file_path: str) -> List[Document]:
if os.path.getsize(file_path) != 0:
filename, ext = os.path.splitext(file_path)
if ext in LOADER_MAPPING:
loader_class, loader_args = LOADER_MAPPING[ext]
try:
loader = loader_class(file_path, **loader_args)
if loader:
return loader.load()
except:
print(f"Corrupted file {file_path}. Ignoring it.")
else:
print(f"Unsupported file {file_path}. Ignoring it.")
else:
print(f"Empty file {file_path}. Ignoring it.")
def load_documents(source_dir: str, ignored_files: List[str] = []) -> List[Document]:
"""
Loads all documents from the source documents directory, ignoring specified files
"""
all_files = []
for ext in LOADER_MAPPING:
all_files.extend(
glob.glob(os.path.join(source_dir, f"**/*{ext}"), recursive=True)
)
filtered_files = [file_path for file_path in all_files if file_path not in ignored_files]
with Pool(processes=os.cpu_count()) as pool:
results = []
with tqdm(total=len(filtered_files), desc='Loading new documents', ncols=80) as pbar:
for i, docs in enumerate(pool.imap_unordered(load_single_document, filtered_files)):
if docs:
results.extend(docs)
pbar.update()
return results
def process_documents(ignored_files: List[str] = []) -> List[Document]:
"""
Load documents and split in chunks
"""
print(f"Loading documents from {source_directory}")
documents = load_documents(source_directory, ignored_files)
if not documents:
print("No new documents to load")
exit(0)
print(f"Loaded {len(documents)} new documents from {source_directory}")
text_splitter = RecursiveCharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
texts = text_splitter.split_documents(documents)
print(f"Split into {len(texts)} chunks of text (max. {chunk_size} tokens each)")
return texts
def does_vectorstore_exist(persist_directory: str) -> bool:
"""
Checks if vectorstore exists
"""
if os.path.exists(os.path.join(persist_directory, 'index')):
if os.path.exists(os.path.join(persist_directory, 'chroma-collections.parquet')) and os.path.exists(os.path.join(persist_directory, 'chroma-embeddings.parquet')):
list_index_files = glob.glob(os.path.join(persist_directory, 'index/*.bin'))
list_index_files += glob.glob(os.path.join(persist_directory, 'index/*.pkl'))
# At least 3 documents are needed in a working vectorstore
if len(list_index_files) > 3:
return True
return False
def main():
# Create embeddings
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
if does_vectorstore_exist(persist_directory):
# Update and store locally vectorstore
print(f"Appending to existing vectorstore at {persist_directory}")
db = Chroma(persist_directory=persist_directory, embedding_function=embeddings, client_settings=CHROMA_SETTINGS)
collection = db.get()
texts = process_documents([metadata['source'] for metadata in collection['metadatas']])
print(f"Creating embeddings. May take some minutes...")
db.add_documents(texts)
else:
# Create and store locally vectorstore
print("Creating new vectorstore")
texts = process_documents()
print(f"Creating embeddings. May take some minutes...")
db = Chroma.from_documents(texts, embeddings, persist_directory=persist_directory)
db.persist()
db = None
print(f"Ingestion complete! You can now run privateGPT.py to query your documents")
if __name__ == "__main__":
main()

File diff suppressed because it is too large Load Diff

View File

@@ -1,74 +0,0 @@
#!/usr/bin/env python3
from langchain.chains import RetrievalQA
from langchain.embeddings import HuggingFaceEmbeddings
from langchain.callbacks.streaming_stdout import StreamingStdOutCallbackHandler
from langchain.vectorstores import Chroma
from langchain.llms import Ollama
import chromadb
import os
import argparse
import time
model = os.environ.get("MODEL", "llama2-uncensored")
# For embeddings model, the example uses a sentence-transformers model
# https://www.sbert.net/docs/pretrained_models.html
# "The all-mpnet-base-v2 model provides the best quality, while all-MiniLM-L6-v2 is 5 times faster and still offers good quality."
embeddings_model_name = os.environ.get("EMBEDDINGS_MODEL_NAME", "all-MiniLM-L6-v2")
persist_directory = os.environ.get("PERSIST_DIRECTORY", "db")
target_source_chunks = int(os.environ.get('TARGET_SOURCE_CHUNKS',4))
from constants import CHROMA_SETTINGS
def main():
# Parse the command line arguments
args = parse_arguments()
embeddings = HuggingFaceEmbeddings(model_name=embeddings_model_name)
db = Chroma(persist_directory=persist_directory, embedding_function=embeddings)
retriever = db.as_retriever(search_kwargs={"k": target_source_chunks})
# activate/deactivate the streaming StdOut callback for LLMs
callbacks = [] if args.mute_stream else [StreamingStdOutCallbackHandler()]
llm = Ollama(model=model, callbacks=callbacks)
qa = RetrievalQA.from_chain_type(llm=llm, chain_type="stuff", retriever=retriever, return_source_documents= not args.hide_source)
# Interactive questions and answers
while True:
query = input("\nEnter a query: ")
if query == "exit":
break
if query.strip() == "":
continue
# Get the answer from the chain
start = time.time()
res = qa(query)
answer, docs = res['result'], [] if args.hide_source else res['source_documents']
end = time.time()
# Print the result
print("\n\n> Question:")
print(query)
print(answer)
# Print the relevant sources used for the answer
for document in docs:
print("\n> " + document.metadata["source"] + ":")
print(document.page_content)
def parse_arguments():
parser = argparse.ArgumentParser(description='privateGPT: Ask questions to your documents without an internet connection, '
'using the power of LLMs.')
parser.add_argument("--hide-source", "-S", action='store_true',
help='Use this flag to disable printing of source documents used for answers.')
parser.add_argument("--mute-stream", "-M",
action='store_true',
help='Use this flag to disable the streaming StdOut callback for LLMs.')
return parser.parse_args()
if __name__ == "__main__":
main()

View File

@@ -1,26 +0,0 @@
[tool.poetry]
name = "privategpt"
version = "0.1.0"
description = ""
authors = ["Ivan Martinez <ivanmartit@gmail.com>"]
license = "Apache Version 2.0"
readme = "README.md"
[tool.poetry.dependencies]
python = "^3.10"
langchain = "0.0.261"
gpt4all = "^1.0.3"
chromadb = "^0.3.26"
PyMuPDF = "^1.22.5"
python-dotenv = "^1.0.0"
unstructured = "^0.8.0"
extract-msg = "^0.41.5"
tabulate = "^0.9.0"
pandoc = "^2.3"
pypandoc = "^1.11"
tqdm = "^4.65.0"
sentence-transformers = "^2.2.2"
[build-system]
requires = ["poetry-core"]
build-backend = "poetry.core.masonry.api"

View File

@@ -1,15 +0,0 @@
langchain==0.0.274
gpt4all==1.0.8
chromadb==0.5.0
llama-cpp-python==0.1.81
urllib3==2.0.4
PyMuPDF==1.23.5
python-dotenv==1.0.0
unstructured==0.10.8
extract-msg==0.45.0
tabulate==0.9.0
pandoc==2.3
pypandoc==1.11
tqdm==4.66.1
sentence_transformers==2.2.2
numpy>=1.22.2 # not directly required, pinned by Snyk to avoid a vulnerability

View File

@@ -1,23 +0,0 @@
# LangChain Web Summarization
This example summarizes the website, [https://ollama.com/blog/run-llama2-uncensored-locally](https://ollama.com/blog/run-llama2-uncensored-locally)
## Running the Example
1. Ensure you have the `llama3.2` model installed:
```bash
ollama pull llama3.2
```
2. Install the Python Requirements.
```bash
pip install -r requirements.txt
```
3. Run the example:
```bash
python main.py
```

View File

@@ -1,12 +0,0 @@
from langchain_community.llms import Ollama
from langchain_community.document_loaders import WebBaseLoader
from langchain.chains.summarize import load_summarize_chain
loader = WebBaseLoader("https://ollama.com/blog/run-llama2-uncensored-locally")
docs = loader.load()
llm = Ollama(model="llama3.2")
chain = load_summarize_chain(llm, chain_type="stuff")
result = chain.invoke(docs)
print(result)

View File

@@ -1 +0,0 @@
langchain==0.0.259

View File

@@ -1,23 +0,0 @@
# LangChain
This example is a basic "hello world" of using LangChain with Ollama.
## Running the Example
1. Ensure you have the `llama3.2` model installed:
```bash
ollama pull llama3.2
```
2. Install the Python Requirements.
```bash
pip install -r requirements.txt
```
3. Run the example:
```bash
python main.py
```

View File

@@ -1,6 +0,0 @@
from langchain.llms import Ollama
input = input("What is your question?\n> ")
llm = Ollama(model="llama3.2")
res = llm.invoke(input)
print (res)

View File

@@ -1 +0,0 @@
langchain==0.0.259

View File

@@ -1,23 +0,0 @@
# LangChain
This example is a basic "hello world" of using LangChain with Ollama using Node.js and Typescript.
## Running the Example
1. Install the prerequisites:
```bash
npm install
```
2. Ensure the `mistral` model is available:
```bash
ollama pull mistral
```
3. Run the example:
```bash
npm start
```

View File

@@ -1,25 +0,0 @@
import { Ollama } from 'langchain/llms/ollama';
import * as readline from "readline";
async function main() {
const ollama = new Ollama({
model: 'mistral'
// other parameters can be found at https://js.langchain.com/docs/api/llms_ollama/classes/Ollama
});
const rl = readline.createInterface({
input: process.stdin,
output: process.stdout,
});
rl.question("What is your question: \n", async (user_input) => {
const stream = await ollama.stream(user_input);
for await (const chunk of stream) {
process.stdout.write(chunk);
}
rl.close();
})
}
main();

View File

@@ -1,997 +0,0 @@
{
"name": "langchain-typescript-simple",
"lockfileVersion": 3,
"requires": true,
"packages": {
"": {
"dependencies": {
"langchain": "^0.0.165"
},
"devDependencies": {
"typescript": "^5.2.2"
}
},
"node_modules/@anthropic-ai/sdk": {
"version": "0.6.2",
"resolved": "https://registry.npmjs.org/@anthropic-ai/sdk/-/sdk-0.6.2.tgz",
"integrity": "sha512-fB9PUj9RFT+XjkL+E9Ol864ZIJi+1P8WnbHspN3N3/GK2uSzjd0cbVIKTGgf4v3N8MwaQu+UWnU7C4BG/fap/g==",
"dependencies": {
"@types/node": "^18.11.18",
"@types/node-fetch": "^2.6.4",
"abort-controller": "^3.0.0",
"agentkeepalive": "^4.2.1",
"digest-fetch": "^1.3.0",
"form-data-encoder": "1.7.2",
"formdata-node": "^4.3.2",
"node-fetch": "^2.6.7"
}
},
"node_modules/@types/node": {
"version": "18.18.4",
"resolved": "https://registry.npmjs.org/@types/node/-/node-18.18.4.tgz",
"integrity": "sha512-t3rNFBgJRugIhackit2mVcLfF6IRc0JE4oeizPQL8Zrm8n2WY/0wOdpOPhdtG0V9Q2TlW/axbF1MJ6z+Yj/kKQ=="
},
"node_modules/@types/node-fetch": {
"version": "2.6.6",
"resolved": "https://registry.npmjs.org/@types/node-fetch/-/node-fetch-2.6.6.tgz",
"integrity": "sha512-95X8guJYhfqiuVVhRFxVQcf4hW/2bCuoPwDasMf/531STFoNoWTT7YDnWdXHEZKqAGUigmpG31r2FE70LwnzJw==",
"dependencies": {
"@types/node": "*",
"form-data": "^4.0.0"
}
},
"node_modules/@types/retry": {
"version": "0.12.0",
"resolved": "https://registry.npmjs.org/@types/retry/-/retry-0.12.0.tgz",
"integrity": "sha512-wWKOClTTiizcZhXnPY4wikVAwmdYHp8q6DmC+EJUzAMsycb7HB32Kh9RN4+0gExjmPmZSAQjgURXIGATPegAvA=="
},
"node_modules/@types/uuid": {
"version": "9.0.5",
"resolved": "https://registry.npmjs.org/@types/uuid/-/uuid-9.0.5.tgz",
"integrity": "sha512-xfHdwa1FMJ082prjSJpoEI57GZITiQz10r3vEJCHa2khEFQjKy91aWKz6+zybzssCvXUwE1LQWgWVwZ4nYUvHQ=="
},
"node_modules/abort-controller": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/abort-controller/-/abort-controller-3.0.0.tgz",
"integrity": "sha512-h8lQ8tacZYnR3vNQTgibj+tODHI5/+l06Au2Pcriv/Gmet0eaj4TwWH41sO9wnHDiQsEj19q0drzdWdeAHtweg==",
"dependencies": {
"event-target-shim": "^5.0.0"
},
"engines": {
"node": ">=6.5"
}
},
"node_modules/agentkeepalive": {
"version": "4.5.0",
"resolved": "https://registry.npmjs.org/agentkeepalive/-/agentkeepalive-4.5.0.tgz",
"integrity": "sha512-5GG/5IbQQpC9FpkRGsSvZI5QYeSCzlJHdpBQntCsuTOxhKD8lqKhrleg2Yi7yvMIf82Ycmmqln9U8V9qwEiJew==",
"dependencies": {
"humanize-ms": "^1.2.1"
},
"engines": {
"node": ">= 8.0.0"
}
},
"node_modules/ansi-styles": {
"version": "5.2.0",
"resolved": "https://registry.npmjs.org/ansi-styles/-/ansi-styles-5.2.0.tgz",
"integrity": "sha512-Cxwpt2SfTzTtXcfOlzGEee8O+c+MmUgGrNiBcXnuWxuFJHe6a5Hz7qwhwe5OgaSYI0IJvkLqWX1ASG+cJOkEiA==",
"engines": {
"node": ">=10"
},
"funding": {
"url": "https://github.com/chalk/ansi-styles?sponsor=1"
}
},
"node_modules/argparse": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/argparse/-/argparse-2.0.1.tgz",
"integrity": "sha512-8+9WqebbFzpX9OR+Wa6O29asIogeRMzcGtAINdpMHHyAg10f05aSFVBbcEqGf/PXw1EjAZ+q2/bEBg3DvurK3Q=="
},
"node_modules/asynckit": {
"version": "0.4.0",
"resolved": "https://registry.npmjs.org/asynckit/-/asynckit-0.4.0.tgz",
"integrity": "sha512-Oei9OH4tRh0YqU3GxhX79dM/mwVgvbZJaSNaRk+bshkj0S5cfHcgYakreBjrHwatXKbz+IoIdYLxrKim2MjW0Q=="
},
"node_modules/base-64": {
"version": "0.1.0",
"resolved": "https://registry.npmjs.org/base-64/-/base-64-0.1.0.tgz",
"integrity": "sha512-Y5gU45svrR5tI2Vt/X9GPd3L0HNIKzGu202EjxrXMpuc2V2CiKgemAbUUsqYmZJvPtCXoUKjNZwBJzsNScUbXA=="
},
"node_modules/base64-js": {
"version": "1.5.1",
"resolved": "https://registry.npmjs.org/base64-js/-/base64-js-1.5.1.tgz",
"integrity": "sha512-AKpaYlHn8t4SVbOHCy+b5+KKgvR4vrsD8vbvrbiQJps7fKDTkjkDry6ji0rUJjC0kzbNePLwzxq8iypo41qeWA==",
"funding": [
{
"type": "github",
"url": "https://github.com/sponsors/feross"
},
{
"type": "patreon",
"url": "https://www.patreon.com/feross"
},
{
"type": "consulting",
"url": "https://feross.org/support"
}
]
},
"node_modules/binary-extensions": {
"version": "2.2.0",
"resolved": "https://registry.npmjs.org/binary-extensions/-/binary-extensions-2.2.0.tgz",
"integrity": "sha512-jDctJ/IVQbZoJykoeHbhXpOlNBqGNcwXJKJog42E5HDPUwQTSdjCHdihjj0DlnheQ7blbT6dHOafNAiS8ooQKA==",
"engines": {
"node": ">=8"
}
},
"node_modules/binary-search": {
"version": "1.3.6",
"resolved": "https://registry.npmjs.org/binary-search/-/binary-search-1.3.6.tgz",
"integrity": "sha512-nbE1WxOTTrUWIfsfZ4aHGYu5DOuNkbxGokjV6Z2kxfJK3uaAb8zNK1muzOeipoLHZjInT4Br88BHpzevc681xA=="
},
"node_modules/camelcase": {
"version": "6.3.0",
"resolved": "https://registry.npmjs.org/camelcase/-/camelcase-6.3.0.tgz",
"integrity": "sha512-Gmy6FhYlCY7uOElZUSbxo2UCDH8owEk996gkbrpsgGtrJLM3J7jGxl9Ic7Qwwj4ivOE5AWZWRMecDdF7hqGjFA==",
"engines": {
"node": ">=10"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/charenc": {
"version": "0.0.2",
"resolved": "https://registry.npmjs.org/charenc/-/charenc-0.0.2.tgz",
"integrity": "sha512-yrLQ/yVUFXkzg7EDQsPieE/53+0RlaWTs+wBrvW36cyilJ2SaDWfl4Yj7MtLTXleV9uEKefbAGUPv2/iWSooRA==",
"engines": {
"node": "*"
}
},
"node_modules/combined-stream": {
"version": "1.0.8",
"resolved": "https://registry.npmjs.org/combined-stream/-/combined-stream-1.0.8.tgz",
"integrity": "sha512-FQN4MRfuJeHf7cBbBMJFXhKSDq+2kAArBlmRBvcvFE5BB1HZKXtSFASDhdlz9zOYwxh8lDdnvmMOe/+5cdoEdg==",
"dependencies": {
"delayed-stream": "~1.0.0"
},
"engines": {
"node": ">= 0.8"
}
},
"node_modules/commander": {
"version": "10.0.1",
"resolved": "https://registry.npmjs.org/commander/-/commander-10.0.1.tgz",
"integrity": "sha512-y4Mg2tXshplEbSGzx7amzPwKKOCGuoSRP/CjEdwwk0FOGlUbq6lKuoyDZTNZkmxHdJtp54hdfY/JUrdL7Xfdug==",
"engines": {
"node": ">=14"
}
},
"node_modules/crypt": {
"version": "0.0.2",
"resolved": "https://registry.npmjs.org/crypt/-/crypt-0.0.2.tgz",
"integrity": "sha512-mCxBlsHFYh9C+HVpiEacem8FEBnMXgU9gy4zmNC+SXAZNB/1idgp/aulFJ4FgCi7GPEVbfyng092GqL2k2rmow==",
"engines": {
"node": "*"
}
},
"node_modules/decamelize": {
"version": "1.2.0",
"resolved": "https://registry.npmjs.org/decamelize/-/decamelize-1.2.0.tgz",
"integrity": "sha512-z2S+W9X73hAUUki+N+9Za2lBlun89zigOyGrsax+KUQ6wKW4ZoWpEYBkGhQjwAjjDCkWxhY0VKEhk8wzY7F5cA==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/delayed-stream": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/delayed-stream/-/delayed-stream-1.0.0.tgz",
"integrity": "sha512-ZySD7Nf91aLB0RxL4KGrKHBXl7Eds1DAmEdcoVawXnLD7SDhpNgtuII2aAkg7a7QS41jxPSZ17p4VdGnMHk3MQ==",
"engines": {
"node": ">=0.4.0"
}
},
"node_modules/digest-fetch": {
"version": "1.3.0",
"resolved": "https://registry.npmjs.org/digest-fetch/-/digest-fetch-1.3.0.tgz",
"integrity": "sha512-CGJuv6iKNM7QyZlM2T3sPAdZWd/p9zQiRNS9G+9COUCwzWFTs0Xp8NF5iePx7wtvhDykReiRRrSeNb4oMmB8lA==",
"dependencies": {
"base-64": "^0.1.0",
"md5": "^2.3.0"
}
},
"node_modules/event-target-shim": {
"version": "5.0.1",
"resolved": "https://registry.npmjs.org/event-target-shim/-/event-target-shim-5.0.1.tgz",
"integrity": "sha512-i/2XbnSz/uxRCU6+NdVJgKWDTM427+MqYbkQzD321DuCQJUqOuJKIA0IM2+W2xtYHdKOmZ4dR6fExsd4SXL+WQ==",
"engines": {
"node": ">=6"
}
},
"node_modules/eventemitter3": {
"version": "4.0.7",
"resolved": "https://registry.npmjs.org/eventemitter3/-/eventemitter3-4.0.7.tgz",
"integrity": "sha512-8guHBZCwKnFhYdHr2ysuRWErTwhoN2X8XELRlrRwpmfeY2jjuUN4taQMsULKUVo1K4DvZl+0pgfyoysHxvmvEw=="
},
"node_modules/expr-eval": {
"version": "2.0.2",
"resolved": "https://registry.npmjs.org/expr-eval/-/expr-eval-2.0.2.tgz",
"integrity": "sha512-4EMSHGOPSwAfBiibw3ndnP0AvjDWLsMvGOvWEZ2F96IGk0bIVdjQisOHxReSkE13mHcfbuCiXw+G4y0zv6N8Eg=="
},
"node_modules/flat": {
"version": "5.0.2",
"resolved": "https://registry.npmjs.org/flat/-/flat-5.0.2.tgz",
"integrity": "sha512-b6suED+5/3rTpUBdG1gupIl8MPFCAMA0QXwmljLhvCUKcUvdE4gWky9zpuGCcXHOsz4J9wPGNWq6OKpmIzz3hQ==",
"bin": {
"flat": "cli.js"
}
},
"node_modules/form-data": {
"version": "4.0.0",
"resolved": "https://registry.npmjs.org/form-data/-/form-data-4.0.0.tgz",
"integrity": "sha512-ETEklSGi5t0QMZuiXoA/Q6vcnxcLQP5vdugSpuAyi6SVGi2clPPp+xgEhuMaHC+zGgn31Kd235W35f7Hykkaww==",
"dependencies": {
"asynckit": "^0.4.0",
"combined-stream": "^1.0.8",
"mime-types": "^2.1.12"
},
"engines": {
"node": ">= 6"
}
},
"node_modules/form-data-encoder": {
"version": "1.7.2",
"resolved": "https://registry.npmjs.org/form-data-encoder/-/form-data-encoder-1.7.2.tgz",
"integrity": "sha512-qfqtYan3rxrnCk1VYaA4H+Ms9xdpPqvLZa6xmMgFvhO32x7/3J/ExcTd6qpxM0vH2GdMI+poehyBZvqfMTto8A=="
},
"node_modules/formdata-node": {
"version": "4.4.1",
"resolved": "https://registry.npmjs.org/formdata-node/-/formdata-node-4.4.1.tgz",
"integrity": "sha512-0iirZp3uVDjVGt9p49aTaqjk84TrglENEDuqfdlZQ1roC9CWlPk6Avf8EEnZNcAqPonwkG35x4n3ww/1THYAeQ==",
"dependencies": {
"node-domexception": "1.0.0",
"web-streams-polyfill": "4.0.0-beta.3"
},
"engines": {
"node": ">= 12.20"
}
},
"node_modules/humanize-ms": {
"version": "1.2.1",
"resolved": "https://registry.npmjs.org/humanize-ms/-/humanize-ms-1.2.1.tgz",
"integrity": "sha512-Fl70vYtsAFb/C06PTS9dZBo7ihau+Tu/DNCk/OyHhea07S+aeMWpFFkUaXRa8fI+ScZbEI8dfSxwY7gxZ9SAVQ==",
"dependencies": {
"ms": "^2.0.0"
}
},
"node_modules/is-any-array": {
"version": "2.0.1",
"resolved": "https://registry.npmjs.org/is-any-array/-/is-any-array-2.0.1.tgz",
"integrity": "sha512-UtilS7hLRu++wb/WBAw9bNuP1Eg04Ivn1vERJck8zJthEvXCBEBpGR/33u/xLKWEQf95803oalHrVDptcAvFdQ=="
},
"node_modules/is-buffer": {
"version": "1.1.6",
"resolved": "https://registry.npmjs.org/is-buffer/-/is-buffer-1.1.6.tgz",
"integrity": "sha512-NcdALwpXkTm5Zvvbk7owOUSvVvBKDgKP5/ewfXEznmQFfs4ZRmanOeKBTjRVjka3QFoN6XJ+9F3USqfHqTaU5w=="
},
"node_modules/js-tiktoken": {
"version": "1.0.7",
"resolved": "https://registry.npmjs.org/js-tiktoken/-/js-tiktoken-1.0.7.tgz",
"integrity": "sha512-biba8u/clw7iesNEWLOLwrNGoBP2lA+hTaBLs/D45pJdUPFXyxD6nhcDVtADChghv4GgyAiMKYMiRx7x6h7Biw==",
"dependencies": {
"base64-js": "^1.5.1"
}
},
"node_modules/js-yaml": {
"version": "4.1.0",
"resolved": "https://registry.npmjs.org/js-yaml/-/js-yaml-4.1.0.tgz",
"integrity": "sha512-wpxZs9NoxZaJESJGIZTyDEaYpl0FKSA+FB9aJiyemKhMwkxQg63h4T1KJgUGHpTqPDNRcmmYLugrRjJlBtWvRA==",
"dependencies": {
"argparse": "^2.0.1"
},
"bin": {
"js-yaml": "bin/js-yaml.js"
}
},
"node_modules/jsonpointer": {
"version": "5.0.1",
"resolved": "https://registry.npmjs.org/jsonpointer/-/jsonpointer-5.0.1.tgz",
"integrity": "sha512-p/nXbhSEcu3pZRdkW1OfJhpsVtW1gd4Wa1fnQc9YLiTfAjn0312eMKimbdIQzuZl9aa9xUGaRlP9T/CJE/ditQ==",
"engines": {
"node": ">=0.10.0"
}
},
"node_modules/langchain": {
"version": "0.0.165",
"resolved": "https://registry.npmjs.org/langchain/-/langchain-0.0.165.tgz",
"integrity": "sha512-CpbNpjwaE+9lzjdw+pZz0VgnRrFivEgr7CVp9dDaAb5JpaJAA4V2v6uQ9ZPN+TSqupTQ79HFn2sfyZVEl2EG7Q==",
"dependencies": {
"@anthropic-ai/sdk": "^0.6.2",
"ansi-styles": "^5.0.0",
"binary-extensions": "^2.2.0",
"camelcase": "6",
"decamelize": "^1.2.0",
"expr-eval": "^2.0.2",
"flat": "^5.0.2",
"js-tiktoken": "^1.0.7",
"js-yaml": "^4.1.0",
"jsonpointer": "^5.0.1",
"langchainhub": "~0.0.6",
"langsmith": "~0.0.31",
"ml-distance": "^4.0.0",
"object-hash": "^3.0.0",
"openai": "~4.4.0",
"openapi-types": "^12.1.3",
"p-queue": "^6.6.2",
"p-retry": "4",
"uuid": "^9.0.0",
"yaml": "^2.2.1",
"zod": "^3.22.3",
"zod-to-json-schema": "^3.20.4"
},
"engines": {
"node": ">=18"
},
"peerDependencies": {
"@aws-crypto/sha256-js": "^5.0.0",
"@aws-sdk/client-bedrock-runtime": "^3.422.0",
"@aws-sdk/client-dynamodb": "^3.310.0",
"@aws-sdk/client-kendra": "^3.352.0",
"@aws-sdk/client-lambda": "^3.310.0",
"@aws-sdk/client-s3": "^3.310.0",
"@aws-sdk/client-sagemaker-runtime": "^3.310.0",
"@aws-sdk/client-sfn": "^3.310.0",
"@aws-sdk/credential-provider-node": "^3.388.0",
"@azure/storage-blob": "^12.15.0",
"@clickhouse/client": "^0.0.14",
"@cloudflare/ai": "^1.0.12",
"@elastic/elasticsearch": "^8.4.0",
"@getmetal/metal-sdk": "*",
"@getzep/zep-js": "^0.7.0",
"@gomomento/sdk": "^1.23.0",
"@google-ai/generativelanguage": "^0.2.1",
"@google-cloud/storage": "^6.10.1",
"@huggingface/inference": "^1.5.1",
"@mozilla/readability": "*",
"@notionhq/client": "^2.2.10",
"@opensearch-project/opensearch": "*",
"@pinecone-database/pinecone": "^1.1.0",
"@planetscale/database": "^1.8.0",
"@qdrant/js-client-rest": "^1.2.0",
"@raycast/api": "^1.55.2",
"@smithy/eventstream-codec": "^2.0.5",
"@smithy/protocol-http": "^3.0.6",
"@smithy/signature-v4": "^2.0.10",
"@smithy/util-utf8": "^2.0.0",
"@supabase/postgrest-js": "^1.1.1",
"@supabase/supabase-js": "^2.10.0",
"@tensorflow-models/universal-sentence-encoder": "*",
"@tensorflow/tfjs-converter": "*",
"@tensorflow/tfjs-core": "*",
"@upstash/redis": "^1.20.6",
"@vercel/postgres": "^0.5.0",
"@writerai/writer-sdk": "^0.40.2",
"@xata.io/client": "^0.25.1",
"@xenova/transformers": "^2.5.4",
"@zilliz/milvus2-sdk-node": ">=2.2.7",
"apify-client": "^2.7.1",
"axios": "*",
"cassandra-driver": "^4.6.4",
"cheerio": "^1.0.0-rc.12",
"chromadb": "*",
"cohere-ai": ">=6.0.0",
"d3-dsv": "^2.0.0",
"epub2": "^3.0.1",
"faiss-node": "^0.3.0",
"fast-xml-parser": "^4.2.7",
"firebase-admin": "^11.9.0",
"google-auth-library": "^8.9.0",
"googleapis": "^126.0.1",
"hnswlib-node": "^1.4.2",
"html-to-text": "^9.0.5",
"ignore": "^5.2.0",
"ioredis": "^5.3.2",
"jsdom": "*",
"llmonitor": "*",
"lodash": "^4.17.21",
"mammoth": "*",
"mongodb": "^5.2.0",
"mysql2": "^3.3.3",
"neo4j-driver": "*",
"node-llama-cpp": "*",
"notion-to-md": "^3.1.0",
"pdf-parse": "1.1.1",
"peggy": "^3.0.2",
"pg": "^8.11.0",
"pg-copy-streams": "^6.0.5",
"pickleparser": "^0.1.0",
"playwright": "^1.32.1",
"portkey-ai": "^0.1.11",
"puppeteer": "^19.7.2",
"redis": "^4.6.4",
"replicate": "^0.18.0",
"sonix-speech-recognition": "^2.1.1",
"srt-parser-2": "^1.2.2",
"typeorm": "^0.3.12",
"typesense": "^1.5.3",
"usearch": "^1.1.1",
"vectordb": "^0.1.4",
"voy-search": "0.6.2",
"weaviate-ts-client": "^1.4.0",
"web-auth-library": "^1.0.3",
"youtube-transcript": "^1.0.6",
"youtubei.js": "^5.8.0"
},
"peerDependenciesMeta": {
"@aws-crypto/sha256-js": {
"optional": true
},
"@aws-sdk/client-bedrock-runtime": {
"optional": true
},
"@aws-sdk/client-dynamodb": {
"optional": true
},
"@aws-sdk/client-kendra": {
"optional": true
},
"@aws-sdk/client-lambda": {
"optional": true
},
"@aws-sdk/client-s3": {
"optional": true
},
"@aws-sdk/client-sagemaker-runtime": {
"optional": true
},
"@aws-sdk/client-sfn": {
"optional": true
},
"@aws-sdk/credential-provider-node": {
"optional": true
},
"@azure/storage-blob": {
"optional": true
},
"@clickhouse/client": {
"optional": true
},
"@cloudflare/ai": {
"optional": true
},
"@elastic/elasticsearch": {
"optional": true
},
"@getmetal/metal-sdk": {
"optional": true
},
"@getzep/zep-js": {
"optional": true
},
"@gomomento/sdk": {
"optional": true
},
"@google-ai/generativelanguage": {
"optional": true
},
"@google-cloud/storage": {
"optional": true
},
"@huggingface/inference": {
"optional": true
},
"@mozilla/readability": {
"optional": true
},
"@notionhq/client": {
"optional": true
},
"@opensearch-project/opensearch": {
"optional": true
},
"@pinecone-database/pinecone": {
"optional": true
},
"@planetscale/database": {
"optional": true
},
"@qdrant/js-client-rest": {
"optional": true
},
"@raycast/api": {
"optional": true
},
"@smithy/eventstream-codec": {
"optional": true
},
"@smithy/protocol-http": {
"optional": true
},
"@smithy/signature-v4": {
"optional": true
},
"@smithy/util-utf8": {
"optional": true
},
"@supabase/postgrest-js": {
"optional": true
},
"@supabase/supabase-js": {
"optional": true
},
"@tensorflow-models/universal-sentence-encoder": {
"optional": true
},
"@tensorflow/tfjs-converter": {
"optional": true
},
"@tensorflow/tfjs-core": {
"optional": true
},
"@upstash/redis": {
"optional": true
},
"@vercel/postgres": {
"optional": true
},
"@writerai/writer-sdk": {
"optional": true
},
"@xata.io/client": {
"optional": true
},
"@xenova/transformers": {
"optional": true
},
"@zilliz/milvus2-sdk-node": {
"optional": true
},
"apify-client": {
"optional": true
},
"axios": {
"optional": true
},
"cassandra-driver": {
"optional": true
},
"cheerio": {
"optional": true
},
"chromadb": {
"optional": true
},
"cohere-ai": {
"optional": true
},
"d3-dsv": {
"optional": true
},
"epub2": {
"optional": true
},
"faiss-node": {
"optional": true
},
"fast-xml-parser": {
"optional": true
},
"firebase-admin": {
"optional": true
},
"google-auth-library": {
"optional": true
},
"googleapis": {
"optional": true
},
"hnswlib-node": {
"optional": true
},
"html-to-text": {
"optional": true
},
"ignore": {
"optional": true
},
"ioredis": {
"optional": true
},
"jsdom": {
"optional": true
},
"llmonitor": {
"optional": true
},
"lodash": {
"optional": true
},
"mammoth": {
"optional": true
},
"mongodb": {
"optional": true
},
"mysql2": {
"optional": true
},
"neo4j-driver": {
"optional": true
},
"node-llama-cpp": {
"optional": true
},
"notion-to-md": {
"optional": true
},
"pdf-parse": {
"optional": true
},
"peggy": {
"optional": true
},
"pg": {
"optional": true
},
"pg-copy-streams": {
"optional": true
},
"pickleparser": {
"optional": true
},
"playwright": {
"optional": true
},
"portkey-ai": {
"optional": true
},
"puppeteer": {
"optional": true
},
"redis": {
"optional": true
},
"replicate": {
"optional": true
},
"sonix-speech-recognition": {
"optional": true
},
"srt-parser-2": {
"optional": true
},
"typeorm": {
"optional": true
},
"typesense": {
"optional": true
},
"usearch": {
"optional": true
},
"vectordb": {
"optional": true
},
"voy-search": {
"optional": true
},
"weaviate-ts-client": {
"optional": true
},
"web-auth-library": {
"optional": true
},
"youtube-transcript": {
"optional": true
},
"youtubei.js": {
"optional": true
}
}
},
"node_modules/langchainhub": {
"version": "0.0.6",
"resolved": "https://registry.npmjs.org/langchainhub/-/langchainhub-0.0.6.tgz",
"integrity": "sha512-SW6105T+YP1cTe0yMf//7kyshCgvCTyFBMTgH2H3s9rTAR4e+78DA/BBrUL/Mt4Q5eMWui7iGuAYb3pgGsdQ9w=="
},
"node_modules/langsmith": {
"version": "0.0.42",
"resolved": "https://registry.npmjs.org/langsmith/-/langsmith-0.0.42.tgz",
"integrity": "sha512-sFuN+e7E+pPBIRaRgFqZh/BRBWNHTZNAwi6uj4kydQawooCZYoJmM5snOkiQrhVSvAhgu6xFhLvmfvkPcKzD7w==",
"dependencies": {
"@types/uuid": "^9.0.1",
"commander": "^10.0.1",
"p-queue": "^6.6.2",
"p-retry": "4",
"uuid": "^9.0.0"
},
"bin": {
"langsmith": "dist/cli/main.cjs"
}
},
"node_modules/md5": {
"version": "2.3.0",
"resolved": "https://registry.npmjs.org/md5/-/md5-2.3.0.tgz",
"integrity": "sha512-T1GITYmFaKuO91vxyoQMFETst+O71VUPEU3ze5GNzDm0OWdP8v1ziTaAEPUr/3kLsY3Sftgz242A1SetQiDL7g==",
"dependencies": {
"charenc": "0.0.2",
"crypt": "0.0.2",
"is-buffer": "~1.1.6"
}
},
"node_modules/mime-db": {
"version": "1.52.0",
"resolved": "https://registry.npmjs.org/mime-db/-/mime-db-1.52.0.tgz",
"integrity": "sha512-sPU4uV7dYlvtWJxwwxHD0PuihVNiE7TyAbQ5SWxDCB9mUYvOgroQOwYQQOKPJ8CIbE+1ETVlOoK1UC2nU3gYvg==",
"engines": {
"node": ">= 0.6"
}
},
"node_modules/mime-types": {
"version": "2.1.35",
"resolved": "https://registry.npmjs.org/mime-types/-/mime-types-2.1.35.tgz",
"integrity": "sha512-ZDY+bPm5zTTF+YpCrAU9nK0UgICYPT0QtT1NZWFv4s++TNkcgVaT0g6+4R2uI4MjQjzysHB1zxuWL50hzaeXiw==",
"dependencies": {
"mime-db": "1.52.0"
},
"engines": {
"node": ">= 0.6"
}
},
"node_modules/ml-array-mean": {
"version": "1.1.6",
"resolved": "https://registry.npmjs.org/ml-array-mean/-/ml-array-mean-1.1.6.tgz",
"integrity": "sha512-MIdf7Zc8HznwIisyiJGRH9tRigg3Yf4FldW8DxKxpCCv/g5CafTw0RRu51nojVEOXuCQC7DRVVu5c7XXO/5joQ==",
"dependencies": {
"ml-array-sum": "^1.1.6"
}
},
"node_modules/ml-array-sum": {
"version": "1.1.6",
"resolved": "https://registry.npmjs.org/ml-array-sum/-/ml-array-sum-1.1.6.tgz",
"integrity": "sha512-29mAh2GwH7ZmiRnup4UyibQZB9+ZLyMShvt4cH4eTK+cL2oEMIZFnSyB3SS8MlsTh6q/w/yh48KmqLxmovN4Dw==",
"dependencies": {
"is-any-array": "^2.0.0"
}
},
"node_modules/ml-distance": {
"version": "4.0.1",
"resolved": "https://registry.npmjs.org/ml-distance/-/ml-distance-4.0.1.tgz",
"integrity": "sha512-feZ5ziXs01zhyFUUUeZV5hwc0f5JW0Sh0ckU1koZe/wdVkJdGxcP06KNQuF0WBTj8FttQUzcvQcpcrOp/XrlEw==",
"dependencies": {
"ml-array-mean": "^1.1.6",
"ml-distance-euclidean": "^2.0.0",
"ml-tree-similarity": "^1.0.0"
}
},
"node_modules/ml-distance-euclidean": {
"version": "2.0.0",
"resolved": "https://registry.npmjs.org/ml-distance-euclidean/-/ml-distance-euclidean-2.0.0.tgz",
"integrity": "sha512-yC9/2o8QF0A3m/0IXqCTXCzz2pNEzvmcE/9HFKOZGnTjatvBbsn4lWYJkxENkA4Ug2fnYl7PXQxnPi21sgMy/Q=="
},
"node_modules/ml-tree-similarity": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/ml-tree-similarity/-/ml-tree-similarity-1.0.0.tgz",
"integrity": "sha512-XJUyYqjSuUQkNQHMscr6tcjldsOoAekxADTplt40QKfwW6nd++1wHWV9AArl0Zvw/TIHgNaZZNvr8QGvE8wLRg==",
"dependencies": {
"binary-search": "^1.3.5",
"num-sort": "^2.0.0"
}
},
"node_modules/ms": {
"version": "2.1.3",
"resolved": "https://registry.npmjs.org/ms/-/ms-2.1.3.tgz",
"integrity": "sha512-6FlzubTLZG3J2a/NVCAleEhjzq5oxgHyaCU9yYXvcLsvoVaHJq/s5xXI6/XXP6tz7R9xAOtHnSO/tXtF3WRTlA=="
},
"node_modules/node-domexception": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/node-domexception/-/node-domexception-1.0.0.tgz",
"integrity": "sha512-/jKZoMpw0F8GRwl4/eLROPA3cfcXtLApP0QzLmUT/HuPCZWyB7IY9ZrMeKw2O/nFIqPQB3PVM9aYm0F312AXDQ==",
"funding": [
{
"type": "github",
"url": "https://github.com/sponsors/jimmywarting"
},
{
"type": "github",
"url": "https://paypal.me/jimmywarting"
}
],
"engines": {
"node": ">=10.5.0"
}
},
"node_modules/node-fetch": {
"version": "2.7.0",
"resolved": "https://registry.npmjs.org/node-fetch/-/node-fetch-2.7.0.tgz",
"integrity": "sha512-c4FRfUm/dbcWZ7U+1Wq0AwCyFL+3nt2bEw05wfxSz+DWpWsitgmSgYmy2dQdWyKC1694ELPqMs/YzUSNozLt8A==",
"dependencies": {
"whatwg-url": "^5.0.0"
},
"engines": {
"node": "4.x || >=6.0.0"
},
"peerDependencies": {
"encoding": "^0.1.0"
},
"peerDependenciesMeta": {
"encoding": {
"optional": true
}
}
},
"node_modules/num-sort": {
"version": "2.1.0",
"resolved": "https://registry.npmjs.org/num-sort/-/num-sort-2.1.0.tgz",
"integrity": "sha512-1MQz1Ed8z2yckoBeSfkQHHO9K1yDRxxtotKSJ9yvcTUUxSvfvzEq5GwBrjjHEpMlq/k5gvXdmJ1SbYxWtpNoVg==",
"engines": {
"node": ">=8"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/object-hash": {
"version": "3.0.0",
"resolved": "https://registry.npmjs.org/object-hash/-/object-hash-3.0.0.tgz",
"integrity": "sha512-RSn9F68PjH9HqtltsSnqYC1XXoWe9Bju5+213R98cNGttag9q9yAOTzdbsqvIa7aNm5WffBZFpWYr2aWrklWAw==",
"engines": {
"node": ">= 6"
}
},
"node_modules/openai": {
"version": "4.4.0",
"resolved": "https://registry.npmjs.org/openai/-/openai-4.4.0.tgz",
"integrity": "sha512-JN0t628Kh95T0IrXl0HdBqnlJg+4Vq0Bnh55tio+dfCnyzHvMLiWyCM9m726MAJD2YkDU4/8RQB6rNbEq9ct2w==",
"dependencies": {
"@types/node": "^18.11.18",
"@types/node-fetch": "^2.6.4",
"abort-controller": "^3.0.0",
"agentkeepalive": "^4.2.1",
"digest-fetch": "^1.3.0",
"form-data-encoder": "1.7.2",
"formdata-node": "^4.3.2",
"node-fetch": "^2.6.7"
},
"bin": {
"openai": "bin/cli"
}
},
"node_modules/openapi-types": {
"version": "12.1.3",
"resolved": "https://registry.npmjs.org/openapi-types/-/openapi-types-12.1.3.tgz",
"integrity": "sha512-N4YtSYJqghVu4iek2ZUvcN/0aqH1kRDuNqzcycDxhOUpg7GdvLa2F3DgS6yBNhInhv2r/6I0Flkn7CqL8+nIcw=="
},
"node_modules/p-finally": {
"version": "1.0.0",
"resolved": "https://registry.npmjs.org/p-finally/-/p-finally-1.0.0.tgz",
"integrity": "sha512-LICb2p9CB7FS+0eR1oqWnHhp0FljGLZCWBE9aix0Uye9W8LTQPwMTYVGWQWIw9RdQiDg4+epXQODwIYJtSJaow==",
"engines": {
"node": ">=4"
}
},
"node_modules/p-queue": {
"version": "6.6.2",
"resolved": "https://registry.npmjs.org/p-queue/-/p-queue-6.6.2.tgz",
"integrity": "sha512-RwFpb72c/BhQLEXIZ5K2e+AhgNVmIejGlTgiB9MzZ0e93GRvqZ7uSi0dvRF7/XIXDeNkra2fNHBxTyPDGySpjQ==",
"dependencies": {
"eventemitter3": "^4.0.4",
"p-timeout": "^3.2.0"
},
"engines": {
"node": ">=8"
},
"funding": {
"url": "https://github.com/sponsors/sindresorhus"
}
},
"node_modules/p-retry": {
"version": "4.6.2",
"resolved": "https://registry.npmjs.org/p-retry/-/p-retry-4.6.2.tgz",
"integrity": "sha512-312Id396EbJdvRONlngUx0NydfrIQ5lsYu0znKVUzVvArzEIt08V1qhtyESbGVd1FGX7UKtiFp5uwKZdM8wIuQ==",
"dependencies": {
"@types/retry": "0.12.0",
"retry": "^0.13.1"
},
"engines": {
"node": ">=8"
}
},
"node_modules/p-timeout": {
"version": "3.2.0",
"resolved": "https://registry.npmjs.org/p-timeout/-/p-timeout-3.2.0.tgz",
"integrity": "sha512-rhIwUycgwwKcP9yTOOFK/AKsAopjjCakVqLHePO3CC6Mir1Z99xT+R63jZxAT5lFZLa2inS5h+ZS2GvR99/FBg==",
"dependencies": {
"p-finally": "^1.0.0"
},
"engines": {
"node": ">=8"
}
},
"node_modules/retry": {
"version": "0.13.1",
"resolved": "https://registry.npmjs.org/retry/-/retry-0.13.1.tgz",
"integrity": "sha512-XQBQ3I8W1Cge0Seh+6gjj03LbmRFWuoszgK9ooCpwYIrhhoO80pfq4cUkU5DkknwfOfFteRwlZ56PYOGYyFWdg==",
"engines": {
"node": ">= 4"
}
},
"node_modules/tr46": {
"version": "0.0.3",
"resolved": "https://registry.npmjs.org/tr46/-/tr46-0.0.3.tgz",
"integrity": "sha512-N3WMsuqV66lT30CrXNbEjx4GEwlow3v6rr4mCcv6prnfwhS01rkgyFdjPNBYd9br7LpXV1+Emh01fHnq2Gdgrw=="
},
"node_modules/typescript": {
"version": "5.2.2",
"resolved": "https://registry.npmjs.org/typescript/-/typescript-5.2.2.tgz",
"integrity": "sha512-mI4WrpHsbCIcwT9cF4FZvr80QUeKvsUsUvKDoR+X/7XHQH98xYD8YHZg7ANtz2GtZt/CBq2QJ0thkGJMHfqc1w==",
"dev": true,
"bin": {
"tsc": "bin/tsc",
"tsserver": "bin/tsserver"
},
"engines": {
"node": ">=14.17"
}
},
"node_modules/uuid": {
"version": "9.0.1",
"resolved": "https://registry.npmjs.org/uuid/-/uuid-9.0.1.tgz",
"integrity": "sha512-b+1eJOlsR9K8HJpow9Ok3fiWOWSIcIzXodvv0rQjVoOVNpWMpxf1wZNpt4y9h10odCNrqnYp1OBzRktckBe3sA==",
"funding": [
"https://github.com/sponsors/broofa",
"https://github.com/sponsors/ctavan"
],
"bin": {
"uuid": "dist/bin/uuid"
}
},
"node_modules/web-streams-polyfill": {
"version": "4.0.0-beta.3",
"resolved": "https://registry.npmjs.org/web-streams-polyfill/-/web-streams-polyfill-4.0.0-beta.3.tgz",
"integrity": "sha512-QW95TCTaHmsYfHDybGMwO5IJIM93I/6vTRk+daHTWFPhwh+C8Cg7j7XyKrwrj8Ib6vYXe0ocYNrmzY4xAAN6ug==",
"engines": {
"node": ">= 14"
}
},
"node_modules/webidl-conversions": {
"version": "3.0.1",
"resolved": "https://registry.npmjs.org/webidl-conversions/-/webidl-conversions-3.0.1.tgz",
"integrity": "sha512-2JAn3z8AR6rjK8Sm8orRC0h/bcl/DqL7tRPdGZ4I1CjdF+EaMLmYxBHyXuKL849eucPFhvBoxMsflfOb8kxaeQ=="
},
"node_modules/whatwg-url": {
"version": "5.0.0",
"resolved": "https://registry.npmjs.org/whatwg-url/-/whatwg-url-5.0.0.tgz",
"integrity": "sha512-saE57nupxk6v3HY35+jzBwYa0rKSy0XR8JSxZPwgLr7ys0IBzhGviA1/TUGJLmSVqs8pb9AnvICXEuOHLprYTw==",
"dependencies": {
"tr46": "~0.0.3",
"webidl-conversions": "^3.0.0"
}
},
"node_modules/yaml": {
"version": "2.3.2",
"resolved": "https://registry.npmjs.org/yaml/-/yaml-2.3.2.tgz",
"integrity": "sha512-N/lyzTPaJasoDmfV7YTrYCI0G/3ivm/9wdG0aHuheKowWQwGTsK0Eoiw6utmzAnI6pkJa0DUVygvp3spqqEKXg==",
"engines": {
"node": ">= 14"
}
},
"node_modules/zod": {
"version": "3.22.4",
"resolved": "https://registry.npmjs.org/zod/-/zod-3.22.4.tgz",
"integrity": "sha512-iC+8Io04lddc+mVqQ9AZ7OQ2MrUKGN+oIQyq1vemgt46jwCwLfhq7/pwnBnNXXXZb8VTVLKwp9EDkx+ryxIWmg==",
"funding": {
"url": "https://github.com/sponsors/colinhacks"
}
},
"node_modules/zod-to-json-schema": {
"version": "3.21.4",
"resolved": "https://registry.npmjs.org/zod-to-json-schema/-/zod-to-json-schema-3.21.4.tgz",
"integrity": "sha512-fjUZh4nQ1s6HMccgIeE0VP4QG/YRGPmyjO9sAh890aQKPEk3nqbfUXhMFaC+Dr5KvYBm8BCyvfpZf2jY9aGSsw==",
"peerDependencies": {
"zod": "^3.21.4"
}
}
}
}

View File

@@ -1,13 +0,0 @@
{
"scripts": {
"start": "tsx main.ts"
},
"devDependencies": {
"tsx": "^4.6.2",
"typescript": "^5.3.3"
},
"dependencies": {
"langchain": "^0.0.165",
"readline": "^1.3.0"
}
}

View File

@@ -1,5 +0,0 @@
FROM llama3.2
PARAMETER temperature 1
SYSTEM """
You are Mario from super mario bros, acting as an assistant.
"""

Binary file not shown.

Before

Width:  |  Height:  |  Size: 446 KiB

View File

@@ -1,43 +0,0 @@
<img src="logo.png" alt="image of Italian plumber" height="200"/>
# Example character: Mario
This example shows how to create a basic character using Llama 3.2 as the base model.
To run this example:
1. Download the Modelfile
2. `ollama pull llama3.2` to get the base model used in the model file.
3. `ollama create NAME -f ./Modelfile`
4. `ollama run NAME`
Ask it some questions like "Who are you?" or "Is Peach in trouble again?"
## Editing this file
What the model file looks like:
```
FROM llama3.2
PARAMETER temperature 1
SYSTEM """
You are Mario from Super Mario Bros, acting as an assistant.
"""
```
What if you want to change its behaviour?
- Try changing the prompt
- Try changing the parameters [Docs](https://github.com/ollama/ollama/blob/main/docs/modelfile.md)
- Try changing the model (e.g. An uncensored model by `FROM wizard-vicuna` this is the wizard-vicuna uncensored model )
Once the changes are made,
1. `ollama create NAME -f ./Modelfile`
2. `ollama run NAME`
3. Iterate until you are happy with the results.
Notes:
- This example is for research purposes only. There is no affiliation with any entity.
- When using an uncensored model, please be aware that it may generate offensive content.

View File

@@ -1,20 +0,0 @@
FROM mistral
SYSTEM """
You are an experienced Devops engineer focused on docker. When given specifications for a particular need or application you know the best way to host that within a docker container. For instance if someone tells you they want an nginx server to host files located at /web you will answer as follows
---start
FROM nginx:alpine
COPY /myweb /usr/share/nginx/html
EXPOSE 80
---end
Notice that the answer you should give is just the contents of the dockerfile with no explanation and there are three dashes and the word start at the beginning and 3 dashes and the word end. The full output can be piped into a file and run as is. Here is another example. The user will ask to launch a Postgres server with a password of abc123. And the response should be
---start
FROM postgres:latest
ENV POSTGRES_PASSWORD=abc123
EXPOSE 5432
---end
Again it's just the contents of the dockerfile and nothing else.
"""

View File

@@ -1,31 +0,0 @@
# DockerIt
DockerIt is a tool to help you build and run your application in a Docker container. It consists of a model that defines the system prompt and model weights to use, along with a python script to then build the container and run the image automatically.
## Running the Example
1. Ensure you have the `mattw/dockerit` model installed:
```bash
ollama pull mattw/dockerit
```
2. Make sure Docker is running on your machine.
3. Install the Python Requirements.
```bash
pip install -r requirements.txt
```
4. Run the example:
```bash
python dockerit.py "simple postgres server with admin password set to 123"
```
5. Enter the name you would like to use for your container image.
## Caveats
This is a simple example. It's assuming the Dockerfile content generated is going to work. In many cases, even with simple web servers, it fails when trying to copy files that don't exist. It's simply an example of what you could possibly do.

View File

@@ -1,17 +0,0 @@
import requests, json, docker, io, sys
inputDescription = " ".join(sys.argv[1:])
imageName = input("Enter the name of the image: ")
client = docker.from_env()
s = requests.Session()
output=""
with s.post('http://localhost:11434/api/generate', json={'model': 'mattw/dockerit', 'prompt': inputDescription}, stream=True) as r:
for line in r.iter_lines():
if line:
j = json.loads(line)
if "response" in j:
output = output +j["response"]
output = output[output.find("---start")+9:output.find("---end")-1]
f = io.BytesIO(bytes(output, 'utf-8'))
client.images.build(fileobj=f, tag=imageName)
container = client.containers.run(imageName, detach=True)
print("Container named", container.name, " started with id: ",container.id)

View File

@@ -1 +0,0 @@
docker

Some files were not shown because too many files have changed in this diff Show More