jmorganca
201a987ff9
some more menu options...
2024-04-28 12:40:52 -04:00
jmorganca
2d8125042a
Touch ID for cli install; server restarts
2024-04-27 22:42:38 -04:00
jmorganca
776e7bb5e4
app: fix status item icons
2024-04-27 15:57:57 -04:00
jmorganca
b8d7ca1a7b
Native implementation of macOS app
2024-04-27 14:20:10 -04:00
Blake Mizerany
2bed62926e
types/model: remove Digest (for now) ( #3970 )
...
The Digest type needs more thought and is not necessary at the moment.
2024-04-26 21:14:28 -07:00
Jeffrey Morgan
aad8d128a0
also look at cwd as a root for windows runners ( #3959 )
2024-04-26 19:14:08 -04:00
Daniel Hiltgen
ec1acbb867
Merge pull request #3968 from dhiltgen/win_generate
...
Fine grain control over windows generate steps
2024-04-26 16:03:38 -07:00
Daniel Hiltgen
e4859c4563
Fine grain control over windows generate steps
...
This will speed up CI which already tries to only build static for unit tests
2024-04-26 15:49:46 -07:00
Nataly Merezhuk
8e30eb26bd
Updates the setup command to use llama3. ( #3962 )
2024-04-26 18:41:01 -04:00
Daniel Hiltgen
0b5c589ca2
Merge pull request #3966 from dhiltgen/bump
...
Fix target in gen_windows.ps1
2024-04-26 15:36:53 -07:00
Michael Yang
65fadddc85
Merge pull request #3964 from ollama/mxyng/weights
...
fix gemma, command-r layer weights
2024-04-26 15:23:33 -07:00
Daniel Hiltgen
ed5fb088c4
Fix target in gen_windows.ps1
2024-04-26 15:10:42 -07:00
Michael Yang
f81f308118
fix gemma, command-r layer weights
2024-04-26 15:00:55 -07:00
Blake Mizerany
b1390a7b37
types/model: export ParseNameBare and Merge ( #3957 )
...
These are useful outside this package.
2024-04-26 14:58:07 -07:00
Michael Yang
11d83386a5
Merge pull request #3951 from ollama/mxyng/zip
...
check file type before zip
2024-04-26 14:51:23 -07:00
Jeffrey Morgan
bb31def011
return code 499
when user cancels request while a model is loading ( #3955 )
2024-04-26 17:38:29 -04:00
Michael Yang
41e03ede95
check file type before zip
2024-04-26 14:18:07 -07:00
Michael Yang
7fea1ecdf6
Merge pull request #3958 from ollama/mxyng/fix-workflow
...
use merge base for diff-tree
2024-04-26 14:17:56 -07:00
Blake Mizerany
054894271d
.github/workflows/test.yaml: add in-flight cancellations on new push ( #3956 )
...
Also, remove a superfluous 'go get'
2024-04-26 13:54:24 -07:00
Michael Yang
6fef042f0b
use merge base for diff-tree
2024-04-26 13:54:15 -07:00
Daniel Hiltgen
5c0c2d1d09
Merge pull request #3954 from dhiltgen/ci_fixes
...
Put back non-avx CPU build for windows
2024-04-26 13:09:03 -07:00
Blake Mizerany
37f9c8ad99
types/model: overhaul Name and Digest types ( #3924 )
2024-04-26 13:08:32 -07:00
Quinten van Buul
2a80f55e2a
Update windows.md ( #3855 )
...
Fixed a typo
2024-04-26 16:04:15 -04:00
Daniel Hiltgen
421c878a2d
Put back non-avx CPU build for windows
2024-04-26 12:44:07 -07:00
Daniel Hiltgen
36666c2142
Merge pull request #3925 from dhiltgen/bump
...
Bump llama.cpp to b2737
2024-04-26 10:09:38 -07:00
Daniel Hiltgen
85801317d1
Fix clip log import
2024-04-26 09:43:46 -07:00
Daniel Hiltgen
2ed0d65948
Bump llama.cpp to b2737
2024-04-26 09:43:28 -07:00
Daniel Hiltgen
d459dc4ad1
Merge pull request #3950 from dhiltgen/windows_packaging
...
Fix exe name for zip packaging on windows
2024-04-26 09:27:37 -07:00
Daniel Hiltgen
40bc4622ef
Fix exe name for zip packaging on windows
...
The zip file encodes the OS and architecture, so keep the short exe name
2024-04-26 09:18:05 -07:00
Daniel Hiltgen
c0f818a07a
Merge pull request #3948 from dhiltgen/win_generate
...
Refactor windows generate for more modular usage
2024-04-26 09:17:20 -07:00
Daniel Hiltgen
8671fdeda6
Refactor windows generate for more modular usage
2024-04-26 08:35:50 -07:00
Daniel Hiltgen
2619850fb4
Merge pull request #3933 from dhiltgen/ci_fixes
...
Move cuda/rocm dependency gathering into generate script
2024-04-26 07:01:24 -07:00
Daniel Hiltgen
8feb97dc0d
Move cuda/rocm dependency gathering into generate script
...
This will make it simpler for CI to accumulate artifacts from prior steps
2024-04-25 22:38:44 -07:00
Daniel Hiltgen
4e1ff6dcbb
Merge pull request #3926 from dhiltgen/ci_fixes
...
Fix release CI
2024-04-25 17:42:31 -07:00
Daniel Hiltgen
8589d752ac
Fix release CI
...
download-artifact path was being used incorrectly. It is where to
extract the zip not the files in the zip to extract. Default is
workspace dir which is what we want, so omit it
2024-04-25 17:27:11 -07:00
Michael Yang
de4ded68b0
Merge pull request #3923 from ollama/mxyng/mem
...
only count output tensors
2024-04-25 16:34:17 -07:00
Daniel Hiltgen
9b5a3c5991
Merge pull request #3914 from dhiltgen/mac_perf
...
Improve mac parallel performance
2024-04-25 16:28:31 -07:00
Jeffrey Morgan
00b0699c75
Reload model if num_gpu
changes ( #3920 )
...
* reload model if `num_gpu` changes
* dont reload on -1
* fix tests
2024-04-25 19:02:40 -04:00
Jeffrey Morgan
993cf8bf55
llm: limit generation to 10x context size to avoid run on generations ( #3918 )
...
* llm: limit generation to 10x context size to avoid run on generations
* add comment
* simplify condition statement
2024-04-25 19:02:30 -04:00
Michael Yang
7bb7cb8a60
only count output tensors
2024-04-25 15:24:08 -07:00
Daniel Hiltgen
b123be5b71
Adjust context size for parallelism
2024-04-25 13:58:54 -07:00
jmorganca
ddf5c09a9b
use matrix multiplcation kernels in more cases
2024-04-25 13:58:54 -07:00
Roy Yang
5f73c08729
Remove trailing spaces ( #3889 )
2024-04-25 14:32:26 -04:00
Daniel Hiltgen
f503a848c2
Merge pull request #3895 from brycereitano/shiftloading
...
Move ggml loading to when attempting to fit
2024-04-25 09:24:08 -07:00
Bryce Reitano
36a6daccab
Restructure loading conditional chain
2024-04-24 17:37:03 -06:00
Bryce Reitano
ceb0e26e5e
Provide variable ggml for TestLoad
2024-04-24 17:19:55 -06:00
Bryce Reitano
284e02bed0
Move ggml loading to when we attempt fitting
2024-04-24 17:17:24 -06:00
Michael Yang
3450a57d4a
Merge pull request #3713 from ollama/mxyng/modelname
...
update copy handler to use model.Name
2024-04-24 16:00:32 -07:00
Michael Yang
592dae31c8
update copy to use model.Name
2024-04-24 15:54:54 -07:00
Michael Yang
2010cbc5fa
Merge pull request #3833 from ollama/mxyng/fix-from
...
fix: from blob
2024-04-24 15:13:47 -07:00
Michael Yang
ac0801eced
only replace if it matches command
2024-04-24 14:49:26 -07:00
Michael Yang
ad66e5b060
split temp zip files
2024-04-24 14:18:01 -07:00
Blake Mizerany
ade4b55520
types/model: make ParseName use default without question ( #3886 )
2024-04-24 11:52:55 -07:00
Daniel Hiltgen
a6d62e0617
Merge pull request #3882 from dhiltgen/amd_gfx
...
AMD gfx patch rev is hex
2024-04-24 11:07:49 -07:00
Daniel Hiltgen
6e76348df7
Merge pull request #3834 from dhiltgen/not_found_in_path
...
Report errors on server lookup instead of path lookup failure
2024-04-24 10:50:48 -07:00
Daniel Hiltgen
0d6687f84c
AMD gfx patch rev is hex
...
Correctly handle gfx90a discovery
2024-04-24 09:43:52 -07:00
Patrick Devine
74d2a9ef9a
add OLLAMA_KEEP_ALIVE env variable to FAQ ( #3865 )
2024-04-23 21:06:51 -07:00
Patrick Devine
14476d48cc
fixes for gguf ( #3863 )
2024-04-23 20:57:20 -07:00
Patrick Devine
ce8ce82567
add mixtral 8x7b model conversion ( #3859 )
2024-04-23 20:17:04 -07:00
Blake Mizerany
4dc4f1be34
types/model: restrict digest hash part to a minimum of 2 characters ( #3858 )
...
This allows users of a valid Digest to know it has a minimum of 2
characters in the hash part for use when sharding.
This is a reasonable restriction as the hash part is a SHA256 hash which
is 64 characters long, which is the common hash used. There is no
anticipation of using a hash with less than 2 characters.
Also, add MustParseDigest.
Also, replace Digest.Type with Digest.Split for getting both the type
and hash parts together, which is most the common case when asking for
either.
2024-04-23 18:24:17 -07:00
Daniel Hiltgen
16b52331a4
Merge pull request #3857 from dhiltgen/mem_escape_valve
...
Add back memory escape valve
2024-04-23 17:32:24 -07:00
Daniel Hiltgen
5445aaa94e
Add back memory escape valve
...
If we get our predictions wrong, this can be used to
set a lower memory limit as a workaround. Recent multi-gpu
refactoring accidentally removed it, so this adds it back.
2024-04-23 17:09:02 -07:00
Daniel Hiltgen
2ac3dd6853
Merge pull request #3850 from dhiltgen/windows_packaging
...
Move nested payloads to installer and zip file on windows
2024-04-23 16:35:20 -07:00
Daniel Hiltgen
d8851cb7a0
Harden sched TestLoad
...
Give the go routine a moment to deliver the expired event
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
058f6cd2cc
Move nested payloads to installer and zip file on windows
...
Now that the llm runner is an executable and not just a dll, more users are facing
problems with security policy configurations on windows that prevent users
writing to directories and then executing binaries from the same location.
This change removes payloads from the main executable on windows and shifts them
over to be packaged in the installer and discovered based on the executables location.
This also adds a new zip file for people who want to "roll their own" installation model.
2024-04-23 16:14:47 -07:00
Daniel Hiltgen
790cf34d17
Merge pull request #3846 from dhiltgen/missing_runner
...
Detect and recover if runner removed
2024-04-23 13:14:12 -07:00
Michael
928d844896
adding phi-3 mini to readme
...
adding phi-3 mini to readme
2024-04-23 13:58:31 -04:00
Daniel Hiltgen
939d6a8606
Make CI lint verbvose
2024-04-23 10:17:42 -07:00
Daniel Hiltgen
58888a74bc
Detect and recover if runner removed
...
Tmp cleaners can nuke the file out from underneath us. This detects the missing
runner, and re-initializes the payloads.
2024-04-23 10:05:26 -07:00
Daniel Hiltgen
cc5a71e0e3
Merge pull request #3709 from remy415/custom-gpu-defs
...
Adds support for customizing GPU build flags in llama.cpp
2024-04-23 09:28:34 -07:00
Michael Yang
e83bcf7f9a
Merge pull request #3836 from ollama/mxyng/mixtral
...
fix: mixtral graph
2024-04-23 09:15:10 -07:00
Daniel Hiltgen
5690e5ce99
Merge pull request #3418 from dhiltgen/concurrency
...
Request and model concurrency
2024-04-23 08:31:38 -07:00
Daniel Hiltgen
f2ea8470e5
Local unicode test case
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
34b9db5afc
Request and model concurrency
...
This change adds support for multiple concurrent requests, as well as
loading multiple models by spawning multiple runners. The default
settings are currently set at 1 concurrent request per model and only 1
loaded model at a time, but these can be adjusted by setting
OLLAMA_NUM_PARALLEL and OLLAMA_MAX_LOADED_MODELS.
2024-04-22 19:29:12 -07:00
Daniel Hiltgen
8711d03df7
Report errors on server lookup instead of path lookup failure
2024-04-22 19:08:47 -07:00
Daniel Hiltgen
ee448deaba
Merge pull request #3835 from dhiltgen/harden_llm_override
...
Trim spaces and quotes from llm lib override
2024-04-22 19:06:54 -07:00
Bruce MacDonald
6e8db04716
tidy community integrations
...
- move some popular integrations to the top of the lists
2024-04-22 17:29:08 -07:00
Bruce MacDonald
658e60cf73
Revert "stop running model on interactive exit"
...
This reverts commit fad00a85e5
.
2024-04-22 17:23:11 -07:00
Bruce MacDonald
4c78f028f8
Merge branch 'main' of https://github.com/ollama/ollama
2024-04-22 17:22:28 -07:00
Michael Yang
435cc866a3
fix: mixtral graph
2024-04-22 17:19:44 -07:00
Hao Wu
c7d3a558f6
docs: update README to add chat (web UI) for LLM ( #3810 )
...
* add chat (web UI) for LLM
I have used chat with llama3 in local successfully and the code is MIT licensed.
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-04-22 20:19:39 -04:00
Maple Gao
089cdb2877
docs: Update README for Lobe-chat integration. ( #3817 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-04-22 20:18:15 -04:00
Võ Đình Đạt
ea1e9aa36b
Update README.md ( #3655 )
2024-04-22 20:16:55 -04:00
Jonathan Smoley
d0d28ef90d
Update README.md with Discord-Ollama project ( #3633 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-04-22 20:14:20 -04:00
Eric Curtin
6654186a7c
Add podman-ollama to terminal apps ( #3626 )
...
The goal of podman-ollama is to make AI even more boring.
Signed-off-by: Eric Curtin <ecurtin@redhat.com >
2024-04-22 20:13:23 -04:00
Daniel Hiltgen
aa72281eae
Trim spaces and quotes from llm lib override
2024-04-22 17:11:14 -07:00
reid41
74bcbf828f
add qa-pilot link ( #3612 )
...
* add qa-pilot link
* format the link
* add shell-pilot
2024-04-22 20:10:34 -04:00
Christian Neff
fe39147e64
Add Chatbot UI v2 to Community Integrations ( #3503 )
2024-04-22 20:09:55 -04:00
Bruce MacDonald
fad00a85e5
stop running model on interactive exit
2024-04-22 16:22:14 -07:00
Jeremy
9c0db4cc83
Update gen_windows.ps1
...
Fixed improper env references
2024-04-21 16:13:41 -04:00
Cheng
62be2050dd
chore: use errors.New to replace fmt.Errorf will much better ( #3789 )
2024-04-20 22:11:06 -04:00
Blake Mizerany
56f8aa6912
types/model: export IsValidNamePart ( #3788 )
2024-04-20 18:26:34 -07:00
Sri Siddhaarth
e6f9bfc0e8
Update api.md ( #3705 )
2024-04-20 15:17:03 -04:00
Jeremy
6f18297b3a
Update gen_windows.ps1
...
Forgot a " on the write-host
2024-04-18 19:47:44 -04:00
Jeremy
15016413de
Update gen_windows.ps1
...
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS to customize GPU builds on Windows
2024-04-18 19:27:16 -04:00
Jeremy
440b7190ed
Update gen_linux.sh
...
Added OLLAMA_CUSTOM_CUDA_DEFS and OLLAMA_CUSTOM_ROCM_DEFS instead of OLLAMA_CUSTOM_GPU_DEFS
2024-04-18 19:18:10 -04:00
Daniel Hiltgen
8d1995c625
Merge pull request #3708 from remy415/arm64static
...
move Ollama static build to its own flag
2024-04-18 16:04:12 -07:00
Daniel Hiltgen
fd01fbf038
Merge pull request #3710 from remy415/update-jetson-docs
...
update jetson tutorial
2024-04-18 16:02:08 -07:00
Blake Mizerany
0408205c1c
types/model: accept former :
as a separator in digest ( #3724 )
...
This also converges the old sep `:` to the new sep `-`.
2024-04-18 14:17:46 -07:00
Jeffrey Morgan
63a7edd771
Update README.md
2024-04-18 16:09:38 -04:00
Michael
554ffdcce3
add llama3 to readme
...
add llama3 to readme
2024-04-18 15:18:48 -04:00
Jeremy
9850a4ce08
Merge branch 'ollama:main' into update-jetson-docs
2024-04-18 09:55:17 -04:00
Jeremy
3934c15895
Merge branch 'ollama:main' into custom-gpu-defs
2024-04-18 09:55:10 -04:00
Jeremy
fd048f1367
Merge branch 'ollama:main' into arm64static
2024-04-18 09:55:04 -04:00
Michael Yang
8645076a71
Merge pull request #3712 from ollama/mxyng/mem
...
add stablelm graph calculation
2024-04-17 15:57:51 -07:00
Michael Yang
05e9424824
Merge pull request #3664 from ollama/mxyng/fix-padding-2
...
fix padding to only return padding
2024-04-17 15:57:40 -07:00
Michael Yang
52ebe67a98
Merge pull request #3714 from ollama/mxyng/model-name-host
...
types/model: support : in PartHost for host:port
2024-04-17 15:34:03 -07:00
Michael Yang
889b31ab78
types/model: support : in PartHost for host:port
2024-04-17 15:16:07 -07:00
Michael Yang
3cf483fe48
add stablelm graph calculation
2024-04-17 13:57:19 -07:00
Jeremy
8dca03173d
Merge remote-tracking branch 'upstream/main' into update-jetson-docs
2024-04-17 16:18:50 -04:00
Jeremy
85bdf14b56
update jetson tutorial
2024-04-17 16:17:42 -04:00
Jeremy
d524e5ef5e
Merge branch 'custom-gpu-defs' of https://github.com/remy415/ollama into custom-gpu-defs
2024-04-17 16:01:03 -04:00
Jeremy
52f5370c48
add support for custom gpu build flags for llama.cpp
2024-04-17 16:00:48 -04:00
Jeremy
da8a0c7657
Merge branch 'ollama:main' into arm64static
2024-04-17 15:22:34 -04:00
Jeremy
1b42b4b59a
Merge branch 'ollama:main' into custom-gpu-defs
2024-04-17 15:21:56 -04:00
Jeremy
7c000ec3ed
adds support for OLLAMA_CUSTOM_GPU_DEFS to customize GPU build flags
2024-04-17 15:21:05 -04:00
jmorganca
c8afe7168c
use correct extension for feature and model request issue templates
2024-04-17 15:18:40 -04:00
jmorganca
28d3cd0148
simpler feature and model request forms
2024-04-17 15:17:08 -04:00
jmorganca
eb5554232a
simpler feature and model request forms
2024-04-17 15:14:49 -04:00
Jeremy
ea4c284a48
Merge branch 'ollama:main' into arm64static
2024-04-17 15:11:38 -04:00
jmorganca
2bdc320216
add descriptions to issue templates
2024-04-17 15:08:36 -04:00
jmorganca
32561aed09
simplify github issue templates a bit
2024-04-17 15:07:03 -04:00
Michael Yang
71548d9829
Merge pull request #3706 from ollama/mxyng/mem
...
account for all non-repeating layers
2024-04-17 11:58:20 -07:00
Jeremy
8aec92fa6d
rearranged conditional logic for static build, dockerfile updated
2024-04-17 14:43:28 -04:00
Michael Yang
a8b9b930b4
account for all non-repeating layers
2024-04-17 11:21:21 -07:00
Michael
9755cf9173
acknowledge the amazing work done by Georgi and team!
2024-04-17 13:48:14 -04:00
Jeremy
70261b9bb6
move static build to its own flag
2024-04-17 13:04:28 -04:00
Blake Mizerany
9df6c85c3a
types/model: add FilepathNoBuild ( #3680 )
...
Also, add test for DisplayLongest.
Also, plumb fill param to ParseName in MustParseName
2024-04-16 18:35:43 -07:00
Michael Yang
e74163af4c
fix padding to only return padding
2024-04-16 15:43:26 -07:00
Michael Yang
fb9580df85
Merge pull request #3684 from ollama/mxyng/scale-graph
...
scale graph based on gpu count
2024-04-16 14:57:09 -07:00
Michael Yang
26df674785
scale graph based on gpu count
2024-04-16 14:44:13 -07:00
Jeffrey Morgan
7c9792a6e0
Support unicode characters in model path ( #3681 )
...
* parse wide argv characters on windows
* cleanup
* move cleanup to end of `main`
2024-04-16 17:00:12 -04:00
Michael Yang
7afb2e125a
Merge pull request #3678 from ollama/mxyng/fix-darwin-partial-offloading
...
darwin: no partial offloading if required memory greater than system
2024-04-16 12:05:56 -07:00
Michael Yang
41a272de9f
darwin: no partial offloading if required memory greater than system
2024-04-16 11:22:38 -07:00
Jeffrey Morgan
f335722275
update llama.cpp submodule to 7593639
( #3665 )
2024-04-15 23:04:43 -04:00
Michael Yang
6d53b67c2c
Merge pull request #3663 from ollama/mxyng/fix-padding
2024-04-15 17:44:54 -07:00
Michael Yang
969238b19e
fix padding in decode
...
TODO: update padding() to _only_ returning the padding
2024-04-15 17:27:06 -07:00
Blake Mizerany
949d7832cf
Revert "cmd: provide feedback if OLLAMA_MODELS is set on non-serve command ( #3470 )" ( #3662 )
...
This reverts commit 7d05a6ee8f
.
This proved to be more painful than useful.
See: https://github.com/ollama/ollama/issues/3624
2024-04-15 16:58:00 -07:00
Sung Kim
99d227c9db
Added Solar example at README.md ( #3610 )
...
Added just one line
| Solar | 10.7B | 6.1GB | `ollama run solar` |
2024-04-15 19:54:23 -04:00
Carlos Gamez
a27e419b47
Update langchainjs.md ( #2030 )
...
Changed ollama.call() for ollama.invoke() as per deprecated documentation from langchain
2024-04-15 18:37:30 -04:00
Chandre Van Der Westhuizen
e4d0db5a97
Added MindsDB information ( #3595 )
...
* Added MindsDB information
Added more details to MindsDB so that Ollama users can know that they can connect their Ollama model with 200+ databases and apps
* updated text for mindsdb
2024-04-15 18:35:29 -04:00
Eli Bendersky
ba460802c2
examples: add more Go examples using the API ( #3599 )
...
* examples: go-multimodal
* examples: add go-pull-progress
* examples: add go-chat
* fix
2024-04-15 18:34:54 -04:00
Jeffrey Morgan
e54a3c7fcd
Update modelfile.md
...
Remove Modelfile parameters that are decided at runtime
2024-04-15 15:35:44 -04:00
Patrick Devine
9f8691c6c8
Add llama2 / torch models for ollama create
( #3607 )
2024-04-15 11:26:42 -07:00
Jeffrey Morgan
a0b8a32eb4
Terminate subprocess if receiving SIGINT
or SIGTERM
signals while model is loading ( #3653 )
...
* terminate subprocess if receiving `SIGINT` or `SIGTERM` signals while model is loading
* use `unload` in signal handler
2024-04-15 12:09:32 -04:00
Jeffrey Morgan
7027f264fb
app: gracefully shut down ollama serve
on windows ( #3641 )
...
* app: gracefully shut down `ollama serve` on windows
* fix linter errors
* bring back `HideWindow`
* remove creation flags
* restore `windows.CREATE_NEW_PROCESS_GROUP`
2024-04-14 18:33:25 -04:00
Blake Mizerany
9bee3b63b1
types/model: add path helpers ( #3619 )
...
This commit adds path helpers for working with Names in URL and file
paths. The new helpers are ParseNameFromPath, ParseNameFromFilePath,
Name.Path, and Name.FilePath.
This commit also adds Name.DisplayLongest, and Name.DisplayLong.
Also, be it updates a place where strings.StripPrefix is more consistent
with the surrounding code.
Also, replace Parts with specific methods
2024-04-13 12:59:19 -07:00
Jeffrey Morgan
309aef7fee
update llama.cpp submodule to 4bd0f93
( #3627 )
2024-04-13 10:43:02 -07:00
Blake Mizerany
08655170aa
types/model: make ParseName variants less confusing ( #3617 )
...
Also, fix http stripping bug.
Also, improve upon docs about fills and masks.
2024-04-12 13:57:57 -07:00
Blake Mizerany
2b341069a7
types/model: remove (*Digest).Scan and Digest.Value ( #3605 )
2024-04-11 13:32:31 -07:00
Daniel Hiltgen
c00fee6936
Merge pull request #3604 from dhiltgen/fix_rocm_deps
...
Fix rocm deps with new subprocess paths
2024-04-11 13:08:29 -07:00
Daniel Hiltgen
c2d813bdc3
Fix rocm deps with new subprocess paths
2024-04-11 12:52:06 -07:00
Michael Yang
786f3a1c44
Merge pull request #3600 from ollama/mxyng/mixtral
2024-04-11 12:23:37 -07:00
Michael Yang
3397eff0cd
mixtral mem
2024-04-11 11:10:41 -07:00
Blake Mizerany
0efb7931c7
Revert "types/model: remove (*Digest).Scan and Digest.Value ( #3589 )"
...
This reverts commit 42f2cc408e
.
2024-04-11 00:45:07 -07:00
Blake Mizerany
42f2cc408e
types/model: remove (*Digest).Scan and Digest.Value ( #3589 )
2024-04-11 00:37:26 -07:00
Blake Mizerany
9446b795b5
types/model: remove DisplayLong ( #3587 )
2024-04-10 16:55:12 -07:00
Blake Mizerany
62f8cda3b3
types/model: remove MarshalText/UnmarshalText from Digest ( #3586 )
2024-04-10 16:52:49 -07:00
Blake Mizerany
6a1de23175
types/model: init with Name and Digest types ( #3541 )
2024-04-10 16:30:05 -07:00
Blake Mizerany
a7b431e743
server: provide helpful workaround hint when stalling on pull ( #3584 )
...
This is a quick fix to help users who are stuck on the "pull" step at
99%.
In the near future we're introducing a new registry client that
should/will hopefully be smarter. In the meantime, this should unblock
the users hitting issue #1736 .
2024-04-10 16:24:37 -07:00
Michael Yang
5a25f93522
Merge pull request #3478 from ollama/mxyng/tensor-layer
...
refactor tensor query
2024-04-10 12:45:03 -07:00
Michael Yang
7e33a017c0
partial offloading
2024-04-10 11:37:20 -07:00
Michael Yang
8b2c10061c
refactor tensor query
2024-04-10 11:37:20 -07:00
Michael Yang
c5c451ca3b
Merge pull request #3579 from ollama/mxyng/fix-ci
...
fix ci
2024-04-10 11:37:01 -07:00
Michael Yang
2b4ca6cf36
fix ci
2024-04-10 11:35:12 -07:00
Eli Bendersky
ad90b9ab3d
api: start adding documentation to package api ( #2878 )
...
* api: start adding documentation to package api
Updates #2840
* Fix lint typo report
2024-04-10 13:31:55 -04:00
Eli Bendersky
4340f8eba4
examples: start adding Go examples using api/ ( #2879 )
...
We can have the same examples as e.g. https://github.com/ollama/ollama-python/tree/main/examples
here. Using consistent naming and renaming the existing example to have -http-
since it uses direct HTTP requests rather than api/
Updates #2840
2024-04-10 13:26:45 -04:00
Daniel Hiltgen
4c7db6b7e9
Merge pull request #3566 from dhiltgen/more_time
...
Handle very slow model loads
2024-04-09 16:53:49 -07:00
Michael Yang
c03f0e3c3d
Merge pull request #3565 from ollama/mxyng/rope
...
fix: rope
2024-04-09 16:36:55 -07:00
Daniel Hiltgen
c5ff443b9f
Handle very slow model loads
...
During testing, we're seeing some models take over 3 minutes.
2024-04-09 16:35:10 -07:00
Michael Yang
01114b4526
fix: rope
2024-04-09 16:15:24 -07:00
Blake Mizerany
1524f323a3
Revert "build.go: introduce a friendlier way to build Ollama ( #3548 )" ( #3564 )
2024-04-09 15:57:45 -07:00
Blake Mizerany
fccf3eecaa
build.go: introduce a friendlier way to build Ollama ( #3548 )
...
This commit introduces a more friendly way to build Ollama dependencies
and the binary without abusing `go generate` and removing the
unnecessary extra steps it brings with it.
This script also provides nicer feedback to the user about what is
happening during the build process.
At the end, it prints a helpful message to the user about what to do
next (e.g. run the new local Ollama).
2024-04-09 14:18:47 -07:00
Michael Yang
c77d45d836
Merge pull request #3506 from ollama/mxyng/quantize-redux
...
cgo quantize
2024-04-09 12:32:53 -07:00
Jeffrey Morgan
5ec12cec6c
update llama.cpp submodule to 1b67731
( #3561 )
2024-04-09 15:10:17 -04:00
Michael Yang
d9578d2bad
Merge pull request #3559 from ollama/mxyng/ci
...
ci: use go-version-file
2024-04-09 11:03:18 -07:00
Michael Yang
cb8352d6b4
ci: use go-version-file
2024-04-09 09:50:12 -07:00
Alex Mavrogiannis
fc6558f47f
Correct directory reference in macapp/README ( #3555 )
2024-04-09 09:48:46 -04:00
Michael Yang
9502e5661f
cgo quantize
2024-04-08 15:31:08 -07:00
Michael Yang
e1c9a2a00f
no blob create if already exists
2024-04-08 15:09:48 -07:00
writinwaters
1341ee1b56
Update README.md ( #3539 )
...
RAGFlow now supports integration with Ollama.
2024-04-08 10:58:14 -04:00
Jeffrey Morgan
63efa075a0
update generate scripts with new LLAMA_CUDA
variable, set HIP_PLATFORM
to avoid compiler errors ( #3528 )
2024-04-07 19:29:51 -04:00
Thomas Vitale
cb03fc9571
Docs: Remove wrong parameter for Chat Completion ( #3515 )
...
Fixes gh-3514
Signed-off-by: Thomas Vitale <ThomasVitale@users.noreply.github.com >
2024-04-06 09:08:35 -07:00
Michael Yang
a5ec9cfc0f
Merge pull request #3508 from ollama/mxyng/rope
2024-04-05 18:46:06 -07:00
Michael Yang
be517e491c
no rope parameters
2024-04-05 18:05:27 -07:00
Michael Yang
fc8e108642
Merge pull request #3496 from ollama/mxyng/cmd-r-graph
...
add command-r graph estimate
2024-04-05 12:26:21 -07:00
Daniel Hiltgen
c5d5c4a96c
Merge pull request #3491 from dhiltgen/context_bust_test
...
Add test case for context exhaustion
2024-04-04 16:20:20 -07:00
Daniel Hiltgen
dfe330fa1c
Merge pull request #3488 from mofanke/fix-windows-dll-compress
...
fix dll compress in windows building
2024-04-04 16:12:13 -07:00
Michael Yang
01f77ae25d
add command-r graph estimate
2024-04-04 14:07:24 -07:00
Daniel Hiltgen
483b81a863
Merge pull request #3494 from dhiltgen/ci_release
...
Fail fast if mingw missing on windows
2024-04-04 10:15:40 -07:00
Daniel Hiltgen
36bd967722
Fail fast if mingw missing on windows
2024-04-04 09:51:26 -07:00
Jeffrey Morgan
b0e7d35db8
use an older version of the mac os sdk in release ( #3484 )
2024-04-04 09:48:54 -07:00
Daniel Hiltgen
aeb1fb5192
Add test case for context exhaustion
...
Confirmed this fails on 0.1.30 with known regression
but passes on main
2024-04-04 07:42:17 -07:00
Daniel Hiltgen
a2e60ebcaf
Merge pull request #3490 from dhiltgen/ci_fixes
...
CI missing archive
2024-04-04 07:24:24 -07:00
Daniel Hiltgen
883ec4d1ef
CI missing archive
2024-04-04 07:23:27 -07:00
mofanke
4de0126719
fix dll compress in windows building
2024-04-04 21:27:33 +08:00
Daniel Hiltgen
9768e2dc75
Merge pull request #3481 from dhiltgen/ci_fixes
...
CI subprocess path fix
2024-04-03 19:29:09 -07:00
Daniel Hiltgen
08600d5bec
CI subprocess path fix
2024-04-03 19:12:53 -07:00
Daniel Hiltgen
a624e672d2
Merge pull request #3479 from dhiltgen/ci_fixes
...
Fix CI release glitches
2024-04-03 18:42:27 -07:00
Daniel Hiltgen
e4a7e5b2ca
Fix CI release glitches
...
The subprocess change moved the build directory
arm64 builds weren't setting cross-compilation flags when building on x86
2024-04-03 16:41:40 -07:00
Michael Yang
a0a15cfd5b
Merge pull request #3463 from ollama/mxyng/graph-estimate
...
update graph size estimate
2024-04-03 14:27:30 -07:00
Michael Yang
12e923e158
update graph size estimate
2024-04-03 13:34:12 -07:00
Jeffrey Morgan
cd135317d2
Fix macOS builds on older SDKs ( #3467 )
2024-04-03 10:45:54 -07:00
Michael Yang
4f895d633f
Merge pull request #3466 from ollama/mxyng/head-kv
...
default head_kv to 1
2024-04-03 10:41:00 -07:00
Blake Mizerany
7d05a6ee8f
cmd: provide feedback if OLLAMA_MODELS is set on non-serve command ( #3470 )
...
This also moves the checkServerHeartbeat call out of the "RunE" Cobra
stuff (that's the only word I have for that) to on-site where it's after
the check for OLLAMA_MODELS, which allows the helpful error message to
be printed before the server heartbeat check. This also arguably makes
the code more readable without the magic/superfluous "pre" function
caller.
2024-04-02 22:11:13 -07:00
Daniel Hiltgen
464d817824
Merge pull request #3464 from dhiltgen/subprocess
...
Fix numgpu opt miscomparison
2024-04-02 20:10:17 -07:00
Pier Francesco Contino
531324a9be
feat: add OLLAMA_DEBUG in ollama server help message ( #3461 )
...
Co-authored-by: Pier Francesco Contino <pfcontino@gmail.com >
2024-04-02 18:20:03 -07:00
Daniel Hiltgen
6589eb8a8c
Revert options as a ref in the server
2024-04-02 16:44:10 -07:00
Michael Yang
90f071c658
default head_kv to 1
2024-04-02 16:37:59 -07:00
Michael Yang
a039e383cd
Merge pull request #3465 from ollama/mxyng/fix-metal
...
fix metal gpu
2024-04-02 16:29:58 -07:00
Michael Yang
80163ebcb5
fix metal gpu
2024-04-02 16:06:45 -07:00
Daniel Hiltgen
a57818d93e
Merge pull request #3343 from dhiltgen/bump_more2
...
Bump llama.cpp to b2581
2024-04-02 15:08:26 -07:00
Daniel Hiltgen
841adda157
Fix windows lint CI flakiness
2024-04-02 12:22:16 -07:00
Daniel Hiltgen
0035e31af8
Bump to b2581
2024-04-02 11:53:07 -07:00
Daniel Hiltgen
c863c6a96d
Merge pull request #3218 from dhiltgen/subprocess
...
Switch back to subprocessing for llama.cpp
2024-04-02 10:49:44 -07:00
Daniel Hiltgen
1f11b52511
Refined min memory from testing
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
526d4eb204
Release gpu discovery library after use
...
Leaving the cudart library loaded kept ~30m of memory
pinned in the GPU in the main process. This change ensures
we don't hold GPU resources when idle.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
0a74cb31d5
Safeguard for noexec
...
We may have users that run into problems with our current
payload model, so this gives us an escape valve.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
10ed1b6292
Detect too-old cuda driver
...
"cudart init failure: 35" isn't particularly helpful in the logs.
2024-04-01 16:48:33 -07:00
Daniel Hiltgen
4fec5816d6
Integration test improvements
...
Cleaner shutdown logic, a bit of response hardening
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
0a0e9f3e0f
Apply 01-cache.diff
2024-04-01 16:48:18 -07:00
Daniel Hiltgen
58d95cc9bd
Switch back to subprocessing for llama.cpp
...
This should resolve a number of memory leak and stability defects by allowing
us to isolate llama.cpp in a separate process and shutdown when idle, and
gracefully restart if it has problems. This also serves as a first step to be
able to run multiple copies to support multiple models concurrently.
2024-04-01 16:48:18 -07:00
Patrick Devine
3b6a9154dd
Simplify model conversion ( #3422 )
2024-04-01 16:14:53 -07:00
Michael Yang
d6dd2ff839
Merge pull request #3241 from ollama/mxyng/mem
...
update memory estimations for gpu offloading
2024-04-01 13:59:14 -07:00
Michael Yang
e57a6ba89f
Merge pull request #2926 from ollama/mxyng/decode-ggml-v2
...
refactor model parsing
2024-04-01 13:58:13 -07:00
Michael Yang
12ec2346ef
Merge pull request #3442 from ollama/mxyng/generate-output
...
fix generate output
2024-04-01 13:56:09 -07:00
Michael Yang
1ec0df1069
fix generate output
2024-04-01 13:47:34 -07:00
Michael Yang
91b3e4d282
update memory calcualtions
...
count each layer independently when deciding gpu offloading
2024-04-01 13:16:32 -07:00
Michael Yang
d338d70492
refactor model parsing
2024-04-01 13:16:15 -07:00
Philipp Gillé
011bb67351
Add chromem-go to community integrations ( #3437 )
2024-04-01 11:17:37 -04:00
Saifeddine ALOUI
d124627202
Update README.md ( #3436 )
2024-04-01 11:16:31 -04:00
Jesse Zhang
b0a8246a69
Community Integration: CRAG Ollama Chat ( #3423 )
...
Corrective Retrieval Augmented Generation Demo, powered by Langgraph and Streamlit 🤗
Support:
- Ollama
- OpenAI APIs
2024-04-01 11:16:14 -04:00
Yaroslav
e6fb39c182
Update README.md ( #3378 )
...
Plugins list updated
2024-03-31 13:10:05 -04:00
sugarforever
e1f1c374ea
Community Integration: ChatOllama ( #3400 )
...
* Community Integration: ChatOllama
* fixed typo
2024-03-30 22:46:50 -04:00
Jeffrey Morgan
06a1508bfe
Update 90_bug_report.yml
2024-03-29 10:11:17 -04:00
Patrick Devine
5a5efee46b
Add gemma safetensors conversion ( #3250 )
...
Co-authored-by: Michael Yang <mxyng@pm.me >
2024-03-28 18:54:01 -07:00
Daniel Hiltgen
97ae517fbf
Merge pull request #3398 from dhiltgen/release_latest
...
CI automation for tagging latest images
2024-03-28 16:25:54 -07:00
Daniel Hiltgen
44b813e459
Merge pull request #3377 from dhiltgen/rocm_v6_bump
...
Bump ROCm to 6.0.2 patch release
2024-03-28 16:07:54 -07:00
Daniel Hiltgen
539043f5e0
CI automation for tagging latest images
2024-03-28 16:07:37 -07:00
Daniel Hiltgen
dbcace6847
Merge pull request #3392 from dhiltgen/ci_build_win_cuda
...
CI windows gpu builds
2024-03-28 16:03:52 -07:00
Daniel Hiltgen
c91a4ebcff
Bump ROCm to 6.0.2 patch release
2024-03-28 15:58:57 -07:00
Daniel Hiltgen
b79c7e4528
CI windows gpu builds
...
If we're doing generate, test windows cuda and rocm as well
2024-03-28 14:39:10 -07:00
Michael Yang
035b274b70
Merge pull request #3379 from ollama/mxyng/origins
...
fix: trim quotes on OLLAMA_ORIGINS
2024-03-28 14:14:18 -07:00
Michael Yang
9c6a254945
Merge pull request #3391 from ollama/mxyng-patch-1
2024-03-28 13:15:56 -07:00
Michael Yang
f31f2bedf4
Update troubleshooting link
2024-03-28 12:05:26 -07:00
Michael Yang
756c257553
Merge pull request #3380 from ollama/mxyng/conditional-generate
...
fix: workflows
2024-03-28 00:35:27 +01:00
Michael Yang
5255d0af8a
fix: workflows
2024-03-27 16:30:01 -07:00
Michael Yang
af8a8a6b59
fix: trim quotes on OLLAMA_ORIGINS
2024-03-27 15:24:29 -07:00
Michael Yang
461ad25015
Merge pull request #3376 from ollama/mxyng/conditional-generate
...
only generate on changes to llm subdirectory
2024-03-27 22:12:53 +01:00
Michael Yang
8838ae787d
stub stub
2024-03-27 13:59:12 -07:00
Michael Yang
db75402ade
mangle arch
2024-03-27 13:44:50 -07:00
Michael Yang
1e85a140a3
only generate on changes to llm subdirectory
2024-03-27 12:45:35 -07:00
Michael Yang
c363282fdc
Merge pull request #3375 from ollama/mxyng/conditional-generate
...
only generate cuda/rocm when changes to llm detected
2024-03-27 20:40:55 +01:00
Michael Yang
5b0c48d29e
only generate cuda/rocm when changes to llm detected
2024-03-27 12:23:09 -07:00
Jeffrey Morgan
913306f4fd
Detect arrow keys on windows ( #3363 )
...
* detect arrow keys on windows
* add some helpful comments
2024-03-26 18:21:56 -04:00
Jeffrey Morgan
f5ca7f8c8e
add license in file header for vendored llama.cpp code ( #3351 )
2024-03-26 16:23:23 -04:00
Jeffrey Morgan
856b8ec131
remove need for $VSINSTALLDIR
since build will fail if ninja
cannot be found ( #3350 )
2024-03-26 16:23:16 -04:00
Patrick Devine
1b272d5bcd
change github.com/jmorganca/ollama
to github.com/ollama/ollama
( #3347 )
2024-03-26 13:04:17 -07:00
Christophe Dervieux
29715dbca7
malformed markdown link ( #3358 )
2024-03-26 10:46:36 -04:00
Daniel Hiltgen
54a028d07f
Merge pull request #3356 from dhiltgen/fix_arm_linux
...
Switch runner for final release job
2024-03-25 20:54:46 -07:00
Daniel Hiltgen
f83e4db365
Switch runner for final release job
...
The manifest and tagging step use a lot of disk space
2024-03-25 20:51:40 -07:00
Daniel Hiltgen
3b5866a233
Merge pull request #3353 from dhiltgen/fix_arm_linux
...
Use Rocky Linux Vault to get GCC 10.2 installed
2024-03-25 19:38:56 -07:00
Daniel Hiltgen
b8c2be6142
Use Rocky Linux Vault to get GCC 10.2 installed
...
This should hopefully only be a temporary workaround until Rocky 8
picks up GCC 10.4 which fixes the NVCC bug
2024-03-25 19:18:50 -07:00
Daniel Hiltgen
e0319bd78d
Revert "Switch arm cuda base image to centos 7"
...
This reverts commit 5dacc1ebe8
.
2024-03-25 19:01:11 -07:00
Daniel Hiltgen
b31ed7f031
Merge pull request #3352 from dhiltgen/fix_arm_linux
...
Switch arm cuda base image to centos 7
2024-03-25 16:13:10 -07:00
Daniel Hiltgen
5dacc1ebe8
Switch arm cuda base image to centos 7
...
We had started using rocky linux 8, but they've updated to GCC 10.3,
which breaks NVCC. 10.2 is compatible (or 10.4, but that's not
available from rocky linux 8 repos yet)
2024-03-25 15:57:08 -07:00
Daniel Hiltgen
c2712b5566
Merge pull request #3348 from dhiltgen/bump_llamacpp
...
Bump llama.cpp to b2527
2024-03-25 14:15:53 -07:00
Daniel Hiltgen
8091ef2eeb
Bump llama.cpp to b2527
2024-03-25 13:47:44 -07:00
Jeffrey Morgan
f38b705dc7
Fix ROCm link in development.md
2024-03-25 16:32:44 -04:00
Daniel Hiltgen
560be5e0b6
Merge pull request #3308 from dhiltgen/bump_more
...
Bump llama.cpp to b2510
2024-03-25 12:56:12 -07:00
Daniel Hiltgen
4a1c76b3aa
Merge pull request #3331 from dhiltgen/integration_testing
...
Integration tests conditionally pull
2024-03-25 12:48:51 -07:00
Daniel Hiltgen
28a64e23ca
Merge pull request #2279 from remy415/main
...
Add support for libcudart.so for CUDA devices (Adds Jetson support)
2024-03-25 12:46:28 -07:00
Niclas Pahlfer
92d74e2f59
adds ooo to community integrations ( #1623 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:08:33 -04:00
Herval Freire
6f8f57dd1d
Add cliobot to ollama supported list ( #1873 )
...
* Update README.md
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:07:19 -04:00
Chenhe Gu
b2fa68b0ea
Add Dify.AI to community integrations ( #1944 )
...
Dify.AI is a model-agnostic LLMOps platform for building and managing LLM applications.
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:06:39 -04:00
Marco Antônio
3767d5ef0d
enh: add ollero.nvim to community applications ( #1905 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:06:08 -04:00
Ani Betts
9fed85bc8b
Add typechat-cli to Terminal apps ( #2428 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:05:04 -04:00
Miguel
4501bc0913
add new Web & Desktop link in readme for alpaca webui ( #2881 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 15:00:18 -04:00
Danny Avila
57ba519e63
Add LibreChat to Web & Desktop Apps ( #2918 )
2024-03-25 14:59:18 -04:00
enoch1118
d98d322d24
Add Community Integration: OllamaGUI ( #2927 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 14:58:28 -04:00
fly2tomato
0c3ec74cf1
Add Community Integration: OpenAOE ( #2946 )
...
* Update README.md
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 14:57:40 -04:00
tusharhero
42ae8359fa
docs: Add AI telegram to Community Integrations. ( #3033 )
2024-03-25 14:56:42 -04:00
Timothy Carambat
e4b76dfb76
docs: Add AnythingLLM to README as integration option ( #3145 )
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 14:54:48 -04:00
Jikku Jose
2c56517494
Add Saddle ( #3178 )
2024-03-25 14:54:09 -04:00
Yusuf Can Bayrak
cfbc1b152b
tlm added to README.md terminal section. ( #3274 )
2024-03-25 14:53:26 -04:00
RAPID ARCHITECT
9305ac1b2e
Update README.md ( #3288 )
...
Added Ollama Basic chat based on hyperdiv
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-03-25 14:52:25 -04:00
drazdra
45d6292959
Update README.md ( #3338 )
...
adding drazdra/ollama-chats to the list of UI :)
2024-03-25 14:50:51 -04:00
Blake Mizerany
22921a3969
doc: specify ADAPTER is optional ( #3333 )
2024-03-25 09:43:19 -07:00
Daniel Hiltgen
7b6cbc10ec
Integration tests conditionally pull
...
If images aren't present, pull them.
Also fixes the expected responses
2024-03-25 08:57:45 -07:00
Jeremy
dfc6721b20
add support for libcudart.so for CUDA devices (adds Jetson support)
2024-03-25 11:07:44 -04:00
Blake Mizerany
acfa2b9422
llm: prevent race appending to slice ( #3320 )
2024-03-24 11:35:54 -07:00
Daniel Hiltgen
2c390a73ac
Merge pull request #3282 from dhiltgen/gpu_docs
...
Add docs for GPU selection and nvidia uvm workaround
2024-03-24 19:15:03 +01:00
Daniel Hiltgen
3e30c75f3e
Bump llama.cpp to b2510
2024-03-23 19:55:56 +01:00
Eddú Meléndez Gonzales
7e430ff352
Add Testcontainers into Libraries section ( #3291 )
...
Testcontainers provides a module for Ollama.
2024-03-23 19:55:25 +01:00
Daniel Hiltgen
1784113ef5
Merge pull request #3309 from dhiltgen/integration_testing
...
Revamp go based integration tests
2024-03-23 19:08:49 +01:00
Daniel Hiltgen
949b6c01e0
Revamp go based integration tests
...
This uplevels the integration tests to run the server which can allow
testing an existing server, or a remote server.
2024-03-23 14:24:18 +01:00
jmorganca
38daf0a252
rename .gitattributes
2024-03-23 12:40:31 +01:00
Daniel Hiltgen
43799532c1
Bump llama.cpp to b2474
...
The release just before ggml-cuda.cu refactoring
2024-03-23 09:54:56 +01:00
Daniel Hiltgen
d8fdbfd8da
Add docs for GPU selection and nvidia uvm workaround
2024-03-21 11:52:54 +01:00
Bruce MacDonald
a5ba0fcf78
doc: faq gpu compatibility ( #3142 )
2024-03-21 05:21:34 -04:00
Jeffrey Morgan
3a30bf56dc
Update faq.md
2024-03-20 17:48:39 +01:00
Daniel Hiltgen
a1c0a48524
Merge pull request #3122 from dhiltgen/better_tmp_cleanup
...
Better tmpdir cleanup
2024-03-20 16:28:03 +01:00
Daniel Hiltgen
74788b487c
Better tmpdir cleanup
...
If expanding the runners fails, don't leave a corrupt/incomplete payloads dir
We now write a pid file out to the tmpdir, which allows us to scan for stale tmpdirs
and remove this as long as there isn't still a process running.
2024-03-20 16:03:19 +01:00
Jeffrey Morgan
7ed3e94105
Update faq.md
2024-03-18 10:24:39 +01:00
jmorganca
2297ad39da
update faq.md
2024-03-18 10:17:59 +01:00
Michael Yang
01cff6136d
Merge pull request #3217 from ollama/mxyng/cleanup
...
remove global
2024-03-18 02:13:30 -07:00
Michael Yang
3c4ad0ecab
dyn global
2024-03-18 09:45:45 +01:00
Michael Yang
22f326464e
Merge pull request #3083 from ollama/mxyng/refactor-readseeker
...
refactor readseeker
2024-03-16 12:08:56 -07:00
Jeffrey Morgan
e95ffc7448
llama: remove server static assets ( #3174 )
2024-03-15 19:24:12 -07:00
Jeffrey Morgan
2dce1ab40b
add llm/ext_server
directory to linguist-vendored
( #3173 )
2024-03-15 17:46:46 -07:00
Daniel Hiltgen
f4b31c2d53
Merge pull request #3111 from alitrack/main
...
Update ollama.iss
2024-03-15 16:46:59 -07:00
Daniel Hiltgen
ab3456207b
Merge pull request #3028 from ollama/ci_release
...
CI release process
2024-03-15 16:40:54 -07:00
Daniel Hiltgen
6ad414f31e
Merge pull request #3086 from dhiltgen/import_server
...
Import server.cpp to retain llava support
2024-03-15 16:10:35 -07:00
Daniel Hiltgen
052b5a3b77
Merge pull request #3171 from dhiltgen/rocm_94x
...
Add Radeon gfx940-942 GPU support
2024-03-15 15:58:33 -07:00
Daniel Hiltgen
d4c10df2b0
Add Radeon gfx940-942 GPU support
2024-03-15 15:34:58 -07:00
Daniel Hiltgen
540f4af45f
Wire up more complete CI for releases
...
Flesh out our github actions CI so we can build official releaes.
2024-03-15 12:37:36 -07:00
Blake Mizerany
6ce37e4d96
llm,readline: use errors.Is instead of simple == check ( #3161 )
...
This fixes some brittle, simple equality checks to use errors.Is. Since
go1.13, errors.Is is the idiomatic way to check for errors.
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2024-03-15 07:14:12 -07:00
Blake Mizerany
703684a82a
server: replace blob prefix separator from ':' to '-' ( #3146 )
...
This fixes issues with blob file names that contain ':' characters to be rejected by file systems that do not support them.
2024-03-14 20:18:06 -07:00
Daniel Hiltgen
6459377ae0
Add ROCm support to linux install script ( #2966 )
2024-03-14 18:00:16 -07:00
Blake Mizerany
8546dd3d72
.github: fix model and feature request yml ( #3155 )
2024-03-14 15:26:06 -07:00
Blake Mizerany
87100be5e0
.github: add issue templates ( #3143 )
2024-03-14 15:19:10 -07:00
Michael Yang
e87c780ff9
Merge pull request #3149 from ollama/mxyng/fix-memory-leak
...
fix: clip memory leak
2024-03-14 13:34:15 -07:00
Michael Yang
291c663865
fix: clip memory leak
2024-03-14 13:12:42 -07:00
Daniel Hiltgen
da20786e3e
Merge pull request #3068 from dhiltgen/win_pipe
...
Use stdin for term discovery on windows
2024-03-14 11:55:19 -07:00
Jeffrey Morgan
5ce997a7b9
Update README.md
2024-03-13 21:12:17 -07:00
Jeffrey Morgan
672ffe9b7d
add OLLAMA_KEEP_ALIVE
to environment variable docs for ollama serve
( #3127 )
2024-03-13 14:35:33 -07:00
Patrick Devine
47cfe58af5
Default Keep Alive environment variable ( #3094 )
...
---------
Co-authored-by: Chris-AS1 <8493773+Chris-AS1@users.noreply.github.com >
2024-03-13 13:29:40 -07:00
Daniel Hiltgen
c1a81c6fe3
Use stdin for term discovery on windows
...
When you feed input to the cmd via a pipe it no longer reports a warning
2024-03-13 10:37:31 -07:00
Steven Lee
152ab524c2
Update ollama.iss
...
add arm64 support
2024-03-13 20:15:45 +08:00
Jeffrey Morgan
e72c567cfd
restore locale patch ( #3091 )
2024-03-12 22:08:13 -07:00
Bruce MacDonald
3e22611200
token repeat limit for prediction requests ( #3080 )
2024-03-12 22:08:25 -04:00
Daniel Hiltgen
a54d4a28dc
Merge pull request #3088 from dhiltgen/rocm_igpu_linux
...
Fix iGPU detection for linux
2024-03-12 17:20:27 -07:00
Daniel Hiltgen
82b0c7c27e
Fix iGPU detection for linux
...
This fixes a few bugs in the new sysfs discovery logic. iGPUs are now
correctly identified by their <1G VRAM reported. the sysfs IDs are off
by one compared to what HIP wants due to the CPU being reported
in amdgpu, but HIP only cares about GPUs.
2024-03-12 16:57:19 -07:00
Patrick Devine
ba7cf7fb66
add more docs on for the modelfile message command ( #3087 )
2024-03-12 16:41:41 -07:00
Bruce MacDonald
2f804068bd
warn when json format is expected but not mentioned in prompt ( #3081 )
2024-03-12 19:07:11 -04:00
Daniel Hiltgen
85129d3a32
Adapt our build for imported server.cpp
2024-03-12 14:57:15 -07:00
Daniel Hiltgen
9ac6440da3
Import server.cpp as of b2356
2024-03-12 13:58:06 -07:00
Michael Yang
0085297928
refactor readseeker
2024-03-12 12:54:18 -07:00
Daniel Hiltgen
34d00f90b1
Merge pull request #3070 from dhiltgen/visible_devices
...
Add docs explaining GPU selection env vars
2024-03-12 11:36:46 -07:00
Daniel Hiltgen
b53229a2ed
Add docs explaining GPU selection env vars
2024-03-12 11:33:06 -07:00
racerole
53c107e20e
chore: fix typo ( #3073 )
...
Signed-off-by: racerole <jiangyifeng@outlook.com >
2024-03-12 14:09:22 -04:00
mofanke
51578d8573
fix gpu_info_cuda.c compile warning ( #3077 )
2024-03-12 14:08:40 -04:00
Jeffrey Morgan
b5fcd9d3aa
use -trimpath
when building releases ( #3069 )
2024-03-11 15:58:46 -07:00
Bruce MacDonald
b80661e8c7
relay load model errors to the client ( #3065 )
2024-03-11 16:48:27 -04:00
Jeffrey Morgan
6d3adfbea2
Update troubleshooting.md
2024-03-11 13:22:28 -07:00
Jeffrey Morgan
369eda65f5
update llama.cpp submodule to ceca1ae
( #3064 )
2024-03-11 12:57:48 -07:00
Michael Yang
f878e91070
Merge pull request #3044 from ollama/mxyng/fix-convert-shape
...
convert: fix shape
2024-03-11 09:56:57 -07:00
Daniel Hiltgen
0d651478e4
Merge pull request #3056 from dhiltgen/rocm_link_clash
...
Avoid rocm runner and dependency clash
2024-03-11 09:48:48 -07:00
Michael Yang
9ea492f1ce
convert: fix shape
2024-03-11 09:41:01 -07:00
Daniel Hiltgen
bc13da2bfe
Avoid rocm runner and dependency clash
...
Putting the rocm symlink next to the runners is risky. This moves
the payloads into a subdir to avoid potential clashes.
2024-03-11 09:33:22 -07:00
Jeffrey Morgan
41b00b9856
fix 03-locale.diff
2024-03-10 16:21:05 -07:00
Daniel Hiltgen
c2a8ed48e7
Merge pull request #3048 from dhiltgen/harden_rocm_deps
...
Harden for deps file being empty (or short)
2024-03-10 15:17:22 -07:00
Daniel Hiltgen
3dc1bb6a35
Harden for deps file being empty (or short)
2024-03-10 14:45:38 -07:00
Daniel Hiltgen
7865a6996a
Merge pull request #3046 from dhiltgen/rocm_search_paths
...
Add ollama executable peer dir for rocm
2024-03-10 12:30:56 -07:00
Daniel Hiltgen
00ec269321
Add ollama executable peer dir for rocm
...
This allows people who package up ollama on their own to place
the rocm dependencies in a peer directory to the ollama executable
much like our windows install flow.
2024-03-10 12:16:30 -07:00
Jeffrey Morgan
908005d90b
patch: use default locale in wpm tokenizer ( #3034 )
2024-03-09 21:12:12 -08:00
Jeffrey Morgan
cdf65e793f
only copy deps for amd64
in build_linux.sh
2024-03-09 17:55:22 -08:00
Daniel Hiltgen
82ca694d68
Rename ROCm deps file to avoid confusion ( #3025 )
2024-03-09 17:48:38 -08:00
Jeffrey Morgan
5017a15bcb
add macapp
to .dockerignore
2024-03-09 16:07:06 -08:00
Jeffrey Morgan
e11668aa07
add bundle_metal
and cleanup_metal
funtions to gen_darwin.sh
2024-03-09 16:04:57 -08:00
Jeffrey Morgan
0bd0f4a29c
tidy cleanup logs
2024-03-09 15:56:48 -08:00
Jeffrey Morgan
1ffb1e2874
update llama.cpp submodule to 77d1ac7
( #3030 )
2024-03-09 15:55:34 -08:00
Daniel Hiltgen
0a7844413c
Merge pull request #3026 from dhiltgen/win_rocm_docs
...
Doc how to set up ROCm builds on windows
2024-03-09 14:17:19 -08:00
Jeffrey Morgan
f9cd55c70b
disable gpu for certain model architectures and fix divide-by-zero on memory estimation
2024-03-09 12:51:38 -08:00
Daniel Hiltgen
0fdebb34a9
Doc how to set up ROCm builds on windows
2024-03-09 11:29:45 -08:00
Daniel Hiltgen
ac64cd4ef9
Merge pull request #3008 from dhiltgen/no_more_idempotent
...
Finish unwinding idempotent payload logic
2024-03-09 09:13:24 -08:00
Daniel Hiltgen
4a5c9b8035
Finish unwinding idempotent payload logic
...
The recent ROCm change partially removed idempotent
payloads, but the ggml-metal.metal file for mac was still
idempotent. This finishes switching to always extract
the payloads, and now that idempotentcy is gone, the
version directory is no longer useful.
2024-03-09 08:34:39 -08:00
Jeffrey Morgan
efe5617b64
update llama.cpp submodule to c2101a2
( #3020 )
2024-03-09 00:44:50 -08:00
Jeffrey Morgan
5b3fad9636
separate out isLocalIP
2024-03-09 00:22:08 -08:00
Jeffrey Morgan
bfec2c6e10
simplify host checks
2024-03-08 23:29:53 -08:00
Jeffrey Morgan
5c143af726
add additional allowed hosts
2024-03-08 23:23:59 -08:00
Jeffrey Morgan
6c0af2599e
Update docs README.md
and table of contents
2024-03-08 22:45:11 -08:00
Jeffrey Morgan
fc8c044584
add allowed host middleware and remove workDir
middleware ( #3018 )
2024-03-08 22:23:47 -08:00
Michael Yang
ecc133d843
Merge pull request #3014 from ollama/mxyng/decode-ggla
2024-03-08 16:14:53 -08:00
Michael Yang
76bdebbadf
decode ggla
2024-03-08 15:46:25 -08:00
Michael Yang
18979ad4a1
convert: fix default shape
2024-03-08 15:42:48 -08:00
Michael Yang
8e0ef931d8
Merge pull request #2990 from ollama/mxyng/default-term-size
...
fix: default terminal width, height
2024-03-08 15:20:54 -08:00
Daniel Hiltgen
280da44522
Merge pull request #2988 from dhiltgen/rocm_docs
...
Refined ROCm troubleshooting docs
2024-03-08 13:33:30 -08:00
Bruce MacDonald
0cebc79cba
fix: allow importing a model from name reference ( #3005 )
2024-03-08 12:27:47 -05:00
Jeffrey Morgan
0e4669b04f
update llama.cpp submodule to 6cdabe6
( #2999 )
2024-03-08 00:26:20 -08:00
Jeffrey Morgan
b886bec3f9
Update api.md
2024-03-07 23:27:51 -08:00
Jeffrey Morgan
fc06205971
Revert "adjust download and upload concurrency based on available bandwidth" ( #2995 )
2024-03-07 18:10:16 -08:00
Blake Mizerany
2ada81e068
cmd: tighten up env var usage sections ( #2962 )
...
Also, document OLLAMA_HOST client semantics per command that honors it.
This looks nicer than having a general puprose environment variable
section in the root usage which was showing up after the "addition help
topics" section outputed by Cobra's default template.
It was decided this was easier to work with than using a custom template
for Cobra right now.
2024-03-07 13:57:07 -08:00
Michael Yang
b1e74d4fda
default terminal width, height
2024-03-07 11:35:42 -08:00
Michael Yang
f678f5c5c3
Merge pull request #2991 from ollama/mxyng/fix-ci
...
fix ci
2024-03-07 11:35:06 -08:00
Michael Yang
2cb74e23fb
fix ci
2024-03-07 11:33:49 -08:00
Daniel Hiltgen
69f0227813
Refined ROCm troubleshooting docs
2024-03-07 11:22:37 -08:00
Daniel Hiltgen
3c8df3808b
Merge pull request #2885 from dhiltgen/rocm_v6_only
...
Revamp ROCm support
2024-03-07 10:51:00 -08:00
Michael Yang
7d564835c2
Merge pull request #2985 from ollama/rm-empty-examples
...
remove empty examples
2024-03-07 10:49:40 -08:00
Michael Yang
72431031d9
no ci test on docs, examples
2024-03-07 10:44:48 -08:00
Michael Yang
6041abb5b2
remove empty examples
2024-03-07 10:40:32 -08:00
Daniel Hiltgen
6c5ccb11f9
Revamp ROCm support
...
This refines where we extract the LLM libraries to by adding a new
OLLAMA_HOME env var, that defaults to `~/.ollama` The logic was already
idempotenent, so this should speed up startups after the first time a
new release is deployed. It also cleans up after itself.
We now build only a single ROCm version (latest major) on both windows
and linux. Given the large size of ROCms tensor files, we split the
dependency out. It's bundled into the installer on windows, and a
separate download on windows. The linux install script is now smart and
detects the presence of AMD GPUs and looks to see if rocm v6 is already
present, and if not, then downloads our dependency tar file.
For Linux discovery, we now use sysfs and check each GPU against what
ROCm supports so we can degrade to CPU gracefully instead of having
llama.cpp+rocm assert/crash on us. For Windows, we now use go's windows
dynamic library loading logic to access the amdhip64.dll APIs to query
the GPU information.
2024-03-07 10:36:50 -08:00
Michael Yang
2e20110e50
Merge pull request #2221 from ollama/mxyng/up-down-ccy
...
adjust download and upload concurrency based on available bandwidth
2024-03-07 09:27:33 -08:00
Daniel Hiltgen
82ddc3e441
Merge pull request #2964 from dhiltgen/mem_limit_var
...
Allow setting max vram for workarounds
2024-03-07 09:25:44 -08:00
Jeffrey Morgan
d481fb3cc8
update go to 1.22 in other places ( #2975 )
2024-03-07 07:39:49 -08:00
DJ Johnson
23ee633252
docs: Add LLM-X to Web Integration section ( #2759 )
2024-03-07 10:11:53 -05:00
John
23ebe8fe11
fix some typos ( #2973 )
...
Signed-off-by: hishope <csqiye@126.com >
2024-03-06 22:50:11 -08:00
Patrick Devine
2c017ca441
Convert Safetensors to an Ollama model ( #2824 )
2024-03-06 21:01:51 -08:00
Daniel Hiltgen
be330174dd
Allow setting max vram for workarounds
...
Until we get all the memory calculations correct, this can provide
and escape valve for users to workaround out of memory crashes.
2024-03-06 17:15:06 -08:00
Blake Mizerany
0ded7fdc4b
cmd: document environment variables for serve command
...
Updates #2944
2024-03-06 13:48:46 -08:00
Leo
2103a5073c
Add Odin Runes, a Feature-Rich Java UI for Ollama, to README ( #2440 )
...
* Add Odin Runes to README
Add Odin Runes to README
This commit adds Odin Runes to the "Community Integrations" section of the README. Odin Runes is a Java-based GPT client designed to provide seamless interaction with GPT models, enhancing productivity in prompt engineering and text generation tasks. This addition highlights the integration between Odin Runes and Ollama, offering users the flexibility to leverage large language models locally within their development workflow.
* Update README.md
this commit applies the comments of the reviewer.
2024-03-06 11:57:49 -08:00
Jeffrey Morgan
ce9f7c4674
Update api.md
2024-03-05 13:13:23 -08:00
Anders Rex
e5596c1944
Add NotesOllama to Community Integrations ( #2909 )
2024-03-04 01:18:10 -08:00
Timothy Graupmann
9bc3fee694
Added community link for Ollama Copilot ( #2582 )
...
* Added community link for Ollama Copilot
* Update README.md
---------
Co-authored-by: Michael <mchiang0610@users.noreply.github.com >
2024-03-04 00:40:36 -08:00
Jeffrey Morgan
21347e1ed6
update llama.cpp submodule to c29af7e
( #2868 )
2024-03-01 15:26:04 -08:00
Jeffrey Morgan
3b4bab3dc5
Fix embeddings load model behavior ( #2848 )
2024-02-29 17:40:56 -08:00
Daniel Hiltgen
cbd6e3b38e
Merge pull request #2838 from dhiltgen/opensuse
...
Add ollama user to video group
2024-02-29 15:47:56 -08:00
Daniel Hiltgen
b830afa716
Merge pull request #2837 from dhiltgen/podman_image_support
...
Add env var so podman will map cuda GPUs
2024-02-29 15:47:37 -08:00
Daniel Hiltgen
bd1d8b0d14
Merge pull request #2836 from bmwiedemann/gzip
...
Omit build date from gzip headers
2024-02-29 15:46:46 -08:00
fred-bf
25c2912120
Add Community Integration: NextChat ( #2780 )
2024-02-29 12:12:13 -08:00
Michael Yang
0e19476b56
prepend image tags ( #2789 )
...
instead of appending image tags, prepend them - this generally produces better results
2024-02-29 11:30:14 -08:00
tylinux
fa2f2b3563
fix: print usedMemory size right ( #2827 )
2024-02-29 11:11:04 -08:00
Jeffrey Morgan
cbf4970e0f
bump submodule to 87c91c07663b707e831c59ec373b5e665ff9d64a
( #2828 )
2024-02-29 09:42:08 -08:00
Daniel Hiltgen
74468513bd
Add ollama user to video group
...
On OpenSUSE, ollama needs to be a member of the video group
to access the GPU
2024-02-29 08:50:10 -08:00
Daniel Hiltgen
794a916a72
Add env var so podman will map cuda GPUs
...
Without this env var, podman's GPU logic doesn't map the GPU through
2024-02-29 08:43:08 -08:00
Bernhard M. Wiedemann
76e5d9ec88
Omit build date from gzip headers
...
See https://reproducible-builds.org/ for why this is good.
This patch was done while working on reproducible builds for openSUSE.
2024-02-29 16:48:19 +01:00
Daniel Hiltgen
076237b8ea
Merge pull request #2771 from dhiltgen/toggle_models
...
Bump llama.cpp to b2276
2024-02-27 11:29:53 -08:00
Daniel Hiltgen
53d694c67f
Merge pull request #2772 from dhiltgen/container_image
...
Refine container image build script
2024-02-27 11:29:08 -08:00
Daniel Hiltgen
5aa6bfea94
Merge pull request #2785 from dhiltgen/win_download
...
Log unexpected server errors checking for update
2024-02-27 10:43:14 -08:00
Daniel Hiltgen
1cde63dd64
Log unexpected server errors checking for update
...
This should unmask some failure modes that likely
show up in app logs as unmarshal errors
2024-02-27 09:17:04 -08:00
Daniel Hiltgen
98e0b7e94f
Refine container image build script
...
Allow overriding the platform, image name, and tag latest for
standard and rocm images.
2024-02-26 17:26:49 -08:00
Daniel Hiltgen
061e8f6abc
Bump llama.cpp to b2276
2024-02-26 16:49:24 -08:00
peanut256
a189810df6
Determine max VRAM on macOS using recommendedMaxWorkingSetSize
( #2354 )
...
* read iogpu.wired_limit_mb on macOS
Fix for https://github.com/ollama/ollama/issues/1826
* improved determination of available vram on macOS
read the recommended maximal vram on macOS via Metal API
* Removed macOS-specific logging
* Remove logging from gpu_darwin.go
* release Core Foundation object
fixes a possible memory leak
2024-02-25 18:16:45 -05:00
Ikko Eltociear Ashimine
e95b896790
Update types.go ( #2744 )
...
specfied -> specified
2024-02-25 13:41:25 -05:00
elthommy
1f087c4d26
Update langchain python tutorial ( #2737 )
...
Remove unused GPT4all
Use nomic-embed-text as embedded model
Fix a deprecation warning (__call__)
2024-02-25 00:31:36 -05:00
Jeffrey Morgan
5d7ea6616f
no extra disk space for windows installation ( #2739 )
2024-02-25 00:20:35 -05:00
Michael Yang
2a4b128ae3
Merge pull request #2719 from ollama/mxyng/format-private-key
...
remove format private key
2024-02-23 17:15:14 -08:00
Michael Yang
fc483274ad
clean up go.mod
2024-02-23 16:53:36 -08:00
Michael Yang
fd10a2ad4b
remove format/openssh.go
...
this is unnecessary now that x/crypto/ssh.MarshalPrivateKey has been
added
2024-02-23 16:52:23 -08:00
Benn Huang
b291f63188
Add Community Integration: Chatbox
...
Co-authored-by: bennhuang <bennhuang@tencent.com >
2024-02-23 07:17:28 -05:00
Jeffrey Morgan
f58856bf6f
better directory cleanup in ollama.iss
2024-02-23 07:14:59 -05:00
Jeffrey Morgan
275ea01587
restore windows build flags and compression
2024-02-22 18:07:18 -05:00
Jeffrey Morgan
8782dd5628
fix build_windows.ps1
script to run go build
with the correct flags
2024-02-22 17:41:43 -05:00
Jeffrey Morgan
11bfff8ee1
update llama.cpp submodule to 96633eeca1265ed03e57230de54032041c58f9cd
2024-02-22 16:44:26 -05:00
Logan Yang
7c0167a8f6
Add copilot for obsidian plugin to community integration ( #1918 )
2024-02-22 14:17:20 -05:00
LangChain4j
74d898e37d
Added LangChain4j links ( #1690 )
2024-02-22 14:09:08 -05:00
Yuan-Man
c6e8b00718
Add README.md ( #2249 )
2024-02-22 14:03:44 -05:00
B-Tocs.org Community
be9980ef13
Update README.md - Ollama for SAP ABAP ( #2510 )
2024-02-22 13:12:27 -05:00
Augustinas Malinauskas
646a0dedb9
Update README.md ( #2504 )
...
- Enchanted is now supported for desktop on macOS
2024-02-22 13:09:29 -05:00
Azhar Khan
7f964d938c
update README to add Gemma 2B, 7B model in Model Library Table ( #2686 )
2024-02-22 13:07:47 -05:00
Pavel Frankov
e6b8a139ff
Update README.md ( #2138 )
2024-02-22 10:52:36 -05:00
Jeffrey Morgan
bdc0ea1ba5
Update import.md
2024-02-22 02:08:03 -05:00
Jeffrey Morgan
7fab7918cc
Update import.md
2024-02-22 02:06:24 -05:00
Michael Yang
74c1bdba0d
Merge pull request #2657 from joshyan1/patch-1
...
Update install.sh success message
2024-02-21 15:55:20 -08:00
Josh
f983ef7f5f
Update install.sh success message
2024-02-21 18:30:01 -05:00
Jeffrey Morgan
1ae1c33651
Windows build + installer adjustments ( #2656 )
...
* remove `-w -s` linker flags on windows
* use `zip` for windows installer compression
2024-02-21 18:21:26 -05:00
Michael Yang
084d846621
refactor
2024-02-21 13:42:48 -08:00
Michael Yang
6a4b994433
lint
2024-02-21 13:42:48 -08:00
Michael Yang
bea007deb7
use LimitGroup for uploads
2024-02-21 13:42:48 -08:00
Michael Yang
074934be03
adjust group limit based on download speed
2024-02-21 13:42:48 -08:00
Michael Yang
0de12368a0
add new LimitGroup for dynamic concurrency
2024-02-21 13:42:48 -08:00
Michael Yang
917bd61084
refactor download run
2024-02-21 13:42:46 -08:00
Jeffrey Morgan
efe040f8c0
reset with init_vars
ahead of each cpu build in gen_windows.ps1
( #2654 )
2024-02-21 16:35:34 -05:00
Jeffrey Morgan
2a7553ce09
update llama.cpp submodule to c14f72d
2024-02-21 09:03:14 -05:00
Sun Bo
10af6070a9
Update big-AGI config file link ( #2626 )
...
Co-authored-by: bo.sun <bo.sun@cotticoffee.com >
2024-02-21 01:24:48 -05:00
Jeffrey Morgan
92423b0600
add dist
directory in build_windows.ps
2024-02-21 00:05:05 -05:00
Jeffrey Morgan
b3eac61cac
update llama.cpp submodule to f0d1fafc029a056cd765bdae58dcaa12312e9879
2024-02-20 22:56:51 -05:00
Jeffrey Morgan
287ba11500
better error message when calling /api/generate
or /api/chat
with embedding models
2024-02-20 21:53:45 -05:00
Jeffrey Morgan
63861f58cc
Support for bert
and nomic-bert
embedding models
2024-02-20 21:37:29 -05:00
Jeffrey Morgan
f0425d3de9
Update faq.md
2024-02-20 20:44:45 -05:00
Michael Yang
210b65268e
replace strings buffer with hasher ( #2437 )
...
the buffered value is going into the hasher eventually so write directly
to the hasher instead
2024-02-20 19:07:50 -05:00
Michael Yang
949d7b1c48
add gguf file types ( #2532 )
2024-02-20 19:06:29 -05:00
Michael Yang
897b213468
use http.DefaultClient ( #2530 )
...
default client already handles proxy
2024-02-20 18:34:47 -05:00
Jeffrey Morgan
4613a080e7
update llama.cpp submodule to 66c1968f7
( #2618 )
2024-02-20 17:42:31 -05:00
Muhammed Nazeem
ace2cdf1c6
Add Page Assist to the community integrations ( #2447 )
2024-02-20 14:03:58 -05:00
Nikesh Parajuli
eed92bc19a
docs: add Msty app in readme ( #1775 )
...
* docs: add Msty app in readme
* docs: update msty url
2024-02-20 14:03:33 -05:00
Michael Edoror
e0a2f46466
Update README.md to include Elixir LangChain Library ( #2180 )
...
The Elixir LangChain Library now supports Ollama Chat with this [PR](https://github.com/brainlid/langchain/pull/70 )
2024-02-20 14:03:02 -05:00
Taras Tsugrii
01ff2e14db
[nit] Remove unused msg local var. ( #2511 )
2024-02-20 14:02:34 -05:00
BADR
199e79ec0c
docs: add tenere to terminal clients ( #2329 )
2024-02-19 23:13:03 -05:00
Jeffrey Morgan
8125ce4cb6
Update import.md
...
Add instructions to get public key on windows
2024-02-19 22:48:24 -05:00
Daniel
636d6eea99
Add ShellOracle to community terminal integrations ( #1767 )
2024-02-19 22:18:05 -05:00
Jeffrey Morgan
df56f1ee5e
Update faq.md
2024-02-19 22:16:42 -05:00
Jean-Baptiste Detroyes
0b6c6c9092
feat: add Helm Chart link to Package managers list ( #1673 )
2024-02-19 22:05:14 -05:00
Jakob Hoeg Mørk
cb60389de7
NextJS web interface for Ollama ( #2466 )
2024-02-19 21:57:36 -05:00
lulz
ce0c95d097
[fix] /bye and /exit are now treated as prefixes ( #2381 )
...
* [fix] /bye and /exit are now treated as prefixes
instead of being treated as entire lines which doesn't align with the way the rest of the commands are treated
* Update cmd/interactive.go
Fixing whitespace
---------
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2024-02-19 21:56:49 -05:00
Eddú Meléndez Gonzales
a9bc1e1c37
Add LangChain4J ( #2164 )
2024-02-19 21:17:32 -05:00
Branislav Gerazov
62c71f4cb1
add ollama-chat.nvim ( #2188 )
2024-02-19 21:14:29 -05:00
Jeffrey Morgan
41aca5c2d0
Update faq.md
2024-02-19 21:11:01 -05:00
Jeffrey Morgan
753724d867
Update api.md to include examples for reproducible outputs
2024-02-19 20:36:16 -05:00
Jeffrey Morgan
e4576c2ee1
Update README.md
2024-02-19 20:15:24 -05:00
Patrick Devine
9a7a4b9533
add faqs for memory pre-loading and the keep_alive setting ( #2601 )
2024-02-19 14:45:25 -08:00
Daniel Hiltgen
2653191222
Merge pull request #2600 from dhiltgen/refined_win_docs
...
Document setting server vars for windows
2024-02-19 13:46:37 -08:00
Daniel Hiltgen
b338c0635f
Document setting server vars for windows
2024-02-19 13:30:46 -08:00
Daniel Hiltgen
4fcbf1cde6
Merge pull request #2599 from dhiltgen/fix_avx
...
Explicitly disable AVX2 on GPU builds
2024-02-19 13:13:05 -08:00
Daniel Hiltgen
9220b4fa91
Merge pull request #2585 from dhiltgen/cuda_leaks
...
Fix cuda leaks
2024-02-19 12:48:00 -08:00
Daniel Hiltgen
fc39a6cd7a
Fix cuda leaks
...
This should resolve the problem where we don't fully unload from the GPU
when we go idle.
2024-02-18 18:37:20 -08:00
Justin Hayes
1e23e82324
Update Web UI link to new project name ( #2563 )
...
Ollama WebUI is now known as Open WebUI.
2024-02-17 20:05:20 -08:00
Daniel Hiltgen
f9fd08040b
Merge pull request #2552 from dhiltgen/dup_update_menus
...
Fix duplicate menus on update and exit on signals
2024-02-16 17:23:37 -08:00
Daniel Hiltgen
4318e35ee3
Merge pull request #2553 from dhiltgen/amdgpu_version
...
Harden AMD driver lookup logic
2024-02-16 17:23:12 -08:00
Daniel Hiltgen
9754c6d9d8
Harden AMD driver lookup logic
...
It looks like the version file doesnt exist on older(?) drivers
2024-02-16 16:20:16 -08:00
Daniel Hiltgen
a497235a55
Fix view logs menu
2024-02-16 15:42:53 -08:00
Daniel Hiltgen
df6dc4fd96
Fix duplicate menus on update and exit on signals
...
Also fixes a few fit-and-finish items for better developer experience
2024-02-16 15:33:16 -08:00
Bruce MacDonald
88622847c6
fix: chat system prompting overrides ( #2542 )
2024-02-16 14:42:43 -05:00
Tristan Rhodes
9774663013
Update faq.md with the location of models on Windows ( #2545 )
2024-02-16 11:04:19 -08:00
Daniel Hiltgen
a468ae0459
Merge pull request #2499 from ollama/windows-preview
...
Windows Preview
2024-02-15 16:06:32 -08:00
Daniel Hiltgen
c3e62ba38a
Merge pull request #2516 from dhiltgen/single_tray_app
...
Fix a couple duplicate instance bugs
2024-02-15 15:52:43 -08:00
Daniel Hiltgen
117369aa73
Exit if we detect another copy of Ollama running
2024-02-15 14:58:29 -08:00
Daniel Hiltgen
1ba734de67
typo
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
5208cf09b1
clean up some logging
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
bb9de6037c
Prevent multiple installers running concurrently
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
272e53a1f5
Prepare to distribute standalone windows executable
...
This will be useful for our automated test riggig, and may be useful for
advanced users who want to "roll their own" system service
2024-02-15 14:56:55 -08:00
Daniel Hiltgen
db2a9ad1fe
Explicitly disable AVX2 on GPU builds
...
Even though we weren't setting it to on, somewhere in the cmake config
it was getting toggled on. By explicitly setting it to off, we get `/arch:AVX`
as intended.
2024-02-15 14:50:11 -08:00
Daniel Hiltgen
c9ab1aead3
Merge pull request #2526 from dhiltgen/harden_for_quotes
...
Harden the OLLAMA_HOST lookup for quotes
2024-02-15 14:13:40 -08:00
Daniel Hiltgen
4a10e7a7fa
Harden the OLLAMA_HOST lookup for quotes
2024-02-15 13:46:56 -08:00
Michael Yang
86808f80a8
remove unused import
2024-02-15 12:09:11 -08:00
Michael Yang
4240b045e6
always enable view logs
2024-02-15 12:08:27 -08:00
Michael Yang
e547378893
disable default debug
2024-02-15 12:05:13 -08:00
Michael Yang
fd77dbec4d
do not print update request headers
2024-02-15 11:36:35 -08:00
Michael
fefb3e77d1
Update README.md
2024-02-15 10:32:40 -08:00
Jeffrey Morgan
ed5489a96e
higher resolution tray icons
2024-02-14 22:55:03 -08:00
jmorganca
76113742cf
update installer title
2024-02-15 05:56:45 +00:00
Jeffrey Morgan
57e60c836f
better windows app and tray icons
2024-02-15 05:56:45 +00:00
jmorganca
622b1f3e67
update installer and app.exe metadata
2024-02-15 05:56:45 +00:00
jmorganca
7ad9844ac0
set exe metadata using resource files
2024-02-15 05:56:45 +00:00
Michael Yang
e43648afe5
rerefactor
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
823a520266
Fix lint error on ignored error for win console
2024-02-15 05:56:45 +00:00
vinjn
66ef308abd
Import "containerd/console" lib to support colorful output in Windows terminal
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
29e90cc13b
Implement new Go based Desktop app
...
This focuses on Windows first, but coudl be used for Mac
and possibly linux in the future.
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
f397e0e988
Move hub auth out to new package
2024-02-15 05:56:45 +00:00
Daniel Hiltgen
9da9e8fb72
Move Mac App to a new dir
2024-02-15 05:56:45 +00:00
Patrick Devine
42e77e2a69
handle race condition while setting raw mode in windows ( #2509 )
2024-02-14 21:28:35 -08:00
Jeffrey Morgan
9241a29336
Revert "Revert "bump submodule to 6c00a06
( #2479 )"" ( #2485 )
...
This reverts commit 6920964b87
.
2024-02-13 18:18:41 -08:00
Jeffrey Morgan
f7231ad9ad
set shutting_down
to false
once shutdown is complete ( #2484 )
2024-02-13 17:48:41 -08:00
Jeffrey Morgan
6920964b87
Revert "bump submodule to 6c00a06
( #2479 )"
...
This reverts commit 2f9ed52bbd
.
2024-02-13 17:23:05 -08:00
Jeffrey Morgan
2f9ed52bbd
bump submodule to 6c00a06
( #2479 )
2024-02-13 17:12:42 -08:00
bnorick
caf2b13c10
Fix infinite keep_alive ( #2480 )
2024-02-13 15:40:32 -08:00
lebrunel
1d263449ff
Update README.md to include link to Ollama-ex Elixir library ( #2477 )
2024-02-13 11:40:44 -08:00
Jeffrey Morgan
48a273f80b
Fix issues with templating prompt in chat mode ( #2460 )
2024-02-12 15:06:57 -08:00
Daniel Hiltgen
939c60473f
Merge pull request #2422 from dhiltgen/better_kill
...
More robust shutdown
2024-02-12 14:05:06 -08:00
Jeffrey Morgan
f76ca04f9e
update submodule to 099afc6
( #2468 )
2024-02-12 14:01:16 -08:00
Daniel Hiltgen
76b8728f0c
Merge pull request #2465 from dhiltgen/block_rocm_pre_9
...
Detect AMD GPU info via sysfs and block old cards
2024-02-12 12:41:43 -08:00
Jeffrey Morgan
1f9078d6ae
Check image filetype in api handlers ( #2467 )
2024-02-12 11:16:20 -08:00
Daniel Hiltgen
6d84f07505
Detect AMD GPU info via sysfs and block old cards
...
This wires up some new logic to start using sysfs to discover AMD GPU
information and detects old cards we can't yet support so we can fallback to CPU mode.
2024-02-12 08:19:41 -08:00
Jeffrey Morgan
26b13fc33c
patch: always add token to cache_tokens ( #2459 )
2024-02-12 08:10:16 -08:00
Jeffrey Morgan
1c8435ffa9
Update domain name references in docs and install script ( #2435 )
2024-02-09 15:19:30 -08:00
Daniel Hiltgen
6680761596
Shutdown faster
...
Make sure that when a shutdown signal comes, we shutdown quickly instead
of waiting for a potentially long exchange to wrap up.
2024-02-08 22:22:50 -08:00
Jeffrey Morgan
42b797ed9c
Update openai.md
2024-02-08 15:03:23 -05:00
Jeffrey Morgan
336aa43f3c
Update openai.md
2024-02-08 12:48:28 -05:00
Daniel Hiltgen
69f392c9b7
Merge pull request #2403 from dhiltgen/handle_tmp_cleanup
...
Ensure the libraries are present
2024-02-07 17:55:31 -08:00
Daniel Hiltgen
a1dfab43b9
Ensure the libraries are present
...
When we store our libraries in a temp dir, a reaper might clean
them when we are idle, so make sure to check for them before
we reload.
2024-02-07 17:27:49 -08:00
Jeffrey Morgan
a0a199b108
Fix hanging issue when sending empty content ( #2399 )
2024-02-07 19:30:33 -05:00
Jeffrey Morgan
ab0d37fde4
Update openai.md
2024-02-07 17:25:33 -05:00
Jeffrey Morgan
14e71350c8
Update openai.md
2024-02-07 17:25:24 -05:00
Jeffrey Morgan
453f572f83
Initial OpenAI /v1/chat/completions
API compatibility ( #2376 )
2024-02-07 17:24:29 -05:00
Daniel Hiltgen
c9dfa6e571
Merge pull request #2377 from dhiltgen/bump_llamacpp
...
Bump llama.cpp to b2081
2024-02-07 12:04:38 -08:00
Michael Yang
3dcbcd367d
Merge pull request #2394 from ollama/mxyng/fix-error-response
2024-02-07 11:47:31 -08:00
Michael Yang
e805ac1d59
fix response on token error
2024-02-07 11:05:49 -08:00
Michael Yang
b9229ffca5
Merge pull request #2378 from ollama/mxyng/runners
...
runners
2024-02-06 13:49:58 -08:00
Michael Yang
46c847c4ad
enable rocm builds
2024-02-06 13:36:13 -08:00
Michael Yang
92b1a21f79
use linux runners
2024-02-06 13:36:04 -08:00
Daniel Hiltgen
de76b95dd4
Bump llama.cpp to b2081
2024-02-06 12:06:43 -08:00
Michael Yang
59ec837ef6
Merge pull request #2374 from ollama/mxyng/rocm-builds
...
disable rocm builds
2024-02-06 09:41:02 -08:00
Michael Yang
f06b99a461
disable rocm builds
2024-02-06 09:29:42 -08:00
Bruce MacDonald
128fce5495
docs: keep_alive ( #2258 )
2024-02-06 11:00:05 -05:00
Daniel Hiltgen
27aa2d4a19
Merge pull request #1849 from mraiser/main
...
Accomodate split cuda lib dir
2024-02-05 16:01:16 -08:00
Jeffrey Morgan
b9f91a0b36
Update import instructions to use convert and quantize tooling from llama.cpp submodule ( #2247 )
2024-02-05 00:50:44 -05:00
Erik S
b538dc3858
Add llm-ollama plugin for Datasette's LLM CLI to README ( #2340 )
...
Co-authored-by: Erik Sp <git@aschwa.com >
2024-02-03 15:40:50 -08:00
Jeffrey Morgan
f0e9496c85
Update api.md
2024-02-02 12:17:24 -08:00
Jeffrey Morgan
09a6f76f4c
fix error on ollama run
with a non-existent model
2024-02-01 23:11:52 -08:00
Jeffrey Morgan
e135167484
Add multimodel support to ollama run
in noninteractive mopde ( #2317 )
2024-02-01 21:33:06 -08:00
Jeffrey Morgan
38296ab352
clear previous images when submitting an image to ollama run
( #2316 )
2024-02-01 21:30:26 -08:00
Daniel Hiltgen
f43dea68d1
Merge pull request #2318 from dhiltgen/more_clean
...
Harden generate patching model
2024-02-01 20:41:29 -08:00
Daniel Hiltgen
e1f50377f4
Harden generate patching model
...
Only apply patches if we have any, and make sure to cleanup
every file we patched at the end to leave the tree clean
2024-02-01 19:34:36 -08:00
Jeffrey Morgan
7913104527
Improvements to ollama run
for multimodal models ( #2300 )
2024-02-01 17:09:51 -08:00
Michael Yang
bfbf2f7cf7
Merge pull request #2296 from ollama/mxyng/img-tags
...
append image tags to user content
2024-02-01 13:16:59 -08:00
Michael Yang
fe3cbd014f
Merge pull request #2298 from ollama/mxyng/debug-prompt
...
structured debug prompt
2024-02-01 13:16:49 -08:00
Michael Yang
3d6f48507a
structured debug prompt
2024-02-01 11:56:28 -08:00
Michael Yang
f3761405c8
use image id
2024-02-01 11:52:42 -08:00
Michael Yang
e49dc9f3d8
fix tests
2024-02-01 11:48:11 -08:00
Michael Yang
d125510b4b
remove image tags
2024-02-01 11:32:51 -08:00
Russell Canfield
1ca386aa9e
Feature - Add Wingman Extension ( #2313 )
2024-02-01 11:16:24 -08:00
Michael Yang
fb56988014
account for image projection in token count
2024-02-01 09:50:48 -08:00
Michael Yang
d046bee790
use llm.ImageData for chat
2024-01-31 19:18:25 -08:00
Jeffrey Morgan
f11bf0740b
use llm.ImageData
2024-01-31 19:13:48 -08:00
Michael Yang
8450bf66e6
trim images
2024-01-31 19:13:47 -08:00
Michael Yang
b4e11be8ef
append image tags to user content
2024-01-31 19:13:10 -08:00
Bruce MacDonald
a896079705
preserve last system message from modelfile ( #2289 )
2024-01-31 21:45:01 -05:00
Michael Yang
583950c828
Merge pull request #2294 from ollama/mxyng/slog-source
...
update slog handler options
2024-01-31 15:29:11 -08:00
Michael Yang
8ac08a0eec
update slog handler options
...
- consistent format by using text handler for debug and non-debug
- truncate source file to just the file name
2024-01-31 15:15:00 -08:00
Michael Yang
60f47be64c
Merge pull request #2284 from ollama/mxyng/parse-raw
...
remove unnecessary parse raw
2024-01-31 09:40:48 -08:00
Daniel Hiltgen
6e56077ada
Merge pull request #2263 from dhiltgen/bump_llamacpp
...
Bump llama.cpp to b1999
2024-01-31 08:39:41 -08:00
Hoang Nguyen
98ae9467bb
Added MindMac to Community Integrations -> Web & Desktop section ( #1957 )
2024-01-31 07:48:37 -08:00
Richard Macarthy
b7a24af083
Add twinny vscode extension to Extensions and Plugins ( #1950 )
2024-01-31 06:25:06 -08:00
Michael Yang
c8b1f2369e
remove unnecessary parse raw
2024-01-30 17:00:53 -08:00
Daniel Hiltgen
72b12c3be7
Bump llama.cpp to b1999
...
This requires an upstream change to support graceful termination,
carried as a patch.
2024-01-30 16:52:12 -08:00
Bruce MacDonald
0632dff3f8
trim chat prompt based on llm context size ( #1963 )
2024-01-30 15:59:29 -05:00
Maximilian Weber
509e2dec8a
Update README.md ( #2252 )
...
Added - [Ollama for R - rollama](https://github.com/JBGruber/rollama ) in Libraries in README.md
2024-01-30 11:56:51 -08:00
Daniel Hiltgen
78a48de804
Merge pull request #2256 from dhiltgen/container_logs
...
Add container hints for troubleshooting
2024-01-30 08:12:48 -08:00
Daniel Hiltgen
e7dbb00331
Add container hints for troubleshooting
...
Some users are new to containers and unsure where the server logs go
2024-01-29 08:53:41 -08:00
Marc Raiser
c3f9538636
remove default.nix
2024-01-29 00:05:07 -05:00
Jeffrey Morgan
2e06ed01d5
remove unknown CPPFLAGS
option
2024-01-28 17:51:23 -08:00
Daniel Hiltgen
4072b5879b
Merge pull request #2246 from dhiltgen/reject_cuda_without_avx
...
Don't disable GPUs on arm without AVX
2024-01-28 16:26:55 -08:00
Daniel Hiltgen
15562e887d
Don't disable GPUs on arm without AVX
...
AVX is an x86 feature, so ARM should be excluded from
the check.
2024-01-28 15:22:38 -08:00
Jeffrey Morgan
f2245c7c77
print prompt with OLLAMA_DEBUG=1
( #2245 )
2024-01-28 15:22:35 -08:00
Jeffrey Morgan
e4b9b72f2a
Do not repeat system prompt for chat templating ( #2241 )
2024-01-28 14:15:56 -08:00
Daniel Hiltgen
311f8e0c3f
Merge pull request #2243 from dhiltgen/harden_zero_gpus
...
Harden for zero detected GPUs
2024-01-28 13:30:44 -08:00
Daniel Hiltgen
f07f8b7a9e
Harden for zero detected GPUs
...
At least with the ROCm libraries, its possible to have the library
present with zero GPUs. This fix avoids a divide by zero bug in llm.go
when we try to calculate GPU memory with zero GPUs.
2024-01-28 13:13:10 -08:00
mraiser
4c4c730a0a
Merge branch 'ollama:main' into main
2024-01-27 21:56:11 -05:00
Daniel Hiltgen
e02ecfb6c8
Merge pull request #2116 from dhiltgen/cc_50_80
...
Add support for CUDA 5.0 cards
2024-01-27 10:28:38 -08:00
Daniel Hiltgen
c8059b4dcf
Merge pull request #2224 from jaglinux/fix_rocm_get_version_message
...
ROCm: Correct the response string in rocm_get_version function
2024-01-27 07:29:32 -08:00
Jagadish Krishnamoorthy
59d87127f5
Update gpu_info_rocm.c
2024-01-26 22:08:27 -08:00
Patrick Devine
b5cf31b460
add keep_alive to generate/chat/embedding api endpoints ( #2146 )
2024-01-26 14:28:02 -08:00
Daniel Hiltgen
cc4915e262
Merge pull request #2214 from dhiltgen/reject_cuda_without_avx
...
Detect lack of AVX and fallback to CPU mode
2024-01-26 12:06:44 -08:00
Daniel Hiltgen
667a2ba18a
Detect lack of AVX and fallback to CPU mode
...
We build the GPU libraries with AVX enabled to ensure that if not all
layers fit on the GPU we get better performance in a mixed mode.
If the user is using a virtualization/emulation system that lacks AVX
this used to result in an illegal instruction error and crash before this
fix. Now we will report a warning in the server log, and just use
CPU mode to ensure we don't crash.
2024-01-26 11:36:03 -08:00
Michael Yang
e054ebe059
Merge pull request #2212 from ollama/mxyng/fix-build
...
fix build
2024-01-26 11:19:08 -08:00
Michael Yang
9d3dcfd0ec
fix logging
2024-01-26 11:04:27 -08:00
Michael Yang
6e0ea5ecc8
Merge pull request #1916 from ollama/mxyng/inactivity-monitor
...
download: add inactivity monitor
2024-01-26 10:56:00 -08:00
Daniel Hiltgen
a47d8b2557
Merge pull request #2197 from dhiltgen/remove_rocm_image
...
Add back ROCm container support
2024-01-26 09:34:23 -08:00
Daniel Hiltgen
30c43c285c
Merge pull request #2195 from dhiltgen/rocm_real_gpus
...
Ignore AMD integrated GPUs
2024-01-26 09:30:24 -08:00
Daniel Hiltgen
23a7ea593b
Merge pull request #2209 from dhiltgen/harden_mgmt
...
Fix crash on cuda ml init failure
2024-01-26 09:30:13 -08:00
Daniel Hiltgen
75c44aa319
Add back ROCm container support
...
This adds ROCm support back as a discrete image.
2024-01-26 09:24:29 -08:00
Daniel Hiltgen
9d7b5d6c91
Ignore AMD integrated GPUs
...
Detect and ignore integrated GPUs reported by rocm.
2024-01-26 09:21:35 -08:00
Daniel Hiltgen
5d9c4a5f5a
Fix crash on cuda ml init failure
...
The new driver lookup code was triggering after init failure due to a missing return
2024-01-26 09:18:33 -08:00
Daniel Hiltgen
197e420a97
Merge pull request #2196 from dhiltgen/remove_rocm_image
...
Switch back to ubuntu base
2024-01-25 16:50:32 -08:00
Daniel Hiltgen
a34e1ad3cf
Switch back to ubuntu base
...
The size increase for rocm support in the standard image is problematic
We'll revisit multiple tags for rocm support in a follow up PR.
2024-01-25 16:46:01 -08:00
Michael Yang
2ae0556292
Merge pull request #1679 from ollama/mxyng/build-gpus
...
build cuda and rocm
2024-01-25 16:38:14 -08:00
Jeffrey Morgan
5be9bdd444
Update modelfile.md
2024-01-25 16:29:48 -08:00
Jeffrey Morgan
b706794905
Update modelfile.md to include MESSAGE
2024-01-25 16:29:32 -08:00
Michael Yang
a8c5413d06
only generate gpu libs
2024-01-25 15:41:56 -08:00
Michael Yang
5580de4571
archive ollama binaries
2024-01-25 15:40:16 -08:00
Michael Yang
946431d5b0
build cuda and rocm
2024-01-25 15:40:15 -08:00
Michael Yang
0610126049
remove env setting
2024-01-25 15:39:43 -08:00
Jeffrey Morgan
3ebd6a83fc
update submodule to cd4fddb29f81d6a1f6d51a0c016bc6b486d68def
2024-01-25 13:54:11 -08:00
Jeffrey Morgan
a64570dcae
Fix clearing kv cache between requests with the same prompt ( #2186 )
...
* Fix clearing kv cache between requests with the same prompt
* fix powershell script
2024-01-25 13:46:20 -08:00
Patrick Devine
7c40a67841
Save and load sessions ( #2063 )
2024-01-25 12:12:36 -08:00
Michael Yang
e64b5b07a2
Merge pull request #2181 from ollama/mxyng/stub-lint
...
stub generate outputs for lint
2024-01-25 11:55:15 -08:00
Michael Yang
9e1e295cdc
Merge pull request #2175 from ollama/mxyng/refactor-tensor-read
...
refactor tensor read
2024-01-25 09:22:42 -08:00
Marc Raiser
6eb3cddcb6
To build on NixOS: nix-shell --run 'go generate ./... && go build .'
2024-01-25 10:17:22 -05:00
mraiser
a4564232a4
Update gen_linux.sh to find libcudart in separate directory
2024-01-25 09:49:35 -05:00
Jeffrey Morgan
a643823f86
Update README.md
2024-01-24 21:36:56 -08:00
Michael Yang
8e5d359a03
stub generate outputs for lint
2024-01-24 17:36:10 -08:00
Daniel Hiltgen
a170888dd4
Merge pull request #2174 from dhiltgen/rocm_real_gpus
...
More logging for gpu management
2024-01-24 11:09:17 -08:00
Michael Yang
cd22855ef8
refactor tensor read
2024-01-24 10:48:31 -08:00
Daniel Hiltgen
013fd07139
More logging for gpu management
...
Fix an ordering glitch of dlerr/dlclose and add more logging to help
root cause some crashes users are hitting. This also refines the
function pointer names to use the underlying function names instead
of simplified names for readability.
2024-01-24 10:32:36 -08:00
Daniel Hiltgen
f63dc2db5c
Merge pull request #2162 from dhiltgen/rocm_real_gpus
...
Report more information about GPUs in verbose mode
2024-01-23 17:45:40 -08:00
Jeffrey Morgan
eaa5a396d9
Update README.md
2024-01-23 16:08:15 -08:00
Jeffrey Morgan
8ed22f5d72
Update README.md
2024-01-23 14:38:01 -08:00
Daniel Hiltgen
987c16b2f7
Report more information about GPUs in verbose mode
...
This adds additional calls to both CUDA and ROCm management libraries to
discover additional attributes about the GPU(s) detected in the system, and
wires up runtime verbosity selection. When users hit problems with GPUs we can
ask them to run with `OLLAMA_DEBUG=1 ollama serve` and share the results.
2024-01-23 11:37:02 -08:00
Jeffrey Morgan
950f636d64
Update README.md
2024-01-23 10:29:10 -08:00
Jeffrey Morgan
4458efb73a
Load all layers on arm64
macOS if model is small enough ( #2149 )
2024-01-22 17:40:06 -08:00
Daniel Hiltgen
ceea599494
Merge pull request #2150 from dhiltgen/default_version
...
Set a default version using git describe
2024-01-22 17:38:27 -08:00
Daniel Hiltgen
3005ec74b3
Set a default version using git describe
...
If a VERSION is not specified, this will generate a version string that
represents the state of the repo. For example `0.1.21-12-gffaf52e-dirty`
representing 12 commits away from 0.1.21 tag, on commit gffaf52e
and the tree is dirty.
2024-01-22 17:12:20 -08:00
Daniel Hiltgen
0759d8996e
Merge pull request #2148 from dhiltgen/intel_mac
...
Refine Accelerate usage on mac
2024-01-22 16:56:58 -08:00
Daniel Hiltgen
0f5b843319
Refine Accelerate usage on mac
...
For old macs, accelerate seems to cause crashes, but for
AVX2 capable macs, it does not.
2024-01-22 16:25:56 -08:00
Jeffrey Morgan
ffaf52e1e9
update submodule to 011e8ec577fd135cbc02993d3ea9840c516d6a1c
2024-01-22 15:16:54 -08:00
Michael Yang
940b10b036
Merge pull request #2144 from jmorganca/mxyng/update-faq
...
faq: update to use launchctl setenv
2024-01-22 13:46:57 -08:00
Daniel Hiltgen
3bc28736cd
Merge pull request #2143 from dhiltgen/llm_verbosity
...
Refine debug logging for llm
2024-01-22 13:19:16 -08:00
Michael Yang
93a756266c
faq: update to use launchctl setenv
2024-01-22 13:10:13 -08:00
Daniel Hiltgen
a0a829bf7a
Merge pull request #2142 from dhiltgen/debug_on_fail
...
Debug logging on init failure
2024-01-22 12:29:22 -08:00
Daniel Hiltgen
730dcfcc7a
Refine debug logging for llm
...
This wires up logging in llama.cpp to always go to stderr, and also
turns up logging if OLLAMA_DEBUG is set.
2024-01-22 12:26:49 -08:00
Daniel Hiltgen
27a2d5af54
Debug logging on init failure
2024-01-22 12:08:22 -08:00
Jeffrey Morgan
5f81a33f43
update submodule to 6f9939d
( #2115 )
2024-01-22 11:56:40 -08:00
Michael Yang
6225fde046
Merge pull request #2102 from jmorganca/mxyng/fix-create-override
...
fix: remove overwritten model layers
2024-01-22 09:37:48 -08:00
Meng Zhuo
069184562b
readline: drop not use min function ( #2134 )
2024-01-22 08:15:08 -08:00
Daniel Hiltgen
5576bb2348
Merge pull request #2130 from dhiltgen/more_faster
...
Make CPU builds parallel and customizable AMD GPUs
2024-01-21 16:14:12 -08:00
Daniel Hiltgen
2738837786
Merge pull request #2131 from dhiltgen/probe_cards_at_init
...
Probe GPUs before backend init
2024-01-21 16:13:47 -08:00
Daniel Hiltgen
ec3764538d
Probe GPUs before backend init
...
Detect potential error scenarios so we can fallback to CPU mode without
hitting asserts.
2024-01-21 15:59:38 -08:00
Daniel Hiltgen
df54c723ae
Make CPU builds parallel and customizable AMD GPUs
...
The linux build now support parallel CPU builds to speed things up.
This also exposes AMD GPU targets as an optional setting for advaced
users who want to alter our default set.
2024-01-21 15:12:21 -08:00
Daniel Hiltgen
fa8c990e58
Merge pull request #2127 from dhiltgen/rocm_container
...
Combine the 2 Dockerfiles and add ROCm
2024-01-21 11:49:01 -08:00
Daniel Hiltgen
da72235ebf
Combine the 2 Dockerfiles and add ROCm
...
This renames Dockerfile.build to Dockerfile, and adds some new stages
to support 2 modes of building - the build_linux.sh script uses
intermediate stages to extract the artifacts for ./dist, and the default
build generates a container image usable by both cuda and rocm cards.
This required transitioniing the x86 base to the rocm image to avoid
layer bloat.
2024-01-21 11:37:11 -08:00
Jeffrey Morgan
89c4aee29e
Unlock mutex when failing to load model ( #2117 )
2024-01-20 20:54:46 -05:00
Daniel Hiltgen
a447a083f2
Add compute capability 5.0, 7.5, and 8.0
2024-01-20 14:24:05 -08:00
Jeffrey Morgan
f32ea81b21
increase minimum overhead to 1024MiB ( #2114 )
2024-01-20 17:11:38 -05:00
Daniel Hiltgen
681a914990
Add support for CUDA 5.2 cards
2024-01-20 10:48:43 -08:00
Jeffrey Morgan
4c54f0ddeb
sign dylibs on macOS ( #2101 )
2024-01-19 19:24:11 -05:00
Michael Yang
c08dfaa23d
fix: remove overwritten model layers
...
if create overrides a manifest, first add the older manifest's layers to
the delete map so they can be cleaned up
2024-01-19 14:58:37 -08:00
Daniel Hiltgen
3b76e736ae
Merge pull request #2100 from dhiltgen/more_wsl_globs
...
More WSL paths
2024-01-19 13:41:08 -08:00
Daniel Hiltgen
552db98bf1
More WSL paths
2024-01-19 13:23:29 -08:00
Daniel Hiltgen
fdcdfef620
Merge pull request #2099 from dhiltgen/fix_cuda_model_swap
...
Switch to local dlopen symbols
2024-01-19 12:22:04 -08:00
Daniel Hiltgen
6a042438af
Switch to local dlopen symbols
2024-01-19 11:37:02 -08:00
Jeffrey Morgan
dc88cc3981
use gzip
for runner embedding ( #2067 )
2024-01-19 13:23:03 -05:00
Daniel Hiltgen
62976087c6
Merge pull request #1999 from lainedfles/termux_android_cpu_only
...
Fix CPU-only build under Android Termux enviornment.
2024-01-18 17:16:53 -08:00
Self Denial
344342abdf
Restore dyn_ext_server.c since RTLD_DEEPBIND has been removed
2024-01-18 17:30:42 -07:00
Self Denial
eb76f3e379
Fix CPU-only build under Android Termux enviornment.
...
Update gpu.go initGPUHandles() to declare gpuHandles variable before
reading it. This resolves an "invalid memory address or nil pointer
dereference" error.
Update dyn_ext_server.c to avoid setting the RTLD_DEEPBIND flag under
__TERMUX__ (Android).
2024-01-18 17:25:33 -07:00
Michael Yang
d017e3d0a6
Merge pull request #2060 from jmorganca/mxyng/fix-show
...
fix show handler
2024-01-18 16:02:27 -08:00
Michael Yang
aac9ab4db7
fix show handler
2024-01-18 15:36:50 -08:00
Michael Yang
1f5b7ff976
Merge pull request #1932 from jmorganca/mxyng/api-fields
...
api: add model for all requests
2024-01-18 14:56:51 -08:00
Michael Yang
e299831e2c
Merge pull request #1958 from purificant/ci
...
ci: update setup-go action
2024-01-18 14:53:36 -08:00
Michael Yang
745b5934fa
add model to ModelResponse
2024-01-18 14:32:55 -08:00
Michael Yang
a38d88d828
api: add model for all requests
...
prefer using req.Model and fallback to req.Name
2024-01-18 14:31:37 -08:00
Daniel Hiltgen
abec7f06e5
Merge pull request #2056 from dhiltgen/slog
...
Mechanical switch from log to slog
2024-01-18 14:27:24 -08:00
Michael Yang
e5da190bac
Merge pull request #2020 from jmorganca/mxyng/install-fedora
...
install: pin fedora to max 37
2024-01-18 14:23:42 -08:00
Daniel Hiltgen
ecbfc0182f
Go bump to v1.21 to pick up slog
2024-01-18 14:12:57 -08:00
Daniel Hiltgen
fedd705aea
Mechanical switch from log to slog
...
A few obvious levels were adjusted, but generally everything mapped to "info" level.
2024-01-18 14:12:57 -08:00
Mike Bird
82ee019bfc
add open interpreter to list of extensions ( #2016 )
2024-01-18 13:59:39 -08:00
Sachin Sachdeva
ad9dbc2a04
Haystack Ollama Integration ( #2021 )
...
Updated readme with the web link for haystack ollama integration
2024-01-18 13:38:32 -08:00
Daniel Hiltgen
fccdf4c635
Merge pull request #1987 from xyproto/archlinux
...
Let gpu.go and gen_linux.sh also find CUDA on Arch Linux
2024-01-18 13:32:10 -08:00
Daniel Hiltgen
d450fb1d1e
Merge pull request #2055 from dhiltgen/cuda_docs
...
Refine the linux cuda/rocm developer docs
2024-01-18 12:07:31 -08:00
Daniel Hiltgen
df40b11d03
Merge pull request #2007 from dhiltgen/cpu_fallback
...
Add multiple CPU variants for Intel Mac
2024-01-18 11:32:29 -08:00
Daniel Hiltgen
9cd20b0ec8
Refine the linux cuda/rocm developer docs
2024-01-18 09:44:44 -08:00
Daniel Hiltgen
b992bf65fc
Disable arm64 for test phase
...
The runners are x86 so we can only run binaries that match.
2024-01-17 19:26:13 -08:00
Daniel Hiltgen
1b249748ab
Add multiple CPU variants for Intel Mac
...
This also refines the build process for the ext_server build.
2024-01-17 15:08:54 -08:00
Alexander F. Rødseth
cbe2adc78a
Merge branch 'main' into archlinux
2024-01-17 12:50:11 +01:00
Michael Yang
d5a7353357
Merge pull request #2026 from jmorganca/mxyng/fix-windows
...
fix: normalize name path before splitting
2024-01-16 16:58:42 -08:00
Michael Yang
96cfb62641
fix: normalize name path before splitting
2024-01-16 16:48:29 -08:00
Daniel Hiltgen
7d00b5d110
Merge pull request #1915 from dhiltgen/bump_llama_with_new_dep
...
Bump llama.cpp to b1842 and add new cuda lib dep
2024-01-16 13:36:49 -08:00
Daniel Hiltgen
795674dd90
Bump llama.cpp to b1842 and add new cuda lib dep
...
Upstream llama.cpp has added a new dependency with the
NVIDIA CUDA Driver Libraries (libcuda.so) which is part of the
driver distribution, not the general cuda libraries, and is not
available as an archive, so we can not statically link it. This may
introduce some additional compatibility challenges which we'll
need to keep an eye on.
2024-01-16 12:53:52 -08:00
Daniel Hiltgen
e282bdccdd
Merge pull request #1990 from dhiltgen/ci_mac_cross
...
Add macos cross-compile CI coverage
2024-01-16 12:31:37 -08:00
Michael Yang
d9bfb2f08f
install: pin fedora to max 37
...
repos for fedora 38 and newer do not exist as of this commit
```
$ dnf config-manager --add-repo https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo
Adding repo from: https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo
Status code: 404 for https://developer.download.nvidia.com/compute/cuda/repos/fedora38/x86_64/cuda-fedora38.repo (IP: 152.195.19.142)
Error: Configuration of repo failed
```
2024-01-16 11:45:21 -08:00
Michael Yang
598d6d5572
Merge pull request #1937 from jmorganca/mxyng/remove-client-py
...
remove client.py
2024-01-16 11:01:41 -08:00
Bruce MacDonald
a897e833b8
do not cache prompt ( #2018 )
...
- prompt cache causes inferance to hang after some time
2024-01-16 13:48:05 -05:00
Patrick Devine
eef50accb4
Fix show parameters ( #2017 )
2024-01-16 10:34:44 -08:00
Michael Yang
05d53de7a1
Merge pull request #1968 from jmorganca/mxyng/fix-request-retry
...
fix: request retry with error
2024-01-16 10:33:50 -08:00
Daniel Hiltgen
8795447dad
Merge pull request #1966 from fpreiss/fpreiss/gen_linux_cuda_detection
...
improve cuda detection (rel. issue #1704 )
2024-01-14 18:00:11 -08:00
Daniel Hiltgen
b3035112a1
Add macos cross-compile CI coverage
2024-01-14 10:38:59 -08:00
Daniel Hiltgen
95ad9a9fc8
Merge pull request #1988 from dhiltgen/fix_intel_mac
...
Fix typo in arm mac arch script
2024-01-14 08:45:18 -08:00
Daniel Hiltgen
3ca5f69ce8
Fix typo in arm mac arch script
2024-01-14 08:32:57 -08:00
Daniel Hiltgen
cfa6337960
Merge pull request #1982 from dhiltgen/fix_intel_mac
...
Fix intel mac build
2024-01-14 08:26:46 -08:00
Alexander F. Rødseth
f4bf1d514f
Let gpu.go and gen_linux.sh also find CUDA on Arch Linux
2024-01-14 13:40:36 +01:00
Jeffrey Morgan
557110d0ba
Disable mmap
with lora layers ( #1985 )
2024-01-13 23:36:31 -05:00
Daniel Hiltgen
2ecb247276
Fix intel mac build
...
Make sure we're building an x86 ext_server lib when cross-compiling
2024-01-13 14:46:34 -08:00
Jeffrey Morgan
288ef8ff95
add gcc -lstdc++
flag for linux cpu ( #1974 )
2024-01-13 03:53:00 -05:00
Jeffrey Morgan
4cf17990f7
use g++ to build libext_server.so
on linux ( #1972 )
2024-01-13 03:12:42 -05:00
Michael Yang
27331ae3a8
download: add inactivity monitor
...
if a download part is inactive for some time, restart it
2024-01-12 15:23:15 -08:00
Michael Yang
b6c0ef1e70
Merge pull request #1961 from jmorganca/mxyng/rm-double-newline
...
remove double newlines in /set parameter
2024-01-12 15:18:19 -08:00
Michael Yang
356d178f6e
Merge pull request #1971 from jmorganca/mxyng/max-context-length
...
add max context length check
2024-01-12 15:10:25 -08:00
Michael Yang
eaed6f8c45
add max context length check
2024-01-12 14:54:07 -08:00
purificant
6a5bfc2ed6
update actions/setup-go
2024-01-12 22:27:25 +00:00
Michael Yang
cf29bd2d72
fix: request retry with error
...
this fixes a subtle bug with makeRequestWithRetry where an HTTP status
error on a retried request will potentially not return the right err
2024-01-12 13:32:27 -08:00
Fabian Preiss
905862e17b
improve cuda detection (rel. issue #1704 )
2024-01-12 21:59:19 +01:00
Patrick Devine
565f8a3c44
Convert the REPL to use /api/chat for interactive responses ( #1936 )
2024-01-12 12:05:52 -08:00
Michael Yang
5121b7ac9c
remove double newlines in /set parameter
2024-01-12 11:21:15 -08:00
Michael Yang
a70262c6b2
Update README.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2024-01-12 09:43:04 -08:00
Tristram Oaten
40a0a90a88
Add group delete to uninstall instructions ( #1924 )
...
After executing the `userdel ollama` command, I saw this message:
```sh
$ sudo userdel ollama
userdel: group ollama not removed because it has other members.
```
Which reminded me that I had to remove the dangling group too. For completeness, the uninstall instructions should do this too.
Thanks!
2024-01-12 00:07:00 -05:00
Michael Yang
cbe20c4375
update readme
2024-01-11 16:24:37 -08:00
Michael Yang
5ffbbea1d7
remove client.py
2024-01-11 15:53:10 -08:00
Daniel Hiltgen
3773fb6465
Merge pull request #1935 from dhiltgen/cpu_fallback
...
Fix up the CPU fallback selection
2024-01-11 15:52:32 -08:00
Daniel Hiltgen
7427fa1387
Fix up the CPU fallback selection
...
The memory changes and multi-variant change had some merge
glitches I missed. This fixes them so we actually get the cpu llm lib
and best variant for the given system.
2024-01-11 15:27:06 -08:00
Michael Yang
f84537e0e0
Merge pull request #1934 from jmorganca/mxyng/fix-slices
...
fix build and lint
2024-01-11 14:36:20 -08:00
Michael Yang
d2be6387c9
fix typo
2024-01-11 14:25:21 -08:00
Michael Yang
d7af35d3d0
import fmt
2024-01-11 14:22:32 -08:00
Michael Yang
defc1dbd6e
use x/exp/slices
2024-01-11 14:20:13 -08:00
Daniel Hiltgen
de2fbdec99
Merge pull request #1819 from dhiltgen/multi_variant
...
Support multiple LLM libs; ROCm v5 and v6; Rosetta, AVX, and AVX2 compatible CPU builds
2024-01-11 14:00:48 -08:00
Eduard van Valkenburg
f5faf79aa1
Add semantic kernel to Readme ( #1931 )
2024-01-11 14:40:23 -05:00
Michael Yang
f4f939de28
Merge pull request #1552 from jmorganca/mxyng/lint-test
...
add lint and test on pull_request
2024-01-11 09:37:45 -08:00
Daniel Hiltgen
39928a42e8
Always dynamically load the llm server library
...
This switches darwin to dynamic loading, and refactors the code now that no
static linking of the library is used on any platform
2024-01-11 08:42:47 -08:00
Daniel Hiltgen
d88c527be3
Build multiple CPU variants and pick the best
...
This reduces the built-in linux version to not use any vector extensions
which enables the resulting builds to run under Rosetta on MacOS in
Docker. Then at runtime it checks for the actual CPU vector
extensions and loads the best CPU library available
2024-01-11 08:42:47 -08:00
Fabian Preiß
3bc8b9832b
fix gpu_test.go Error (same type) uint64->uint32 ( #1921 )
2024-01-11 08:22:23 -05:00
Jeffrey Morgan
ab6be852c7
revisit memory allocation to account for full kv cache on main gpu
2024-01-11 01:45:31 -05:00
Daniel Hiltgen
052b33b81b
DRY out the Dockefile.build
2024-01-10 17:27:51 -08:00
Daniel Hiltgen
8da7bef05f
Support multiple variants for a given llm lib type
...
In some cases we may want multiple variants for a given GPU type or CPU.
This adds logic to have an optional Variant which we can use to select
an optimal library, but also allows us to try multiple variants in case
some fail to load.
This can be useful for scenarios such as ROCm v5 vs v6 incompatibility
or potentially CPU features.
2024-01-10 17:27:51 -08:00
Jeffrey Morgan
b24e8d17b2
Increase minimum CUDA memory allocation overhead and fix minimum overhead for multi-gpu ( #1896 )
...
* increase minimum cuda overhead and fix minimum overhead for multi-gpu
* fix multi gpu overhead
* limit overhead to 10% of all gpus
* better wording
* allocate fixed amount before layers
* fixed only includes graph alloc
2024-01-10 19:08:51 -05:00
Jeffrey Morgan
f83881390f
revert submodule back to 328b83de23b33240e28f4e74900d1d06726f5eb1
2024-01-10 18:42:39 -05:00
Daniel Hiltgen
ac70ab6761
Merge pull request #1914 from dhiltgen/smarter_cuda_detection
...
Smarter GPU Management library detection
2024-01-10 15:21:56 -08:00
Daniel Hiltgen
3c49c3ab0d
Harden GPU mgmt library lookup
...
When there are multiple management libraries installed on a system
not every one will be compatible with the current driver. This change
improves our management library algorithm to build up a set of discovered
libraries based on glob patterns, and then try all of them until we're able to
load one without error.
2024-01-10 15:06:41 -08:00
Daniel Hiltgen
9754ae4c89
Support optional override of the target archictures
...
This can help speed up incremental builds when you're only testing one
archicture, like amd64. E.g.
BUILD_ARCH=amd64 ./scripts/build_linux.sh && scp ./dist/ollama-linux-amd64 test-system:
2024-01-10 14:43:24 -08:00
Jeffrey Morgan
224fbf2795
update submodule to commit 1fc2f265ff9377a37fd2c61eae9cd813a3491bea
until its main branch is fixed
2024-01-10 17:03:15 -05:00
Jeffrey Morgan
2c6e8f5248
Update submodule to 6efb8eb30e7025b168f3fda3ff83b9b386428ad6
( #1885 )
...
* update submodule to `6efb8eb30e7025b168f3fda3ff83b9b386428ad6`
* unblock condition variable in `update_slots` when closing server
2024-01-10 16:48:38 -05:00
Jeffrey Morgan
34344d801c
clean up cmake build
directory when cross compiling macOS builds
2024-01-09 17:13:56 -05:00
Robin Glauser
e868c8a5c7
Update api.md ( #1878 )
...
Fixed assistant in the example response.
2024-01-09 16:21:17 -05:00
Jeffrey Morgan
c336693f07
calculate overhead based number of gpu devices ( #1875 )
2024-01-09 15:53:33 -05:00
Daniel Hiltgen
e89dc1d54b
Merge pull request #1874 from dhiltgen/correct_cuda_min
...
Set corret CUDA minimum compute capability version
2024-01-09 11:37:22 -08:00
Daniel Hiltgen
1961a81f03
Set corret CUDA minimum compute capability version
...
If you attempt to run the current CUDA build on compute capability 5.2
cards, you'll hit the following failure:
cuBLAS error 15 at ggml-cuda.cu:7956: the requested functionality is not supported
2024-01-09 11:28:24 -08:00
Jeffrey Morgan
8a8c7e7f8d
only build for metal on arm64
2024-01-09 13:51:08 -05:00
Jeffrey Morgan
6df83e6daa
update rough cuda overhead estimate to 15% + 384MiB
2024-01-09 13:51:08 -05:00
Michael Yang
f921e2696e
typo
2024-01-09 09:45:42 -08:00
Michael Yang
4a33cede20
remove unused fields and functions
2024-01-09 09:37:40 -08:00
Michael Yang
f95d2f25f3
fix temporary history file permissions
2024-01-09 09:36:58 -08:00
Michael Yang
2b9892a808
fix(windows): modelpath and list
2024-01-09 09:36:58 -08:00
Michael Yang
2bb2bdd5d4
fix lint
2024-01-09 09:36:58 -08:00
Michael Yang
acfc376efd
add .golangci.yaml
2024-01-09 09:36:58 -08:00
Michael Yang
997253143f
add lint and test on pull_request
2024-01-09 09:36:58 -08:00
Michael Yang
62023177f6
Merge pull request #1614 from jmorganca/mxyng/fix-set-template
...
fix: set template without triple quotes
2024-01-09 09:36:24 -08:00
Jeffrey Morgan
6164f378f2
revert cuda overhead to 20%
2024-01-09 00:54:29 -05:00
Jeffrey Morgan
f387e9631b
use runner if cuda alloc won't fit
2024-01-09 00:44:34 -05:00
Jeffrey Morgan
6566387ae3
add TODO
for cuda overhead
2024-01-09 00:28:03 -05:00
Jeffrey Morgan
37708931fb
update cuda overhead to 20% to fix crashes when switching between models and large context sizes
2024-01-09 00:05:23 -05:00
Jeffrey Morgan
f6cb0a553c
update cuda overhead to 15% or 400MiB
2024-01-08 23:45:45 -05:00
Jeffrey Morgan
2680078c13
fix build on linux
2024-01-08 23:44:13 -05:00
Jeffrey Morgan
f1b7e5f560
update overhead to 15%
2024-01-08 23:37:45 -05:00
Jeffrey Morgan
cb534e6ac2
use 10% vram overhead for cuda
2024-01-08 23:17:44 -05:00
Jeffrey Morgan
58ce2d8273
better estimate scratch buffer size
2024-01-08 21:32:44 -05:00
Jeffrey Morgan
18ddf6d57d
fix windows build
2024-01-08 20:04:01 -05:00
Michael Yang
61e6502449
Merge pull request #1818 from jmorganca/mxyng/fix-alt-prompt
...
fix(cmd): history in alt prompt
2024-01-08 13:48:34 -08:00
Jeffrey Morgan
08f1e18965
Offload layers to GPU based on new model size estimates ( #1850 )
...
* select layers based on estimated model memory usage
* always account for scratch vram
* dont load +1 layers
* better estmation for graph alloc
* Update gpu/gpu_darwin.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
* Update llm/llm.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
* Update llm/llm.go
* add overhead for cuda memory
* Update llm/llm.go
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
* fix build error on linux
* address comments
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2024-01-08 16:42:00 -05:00
Bruce MacDonald
7e8f7c8358
remove ggml automatic re-pull ( #1856 )
2024-01-08 14:41:01 -05:00
Bruce MacDonald
3f3eb19a3b
document response in modelfile template variables ( #1428 )
2024-01-08 14:38:51 -05:00
Daniel Hiltgen
059ae4585e
Merge pull request #1834 from dhiltgen/old_cuda
...
Detect very old CUDA GPUs and fall back to CPU
2024-01-07 10:39:49 -08:00
Daniel Hiltgen
6347f501ca
Merge pull request #1828 from dhiltgen/fix_llava
...
Accept windows paths for image processing
2024-01-07 09:05:46 -08:00
Jeffrey Morgan
5feec959ad
dont use -Wall
in static build ( #1833 )
2024-01-07 10:39:19 -05:00
Jeffrey Morgan
dbdd50b283
add -DCMAKE_SYSTEM_NAME=Darwin
cmake flag ( #1832 )
2024-01-07 00:46:17 -05:00
Daniel Hiltgen
d74ce6bd4f
Detect very old CUDA GPUs and fall back to CPU
...
If we try to load the CUDA library on an old GPU, it panics and crashes
the server. This checks the compute capability before we load the
library so we can gracefully fall back to CPU mode.
2024-01-06 21:40:29 -08:00
Guilherme Baptista
57942b4676
Update README.md - Community Integrations - Ollama for Ruby ( #1830 )
2024-01-06 22:31:39 -05:00
Daniel Hiltgen
e0d05b0f1e
Accept windows paths for image processing
...
This enhances our regex to support windows style paths. The regex will
match invalid path specifications, but we'll still validate file
existence and filter out mismatches
2024-01-06 10:50:27 -08:00
Daniel Hiltgen
2d9dd14f27
Merge pull request #1697 from dhiltgen/win_docs
...
Add windows native build instructions
2024-01-05 19:34:20 -08:00
Jeffrey Morgan
1caa56128f
add cuda lib path for nvidia container toolkit
2024-01-05 21:10:37 -05:00
Michael Yang
0101e76dbe
Merge pull request #1797 from sublimator/nd-allow-extension-origins-still-needs-explicit-listing-2024-01-05
...
fix: allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 17:20:09 -08:00
Michael Yang
2ef9352b94
fix(cmd): history in alt mode
2024-01-05 16:20:02 -08:00
Michael Yang
5580ae2472
fix: set template without triple quotes
2024-01-05 15:51:33 -08:00
Bruce MacDonald
3a9f447141
only pull gguf model if already exists ( #1817 )
2024-01-05 18:50:00 -05:00
Patrick Devine
9c2941e61b
switch api for ShowRequest to use the name field ( #1816 )
2024-01-05 15:06:43 -08:00
Patrick Devine
238ac5e765
Add unit tests for Parser ( #1815 )
2024-01-05 14:04:31 -08:00
Bruce MacDonald
4f4980b66b
simplify ggml update logic ( #1814 )
...
- additional information is now available in show response, use this to pull gguf before running
- make gguf updates cancellable
2024-01-05 15:22:32 -05:00
Patrick Devine
22e93efa41
add show info command and fix the modelfile
2024-01-05 12:20:05 -08:00
Patrick Devine
2909dce894
split up interactive generation
2024-01-05 12:20:05 -08:00
Jeffrey Morgan
df32537312
gpu: read memory info from all cuda devices ( #1802 )
...
* gpu: read memory info from all cuda devices
* add `LOOKUP_SIZE` constant
* better constant name
* address comments
2024-01-05 11:25:58 -05:00
Bruce MacDonald
3367b5f3df
remove unused generate patches ( #1810 )
2024-01-05 11:25:45 -05:00
Matt Williams
46edbbc518
Merge pull request #1801 from jmorganca/mattw/correctdockerlink
2024-01-04 19:20:45 -08:00
Michael Yang
d2ff18cd6b
Merge pull request #1791 from jmorganca/mxyng/update-build
...
update Dockerfile.build
2024-01-04 19:13:44 -08:00
Matt Williams
df086d3c8c
fix docker doc to point to hub
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2024-01-04 18:42:23 -08:00
Nicholas Dudfield
8baaaa39c0
Allow extension origins (still needs explicit listing), fixes #1686
2024-01-05 09:06:47 +07:00
Michael Yang
f9961c70ae
update build
2024-01-04 17:34:38 -08:00
Daniel Hiltgen
cd8fad3398
Merge pull request #1790 from dhiltgen/llm_code_shuffle
...
Cleaup stale submodule
2024-01-04 13:47:25 -08:00
Daniel Hiltgen
9983fa5f4e
Cleaup stale submodule
...
If the tree has a stale submodule, make sure we clean it up first
2024-01-04 13:40:16 -08:00
Daniel Hiltgen
dfda91c2ee
Merge pull request #1788 from dhiltgen/llm_code_shuffle
...
Revamp code layout for the llm directory and llama.cpp submodule
2024-01-04 13:14:28 -08:00
Daniel Hiltgen
fac9060da5
Init submodule with new path
2024-01-04 13:00:13 -08:00
Daniel Hiltgen
a554616f8e
remove old llama.cpp submodule path
2024-01-04 12:12:21 -08:00
Daniel Hiltgen
77d96da94b
Code shuffle to clean up the llm dir
2024-01-04 12:12:05 -08:00
Brian Murray
0d6e3565ae
Add embeddings to API ( #1773 )
2024-01-04 15:00:52 -05:00
Daniel Hiltgen
b5939008a1
Merge pull request #1785 from dhiltgen/win_native_cli
...
Load dynamic cpu lib on windows
2024-01-04 08:55:01 -08:00
Daniel Hiltgen
e9ce91e9a6
Load dynamic cpu lib on windows
...
On linux, we link the CPU library in to the Go app and fall back to it
when no GPU match is found. On windows we do not link in the CPU library
so that we can better control our dependencies for the CLI. This fixes
the logic so we correctly fallback to the dynamic CPU library
on windows.
2024-01-04 08:41:41 -08:00
Bruce MacDonald
4ad6c9b11f
fix: pull either original model or from model on create ( #1774 )
2024-01-04 01:34:38 -05:00
Jeffrey Morgan
c0285158a9
tweak memory requirements error text
2024-01-03 19:47:18 -05:00
Jeffrey Morgan
77a66df72c
add macOS memory check for 47B models
2024-01-03 19:46:16 -05:00
Jeffrey Morgan
5b4837f881
remove unused filetype check
2024-01-03 19:45:39 -05:00
Jeffrey Morgan
29340c2e62
update cmake flags for amd64
macOS ( #1780 )
...
* update cmake flags for intel macOS
* remove `LLAMA_K_QUANTS`
* put back `CMAKE_OSX_DEPLOYMENT_TARGET` and disable `LLAMA_F16C`
2024-01-03 19:22:15 -05:00
Daniel Hiltgen
d5ec730354
Merge pull request #1779 from dhiltgen/refined_amd_gpu_list
...
Improve maintainability of Radeon card list
2024-01-03 16:18:57 -08:00
Daniel Hiltgen
8bed487aba
Merge pull request #1778 from dhiltgen/wsl1
...
Fail fast on WSL1 while allowing on WSL2
2024-01-03 16:18:41 -08:00
Daniel Hiltgen
c1a10a6e9b
Merge pull request #1781 from dhiltgen/cpu_only_build
...
Fix CPU only builds
2024-01-03 16:18:25 -08:00
Daniel Hiltgen
ddbfa6fe31
Fix CPU only builds
...
Go embed doesn't like when there's no matching files, so put
a dummy placeholder in to allow building without any GPU support
If no "server" library is found, it's safely ignored at runtime.
2024-01-03 16:08:34 -08:00
Daniel Hiltgen
2fcd41ef81
Fail fast on WSL1 while allowing on WSL2
...
This prevents users from accidentally installing on WSL1 with instructions
guiding how to upgrade their WSL instance to version 2. Once running WSL2
if you have an NVIDIA card, you can follow their instructions to set up
GPU passthrough and run models on the GPU. This is not possible on WSL1.
2024-01-03 16:02:32 -08:00
Daniel Hiltgen
16f4603b67
Improve maintainability of Radeon card list
...
This moves the list of AMD GPUs to an easier to maintain list which
should make it easier to update over time.
2024-01-03 15:16:56 -08:00
Daniel Hiltgen
1184686649
Merge pull request #1776 from dhiltgen/render_group
...
Add ollama user to render group for Radeon support
2024-01-03 13:07:54 -08:00
Daniel Hiltgen
2588cb2daa
Add ollama user to render group for Radeon support
...
For the ROCm libraries to access the driver, we need to add the ollama user
to the render group.
2024-01-03 12:56:31 -08:00
Jeffrey Morgan
c7ea8f237e
set num_gpu
to 1 only by default on darwin arm64 ( #1771 )
2024-01-03 14:10:29 -05:00
Bruce MacDonald
0b3118e0af
fix: relay request opts to loaded llm prediction ( #1761 )
2024-01-03 12:01:42 -05:00
Daniel Hiltgen
05face44ef
Merge pull request #1683 from dhiltgen/fix_windows_test
...
Fix windows system memory lookup
2024-01-03 09:00:39 -08:00
Daniel Hiltgen
a2ad952440
Fix windows system memory lookup
...
This refines the gpu package error handling and fixes a bug with the
system memory lookup on windows.
2024-01-03 08:50:01 -08:00
Daniel Hiltgen
5fea4410be
Merge pull request #1680 from dhiltgen/better_patching
...
Refactor how we augment llama.cpp and refine windows native build
2024-01-03 08:10:17 -08:00
Bruce MacDonald
b846eb64d0
Fix template
api doc description ( #1661 )
2024-01-03 11:00:59 -05:00
Cole Gillespie
3c5dd9ed1d
Update README.md ( #1766 )
2024-01-03 10:44:22 -05:00
Jeffrey Morgan
b17ccd0542
Update import.md
2024-01-02 22:28:18 -05:00
Patrick Devine
d0409f772f
keyboard shortcut help ( #1764 )
2024-01-02 18:04:12 -08:00
Jeffrey Morgan
ec261422af
use docker build
in build scripts
2024-01-02 19:32:54 -05:00
Daniel Hiltgen
0498f7ce56
Get rid of one-line llama.log
...
This one log line was triggering a single line llama.log to be generated
in the pwd of the server
2024-01-02 15:36:16 -08:00
Daniel Hiltgen
738a8d12eb
Rename the ollama cmakefile
2024-01-02 15:36:16 -08:00
Daniel Hiltgen
d966b730ac
Switch windows build to fully dynamic
...
Refactor where we store build outputs, and support a fully dynamic loading
model on windows so the base executable has no special dependencies thus
doesn't require a special PATH.
2024-01-02 15:36:16 -08:00
Daniel Hiltgen
9a70aecccb
Refactor how we augment llama.cpp
...
This changes the model for llama.cpp inclusion so we're not applying a patch,
but instead have the C++ code directly in the ollama tree, which should make it
easier to refine and update over time.
2024-01-02 15:35:55 -08:00
Karim ElGhandour
22cd5eaab6
Added Ollama-SwiftUI to integrations ( #1747 )
2024-01-02 09:47:50 -05:00
Dane Madsen
304a8799ca
Update README.md ( #1757 )
2024-01-02 09:47:08 -05:00
Jeffrey Morgan
2a2fa3c329
api.md
cleanup & formatting
2023-12-27 14:32:35 -05:00
Jeffrey Morgan
55978c1dc9
clean up cache api option
2023-12-27 14:27:45 -05:00
Jeffrey Morgan
d4ebdadbe7
enable cache_prompt
by default
2023-12-27 14:23:42 -05:00
Daniel Hiltgen
e201efa14b
Add windows native build instructions
2023-12-25 08:31:34 -08:00
Icelain
c5f21f73a4
follow best practices by adding resp.Body.Close() ( #1708 )
2023-12-25 09:01:37 -05:00
Jeffrey Morgan
371bc73531
Update README.md
2023-12-24 11:54:08 -05:00
Jeffrey Morgan
c651d8b824
Update README.md
2023-12-23 11:18:12 -05:00
Daniel Hiltgen
cf50ef5b51
Merge pull request #1684 from dhiltgen/tag_integration_tests
...
Guard integration tests with a tag
2023-12-22 16:43:41 -08:00
Daniel Hiltgen
697bea6939
Guard integration tests with a tag
...
This should help CI avoid running the integration test logic in a
container where it's not currently possible.
2023-12-22 16:33:27 -08:00
K0IN
10da41d677
Add Cache flag to api ( #1642 )
2023-12-22 17:16:20 -05:00
Bruce MacDonald
db356c8519
post-response templating ( #1427 )
2023-12-22 17:07:05 -05:00
Jeffrey Morgan
b80081022f
cache docker builds in build_linux.sh
2023-12-22 16:01:20 -05:00
Matt Williams
790457398a
Merge pull request #1677 from jmorganca/mattw/docrunupdate
...
update where are models stored q
2023-12-22 09:56:27 -08:00
Matt Williams
511069a2a5
update where are models stored q
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-22 09:48:44 -08:00
Matt Williams
5a85070c22
Update readmes, requirements, packagejsons, etc for all examples ( #1452 )
...
Most of the examples needed updates of Readmes to show how to run them. Some of the requirements.txt files had extra content that wasn't needed, or missing altogether. Apparently some folks like to run npm start
to run typescript, so a script was added to all typescript examples which
hadn't been done before.
Basically just a lot of cleanup.
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-22 09:10:41 -08:00
Matt Williams
291700c92d
Clean up documentation ( #1506 )
...
* Clean up documentation
Will probably need to update with PRs for new release.
Signed-off-by: Matt Williams <m@technovangelist.com >
* Correcting to fit in 0.1.15 changes
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* addressing comments
Signed-off-by: Matt Williams <m@technovangelist.com >
* more api cleanup
Signed-off-by: Matt Williams <m@technovangelist.com >
* its llava not llama
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Updated hosting to server and documented all env vars
Signed-off-by: Matt Williams <m@technovangelist.com >
* remove last of the cli descriptions
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* update further per conversation with jeff earlier today
Signed-off-by: Matt Williams <m@technovangelist.com >
* cleanup the doc readme
Signed-off-by: Matt Williams <m@technovangelist.com >
* move upgrade to faq
Signed-off-by: Matt Williams <m@technovangelist.com >
* first change
Signed-off-by: Matt Williams <m@technovangelist.com >
* updated
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* examples in parent
Signed-off-by: Matt Williams <m@technovangelist.com >
* add exapmle for create model.
Signed-off-by: Matt Williams <m@technovangelist.com >
* update faq
Signed-off-by: Matt Williams <m@technovangelist.com >
* update create model api
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update docs/api.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* update the readme in docs
Signed-off-by: Matt Williams <m@technovangelist.com >
* update a few more things
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/faq.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update README.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/modelfile.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
* Update docs/troubleshooting.md
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
---------
Signed-off-by: Matt Williams <m@technovangelist.com >
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2023-12-22 09:10:01 -08:00
Daniel Hiltgen
9db28af84e
Merge pull request #1675 from dhiltgen/less_verbose
...
Quiet down llama.cpp logging by default
2023-12-22 08:57:17 -08:00
Daniel Hiltgen
e5202eb687
Quiet down llama.cpp logging by default
...
By default builds will now produce non-debug and non-verbose binaries.
To enable verbose logs in llama.cpp and debug symbols in the
native code, set `CGO_CFLAGS=-g`
2023-12-22 08:47:18 -08:00
Daniel Hiltgen
96fb441abd
Merge pull request #1146 from dhiltgen/ext_server_cgo
...
Add cgo implementation for llama.cpp
2023-12-22 08:16:31 -08:00
Daniel Hiltgen
495c06e4a6
Fix doc glitch
2023-12-21 18:21:31 -08:00
Daniel Hiltgen
fa24e73b82
Remove CPU build, fixup linux build script
2023-12-21 18:21:31 -08:00
Daniel Hiltgen
325d74985b
Fix CPU performance on hyperthreaded systems
...
The default thread count logic was broken and resulted in 2x the number
of threads as it should on a hyperthreading CPU
resulting in thrashing and poor performance.
2023-12-21 16:23:36 -08:00
Bruce MacDonald
fabf2f3467
allow for starting llava queries with filepath ( #1549 )
2023-12-21 13:20:59 -05:00
Daniel Hiltgen
d9cd3d9667
Revive windows build
...
The windows native setup still needs some more work, but this gets it building
again and if you set the PATH properly, you can run the resulting exe on a cuda system.
2023-12-20 17:21:54 -08:00
Patrick Devine
a607d922f0
add FAQ for slow networking in WSL2 ( #1646 )
2023-12-20 16:27:24 -08:00
Daniel Hiltgen
7555ea44f8
Revamp the dynamic library shim
...
This switches the default llama.cpp to be CPU based, and builds the GPU variants
as dynamically loaded libraries which we can select at runtime.
This also bumps the ROCm library to version 6 given 5.7 builds don't work
on the latest ROCm library that just shipped.
2023-12-20 14:45:57 -08:00
Jeffrey Morgan
df06812494
Update api.md
2023-12-20 08:47:53 -05:00
Daniel Hiltgen
1d1eb1688c
Additional nvidial-ml path to check
2023-12-19 15:52:34 -08:00
Michael Yang
23dc179350
Merge pull request #1619 from jmorganca/mxyng/fix-version-test
...
fix(test): use real version string for comparison
2023-12-19 15:48:52 -08:00
Michael Yang
63aac0edc5
fix(test): use real version string for comparison
2023-12-19 15:03:02 -08:00
Daniel Hiltgen
6558f94ed0
Fix darwin intel build
2023-12-19 13:32:24 -08:00
Erick Ghaumez
1ca484f67e
Add Langchain Dart library ( #1564 )
...
* Add Langchain Dart
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-19 14:04:52 -05:00
Jeffrey Morgan
72b0c32fe9
Update README.md
2023-12-19 12:59:22 -05:00
Jeffrey Morgan
68c28224f8
Update README.md
2023-12-19 12:59:03 -05:00
Daniel Hiltgen
54dbfa4c4a
Carry ggml-metal.metal as payload
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
5646826a79
Add WSL2 path to nvidia-ml.so library
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
3269535a4c
Refine handling of shim presence
...
This allows the CPU only builds to work on systems with Radeon cards
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
1b991d0ba9
Refine build to support CPU only
...
If someone checks out the ollama repo and doesn't install the CUDA
library, this will ensure they can build a CPU only version
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
51082535e1
Add automated test for multimodal
...
A simple test case that verifies llava:7b can read text in an image
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
9adca7f711
Bump llama.cpp to b1662 and set n_parallel=1
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
89bbaafa64
Build linux using ubuntu 20.04
...
This changes the container-based linux build to use an older Ubuntu
distro to improve our compatibility matrix for older user machines
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
35934b2e05
Adapted rocm support to cgo based llama.cpp
2023-12-19 09:05:46 -08:00
65a
f8ef4439e9
Use build tags to generate accelerated binaries for CUDA and ROCm on Linux.
...
The build tags rocm or cuda must be specified to both go generate and go build.
ROCm builds should have both ROCM_PATH set (and the ROCM SDK present) as well
as CLBlast installed (for GGML) and CLBlast_DIR set in the environment to the
CLBlast cmake directory (likely /usr/lib/cmake/CLBlast). Build tags are also
used to switch VRAM detection between cuda and rocm implementations, using
added "accelerator_foo.go" files which contain architecture specific functions
and variables. accelerator_none is used when no tags are set, and a helper
function addRunner will ignore it if it is the chosen accelerator. Fix go
generate commands, thanks @deadmeu for testing.
2023-12-19 09:05:46 -08:00
Daniel Hiltgen
d4cd695759
Add cgo implementation for llama.cpp
...
Run the server.cpp directly inside the Go runtime via cgo
while retaining the LLM Go abstractions.
2023-12-19 09:05:46 -08:00
Bruce MacDonald
5e7fd6906f
Update images.go
2023-12-19 09:05:46 -08:00
Bruce MacDonald
811b1f03c8
deprecate ggml
...
- remove ggml runner
- automatically pull gguf models when ggml detected
- tell users to update to gguf in the case automatic pull fails
Co-Authored-By: Jeffrey Morgan <jmorganca@gmail.com >
2023-12-19 09:05:46 -08:00
Matt Williams
ed195f3562
Merge pull request #1595 from pgibler/main
...
Added cmdh to community section in README
2023-12-18 20:55:18 -08:00
Matt Williams
e0d0072ef1
Merge pull request #1592 from jmorganca/mattw/examplepruning
...
Lets get rid of these old modelfile examples
2023-12-18 20:29:48 -08:00
pgibler
620a2ffcfb
Added cmdh to community section in README
2023-12-18 22:04:40 -05:00
Matt Williams
d287013f24
Lets get rid of these old modelfile examples
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-18 17:47:33 -08:00
Jeffrey Morgan
6b5bdfa6c9
update runner submodule
2023-12-18 17:33:46 -05:00
Jeffrey Morgan
c063ee4af0
update runner submodule to fix hipblas build
2023-12-18 15:41:13 -05:00
Bruce MacDonald
d99fa6ce0a
send empty messages on last chat response ( #1530 )
2023-12-18 14:23:38 -05:00
Patrick Devine
3948c6ea06
add magic header for unit tests ( #1558 )
2023-12-18 10:41:02 -08:00
Jeffrey Morgan
b85982eb91
update runner submodule
2023-12-18 12:43:31 -05:00
Patrick Devine
86b0dd4b16
add API create/copy handlers ( #1541 )
2023-12-15 11:59:18 -08:00
Augustinas Malinauskas
f728738427
README with Enchanted iOS App ( #1529 )
...
* feat(docs): README with Enchanted iOS app
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-15 14:37:29 -05:00
Ian Purton
115048a0d8
Added Bionic GPT as a front end. ( #1463 )
...
* Added Bionic GPT as a front end.
* Update README.md
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-15 14:33:04 -05:00
Bruce MacDonald
1b417a7836
use exp slices for go 1.20 compatibility ( #1544 )
2023-12-15 14:15:56 -05:00
Patrick Devine
0174665d0e
add API tests for list handler ( #1535 )
2023-12-14 18:18:25 -08:00
Patrick Devine
630518f0d9
Add unit test of API routes ( #1528 )
2023-12-14 16:47:40 -08:00
Bruce MacDonald
6e16098a60
remove sample_count from docs ( #1527 )
...
this info has not been returned from these endpoints in some time
2023-12-14 17:49:00 -05:00
Bruce MacDonald
6ee8c80199
restore model load duration on generate response ( #1524 )
...
* restore model load duration on generate response
- set model load duration on generate and chat done response
- calculate createAt time when response created
* remove checkpoints predict opts
* Update routes.go
2023-12-14 12:15:50 -05:00
Jeffrey Morgan
31f0551dab
Update runner to support mixtral and mixture of experts (MoE) ( #1475 )
2023-12-13 17:15:10 -05:00
Jeffrey Morgan
4a1abfe4fa
fix tests
2023-12-13 14:42:30 -05:00
Jeffrey Morgan
bbd41494bf
add multimodal to README.md
2023-12-13 14:38:47 -05:00
Jeffrey Morgan
fedba24a63
Docs for multimodal support ( #1485 )
...
* add multimodal docs
* add chat api docs
* consistency between `/api/generate` and `/api/chat`
* simplify docs
2023-12-13 13:59:33 -05:00
pepperoni21
e3b090dbc5
Added message format for chat api ( #1488 )
2023-12-13 11:21:23 -05:00
Patrick Devine
d9e60f634b
add image support to the chat api ( #1490 )
2023-12-12 13:28:58 -08:00
Michael Yang
4251b342de
Merge pull request #1469 from jmorganca/mxyng/model-types
...
remove per-model types
2023-12-12 12:27:03 -08:00
Jeffrey Morgan
0a9d348023
Fix issues with /set template
and /set system
( #1486 )
2023-12-12 14:43:19 -05:00
Bruce MacDonald
3144e2a439
exponential back-off ( #1484 )
2023-12-12 12:33:02 -05:00
Bruce MacDonald
c0960e29b5
retry on concurrent request failure ( #1483 )
...
- remove parallel
2023-12-12 12:14:35 -05:00
ruecat
5314fc9b63
Fix Readme "Database -> MindsDB" link ( #1479 )
2023-12-12 10:26:13 -05:00
Jorge Torres
a36b5fef3b
Update README.md ( #1412 )
2023-12-11 18:05:10 -05:00
Patrick Devine
910e9401d0
Multimodal support ( #1216 )
...
---------
Co-authored-by: Matt Apperson <mattapperson@Matts-MacBook-Pro.local >
2023-12-11 13:56:22 -08:00
Michael Yang
56ffc3023a
remove per-model types
...
mostly replaced by decoding tensors except ggml models which only
support llama
2023-12-11 09:40:21 -08:00
Bruce MacDonald
7a1b37ac64
os specific ctrl-z ( #1420 )
2023-12-11 10:48:14 -05:00
Jeffrey Morgan
5d4d2e2c60
update docs with chat completion api
2023-12-10 13:53:36 -05:00
Jeffrey Morgan
7db5bcf73b
fix go-staticcheck
warning
2023-12-10 11:44:27 -05:00
Jeffrey Morgan
fa2f095bd9
fix model name returned by /api/generate
being different than the model name provided
2023-12-10 11:42:15 -05:00
Jeffrey Morgan
045b855db9
fix error on accumulating final chat response
2023-12-10 11:24:39 -05:00
Jeffrey Morgan
32064a0646
fix empty response when receiving runner error
2023-12-10 10:53:38 -05:00
Jeffrey Morgan
d9a250e9b5
seek to end of file when decoding older model formats
2023-12-09 21:14:35 -05:00
Jeffrey Morgan
944519ed16
seek to eof for older model binaries
2023-12-09 20:48:57 -05:00
Jeffrey Morgan
2dd040d04c
do not use --parallel 2
for old runners
2023-12-09 20:17:33 -05:00
Bruce MacDonald
bbe41ce41a
fix: parallel queueing race condition caused silent failure ( #1445 )
...
* fix: queued request failures
- increase parallel requests to 2 to complete queued request, queueing is managed in ollama
* log steam errors
2023-12-09 14:14:02 -05:00
Jeffrey Morgan
9e1406e4ed
Don't expose model information in /api/generate
2023-12-09 02:05:43 -08:00
Jeffrey Morgan
b74580c913
Update api.md
2023-12-08 16:02:07 -08:00
Bruce MacDonald
7e9405fd07
fix: encode full previous prompt in context ( #1424 )
2023-12-08 16:53:51 -05:00
Bruce MacDonald
3b0b8930d4
fix: only flush template in chat when current role encountered ( #1426 )
2023-12-08 16:44:24 -05:00
Bruce MacDonald
e3f925fc1b
fix: restore modelfile system in prompt template ( #1425 )
2023-12-08 14:20:19 -05:00
Jeffrey Morgan
2a2289fb6b
Update api.md
2023-12-08 09:36:45 -08:00
Matt Williams
dd427f499a
Merge pull request #1419 from jmorganca/mattw/typescript-simplechat
...
Simple chat example for typescript
2023-12-07 14:42:24 -08:00
Michael Yang
2ae573c7ed
Merge pull request #1421 from jmorganca/mxyng/fix-newline
...
fix redundant newline
2023-12-07 13:47:23 -08:00
Matt Williams
02fe26c44b
update the readme as per bruce
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-07 13:46:30 -08:00
Michael Yang
16c7548460
fix redundant newline
2023-12-07 13:44:45 -08:00
Matt Williams
fa75998c0d
Update examples/typescript-simplechat/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-07 13:40:54 -08:00
Matt Williams
5344f886c8
Update examples/typescript-simplechat/client.ts
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-07 13:40:37 -08:00
Matt Williams
6cc823c9b5
Update examples/typescript-simplechat/client.ts
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-07 13:39:59 -08:00
Matt Williams
b84d34e632
Update examples/typescript-simplechat/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-07 13:39:33 -08:00
Matt Williams
30229a913c
Update examples/typescript-simplechat/client.ts
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-07 13:39:24 -08:00
Matt Williams
1ade380bd7
Simple chat example for typescript
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-07 11:48:25 -08:00
Jeffrey Morgan
ba264e9da8
add future version note to chat api docs
2023-12-07 09:42:15 -08:00
Matt Williams
a2405ec831
Merge pull request #1409 from jmorganca/mattw/python-simplechat
...
Simple chat example
2023-12-06 15:49:45 -08:00
Matt Williams
ce809bb529
Merge branch 'mattw/python-simplechat' of github.com:jmorganca/ollama into mattw/python-simplechat
2023-12-06 15:48:42 -08:00
Matt Williams
76bc4d0458
Cleanup as per Bruce
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-06 15:44:40 -08:00
Bruce MacDonald
4a02945a15
Update examples/python-simplechat/client.py
2023-12-06 18:36:45 -05:00
Matt Williams
aec742b6d2
Update examples/python-simplechat/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-06 15:30:45 -08:00
Matt Williams
f337642e94
Update examples/python-simplechat/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-06 15:30:35 -08:00
Matt Williams
51131cc6e2
Update examples/python-simplechat/client.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-12-06 15:30:10 -08:00
Matt Williams
43027789dc
Simple chat example
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-06 14:35:58 -08:00
Xe Iaso
f9b7d65e2b
docs/tutorials: add bit on how to use Fly GPUs on-demand with Ollama ( #1406 )
...
Signed-off-by: Xe Iaso <xe@camellia.finch-kitefin.ts.net >
2023-12-06 14:14:02 -08:00
Michael Yang
1f05d77110
Merge pull request #1244 from jmorganca/brucemacd/no-fail-template
...
do not fail on unsupported template variables
2023-12-06 13:23:04 -08:00
Michael Yang
c3ff36088b
Merge pull request #774 from jmorganca/mxyng/server-version
...
add version api and show server version in cli
2023-12-06 13:22:55 -08:00
Samuel Calderon
13524b5e72
List "Send chat messages" in table of contents ( #1399 )
...
Thank you @calderonsamuel
2023-12-06 12:34:27 -08:00
Michael Yang
f1b049fed8
Merge pull request #1377 from jmorganca/mxyng/qwen
...
update for qwen
2023-12-06 12:31:51 -08:00
Jeffrey Morgan
97c5696945
fix base urls in chat examples
2023-12-06 12:10:20 -08:00
Bruce MacDonald
47d4e22673
use missingkey in set empty interface when missing
2023-12-05 15:49:05 -08:00
Michael Yang
32f62fbb8e
Merge pull request #1334 from jmorganca/mxyng/load-projectors
...
load projectors
2023-12-05 14:40:53 -08:00
Michael Yang
5d75505ebd
return model configuration in generate
2023-12-05 14:39:02 -08:00
Michael Yang
b9495ea162
load projectors
2023-12-05 14:36:12 -08:00
Michael Yang
409bb9674e
Merge pull request #1308 from jmorganca/mxyng/split-from
...
split from into one or more models
2023-12-05 14:33:03 -08:00
Michael Yang
d3479c07a1
Merge pull request #1250 from jmorganca/mxyng/create-layer
...
refactor layer creation
2023-12-05 14:32:52 -08:00
Michael Yang
b12f1b984f
Merge pull request #1393 from jmorganca/mxyng/fix-whitespace
...
fix: trim space in modelfile fields
2023-12-05 12:18:01 -08:00
Bruce MacDonald
195e3d9dbd
chat api endpoint ( #1392 )
2023-12-05 14:57:33 -05:00
Michael Yang
38fe1a368b
fix: trim space in modelfile fields
2023-12-05 11:57:29 -08:00
Michael Yang
4b77fcb2b9
comments
2023-12-05 09:43:50 -08:00
Michael Yang
cde13bcdea
cmd: only print server version when different
2023-12-05 09:36:01 -08:00
Michael Yang
0f0cd265a7
cmd: add server version
2023-12-05 09:36:01 -08:00
Michael Yang
0db4706ec2
api: add version api handler
2023-12-05 09:36:01 -08:00
Michael Yang
1ebdbd9694
server: add version handler
2023-12-05 09:36:01 -08:00
Michael Yang
5c59455b59
cmd: use existing cmd context
2023-12-05 09:36:01 -08:00
Jeffrey Morgan
00d06619a1
Revert "chat api ( #991 )" while context variable is fixed
...
This reverts commit 7a0899d62d
.
2023-12-04 21:16:27 -08:00
Matt Williams
f1ef3f9947
remove mention of gpt-neox in import ( #1381 )
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-12-04 20:58:10 -08:00
Michael Yang
5a5dca13b2
comments
2023-12-04 16:59:23 -08:00
Michael Yang
7232f1fa41
go mod tidy
2023-12-04 16:59:23 -08:00
Michael Yang
72e7a49aa9
seek instead of copyn
2023-12-04 16:59:23 -08:00
Michael Yang
a3737cbd33
use NewLayer for CreateBlobHandler
2023-12-04 16:59:23 -08:00
Michael Yang
998f1785b6
add modelfamilies
2023-12-04 16:59:23 -08:00
Michael Yang
70a93057cd
refactor layer creation
...
previous layer creation was not ideal because:
1. it required reading the input file multiple times, once to calculate
the sha256 checksum, another to write it to disk, and potentially one
more to decode the underlying gguf
2. used io.ReadSeeker which is prone to user error. if the file isn't
reset correctly or in the right place, it could end up reading an
empty file
there are also some brittleness when reading existing layers else
writing the inherited layers will error reading an already closed file
this commit aims to fix these issues by restructuring layer creation.
1. it will now write the layer to a temporary file as well as the hash
function and move it to the final location on Commit
2. layers are read once once when copied to the destination. exception
is raw model files which still requires a second read to decode the
model metadata
2023-12-04 16:59:23 -08:00
Michael Yang
2cb0fa7d40
split from into one or more models
2023-12-04 16:59:23 -08:00
Michael Yang
b2816bca67
unnecessary ReadSeeker for DecodeGGML
2023-12-04 16:59:23 -08:00
Patrick Devine
bf704423c5
revert cli to use /api/generate ( #1383 )
2023-12-04 16:35:29 -08:00
Bruce MacDonald
7a0899d62d
chat api ( #991 )
...
- update chat docs
- add messages chat endpoint
- remove deprecated context and template generate parameters from docs
- context and template are still supported for the time being and will continue to work as expected
- add partial response to chat history
2023-12-04 18:01:06 -05:00
Michael Yang
0cca1486dd
Merge pull request #1376 from jmorganca/mxyng/rocky-install
...
install: fix rocky kernel packages
2023-12-04 14:23:43 -08:00
Patrick Devine
2113c9d31a
make linewrap still work when the terminal width has changed ( #1350 )
2023-12-04 14:14:56 -08:00
Michael Yang
6deebf2489
update for qwen
2023-12-04 11:38:05 -08:00
Michael Yang
95cb38ae47
install: fix rocky kernel packages
2023-12-04 11:10:42 -08:00
ruecat
1f126afb2d
Ollama Telegram Bot ( #1364 )
...
* Add "ollama-telegram" to Extensions & Plugins
* Update README.md
2023-12-03 11:19:55 -08:00
Jeffrey Morgan
f6201a7a6c
remove duplicate community integration in README.md
2023-12-02 21:18:13 -08:00
Michael Yang
b3f6c6598f
Merge pull request #1349 from jmorganca/mxyng/ctrl-z
...
handle ctrl+z
2023-12-01 16:21:49 -08:00
Michael Yang
88620e983a
handle ctrl+z
2023-12-01 16:15:20 -08:00
Michael Yang
cedae0d17a
Merge pull request #1347 from jshph/adapter-hash
...
Fix adapter loading from SHA hash
2023-12-01 11:08:25 -08:00
Joshua Pham
bb80a597db
Fix adapter loading from SHA hash
2023-12-01 13:50:55 -05:00
Patrick Devine
6681d37861
allow setting the system and template for prompts in the repl ( #1335 )
2023-12-01 09:28:35 -08:00
Michael Yang
0409c1fa59
docker: set PATH, LD_LIBRARY_PATH, and capabilities ( #1336 )
...
* docker: set PATH, LD_LIBRARY_PATH, and capabilities
* example: update k8s gpu manifest
2023-11-30 21:16:56 -08:00
Michael Yang
b56e92470a
Merge pull request #1229 from jmorganca/mxyng/calculate-as-you-go
...
revert checksum calculation to calculate-as-you-go
2023-11-30 10:54:38 -08:00
Jeffrey Morgan
5687f1a0cf
fix unexpected end of response
errors when cancelling in ollama run
2023-11-30 00:30:21 -05:00
James Radtke
7eda3d0c55
Corrected transposed 129 to 192 for OLLAMA_ORIGINS example ( #1325 )
2023-11-29 22:44:17 -05:00
Bruce MacDonald
7194a07d4d
Add chatd to example projects
2023-11-29 21:18:21 -05:00
Michael Yang
13efd5f218
upload: fix PUT retry
2023-11-29 16:38:35 -08:00
Michael Yang
c4bdfffd96
upload: separate progress tracking
2023-11-29 16:38:33 -08:00
Michael Yang
26c63418e0
new hasher
2023-11-29 14:52:41 -08:00
Michael Yang
2799784ac8
revert checksum calculation to calculate-as-you-go
2023-11-29 13:47:58 -08:00
Alec Hammond
91897a606f
Add OllamaEmbeddings to python LangChain example ( #994 )
...
* Add OllamaEmbeddings to python LangChain example
* typo
---------
Co-authored-by: Alec Hammond <alechammond@fb.com >
2023-11-29 16:25:39 -05:00
Bruce MacDonald
96122b7271
validate model tags on copy ( #1323 )
2023-11-29 15:54:29 -05:00
jeremiahbuckley
39be7fdb98
fix rhel cuda install ( #1321 )
...
Co-authored-by: Cloud User <azureuser@testgpu2.hqzwom21okjenksna4y3c4ymjd.phxx.internal.cloudapp.net >
2023-11-29 14:55:15 -05:00
Timothy Jaeryang Baek
c2e3b89176
fix: disable ':' in tag names ( #1280 )
...
Co-authored-by: rootedbox
2023-11-29 13:33:45 -05:00
Patrick Devine
cde31cb220
Allow setting parameters in the REPL ( #1294 )
2023-11-29 09:56:42 -08:00
ToasterUwU
63097607b2
Correct MacOS Host port example ( #1301 )
2023-11-29 11:44:03 -05:00
Michael
2ae80e1e27
Update README.md
...
add new recent models as examples
2023-11-28 22:16:37 -05:00
Michael Yang
b173cfc558
Merge pull request #1195 from jmorganca/mxyng/fix-bar-rate
...
progress: fix bar rate
2023-11-28 11:55:23 -08:00
Michael Yang
424d53ac70
progress: fix bar rate
2023-11-28 11:44:56 -08:00
ftorto
e1a69d44c9
Update faq.md ( #1299 )
...
Fix a typo in the CA update command
2023-11-28 09:54:42 -05:00
Jason Jacobs
3d620f9462
ignore jetbrain ides ( #1287 )
2023-11-27 15:57:45 -05:00
Bruce MacDonald
928950fcc6
update python client create example ( #1227 )
...
* add remote create to python example client
2023-11-27 15:36:19 -05:00
Kasumi
39c6d949fc
Add Amica to community integrations ( #1281 )
2023-11-27 10:44:37 -05:00
Jeffrey Morgan
16a9006306
add back f16c
instructions on intel mac
2023-11-26 15:59:49 -05:00
Jeffrey Morgan
e9216ea459
fix readline history on linux
2023-11-26 15:59:04 -05:00
Jeffrey Morgan
9e4a316405
update submodule commit
2023-11-26 14:52:00 -05:00
Jeffrey Morgan
9fb5e8399c
Fix issues with inputting and formatting multi line strings in ollama run
...
Co-authored-by: Wen Sun <iwendellsun@gmail.com >
2023-11-26 12:54:29 -05:00
Jing Zhang
82b9b329ff
windows CUDA support ( #1262 )
...
* Support cuda build in Windows
* Enable dynamic NumGPU allocation for Windows
2023-11-24 17:16:36 -05:00
Jongwook Choi
12e8c12d2b
Disable CUDA peer access as a workaround for multi-gpu inference bug ( #1261 )
...
When CUDA peer access is enabled, multi-gpu inference will produce
garbage output. This is a known bug of llama.cpp (or nvidia). Until the
upstream bug is fixed, we can disable CUDA peer access temporarily
to ensure correct output.
See #961 .
2023-11-24 14:05:57 -05:00
Jeffrey Morgan
d77dde126b
consistent cpu instructions on macos and linux
2023-11-22 16:26:46 -05:00
Michael Yang
c7e70cd3bb
Merge pull request #1245 from jmorganca/mxyng/gguf-int
...
fix: gguf int type
2023-11-22 11:42:56 -08:00
Michael Yang
199941cd15
fix: gguf int type
2023-11-22 11:40:30 -08:00
Long Huynh
c9474f7f61
Update README.md - Community Integrations - Obsidian BMO Chatbot plugin ( #1239 )
2023-11-22 14:32:30 -05:00
Jeffrey Morgan
927e3ba4a4
tag image with correct version when building with build_docker
script
2023-11-22 14:32:17 -05:00
Bruce MacDonald
37d95157df
fix relative path on create ( #1222 )
2023-11-21 15:43:17 -05:00
Jeffrey Morgan
2eaa95b417
Update api.md
2023-11-21 15:32:05 -05:00
Kevin Cao
3cd07728f4
Make alt+backspace delete word ( #1223 )
2023-11-21 12:26:47 -08:00
Michael Yang
ecf8b793f0
Merge pull request #1224 from jmorganca/mxyng/update
...
update llama.cpp
2023-11-21 12:21:59 -08:00
Matt Williams
abf294826b
Merge pull request #1221 from jmorganca/mattw/communityinstalls
...
add installation packages category to community
2023-11-21 12:12:23 -08:00
Steve Korshakov
ae06bb426b
add Llama Coder ( #1225 )
...
* add Llama Coder
* Update README.md
2023-11-21 14:08:19 -05:00
Matt Williams
d8e0f62ebb
Merge pull request #1159 from jmorganca/mattw/functioncalling
...
Example: Function Calling in Typescript
2023-11-21 10:06:55 -08:00
Michael Yang
a00fac4ec8
update llama.cpp
2023-11-21 09:50:02 -08:00
Jeffrey Morgan
f2113c1fc7
fix potential error in progress bar calculation
2023-11-21 12:48:20 -05:00
Jeffrey Morgan
6452e2ecb8
fix cases where progress bar would not be fixed size
2023-11-21 12:07:25 -05:00
Matt Williams
9a28e263a5
Update README.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2023-11-21 07:25:32 -08:00
Matt Williams
0c066c9214
Update README.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2023-11-21 07:25:26 -08:00
Jeffrey Morgan
aabd71aede
fix rendering and variable width issues on progress bar
2023-11-21 10:02:37 -05:00
Matt Williams
da4d7c9f9c
add installation packages category to community
...
Moved the arch package and someone has added a pr for brew.
that needs to get updated to be a link.
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-21 06:40:59 -08:00
Matt Williams
f321b13a03
Merge pull request #1178 from tusharhero/install-instructions-archlinux
...
Add Installation instructions for Archlinux
2023-11-21 06:33:22 -08:00
Matt Williams
5ebcde1541
Merge branch 'main' into install-instructions-archlinux
2023-11-21 06:32:50 -08:00
Matt Williams
45206cb7cc
Merge pull request #1218 from danemadsen/main
...
Update Maid repo
2023-11-21 06:30:33 -08:00
Matt Williams
6e65b84f54
Merge pull request #1219 from dustinblackman/main
...
docs: Add Oatmeal to terminal integrations
2023-11-21 06:28:12 -08:00
Dustin Blackman
c00ce12e83
docs: Add Oatmeal to terminal integrations
2023-11-21 06:47:43 -05:00
tusharhero
e1cd3152c9
Move Archlinux package to Community Integrations section.
2023-11-21 16:28:50 +05:30
Dane Madsen
0bef3778c9
Update README.md
2023-11-21 21:02:13 +11:00
Dane Madsen
6ebab38b89
Merge branch 'jmorganca:main' into main
2023-11-21 20:01:13 +10:00
Dane Madsen
5d8e864d44
Update Maid repo
2023-11-21 21:00:54 +11:00
Matt Williams
5f7acd0bbd
remove 'recent'
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-20 17:03:25 -08:00
Matt Williams
44b3a1ad42
Merge branch 'mattw/functioncalling' of github.com:jmorganca/ollama into mattw/functioncalling
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-20 17:01:41 -08:00
Matt Williams
0260be4414
remove 'recently'
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-20 16:57:07 -08:00
Jeffrey Morgan
a3fcecf943
only set main_gpu
if value > 0 is provided
2023-11-20 19:54:04 -05:00
Jeffrey Morgan
df07e4a097
remove redundant filename parameter ( #1213 )
2023-11-20 17:05:36 -05:00
Michael Yang
0b7ade0d4c
Merge pull request #1212 from jmorganca/mxyng/metal
...
enable metal for fp32, q5_0, q5_1
2023-11-20 13:56:39 -08:00
Michael Yang
19b7a4d715
recent llama.cpp update added kernels for fp32, q5_0, and q5_1
2023-11-20 13:44:31 -08:00
Bruce MacDonald
31ab453d37
resolve FROM path before sending modelfile ( #1211 )
2023-11-20 16:43:48 -05:00
Jeffrey Morgan
35c4b5ec16
calculate hash separately from http request
2023-11-20 15:45:11 -05:00
James Braza
f24741ff39
Documenting how to view Modelfile
s ( #723 )
...
* Documented viewing Modelfiles in ollama.ai/library
* Moved Modelfile in ollama.ai down per request
2023-11-20 15:24:29 -05:00
Jeffrey Morgan
8c4022b06b
fix initial progress stats
2023-11-20 14:33:46 -05:00
Jeffrey Morgan
433702f421
hide progress stats on completion
2023-11-20 14:22:39 -05:00
Matt Williams
48896f626c
Update examples/typescript-functioncalling/extractwp.ts
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-20 10:12:10 -08:00
Matt Williams
c57aee6fba
Update examples/typescript-functioncalling/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-20 10:10:42 -08:00
Jeffrey Morgan
6066c70edd
restore progress messages for older endpoints
2023-11-20 11:37:17 -05:00
Jeffrey Morgan
f10ac5de19
restore stats updated every second to progress bar
2023-11-20 10:58:19 -05:00
Jeffrey Morgan
93a108214c
only show decimal points for smaller file size numbers
2023-11-20 10:58:19 -05:00
Purinda Gunasekara
be61a81758
main-gpu argument is not getting passed to llamacpp, fixed. ( #1192 )
2023-11-20 10:52:52 -05:00
Toni Soriano
2fdf1b5ff8
add laravel package to README.md ( #1208 )
...
Co-authored-by: Toni <cloudstudio@Tonis-Mac-mini.local >
2023-11-20 10:48:35 -05:00
Huy Le
331068b964
Adding ogpt.nvim
into the list of plugins! ( #1190 )
...
* adding ollama.nvim for visibility
* adding an ogpt.nvim neovim plugin
2023-11-20 10:39:14 -05:00
Andy Brenneke
0179d8eb6b
Add Rivet to Community Integrations ( #1183 )
2023-11-20 10:36:47 -05:00
Eli Bendersky
be48741308
README: link to LangChainGo for talking to ollama, with an example ( #1206 )
2023-11-20 10:35:07 -05:00
Jeffrey Morgan
6bbd6e26fb
fix temporary newline created and removed with spinner in ollama run
2023-11-20 00:49:08 -05:00
Jeffrey Morgan
e6ad4813d3
dont crash when redirecting stderr
2023-11-19 23:50:45 -05:00
Jeffrey Morgan
13ba6df5ab
enable cpu instructions on intel macs
2023-11-19 23:20:26 -05:00
Jeffrey Morgan
9d73d3a6b5
add back part.Reset()
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
72cd336410
dont retry on upload complete context cancel
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
1bd594b2fa
revert to using one open file for blob uploads
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
9a8c21ac3d
use exponential everywhere
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
f6b317e8c9
fix sending too little data in chunk upload body
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
ac5076ce1e
exponential backoff up to 30s
2023-11-19 14:32:19 -05:00
Michael Yang
42c2e3a624
upload: retry complete upload
2023-11-19 14:32:19 -05:00
Michael Yang
cb42589792
adjust download/upload parts
2023-11-19 14:32:19 -05:00
Jeffrey Morgan
258addc799
fix comment in progress.go
2023-11-19 13:46:19 -05:00
Jeffrey Morgan
c06b9b7304
update progress rendering to be closer to v0.1.10
2023-11-19 13:43:21 -05:00
Jeffrey Morgan
95b9acd324
improve pull percentage rendering
2023-11-19 11:00:43 -05:00
Jeffrey Morgan
04cbf5ccc0
progress bar styling improvements
2023-11-19 09:54:33 -05:00
Jeffrey Morgan
e1d7056496
update progress statuses
2023-11-19 09:21:13 -05:00
Jeffrey Morgan
02524a56ff
check retry for authorization error
2023-11-19 00:19:53 -05:00
Jeffrey Morgan
1657c6abc7
add note to specify JSON in the prompt when using JSON mode
2023-11-18 22:59:26 -05:00
Jeffrey Morgan
12e046f12a
remove unused function
2023-11-18 22:16:51 -05:00
Jeffrey Morgan
36a3bbf65f
Update llm/llama.go
2023-11-18 21:25:07 -05:00
Bruce MacDonald
43a726149d
fix potentially inaccurate error message
2023-11-18 21:25:07 -05:00
Jeffrey Morgan
984714f131
update status text when transfering blob on ollama create
2023-11-18 09:40:10 -05:00
Jeffrey Morgan
bab9494176
add -
separator to temp file created on ollama create
2023-11-18 09:39:52 -05:00
Jeffrey Morgan
85e4441c6a
cache docker builds
2023-11-18 08:51:38 -05:00
Michael Yang
42e43736a4
Merge pull request #1186 from jmorganca/mxyng/copy-blob
...
fix cross device rename
2023-11-17 21:54:53 -08:00
Michael Yang
c6e6c8ee7e
fix cross device rename
2023-11-17 15:22:17 -08:00
Jeffrey Morgan
a185b29719
fix install script error on linux
2023-11-17 18:00:41 -05:00
Michael Yang
dc84b20d6b
Merge pull request #1104 from jmorganca/mxyng/jupyter
...
add jupyter notebook example
2023-11-17 14:46:26 -08:00
Michael Yang
ad8659b980
Merge pull request #1161 from jmorganca/mxyng/systemd-placeholder
...
placeholder environment variables
2023-11-17 14:45:38 -08:00
Michael Yang
c1bbf5ddee
Merge pull request #1134 from jmorganca/mxyng/progress
...
progress bar
2023-11-17 14:03:35 -08:00
Bruce MacDonald
0b19e24d81
only retry once on auth failure ( #1175 )
2023-11-17 14:22:35 -05:00
Michael Yang
3cb07d2773
simplify StopAndClear
2023-11-17 10:26:22 -08:00
Michael Yang
976068369b
stop all spinners on progress stop
2023-11-17 10:06:19 -08:00
Michael Yang
4d677ee389
no divide by zero
2023-11-17 10:06:19 -08:00
Michael Yang
7ea905871a
only move cursor up if pos > 0
2023-11-17 10:06:19 -08:00
Michael Yang
d6ecaa2cbf
update progress responses
2023-11-17 10:06:19 -08:00
Michael Yang
4dcf7a59b1
generate progress
2023-11-17 10:06:19 -08:00
Michael Yang
1c0e092ead
progress cmd
2023-11-17 10:06:19 -08:00
Michael Yang
c4a3ccd7ac
progress
2023-11-17 10:06:19 -08:00
Michael Yang
9f04e5a8ea
format bytes
2023-11-17 10:06:19 -08:00
Michael Yang
f91bb2f7f0
remove progressbar
2023-11-17 10:06:19 -08:00
Michael Yang
0813387414
Merge pull request #1177 from jmorganca/mxyng/faq
...
faq: fix heading and add more details
2023-11-17 10:05:21 -08:00
Michael Yang
4936b5bb37
add jupyter readme
2023-11-17 10:04:52 -08:00
tusharhero
786288829e
Make Archlinux a sub-heading of Linux.
2023-11-17 23:17:36 +05:30
tusharhero
72dcc952b6
Add Installation instructions for Archlinux
...
Pacman is the recommended installation method. And the package is in
the official repository, so makes sense to mention it in the README.
2023-11-17 23:13:40 +05:30
Michael Yang
f7f6d6c693
Update examples/jupyter-notebook/ollama.ipynb
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-17 09:30:07 -08:00
Michael Yang
a3053b66d2
add jupyter notebook example
2023-11-17 09:30:07 -08:00
Michael Yang
c82ead4d01
faq: fix heading and add more details
2023-11-17 09:02:17 -08:00
Michael Yang
90860b6a7e
update faq ( #1176 )
2023-11-17 11:42:58 -05:00
Jeffrey Morgan
81092147c4
remove unnecessary -X POST
from example curl
commands
2023-11-17 09:50:38 -05:00
Jeffrey Morgan
92656a74b7
Use llama2
as the model in api.md
2023-11-17 07:17:51 -05:00
Jeffrey Morgan
41434a7cdc
build intel mac with correct binary and compile flags
2023-11-16 22:14:51 -05:00
Michael Yang
71687ab809
Merge pull request #1164 from jmorganca/mxyng/faq
...
update faq
2023-11-16 17:20:18 -08:00
Michael Yang
d8842b4d4b
update faq
2023-11-16 17:07:36 -08:00
Michael Yang
32add8577d
placeholder environment variables
2023-11-16 16:57:39 -08:00
Michael Yang
585f9c01fa
Merge pull request #1160 from jmorganca/mxyng/faq
...
update faq
2023-11-16 16:48:51 -08:00
Michael Yang
c13bde962d
Update docs/faq.md
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2023-11-16 16:48:38 -08:00
Michael Yang
ee307937fd
update faq
2023-11-16 16:46:43 -08:00
Matt Williams
ab6639bc47
Merge pull request #1074 from jmorganca/mattw/loganalysisexample
...
Log Analysis Example
2023-11-16 16:33:07 -08:00
Matt Williams
fefae84c06
example: function calling
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-16 16:26:29 -08:00
Jeffrey Morgan
dbe6e77472
Update README.md
2023-11-16 16:46:38 -05:00
Bruce MacDonald
4b3f4bc7d9
return failure details when unauthorized to push ( #1131 )
...
Co-authored-by: Jeffrey Morgan <jmorganca@gmail.com >
2023-11-16 16:44:18 -05:00
Michael Yang
a5ccf742c1
fix cross repo mounts
2023-11-16 16:33:30 -05:00
Michael Yang
e33ef391cd
fix push scope error for inherited model
2023-11-16 16:33:30 -05:00
yanndegat
75295b9528
install: fix enable contrib on debian 12 ( #1151 )
...
On debian 12, sources definitions have moved from
/etc/apt/sources.list to /etc/apt/sources.list.d/debian.sources
2023-11-16 15:53:06 -05:00
Matt Williams
db5ef3004c
Merge pull request #1079 from jmorganca/mattw/jsonexample
...
Add example using JSON format output
2023-11-16 09:13:34 -08:00
Michael Yang
b5f158f046
add faq for proxies ( #1147 )
2023-11-16 11:43:37 -05:00
Piero Savastano
30141b42e9
Add Cheshire Cat to community integrations ( #1124 )
2023-11-16 11:30:54 -05:00
Dane Madsen
5f301ece1d
Add Maid to Community Integrations ( #1120 )
2023-11-16 11:27:53 -05:00
Michael Yang
77954bea0e
Merge pull request #898 from jmorganca/mxyng/build-context
...
create remote models
2023-11-15 16:41:12 -08:00
Michael Yang
54f92f01cb
update docs
2023-11-15 15:28:15 -08:00
Michael
30ae6e731e
Update randomaddresses.py
2023-11-15 18:24:50 -05:00
Michael
b28a30f7ba
Update examples/python-json-datagenerator/predefinedschema.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-15 18:23:36 -05:00
Jeffrey Morgan
ecd71347ab
Update faq.md
2023-11-15 18:17:13 -05:00
Jeffrey Morgan
8ee4cbea0f
Remove table of contents in faq.md
2023-11-15 18:16:27 -05:00
Michael Yang
652d90e1c7
Update server/images.go
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-15 15:16:23 -08:00
Michael Yang
bc22d5a38b
no blob response
2023-11-15 15:16:23 -08:00
Michael Yang
71d71d0988
update docs
2023-11-15 15:16:23 -08:00
Michael Yang
1901044b07
use checksum reference
2023-11-15 15:16:23 -08:00
Michael Yang
d660eebf22
fix create from model tag
2023-11-15 15:16:23 -08:00
Michael Yang
cac11c9137
update api docs
2023-11-15 15:16:23 -08:00
Michael Yang
a07c935d34
ignore non blobs
2023-11-15 15:16:23 -08:00
Michael Yang
1552cee59f
client create modelfile
2023-11-15 15:16:23 -08:00
Michael Yang
3ca56b5ada
add create modelfile field
2023-11-15 15:16:23 -08:00
Michael Yang
b0d14ed51c
refactor create model
2023-11-15 15:16:23 -08:00
Matt Williams
f61f340279
FAQ: answer a few faq questions ( #1128 )
...
* faq: does ollama share my prompts
Signed-off-by: Matt Williams <m@technovangelist.com >
* faq: ollama and openai
Signed-off-by: Matt Williams <m@technovangelist.com >
* faq: vscode plugins
Signed-off-by: Matt Williams <m@technovangelist.com >
* faq: send a doc to Ollama
Signed-off-by: Matt Williams <m@technovangelist.com >
* extra spacing
Signed-off-by: Matt Williams <m@technovangelist.com >
* Update faq.md
* Update faq.md
---------
Signed-off-by: Matt Williams <m@technovangelist.com >
Co-authored-by: Michael <mchiang0610@users.noreply.github.com >
2023-11-15 18:05:13 -05:00
Michael Yang
686f85d6ca
Merge pull request #1132 from jmorganca/mxyng/human-bytes
...
replace go-humanize with format.HumanBytes
2023-11-15 09:46:21 -08:00
bnodnarb
85951d25ef
Created tutorial for running Ollama on NVIDIA Jetson devices ( #1098 )
2023-11-15 12:32:37 -05:00
Dane Madsen
779e196ef6
Merge branch 'jmorganca:main' into main
2023-11-15 21:38:07 +10:00
Michael Yang
01ea6002c4
replace go-humanize with format.HumanBytes
2023-11-14 14:57:41 -08:00
Jeffrey Morgan
423862042a
treat ollama run model < file
as entire prompt, not prompt-per-line ( #1126 )
...
Previously, `ollama run` treated a non-terminal stdin (such as `ollama run model < file`) as containing one prompt per line. To run inference on a multi-line prompt, the only non-API workaround was to run `ollama run` interactively and wrap the prompt in `"""..."""`.
Now, `ollama run` treats a non-terminal stdin as containing a single prompt. For example, if `myprompt.txt` is a multi-line file, then `ollama run model < myprompt.txt` would treat `myprompt.txt`'s entire contents as the prompt.
Co-authored-by: Quinn Slack <quinn@slack.org >
2023-11-14 16:42:21 -05:00
Bruce MacDonald
df18486c35
Move /generate format to optional parameters ( #1127 )
...
This field is optional and should be under the `Advanced parameters` header
2023-11-14 16:12:30 -05:00
Jeffrey Morgan
4e612a2e92
use stdout fd for terminal size ( #1125 )
2023-11-14 16:09:09 -05:00
Matt Williams
47ffb81db7
Update examples/python-json-datagenerator/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:33:34 -08:00
Matt Williams
69795d2db0
Update examples/python-json-datagenerator/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:33:16 -08:00
Matt Williams
acde0819d9
Update examples/python-json-datagenerator/randomaddresses.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:33:02 -08:00
Matt Williams
f748331aa3
Update examples/python-json-datagenerator/predefinedschema.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:32:45 -08:00
Matt Williams
f4edc302a8
Update examples/python-loganalysis/readme.md
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:31:22 -08:00
Matt Williams
64b7e0c218
Update examples/python-loganalysis/loganalysis.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:31:05 -08:00
Matt Williams
eced0d52ab
Update examples/python-loganalysis/loganalysis.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:30:30 -08:00
Matt Williams
96bf9cafa7
Update examples/python-loganalysis/loganalysis.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-14 10:30:17 -08:00
Jeffrey Morgan
6e0f686afa
--format json
should work in interactive mode
2023-11-14 10:22:03 -05:00
Dane Madsen
c1a5220860
Update README.md
2023-11-14 15:31:31 +10:00
Dane Madsen
3b15175a70
Add maid to community integrations
2023-11-14 15:30:03 +10:00
Jeffrey Morgan
c1844bbee2
add json mode to cli ( #1095 )
2023-11-13 21:54:02 -05:00
Huy Le
cb745965ce
adding ollama.nvim for visibility ( #1115 )
2023-11-13 17:00:17 -05:00
Enrico Ros
8d29b6a2b6
New big-AGI integration ( #1078 )
...
* New big-AGI integration
Ollama works great in big-AGI, and this document explains how to link the two projects.
* Update README.md
2023-11-13 16:59:00 -05:00
Ilya Breitburg
724aa64bee
Add Dart library to README.md ( #1106 )
2023-11-13 14:50:42 -05:00
Michael Yang
d91c103e74
Merge pull request #1055 from dansreis/946-fix-incorrect-base-model-name
...
Fixed incorrect base model name
2023-11-13 08:42:55 -08:00
Kevin Hermawan
98ec7d81e3
Add OllamaKit to the community integrations ( #1085 )
2023-11-11 14:41:42 -08:00
Matt Williams
b6817a83d8
Add gif and finish readme
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-10 16:41:48 -06:00
Matt Williams
73f3448ede
add example showing use of JSON format
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-10 16:33:56 -06:00
Daniel Reis
7c438f2c53
Replaced method
2023-11-10 20:22:03 +00:00
Daniel Reis
6e46338d44
Reverting previous changes
2023-11-10 20:21:35 +00:00
Jeffrey Morgan
cdddd3df65
add format
to example python client
2023-11-10 10:22:21 -08:00
Daniel Hiltgen
afa61bdf45
Merge pull request #1075 from jmorganca/dhiltgen/unexpected-eof
...
Resume chunk download on UnexpectedEOF errors
2023-11-10 08:48:27 -08:00
Daniel Hiltgen
cc54a416c6
Resume chunk download on UnexpectedEOF errors
...
If the chunk download is interrupted, resume from where we left off
2023-11-10 08:29:42 -08:00
Matt Williams
c819d7f68a
Merge pull request #955 from jmorganca/mattw/example-bash-compare
...
docs: add examples using bash to compare models
2023-11-10 08:59:32 -06:00
Matt Williams
e4f59ba073
better streaming plus gif
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-10 08:55:17 -06:00
Matt Williams
5de568bffe
Add a simple log analysis example
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-10 08:28:52 -06:00
Jeffrey Morgan
5cba29b9d6
JSON mode: add `"format" as an api parameter ( #1051 )
...
* add `"format": "json"` as an API parameter
---------
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-09 16:44:02 -08:00
Daniel Reis
d17730356a
Removed inline parse model path
2023-11-09 22:44:26 +00:00
Daniel Reis
32d79a6eea
Using 'GetShortTagname' method instead
2023-11-09 22:40:37 +00:00
Bruce MacDonald
5b39503bcd
document specifying multiple stop params ( #1061 )
2023-11-09 13:16:26 -08:00
Bruce MacDonald
1ae84bc2a2
skip gpu if less than 2GB VRAM are available ( #1059 )
2023-11-09 13:16:16 -08:00
Bruce MacDonald
db8bf336fc
Update README.md
2023-11-09 12:53:24 -08:00
Nick Anderson
d77e094a90
Added gptel to list of integrations ( #1062 )
2023-11-09 12:52:36 -08:00
Matt Williams
dd3dc47ddb
Merge pull request #992 from aashish2057/aashish2057/langchainjs_doc_update
2023-11-09 05:08:31 -08:00
Michael Yang
c5e1bbabda
instead of static number of parameters for each model family, get the real number from the tensors ( #1022 )
...
* parse tensor info
* refactor decoder
* return actual parameter count
* explicit rounding
* s/Human/HumanNumber/
2023-11-08 17:55:46 -08:00
Bruce MacDonald
a49d6acc1e
add a complete /generate options example ( #1035 )
2023-11-08 16:44:36 -08:00
Moritz Poldrack
6e9bcdb9b3
progressbar: make start and end seamless ( #1042 )
2023-11-08 16:42:40 -08:00
Matt Williams
13086363bd
Update as per bmacd
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-08 18:09:05 -06:00
Bruce MacDonald
ec2a31e9b3
support raw generation requests ( #952 )
...
- add the optional `raw` generate request parameter to bypass prompt formatting and response context
-add raw request to docs
2023-11-08 14:05:02 -08:00
Amith Koujalgi
ec84c02d54
Add Ollama4j Java library to the list of community libraries ( #1044 )
2023-11-08 11:04:32 -08:00
Kevin Hermawan
2a88b66bc9
Add Ollamac to community integrations ( #1043 )
2023-11-08 11:01:09 -08:00
Jeffrey Morgan
2d0faea96c
clean up README.md
2023-11-08 00:03:29 -08:00
Jeffrey Morgan
637142181a
clean up README.md
2023-11-07 23:52:31 -08:00
Matt Williams
bcbff421c9
Merge pull request #1023 from jmorganca/mattw/wherearemodelsfaq
2023-11-07 17:59:54 -08:00
thealhu
1359d6cf3b
Fix sudo variable in install.sh ( #1034 )
...
It was forgotten to replace sudo at one place with the variable for sudo.
2023-11-07 09:59:57 -08:00
Omar Magdy
6e2d0224d9
Added logseq ollama plugin ( #1029 )
2023-11-07 09:58:13 -08:00
Ikko Eltociear Ashimine
921406f721
Update client.py ( #1026 )
...
recieve -> receive
2023-11-07 09:55:47 -08:00
Michael Yang
c7047d7353
Merge pull request #959 from jmorganca/mxyng/example-k8s
2023-11-07 10:43:21 -06:00
Matt Williams
1d155caba3
docs: clarify where the models are stored in the faq
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-11-06 14:38:49 -08:00
Michael Yang
866324b9a5
Merge pull request #943 from tjbck/patch-1
...
doc: categorised community integrations + added ollama-webui
2023-11-06 11:35:39 -08:00
Michael Yang
145e060855
Apply suggestions from code review
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-11-06 11:32:23 -08:00
Michael Yang
146072113d
Merge pull request #993 from jmorganca/mxyng/cleanup
...
cleanup upload and download errors
2023-11-06 11:32:12 -08:00
Timothy Jaeryang Baek
33d31d1b56
Merge branch 'main' into patch-1
2023-11-06 14:27:02 -05:00
Dr. David A. Kunz
274c6cbf4c
Added gen.nvim to community integrations ( #996 )
2023-11-06 10:51:41 -08:00
Elton Renda
7ebbd89bbf
add hass-ollama-conversation ( #999 )
2023-11-06 10:50:35 -08:00
Lars Grammel
9079b1bb6d
Add ModelFusion community integration ( #1020 )
2023-11-06 10:46:16 -08:00
Timothy Jaeryang Baek
6febde7200
Merge branch 'main' into patch-1
2023-11-04 19:12:18 -05:00
pepperoni21
325cfcd9ff
Added ollama-rs to community integrations ( #995 )
...
Co-authored-by: pepperoni21 <pepperoni2100@gmail.com >
2023-11-04 14:51:29 -07:00
Jeffrey Morgan
639d0fd070
Update README.md
2023-11-04 12:24:24 -07:00
Jeffrey Morgan
e21579a0f1
Restore system prompt on requests
2023-11-03 17:26:45 -07:00
Jeffrey Morgan
c44b619428
remove unused fmt.Println
2023-11-03 17:24:58 -07:00
Michael Yang
434a6f9d46
return last error
2023-11-03 16:49:51 -07:00
aashish2057
b13586cc72
update langchainjs doc
2023-11-03 18:45:19 -05:00
Jeffrey Morgan
17678b7225
Restore system prompt on requests and default num_keep
to 0
2023-11-03 13:25:25 -07:00
Michael Yang
84725ec7e3
refactor part reset
2023-11-03 09:20:32 -07:00
Bruce MacDonald
6109bebba6
reformat api docs for more examples ( #972 )
2023-11-03 10:57:00 -04:00
Noah Gitsham
8ae8c9fa8c
Remove duplicate "install" in GPU support warning ( #984 )
2023-11-03 00:45:14 -07:00
Noah Gitsham
f39daff461
Add missing "be" to GPU support warning message ( #983 )
2023-11-02 18:37:12 -07:00
Jeffrey Morgan
c50b01bc21
check request.Context
for initial system prompt
2023-11-02 18:17:00 -07:00
Bruce MacDonald
b9dc875401
remove modelfile context deprecated in v0.0.7 ( #974 )
2023-11-02 20:52:56 -04:00
Jeffrey Morgan
06589a3b30
Set NumKeep
to 4
by default ( #982 )
2023-11-02 17:26:11 -07:00
Michael Yang
1fd511e661
Merge pull request #975 from jmorganca/mxyng/downloads
...
update downloads to use retry wrapper
2023-11-02 16:12:48 -07:00
Michael Yang
c01bbe94fd
Merge pull request #979 from jmorganca/mxyng/num-keep
...
update default NumKeep
2023-11-02 15:48:44 -07:00
Jeffrey Morgan
1beb5645a9
only use system prompt if context is not provided ( #978 )
2023-11-02 15:48:02 -07:00
Michael Yang
6db3691b8f
update default NumKeep
2023-11-02 15:47:35 -07:00
Michael Yang
fe5a872444
fix upload
2023-11-02 13:25:58 -07:00
Michael Yang
d39709260f
download with retry
2023-11-02 13:16:11 -07:00
Michael Yang
60bb3c03a1
use http.Method
2023-11-02 13:12:45 -07:00
Jeffrey Morgan
2e53704685
default rope params to 0 for new models ( #968 )
2023-11-02 08:41:30 -07:00
Michael Yang
527f9a7975
Merge pull request #966 from jmorganca/mxyng/fix-log
2023-11-01 17:49:10 -07:00
Michael Yang
c4cc738cbf
fix log
2023-11-01 17:18:11 -07:00
Michael Yang
2c6189f4fe
Merge pull request #750 from jmorganca/mxyng/concurrent-uploads
...
concurrent uploads
2023-11-01 15:00:01 -07:00
Michael Yang
dccac8c8fa
k8s example
2023-11-01 14:52:58 -07:00
Michael Yang
c05ab9a86e
Merge pull request #965 from jmorganca/mxyng/go-mod-tidy
...
go mod tidy
2023-11-01 11:55:43 -07:00
Michael Yang
f42f3d9b27
go fmt
2023-11-01 11:55:08 -07:00
Michael Yang
341fb7e35f
go mod tidy
2023-11-01 11:54:25 -07:00
Michael
f31961637f
Update README.md
2023-11-01 12:20:55 -04:00
Michael Yang
ec3614812a
Merge pull request #960 from jmorganca/mxyng/fix-tautology
2023-11-01 08:30:49 -07:00
Michael Yang
f14969314a
Merge pull request #958 from jmorganca/mxyng/append-ld-library-path
2023-11-01 08:30:38 -07:00
Bruce MacDonald
1fb9288661
notify that the ollama api is available after linux install ( #954 )
2023-11-01 11:28:26 -04:00
Matt Williams
01a03caa20
Merge pull request #956 from jmorganca/mattw/apidocupdate
2023-10-31 21:43:11 -07:00
Michael Yang
bf6786bb39
fix tautology
2023-10-31 20:49:48 -07:00
Michael Yang
642128b75a
append LD_LIBRARY_PATH
2023-10-31 15:54:49 -07:00
Matt Williams
f21bd6210d
docs: clarify and clean up API docs
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-10-31 13:11:33 -07:00
Matt Williams
80362fedce
better readme
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-10-31 12:40:46 -07:00
Matt Williams
5757925060
add a gif
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-10-31 11:52:01 -07:00
Michael
4512301756
Update README.md
2023-10-31 13:25:36 -04:00
Matt Williams
2236a93efc
docs: add examples using bash to compare models
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-10-31 09:12:39 -07:00
Matt Williams
ad88799411
Merge pull request #949 from jmorganca/matt/fixPrivateGPT
...
fix: private gpt example was broken due to changes in chroma
2023-10-30 17:17:00 -07:00
Bruce MacDonald
0818b5e318
readline windows terminal support ( #950 )
...
- update the readline package to have basic support on windows, this is not full feature parity with the unix cli yet
2023-10-30 16:18:12 -04:00
Matt Williams
1df6100c77
Update examples/langchain-python-rag-privategpt/privateGPT.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-10-30 12:48:17 -07:00
Matt Williams
5c48fe1fb0
Update examples/langchain-python-rag-privategpt/constants.py
...
Co-authored-by: Bruce MacDonald <brucewmacdonald@gmail.com >
2023-10-30 12:47:56 -07:00
Dirk Loss
874bb31986
Fix conversion command for gptneox ( #948 )
2023-10-30 14:34:29 -04:00
Matt Williams
f7856a57eb
fix: private gpt example was broken due to changes in chroma
...
Signed-off-by: Matt Williams <m@technovangelist.com >
2023-10-30 10:56:25 -07:00
Bruce MacDonald
f9a4281124
clean up: remove server functions from client ( #937 )
2023-10-30 11:10:18 -04:00
Timothy Jaeryang Baek
96da0792e6
doc: OllamaSharp for .NET moved to libraries
2023-10-28 16:18:38 -05:00
Timothy Jaeryang Baek
95d24262fc
doc: categorised community integrations + added web-ui
2023-10-28 16:02:13 -05:00
Jeffrey Morgan
8d03bd7b54
remove +build
directive in term.go
2023-10-28 09:56:03 -07:00
Michael Yang
115fc56eb7
calculate and verify md5 checksum
2023-10-27 17:07:33 -07:00
Michael Yang
186f685224
retry PUT
2023-10-27 17:07:33 -07:00
Michael Yang
12efcbb057
comments
2023-10-27 17:07:33 -07:00
Michael Yang
4e09aab8b9
concurrent uploads
2023-10-27 17:07:33 -07:00