* Store and persist OS upgrade map to fix update path evaluation
The existing logic calculated OS upgrade paths inline during fetch_data,
which will not get reevaluted when the current OS is unsupported
(JobCondition.OS_SUPPORTED). E.g. after updating from 11.4 to 11.5, the
system wouldn't offer the next available update (15.2) because the
upgrade path calculation relied on fresh data from the blocked fetch
operation.
Changes:
- Add ATTR_HASSOS_UPGRADE constant and schema validation
- Store hassos-upgrade map from version JSON in updater data
- Refactor version_hassos property to use stored upgrade map instead of
inline calculation during fetch_data
- Maintain upgrade path logic: upgrade within major version first, then
jump to next major version when at the latest in current major
- Add type safety checks for version.major access
This ensures upgrade paths work correctly even when update data refresh
is blocked due to unsupported OS versions, fixing the scenario where
HAOS 11.5 wouldn't show 15.2 as the next available update.
* Update supervisor/updater.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Address mypy issue
* Fix pytest
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* Add availability API for addons
* Add cast back and test for latest version of installed addon
* Make error responses more translation/client library friendly
* Add test cases for install/update APIs
* Add background option to update/install APIs
* Refactor to use common background_task utility in backups too
* Use a validation_complete event rather then looking for bus events
* Fix NetworkManager connection name for VLANs
The connection name for VLANs should include the parent interface name
for better identification. This was originally the intention, but the
interface object's name property was used which appears empty at that
point.
* Disallow creating multiple connections for the same VLAN id
Only allow a single connection per interface and VLAN id. The regular
network commands can be used to alter the configuration.
* Fix pytest
* Simply connection id name generation
Always rely on the Supervisor interface representation's name attribute
to generate the NetworkManager connection id. Make sure that the name
is correctly set when creating VLAN interfaces as well.
* Special case VLAN configuration
We can't use the match information when comparing Supervisor interface
representation with D-Bus representations. Special case VLAN and
compare using VLAN ID and parent interface.
Note that this currently compares connection UUID of the parent
interface.
* Fix pytest
* Separate VLAN creation logic from apply_changes
Apply changes is really all about updating the NetworkManager settings
of a particular network interface. The base in apply_changes() is
NetworkInterface class, which is the NetworkManager Device abstraction.
All physical interfaces have such a Device hence it is always present.
The only exception is when creating a VLAN: Since it is a virtual
device, there is no device when creating a VLAN.
This separate the two cases. This makes it much easier to reason if
a VLAN already exists or not, and to handle the case where a VLAN
needs to be created.
For all other network interfaces, the apply_changes() method can
now rely on the presence of the NetworkInterface Device abstraction.
* Add VLAN test interface and VLAN exists test
Add a test which checks that an error gets raised when a VLAN for a
particular interface/id combination already exists.
* Address pylint
* Fix test_ignore_veth_only_changes pytest
* Make VLAN interface disabled to avoid test issues
* Reference setting 38 in mocked connection
* Make sure interface type matches
Require a interface type match before doing any comparision.
* Add Supervisor host network configuration tests
* Fix device type checking
* Fix pytest
* Fix tests by taking VLAN interface into account
* Fix test_load_with_network_connection_issues
This seems like a hack, but it turns out that the additional active
connection caused coresys.host.network.update() to be called, which
implicitly "fake" activated the connection. Now it seems that our
mocking causes IPv4 gateway to be set.
So in a way, the test checked a particular mock behavior instead of
actual intention.
The crucial part of this test is that we make sure the settings remain
unchanged. This is done by ensuring that the the method is still auto.
* Fix test_check_network_interface_ipv4.py
Now that we have the VLAN interface active too it will raise an issue
as well.
* Apply suggestions from code review
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
* Fix ruff check issue
---------
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
* Send progress updates during image pull for install/update
* Add extra to tests about job APIs
* Sent out of date progress to sentry and combine done event
* Pulling container image layer
* Storage space usage API
* Move to host API
* add tests
* fix test url
* more tests
* fix tests
* fix test
* PR comments
* update test
* tweak format and url
* add .DS_Store to .gitignore
* update tests
* test coverage
* update to new struct
* update test
* Enable IPv6 by default for new installations
Enable IPv6 by default for new Supervisor installations. Let's also
make the `enable_ipv6` attribute nullable, so we can distinguish
between "not set" and "set to false".
* Add pytest
* Add log message that system restart is required for IPv6 changes
* Fix API pytest
* Create resolution center issue when reboot is required
* Order log after actual setter call
* Rename repository fixture to test_repository
Also don't remove the built-in repositories. The list was incomplete,
and tests don't seem to require that anymore.
* Get rid of StoreType
The type doesn't have much value, we have constant strings anyways.
* Introduce types.py
* Use slug to determine which repository urls to return
* Simplify BuiltinRepository enum
* Mock GitRepo load
* Improve URL handling and repository creation logic
* Refactor update_repositories
* Get rid of get_from_url
It is no longer used in production code.
* More refactoring
* Address pylint
* Introduce is_git_based property to Repository class
Return all git based URLs, including the Core repository.
* Revert "Introduce is_git_based property to Repository class"
This reverts commit dfd5ad79bf.
* Fold type.py into const.py
Align more with how Supervisor code is typically structured.
* Update supervisor/store/__init__.py
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
* Apply repository remove suggestion
* Fix tests
---------
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
When authentication using JSON payload or URL encoded payload fails,
use the generic HTTP response code 401 Unauthorized instead of 400
Bad Request.
This is a more appropriate response code for authentication errors
and is consistent with the behavior of other authentication methods.
* Use Docker BuildKit to build addons
* Improve error message as suggested by CodeRabbit
* Fix container.remove() tests missing v=True
* Ignore squash rather than falling back to legacy builder
* Use version rather than tag to avoid confusion in run_command()
* Fix tests differently
* Use PropertyMock like other tests
* Restore position of fix_label fn
* Exempt addon builder image from unsupported checks
* Refactor tests
* Fix tests expecting wrong builder image
* Remove harcoded paths
* Fix tests
* Remove get_addon_host_path() function
* Use docker buildx build rather than docker build
Co-authored-by: Stefan Agner <stefan@agner.ch>
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
* Rename detect-blocking-io API value to match other APIs
For the new detect-blocking-io option, use dashes instead of
underscores in `on-at-startup` for consistency with other API
endpoints.
This is a breaking change, but since the API is really new and not
really used yet, it is fairly safe to do so.
* Fix pytest
* Fix mypy issues in store module
* Fix mypy issues in utils module
* Fix mypy issues in all remaining source files
* Fix ingress user typeddict
* Fixes from feedback
* Fix mypy issues after installing docker-types
Configurable and w/ migrations between IPv4-Only and Dual-Stack
Signed-off-by: David Rapan <david@rapan.cz>
Co-authored-by: Stefan Agner <stefan@agner.ch>
* feat: Add IPv6 address generation mode & privacy extensions
Signed-off-by: David Rapan <david@rapan.cz>
* Use NetworkManager fixture for settings init tests
This fixes the test by since the extended implementation now can read
the version of NetworkManager.
* Add pytest for addr_gen_mode
---------
Signed-off-by: David Rapan <david@rapan.cz>
Co-authored-by: Stefan Agner <stefan@agner.ch>
Instead of copying the backup in the main job, lets copy them in
separate job per location. This allows to use the same backup error
handling mechanism as for add-ons and folders.
This makes the stage introduced in #5784 somewhat redundant, but
before removing it, let's see if this approach works out.
* Harmonize folder and add-on backup error handling
Align add-on and folder backup error handling in that in both cases
errors are recorded on the respective backup Jobs, but not raised to
the caller. This allows the backup to complete successfully even if
some add-ons or folders fail to back up.
Along with this, also record errors in the per-add-on and per-folder
backup jobs, as well as the add-on and folder root job.
And finally, align the exception handling to only catch expected
exceptions for add-ons too.
* Fix pytest
* Recreate aiohttp ClientSession after DNS plug-in load
Create a temporary ClientSession early in case we need to load version
information from the internet. This doesn't use the final DNS setup
and hence might fail to load in certain situations since we don't have
the fallback mechanims in place yet. But if the DNS container image
is present, we'll continue the setup and load the DNS plug-in. We then
can recreate the ClientSession such that it uses the DNS plug-in.
This works around an issue with aiodns, which today doesn't reload
`resolv.conf` automatically when it changes. This lead to Supervisor
using the initial `resolv.conf` as created by Docker. It meant that
we did not use the DNS plug-in (and its fallback capabilities) in
Supervisor. Also it meant that changes to the DNS setup at runtime
did not propagate to the aiohttp ClientSession (as observed in #5332).
* Mock aiohttp.ClientSession for all tests
Currently in several places pytest actually uses the aiohttp
ClientSession and reaches out to the internet. This is not ideal
for unit tests and should be avoided.
This creates several new fixtures to aid this effort: The `websession`
fixture simply returns a mocked aiohttp.ClientSession, which can be
used whenever a function is tested which needs the global websession.
A separate new fixture to mock the connectivity check named
`supervisor_internet` since this is often used through the Job
decorator which require INTERNET_SYSTEM.
And the `mock_update_data` uses the already existing update json
test data from the fixture directory instead of loading the data
from the internet.
* Log ClientSession nameserver information
When recreating the aiohttp ClientSession, log information what
nameservers exactly are going to be used.
* Refuse ClientSession initialization when API is available
Previous attempts to reinitialize the ClientSession have shown
use of the ClientSession after it was closed due to API requets
being handled in parallel to the reinitialization (see #5851).
Make sure this is not possible by refusing to reinitialize the
ClientSession when the API is available.
* Fix pytests
Also sure we don't create aiohttp ClientSession objects unnecessarily.
* Apply suggestions from code review
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
---------
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
* Add basic test coverage for /auth API
* Check /auth API is called from an add-on
Currently the /auth API is only available for add-ons. Return 403
for calls not originating from an add-on.
* Handle bad json in auth API
Use the API specific JSON load helper which raises an APIError. This
causes the API to return a 400 error instead of a 500 error when the
JSON is invalid.
* Avoid redefining name 'mock_check_login'
* Update tests/api/test_auth.py
* Add dedicated update information reload
Currently we have the /refresh_updates endpoint which updates the main
component versions (Core, OS, Supervisor, Plug-ins) and the add-on
store at the same time. This combined update causes more update
information reloads than necessary.
To allow fine grained update refresh control introduce a new endpoint
/reload_updates which asks Supervisor to only update main component
versions (learned through the version json files).
The /store/reload endpoint already allows to update the add-on store
separately.
* Add pytest
* Update supervisor/api/__init__.py
Similar to timezone also add country information to the Supervisor
info. This is useful to set country specific configurations such as
Wireless radio regulatory setting. This is also useful for add-ons
which need country information but only have hassio API access.
Since Systemd v256 the Range header must not end with a trailing colon.
We relied on this undocumented feature when following logs, and the
frontend or CLI may still use it in requests. To fix the requests
failing with new Systemd version, intercept the header and fill in the
num_entries to maximum possible value, which avoids the journal-gatewayd
returning the response prematurely and also works on older Systemd
versions.
The journal-gatewayd would still return response if follow flag is used
along with num_entries, but this behavior is unchanged and would be
better fixed in the backend.
Link: https://github.com/systemd/systemd/issues/37172
* Fix root path requests
Since #5759 we've tried to access the path explicitly. However, this
raises KeyError exception when trying to access the proxied root path
(e.g. http://supervisor/core/api/). Before #5759 get was used, which
lead to no exception, but instead inserted a `None` into the path.
It seems aiohttp doesn't provide a path when the root is accessed. So
simply convert this to no path as well by setting path to an empty
string.
* Add rudimentary pytest for regular proxy requets
* Fix mypy issues in backups module
* Fix mypy issues in dbus module
* Fix mypy issues in api after rebase
* TypedDict to dataclass and other small fixes
* Finish fixing mypy errors in dbus
* local_where must exist
* Fix references to name in tests
* Improve Home Assistant Core WebSocket proxy implementation
This change removes unnecessary task creation for every WebSocket
message and instead creates just two tasks, one for each direction.
This improves performance by about factor of 3 when measuring 1000
WebSocket requests to Core (from ~530ms to ~160ms).
While at it, also handle all WebSocket message related to closing the
WebSocket and report all other errors as warnings instead of just info.
* Improve logging and error handling
* Add WS client error test case
* Use asyncio.gather directly
* Use asyncio.wait to handle exceptions gracefully
* Drop cancellation handling and correctly wait for the other proxy task
* Add API for swap configuration
Add HTTP API for swap size and swappiness to /os/config/swap. Individual
options can be set in JSON and are calling the DBus API added in OS
Agent 1.7.x, available since OS 15.0. Check for presence of OS of the
required version and return 404 if the criteria are not met.
* Fix type hints and reboot_required logic
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Fix formatting after adding suggestions from GH
* Address @mdegat01 review comments
- Improve swap options validation
- Add swap to the 'all' property of dbus agent
- Use APINotFound with reason instead of HTTPNotFound
- Reorder API routes
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Report stage with error in jobs
* Copy doesn't lose track of the successful copies
* Add stage to errors in api output test
* revert unneessary change to import
* Add tests for a bit more coverage of copy_additional_locations
* Handle unexpected WebSocket messages during auth
When an add-on does not respond or closes the WebSocket connection
during the authentication phase Supervisor does not handle errors
gracefully. Simply log such unexpected authentication to avoid
unnecessary stack traces in the log and make such cases no longer
appear on Sentry.
* Add pytest
* Introduce a timeout of 10s
* Add blockbuster library and find I/O from unit tests
* Fix lint and test issue
* Fixes from feedback
* Avoid modifying webapp object in executor
* Split su options validation and only validate timezone on change
* Replace non-unicode characters for add-on static files
Add-on documentation and changelog get read and returned as text file.
However, in case the original author used non-unicode characters, or
the file corrupted, loading currently fails with an UnicodeDecodeError.
Let's just use the built-in replace error handling of Python, so they
appear for the user as non-unicode characters by replacing them with
the official unicode replacement character "�".
* Remove superflous parameter for binary files
* ruff format
* Add pytests
* Move read_text to executor
* Fix issues found by coderabbit
* formated to formatted
* switch to async_capture_exception
* Find and replace got one too many
* Update patch mock to async_capture_exception
* Drop Sentry capture from format_message
The error handling got introduced in #2052, however, #2100 essentially
makes sure there will never be a byte object passed to this function.
And even if, the Sentry aiohttp plug-in will properly catch such an
exception.
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
* Initialize Supervisor Core state in constructor
Make sure the Supervisor Core state is set to a value early on. This
makes sure that the state is always of type CoreState, and makes sure
that any use of the state can rely on it being an actual value from the
CoreState enum.
This fixes Sentry filter during early startup, where the state
previously was None. Because of that, the Sentry filter tried to
collect more Context, which lead to an exception and not reporting
errors.
* Fix pytest
It seems that with initializing the state early, the pytest actually
runs a system evaluation with:
Starting system evaluation with state initialize
Before it did that with:
Starting system evaluation with state None
It detects that the container runs as privileged, and declares the
system as unhealthy.
It is unclear to me why coresys.core.healthy was checked in this
context, it doesn't seem useful. Just remove the check, and validate
the state through the getter instead.
* Update supervisor/core.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Make sure Supervisor container is privileged in pytest
With the Supervisor Core state being valid now, some evaluations
now actually run when loading the resolution center. This leads to
Supervisor getting declared unhealthy due to not running in a privileged
container under pytest.
Fake the host container to be privileged to make evaluations not
causing the system to be declared unhealthy under pytest.
* Avoid writing actual Supervisor run state file
With the Supervisor Core state being valid from the very start, we end
up writing a state everytime.
Instead of actually writing a state file, simply validate the the
necessary calls are being made. This is more conform to typical unit
tests and avoids writing a file for every test.
* Extend WebSocket client fixture and use it consistently
Extend the ha_ws_client WebSocket client fixture to set Supervisor Core
into run state and clear all pending messages.
Currently only some tests use the ha_ws_client WebSocket client fixture.
Use it consistently for all tests.
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
When developing/testing in a Supervised environment, the
systemd-journal-gatewayd socket is actually available. Mock the
socket Path file to make the test independent of the pytest
environment.
* Avoid IO in event loop when removing backup
* Refactor backup size calculation
Currently size is lazy loaded when required via properties. This
however is blocking the async event loop.
Backup sizes don't change. Instead of lazy loading the size of a backup
simply determine it on loading/after creation.
* Fix tests for backup size change
* Avoid IO in event loop when loading backups
* Avoid IO in event loop when importing a backup
* Validate Backup always before restoring
Since #5519 we check the encryption password early in restore case.
This has the side effect that we check the file existance early too.
However, in the non-encryption case, the file is not checked early.
This PR changes the behavior to always validate the backup file before
restoring, ensuring both encryption and non-encryption cases are
handled consistently.
In particular, the last case of test_restore_immediate_errors actually
validates that behavior. That test should actually have failed so far.
But it seems that because we validate the backup shortly after freeze
anyways, the exception still got raised early enough.
A simply `await asyncio.sleep(10)` right after the freeze makes the
test case fail. With this change, the test works consistently.
* Address pylint
* Fix backup_manager tests
* Drop warning message
* Fix restoring unencrypted backup in corner case
If a backup has a encrypted and unencrypted location, and the encrypted
location is beeing restored first, the encryption key is still cached.
When the user restores the unencrypted backup next, it will fail because
the Supervisor tries to use encryption key still.
* Add integration test for restoring backups with and without encryption
* Rename _validate_location_password to _set_location_password
* Reload backup metadata from restore location
* Revert "Reload backup metadata from restore location"
This reverts commit 9b47a1cfe9.
* Make pytest work/punt the ball on docker config restore issue
* Address pylint error
* Handle non-existing file in Backup password check too
Make sure we handle a non-existing backup file also when validating
the password.
* Update supervisor/backups/manager.py
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>
* Add test case and fix password check when multiple locations
* Mock default backup unprotected by default
Instead of setting the protected property which we might not use
everywhere, simply mock the default backup to be unprotected.
* Fix mock of protected backup
* Introduce test for validate_password
Testing showed that validate_password doesn't return anything. Extend
tests to cover this case and fix the actual code.
---------
Co-authored-by: Mike Degatano <michael.degatano@gmail.com>