* Remove release assets tarball on frontend update
* Fix path to the removed tarball
---------
Co-authored-by: Jan Čermák <sairon@users.noreply.github.com>
* Finish out effort of adding and enabling blockbuster
* Skip getting addon file size until securetar fixed
* Fix test for devcontainer and blocking I/O
* Fix docker fixture and load_config to post_init
* Use Sentry helper function to report warnings
Don't use Sentry directly but the existing helper function.
* Add pytest that Sentry is by default off
* Address ruff
* Address ruff
* Add blockbuster library and find I/O from unit tests
* Fix lint and test issue
* Fixes from feedback
* Avoid modifying webapp object in executor
* Split su options validation and only validate timezone on change
This essentially reverts PR #5685.
The Sentry `AsyncioIntegration` replaces the asyncio task factory with
its instrumentalized version, which reports all execeptions which
aren't handled *within* a task to Sentry.
However, we quite often run tasks and handle exceptions outside, e.g.
this commen pattern (example from `MountManager` `reload()``):
```python
results = await asyncio.gather(
*[mount.update() for mount in mounts], return_exceptions=True
)
... create resolution issues from results with exceptions ...
```
Here, asyncio.gather() uses ensure_future(), which converts the
co-routines to tasks. These Sentry instrumented tasks will then report
exceptions to Sentry, even though we handle exceptions gracefully.
So the `AsyncioIntegration` doesn't work for our use case, and causes
unnecessary noise in Sentry. Disable it again.
* Fix add-on store repository getting removed without internet
Currently, when a git command error happens in `pull()`, we declare
the repository as corrupt. Subsequent system autofix runs then execute
the reset resolution, which essentially removes the git repository from
the system.
In situations where the Internet fails right between the last
Supervisor connectivity check and the add-on store repository update
(the connectivity checks are throttled to once every 10 minutes while
connectivity is considered good), or if the outage is only partial
(e.g. reaching connectivity check works but the store repository is not
reachable), this leads to a git command error which declares the
repository as corrupt just as well, and ultimately leads to the removal
of the add-on store repository from the local system.
Run a git ls-remote first, which is used as an extra connectivity check.
This will also avoid removing the repository if Internet connectivity
works but the git provider is temporary down or not reachable.
That said, it will also fail if the repository is no longer present.
But this case needs extra handling anyways.
* Run git ls-remote in executor
* Make sure to close file stream after backup upload
Currently the file stream does not get closed before importing the file
stream. It seems the test case didn't catch that, presumably because
it is a race condition if the bytes get flushed to disk or not.
Properly close the stream before continue handling the file.
* Close file stream in executor
* Add comment about closing twice is fine
* Move read_text to executor
* switch to async_capture_exception
* Finish moving read_text to executor
* Cover read_bytes and some write_text calls as well
* Fix await issues
* Fix format_message
When connection is closed by the client, the journal_logs_reader
generator still returns new lines, trying to write each one of them to
the closed transport. With debug logging enabled, this can end up in an
endless loop. To fix that, break out of the loop immediately after the
connection is reset.
* Exclude non-Supervisor Server Error codes from Sentry reporting
Exclude status codes 502 Bad Gateway and 503 Service Unavailable from
Sentry aiohttp integration. These are returned by Supervisor itself
when acting as a proxy for Home Assistant Core. These aren't errors of
Supervisor.
* ruff check
* Replace non-unicode characters for add-on static files
Add-on documentation and changelog get read and returned as text file.
However, in case the original author used non-unicode characters, or
the file corrupted, loading currently fails with an UnicodeDecodeError.
Let's just use the built-in replace error handling of Python, so they
appear for the user as non-unicode characters by replacing them with
the official unicode replacement character "�".
* Remove superflous parameter for binary files
* ruff format
* Add pytests
When initially loading the store manager, update_repositories makes
sure that all repositories are actually present. If they are for some
reason corrupted or content is missing, we currently still trying
to load them which leads to an unnecessary warning:
```
2025-03-03 11:55:54.324 WARNING (SyncWorker_1) [supervisor.store.data] No repository information exists at /data/addons/git/a0d7b954
...
2025-03-03 11:55:54.343 INFO (MainThread) [supervisor.store.git] Cloning add-on https://github.com/hassio-addons/repository repository
```
Since update_repositories always loads the data, simply remove the
superfluous earlier loading attempt.
While at it, also improve the cloning/update log messages to make it
clear what repository is cloned/updated.
* Suppress all ClientConnectionReset when returning logs
In #5358 we started suppressing ClientConnectionReset when logs are
returned from the Journal Gateway and the client ends connection
unexpectedly. The connection can be closed also when the headers are
returned, so ignore also that error.
Refs #5606
* Log ClientConnectionResetError as DEBUG instead of suppressing it
* Fix cloning of add-on store repository
Since #5669, the add-on store reset no longer deletes the root
directory. However, if the root directory is not present, the current
code no longer invokes cloning, instead tries to load the git
repository directly.
With this change, the code clones whenever there is no .git directory,
which works for both cases.
* Fix pytest
* Move read_text to executor
* Fix issues found by coderabbit
* formated to formatted
* switch to async_capture_exception
* Find and replace got one too many
* Update patch mock to async_capture_exception
* Drop Sentry capture from format_message
The error handling got introduced in #2052, however, #2100 essentially
makes sure there will never be a byte object passed to this function.
And even if, the Sentry aiohttp plug-in will properly catch such an
exception.
---------
Co-authored-by: Stefan Agner <stefan@agner.ch>
Since #5696 we don't need to load the resolution center early. In fact,
with #5686 this is even problematic for pytests in devcontainer, since
the Supervisor Core state is valid and this causes AppArmor evaluations
to run (and fail).
Actually, #5696 removed the resolution center. #5686 brought it
accidentally back. This was seemingly a merge error.
By default, warnings are simply printed to stderr. This makes them
easy to miss in the log. Capture warnings and user Python logger to log
them with warning level.
Also, if the message is an instance of Exception (which it typically
is), report the warning to Sentry. This is e.g. useful for asyncio
RuntimeWarning warnings "coroutine was never awaited".
* Initialize Supervisor Core state in constructor
Make sure the Supervisor Core state is set to a value early on. This
makes sure that the state is always of type CoreState, and makes sure
that any use of the state can rely on it being an actual value from the
CoreState enum.
This fixes Sentry filter during early startup, where the state
previously was None. Because of that, the Sentry filter tried to
collect more Context, which lead to an exception and not reporting
errors.
* Fix pytest
It seems that with initializing the state early, the pytest actually
runs a system evaluation with:
Starting system evaluation with state initialize
Before it did that with:
Starting system evaluation with state None
It detects that the container runs as privileged, and declares the
system as unhealthy.
It is unclear to me why coresys.core.healthy was checked in this
context, it doesn't seem useful. Just remove the check, and validate
the state through the getter instead.
* Update supervisor/core.py
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Make sure Supervisor container is privileged in pytest
With the Supervisor Core state being valid now, some evaluations
now actually run when loading the resolution center. This leads to
Supervisor getting declared unhealthy due to not running in a privileged
container under pytest.
Fake the host container to be privileged to make evaluations not
causing the system to be declared unhealthy under pytest.
* Avoid writing actual Supervisor run state file
With the Supervisor Core state being valid from the very start, we end
up writing a state everytime.
Instead of actually writing a state file, simply validate the the
necessary calls are being made. This is more conform to typical unit
tests and avoids writing a file for every test.
* Extend WebSocket client fixture and use it consistently
Extend the ha_ws_client WebSocket client fixture to set Supervisor Core
into run state and clear all pending messages.
Currently only some tests use the ha_ws_client WebSocket client fixture.
Use it consistently for all tests.
---------
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
* Load resolution evaluation, check and fixups early
Before #5652, these modules were loaded in the constructor, hence early
in `initialize_coresys()`. Moving them late actually exposed an issue
where NetworkManager connectivity setter couldn't get the
`connectivity_check` evaluation, leading to an exception early in
bootstrap.
Technically, it might be safe to load the resolution modules only in
`Core.connect()`, however then we'd have to load them separately for
pytest. Let's go conservative and load them the same place where they
got loaded before #5652.
* Load resolution modules in a single executor call
* Fix pytest