Config flow for Scrape integration (#22494)

* Update scrape.markdown

* Fix lint

* Fix error

* Make nicer

* fix url

* Inline code

* templates

* Template2

* Fix table and styling

* Update scrape.markdown

* Update scrape.markdown

* Update scrape.markdown

* Update scrape.markdown

* Update scrape.markdown

* Update scrape.markdown

* Update scrape.markdown

* Add back css selector link
This commit is contained in:
G Johansson 2022-06-04 13:46:34 +02:00 committed by GitHub
parent 4e83a6e26b
commit e5ec111c57
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -5,8 +5,10 @@ ha_category:
- Sensor - Sensor
ha_release: 0.31 ha_release: 0.31
ha_iot_class: Cloud Polling ha_iot_class: Cloud Polling
ha_config_flow: true
ha_codeowners: ha_codeowners:
- '@fabaff' - '@fabaff'
- '@gjohansson-ST'
ha_domain: scrape ha_domain: scrape
ha_platforms: ha_platforms:
- sensor - sensor
@ -15,85 +17,15 @@ ha_integration_type: integration
The `scrape` sensor platform is scraping information from websites. The sensor loads an HTML page and gives you the option to search and split out a value. As this is not a full-blown web scraper like [scrapy](https://scrapy.org/), it will most likely only work with simple web pages and it can be time-consuming to get the right section. The `scrape` sensor platform is scraping information from websites. The sensor loads an HTML page and gives you the option to search and split out a value. As this is not a full-blown web scraper like [scrapy](https://scrapy.org/), it will most likely only work with simple web pages and it can be time-consuming to get the right section.
Check Beautifulsoup's [CSS selectors](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors) for details on how to write a **Select**.
If you are not using Home Assistant Container or Home Assistant Operating System, this integration requires `libxml2` to be installed. On Debian based installs, run: If you are not using Home Assistant Container or Home Assistant Operating System, this integration requires `libxml2` to be installed. On Debian based installs, run:
```bash ```bash
sudo apt install libxml2 sudo apt install libxml2
``` ```
To enable this sensor, add the following lines to your `configuration.yaml` file: {% include integrations/config_flow.md %}
```yaml
# Example configuration.yaml entry
sensor:
- platform: scrape
resource: https://www.home-assistant.io
select: ".current-version h1"
```
{% configuration %}
resource:
description: The URL to the website that contains the value.
required: true
type: string
select:
description: "Defines the HTML tag to search for. Check Beautifulsoup's [CSS selectors](https://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors) for details."
required: true
type: string
attribute:
description: Get value of an attribute on the selected tag.
required: false
type: string
index:
description: Defines which of the elements returned by the CSS selector to use.
required: false
default: 0
type: integer
name:
description: Name of the sensor.
required: false
default: Web scrape
type: string
value_template:
description: Defines a template to get the state of the sensor.
required: false
type: template
unit_of_measurement:
description: Defines the units of measurement of the sensor, if any.
required: false
type: string
device_class:
description: The [type/class](/integrations/sensor/#device-class) of the sensor to set the icon in the frontend.
required: false
type: device_class
default: None
state_class:
description: The [state_class](https://developers.home-assistant.io/docs/core/entity/sensor#available-state-classes) of the sensor.
required: false
type: string
default: None
authentication:
description: Type of the HTTP authentication. Either `basic` or `digest`.
required: false
type: string
verify_ssl:
description: Enables/disables verification of SSL-certificate, for example if it is self-signed.
required: false
type: boolean
default: true
username:
description: The username for accessing the website.
required: false
type: string
password:
description: The password for accessing the website.
required: false
type: string
headers:
description: Headers to use for the web request.
required: false
type: string
{% endconfiguration %}
## Examples ## Examples
@ -103,97 +35,67 @@ In this section you find some real-life examples of how to use this sensor. Ther
The current release Home Assistant is published on [https://www.home-assistant.io/](/) The current release Home Assistant is published on [https://www.home-assistant.io/](/)
{% raw %} | Field | Value |
| --- | --- |
```yaml | **Resource** | https://www.home-assistant.io |
sensor: | **Name** | Release |
# Example configuration.yaml entry | **Select** | `.current-version h1` |
- platform: scrape | **Value Template** | {% raw %}`{{ value.split(':')[1] }}`{% endraw %} |
resource: https://www.home-assistant.io
name: Release
select: ".current-version h1"
value_template: '{{ value.split(":")[1] }}'
```
{% endraw %}
### Available implementations ### Available implementations
Get the counter for all our implementations from the [Component overview](/integrations/) page. Get the counter for all our implementations from the [Component overview](/integrations/) page.
{% raw %} | Field | Value |
| --- | --- |
```yaml | **Resource** | https://www.home-assistant.io/integrations/ |
# Example configuration.yaml entry | **Name** | Home Assistant impl. |
sensor: | **Select** | `a[href="#all"]` |
- platform: scrape | **Value Template** | {% raw %}`{{ value.split('(')[1].split(')')[0] }}`{% endraw %} |
resource: https://www.home-assistant.io/integrations/
name: Home Assistant impl.
select: 'a[href="#all"]'
value_template: '{{ value.split("(")[1].split(")")[0] }}'
```
{% endraw %}
### Get a value out of a tag ### Get a value out of a tag
The German [Federal Office for Radiation protection (Bundesamt für Strahlenschutz)](http://www.bfs.de/) is publishing various details about optical radiation including an UV index. This example is getting the index for a region in Germany. The German [Federal Office for Radiation protection (Bundesamt für Strahlenschutz)](http://www.bfs.de/) is publishing various details about optical radiation including an UV index. This example is getting the index for a region in Germany.
```yaml | Field | Value |
# Example configuration.yaml entry | --- | --- |
sensor: | **Resource** | http://www.bfs.de/DE/themen/opt/uv/uv-index/prognose/prognose_node.html |
- platform: scrape | **Name** | Coast Ostsee |
resource: http://www.bfs.de/DE/themen/opt/uv/uv-index/prognose/prognose_node.html | **Select** | `p` |
name: Coast Ostsee | **Index** | `19` |
select: "p" | **Unit of Measurement** | `UV Index` |
index: 19
unit_of_measurement: "UV Index"
```
### IFTTT status ### IFTTT status
If you make heavy use of the [IFTTT](/integrations/ifttt/) web service for your automations and are curious about the [status of IFTTT](https://status.ifttt.com/) then you can display the current state of IFTTT in your frontend. If you make heavy use of the [IFTTT](/integrations/ifttt/) web service for your automations and are curious about the [status of IFTTT](https://status.ifttt.com/) then you can display the current state of IFTTT in your frontend.
```yaml | Field | Value |
# Example configuration.yaml entry | --- | --- |
sensor: | **Resource** | https://status.ifttt.com/ |
- platform: scrape | **Name** | IFTTT status |
resource: https://status.ifttt.com/ | **Select** | `.component-status` |
name: IFTTT status
select: ".component-status"
```
### Get the latest podcast episode file URL ### Get the latest podcast episode file URL
If you want to get the file URL for the latest episode of your [favorite podcast](https://hasspodcast.io/), so you can pass it on to a compatible media player. If you want to get the file URL for the latest episode of your [favorite podcast](https://hasspodcast.io/), so you can pass it on to a compatible media player.
```yaml | Field | Value |
# Example configuration.yaml entry | --- | --- |
sensor: | **Resource** | https://hasspodcast.io/feed/podcast |
- platform: scrape | **Name** | Home Assistant Podcast |
resource: https://hasspodcast.io/feed/podcast | **Select** | `enclosure` |
name: Home Assistant Podcast | **Index** | `1` |
select: "enclosure" | **Attribute** | `url` |
index: 1
attribute: url
```
### Energy price ### Energy price
This example tries to retrieve the price for electricity. This example tries to retrieve the price for electricity.
{% raw %} | Field | Value |
| --- | --- |
```yaml | **Resource** | https://elen.nu/timpriser-pa-el-for-elomrade-se3-stockholm/ |
# Example configuration.yaml entry | **Name** | Electricity price |
sensor: | **Select** | `.text-lg:is(span)` |
- platform: scrape | **Index** | `1` |
resource: https://elen.nu/timpriser-pa-el-for-elomrade-se3-stockholm/ | **Value Template** | {% raw %}`{{ value \| replace(',', '.') \| float }}`{% endraw %} |
name: Electricity price | **Unit of Measurement** | `öre/kWh` |
select: ".text-lg:is(span)"
index: 1
value_template: '{{ value | replace (",", ".") | float }}'
unit_of_measurement: "öre/kWh"
```
{% endraw %}