home-assistant.io/source/_components/sensor.scrape.markdown
2019-01-26 10:55:55 +01:00

5.4 KiB

layout title description date sidebar comments sharing footer logo ha_category ha_release ha_iot_class
page Scrape Sensor Instructions on how to integrate Web scrape sensors into Home Assistant. 2016-10-12 09:10 true false true true home-assistant.png Sensor 0.31 Cloud Polling

The scrape sensor platform is scraping information from websites. The sensor loads a HTML page and gives you the option to search and split out a value. As this is not a full-blown web scraper like scrapy, it will most likely only work with simple web pages and it can be time-consuming to get the right section.

To enable this sensor, add the following lines to your configuration.yaml file:

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: https://www.home-assistant.io
    select: ".current-version h1"

{% configuration %} resource: description: The URL to the website that contains the value. required: true type: string select: description: "Defines the HTML tag to search for. Check Beautifulsoup's CSS selectors for details." required: true type: string attribute: description: Get value of an attribute on the selected tag. required: false type: string name: description: Name of the sensor. required: false default: Web scrape type: string unit_of_measurement: description: Defines the units of measurement of the sensor, if any. required: false type: string authentication: description: Type of the HTTP authentication. Either basic or digest. required: false type: string username: description: The username for accessing the website. required: false type: string password: description: The password for accessing the website. required: false type: string headers: description: Headers to use for the web request. required: false type: string {% endconfiguration %}

{% linkable_title Examples %}

In this section you find some real-life examples of how to use this sensor. There is also a Jupyter notebook available for this example to give you a bit more insight.

{% linkable_title Home Assistant %}

The current release Home Assistant is published on https://www.home-assistant.io/

{% raw %}

sensor:
# Example configuration.yaml entry
  - platform: scrape
    resource: https://www.home-assistant.io
    name: Release
    select: ".current-version h1"
    value_template: '{{ value.split(":")[1] }}'

{% endraw %}

{% linkable_title Available implementations %}

Get the counter for all our implementations from the Component overview page.

{% raw %}

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: https://www.home-assistant.io/components/
    name: Home Assistant impl.
    select: 'a[href="#all"]'
    value_template: '{{ value.split("(")[1].split(")")[0] }}'

{% endraw %}

{% linkable_title Get a value out of a tag %}

The German Federal Office for Radiation protection (Bundesamt für Strahlenschutz) is publishing various details about optical radiation including an UV index. This example is getting the index for a region in Germany.

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: http://www.bfs.de/DE/themen/opt/uv/uv-index/prognose/prognose_node.html
    name: Coast Ostsee
    select: 'p:nth-of-type(19)'
    unit_of_measurement: 'UV Index'

{% linkable_title IFTTT status %}

If you make heavy use of the IFTTT web service for your automations and are curious about the status of IFTTT then you can display the current state of IFTTT in your frontend.

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: http://status.ifttt.com/
    name: IFTTT status
    select: '.component-status'

{% linkable_title Get the latest podcast episode file URL %}

If you want to get the file URL for the latest episode of your favorite podcast, so you can pass it on to a compatible media player.

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: https://hasspodcast.io/feed/podcast
    name: Home Assistant Podcast
    select: 'enclosure:nth-of-type(1)'
    attribute: url

{% linkable_title Energy price %}

This example tries to retrieve the price for electricity.

{% raw %}

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: https://elen.nu/timpriser-pa-el-for-elomrade-se3-stockholm/
    name: Electricity price
    select: ".elspot-content"
    value_template: '{{ ((value.split(" ")[0]) | replace (",", ".")) }}'
    unit_of_measurement: "öre/kWh"

{% endraw %}

{% linkable_title BOM Weather %}

The Australian Bureau of Meterology website returns an error if the User Agent header is not sent.

{% raw %}

# Example configuration.yaml entry
sensor:
  - platform: scrape
    resource: http://www.bom.gov.au/vic/forecasts/melbourne.shtml
    name: Melbourne Forecast Summary
    select: ".main .forecast p"
    value_template: '{{ value | truncate(255) }}'
    # Request every hour
    scan_interval: 3600
    headers:
      User-Agent: Mozilla/5.0

{% endraw %}