Get HTML

This activity can be used to scrape web content from an HTML page, by isolating elements given a complex XPATH evaluation string.

The resulting selected value (which is stored as a String), is output to an internally defined variable.

Properties

Input

  • URL: The given URL, or variable defined as the specific URL.

  • Evaluator: variable defined with XPATH evaluation string.

Output

  • Output: set already defined variables as output, or define a new one.

Misc

  • Display Name: action display name in your implementation project

  • Wait Before: number milliseconds robot to wait before executing the action.

  • Wait After: number milliseconds robot to wait until moving to the next action.

  • Abort on Error: True/False

  • Retry Times: Specifies the number of times to retry an action if it doesn't end successfully. Default value 0 (no retry); Use -1 if you want to retry indefinitely.

Use Case

Potential Use Cases

  • Scrape the listings of an online shop, export in excel, and generate statistics based on scraped data.

  • From the Ministry of Finance website, scrape the list of companies that didn’t pay taxes in order to avoid them as suppliers or customers.

  • Scrape the price of some cryptocurrencies over time and determine a trend.

Examples of Using Get HTML

Example

In this example, the robot finds out one of his friends became a parent and it wants to buy a pair of trousers for the baby from this website.

It chooses these trousers and wants to check their price using the Scrape Webpage method.

Watch the robot in action:

You can see how the Get HTML activity is used in an example that incorporates multiple activities. You can check and download the example from here:

The article doesn't fully answer your question, or you want to find out more? Ask your question in the comments, on our community forum, or contact our support team or your account manager.Whatever works for you, your question will find its answer soon!