HTML
Select HTML as the query type to extract data from HTML pages.
Caution
Use HTML queries only for retrieving data from legacy systems where no alternative APIs exist. Where possible, use JSON, CSV, or XML query types instead.
Parser options
When querying HTML data, use the JSONata or JQ backend parsers. These parsers convert HTML into a structured format similar to XML, allowing you to use the same selector syntax.
- JSONata parser: Use JSONata expressions to navigate and transform the parsed HTML structure. For more information, refer to JSONata parser.
- JQ parser: Use JQ expressions to navigate and transform the parsed HTML structure. For more information, refer to JQ parser.
For selector syntax and examples, refer to XML, as HTML is parsed using the same approach.
Configure an HTML query
To extract data from an HTML page:
- Select HTML as the query type.
- Select URL as the source.
- Enter the page URL.
- Select JSONata or JQ as the parser.
- Configure the root selector and column selectors.
Limitations
Be aware of the following limitations when using HTML queries:
- Symmetrical data only: Tables with
colspanorrowspanattributes are not supported. - Text content only: Retrieving HTML attributes is not supported.
- XHTML compatibility: The backend HTML parser only works with XHTML-compatible pages.
- Rate limiting: Websites may block frequent requests. Set appropriate refresh intervals.



