Filter HTML Contents

By using the HtmlDocument object from the HtmlAgilityPack library, html document object filtering can be achieved.

This script can be used to create a new HtmlDocument variable, loading the html code(from a text file, web), selecting the nodes to be parsed, converting the variable into a list and printing the output.

HtmlDocument doc = new HtmlDocument();
HtmlNodeCollection selectedHtmlNodes = doc.DocumentNode.SelectNodes("//html/body");

“InnerText” - text only

“InnerHtml” - whole html code

Last updated