HTML Scraper Plugin
The HTML Scraper Plugin is meant to scrape any HTML page extracting human readable text only.
Such plugin will generate a set of triples like:
+-----------------
"" .
"" .
"" .
"" .
+-----------------
The plugin engine is based on the {{{http://code.google.com/p/boilerpipe/} Boilerpipe}} library extractor.
The extractors mentioned as DE, AE, LCE and CE are the ones defined within the library.