Apache JMeter

org.apache.jmeter.protocol.http.parser
Class JsoupBasedHtmlParser

java.lang.Object
  extended by org.apache.jmeter.protocol.http.parser.HTMLParser
      extended by org.apache.jmeter.protocol.http.parser.JsoupBasedHtmlParser

public class JsoupBasedHtmlParser
extends HTMLParser

Parser based on JSOUP

Since:
2.10 TODO Factor out common code between LagartoBasedHtmlParser and this one (adapter pattern)

Field Summary
 
Fields inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
ATT_BACKGROUND, ATT_CODE, ATT_CODEBASE, ATT_DATA, ATT_HREF, ATT_IS_IMAGE, ATT_REL, ATT_SRC, ATT_STYLE, ATT_TYPE, DEFAULT_PARSER, PARSER_CLASSNAME, STYLESHEET, TAG_APPLET, TAG_BASE, TAG_BGSOUND, TAG_BODY, TAG_EMBED, TAG_FRAME, TAG_IFRAME, TAG_IMAGE, TAG_INPUT, TAG_LINK, TAG_OBJECT, TAG_SCRIPT
 
Constructor Summary
JsoupBasedHtmlParser()
           
 
Method Summary
 Iterator<URL> getEmbeddedResourceURLs(byte[] html, URL baseUrl, URLCollection coll, String encoding)
          Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...
protected  boolean isReusable()
          Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.
 
Methods inherited from class org.apache.jmeter.protocol.http.parser.HTMLParser
getEmbeddedResourceURLs, getEmbeddedResourceURLs, getParser, getParser
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

JsoupBasedHtmlParser

public JsoupBasedHtmlParser()
Method Detail

getEmbeddedResourceURLs

public Iterator<URL> getEmbeddedResourceURLs(byte[] html,
                                             URL baseUrl,
                                             URLCollection coll,
                                             String encoding)
                                      throws HTMLParseException
Description copied from class: HTMLParser
Get the URLs for all the resources that a browser would automatically download following the download of the HTML content, that is: images, stylesheets, javascript files, applets, etc...

All URLs should be added to the Collection.

Malformed URLs can be reported to the caller by having the Iterator return the corresponding RL String. Overall problems parsing the html should be reported by throwing an HTMLParseException. N.B. The Iterator returns URLs, but the Collection will contain objects of class URLString.

Specified by:
getEmbeddedResourceURLs in class HTMLParser
Parameters:
html - HTML code
baseUrl - Base URL from which the HTML code was obtained
coll - URLCollection
encoding - Charset
Returns:
an Iterator for the resource URLs
Throws:
HTMLParseException

isReusable

protected boolean isReusable()
Description copied from class: HTMLParser
Parsers should over-ride this method if the parser class is re-usable, in which case the class will be cached for the next getParser() call.

Overrides:
isReusable in class HTMLParser
Returns:
true if the Parser is reusable

Apache JMeter

Copyright © 1998-2013 Apache Software Foundation. All Rights Reserved.