public class URLUtil extends Object
Constructor and Description |
---|
URLUtil() |
Modifier and Type | Method and Description |
---|---|
static String |
chooseRepr(String src,
String dst,
boolean temp)
Given two urls, a src and a destination of a redirect, it returns the
representative url.
|
static String |
getDomainName(String url)
Returns the domain name of the url.
|
static String |
getDomainName(URL url)
Returns the domain name of the url.
|
static DomainSuffix |
getDomainSuffix(String url)
Returns the
DomainSuffix corresponding to the
last public part of the hostname |
static DomainSuffix |
getDomainSuffix(URL url)
Returns the
DomainSuffix corresponding to the
last public part of the hostname |
static String |
getHost(String url)
Returns the lowercased hostname for the url or null if the url is not well
formed.
|
static String[] |
getHostSegments(String url)
Partitions of the hostname of the url by "."
|
static String[] |
getHostSegments(URL url)
Partitions of the hostname of the url by "."
|
static String |
getPage(String url)
Returns the page for the url.
|
static String |
getProtocol(String url) |
static String |
getProtocol(URL url) |
static String |
getTopLevelDomainName(String url)
Returns the top level domain name of the url.
|
static String |
getTopLevelDomainName(URL url)
Returns the top level domain name of the url.
|
static boolean |
isSameDomainName(String url1,
String url2)
Returns whether the given urls have the same domain name.
|
static boolean |
isSameDomainName(URL url1,
URL url2)
Returns whether the given urls have the same domain name.
|
static void |
main(String[] args)
For testing
|
static URL |
resolveURL(URL base,
String target)
Resolve relative URL-s and fix a few java.net.URL errors
in handling of URLs with embedded params and pure query
targets.
|
static String |
toASCII(String url) |
static String |
toUNICODE(String url) |
public static URL resolveURL(URL base, String target) throws MalformedURLException
base
- base urltarget
- target url (may be relative)MalformedURLException
public static String getDomainName(URL url)
getDomainName(conf, new URL(http://lucene.apache.org/))
apache.org
public static String getDomainName(String url) throws MalformedURLException
getDomainName(conf, new http://lucene.apache.org/)
apache.org
MalformedURLException
public static String getTopLevelDomainName(URL url) throws MalformedURLException
getTopLevelDomainName(conf, new http://lucene.apache.org/)
org
MalformedURLException
public static String getTopLevelDomainName(String url) throws MalformedURLException
getTopLevelDomainName(conf, new http://lucene.apache.org/)
org
MalformedURLException
public static boolean isSameDomainName(URL url1, URL url2)
isSameDomain(new URL("http://lucene.apache.org")
, new URL("http://people.apache.org/"))
will return true.
public static boolean isSameDomainName(String url1, String url2) throws MalformedURLException
isSameDomain("http://lucene.apache.org"
,"http://people.apache.org/")
will return true.
MalformedURLException
public static DomainSuffix getDomainSuffix(URL url)
DomainSuffix
corresponding to the
last public part of the hostnamepublic static DomainSuffix getDomainSuffix(String url) throws MalformedURLException
DomainSuffix
corresponding to the
last public part of the hostnameMalformedURLException
public static String[] getHostSegments(URL url)
public static String[] getHostSegments(String url) throws MalformedURLException
MalformedURLException
public static String chooseRepr(String src, String dst, boolean temp)
Given two urls, a src and a destination of a redirect, it returns the representative url.
This method implements an extended version of the algorithm used by the
Yahoo! Slurp crawler described here:
How
does the Yahoo! webcrawler handle redirects?
src
- The source url.dst
- The destination url.temp
- Is the redirect a temporary redirect.public static String getHost(String url)
url
- The url to check.public static String getPage(String url)
url
- The url to check.public static void main(String[] args)
Copyright © 2014 The Apache Software Foundation