|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.solr.util.SimplePostTool
public class SimplePostTool
A simple utility class for posting raw updates to a Solr server, has a main method so it can be run on the command line. View this not as a best-practice code example, but as a standalone example built with an explicit purpose of not having external jar dependencies.
Nested Class Summary | |
---|---|
class |
SimplePostTool.PageFetcherResult
Utility class to hold the result form a page fetch |
Constructor Summary | |
---|---|
SimplePostTool()
|
|
SimplePostTool(String mode,
URL url,
boolean auto,
String type,
int recursive,
int delay,
String fileTypes,
OutputStream out,
boolean commit,
boolean optimize,
String[] args)
Constructor which takes in all mandatory input for the tool to work. |
Method Summary | |
---|---|
static String |
appendParam(String url,
String param)
Appends a URL query parameter to a URL |
protected static URL |
appendUrlPath(URL url,
String append)
Appends to the path of the URL |
void |
commit()
Does a simple commit operation |
protected String |
computeFullUrl(URL baseUrl,
String link)
Computes the full URL based on a base url and a possibly relative link found in the href param of an HTML anchor. |
static void |
doGet(String url)
Performs a simple get on the given URL |
static void |
doGet(URL url)
Performs a simple get on the given URL |
void |
execute()
After initialization, call execute to start the post job. |
org.apache.solr.util.SimplePostTool.GlobFileFilter |
getFileFilterFromFileTypes(String fileTypes)
|
static NodeList |
getNodesFromXP(Node n,
String xpath)
Gets all nodes matching an XPath |
static String |
getXP(Node n,
String xpath,
boolean concatAll)
Gets the string content of the matching an XPath |
protected static String |
guessType(File file)
Guesses the type of a file, based on file name suffix |
protected byte[] |
inputStreamToByteArray(InputStream is)
Reads an input stream into a byte array |
protected static boolean |
isOn(String property)
Tests if a string is either "true", "on", "yes" or "1" |
static void |
main(String[] args)
See usage() for valid command line usage |
static Document |
makeDom(String in,
String inputEncoding)
Takes a string as input and returns a DOM |
protected static String |
normalizeUrlEnding(String link)
Normalizes a URL string by removing anchor part and trailing slash |
void |
optimize()
Does a simple optimize operation |
protected static SimplePostTool |
parseArgsAndInit(String[] args)
Parses incoming arguments and system params and initializes the tool |
boolean |
postData(InputStream data,
Integer length,
OutputStream output,
String type,
URL url)
Reads data from the data stream and posts it to solr, writes to the response to output |
void |
postFile(File file,
OutputStream output,
String type)
Opens the file and posts it's contents to the solrUrl, writes to response to output. |
int |
postFiles(File[] files,
int startIndexInArgs,
OutputStream out,
String type)
Post all filenames provided in args |
int |
postFiles(String[] args,
int startIndexInArgs,
OutputStream out,
String type)
Post all filenames provided in args |
int |
postWebPages(String[] args,
int startIndexInArgs,
OutputStream out)
This method takes as input a list of start URL strings for crawling, adds each one to a backlog and then starts crawling |
static InputStream |
stringToStream(String s)
Converts a string to an input stream |
protected boolean |
typeSupported(String type)
Uses the mime-type map to reverse lookup whether the file ending for our type is supported by the fileTypes option |
protected int |
webCrawl(int level,
OutputStream out)
A very simple crawler, pulling URLs to fetch from a backlog and then recurses N levels deep if recursive>0. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SimplePostTool(String mode, URL url, boolean auto, String type, int recursive, int delay, String fileTypes, OutputStream out, boolean commit, boolean optimize, String[] args)
mode
- whether to post files, web pages, params or stdinurl
- the Solr base Url to post to, should end with /updateauto
- if true, we'll guess type and add resourcename/urltype
- content-type of the data you are postingrecursive
- number of levels for file/web mode, or 0 if one file onlydelay
- if recursive then delay will be the wait time between postsfileTypes
- a comma separated list of file-name endings to accept for file/webout
- an OutputStream to write output to, e.g. stdout to print to consolecommit
- if true, will commit at end of postingoptimize
- if true, will optimize at end of postingargs
- a String[] of arguments, varies between modespublic SimplePostTool()
Method Detail |
---|
public static void main(String[] args)
args
- the params on the command linepublic void execute()
protected static SimplePostTool parseArgsAndInit(String[] args)
args
- the incoming cmd line args
public int postFiles(String[] args, int startIndexInArgs, OutputStream out, String type)
args
- array of file namesstartIndexInArgs
- offset to startout
- output stream to post data totype
- default content-type to use when posting (may be overridden in auto mode)
public int postFiles(File[] files, int startIndexInArgs, OutputStream out, String type)
files
- array of FilesstartIndexInArgs
- offset to startout
- output stream to post data totype
- default content-type to use when posting (may be overridden in auto mode)
public int postWebPages(String[] args, int startIndexInArgs, OutputStream out)
args
- the raw input args from main()startIndexInArgs
- offset for where to startout
- outputStream to write results to
protected static String normalizeUrlEnding(String link)
protected int webCrawl(int level, OutputStream out)
level
- which level to crawlout
- output stream to write to
protected byte[] inputStreamToByteArray(InputStream is) throws IOException
is
- the input stream
IOException
- If there is a low-level I/O error.protected String computeFullUrl(URL baseUrl, String link)
baseUrl
- the base url from where the link was foundlink
- the absolute or relative link
protected boolean typeSupported(String type)
type
- what content-type to lookup
protected static boolean isOn(String property)
property
- the string to test
public void commit()
public void optimize()
public static String appendParam(String url, String param)
url
- the original URLparam
- the parameter(s) to append, separated by "&"
public void postFile(File file, OutputStream output, String type)
protected static URL appendUrlPath(URL url, String append) throws MalformedURLException
url
- the URLappend
- the path to append
MalformedURLException
protected static String guessType(File file)
file
- the file
public static void doGet(String url)
public static void doGet(URL url)
public boolean postData(InputStream data, Integer length, OutputStream output, String type, URL url)
public static InputStream stringToStream(String s)
s
- the string
public org.apache.solr.util.SimplePostTool.GlobFileFilter getFileFilterFromFileTypes(String fileTypes)
public static NodeList getNodesFromXP(Node n, String xpath) throws XPathExpressionException
XPathExpressionException
public static String getXP(Node n, String xpath, boolean concatAll) throws XPathExpressionException
n
- the node (or doc)xpath
- the xpath stringconcatAll
- if true, text from all matching nodes will be concatenated, else only the first returned
XPathExpressionException
public static Document makeDom(String in, String inputEncoding) throws SAXException, IOException, ParserConfigurationException
SAXException
IOException
ParserConfigurationException
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |