|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object org.apache.solr.internal.csv.CSVParser
public class CSVParser
Parses CSV files according to the specified configuration.
Because CSV appears in many different dialects, the parser supports many
configuration settings by allowing the specification of a CSVStrategy
.
Parsing of a csv-string having tabs as separators, '"' as an optional value encapsulator, and comments starting with '#':
String[][] data = (new CSVParser(new StringReader("a\tb\nc\td"), new CSVStrategy('\t','"','#'))).getAllValues();
Parsing of a csv-string in Excel CSV format
String[][] data = (new CSVParser(new StringReader("a;b\nc;d"), CSVStrategy.EXCEL_STRATEGY)).getAllValues();
Internal parser state is completely covered by the strategy and the reader-state.
see package documentation for more details
Field Summary | |
---|---|
protected static int |
TT_EOF
Token (which can have content) when end of file is reached. |
protected static int |
TT_EORECORD
Token with content when end of a line is reached. |
protected static int |
TT_INVALID
Token has no valid content, i.e. |
protected static int |
TT_TOKEN
Token with content, at beginning or in the middle of a line. |
Constructor Summary | |
---|---|
CSVParser(Reader input)
CSV parser using the default CSVStrategy . |
|
CSVParser(Reader input,
char delimiter)
Deprecated. use CSVParser(Reader,CSVStrategy) . |
|
CSVParser(Reader input,
char delimiter,
char encapsulator,
char commentStart)
Deprecated. use CSVParser(Reader,CSVStrategy) . |
|
CSVParser(Reader input,
CSVStrategy strategy)
Customized CSV parser using the given CSVStrategy |
Method Summary | |
---|---|
String[][] |
getAllValues()
Parses the CSV according to the given strategy and returns the content as an array of records (whereas records are arrays of single values). |
String[] |
getLine()
Parses from the current point in the stream til the end of the current line. |
int |
getLineNumber()
Returns the current line number in the input stream. |
CSVStrategy |
getStrategy()
Obtain the specified CSV Strategy. |
protected org.apache.solr.internal.csv.CSVParser.Token |
nextToken()
Convenience method for nextToken(null) . |
protected org.apache.solr.internal.csv.CSVParser.Token |
nextToken(org.apache.solr.internal.csv.CSVParser.Token tkn)
Returns the next token. |
String |
nextValue()
Parses the CSV according to the given strategy and returns the next csv-value as string. |
protected int |
unicodeEscapeLexer(int c)
Decodes Unicode escapes. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
protected static final int TT_INVALID
protected static final int TT_TOKEN
protected static final int TT_EOF
protected static final int TT_EORECORD
Constructor Detail |
---|
public CSVParser(Reader input)
CSVStrategy
.
input
- a Reader containing "csv-formatted" inputpublic CSVParser(Reader input, char delimiter)
CSVParser(Reader,CSVStrategy)
.
CSVStrategy
except for the delimiter setting.
input
- a Reader based on "csv-formatted" inputdelimiter
- a Char used for value separationpublic CSVParser(Reader input, char delimiter, char encapsulator, char commentStart)
CSVParser(Reader,CSVStrategy)
.
input
- a Reader based on "csv-formatted" inputdelimiter
- a Char used for value separationencapsulator
- a Char used as value encapsulation markercommentStart
- a Char used for comment identificationpublic CSVParser(Reader input, CSVStrategy strategy)
CSVStrategy
input
- a Reader containing "csv-formatted" inputstrategy
- the CSVStrategy used for CSV parsingMethod Detail |
---|
public String[][] getAllValues() throws IOException
The returned content starts at the current parse-position in the stream.
IOException
- on parse error or input read-failurepublic String nextValue() throws IOException
IOException
- on parse error or input read-failurepublic String[] getLine() throws IOException
IOException
- on parse error or input read-failurepublic int getLineNumber()
protected org.apache.solr.internal.csv.CSVParser.Token nextToken() throws IOException
nextToken(null)
.
IOException
protected org.apache.solr.internal.csv.CSVParser.Token nextToken(org.apache.solr.internal.csv.CSVParser.Token tkn) throws IOException
tkn
- an existing Token object to reuse. The caller is responsible to initialize the
Token.
IOException
- on stream access errorprotected int unicodeEscapeLexer(int c) throws IOException
c
- current char which is discarded because it's the "\\" of "\\uXXXX"
IOException
- on wrong unicode escape sequence or read errorpublic CSVStrategy getStrategy()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |