Apache Camel: uniVocity-parsers formats

uniVocity-parsers

Available as of Camel 2.15.0

This Data Format uses uniVocity-parsers for reading and writing 3 kinds of tabular data text files:

CSV (Comma Separated Values), where the values are separated by a symbol (usually a comma)
fixed-width, where the values have known sizes
TSV (Tabular Separated Values), where the fields are separated by a tabulation

Thus there are 3 data formats based on uniVocity-parsers.

If you use Maven you can just add the following to your pom.xml, substituting the version number for the latest and greatest release (see the download page for the latest versions).

<dependency>
    <groupId>org.apache.camel</groupId>
    <artifactId>camel-univocity-parsers</artifactId>
    <version>x.x.x</version>
</dependency>

Options

Most configuration options of the uniVocity-parsers are available in the data formats. If you want more information about a particular option, please refer to their documentation page.

The 3 data formats share common options and have dedicated ones, this section presents them all.

Common options, shared by all the data formats

Parameter name	Type	Description
`nullValue`	`String`	The string representation of a `null` value. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `null`.
`skipEmptyLines`	`Boolean`	Whether or not the empty lines must be ignored. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `true`.
`ignoreTrailingWhitespaces`	`Boolean`	Whether or not the trailing white spaces must ignored. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `true`.
`ignoreLeadingWhitespaces`	`Boolean`	Whether or not the leading white spaces must be ignored. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `true`.
`headersDisabled`	`boolean`	Whether or not the headers are disabled. When defined, this option explicitly sets the headers as `null` which indicates that there is no header. This option is `false` by default.
`headers`	`String[]`	The headers to use. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `null`. In the XML DSL, this option is configured using children `<univocity-header>` tags: <univocity-csv> <univocity-header>first</univocity-header> <univocity-header>second</univocity-header> </univocity-csv> See other marshalling and unmarshalling examples for more information.
`headersExtractionEnabled`	`Boolean`	Whether or not the header must be read in the first line of the test document This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `false`.
`numberOfRecordsToRead`	`Integer`	The maximum number of record to read. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `-1`.
`emptyValue`	`String`	The String representation of an empty value This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `""`.
`lineSeparator`	`String`	The line separator of the files This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is the platform line separator.
`normalizedLineSeparator`	`Character`	The normalized line separator of the files This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `"\n"`.
`comment`	`Character`	The comment symbol. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `'#'`.
`lazyLoad`	`boolean`	Whether the unmarshalling should produce an iterator that reads the lines on the fly or if all the lines must be read at one. This option is `false` by default.
`asMap`	`boolean`	Whether the unmarshalling should produce maps for the lines values instead of lists. It requires to have header (either defined or collected). This options is `false` by default.

CSV format options

Parameter name	Type	Description
`quoteAllFields`	`Boolean`	Whether or not all values must be quoted when writing them. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `false`.
`quote`	`Character`	The quote symbol. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is '"'.
`quoteEscape`	`Character`	The quote escape symbol. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `'"'`.
`delimiter`	`Character`	The delimiter of values This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `','`.

Fixed-width format options

Parameter name	Type	Description
`fieldLengths`	`int[]`	This options is required and defines the length of each values In the XML DSL, this option is configured using children `<univocity-header>` tags: <univocity-fixed> <univocity-header length="3"/> <univocity-header length="8"/> </univocity-fixed> See other marshalling and unmarshalling examples for more information.
`skipTrailingCharsUntilNewline`	`Boolean`	Whether or not the trailing characters until new line must be ignored. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `false`.
`recordEndsOnNewLine`	`Boolean`	Whether or not the record ends on new line. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `false`.
`padding`	`Character`	The padding character. This option is `null` by default. When `null`, it uses the default uniVocity-parser value which is `' '` (space).

TSV format options

Parameter name Type Description

escapeChar

Character

The escape character.

This option is null by default. When null, it uses the default uniVocity-parser value which is '\'.

Marshalling usages

The marshalling accepts either:

A list of maps (List<Map<String, ?>>), one for each line
A single map (Map<String, ?>), for a single line

Any other body will throws an exception.

Usage example: marshalling a Map into CSV format

Considering the following body	Map<String, Object> body = new HashMap<>(); body.put("A", "one"); body.put("B", "two"); body.put("C", "three");
and this Java route definition	from("direct:input") .marshal(new UniVocityCsvDataFormat()) .to("mock:result");
or this XML route definition	<route> <from uri="direct:input"/> <marshal> <univocity-csv/> </marshal> <to uri="mock:result"/> </route>
then it will produce	one,two,three

Usage example: marshalling a Map into fixed-width format

Considering the following body	Map<String, Object> body = new HashMap<>(); body.put("A", "one"); body.put("B", "two"); body.put("C", "three");
and this Java route definition	from("direct:input") .marshal(new UniVocityFixedWidthDataFormat() .setFieldLengths(new int[] {5, 5, 5}) .padding('_') ) .to("mock:result");
	<route> <from uri="direct:input"/> <marshal> <univocity-fixed padding="_"> <univocity-header length="5"/> <univocity-header length="5"/> <univocity-header length="5"/> </univocity-fixed> </marshal> <to uri="mock:result"/> </route>
then it will produce	one__two__three

Usage example: marshalling a Map into TSV format

Considering the following body	Map<String, Object> body = new HashMap<>(); body.put("A", "one"); body.put("B", "two"); body.put("C", "three");
and this Java route definition	from("direct:input") .marshal(new UniVocityTsvDataFormat()) .to("mock:result");
or this XML route definition	<route> <from uri="direct:input"/> <marshal> <univocity-tsv/> </marshal> <to uri="mock:result"/> </route>
then it will produce	one two three (with tabs separating the values)

Unmarshalling usages

The unmarshalling uses an InputStream in order to read the data.

Each row produces either:

a list with all the values in it (asMap option with false);
A map with all the values indexed by the headers (asMap option with true).

All the rows can either:

be collected at once into a list (lazyLoad option with false);
be read on the fly using an iterator (lazyLoad option with true).

Usage example: unmarshalling a CSV format into maps with automatic headers

Considering the following body	A,B,C one,two,three four,five,six
and this Java route definition	from("direct:input") .unmarshal(new UniVocityCsvDataFormat() .setAsMap(true) .setHeaderExtractionEnabled(true) ) .to("mock:result");
or this XML route definition	<route> <from uri="direct:input"/> <unmarshal> <univocity-csv headerExtractionEnabled="true" asMap="true"/> </unmarshal> <to uri="mock:result"/> </route>
then it will produce	[ {A: 'one', B: 'two', C: 'three'}, {A: 'four', B: 'five', C: 'six'} ]

Usage example: unmarshalling a fixed-width format into lists

Considering the following body	one two three four five six
and this Java route definition	from("direct:input") .unmarshal(new UniVocityFixedWidthDataFormat() .setFieldLengths(new int[] {5,5,5}) ) .to("mock:result");
or this XML route definition	<route> <from uri="direct:input"/> <unmarshal> <univocity-fixed> <univocity-header length="5"/> <univocity-header length="5"/> <univocity-header length="5"/> </univocity-fixed> </unmarshal> <to uri="mock:result"/> </route>
then it will produce	[ ['one', 'two', 'three'], ['four', 'five', 'six'] ]

uniVocity-parsers

Options

Common options, shared by all the data formats

CSV format options

Fixed-width format options

TSV format options

Marshalling usages

Usage example: marshalling a Map into CSV format

Usage example: marshalling a Map into fixed-width format

Usage example: marshalling a Map into TSV format

Unmarshalling usages

Usage example: unmarshalling a CSV format into maps with automatic headers

Usage example: unmarshalling a fixed-width format into lists

Overview

Documentation

Search

Community

Developers

Apache Software Foundation