code
docs
tests
The Zest™ Core I/O API is completely generic and not tied to the Zest™ programming model as a whole. It can be used independently of Zest, together with the Zest™ Core Functional API, which the Core I/O API depends on.
The Zest™ Core I/O API tries to address the problem around shuffling data around from various I/O inputs and outputs, possibly with transformations and filtering along the way. It was identified that there is a general mix-up of concerns in the stereotypical I/O handling codebases that people deal with all the time. The reasoning around this, can be found in the Use I/O API, and is recommended reading.
Why does I/O operations in Java have to be so complicated, with nested try/catch/finally and loops? Don’t you wish that the operations could be expressed in a more natural way, such as;
File source = ... File destination = ... source.copyTo( destination );
It seems natural to do, yet it is not present for us. We need to involve FileInputStream/FileOutputStream, wrap them in Buffered versions of it, do our own looping, close the streams afterwards and what not. So, the java.io.File does not have this simple feature and in the Zest™ Core API, we need to work around this limitation. We also want to make the abstraction a little bit more encompassing than "just" files. So how does that look like then?
The most common inputs and outputs are collected in the org.apache.zest.io.Inputs and org.apache.zest.io.Outputs classes as static factory methods, but you can create your (more about that later).
So, we want to read a text file and write the content into another text file, right? This is how it is done;
File source = new File( "source.txt" ); File destination = new File( "destination.txt" ); Inputs.text( source ).transferTo( Outputs.text( destination ) );
Pretty much self-explanatory, wouldn’t you say? But what happened to the handling of exceptions and closing of resources? It is all handled inside the Zest™ Core I/O API. There is nothing you can forget to do.
Another simple example, where we want to count the number of lines in the text;
import org.apache.zest.io.Transforms.Counter; import static org.apache.zest.io.Transforms.map; [...snip...] File source = new File( "source.txt" ); File destination = new File( "destination.txt" ); Counter<String> counter = new Counter<String>(); Inputs.text( source ).transferTo( map(counter, Outputs.text(destination) )); System.out.println( "Lines: " + counter.count() );
The Counter is a Function which gets injected into the transfer.
Ok, so we have seen that the end result can become pretty compelling. How does it work?
I/O is defined as a process of moving data from an Input, via one or more Transforms to an Output. The Input could be a File or a String, the transformation could be a filter, conversion or a function and finally the Output destination could be a File, String or an OutputStream. It is important to note that there is a strong separation of concern between them. Let’s look at the on at a time.
This interface simply has a transferTo() method, which takes an Output. The formal definition is;
public interface Input<T, SenderThrowableType extends Throwable> { <ReceiverThrowableType extends Throwable> void transferTo( Output<? super T, ReceiverThrowableType> output ) throws SenderThrowableType, ReceiverThrowableType; }
What on earth is all this genericized Exceptions? Well, it abstracts away the explicit Exception that implementations may throw (or not). So instead of demanding that all I/O must throw the java.io.IOException, as is the case in the JDK classes, it is up to the implementation to declare what it may throw. That is found in the SenderThrowable generic of the interface.
But hold on a second. Why is an Input throwing a "sender" exception? Well, think again. The Input is feeding "something" with data. It takes it from some source and "sends it" to the downstream chain, eventually reaching an Output, which likewise is the ultimate "receiver".
So, then, the method transferTo() contains the declaration of the downstream receiver’s possible Exception (ReceiverThrowable) which the transferTo() method may also throw as the data may not be accepted and such exception will bubble up to the transferTo() method (the client’s view of the transfer).
The output interface is likewise fairly simple;
public interface Output<T, ReceiverThrowableType extends Throwable> { [...snip...] <SenderThrowableType extends Throwable> void receiveFrom( Sender<? extends T, SenderThrowableType> sender ) throws ReceiverThrowableType, SenderThrowableType; }
It can simply receive data from a org.apache.zest.io.Sender.
Hey, hold on! Why is it not receiving from an Input? Because the Input is the client’s entry point and of no use to the Output as such. Instead, the Output will tell the Sender (which is handed to from the Input or an transformation) to which Receiver the content should be sent to.
Complicated? Perhaps not trivial to get your head around it at first, but the chain is;
Input passes a Sender to the Output which tells the Sender which Receiver the data should be sent to.
The reason for this is that we need to separate the concerns properly. Input is a starting point, but it emits something which is represented by the Sender. Likewise the destination is an Output, but it receives the data via its Receiver interface. For O/S resources, they are handled purely inside the Input and Output implementations, where are the Sender/Receiver are effectively dealing with the data itself.
The 3 component in the Zest™ Core I/O API is the transformations that are possible. Interestingly enough, with the above separation of concerns, we don’t need an InputOutput type that can both receive and send data. Instead, we simply need to prepare easy to use static factory methods, which are found in the org.apache.zest.io.Transforms class. Again, it is fairly straight forward to create your own Transforms if you need something not provided here.
The current transformations available are;
There are also a couple of handy map functions available, such as
Let us take a closer look at the implementation of a map function, namely Counter mentioned above and also used in the section First Example above.
The implementation is very straight forward.
public static class Counter<T> implements Function<T, T> { private volatile long count = 0; public long count() { return count; } @Override public T apply( T t ) { count++; return t; } }
On each call to the map() method, increment the counter field. The client can then retrieve that value after the transfer is complete, or in a separate thread to show progress.
Speaking of "progress", so how is the ProgressLog implemented? Glad you asked;
public static class ProgressLog<T> implements Function<T, T> { private Counter<T> counter; private Log<String> log; private final long interval; public ProgressLog( Logger logger, String format, long interval ) { this.interval = interval; if( logger != null && format != null ) { log = new Log<>( logger, format ); } counter = new Counter<>(); } public ProgressLog( long interval ) { this.interval = interval; counter = new Counter<>(); } @Override public T apply( T t ) { counter.apply( t ); if( counter.count % interval == 0 ) { logProgress(); } return t; } // Override this to do something other than logging the progress protected void logProgress() { if( log != null ) { log.apply( counter.count + "" ); } } }
It combines the Counter and the Log implementations, so that the count is forwarded to the Log at a given interval, such as every 1000 items. This may not be what you think a ProgressLog should look like, but it serves as a good example on how you can combine the general principles found in the Zest™ Core API package.
The filter transform takes a regular java.util.function.Predicate as an argument, where the implementation needs to return true or false for whether the item passed in is within the limits. Let’s say that you have a IntegerRangeSpecification, which could then be implemented as
public static class IntegerRangeSpecification implements Predicate<Integer> { private int lower; private int higher; public IntegerRangeSpecification( int lower, int higher ) { this.lower = lower; this.higher = higher; } @Override public boolean test( Integer item ) { return item >= lower && item <= higher; } }
Input and Output implementations at first glance look quite scary. Taking a closer look and it can be followed. But to simplify for users, the org.apache.zest.io.Inputs and org.apache.zest.io.Outputs contains static factory methods for many useful sources and destinations.
The current set of ready-to-use Input implementations are;
/** * Read lines from a String. * * @param source lines * * @return Input that provides lines from the string as strings */ public static Input<String, RuntimeException> text( final String source ) [...snip...] /** * Read lines from a Reader. * * @param source lines * * @return Input that provides lines from the string as strings */ public static Input<String, RuntimeException> text( final Reader source ) [...snip...] /** * Read lines from a UTF-8 encoded textfile. * * If the filename ends with .gz, then the data is automatically unzipped when read. * * @param source textfile with lines separated by \n character * * @return Input that provides lines from the textfiles as strings */ public static Input<String, IOException> text( final File source ) [...snip...] /** * Read lines from a textfile with the given encoding. * * If the filename ends with .gz, then the data is automatically unzipped when read. * * @param source textfile with lines separated by \n character * @param encoding encoding of file, e.g. "UTF-8" * * @return Input that provides lines from the textfiles as strings */ public static Input<String, IOException> text( final File source, final String encoding ) [...snip...] /** * Read lines from a textfile at a given URL. * * If the content support gzip encoding, then the data is automatically unzipped when read. * * The charset in the content-type of the URL will be used for parsing. Default is UTF-8. * * @param source textfile with lines separated by \n character * * @return Input that provides lines from the textfiles as strings */ public static Input<String, IOException> text( final URL source ) [...snip...] /** * Read a file using ByteBuffer of a given size. Useful for transferring raw data. * * @param source The file to be read. * @param bufferSize The size of the byte array. * * @return An Input instance to be applied to streaming operations. */ public static Input<ByteBuffer, IOException> byteBuffer( final File source, final int bufferSize ) [...snip...] /** * Read an inputstream using ByteBuffer of a given size. * * @param source The InputStream to be read. * @param bufferSize The size of the byte array. * * @return An Input instance to be applied to streaming operations. */ public static Input<ByteBuffer, IOException> byteBuffer( final InputStream source, final int bufferSize ) [...snip...] /** * Combine many Input into one single Input. When a transfer is initiated from it all items from all inputs will be transferred * to the given Output. * * @param inputs An Iterable of Input instances to be combined. * @param <T> The item type of the Input * @param <SenderThrowableType> The Throwable that might be thrown by the Inputs. * * @return A combined Input, allowing for easy aggregation of many Input sources. */ public static <T, SenderThrowableType extends Throwable> Input<T, SenderThrowableType> combine( final Iterable<Input<T, SenderThrowableType>> inputs ) [...snip...] /** * Create an Input that takes its items from the given Iterable. * * @param iterable The Iterable to be used as an Input. * @param <T> The item type of the Input * * @return An Input instance that is backed by the Iterable. */ public static <T> Input<T, RuntimeException> iterable( final Iterable<T> iterable ) [...snip...] /** * Create an Input that allows a Visitor to write to an OutputStream. The stream is a BufferedOutputStream, so when enough * data has been gathered it will send this in chunks of the given size to the Output it is transferred to. The Visitor does not have to call * close() on the OutputStream, but should ensure that any wrapper streams or writers are flushed so that all data is sent. * * @param outputVisitor The OutputStream Visitor that will be backing the Input ByteBuffer. * @param bufferSize The buffering size. * * @return An Input instance of ByteBuffer, that is backed by an Visitor to an OutputStream. */ public static Input<ByteBuffer, IOException> output( final Visitor<OutputStream, IOException> outputVisitor, final int bufferSize )
The current set of ready-to-use Input implementations are;
/** * Write lines to a text file with UTF-8 encoding. Separate each line with a newline ("\n" character). If the writing or sending fails, * the file is deleted. * <p> * If the filename ends with .gz, then the data is automatically GZipped. * </p> * @param file the file to save the text to * * @return an Output for storing text in a file */ public static Output<String, IOException> text( final File file ) [...snip...] /** * Write lines to a text file. Separate each line with a newline ("\n" character). If the writing or sending fails, * the file is deleted. * <p> * If the filename ends with .gz, then the data is automatically GZipped. * </p> * @param file the file to save the text to * * @return an Output for storing text in a file */ public static Output<String, IOException> text( final File file, final String encoding ) [...snip...] /** * Write lines to a Writer. Separate each line with a newline ("\n" character). * * @param writer the Writer to write the text to * @return an Output for storing text in a Writer */ public static Output<String, IOException> text( final Writer writer ) [...snip...] /** * Write lines to a StringBuilder. Separate each line with a newline ("\n" character). * * @param builder the StringBuilder to append the text to * @return an Output for storing text in a StringBuilder */ public static Output<String, IOException> text( final StringBuilder builder ) [...snip...] /** * Write ByteBuffer data to a file. If the writing or sending of data fails the file will be deleted. * * @param file The destination file. * * @return The Output ByteBuffer instance backed by a File. */ public static Output<ByteBuffer, IOException> byteBuffer( final File file ) [...snip...] /** * Write ByteBuffer data to an OutputStream. * * @param stream Destination OutputStream * * @return The Output of ByteBuffer that will be backed by the OutputStream. */ public static Output<ByteBuffer, IOException> byteBuffer( final OutputStream stream ) [...snip...] /** * Write byte array data to a file. If the writing or sending of data fails the file will be deleted. * * @param file The File to be written to. * @param bufferSize The size of the ByteBuffer. * * @return An Output instance that will write to the given File. */ public static Output<byte[], IOException> bytes( final File file, final int bufferSize ) [...snip...] /** * Do nothing. Use this if you have all logic in filters and/or specifications * * @param <T> The item type. * * @return An Output instance that ignores all data. */ public static <T> Output<T, RuntimeException> noop() [...snip...] /** * Use given receiver as Output. Use this if there is no need to create a "transaction" for each transfer, and no need * to do batch writes or similar. * * @param <T> The item type * @param receiver receiver for this Output * * @return An Output instance backed by a Receiver of items. */ public static <T, ReceiverThrowableType extends Throwable> Output<T, ReceiverThrowableType> withReceiver( final Receiver<T, ReceiverThrowableType> receiver ) [...snip...] /** * Write objects to System.out.println. * * @return An Output instance that is backed by System.out */ public static Output<Object, RuntimeException> systemOut() [...snip...] /** * Write objects to System.err.println. * * @return An Output instance backed by System.in */ @SuppressWarnings( "UnusedDeclaration" ) public static Output<Object, RuntimeException> systemErr() [...snip...] /** * Add items to a collection */ public static <T> Output<T, RuntimeException> collection( final Collection<T> collection )