org.apache.uima.java$UIMA_READER_IMPL_TRAINFile System Collection ReaderReads files from the filesystem. This CollectionReader may be used
with or without a CAS Initializer. If a CAS Initializer is supplied, it will
be passed an InputStream to the file and must populate the CAS from that
InputStream. If no CAS Initializer is supplied, this CollectionReader will
read the file itself and set treat the entire contents of the file as the
document to be inserted into the CAS.1.0The Apache Software FoundationKnownPHINodeListList of XPaths to specific fields known to contain ONLY PHI.StringtruefalseScrubNodeListList of XPaths to scrubStringtruetrueInputDirectoryDirectory containing input filesStringfalsetrueEncodingCharacter encoding for the documents. If not specified,
the default system encoding will be used. Note that this parameter
only applies if there is no CAS Initializer provided; otherwise,
it is the CAS Initializer's responsibility to deal with character
encoding issues.StringfalsefalseLanguageISO language code for the documentsStringfalsefalseBrowseSubdirectoriesTrue means include files of subdirectories, recursively, of the input directory.BooleanfalsefalseScrubNodeList/Envelope/Body/PathologyCase/FullReportData/Envelope/Body/PathologyCase/FullReportText/Envelope/Body/PathologyCase/GrossDescriptionText/Envelope/Body/PathologyCase/DiagnosisTextKnownPHINodeList/Envelope/Header/Identifiers/FirstName/Envelope/Header/Identifiers/LastName/Envelope/Header/Identifiers/DateOfBirth/Envelope/Header/Identifiers/SSN/Envelope/Header/Identifiers/AccessionNumber/Envelope/Header/Identifiers/LocalMRNInputDirectory$DIR_INPUT_TRAINBrowseSubdirectoriesfalseorg.spin.scrubber.uima.type.KnownPHIsortedbeginstandardorg.apache.uima.examples.SourceDocumentInformationorg.spin.scrubber.uima.type.KnownPHItruefalsetrue