Apache Streams (incubating)
Release Notes - Streams - Version 0.4
Bug
- [STREAMS-73] - Interfaces
- [STREAMS-132] - RegexUtils does not ensure that content is non-null
- [STREAMS-151] - Refactor Facebook Provider to have continuous AND finite mode
- [STREAMS-159] - Add facebook page feed provider
- [STREAMS-177] - new NPE in LinkResolver causing surefire to fail
- [STREAMS-258] - Prevent signing of .asc, .m5, and .sha1 artifacts
- [STREAMS-391] - streams-provider-dropwizard exceptions in test logs
- [STREAMS-405] - Link to src/site/markdown/index.md in README.md is broken
- [STREAMS-407] - add DateTimeSerializers for formats safely, don't crash if invalid
- [STREAMS-414] - Incorrect Documentation
- [STREAMS-436] - Put a timeout on all Provider Integration Tests
- [STREAMS-437] - DatumFromMetadataProcessorIT failing during 0.4 release
- [STREAMS-446] - RAT check fails in prep for 0.4-incubating release
- [STREAMS-447] - Scala-plugin failures in prep for 0.4-incubating release
Improvement
- [STREAMS-127] - JsonSchema Replication in Datasift provider
- [STREAMS-160] - Embed original source provider pojos inside datasift pojos if possible
- [STREAMS-186] - Platform-level 'detectConfiguration'
- [STREAMS-399] - Add any missing fields to tweet.json
- [STREAMS-400] - streams-persist-elasticsearch : bump to version 2.x
- [STREAMS-403] - Ensure all providers function stand-alone
- [STREAMS-404] - Integration Test for FsElasticsearchIndex
- [STREAMS-411] - ability (and instructions on how) to run providers directly from console
- [STREAMS-413] - Update dependency and plugin versions - Q4 2016
- [STREAMS-425] - better tracking completion in multi-threaded providers
- [STREAMS-426] - streams-persist-mongo : test with docker
- [STREAMS-427] - Support any jackson-compatible class as valid input to base converters and provider converters
- [STREAMS-428] - Update example markdown on website
- [STREAMS-429] - fix failing tests in streams-plugins
- [STREAMS-430] - update jenkins to run advanced integration testing
- [STREAMS-431] - Remove streams.util.RegexUtils
- [STREAMS-432] - Update to Java 8
- [STREAMS-433] - Upgrade maven and jenkins to build with jdk8
- [STREAMS-434] - Delete CustomDateTimeFormat which is not used anywhere
- [STREAMS-435] - remove incubator-streams-master-pom.xml
New Feature
- [STREAMS-213] - Publish jsonschemas to a web-accessible URL when jenkins builds snapshot and releases
- [STREAMS-389] - Support generation of scala source from jsonschemas
- [STREAMS-398] - Support generation of hive table definitions from jsonschema
- [STREAMS-418] - Flink twitter example(s)
Task
- [STREAMS-203] - Update GooglePlus TypeConverter to handle Post Activities
- [STREAMS-316] - add “apache’ to the artifact name
- [STREAMS-408] - Check package names and run instructions of modules in streams-examples/local
- [STREAMS-409] - the copyright year in the NOTICE files need to be updated for 2016.
- [STREAMS-410] - Delete any modules which have been removed from reactor from master branch
- [STREAMS-416] - Delete defunct or not-implemented provider modules
- [STREAMS-417] - Collect example AS 2.0 object and activity documents to use in test cases
- [STREAMS-419] - reboot: cleanup git branches and tags
- [STREAMS-421] - Delete defunct or not-implemented runtime modules
Test
- [STREAMS-415] - Proof of concept integration test that pulls actual data from generator
Release Notes - Streams - Version 0.3
Bug
- [STREAMS-158] - Sysomos Processor exceptions result in failed processor thread
- [STREAMS-220] - FacebookPostSerializer is broken
- [STREAMS-223] - streams-monitoring exception when streamConfig not set
- [STREAMS-227] - Array out of bounds Exception running FacebookTypeConverter
- [STREAMS-229] - GooglePlus TypeConverter needs to be able to accommodate String datums
- [STREAMS-230] - Broadcast Monitor doesn't start in all cases
- [STREAMS-236] - AbstractRegexExtensionExtractor should not allow duplicate entities
- [STREAMS-260] - FacebookPageFeedDataCollector should handle backoff strategy correctly
- [STREAMS-264] - LinkExpansion Tests are depending on (non-existent) external resources
- [STREAMS-311] - TwitterUserInformationProvider stalls with > 20 items provided
- [STREAMS-337] - WebHdfsPersistReader/Writer should allow user-specified file encoding
- [STREAMS-338] - Runtime exceptions caught by LocalStreamBuilder aren't logged
- [STREAMS-346] - persist-hdfs: unnecessary port config causes failure when reading from local filesystem
- [STREAMS-347] - persist-hdfs: incorrectly escapes output when write called on a json String
- [STREAMS-354] - pojo.json.Collection.totalItems should be an integer
- [STREAMS-355] - WebHdfsPersistReader should be serializable
- [STREAMS-356] - Metadata missing in StreamsDatum constructors and toString
- [STREAMS-357] - Streams can shutdown before starting if providers haven't started yet when runtime first checks run status
- [STREAMS-358] - Refreshing indexes in EsPW.cleanUp can result in streams not terminating
- [STREAMS-374] - Fix ActivityConverterUtil detectClasses(document) exception handling
- [STREAMS-377] - can't create different LineReadWriteUtils
- [STREAMS-380] - Media_Link should be serializable
- [STREAMS-382] - Media_Link width and height should be integers, not floats
- [STREAMS-386] - Clean up streams-pojo jsonschemas to ensure all beans are serialiazable
- [STREAMS-395] - incubator-streams-examples COMPILATION ERROR
Improvement
- [STREAMS-126] - StreamsLocalBuilder should look for provider timeout in typesafe config
- [STREAMS-191] - Streams implement use of throughput queues
- [STREAMS-226] - Consolidate all stream-wide configuration
- [STREAMS-279] - Create Youtube User/Channel provider
- [STREAMS-294] - Allow for custom setting of QueueSize, BatchSize, and ScrollTimeout
- [STREAMS-308] - Update readme and website to recommend latest version of JDK
- [STREAMS-313] - Performant Off-line Neo4j GraphPersistWriter
- [STREAMS-320] - Update release documentation to cover build and deploy of mvn site to svn
- [STREAMS-321] - Add HOCON Converter support for input object not at root node
- [STREAMS-322] - Support gzipped files in WebHdfsReader
- [STREAMS-324] - clean up example poms
- [STREAMS-325] - Configure monitoring via typesafe
- [STREAMS-326] - Bump hadoop version in streams-persist-hdfs
- [STREAMS-328] - StreamsJacksonMapper should omit null/empty fields when serializing
- [STREAMS-329] - Processor to derive required ES metadata from standard fields
- [STREAMS-330] - make TwitterErrorHandler respect the Twitter4j RateLimitStatus reset time
- [STREAMS-332] - Restore ability to test data conversions on flat files during build
- [STREAMS-333] - ElasticsearchPersistUpdater should set parent+routing when available
- [STREAMS-335] - Publish schema and configuration resources to maven site pages
- [STREAMS-336] - consolidate common pom sections to streams-master
- [STREAMS-339] - incorporate defined extension fields for activities and actors into pojos
- [STREAMS-342] - Expose convertResultToString and processLine as public methods
- [STREAMS-343] - Make SerializationUtil spark-compatible
- [STREAMS-348] - Add get/set methods for StreamsConfiguration to Builder interface
- [STREAMS-350] - provider-twitter: derive baseURL from configuration
- [STREAMS-353] - Support removal of 'extensions' from document path structure
- [STREAMS-361] - Persist Reader/Writer for Amazon Kinesis
- [STREAMS-362] - Make shutdown timers in local runtime fully configurable
- [STREAMS-363] - upgrade persist-s3 to match persist-hdfs features
- [STREAMS-364] - Allow resolution of typesafe/stream config from url in stream-config
- [STREAMS-368] - Add String getId() to StreamsOperation
- [STREAMS-370] - Support arbitrary labels in streams-persist-graph
- [STREAMS-371] - Allow use of ids endpoints in TwitterFollowingProvider
- [STREAMS-373] - Allow specification of 'maximum_items' in twitter providers
- [STREAMS-375] - Override component configuration when valid beans supplied by runtime as prepare args
- [STREAMS-376] - When twitter providers fail authentication, log the ID that could not be accessed.
- [STREAMS-378] - incompatibility between streams binaries and spark 1.5
- [STREAMS-381] - streams-provider-twitter: User can contain a Status
- [STREAMS-383] - register DefaultScalaModule in StreamsJacksonMapper
- [STREAMS-387] - Default behavior of streams-pojo-extensions: use additionalProperties directly
- [STREAMS-390] - Eliminate dependency and plugin warnings from build
- [STREAMS-392] - Centralized logging configuration for maven build
- [STREAMS-393] - Switch any usage of System.out and System.err to slf4j
- [STREAMS-397] - streams-provider-youtube test failure in jenkins
- [STREAMS-401] - Refresh Streams Website
- [STREAMS-406] - Bring streams-master markdown files into compliance with rat plugin
Story
- [STREAMS-43] - Complete, test, and document g+ provider
- [STREAMS-44] - Complete, test, and document sysomos provider
- [STREAMS-46] - Complete, test, and document facebook API provider
- [STREAMS-261] - Create Facebook Bio Collector/Provider
- [STREAMS-272] - Create a Youtube Post Provider
Task
- [STREAMS-123] - Add twitter specific link handling for datasift
- [STREAMS-202] - Create Google Plus Deserializer and TypeConverter
- [STREAMS-274] - Create a YoutubeTypeConverter and serializer
- [STREAMS-280] - Add ability to get a final document count from the Sysomos Provider
- [STREAMS-298] - Ensure jcl-over-slf4j bridge is included where necessary
- [STREAMS-317] - release a .tar.gz artefact as well as a .zip
- [STREAMS-318] - remove test files whose license may not be compatible with an Apache release (google-gplus)
- [STREAMS-319] - update <developer> info in the pom with active contributors
- [STREAMS-331] - Set up build of master and pull requested branches on travis-ci.org
- [STREAMS-334] - version bump datasift module to remove boundary repo dependency
- [STREAMS-351] - add A2 license to site.xml files
- [STREAMS-359] - copy streams-master pom into streams-project as CI workaround
- [STREAMS-372] - Support deploy to snapshot repo by CI
Test
- [STREAMS-208] - Integration Testing capability and reference Integration Test
- [STREAMS-384] - TestLinkUnwinderProcessor.test404Link is failing
Release Notes - Streams - Version 0.2
Sub-task
- [STREAMS-275] - ActivityConverterProcessor should apply reflection mode when configuration is not provided
- [STREAMS-277] - Upgrade streams-provider-twitter to work with reflection-based conversion
- [STREAMS-305] - Add missing AL
- [STREAMS-306] - Intermittent test failures
- [STREAMS-307] - Release test-jar packaging
- [STREAMS-309] - incubator-streams-examples site plugin
- [STREAMS-312] - Remove test resources without clear licensing and ignore tests that require them
Bug
- [STREAMS-155] - Build hadoop modules with Apache artifacts
- [STREAMS-200] - StreamsProcessorTask ignores any processing
- [STREAMS-225] - Streams need to remove any of their JMX beans on shutdown/cleanup
- [STREAMS-243] - S3 Persist Writer does not flush or shutdown on stream shutdown
- [STREAMS-263] - FacebookTypeConverter should be able to handle Facebook Pages
- [STREAMS-266] - some classes/tests are using the wrong NotImplementedException
- [STREAMS-278] - Rework pig runtime as part of switch from 'Serializer' to 'Converter'
- [STREAMS-281] - enable BroadcastMessagePersister
- [STREAMS-288] - StreamsJacksonModule should not scan for DateTimeFormats by default
- [STREAMS-296] - Local Runtime doesn't allow persist writers enough time to flush and close during shutdown
- [STREAMS-299] - Sysomos Provider uses dev API URL
Improvement
- [STREAMS-147] - Platform-level type conversion
- [STREAMS-201] - Util function to remove all MXBeans for tests
- [STREAMS-214] - Create, test, and document file-backed persistance module
- [STREAMS-271] - suggest increasing JVM heap in readme
- [STREAMS-273] - Support POST endpoints in streams-http
- [STREAMS-284] - Read/write parent IDs in streams-persist-elasticsearch
- [STREAMS-285] - Add all objectTypes in spec to streams-pojo
- [STREAMS-286] - Add all verbs in spec to streams-pojo
- [STREAMS-293] - allow for missing metadata fields in streams-persist-hdfs
Story
- [STREAMS-47] - Complete, test, and document mongo persist
Task
- [STREAMS-95] - TwitterProfileProcessor needs to send a user ID in each StreamsDatum
- [STREAMS-304] - Perform 0.2-incubating release
Release Notes - Streams - Version 0.1
Sub-task
- [STREAMS-212] - Generic Type Converter Processor
- [STREAMS-215] - Add method to StreamsConfigurator to return Serializable config given a Typesafe Config
- [STREAMS-218] - Generic Activity Converter Processor
- [STREAMS-241] - Reflection-based StreamsJacksonMapper
- [STREAMS-245] - Clean-up root POM
- [STREAMS-246] - Clean-up module POMs
- [STREAMS-247] - Clean-up READMEs
- [STREAMS-248] - Bring website content in-line with released capabilities.
- [STREAMS-249] - Javadoc plugin
- [STREAMS-250] - Prepare release notes
- [STREAMS-251] - Dry-run release process, review artifacts, update documentation
- [STREAMS-254] - Resolve rat plugin failures
- [STREAMS-255] - Merge streams-master into streams-project
- [STREAMS-256] - streams-components-all version doesn't bump during release
- [STREAMS-267] - .gitignore eclipse workspace files
- [STREAMS-268] - remove streams-web from master
- [STREAMS-269] - maven-remote-resources-plugin using SNAPSHOT resource-bundles
Bug
- [STREAMS-167] - TwitterConfigurator doesn't properly create TwitterUserInfoConfiguration from hocon
- [STREAMS-219] - src/main/resource files are being created by build
- [STREAMS-252] - Monitor Executor Service in LocalStreamBuilder needs to be Flexible
- [STREAMS-262] - jars in runner folder of source release
- [STREAMS-265] - some jar artifacts from 0.1-rc1 did not contain DISCLAIMER
Improvement
- [STREAMS-68] - ActivitySerializer should be ActivityConverter
- [STREAMS-143] - Allow modules to get instances of StreamsJacksonMapper that can process additional DateTime formats
- [STREAMS-244] - Prepare for 0.1 release