Release Notes - Tajo - Version 0.9.0
Changes since Tajo 0.8.0
Sub-task
- [TAJO-215] - Catalog should allow compatible types when finding functions
- [TAJO-218] - HiveQLAnalyzer has to support cast expression.
- [TAJO-517] - Publish Tajo jar to a public maven repository
- [TAJO-529] - Fix warnings in tajo-algebra
- [TAJO-602] - WorkerResourceManager should be broke down into 3 parts
- [TAJO-615] - Implement ADD TABLE RENAME TABLE
- [TAJO-659] - Add Tajo JDBC documentation
- [TAJO-667] - Add math function documentation
- [TAJO-668] - Add datetime function documentation
- [TAJO-669] - Add cluster setup documentation
- [TAJO-696] - Implement ALTER TABLE ADD COLUMN
- [TAJO-697] - Implement ALTER TABLE RENAME COLUMN
- [TAJO-736] - Add table management documentation
- [TAJO-761] - Implements INTERVAL type
- [TAJO-762] - Implements current date/time function
- [TAJO-783] - Remove yarn-related code from tajo-core
- [TAJO-790] - Implements ADD_MONTHS() function
- [TAJO-791] - Implements ADD_DAYS() function
- [TAJO-836] - create index support
- [TAJO-837] - Register index meta information at Catalog
- [TAJO-907] - Implement off-heap tuple block and zero-copy tuple
- [TAJO-924] - Merge the current window_function branch to master
- [TAJO-991] - Running PullServer on a dedicated JVM process which separates from worker.
- [TAJO-992] - Reduce number of hash shuffle output file.
- [TAJO-1008] - Protocol buffer De/Serialization for EvalNode
- [TAJO-1015] - Add executionblock event in worker
- [TAJO-1016] - Refactor worker rpc information
- [TAJO-1038] - Remove use of Builder variable in Schema
- [TAJO-1060] - Apply updated hadoop versions to README and BUILDING files
- [TAJO-1061] - Update build documentation
- [TAJO-1062] - Update TSQL documentation
- [TAJO-1068] - Add SQL Query documentation
- [TAJO-1069] - Add document to explain High Availability support
- [TAJO-1077] - Add Derby configuration documentation
- [TAJO-1096] - Update download source documentation
Bug
- [TAJO-427] - Empty table makes IndexOutOfBoundsException at LEFT OUTER JOIN clause.
- [TAJO-563] - INSERT OVERWRITE should not remove data before query success
- [TAJO-587] - Query is hanging when OutOfMemoryError occurs in the query master
- [TAJO-590] - Rename HiveConverter to HiveQLAnalyzer
- [TAJO-619] - SELECT count(1) after joins on text keys causes wrong plans
- [TAJO-620] - A join query can cause IndexOutOfBoundsException if one of tables is empty.
- [TAJO-624] - Incorrect progress indication
- [TAJO-628] - The second stage of distinct aggregation can be scheduled to only one node.
- [TAJO-630] - QueryMasterTask never finished when Internal error occurs.
- [TAJO-635] - Improve tests of query semantic verification
- [TAJO-638] - QueryUnitAttempt causes Invalid event error: TA_UPDATE at TA_ASSIGNED
- [TAJO-640] - In inner join clause, empty table can cause a error by order-by clause.
- [TAJO-641] - NPE in HCatalogStore.addTable()
- [TAJO-645] - Task.Reporter can cause NPE during reporting.
- [TAJO-647] - Work unbalance on disk scheduling of DefaultScheduler
- [TAJO-648] - TajoWorker does not send correct QM rpc and client rpc ports via heartbeat.
- [TAJO-650] - Repartitioner::scheduleHashShuffledFetches should adjust the number of tasks
- [TAJO-651] - HcatalogStore should support (de)serialization of RCFile
- [TAJO-652] - logical planner cannot handle alias on partition columns
- [TAJO-653] - RCFileAppender throws IOException
- [TAJO-663] - CREATE TABLE USING RAW doesn't throw ERROR
- [TAJO-672] - Wrong progress status when overwrites to partition table
- [TAJO-674] - ExplainLogicalPlan can cause NPE when a query includes derived tables
- [TAJO-682] - RangePartitionAlgorithm should be improved to handle empty texts
- [TAJO-689] - NoSuchElementException occurs during assigning the leaf tasks
- [TAJO-690] - infinite loop occurs when rack task is assigning
- [TAJO-693] - StatusUpdateTransition in QueryUnitAttempt handles TA_UPDATE incorrectly
- [TAJO-698] - Error occurs when FUNCTION and IN statement are used together.
- [TAJO-701] - Invalid bytes when creating BlobDatum with offset
- [TAJO-705] - CTAS always stores tables with CSV storage type into catalog
- [TAJO-706] - In the case of very quick query, client can't get query status.
- [TAJO-712] - Fix some bugs after database is supported
- [TAJO-713] - Missing INET4 in UniformRangePartition
- [TAJO-716] - Using column names actually aliased in aggregation functions can cause planning error.
- [TAJO-718] - A group-by clause with the same columns but aliased causes planning error.
- [TAJO-719] - JUnit test failures
- [TAJO-729] - PreLogicalPlanVerifier verifies distinct aggregation functions incorrectly.
- [TAJO-738] - NPE occur when failed in QueryMaster's GlobalPlanner.build().
- [TAJO-739] - A subquery with the same column alias caused planning error.
- [TAJO-741] - GreedyHeuristicJoinOrderAlgorithm removes some join pairs.
- [TAJO-747] - BroadCastJoin omits some data.
- [TAJO-748] - Shuffle output numbers of join may be inconsistent.
- [TAJO-750] - Join orders affects abnormal to the result data.
- [TAJO-754] - failure of INSERT INTO may remove the target table.
- [TAJO-759] - Fix findbug errors added recently.
- [TAJO-763] - Out of range problem in utc_usec_to()
- [TAJO-765] - Incorrect Configuration Classpaths
- [TAJO-766] - Test failures in TestExecExternalShellCommand
- [TAJO-772] - TajoDump cannot dump upper/lower mixed case database names.
- [TAJO-777] - Partition column in function parameter occurs NPE
- [TAJO-778] - TPC-DS Q34 occurs NPE
- [TAJO-779] - TPC-DS Q46 occurs NPE
- [TAJO-786] - TajoDataMetaDatabase::getSchemas creates invalid MetaDataTuple
- [TAJO-787] - FilterPushDownRule::visitSubQuery does not consider aliased columns.
- [TAJO-792] - Insert table with a qualified target table name can cause error.
- [TAJO-795] - PlannerUtil::joinJoinKeyForEachTable need to handle theta-join.
- [TAJO-799] - Local query without FROM throws IllegalArgumentException in CLI
- [TAJO-800] - CLI's meta command should be aware "TABLE_NAME" style.
- [TAJO-802] - No partition columns in WEB catalog page.
- [TAJO-803] - INSERT INTO without FROM throws ClassCastException.
- [TAJO-805] - Multiple constant in selection emits some columns.
- [TAJO-806] - CreateTableNode in CTAS uses a wrong schema as output schema and table schema.
- [TAJO-808] - Fix pre-commit build failure
- [TAJO-812] - Some methods of TajoDatabaseMetaData should result in an empty tuple list instead of SQLFeatureNotSupportedException.
- [TAJO-813] - CLI should support comment character with multi-line query.
- [TAJO-816] - NULL delimiter doesn't apply with HCatalogStore
- [TAJO-819] - KillQuery does not work query occasionally.
- [TAJO-821] - IllegalStateException occurs when a NettyClientBase object is created within single thread.
- [TAJO-823] - Missing INET4 handling in DatumFactory.cast()
- [TAJO-827] - SUM() overflow in the case of INT4
- [TAJO-829] - Same constants in groupby clause may cause planning error.
- [TAJO-830] - Some filter conditions with a SUBQUERY are removed by optimizer.
- [TAJO-832] - NPE occurs when Exception's message is null in Task.
- [TAJO-833] - NPE occurs when using the column as a alias name in the multiple DISTINCT.
- [TAJO-848] - PreLogicalPlanVerifier::visitInsert need to find smaller expressions than target columns for a partitioned table.
- [TAJO-850] - OUTER JOIN does not properly handle a NULL.
- [TAJO-851] - Timestamp type test of TestSQLExpression::testCastFromTable fails in jenkins CI test
- [TAJO-852] - Integration test using HCatalog as a catalog store is failed
- [TAJO-861] - tajo-dump script are not executable
- [TAJO-862] - Restore failure of dumped relations
- [TAJO-863] - Column order mismatched in the JOIN query with asterisk selection.
- [TAJO-864] - JUnit test failure at TestTimestampDatum.testTimestampConstructor()
- [TAJO-866] - COUNT DISTINCT with other aggregation function throws ClassCastException.
- [TAJO-867] - OUTER JOIN with empty result subquery produces a wrong result.
- [TAJO-868] - TestDateTimeFunctions unit test is occasionally failed
- [TAJO-869] - Sometimes, the unit test of testTaskRunnerHistory is failed.
- [TAJO-870] - FilterPushDown ignores a partitioned column in CASE expression within WHERE clause
- [TAJO-872] - NOW() function has a different value on each task.
- [TAJO-873] - Query status is still RUNNING after session expired.
- [TAJO-874] - Sometimes InvalidOperationException occurs when aggregates TableStat.
- [TAJO-879] - Some data is missing in the case of BROADCAST JOIN and multi-column partition.
- [TAJO-880] - NULL in CASE clause occurs Exception.
- [TAJO-881] - JOIN with union query occurs NPE
- [TAJO-882] - CLI hangs when a error occurs in the GlobalPlanner.
- [TAJO-884] - complex join conditions should be supported in ON clause
- [TAJO-891] - Complex join conditions with UNION or inline should be supported
- [TAJO-894] - Left outer join with partitioned large table and small table returns empty result.
- [TAJO-896] - Full outer join query with empty intermediate data doesn't terminate.
- [TAJO-897] - PartitionedTableRewriter is repeated several times with same table.
- [TAJO-898] - Left outer join with union returns empty result.
- [TAJO-899] - Nested now() has different value for each task
- [TAJO-902] - Unicode delimiter does not work correctly
- [TAJO-904] - ORDER BY Null first support
- [TAJO-905] - When to_date() parses some date without day, the result will be wrong.
- [TAJO-908] - Fetcher does not retry, when pull server connection was closed
- [TAJO-909] - {SortBased, Col}PartitionStoreExec should not write partition keys to files.
- [TAJO-912] - Tsql prints wrong version.
- [TAJO-913] - Add some missed tests for constant value group-by keys
- [TAJO-914] - join queries with constant values can cause schema mismatch in logical plan
- [TAJO-916] - SubQuery::computeStatFromTasks occasionally fail.
- [TAJO-917] - Using alias name which is same to existing column names causes error
- [TAJO-925] - Child ExecutionBlock of JOIN node has different number of shuffle keys.
- [TAJO-926] - Join condition including column references of a row-preserving table in left outer join causes incorrect result
- [TAJO-927] - Broadcast Join with Large, Small, Large, Small tables makes a wrong plan.
- [TAJO-929] - Broadcast join with empty outer join table returns empty result.
- [TAJO-934] - Multiple DISTINCT returns null grouping key value.
- [TAJO-936] - TestStorages::testSplitable is failed occasionally.
- [TAJO-939] - Refactoring the column resolver in LogicalPlan
- [TAJO-945] - Connecting to Tajo by JDBC driver failed with SQL Exception "Invalid JDBC URI"
- [TAJO-947] - ColPartitionStoreExec can cause URISyntaxException due to special characters
- [TAJO-948] - 'INSERT INTO' statement to non existence table casuses NPE.
- [TAJO-949] - PullServer does not release files, when a channel throws an internal exception
- [TAJO-952] - Wrong default partition volume config
- [TAJO-957] - ROUND should be support INT parameter.
- [TAJO-960] - TajoCli's problem does not show the current status
- [TAJO-961] - TajoCli should exit when at least one query faces error while executing a SQL script.
- [TAJO-962] - Column reference used in LIMIT clause incurs NPE.
- [TAJO-965] - Upgrade Bytes class and move some methods to others
- [TAJO-968] - Self-Join query (including partitioned table) doesn't run unexpectedly using auto broadcast join.
- [TAJO-969] - Distributed sort on a large data set may result in incorrect results.
- [TAJO-972] - Broadcast join with left outer join returns duplicated rows.
- [TAJO-974] - Eliminate unexpected case condition in SubQuery
- [TAJO-975] - alias name which is the same to existing column name may cause NPE during PPD
- [TAJO-977] - INSERT into a partitioned table as SELECT statement uses a wrong schema.
- [TAJO-978] - RoundFloat8 should return Float8Datum type.
- [TAJO-979] - Dividing float value by zero should throw "Divide by zero Exception"
- [TAJO-980] - execution page in Web UI broken
- [TAJO-981] - Help command (\?) in tsql takes too long time.
- [TAJO-985] - Client API should be non-blocking
- [TAJO-994] - 'count(distinct x)' function counts first null value.
- [TAJO-995] - HiveMetaStoreClient wrapper should retry the connection
- [TAJO-996] - Sometimes, scheduleFetchesByEvenDistributedVolumes loses some FetchImpls
- [TAJO-999] - SequenceFile key class need to be compatible.
- [TAJO-1000] - TextDatum.asChar() is incorrect, if client charset is different
- [TAJO-1004] - UniformRangePartition cannot deal with unicode ranges
- [TAJO-1006] - Fix wrong storage unit for kilo bytes and others.
- [TAJO-1009] - A binary eval for column references of the same tables should not be recognized as a join condition
- [TAJO-1013] - A complex equality condition including columns of the same table is recognized as a join condition
- [TAJO-1017] - TajoConf misuses read & write locks in some functions
- [TAJO-1020] - TajoContainerProxy::assignExecutionBlock causes NPE by race condition.
- [TAJO-1021] - Remove the member variable Builder from all classes inherited from ProtoObject.
- [TAJO-1022] - tsql does not work as background process
- [TAJO-1024] - RpcConnectionPool::getConnection can cause NPE at initialization
- [TAJO-1025] - Network disconnection during query processing can cause infinite exceptions
- [TAJO-1029] - TAJO_PULLSERVER_STANDALONE should be false in default tajo-env.sh
- [TAJO-1037] - KillQuery hang in subquery init state
- [TAJO-1047] - DefaultTaskScheduler:allocateRackTask is failed occasionally on JDK 1.7
- [TAJO-1048] - Missed use of session variables in GlobalPlanner
- [TAJO-1050] - RPC client does not retry during connecting
- [TAJO-1056] - Wrong resource release or wrong task scheduling
- [TAJO-1065] - The \admin -cluster argument doesn't run as expected.
- [TAJO-1067] - INSERT OVERWRITE INTO should not remove all partitions.
- [TAJO-1072] - CLI gets stuck when wrong host/port is provided
- [TAJO-1074] - Query calculates wrong progress before subquery init
- [TAJO-1081] - Non-forwarded (simple) query shows wrong rows.
- [TAJO-1097] - IllegalArgumentException: RawFileScanner
- [TAJO-1098] - LogicalPlanVerifier should validate operations within CASE WHEN clauses.
- [TAJO-1099] - LogicalPlanner::convertDataType causes NPE in some cases.
- [TAJO-1101] - Broadcast join with a zero-length file table returns wrong result data.
- [TAJO-1102] - Self-join with a partitioned table returns wrong result data.
- [TAJO-1103] - Insert clause of partitioned table loses some FetchImpls
- [TAJO-1104] - Using asterisk with GROUP BY causes NPE.
- [TAJO-1106] - Missing session check in getFinishedQuery API
- [TAJO-1107] - Broadcast join on non-leaf node scans only first data file.
- [TAJO-1110] - JAVA_PULLSERVER_HEAP_MAX in bin/tajo should be increased
- [TAJO-1111] - TestKillQuery.testKillQueryFromInitState occasionally fails
- [TAJO-1113] - SubQuery in KILLED state should handle unexpected events.
Improvement
- [TAJO-153] - Proto(Async|Blocking)RpcClient should retry to connect a server when failed.
- [TAJO-196] - Add EngineContext to contain resource information about worker
- [TAJO-356] - Improve TajoClient to directly get query results in the first request
- [TAJO-425] - RAWFILE_SYNC_INTERVAL has not default value.
- [TAJO-589] - Add fine grained progress indicator for each task
- [TAJO-614] - Explaining a logical node should use ExplainLogicalPlanVisitor.
- [TAJO-616] - SequenceFile support
- [TAJO-617] - Rename BIN/tajo_dump BIN/tajo-dump
- [TAJO-634] - ExecutionBlock must be sorted by start time in querydetail.jsp
- [TAJO-644] - Support quoted identifiers
- [TAJO-654] - Separate TajoWorker into yarn worker and standby worker
- [TAJO-662] - The tasks of CTAS on a partitioned table should be fine grained
- [TAJO-665] - sort buffer size must be dealt as long type values.
- [TAJO-670] - Change daemon's hostname to canonical hostname
- [TAJO-673] - Assign proper number of tasks when inserting into partitioned table
- [TAJO-675] - maximum frame size of frameDecoder should be increased
- [TAJO-691] - HashJoin or HashAggregation is too slow if there is many unique keys
- [TAJO-699] - Create a table using LIKE
- [TAJO-709] - Add .reviewboardrc and use rbt instead of post-review
- [TAJO-714] - Enable setting Parquet tuning parameters
- [TAJO-715] - hadoop version upgrade to 2.3.0
- [TAJO-717] - Improve file splitting for large number of splits
- [TAJO-725] - Broadcast JOIN should supports multiple tables
- [TAJO-728] - Supports expressions in 'IN predicate'
- [TAJO-732] - Support executing LINUX shell command and HDFS command.
- [TAJO-734] - Arrange TajoCli output message.
- [TAJO-735] - Remove multiple SLF4J bindings message.
- [TAJO-737] - Change version message when daemon starts up.
- [TAJO-743] - Change the default resource allocation policy of leaf tasks
- [TAJO-745] - APIs in TajoClient and JDBC should be case sensitive.
- [TAJO-755] - ALTER TABLESPACE LOCATION support
- [TAJO-758] - Supports parameter values in the SQL file.
- [TAJO-768] - Improve the log4j configuration
- [TAJO-769] - A minor improvements for HCatalogStore
- [TAJO-789] - Improve shuffle URI
- [TAJO-793] - CLI should be able to exit when single query is failed.
- [TAJO-797] - Implicit type conversion support
- [TAJO-801] - Multiple distinct should be supported.
- [TAJO-804] - Bump up Parquet version to 1.4.2
- [TAJO-807] - Implement Round(numeric, int) function.
- [TAJO-811] - add simple fifo scheduler support
- [TAJO-824] - Improve SimpleParser to handle JSON statements
- [TAJO-840] - Improve query result print with counting empty table.
- [TAJO-842] - NULL handling in JDBC.
- [TAJO-843] - implements COALESCE for BOOLEAN, DATE, TIME, TIMESTAMP
- [TAJO-844] - JDBC should be support getTime, getDate, and getTimestamp.
- [TAJO-846] - Clean up the task history in woker
- [TAJO-853] - Refactoring FilterPushDown for OUTER JOIN
- [TAJO-895] - ConstEval should not be included in target list of projectable nodes
- [TAJO-900] - Reducing memory usage during query processing
- [TAJO-903] - Some left outer join cases are not optimized as the broadcast join.
- [TAJO-906] - Runtime code generation for evaluating expression trees
- [TAJO-910] - Simple query (non-forwarded query) should be supported against partition tables.
- [TAJO-911] - Refactoring Mysql/Maria Catalog Store
- [TAJO-928] - Session variables should override query configs in TajoConf.
- [TAJO-931] - Output file can be punctuated depending on the file size.
- [TAJO-932] - Upgrade Parquet to 1.5.0.
- [TAJO-933] - Fork some classes of Parquet as builtin third-party classes
- [TAJO-937] - Should use tajo.util.VersionInfo instead of TajoConstants.TAJO_VERSION
- [TAJO-953] - RawFile should release a DirectBuffer immediately
- [TAJO-956] - CONCAT should be support multiple params and null param.
- [TAJO-966] - Range partition should support split of multiple characters.
- [TAJO-983] - Worker should directly read Intermediate data stored in localhost rather than fetching
- [TAJO-984] - Improve the default data type handling in RowStoreUtil
- [TAJO-987] - Hash shuffle should be balanced according to intermediate volumes
- [TAJO-989] - Cleanup of child blocks after parent execution block is complete
- [TAJO-990] - Implement a tool to find tajo configurations.
- [TAJO-1010] - Improve multiple DISTINCT aggregation.
- [TAJO-1027] - Upgrade Hive to 0.13.0 and 0.13.1
- [TAJO-1028] - JDBC should support SET command.
- [TAJO-1030] - Not supported JDBC APIs should return empty results instead of Exception
- [TAJO-1034] - Reduce Explicit Use of JVM Internal Class
- [TAJO-1040] - Misuse netty HashedWheelTimer
- [TAJO-1046] - Remove hadoop native dependency of pullserver
- [TAJO-1049] - Remove the parallel degree limit up to the maximum cluster capacity
- [TAJO-1052] - (Umbrella) Add and Update user documentation for 0.9.0
- [TAJO-1071] - should be possible to get long query results with no prompt
- [TAJO-1093] - DateTimeFormat.to_char() is slower than SimpleDateFormat.format()
New Feature
- [TAJO-20] - INSERT INTO ... SELECT
- [TAJO-30] - Parquet Integration
- [TAJO-353] - Add Database support to Tajo
- [TAJO-377] - Implement concat function
- [TAJO-378] - Implement concat_ws function.
- [TAJO-480] - Umbrella Jira for adding ALTER TABLE statement
- [TAJO-704] - TajoMaster HA
- [TAJO-711] - Add Avro storage support
- [TAJO-847] - Supporting MariaDB-based Store, which is compatible with MySQL.
- [TAJO-849] - Add Parquet storage to HCatalogStore
- [TAJO-860] - Implements TRUNCATE table.
- [TAJO-1105] - Add thread which detects JVM pauses like HADOOP's
Task
- [TAJO-605] - Rename Options to KeyValueList
- [TAJO-621] - Add DOAP file for Tajo
- [TAJO-632] - add intellij idea projects files into git ignore
- [TAJO-642] - Change tajo documentation tool to sphinx
- [TAJO-657] - Missing table stat in RCFile
- [TAJO-681] - Embed sphinx rtd theme into tajo-docs
- [TAJO-694] - Bump up hadoop to 2.3.0
- [TAJO-700] - Update site, wikis, pom.xml and other resources to point to the new repository location
- [TAJO-730] - Update Tajo site to reflect graduation
- [TAJO-752] - Escalate sub modules in tajo-core into the top-level modules
- [TAJO-753] - Clean up of maven dependencies
- [TAJO-788] - Update Tajo documentation and README, and BUILDING
- [TAJO-810] - Update Tajo site for 0.8.0 release
- [TAJO-814] - Set up Travis CI builds
- [TAJO-817] - tajo-core should not skip deploy.
- [TAJO-820] - Add missing license header to 0.8.0 release announcement.
- [TAJO-834] - Add Travis notification to issues@tajo.a.o and IRC.
- [TAJO-859] - Update site for new committer Alvin Henrick
- [TAJO-886] - Add IRC page to community section in site.
- [TAJO-887] - Eliminate HiveQL support feature
- [TAJO-890] - Redirect stdout of maven test to /dev/null in Travis CI script
- [TAJO-1001] - Add missed postgresql license to NOTICE.txt and LICENSE.txt
- [TAJO-1007] - Update site for new committer and new contributors
- [TAJO-1054] - Wrong comment in ByteUtils.splitWorker()
- [TAJO-1070] - BSTIndexScanExec should not seek a negative offset
- [TAJO-1078] - Update contributor list
Test