Release Notes - Tajo - Version 0.10.0
Changes since Tajo 0.9.0
Sub-task
- [TAJO-324] - Rename the prefix 'QueryUnit' to Task
- [TAJO-920] - Add FIRST_VALUE and LAST_VALUE window functions
- [TAJO-1149] - Implement direct read of DelimitedTextFile
- [TAJO-1151] - Implement the ByteBuffer-based De/Serializer
- [TAJO-1152] - RawFile ByteBuffer should be reuse
- [TAJO-1233] - Merge hbase_storage branch to the master branch
- [TAJO-1260] - Add ALTER TABLE ADD/DROP PARTITION statement to parser
- [TAJO-1262] - Rename the prefix 'SubQuery' to 'Stage'
- [TAJO-1287] - Repeated using of the same order by key in multiple window clauses should be supported
Bug
- [TAJO-831] - Project wrong column in the case of having same alias in subquery.
- [TAJO-930] - Could not initialize class org.apache.tajo.QueryTestCaseBase during building
- [TAJO-1003] - wrong converting to_timestamp(text, text)
- [TAJO-1063] - Current_time() always returns UTC time
- [TAJO-1064] - The hour values are different between current_time() and extract from current_time()
- [TAJO-1108] - RawFile tableStats does not update in query processing
- [TAJO-1119] - JDBC driver should support TIMESTAMP type.
- [TAJO-1126] - Join condition including functions throws IllegalArgumentException.
- [TAJO-1139] - ExternalSortExec should delete the intermediate files
- [TAJO-1150] - Some weird methods in QueryClientImpl should be fixed
- [TAJO-1154] - TajoCli doesn't pause while running the non-forwarded query.
- [TAJO-1157] - Required Java version in tutorial doc needs to be updated
- [TAJO-1158] - Max Hadoop version in tutorial doc needs to be updated
- [TAJO-1162] - to_char() returns "-00" second.
- [TAJO-1166] - S3 related storage causes compilation error in Hadoop 2.6.0-SNAPSHOT
- [TAJO-1179] - Integration tests in TravisCI are occasionally failed due to log size.
- [TAJO-1180] - digitValue should throw Exception when char is not in valid range
- [TAJO-1181] - Avro schema URL should support various protocols.
- [TAJO-1183] - Keep command execution even with errors.
- [TAJO-1185] - Default timezone should be UTC+0 instead of depending on JVM
- [TAJO-1188] - Fix testcase testTimestampConstructor in TestTimestampDatum
- [TAJO-1190] - INSERT INTO to partition tables may cause NPE.
- [TAJO-1191] - Change DateDatum timezone to UTC
- [TAJO-1192] - testTimestampConstructor incorrectly compares a local time with a UTC time
- [TAJO-1194] - 'INSERT OVERWRITE .. SELECT' does not remove existing data when result is empty.
- [TAJO-1197] - Unit test failed: unable to create new native thread
- [TAJO-1200] - Invalid shuffle data of multiple worker in same server
- [TAJO-1205] - Remove possible memory leak in TajoMaster
- [TAJO-1208] - Failure of create table using textfile on hivemeta
- [TAJO-1210] - ByteBufLineReader does not handle the end of file, if newline is not appeared
- [TAJO-1211] - Staging directory for CTAS and INSERT should be in the output dir.
- [TAJO-1219] - Files located in intermediate directories of partitioned table should be ignored
- [TAJO-1220] - Implement createStatement() and setEscapeProcessing() in JdbcConnection
- [TAJO-1223] - Wrong query verification against asterisk and more expressions in select list
- [TAJO-1224] - When there is no projected column, json scan can be hang.
- [TAJO-1225] - Fix wrong schema name in JDBC driver
- [TAJO-1231] - Implicit table properties in session are not stored in table property.
- [TAJO-1232] - Implicit groupby queries with LIMIT lead to wrong results.
- [TAJO-1234] - Rearrange timezone in date/time types
- [TAJO-1235] - ByteBufLineReader can not read text line with CRLF
- [TAJO-1237] - Fix missing maven-module for pullserver
- [TAJO-1239] - ORDER BY with null column desc miss some data.
- [TAJO-1242] - Json scanner can not read some case of truncated text
- [TAJO-1244] - tajo.worker.tmpdir.locations should use a validator for a list of paths.
- [TAJO-1246] - HBase info port conflict occasionally causes unit test failures in Jenkins CI
- [TAJO-1249] - Tajo should check if a file format given in DDL is supported.
- [TAJO-1250] - RawFileAppender occasionally causes BufferOverflowException
- [TAJO-1251] - Query is hanging occasionally by shuffle report
- [TAJO-1252] - PathValidator should allow hdfs paths which contain IP addresses
- [TAJO-1254] - Fix getProgress race conditions in Query
- [TAJO-1257] - ORDER BY with NULL FIRST misses some data
- [TAJO-1259] - A title in catalog configuration document is different from others
- [TAJO-1265] - min(), max() does not handle null properly
- [TAJO-1270] - Fix typos
- [TAJO-1275] - Optimizer pushs down non-equi filter as theta join qualifier
- [TAJO-1277] - GreedyHeuristicJoinOrderAlgorithm sometimes wrongly assumes associativity of joins
- [TAJO-1278] - Unit tests occasionally hang due to the invalid query status
- [TAJO-1283] - ORDER BY with the first descending order causes wrong results
- [TAJO-1289] - History reader fails to get the query information after a successful query execution
- [TAJO-1297] - Tajo Web UI does not work after TAJO-1291
- [TAJO-1299] - TB and PB representations in StorageUnit are overflow
- [TAJO-1303] - CDH cannot pass hadoop version check test
- [TAJO-1304] - Can not found TextFile in catalog
- [TAJO-1305] - With metadata storage of MySQL, columns with the same character but difference case are not allowed
- [TAJO-1308] - QueryInprogress can not release when query is QUERY_ERROR
- [TAJO-1312] - Stage causes Invalid event error: SQ_SHUFFLE_REPORT at KILLED
- [TAJO-1313] - Tajo-dump creates DDLs for information_schema tables
- [TAJO-1315] - Invalid results are returned when a source table consists of multiple csv files
- [TAJO-1316] - NPE occurs when performing window functions after join
- [TAJO-1318] - Unit test failure after miniDFS cluster restart
- [TAJO-1319] - Tajo can't find HBase configuration file.
- [TAJO-1321] - Cli prints wrong response time
- [TAJO-1322] - Invalid stored caching on StorageManager
- [TAJO-1324] - Remove warehouse directory rewriting in Unit Test
- [TAJO-1325] - Invalid history cleaner timeout
- [TAJO-1336] - Fix task failure of stopped task
Improvement
- [TAJO-269] - Protocol buffer De/Serialization for LogicalNode
- [TAJO-784] - Improve TpchTestBase to be more general.
- [TAJO-1035] - Add default TAJO_PULLSERVER_HEAPSIZE
- [TAJO-1053] - ADD PARTITIONS for HCatalogStore
- [TAJO-1092] - Improve the function system to allow other function implementation types
- [TAJO-1109] - Separate SQL Statements from Catalog Stores
- [TAJO-1114] - Improve ConfVars (SessionVar) to take a validator interface to check its input.
- [TAJO-1125] - Separate logical plan and optimizer into a maven module
- [TAJO-1128] - Implement a select box for database at web interface
- [TAJO-1132] - More detailed version info in tsql
- [TAJO-1133] - Add 'bin/tajo version' command
- [TAJO-1140] - Separate TajoClient into fine grained parts.
- [TAJO-1143] - TajoMaster, TajoWorker, and TajoClient should have diagnosis phase at startup
- [TAJO-1145] - Add 'bin/tajo --help' command
- [TAJO-1159] - Change tsql history behavior
- [TAJO-1160] - Remove Hadoop dependency from tajo-client module
- [TAJO-1161] - Remove joda time dependency from tajo-core
- [TAJO-1163] - TableDesc should use URI instead of Path.
- [TAJO-1165] - Needs to show error messages on query_executor.jsp
- [TAJO-1169] - Some older version of OpenJDK 1.6 does not get default timezone id
- [TAJO-1172] - Remove Trevni storage type and its related classes
- [TAJO-1174] - remove unnessary codes for blobdatum
- [TAJO-1176] - Implements queryable virtual tables for catalog information
- [TAJO-1177] - Reduce the use of Sun proprietary API
- [TAJO-1184] - Upgrade netty-buffer to 4.0.24.Final
- [TAJO-1186] - Table should have timezone as an table property
- [TAJO-1187] - TajoCli should print time/timestamp values with timezone
- [TAJO-1189] - *-site.xml.template should contain commented out default settings.
- [TAJO-1195] - Remove unused CachedDNSResolver Class
- [TAJO-1204] - Remove unused ServerName class
- [TAJO-1209] - Pluggable line (de)serializer for DelimitedTextFile
- [TAJO-1213] - Implement CatalogStore::updateTableStats
- [TAJO-1221] - HA TajoClient should not connect TajoMaster at the first.
- [TAJO-1227] - When a task is failed, ParquetAppender::close causes NPE.
- [TAJO-1228] - TajoClient should communicate with only TajoMaster without TajoWorker
- [TAJO-1230] - Disable ipv6 support on JVM
- [TAJO-1236] - Remove slow 'new String' operation in parquet format
- [TAJO-1241] - Change default client and table time zone behavior
- [TAJO-1243] - *-site.xml.template should have default configs commented out.
- [TAJO-1245] - Add documentation about PostgreSQL and Oracle Catalog driver
- [TAJO-1247] - Store type 'TEXTFILE' should be TEXT while keeping enum 'TEXTFILE' in protobuf
- [TAJO-1258] - Close() for classes derived from FileAppender should be robust
- [TAJO-1261] - Separate query and ddl execution codes from GlobalEngine
- [TAJO-1268] - tajo-client module should not use UserGroupInformation
- [TAJO-1269] - Separate cli from tajo-client
- [TAJO-1279] - Cleanup TajoAsyncDispatcher and interrupt stop events
- [TAJO-1281] - Remove hadoop-common dependency from tajo-rpc
- [TAJO-1282] - Cleanup the relationship of QueryInProgress and QueryJobManager
- [TAJO-1285] - Refactoring Magic Number to HAConstants
- [TAJO-1286] - Remove netty dependency from tajo-jdbc
- [TAJO-1288] - Refactoring org.apache.tajo.master package
- [TAJO-1290] - Add HBase Storage Integration Documentation
- [TAJO-1291] - Rename TajoMasterProtocol to QueryCoordinatorProtocol
- [TAJO-1293] - Tajo have to accept hostname beginning with digits.
- [TAJO-1306] - HAServiceUtil should not directly use HDFS.
- [TAJO-1307] - HBaseStorageManager need to support for users to use hbase-site.xml file.
- [TAJO-1309] - Add missing break point in physical operator
- [TAJO-1317] - Parallel Test Executions on Tajo Core Project
- [TAJO-1320] - HBaseStorageManager need to support Zookeeper Client Port.
- [TAJO-1328] - Fix deprecated property names in the catalog configuration document
New Feature
- [TAJO-233] - Support PostgreSQL CatalogStore
- [TAJO-235] - Support Oracle CatalogStore
- [TAJO-1026] - Implement Query history persistency manager.
- [TAJO-1095] - Implement Json file scanner
- [TAJO-1100] - Refactor CSVFile to DelimitedLineTextFile
- [TAJO-1118] - (Umbrella) HBase Storage Integration
- [TAJO-1199] - EMR bootstrap script for Tajo
- [TAJO-1222] - DelimitedTextFile should be tolerant against parsing errors.
- [TAJO-1238] - Add SET SESSION and RESET statement
Task
- [TAJO-1032] - Improve TravisCI scripts to adjust log4j log level
- [TAJO-1129] - Remove hadoop 2.2.0 support
- [TAJO-1141] - Refactor the packages hierarchy of tajo-client
- [TAJO-1153] - Merge off-heap package in block_iteration branch to master branch
- [TAJO-1229] - rename tajo-yarn-pullserver to tajo-pullserver
- [TAJO-1267] - Remove LazyTaskScheduler
- [TAJO-1274] - Merge separate pages of getting started document into a single page
- [TAJO-1280] - Update the roles of Hyoungjun and Jihun in web site
- [TAJO-1294] - Add index documents
- [TAJO-1295] - Remove legacy worker.dataserver package and its unit tests.
- [TAJO-1296] - Remove obsolete classes from tajo.master.container package.
- [TAJO-1323] - Cleanup the unstable test case