public class UpgradeTool
extends Object
This utility is designed to help with upgrading to Hive 3.0. On-disk layout for transactional
tables has changed in 3.0 and require pre-processing before upgrade to ensure they are readable
by Hive 3.0. Some transactional tables (identified by this utility) require Major compaction
to be run on them before upgrading to 3.0. Once this compaction starts, no more
update/delete/merge statements may be executed on these tables until upgrade is finished.
Additionally, a new type of transactional tables was added in 3.0 - insert-only tables. These
tables support ACID semantics and work with any Input/OutputFormat. Any Managed tables may
be made insert-only transactional table. These tables don't support Update/Delete/Merge commands.
This utility works in 2 modes: preUpgrade and postUpgrade.
In preUpgrade mode it has to have 2.x Hive jars on the classpath. It will perform analysis on
existing transactional tables, determine which require compaction and generate a set of SQL
commands to launch all of these compactions.
Note that depending on the number of tables/partitions and amount of data in them compactions
may take a significant amount of time and resources. The script output by this utility includes
some heuristics that may help estimate the time required. If no script is produced, no action
is needed. For compactions to run an instance of standalone Hive Metastore must be running.
Please make sure hive.compactor.worker.threads is sufficiently high - this specifies the limit
of concurrent compactions that may be run. Each compaction job is a Map-Reduce job.
hive.compactor.job.queue may be used to set a Yarn queue ame where all compaction jobs will be
submitted.
In postUpgrade mode, Hive 3.0 jars/hive-site.xml should be on the classpath. This utility will
find all the tables that may be made transactional (with ful CRUD support) and generate
Alter Table commands to do so. It will also find all tables that may not support full CRUD
but can be made insert-only transactional tables and generate corresponding Alter Table commands.
TODO: rename files
"execute" option may be supplied in both modes to have the utility automatically execute the
equivalent of the generated commands
"location" option may be supplied followed by a path to set the location for the generated
scripts.