/[Apache-SVN]
ViewVC logotype

Revision 564012


Jump to revision: Previous Next
Author: stack
Date: Wed Aug 8 20:30:13 2007 UTC (17 years, 2 months ago)
Changed paths: 30
Log Message:
HADOOP-1662 Make region splits faster
Splits are now near-instantaneous.  On split, daughter splits create
'references' to store files up in the parent region using new 'HalfMapFile'
class to proxy accesses against the top-half or bottom-half of 
backing MapFile.  Parent region is deleted after all references in daughter
regions have been let go.

Below includes other cleanups and at least one bug fix for fails adding
>32k records and improvements to make it more likely TestRegionServerAbort
will complete..

A src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java
    Added. Tests new Reference HStoreFiles. Test new HalfMapFileReader inner
    class of HStoreFile. Test that we do the right thing when HStoreFiles
    are smaller than a MapFile index range (i.e. there is not 'MidKey').
    Test we do right thing when key is outside of a HalfMapFile.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java
    getHRegionDir moved from HStoreFile to HRegion.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java
    Let out exception rather than catch and call 'fail'.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java
    Refactored so can start and stop a minihbasecluster w/o having to
    subclass this TestCase. Refactored methods in this class to use the
    newly added methods listed below.
    (MasterThread, RegionServerThread, startMaster, startRegionServers
      shutdown): Added.
    Added logging of abort, close and wait.  Also on abort/close
    was doing a remove that made it so subsequent wait had nothing to
    wait on.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java
    Added tests that assert all works properly at region level on
    multiple levels of splits and then do same on a cluster.
M src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java
    Removed catch and 'fail()'.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java
    Javadoc to explain how split now works. Have constructors flow
    into each other rather than replicate setup per instance. Moved
    in here operations such as delete, rename, and length of store files
    (No need of clients to remember to delete map and info files).
    (REF_NAME_PARSER, Reference, HalfMapFile, isReference,
      writeReferenceFiles, writeSplitInfo, readSplitInfo,
      createOrFail, getReader, getWriter, toString): Added.
    (getMapDir, getMapFilePath, getInfoDir, getInfoFilePath): Added
    a bunch of overrides for reference handling.
    (loadHStoreFiles): Amended to load references off disk.
    (splitStoreFiles): Redone to instead write references into
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java
    Rename maps as readers and mapFiles as storefiles.
    Moved BloomFilterReader and Writer into HStoreFile. Removed
    getMapFileReader and getMapFileWriter (They are in HStoreFile now).
    (getReaders): Added.
    (HStoreSize): Added.  Data Structure to hold aggregated size
    of all HStoreFiles in HStore, the largest, its midkey, and
    if the HStore is splitable (May not be if references).
    Previous we only did largest file; less accurate.
    (getLargestFileSize): Renamed size and redone to aggregate
    sizes, etc.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java
    Have constructors waterfall down through each other rather than
    repeat initializations.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java
    Use new HStoreSize structure.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java
    Added delayed remove of HRegion (Now done in HMaster as part of
    meta scan). Change LOG.error and LOG.warn so they throw stack trace
    instead of just the Exception.toString as message.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java
    (COLUMN_FAMILY_STR): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java
    Added why to log of splitting.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java
    Short is not big enough to hold edits tha could contain a sizable
    web page.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java
    (getTableName): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java
    Added constructor to BaseScanner that takes name of table we're
    scanning (ROOT or META usually). Added to scanOneRegion handling
    of split regions.  Collect splits to check while scanning and
    then outside of the scanning, so we can modify the META table
    is needed, do the checks of daughter regions and update on
    change of state.  Made LOG.warn and LOG.error print stack trace.
    (isSplitParent, cleanupSplits, hasReferences): Added. 
    Added toString to each of the PendingOperation implementations.
    In the ShutdownPendingOperation scan of meta data, removed
    check of startcode (if the server name is that of the dead
    server, it needs reassigning even if start code is good).
    Also, if server name is null -- possible if we are missing
    edits off end of log -- then the region should be reassigned
    just in case its from the dead server.  Also, if reassigning,
    clear from pendingRegions.  Server may have died after sending
    region is up but before the server confirms receipt in the
    meta scan. Added mare detail to each log.  In OpenPendingOperation
    we were trying to clear pendingRegion in wrong place -- it was
    never executed (regions were always pending). 
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java
    Add split boolean.  Output offline and split status in toString.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java
    Comments.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java
    Moved getRegionDir here from HStoreFile.
    (COL_SPLITA, COL_SPLITB): Added.
    (closeAndSplit): Refactored to use new fast split method.
       StringUtils.formatTimeDiff(System.currentTimeMillis(), startTime));
    (splitStoreFile): Moved into HStoreFile.
    (getSplitRegionDir, getSplitsDir, toString): Added.
    (needsSplit): Refactored to exploit new HStoreSize structure.
    Also manages notion of 'unsplitable' region.
    (largestHStore): Refactored.
    (removeSplitFromMETA, writeSplitToMETA, getSplit, hasReference): Added.
M src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java
    (intToBytes, getBytes): Added.
A src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java    
    Utility reading and writing Writables.


Changed paths

Path Details
Directorylucene/hadoop/trunk/src/contrib/hbase/CHANGES.txt modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HBaseAdmin.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HColumnDescriptor.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConnectionManager.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HConstants.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLog.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HLogEdit.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMaster.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMemcache.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HMerge.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegion.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionInfo.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionLocation.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HRegionServer.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStore.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HStoreFile.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/HTable.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Keying.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/java/org/apache/hadoop/hbase/util/Writables.java added
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/HBaseTestCase.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/MiniHBaseCluster.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/StaticTestEnvironment.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestBatchUpdate.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestGet.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHRegion.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestHStoreFile.java added
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestRegionServerAbort.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestScanner.java modified , text changed
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestSplit.java added
Directorylucene/hadoop/trunk/src/contrib/hbase/src/test/org/apache/hadoop/hbase/TestTimestamp.java modified , text changed

infrastructure at apache.org
ViewVC Help
Powered by ViewVC 1.1.26