List of things todo for branch, including comments from reviews not yet implemented. --- remaining tasks before merge --- --- tasks to complete post merge --- * move client to use CatalogTracker and add region admin methods + Yes. * bulletproof splits. need to be recoverable from every point including partial META edits over on RS + Should be there. Add more tests. -- St.Ack 20100901 * review timeout semantics for client calls. servers should generally wait forever on root/meta but client class need to eventually timeout. we need to document new configuration parameters as well since this will now be a 'timeout' rather than 'retries' and 'delay'. TODO: Remove configs that no longer apply -- St.Ack 20100901 * finish rewriting or making any existing failing unit tests pass * new master unit tests (failover, failing RS and Master during various points of regions in transition, etc) harder stuff --- * make final decisions on root/meta timeouts. almost everyone is coordinating access through CatalogTracker which should make it easy to standardize. if there are operations that should just retry indefinitely, they need to resubmit themselves to their executor service. -- Should never timeout IMO and we changed executors so root and meta are done separately so this should be ok? -- St.Ack 20100815 * on region open (and wherever split children notify master) should check if if the table is disabled and should close the regions... maybe. * figure how to handle the very rare but possible race condition where two RSs will update META and the later one can squash the valid one if there was a long gc pause * review synchronization in AssignmentManager * migrate TestMasterTransitions or make new? Make a new one -- St.Ack 20100901 * write new tests!!! somewhat easier stuff --- * jsp pages borked * make sync calls for enable/disable (check and verify methods?) this still needs some love and testing but should be much easier to control now * Add balancing unit tests (was integrate balancer -- done. St.Ack 20100901) implemented but need to start a thread or chore, each time, wait for no regions in transition, generate and iterate the plan, putting it in-memory and then triggering the assignment. if the master crashes mid-balance, it should finish up to the point of the last CLOSE RPC it sent out to an RS. Regions will be sitting in CLOSED in ZK, failover master will pick it up, re-executes the ClosedRegionHandler() on it * synchronize all access to the boolean in ActiveMasterManager (now this is probably just move it to extend ZKNodeTracker) * update client to use new admin functions straight to rs possibly migrate client to use CatalogTracker? St.Ack -- Ensure root and meta are last to close on cluster shutdown; it shoudl be the case but verify. From Master: // TODO: Sync or async on this stuff? execute means sync. submit means later. // Right now this will swallow exceptions either way, might need // process() which throws nothing but execute() which throws IOE so // synchronous stuff can throw exceptions? At the moment we have a mix of sync and async. Whats missing is a callback mechanism (Benoit's Twisted Deferred would do nicely here). I change EventHandler to remove the execute so synchronous call process directly. Also made it so process now throws an exception Also, changed handlers so they do checks in constructor and constructor throws IOE. This way, we fail fast so even if asynchronous operation, its possible we'll see the TableNotDisabledException. There is more to do in here but should be good enough for merge. ================================================================================ COMPLETED STUFF (retained to ensure final review of these issues at the end) ================================================================================ * review master startup order we should use cluster flag in zk to signal RS to start rather than master availability and between master up and cluster up the master can do bulk of it's initialization. -- Yes. CST is currently a little off in that its homed on root location rather than the up/down status. Also shutdown. RS now watches /hbase/shutdown and starts shutdown when this goes down -- St.Ack 20100815 -- This should be done now. St.Ack 20100817 * in RootEditor there is a race condition between delete and watch? -- Didn't you say that this was a pigment of your emancipation? -- St.Ack 20100815 * review FileSystemManager calls notes from 8/4 (with what i did tonight for them, which is most of what is different in this diff) --- * in CatalogTracker need to stabilize on one getRoot and one getMeta method to use that waits and uses the default wait-for-catalogs timeout. We should get rid of the 'refresh' boolean that I have in there and should always ping the server to ensure it is serving the region before we return it. If we do eventually drop root and put the meta locations into zk we would no longer need this, so will not always have to pay this tax. >> This is done. You pass default timeout in constructor. Two methods now are: waitForRootServerConnectionDefault() waitForMetaServerConnectionDefault() * ROOT changes RootEditor -> RootLocationEditor, delete -> unset Change the way we unset the root location. Set the data to null rather than deleting the node. Requires changes to RootLocationEditor and RootRegionTracker. >> Thought there was a race condition here, but there is not. In fact, we do not even need to set the watch in the delete method. It is already properly being handled by RootRegionTracker. * In AssignmentManager.processFailure() need to insert RegionState into RIT map >> This is done. This needs tests but I think failover is all in place now. * On RS-side, make separate OpenRootHandler and OpenMetaHandler >> Added four new handlers for open/close of root/meta and associated executors * Add priorities to Opened/Closed handlers on Master >> Added ROOT, META, USER priorities -- I don't think these are used? And priorites on close are odd in that on shutdown we want meta and root to close last. St.Ack 20100817 * In RegionTransitionData, store the actual byte[] regionName rather than the encoded name >> Done. We should also get in practice of naming variables encodedName if it is that. * In EventType, completely remove differentiating between Master and RS. This means for a given EventType, it will map to the same handler whether it is on RS or Master. >> Done. Also in EventType, remove fromByte() and use an ordinal() method >> Done. Can we remove even having the (int) values for the enums now? Later: * renaming master file manager? MasterFS/MasterFileSystem -- I renamed this stuff -- St.Ack 20100815 * ServerStatus/MasterStatus + We now have: Abortable as the base class (M, RS, and Client implement abort()) + ServerController (M and RS implement getZK/ServerName/Info/etc) + RegionServerController (RS, definitely the hacky one) + MasterController (get/set of shutdown, close, etc) - These need new names to be more descriptive (ServerControl?) - They should have a very clear purpose that adds value beyond passing HMaster directly - Current idea is these things would just have accessors/setters to the server status booleans and abort() methods (like closed, closing, abortRequested) -- Done. I removed MasterStatus/MasterController. Not necessary. The RSController was renamed RegionServer. Not the best but until something better. I got rid of a few of the calls it was doing as they didn't seem needed (e.g. call openHRegion, a static, rather than do an open on the passed regionservercontroller) -- St.Ack 20100815 * HBaseEventHandler/HBaseEventType/HBaseExecutorService X (done) After ZK changes, renamed to EventHandler/EventType - Currently multiple types map to a single handler, we may want 1-to-1 - Need to do a full review of the semantics of these once bulk of master rewrite is done * LoadBalancer - Need to finish or back out code related to block locations (if finish, need to use files not directory, and use right location) - Put notes from reviewboard/jira into LB javadoc or hbase "book" Questions: If region in RIT, do I need to wait on log replay if region was in OPENING or PENDING_OPEN state? So on assign, if we fail -- say connection refused when we try open on the RS, regions state remains offline -- who comes along and finds all offline and assigns? TODO: + Add test to prove move region works. + Add test to prove enable/disable balancer works. + Add test for fixup if daughter edits don't make it into .META. (should be fixed up as part of server shutdown processing). + ensure root/meta are last to close on cluster shutdown - Add asking RS what it has when only two servers remaining... and when only root or meta, then send explicit close of each. Do it this way to ensure correct shutdown order -- St.Ack 08/21