Pig Change Log Trunk (unreleased changes) INCOMPATIBLE CHANGES NEW FEATURES OPTIMIZATIONS BUG FIXES PIG-24 Files that were incorrectly placed under test/reports have been removed. ant clean now cleans test/reports. (milindb via gates) PIG-25 com.yahoo.pig dir left under pig/test by mistake. removed it (olgan@) PIG-23 Made pig work with java 1.5. (milindb via gates) PIG-8 added binary comparator (olgan) PIG-17 integrated with Hadoop 0.15 (olgan@) PIG-11 Add capability to search for jar file to register. (antmagna via olgan) PIG-20 Added custom comparator functions for order by (phunt via gates) PIG-33 Help was commented out - uncommented (olgan) PIG-31: second half of concurrent mode problem addressed (olgan) PIG-14: added heartbeat functionality (olgan) PIG-17: updated hadoop15.jar to match hadoop 0.15.1 release PIG-7: Added use of combiner in some restricted cases. (gates) PIG-29: fixed bag factory to be properly initialized (utkarsh) PIG-43: fixed problem where using the combiner prevented a pig alias from being evaluated more than once. (gates) PIG-45: Fixed pig.pl to not assume hodrc file is named the same as cluster name (gates). PIG-7 (more): Fixed bug in PigCombiner where it was writing IndexedTuples instead of Tuples, causing Reducer to crash in some cases. PIG-47: Added methods to DataMap to provide access to its content PIG-12: Added time stamps to log4j messages (phunt via gates). PIG-44: Added adaptive decision of the number of records to hold in memory before spilling (utkarsh) PIG-39: created more efficient version of read (spullara via olgan) PIG-41: Added patterns to svn:ignore PIG-51: Fixed combiner in the presence of flattening PIG-30: Rewrote DataBags to better handle decisions of when to spill to disk and to spill more intelligently. (gates) PIG-61: Fixed MapreducePlanCompiler to use PigContext to load up the comparator function instead of Class.forName. (gates) PIG-56: Made DataBag implement Iterable. (groves via gates) PIG-63: Fix for non-ascii UTF-8 data (breed@ and olgan@) PIG-77: Added eclipse specific files to svn:ignore PIG-57: Fixed NPE in PigContext.fixUpDomain (francisoud via gates) PIG-69: NPE in PigContext.setJobtrackerLocation (francisoud via gates) PIG-78: src/org/apache/pig/builtin/PigStorage.java doesn't compile (arun via olgan) PIG-32: ABstraction layer (olgan) PIG-87: Fix pig.pl to find java via JAVA_HOME instead of hardcoded default path. Also fix it to not die if pigclient.conf is missing. (craigm via gates). PIG-89: Fix DefaultDataBag, DistinctDataBag, SortedDataBag to close spill files when they are done spilling (contributions by craigm, breed, and gates, committed by gates). PIG-95: Remove System.exit() statements from inside pig (joa23 via gates). PIG-65: convert tabs to spaces (groves via olgan) PIG-97: Turn off combiner in the case of Cogroup, as it doesn't work when more than one bag is involved (gates). PIG-92: Fix NullPointerException in PIgContext due to uninitialized conf reference. (francisoud via gates) PIG-83: Change everything except grunt and Main (PigServer on down) to use common logging abstraction instead of log4j. By default in grunt, log4j still used as logging layer. Also converted all System.out/err.println statements to use logging instead. (francisoud via gates) PIG-80: In a number of places stack trace information was being lost by an exception being caught, and a different exception then thrown. All those locations have been changed so that the new exception now wraps the old. (francisoud via gates). PIG-84: Converted printStackTrace calls to calls to the logger. (francisoud via gates). PIG-88: Remove unused HadoopExe import from Main. (pi_song via gates). PIG-99: Fix to make unit tests not run out of memory. (francisoud via gates). PIG-107: enabled several tests. (francisoud via olgan) PIG-46: abort processing on error for non-interactive mode (olston via olgan) PIG-109: improved exception handling (oae via olgan) PIG-72: Move unit tests to use MiniDFS and MiniMR so that unit tests can be run w/o access to a hadoop cluster. (xuzh via gates) PIG-68: improvements to build.xml (joa23 via olgan) PIG-110: Replaced code accidently merged out in PIG-32 fix that handled flattening the combiner case. (gates and oae) PIG-213: Remove non-static references to logger from data bags and tuples, as it causes significant overhead (vgeschel via gates). PIG-284: target for building source jar (oae via olgan) PIG-294: string comparator unit tests (sms via pi_song) PIG-258: cleaning up directories on failure (daijy via olgan) PIG-139: command line editing (daijy via olgan) PIG-270: proper line number for parse errors (daijy via olgan) PIG-363: fix for describe to produce schema name PIG-367: convinience function for UDFs to name schema PIG-368: making JobConf available to Load/Store UDFs PIG-311: cross is broken PIG-369: support for filter UDFs PIG-375: support for implicit split PIG-301: fix for order by descending PIG-378: fix for GENERATE + LIMIT PIG-362: don't push limit above generate with flatten PIG-381: bincond does not handle null data PIG-382: bincond throws typecast exception PIG-352: java.lang.ClassCastException when invalid field is accessed PIG-329: TestStoreOld, 2 unit tests were broken PIG-353: parsing of complex types PIG-392: error handling with multiple MRjobs PIG-397: code defaults to single reducer PIG-373: unconnected load causes problem, PIG-413: problem with float sum PIG-398: Expressions not allowed inside foreach (sms via olgan) PIG-418: divide by 0 problem PIG-402: order by with user comparator (shravanmn via olgan) PIG-415: problem with comparators (shravanmn via olgan) PIG-422: cross is broken (shravanmn via olgan) PIG-407: need to clone operators (pradeepk via olgan) PIG-428: TypeCastInserter does not replace projects in inner plans correctly (pradeepk vi olgan) PIG-421: error with complex nested plan (sms via olgan) PIG-429: Self join wth implicit split has the join output in wrong order (pradeepk via olgan) PIG-434: short-circuit AND and OR (pradeepk viia olgan) PIG-333: allowing no parethesis with single column alias with flatten (sms via olgan) PIG-426: Adding result of two UDFs gives a syntax error PIG-426: Adding result of two UDFs gives a syntax error (sms via olgan) PIG-436: alias is lost when single column is flattened (pradeepk via olgan) PIG-364: Limit return incorrect records when we use multiple reducer (daijy via olgan) PIG-439: disallow alias renaming (pardeepk via olgan) PIG-440: Exceptions from UDFs inside a foreach are not captured (pradeepk via olgan) PIG-442: Disambiguated alias after a foreach flatten is not accessible a couple of statements after the foreach (sms via olgan) PIG-424: nested foreach with flatten and agg gives an error (sms via olgan) PIG-411: Pig leaves HOD processes behind if Ctrl-C is used before HOD connection is fully established (olgan) PIG-430: Projections in nested filter and inside foreach do not work (sms via olgan) PIG-445: Null Pointer Exceptions in the mappers leading to lot of retries (shravanmn via olgan) PIG-444: job.jar is left behined (pradeepk via olgan) PIG-447: improved error messages (pradeepk via olgan) PIG-448: explain broken after load with types (pradeepk via olgan) PIG-380: invalid schema for databag constant (sms via olgan) PIG-451: If an field is part of group followed by flatten, then referring to it causes a parse error (pradeepk via olgan) PIG-455: "group" alias is lost after a flatten(group) (pradeepk vi olgan) PIG-458: integration with Hadoop 18 (olgan) PIG-459: increased sleep time before checking for job progress PIG-462: LIMIT N should create one output file with N rows (shravanmn via olgan) PIG-443: Illustrate for the Types branch (shubham via olgan) PIG-376: set job name (olgan) PIG-463: POCast changes (pradeepk via olgan)