Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
Pig has graduated as a Hadoop subproject, see Message-ID: <f767f0600810171144w3aaff127p65766a0a2a1a9662@mail.gmail.com> on general@incubator.apache.org
Pig is a platform for analyzing large data sets that consists of a high-level language for expressing data analysis programs, coupled with infrastructure for evaluating these programs. The salient property of Pig programs is that their structure is amenable to substantial parallelization, which in turns enables them to handle very large data sets.
At the present time, Pig's infrastructure layer consists of a compiler that produces sequences of Map-Reduce programs, for which large-scale parallel implementations already exist (e.g., the Hadoop subproject). Pig's language layer currently consists of a textual language called Pig Latin, which has the following key properties:
This is the first phase on incubation, needed to start the project at Apache.
Item assignment is shown by the Apache id. Completed tasks are shown by the completion date (YYYY-MM-dd).
| date | item |
|---|---|
| October 2007 | Make sure that the requested project name does not already exist and check www.nameprotect.com to be sure that the name is not already trademarked for an existing software product. |
| Existing project (Lucene) pre-approved Fall 2007 | If request from outside Apache to enter an existing Apache project, then post a message to that project for them to decide on acceptance. |
DONE
| date | item |
|---|---|
| October 2007 | Identify all the Mentors for the incubation, by asking all that can be Mentors. |
| November 2007 | Subscribe all Mentors on the pmc and general lists. |
| November 2007 | Give all Mentors access to the incubator SVN repository. (to be done by the Incubator PMC chair or an Incubator PMC Member wih karma for the authorizations file) |
| November 2007 (re-done June 2008) | Tell Mentors to track progress in the file 'incubator/projects/{project.name}.xml' |
DONE
| date | item |
|---|---|
| Not applicable (I think) | Check and make sure that the papers that transfer rights to the ASF been received. It is only necessary to transfer rights for the package, the core code, and any new code produced by the project. |
| Not applicble (new project) | Check and make sure that the files that have been donated have been updated to reflect the new ASF copyright. |
| date | item |
|---|---|
| August 2008 | Check and make sure that for all code included with the distribution that is not under the Apache license, e have the right to combine with Apache-licensed code and redistribute. |
| August 2008 | Check and make sure that all source code distributed by the project is covered by one or more of the following approved licenses: Apache, BSD, Artistic, MIT/X, MIT/W3C, MPL 1.1, or something with essentially the same terms. |
DONE
| date | item |
|---|---|
| October 2007 (and ongoing) | Check that all active committers have submitted a contributors agreement. |
| September 2008 | Add all active committers in the STATUS file. |
| September 2008 | Ask root for the creation of committers' accounts on people.apache.org. |
DONE
Active Committers Are
| Committer | Email Address |
|---|---|
| Daniel Dai | daijyc@gmail.com |
| Nigel Daley | nigel@apache.org |
| Alan Gates | gates@yahoo-inc.com |
| Olga Natkovich | olgan@yahoo-inc.com |
| Owen O'Malley | omalley@apache.org |
| Chris Olston | olston@yahoo-inc.com |
| Ben Reed | breed@yahoo-inc.com |
| Pi Song | pi.songs@gmail.com |
| Utkarsh Srivastava | utkarsh@yahoo-inc.com |
| date | item |
|---|---|
| October 2007 | Ask infrastructure to create source repository modules and grant the committers karma. |
| October 2007 | Ask infrastructure to set up and archive Mailing lists. |
| October 2007 | Decide about and then ask infrastructure to setup an issuetracking system (Bugzilla, Scarab, Jira). |
| Not applicable (project starts in Apache) | Migrate the project to our infrastructure. |
DONE
Add project specific tasks here.
These action items have to be checked for during the whole incubation process.
These items are not to be signed as done during incubation, as they may change during incubation. They are to be looked into and described in the status reports and completed in the request for incubation signoff.
Add project specific tasks here.
Things to check for before voting the project out.