2011-July Tashi Incubator Status Report Tashi has been incubating since September 2008. The Tashi project aims to build a software infrastructure for cloud computing on massive internet-scale datasets (what we call Big Data). The idea is to build a cluster management system that enables the Big Data that are stored in a cluster/data center to be accessed, shared, manipulated, and computed on by remote users in a convenient, efficient, and safe manner. Tashi has previously encompassed just the tools to manage virtual machines using Xen and KVM, but is gaining the facility to hand out physical machines as well. Development activities have included:- * Refactor primitive scheduler to be less convoluted * Ensure that an old CM handle expires to not talk to a dead CM * Use virtio networking by default for performance * Enable config option for Miha Stopar's auto host registration * Clean unused and untested modules from stable branch * Reduce VM startup time when using scratch (old sparse files) * Conversion of sparse file scratch space to Linux LVM2 * Work on migrating VMs between hosts * Resource usage messages sent to clustermanager for accounting The project is still working toward building a larger user and development community. User groups have been identified in Ireland, Slovenia and Korea, as well as at Georgia Tech. CMU usage is growing as other groups hear about the availability of the resource. Intel has restructured its research division and folded some operations into adjoining academic sites. Items to be resolved before graduation: * A stable branch exists which could be a release candidate, but the codebase is large and test hardware is currently in short supply. We are confident that the code in the stablefix branch will work if running QEMU emulation, Pickle or sqlite data storage, primitive scheduler. Xen, other data stores and schedulers have not been tested recently. * Should have example accounting code * Develop community diversity (Committers currently at Telefonica, Google and CMU)