# Licensed to the Apache Software Foundation (ASF) under one or more # contributor license agreements. See the NOTICE file distributed with # this work for additional information regarding copyright ownership. # The ASF licenses this file to You under the Apache License, Version 2.0 # (the "License"); you may not use this file except in compliance with # the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. Source: hadoop Section: misc Priority: extra Maintainer: Bigtop Build-Depends: debhelper (>= 6), ant, ant-optional, liblzo2-dev, python, libzip-dev,automake, autoconf (>= 2.61), sharutils, g++ (>= 4), git-core, libfuse-dev, libssl-dev Standards-Version: 3.8.0 Homepage: http://hadoop.apache.org/core/ Package: hadoop Provides: hadoop Architecture: all Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, bigtop-utils, zookeeper (>= 3.4.0), psmisc, netcat-openbsd Description: Hadoop is a software platform for processing vast amounts of data Hadoop is a software platform that lets one easily write and run applications that process vast amounts of data. . Here's what makes Hadoop especially useful: * Scalable: Hadoop can reliably store and process petabytes. * Economical: It distributes the data and processing across clusters of commonly available computers. These clusters can number into the thousands of nodes. * Efficient: By distributing the data, Hadoop can process it in parallel on the nodes where the data is located. This makes it extremely rapid. * Reliable: Hadoop automatically maintains multiple copies of data and automatically redeploys computing tasks based on failures. . Hadoop implements MapReduce, using the Hadoop Distributed File System (HDFS). MapReduce divides applications into many small blocks of work. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster. MapReduce can then process the data where it is located. Package: hadoop-hdfs Architecture: all Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, bigtop-utils, hadoop (= ${source:Version}), bigtop-jsvc Description: The Hadoop Distributed File System Hadoop Distributed File System (HDFS) is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations. Package: hadoop-yarn Architecture: all Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, bigtop-utils, hadoop (= ${source:Version}) Description: The Hadoop NextGen MapReduce (YARN) YARN (Hadoop NextGen MapReduce) is a general purpose data-computation framework. The fundamental idea of YARN is to split up the two major functionalities of the JobTracker, resource management and job scheduling/monitoring, into separate daemons: ResourceManager and NodeManager. . The ResourceManager is the ultimate authority that arbitrates resources among all the applications in the system. The NodeManager is a per-node slave managing allocation of computational resources on a single node. Both work in support of per-application ApplicationMaster (AM). . An ApplicationMaster is, in effect, a framework specific library and is tasked with negotiating resources from the ResourceManager and working with the NodeManager(s) to execute and monitor the tasks. Package: hadoop-mapreduce Architecture: all Depends: ${shlibs:Depends}, ${misc:Depends}, adduser, bigtop-utils, hadoop-yarn (= ${source:Version}) Description: The Hadoop MapReduce (MRv2) Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes. Package: hadoop-hdfs-fuse Architecture: i386 amd64 Depends: ${shlibs:Depends}, hadoop-hdfs (= ${source:Version}), hadoop-client (= ${source:Version}), libfuse2, fuse-utils Enhances: hadoop Description: Mountable HDFS These projects (enumerated below) allow HDFS to be mounted (on most flavors of Unix) as a standard file system using Package: hadoop-doc Provides: hadoop-doc Architecture: all Section: doc Description: Hadoop Documentation Documentation for Hadoop Package: hadoop-conf-pseudo Provides: hadoop-conf-pseudo Architecture: all Depends: hadoop (= ${source:Version}), hadoop-hdfs-namenode (= ${source:Version}), hadoop-hdfs-datanode (= ${source:Version}), hadoop-hdfs-secondarynamenode (= ${source:Version}), hadoop-yarn-resourcemanager (= ${source:Version}), hadoop-yarn-nodemanager (= ${source:Version}), hadoop-mapreduce-historyserver (= ${source:Version}) Description: Pseudo-distributed Hadoop configuration Contains configuration files for a "pseudo-distributed" Hadoop deployment. In this mode, each of the hadoop components runs as a separate Java process, but all on the same machine. Package: hadoop-mapreduce-historyserver Provides: hadoop-mapreduce-historyserver Architecture: all Depends: hadoop-mapreduce (= ${source:Version}) Description: MapReduce History Server The History server keeps records of the different activities being performed on a Apache Hadoop cluster Package: hadoop-yarn-nodemanager Provides: hadoop-yarn-nodemanager Architecture: all Depends: hadoop-yarn (= ${source:Version}) Description: YARN Node Manager The NodeManager is the per-machine framework agent who is responsible for containers, monitoring their resource usage (cpu, memory, disk, network) and reporting the same to the ResourceManager/Scheduler. Package: hadoop-yarn-resourcemanager Provides: hadoop-yarn-resourcemanager Architecture: all Depends: hadoop-yarn (= ${source:Version}) Description: YARN Resource Manager The resource manager manages the global assignment of compute resources to applications Package: hadoop-yarn-proxyserver Provides: hadoop-yarn-proxyserver Architecture: all Depends: hadoop-yarn (= ${source:Version}) Description: YARN Web Proxy The web proxy server sits in front of the YARN application master web UI. Package: hadoop-hdfs-namenode Provides: hadoop-hdfs-namenode Architecture: all Depends: hadoop-hdfs (= ${source:Version}) Description: The Hadoop namenode manages the block locations of HDFS files The Hadoop Distributed Filesystem (HDFS) requires one unique server, the namenode, which manages the block locations of files on the filesystem. Package: hadoop-hdfs-secondarynamenode Provides: hadoop-hdfs-secondarynamenode Architecture: all Depends: hadoop-hdfs (= ${source:Version}) Description: Hadoop Secondary namenode The Secondary Name Node periodically compacts the Name Node EditLog into a checkpoint. This compaction ensures that Name Node restarts do not incur unnecessary downtime. Package: hadoop-hdfs-zkfc Provides: hadoop-hdfs-zkfc Architecture: all Depends: hadoop-hdfs (= ${source:Version}) Description: Hadoop HDFS failover controller The Hadoop HDFS failover controller is a ZooKeeper client which also monitors and manages the state of the NameNode. Each of the machines which runs a NameNode also runs a ZKFC, and that ZKFC is responsible for: Health monitoring, ZooKeeper session management, ZooKeeper-based election. Package: hadoop-hdfs-datanode Provides: hadoop-hdfs-datanode Architecture: all Depends: hadoop-hdfs (= ${source:Version}) Description: Hadoop Data Node The Data Nodes in the Hadoop Cluster are responsible for serving up blocks of data over the network to Hadoop Distributed Filesystem (HDFS) clients. Package: libhdfs0 Architecture: any Depends: hadoop (= ${source:Version}), ${shlibs:Depends} Description: Hadoop Filesystem Library Hadoop Filesystem Library Package: libhdfs0-dev Architecture: any Section: libdevel Depends: hadoop (= ${source:Version}), libhdfs0 (= ${binary:Version}) Description: Development support for libhdfs0 Includes examples and header files for accessing HDFS from C Package: hadoop-httpfs Provides: hadoop-httpfs Architecture: all Depends: hadoop-hdfs (= ${source:Version}), bigtop-tomcat Description: HTTPFS for Hadoop The server providing HTTP REST API support for the complete FileSystem/FileContext interface in HDFS. Package: hadoop-client Provides: hadoop-client Architecture: all Depends: hadoop (= ${source:Version}), hadoop-hdfs (= ${source:Version}), hadoop-yarn (= ${source:Version}), hadoop-mapreduce (= ${source:Version}) Description: Hadoop client side dependencies Installation of this package will provide you with all the dependencies for Hadoop clients.