Apache Mesos FAQ

What platforms does Mesos run on?

We run it on Linux and Mac OS X, but it could work on other POSIX systems too.

What's the quickest way to try out Mesos?

Our EC2 scripts let you launch a Mesos cluster with Hadoop, Spark and MPI on it in a few minutes on Amazon EC2.

How well does Mesos scale?

Mesos is specifically designed for scalability, and employs an efficient two-level scheduling mechanism (resource offers) and a fast, event-driven C++ implementation to support thousands of scheduling decisions per second. In our tests, Mesos can handle 50,000 slaves and 200 frameworks (clients) while keeping the task scheduling latency below 1 second. We are unaware of any other open source cluster scheduler that can scale to these levels.

What happens if the master dies?

Mesos can fail over to a backup master if you configure our failover support using ZooKeeper. Mesos is designed so that the master contains only soft state, allowing it to recover quickly from a failure. In our tests, recovery takes about 10 seconds with the default ZooKeeper ping interval.

With resource offers, do frameworks need to wait a long time to find a suitable node to launch a task on?

No. Mesos allows frameworks to provide requests (or filters) to tell the master what resources they are interested in, so they can receive only those. It also batches resources so that frameworks can see many nodes in the same offer. In general, with any scheduling system, applications will have to wait a longer time if the resources they want are being used by other applications, or a shorter time if resources are free. Things are no different with Mesos. But the resource offer mechanism does simplify the master (hence improving scalability and resilience) and lets frameworks use their own policies to optimize their placement even if these policies were not anticipated by the Mesos designers.

What happens if everyone rejects an offer?

There is a timeout (like a temporary filter) before it gets offered again, so that we don't re-offer it continuously to every framework. Frameworks can configure this timeout if they wish. In most cases, a framework will reject offers with an infinite timeout (passing -1 to replyToOffer) when it has no work to run, and then call reviveOffers to receive new offers when it does.

How can I write a framework on Mesos?

Mesos provides APIs to write new frameworks in Java, Python and C++. Take a look at the framework development guide and at the example frameworks in src/examples to get started.

What kind of security does Mesos provide?

Mesos runs each framework as the UNIX user who submitted it, allowing you to use UNIX permissions to control data access. It can also isolate resources on each machine using Linux Containers, so that runaway tasks or greedy users cannot monopolize the resources on each node. It does not yet protect against malicious users trying to attack Mesos itself (so, for example, you shouldn't run it as a "public cloud"), though we are working on that.

News

Download Download Mesos