A container orchestration platform for Mesos and DC/OS
v1.8.222 SHA Checksum · v1.8.222 Release NotesMarathon is a production-grade container orchestration platform for Mesosphere’s Datacenter Operating System (DC/OS) and Apache Mesos.
/metrics
in JSON format, push them to systems like Graphite, StatsD and DataDog, or scrape them using Prometheus./metrics
in JSON format, or push them to systems like Graphite, StatsD and DataDog.Running on DC/OS, Marathon gains the following additional features:
The graphic below shows how Marathon runs on Apache Mesos acting as the orchestrator for other applications and services.
Marathon is the first framework to be launched, running directly alongside Mesos. This means the Marathon scheduler processes are started directly using init
, upstart
, or a similar tool.
Marathon is a powerful way to run other Mesos frameworks: in this case, Chronos. Marathon launches two instances of the Chronos scheduler using the Docker image mesosphere/chronos
. The Chronos instances appear in orange on the top row.
If either of the two Chronos containers fails for any reason, then Marathon will restart them on another agent. This approach ensures that two Chronos processes are always running.
Since Chronos itself is a framework and receives resource offers, it can start tasks on Mesos. In the use case below, Chronos is running two scheduled jobs, shown in blue. One dumps a production MySQL database to S3, while another sends an email newsletter to all customers via Rake.
Meanwhile, Marathon also runs the other application containers - either Docker or Mesos - that make up our website: JBoss servers, Jetty, Sinatra, Rails, and so on.
We have shown that Marathon is responsible for running other frameworks, helps them maintain 100% uptime, and coexists with them creating workloads in Mesos.
The next three images illustrate scaling and container placement.
Below we see Marathon running three applications, each scaled to a different number of containers: Search (1), Jetty (3), and Rails (5).
As the website gains traction, we decide to scale out the Search service and our Rails-based application.
We use the Marathon REST API call to to add more instances. Marathon will take care of placing the new containers on machines with spare capacity, honoring the constraints we previously set. We can see the containers are dynamically placed:
Finally, imagine that one of the datacenter workers trips over a power cord and a server is unplugged. No problem for Marathon: it moves the affected Search and Rails containers to a node that has spare capacity. Marathon has maintained our uptime in the face of machine failure.