Apache Mesos

+ Apache YARN

= Myriad

Mohit Soni   Santosh Marella   Adam Bordelon

Agenda

  • What's up with Datacenters these days?
  • Apache Mesos vs. Apache Hadoop/YARN?
  • Why would you want/need both?
  • Introducing Apache Myriad

What's running on your datacenter?

  • Tier 1 services
  • Tier 2 services
  • High Priority Batch
  • Best Effort, backfill

Requirements

  • Programming models based on resources,
    not machines
  • Custom resource types
  • Custom scheduling algorithms:
    Fast vs. careful/slow
  • Lightweight executors, fast task launch time
  • Multi-tenancy, utilization, strong isolation
  • Preemption/oversubscription, fault-tolerance

Hadoop and More

  • Support Hadoop/BigData ecosystem
  • Support arbitrary (legacy) processes/containers
  • Connect Big Data to non-Hadoop apps,
    share data, resources

Mesos from 10,000 feet

Open Source Apache project

Cluster Resource Manager

Scalable to 10,000s of nodes

Fault-tolerant, no SPOF

Multi-tenancy, Resource Isolation

Improved resource utilization

Mesos is more than

Yet Another Resource Negotiator

Long-running services; real-time jobs

Native Docker; cgroups for years;
Isolate cpu/mem/disk/net/other

Distributed systems SDK;
~200 loc for a new app

Core written in C++ for performance,
Apps in any language

Why two resource managers?

Static Partitioning sucks

  • Hadoop teams fine with isolated clusters,
    but Ops team unhappy; slow to provision
  • Resource silos, no elasticity
  • Want to run Hadoop on the same infrastructure,
    without interrupting Tier-1 services
  • Want multi-tenancy, resource sharing/isolation

Introducing Myriad

Myriad Overview

  • Makes YARN (Yet Another) Mesos Framework
  • Mesos manages DC, YARN manages Hadoop
  • Get resources from Mesos, scale YARN
  • Reclaim YARN resources, give back to Mesos

Myriad improves Mesos

Tighter integration with Hadoop frameworks like HBase, Hive, Pig

Borrow resources from Hadoop
when traffic spikes for tier-1 services

Backfill unused resource capacity
with best-effort Hadoop jobs

No Mesos code changes necessary

Myriad improves Hadoop

Elastic scaling

Fault-tolerant: Maintain NM capacity

Share resources with other workloads,
improve resource utilization

Multiple isolated Hadoop clusters
sharing node resources and DFS

No YARN/Hadoop code changes

Use-cases

YARN on the fly

  • Elastic: Scale up/down NMs as needed
  • Fault-tolerance: Run RM on Marathon, auto-restart on another node
  • Quick dev/QA clusters; compatibility testing

Multiple YARNs

  • Multiple isolated YARN clusters:
    different versions, dev/test/prod
  • YARN version upgrade, workload migration
  • Scale up new YARN cluster, scale down old
  • Same data layer (HDFS)

Sharing with non-Hadoop workloads

  • Web servers, data ingestion: write to HDFS
  • Followed by periodic Hadoop analytics,
    write to HDFS/HBase
  • Which may feed back into the webserver,
    or start another data ingestion cycle
  • Different needs over time, dynamically adjust

Features in Progress

  • Fine-grained scaling
  • Myriad scheduler HA, task reconciliation
  • RM discovery using Marathon/Mesos-DNS
  • Distribution of hadoop binaries
  • Dockerization
  • Upcoming: Multiple isolated YARN clusters
  • Upcoming: Data locality optimizations

Apache Incubator Podling Update

  • Done: Proposal, Status page, mailing lists
  • Mentors: benh, tdunning, danese, lresende
  • Committers: Mohit, Santosh, Adam, KenSipe

  • Legal: Naming, donation, CLAs
  • INFRA: git repo (empty), JIRA, website, wiki
  • Release: version, vote, ShipIt!
  • Community: more users, committers

Learn More!

Apache Myriad Incubator Proposal

Apache Myriad Incubator Status Page

https://github.com/mesos/myriad

dev@myriad.incubator.apache.org

 

http://mesos.apache.org

https://github.com/apache/mesos

http://mesosphere.com