Engine Integration

Apache Spark

  1. Deploy Client JAR
Bash
Copy
  1. Core Spark Configuration

Bash
Copy
  1. Quick Test with spark-shell
Bash
Copy
  1. Deploy Client JAR.
Bash
Copy
  1. Configure Flink.
Bash
Copy

Apache Tez

Append following property in yarn-site xml under yarn.application.classpath and under Mapreduce mapreduce.application.classpath.

Bash
Copy
Bash
Copy

MapReduce

Need to add above tez configuration for classpath addition and then following additional properties in Mapreduce configuration.

Bash
Copy

High Availability

Master HA Overview

Celeborn Master achieves HA using Apache Ratis (Raft consensus protocol). A minimum of 3 Master nodes is required. Odd numbers (3, 5, or 7) are required so the cluster can elect a leader by majority vote.

Bash
Copy

Master HA Configuration

Bash
Copy

Ambari Mpack HA Settings

Property (in celeborn-ha)Value
celeborn.master.ha.enabledtrue
celeborn.master.ha.node.mappingmaster1:1,master2:2,master3:3
celeborn.master.ha.rpc.port9097
celeborn.master.ha.ratis.port9872

Worker Fault Tolerance

Enable data replication on the Spark client to ensure shuffle data survives a worker node failure:

Bash
Copy
Type to search, ESC to discard
Type to search, ESC to discard
Type to search, ESC to discard
  Last updated