Skip to content

Commit

Permalink
[INFRA] Remove incubator/incubating for graduation
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

Remove incubator/incubating for graduation including:

- Remove `incubator`/`Incubating`.
- Remove `DISCLAIMER` and corresponding link.
- Update Release scripts and template.

Fix apache#2415.

### Why are the changes needed?

The ASF board has approved a resolution to graduate Celeborn into a full Top Level Project. To transition from the Apache Incubator to a new TLP, there's a few action items we need to do to complete the transition.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

No.

Closes apache#2421 from SteNicholas/infra-graduation.

Authored-by: SteNicholas <[email protected]>
Signed-off-by: mingji <[email protected]>
(cherry picked from commit c9b878a)
Signed-off-by: SteNicholas <[email protected]>
  • Loading branch information
SteNicholas committed May 7, 2024
1 parent 15de4e5 commit 641a802
Show file tree
Hide file tree
Showing 19 changed files with 71 additions and 92 deletions.
10 changes: 0 additions & 10 deletions DISCLAIMER

This file was deleted.

2 changes: 1 addition & 1 deletion NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
2 changes: 1 addition & 1 deletion NOTICE-binary
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
55 changes: 26 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Apache Celeborn (Incubating)
# Apache Celeborn

[![Celeborn CI](https://github.com/apache/incubator-celeborn/actions/workflows/maven.yml/badge.svg)](https://github.com/apache/incubator-celeborn/actions/workflows/maven.yml)
Celeborn is dedicated to improving the efficiency and elasticity of
[![Celeborn CI](https://github.com/apache/celeborn/actions/workflows/maven.yml/badge.svg)](https://github.com/apache/celeborn/actions/workflows/maven.yml)
Celeborn (/ˈkeləbɔ:n/) is dedicated to improving the efficiency and elasticity of
different map-reduce engines and provides an elastic, high-efficient
management service for intermediate data including shuffle data, spilled data, result data, etc. Currently, Celeborn is focusing on shuffle data.

Expand Down Expand Up @@ -44,12 +44,12 @@ Celeborn worker's slot count decreases when a partition is allocated and increme
1. Celeborn supports Spark 2.4/3.0/3.1/3.2/3.3/3.4/3.5, Flink 1.14/1.15/1.17/1.18 and Hadoop MapReduce 2/3.
2. Celeborn tested under Scala 2.11/2.12/2.13 and Java 8/11/17 environment.

Build Celeborn
Build Celeborn via `make-distribution.sh`:
```shell
./build/make-distribution.sh -Pspark-2.4/-Pspark-3.0/-Pspark-3.1/-Pspark-3.2/-Pspark-3.3/-Pspark-3.4/-Pflink-1.14/-Pflink-1.15/-Pflink-1.17/-Pflink-1.18/-Pmr
```

package apache-celeborn-${project.version}-bin.tgz will be generated.
Package `apache-celeborn-${project.version}-bin.tgz` will be generated.

> **_NOTE:_** The following table indicates the compatibility of Celeborn Spark and Flink clients with different versions of Spark and Flink for various Java and Scala versions.
Expand All @@ -67,7 +67,7 @@ package apache-celeborn-${project.version}-bin.tgz will be generated.
| Flink 1.17 | &#x274C; | &#10004; | &#10004; | &#x274C; | &#x274C; | &#x274C; | &#x274C; |
| Flink 1.18 | &#x274C; | &#10004; | &#10004; | &#x274C; | &#x274C; | &#x274C; | &#x274C; |

To compile the client for Spark 2.4 with Scala 2.12, please use the following command
To compile the client for Spark 2.4 with Scala 2.12, please use the following command:

- Scala 2.12.8/2.12.9/2.12.10
```shell
Expand Down Expand Up @@ -107,8 +107,8 @@ Celeborn cluster composes of Master and Worker nodes, the Master supports both s

### Deploy Celeborn
#### Deploy on host
1. Unzip the tarball to `$CELEBORN_HOME`
2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh`
1. Unzip the tarball to `$CELEBORN_HOME`.
2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh`.

EXAMPLE:
```properties
Expand All @@ -117,7 +117,7 @@ CELEBORN_MASTER_MEMORY=4g
CELEBORN_WORKER_MEMORY=2g
CELEBORN_WORKER_OFFHEAP_MEMORY=4g
```
3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf`
3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf`.

EXAMPLE: single master cluster
```properties
Expand Down Expand Up @@ -151,7 +151,7 @@ celeborn.worker.replicate.fastFail.duration 240s
celeborn.storage.hdfs.kerberos.principal user@REALM
celeborn.storage.hdfs.kerberos.keytab /path/to/user.keytab

# If your hosts have disk raid or use lvm, set celeborn.worker.monitor.disk.enabled to false
# If your hosts have disk raid or use lvm, set `celeborn.worker.monitor.disk.enabled` to false
celeborn.worker.monitor.disk.enabled false
```

Expand Down Expand Up @@ -198,26 +198,24 @@ celeborn.worker.flusher.hdfs.buffer.size 4m
celeborn.storage.hdfs.dir hdfs://<namenode>/celeborn
celeborn.worker.replicate.fastFail.duration 240s

# If your hosts have disk raid or use lvm, set celeborn.worker.monitor.disk.enabled to false
# If your hosts have disk raid or use lvm, set `celeborn.worker.monitor.disk.enabled` to false
celeborn.worker.monitor.disk.enabled false
```

Flink engine related configurations:
```properties
# if you are using Celeborn for flink, these settings will be needed
# If you are using Celeborn for flink, these settings will be needed.
celeborn.worker.directMemoryRatioForReadBuffer 0.4
celeborn.worker.directMemoryRatioToResume 0.6
# these setting will affect performance.
celeborn.worker.directMemoryRatioToResume 0.5
# These setting will affect performance.
# If there is enough off-heap memory, you can try to increase read buffers.
# Read buffer max memory usage for a data partition is `taskmanager.memory.segment-size * readBuffersMax`
celeborn.worker.partition.initial.readBuffersMin 512
celeborn.worker.partition.initial.readBuffersMax 1024
celeborn.worker.readBuffer.allocationWait 10ms
# Currently, shuffle partitionSplit is not supported, so you should disable split in celeborn worker side or set `celeborn.client.shuffle.partitionSplit.threshold` to a high value in flink client side.
celeborn.worker.shuffle.partitionSplit.enabled false
```

4. Copy Celeborn and configurations to all nodes
4. Copy Celeborn and configurations to all nodes.
5. Start all services. If you install Celeborn distribution in the same path on every node and your
cluster can perform SSH login then you can fill `$CELEBORN_HOME/conf/hosts` and
use `$CELEBORN_HOME/sbin/start-all.sh` to start all
Expand Down Expand Up @@ -250,14 +248,14 @@ WorkerRef: null
Please refer to our [website](https://celeborn.apache.org/docs/latest/deploy_on_k8s/)

### Deploy Spark client
Copy $CELEBORN_HOME/spark/*.jar to $SPARK_HOME/jars/
Copy `$CELEBORN_HOME/spark/*.jar` to `$SPARK_HOME/jars/`.

#### Spark Configuration
To use Celeborn,the following spark configurations should be added.
To use Celeborn, the following spark configurations should be added.
```properties
# Shuffle manager class name changed in 0.3.0:
# before 0.3.0: org.apache.spark.shuffle.celeborn.RssShuffleManager
# since 0.3.0: org.apache.spark.shuffle.celeborn.SparkShuffleManager
# before 0.3.0: `org.apache.spark.shuffle.celeborn.RssShuffleManager`
# since 0.3.0: `org.apache.spark.shuffle.celeborn.SparkShuffleManager`
spark.shuffle.manager org.apache.spark.shuffle.celeborn.SparkShuffleManager
# must use kryo serializer because java serializer do not support relocation
spark.serializer org.apache.spark.serializer.KryoSerializer
Expand All @@ -272,13 +270,13 @@ spark.shuffle.service.enabled false
# Sort shuffle writer uses less memory than hash shuffle writer, if your shuffle partition count is large, try to use sort hash writer.
spark.celeborn.client.spark.shuffle.writer hash

# We recommend setting spark.celeborn.client.push.replicate.enabled to true to enable server-side data replication
# We recommend setting `spark.celeborn.client.push.replicate.enabled` to true to enable server-side data replication
# If you have only one worker, this setting must be false
# If your Celeborn is using HDFS, it's recommended to set this setting to false
spark.celeborn.client.push.replicate.enabled true

# Support for Spark AQE only tested under Spark 3
# we recommend setting localShuffleReader to false to get better performance of Celeborn
# we recommend setting localShuffleReader to false for getting better performance of Celeborn
spark.sql.adaptive.localShuffleReader.enabled false

# If Celeborn is using HDFS
Expand All @@ -296,7 +294,7 @@ spark.dynamicAllocation.shuffleTracking.enabled false
```

### Deploy Flink client
Copy $CELEBORN_HOME/flink/*.jar to $FLINK_HOME/lib/
Copy `$CELEBORN_HOME/flink/*.jar` to `$FLINK_HOME/lib/`.

#### Flink Configuration
To use Celeborn, the following flink configurations should be added.
Expand All @@ -322,9 +320,9 @@ taskmanager.memory.task.off-heap.size: 512m
```
**Note**: The config option `execution.batch-shuffle-mode` should configure as `ALL_EXCHANGES_BLOCKING`.

### Deploy mapreduce client
Add $CELEBORN_HOME/mr/*.jar to to `mapreduce.application.classpath` and `yarn.application.classpath`.
And setting the following settings in YARN and MapReduce config.
### Deploy MapReduce client
Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and `yarn.application.classpath`.
Meanwhile, configure the following settings in YARN and MapReduce config.
```bash
-Dyarn.app.mapreduce.am.job.recovery.enable=false
-Dmapreduce.job.reduce.slowstart.completedmaps=1
Expand All @@ -334,7 +332,6 @@ And setting the following settings in YARN and MapReduce config.
-Dmapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.CelebornShuffleConsumer
```


### Best Practice
If you want to set up a production-ready Celeborn cluster, your cluster should have at least 3 masters and at least 4 workers.
Masters and works can be deployed on the same node but should not deploy multiple masters or workers on the same node.
Expand Down Expand Up @@ -371,7 +368,7 @@ Contact us through the following mailing list.

### Report Issues or Submit Pull Request

If you meet any questions, feel free to file a 🔗[Jira Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us and fix it by submitting a 🔗[Pull Request](https://github.com/apache/incubator-celeborn/pulls).
If you meet any questions, feel free to file a 🔗[Jira Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us and fix it by submitting a 🔗[Pull Request](https://github.com/apache/celeborn/pulls).

| IM | Contact Info |
|:---------|:------------------------------------------------------------------------------------------------------------------------------------------|
Expand Down
1 change: 0 additions & 1 deletion build/make-distribution.sh
Original file line number Diff line number Diff line change
Expand Up @@ -390,7 +390,6 @@ cp "$PROJECT_DIR/docker/Dockerfile" "$DIST_DIR/docker"
cp -r "$PROJECT_DIR/charts" "$DIST_DIR"

# Copy license files
cp "$PROJECT_DIR/DISCLAIMER" "$DIST_DIR/DISCLAIMER"
if [[ -f $"$PROJECT_DIR/LICENSE-binary" ]]; then
cp "$PROJECT_DIR/LICENSE-binary" "$DIST_DIR/LICENSE"
cp -r "$PROJECT_DIR/licenses-binary" "$DIST_DIR/licenses"
Expand Down
4 changes: 2 additions & 2 deletions build/release/release.sh
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,8 @@ fi

RELEASE_TAG="v${RELEASE_VERSION}-rc${RELEASE_RC_NO}"

SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/incubator/celeborn"
SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/incubator/celeborn"
SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/celeborn"
SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/celeborn"

RELEASE_DIR="${PROJECT_DIR}/tmp"
SVN_STAGING_DIR="${PROJECT_DIR}/tmp/svn-dev"
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
2 changes: 1 addition & 1 deletion client-mr/mr-shaded/src/main/resources/META-INF/NOTICE
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@

Apache Celeborn (Incubating)
Apache Celeborn
Copyright 2022-2024 The Apache Software Foundation.

This product includes software developed at
Expand Down
4 changes: 2 additions & 2 deletions dev/merge_pr.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,8 +64,8 @@
GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY")


GITHUB_BASE = "https://github.com/apache/incubator-celeborn/pull"
GITHUB_API_BASE = "https://api.github.com/repos/apache/incubator-celeborn"
GITHUB_BASE = "https://github.com/apache/celeborn/pull"
GITHUB_API_BASE = "https://api.github.com/repos/apache/celeborn"
JIRA_BASE = "https://issues.apache.org/jira/browse"
JIRA_API_BASE = "https://issues.apache.org/jira"
# Prefix added to temporary branches
Expand Down
27 changes: 16 additions & 11 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,11 +20,11 @@ license: |
---
Quick Start
===
This documentation gives a quick start guide for running Apache Spark/Flink with Apache Celeborn™(Incubating).
This documentation gives a quick start guide for running Spark/Flink/MapReduce with Apache Celeborn™.

### Download Celeborn
Download the latest Celeborn binary from the [Downloading Page](https://celeborn.apache.org/download/).
Decompress the binary and set `$CELEBORN_HOME`
Decompress the binary and set `$CELEBORN_HOME`.
```shell
tar -C <DST_DIR> -zxvf apache-celeborn-<VERSION>-bin.tgz
export CELEBORN_HOME=<Decompressed path>
Expand All @@ -37,7 +37,7 @@ cd $CELEBORN_HOME/conf
cp log4j2.xml.template log4j2.xml
```
#### Configure Storage
Configure the directory to store shuffle data, for example `$CELEBORN_HOME/shuffle`
Configure the directory to store shuffle data, for example `$CELEBORN_HOME/shuffle`.
```shell
cd $CELEBORN_HOME/conf
echo "celeborn.worker.storage.dirs=$CELEBORN_HOME/shuffle" > celeborn-defaults.conf
Expand Down Expand Up @@ -154,11 +154,15 @@ INFO [async-reply] Controller: CommitFiles for local-1690000152711-0 success wit
```

## Start MapReduce With Celeborn
### Add Celeborn client jar to MapReduce's classpath
1.Add $CELEBORN_HOME/mr/*.jar to `mapreduce.application.classpath` and `yarn.application.classpath`.
2.Restart your yarn cluster.
### Add Celeborn configurations to MapReduce's conf
Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
### Copy Celeborn Client to MapReduce's classpath
1. Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and `yarn.application.classpath`.
```shell
cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <mapreduce.application.classpath>
cp $CELEBORN_HOME/mr/<Celeborn Client Jar> <yarn.application.classpath>
```
2. Restart your yarn cluster.
### Add Celeborn configuration to MapReduce's conf
- Modify configurations in `${HADOOP_CONF_DIR}/yarn-site.xml`.
```xml
<configuration>
<property>
Expand All @@ -173,7 +177,7 @@ Modify `${HADOOP_CONF_DIR}/yarn-site.xml`
</property>
</configuration>
```
Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
- Modify configurations in `${HADOOP_CONF_DIR}/mapred-site.xml`.
```xml
<configuration>
<property>
Expand All @@ -195,10 +199,11 @@ Modify `${HADOOP_CONF_DIR}/mapred-site.xml`
</property>
</configuration>
```
Then you can run a word count to check whether your configs are correct.
Then deploy the example word count to the running cluster for verifying whether above configurations are correct.
```shell
cd $HADOOP_HOME
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /sometext /someoutput

./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /someinput /someoutput
```
During the MapReduce Job, you should see the following message in Celeborn Master's log:
```log
Expand Down
14 changes: 7 additions & 7 deletions docs/developers/glutensupport.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ license: |
# Gluten Support
## Velox Backend

[Gluten](https://github.com/oap-project/gluten) with velox backend supports Celeborn as remote shuffle service. Below introduction is used to enable this feature
[Gluten](https://github.com/apache/incubator-gluten) with velox backend supports Celeborn as remote shuffle service. Below introduction is used to enable this feature.

First refer to this URL(https://github.com/oap-project/gluten/blob/main/docs/get-started/Velox.md) to build Gluten with velox backend.
First refer to [Get Started With Velox](https://github.com/apache/incubator-gluten/blob/main/docs/get-started/Velox.md) to build Gluten with velox backend.

When compiling the Gluten Java module, it's required to enable `rss` profile, as follows:

Expand All @@ -31,18 +31,18 @@ mvn clean package -Pbackends-velox -Pspark-3.3 -Prss -DskipTests

Then add the Gluten and Spark Celeborn Client packages to your Spark application's classpath(usually add them into `$SPARK_HOME/jars`).

- Celeborn: celeborn-client-spark-3-shaded_2.12-0.3.0-incubating.jar
- Gluten: gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar, gluten-thirdparty-lib-xx.jar
- Celeborn: `celeborn-client-spark-3-shaded_2.12-[celebornVersion].jar`
- Gluten: `gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar`, `gluten-thirdparty-lib-xx.jar`

Currently to use Gluten following configurations are required in `spark-defaults.conf`
Currently, to use Gluten following configurations are required in `spark-defaults.conf`.

```
spark.shuffle.manager org.apache.spark.shuffle.gluten.celeborn.CelebornShuffleManager
# celeborn master
spark.celeborn.master.endpoints clb-master:9097
# we recommend set spark.celeborn.push.replicate.enabled to true to enable server-side data replication
# we recommend set `spark.celeborn.push.replicate.enabled` to true to enable server-side data replication
# If you have only one worker, this setting must be false
spark.celeborn.client.push.replicate.enabled true
Expand All @@ -52,7 +52,7 @@ spark.shuffle.service.enabled false
spark.sql.adaptive.localShuffleReader.enabled false
# If you want to use dynamic resource allocation,
# please refer to this URL (https://github.com/apache/incubator-celeborn/tree/main/assets/spark-patch) to apply the patch into your own Spark.
# please refer to this URL (https://github.com/apache/celeborn/tree/main/assets/spark-patch) to apply the patch into your own Spark.
spark.dynamicAllocation.enabled false
```

Expand Down
Loading

0 comments on commit 641a802

Please sign in to comment.