From 641a802e2c14ca1da8e1651fe46f6948c4104e21 Mon Sep 17 00:00:00 2001 From: SteNicholas Date: Wed, 27 Mar 2024 13:54:47 +0800 Subject: [PATCH] [INFRA] Remove incubator/incubating for graduation ### What changes were proposed in this pull request? Remove incubator/incubating for graduation including: - Remove `incubator`/`Incubating`. - Remove `DISCLAIMER` and corresponding link. - Update Release scripts and template. Fix #2415. ### Why are the changes needed? The ASF board has approved a resolution to graduate Celeborn into a full Top Level Project. To transition from the Apache Incubator to a new TLP, there's a few action items we need to do to complete the transition. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? No. Closes #2421 from SteNicholas/infra-graduation. Authored-by: SteNicholas Signed-off-by: mingji (cherry picked from commit c9b878a2f5b719d5c5fe3d68cd4e43b53ec25c14) Signed-off-by: SteNicholas --- DISCLAIMER | 10 ---- NOTICE | 2 +- NOTICE-binary | 2 +- README.md | 55 +++++++++---------- build/make-distribution.sh | 1 - build/release/release.sh | 4 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- .../src/main/resources/META-INF/NOTICE | 2 +- dev/merge_pr.py | 4 +- docs/README.md | 27 +++++---- docs/developers/glutensupport.md | 14 ++--- docs/developers/overview.md | 6 +- mkdocs.yml | 20 ++----- project/CelebornBuild.scala | 4 +- 19 files changed, 71 insertions(+), 92 deletions(-) delete mode 100644 DISCLAIMER diff --git a/DISCLAIMER b/DISCLAIMER deleted file mode 100644 index 0e5e17ddebc..00000000000 --- a/DISCLAIMER +++ /dev/null @@ -1,10 +0,0 @@ -Apache Celeborn (Incubating) is an effort undergoing incubation at the Apache -Software Foundation (ASF), sponsored by the Apache Incubator PMC. - -Incubation is required of all newly accepted projects until a further review -indicates that the infrastructure, communications, and decision making process -have stabilized in a manner consistent with other successful ASF projects. - -While incubation status is not necessarily a reflection of the completeness -or stability of the code, it does indicate that the project has yet to be -fully endorsed by the ASF. diff --git a/NOTICE b/NOTICE index 34ec3f608e9..b63ca7b1997 100644 --- a/NOTICE +++ b/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/NOTICE-binary b/NOTICE-binary index 8e59fff41a7..4d6fb045650 100644 --- a/NOTICE-binary +++ b/NOTICE-binary @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/README.md b/README.md index 0a9304c48ce..ca40effad05 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ -# Apache Celeborn (Incubating) +# Apache Celeborn -[![Celeborn CI](https://github.com/apache/incubator-celeborn/actions/workflows/maven.yml/badge.svg)](https://github.com/apache/incubator-celeborn/actions/workflows/maven.yml) -Celeborn is dedicated to improving the efficiency and elasticity of +[![Celeborn CI](https://github.com/apache/celeborn/actions/workflows/maven.yml/badge.svg)](https://github.com/apache/celeborn/actions/workflows/maven.yml) +Celeborn (/ˈkeləbɔ:n/) is dedicated to improving the efficiency and elasticity of different map-reduce engines and provides an elastic, high-efficient management service for intermediate data including shuffle data, spilled data, result data, etc. Currently, Celeborn is focusing on shuffle data. @@ -44,12 +44,12 @@ Celeborn worker's slot count decreases when a partition is allocated and increme 1. Celeborn supports Spark 2.4/3.0/3.1/3.2/3.3/3.4/3.5, Flink 1.14/1.15/1.17/1.18 and Hadoop MapReduce 2/3. 2. Celeborn tested under Scala 2.11/2.12/2.13 and Java 8/11/17 environment. -Build Celeborn +Build Celeborn via `make-distribution.sh`: ```shell ./build/make-distribution.sh -Pspark-2.4/-Pspark-3.0/-Pspark-3.1/-Pspark-3.2/-Pspark-3.3/-Pspark-3.4/-Pflink-1.14/-Pflink-1.15/-Pflink-1.17/-Pflink-1.18/-Pmr ``` -package apache-celeborn-${project.version}-bin.tgz will be generated. +Package `apache-celeborn-${project.version}-bin.tgz` will be generated. > **_NOTE:_** The following table indicates the compatibility of Celeborn Spark and Flink clients with different versions of Spark and Flink for various Java and Scala versions. @@ -67,7 +67,7 @@ package apache-celeborn-${project.version}-bin.tgz will be generated. | Flink 1.17 | ❌ | ✔ | ✔ | ❌ | ❌ | ❌ | ❌ | | Flink 1.18 | ❌ | ✔ | ✔ | ❌ | ❌ | ❌ | ❌ | -To compile the client for Spark 2.4 with Scala 2.12, please use the following command +To compile the client for Spark 2.4 with Scala 2.12, please use the following command: - Scala 2.12.8/2.12.9/2.12.10 ```shell @@ -107,8 +107,8 @@ Celeborn cluster composes of Master and Worker nodes, the Master supports both s ### Deploy Celeborn #### Deploy on host -1. Unzip the tarball to `$CELEBORN_HOME` -2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh` +1. Unzip the tarball to `$CELEBORN_HOME`. +2. Modify environment variables in `$CELEBORN_HOME/conf/celeborn-env.sh`. EXAMPLE: ```properties @@ -117,7 +117,7 @@ CELEBORN_MASTER_MEMORY=4g CELEBORN_WORKER_MEMORY=2g CELEBORN_WORKER_OFFHEAP_MEMORY=4g ``` -3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf` +3. Modify configurations in `$CELEBORN_HOME/conf/celeborn-defaults.conf`. EXAMPLE: single master cluster ```properties @@ -151,7 +151,7 @@ celeborn.worker.replicate.fastFail.duration 240s celeborn.storage.hdfs.kerberos.principal user@REALM celeborn.storage.hdfs.kerberos.keytab /path/to/user.keytab -# If your hosts have disk raid or use lvm, set celeborn.worker.monitor.disk.enabled to false +# If your hosts have disk raid or use lvm, set `celeborn.worker.monitor.disk.enabled` to false celeborn.worker.monitor.disk.enabled false ``` @@ -198,26 +198,24 @@ celeborn.worker.flusher.hdfs.buffer.size 4m celeborn.storage.hdfs.dir hdfs:///celeborn celeborn.worker.replicate.fastFail.duration 240s -# If your hosts have disk raid or use lvm, set celeborn.worker.monitor.disk.enabled to false +# If your hosts have disk raid or use lvm, set `celeborn.worker.monitor.disk.enabled` to false celeborn.worker.monitor.disk.enabled false ``` Flink engine related configurations: ```properties -# if you are using Celeborn for flink, these settings will be needed +# If you are using Celeborn for flink, these settings will be needed. celeborn.worker.directMemoryRatioForReadBuffer 0.4 -celeborn.worker.directMemoryRatioToResume 0.6 -# these setting will affect performance. +celeborn.worker.directMemoryRatioToResume 0.5 +# These setting will affect performance. # If there is enough off-heap memory, you can try to increase read buffers. # Read buffer max memory usage for a data partition is `taskmanager.memory.segment-size * readBuffersMax` celeborn.worker.partition.initial.readBuffersMin 512 celeborn.worker.partition.initial.readBuffersMax 1024 celeborn.worker.readBuffer.allocationWait 10ms -# Currently, shuffle partitionSplit is not supported, so you should disable split in celeborn worker side or set `celeborn.client.shuffle.partitionSplit.threshold` to a high value in flink client side. -celeborn.worker.shuffle.partitionSplit.enabled false ``` -4. Copy Celeborn and configurations to all nodes +4. Copy Celeborn and configurations to all nodes. 5. Start all services. If you install Celeborn distribution in the same path on every node and your cluster can perform SSH login then you can fill `$CELEBORN_HOME/conf/hosts` and use `$CELEBORN_HOME/sbin/start-all.sh` to start all @@ -250,14 +248,14 @@ WorkerRef: null Please refer to our [website](https://celeborn.apache.org/docs/latest/deploy_on_k8s/) ### Deploy Spark client -Copy $CELEBORN_HOME/spark/*.jar to $SPARK_HOME/jars/ +Copy `$CELEBORN_HOME/spark/*.jar` to `$SPARK_HOME/jars/`. #### Spark Configuration -To use Celeborn,the following spark configurations should be added. +To use Celeborn, the following spark configurations should be added. ```properties # Shuffle manager class name changed in 0.3.0: -# before 0.3.0: org.apache.spark.shuffle.celeborn.RssShuffleManager -# since 0.3.0: org.apache.spark.shuffle.celeborn.SparkShuffleManager +# before 0.3.0: `org.apache.spark.shuffle.celeborn.RssShuffleManager` +# since 0.3.0: `org.apache.spark.shuffle.celeborn.SparkShuffleManager` spark.shuffle.manager org.apache.spark.shuffle.celeborn.SparkShuffleManager # must use kryo serializer because java serializer do not support relocation spark.serializer org.apache.spark.serializer.KryoSerializer @@ -272,13 +270,13 @@ spark.shuffle.service.enabled false # Sort shuffle writer uses less memory than hash shuffle writer, if your shuffle partition count is large, try to use sort hash writer. spark.celeborn.client.spark.shuffle.writer hash -# We recommend setting spark.celeborn.client.push.replicate.enabled to true to enable server-side data replication +# We recommend setting `spark.celeborn.client.push.replicate.enabled` to true to enable server-side data replication # If you have only one worker, this setting must be false # If your Celeborn is using HDFS, it's recommended to set this setting to false spark.celeborn.client.push.replicate.enabled true # Support for Spark AQE only tested under Spark 3 -# we recommend setting localShuffleReader to false to get better performance of Celeborn +# we recommend setting localShuffleReader to false for getting better performance of Celeborn spark.sql.adaptive.localShuffleReader.enabled false # If Celeborn is using HDFS @@ -296,7 +294,7 @@ spark.dynamicAllocation.shuffleTracking.enabled false ``` ### Deploy Flink client -Copy $CELEBORN_HOME/flink/*.jar to $FLINK_HOME/lib/ +Copy `$CELEBORN_HOME/flink/*.jar` to `$FLINK_HOME/lib/`. #### Flink Configuration To use Celeborn, the following flink configurations should be added. @@ -322,9 +320,9 @@ taskmanager.memory.task.off-heap.size: 512m ``` **Note**: The config option `execution.batch-shuffle-mode` should configure as `ALL_EXCHANGES_BLOCKING`. -### Deploy mapreduce client -Add $CELEBORN_HOME/mr/*.jar to to `mapreduce.application.classpath` and `yarn.application.classpath`. -And setting the following settings in YARN and MapReduce config. +### Deploy MapReduce client +Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and `yarn.application.classpath`. +Meanwhile, configure the following settings in YARN and MapReduce config. ```bash -Dyarn.app.mapreduce.am.job.recovery.enable=false -Dmapreduce.job.reduce.slowstart.completedmaps=1 @@ -334,7 +332,6 @@ And setting the following settings in YARN and MapReduce config. -Dmapreduce.job.reduce.shuffle.consumer.plugin.class=org.apache.hadoop.mapreduce.task.reduce.CelebornShuffleConsumer ``` - ### Best Practice If you want to set up a production-ready Celeborn cluster, your cluster should have at least 3 masters and at least 4 workers. Masters and works can be deployed on the same node but should not deploy multiple masters or workers on the same node. @@ -371,7 +368,7 @@ Contact us through the following mailing list. ### Report Issues or Submit Pull Request -If you meet any questions, feel free to file a 🔗[Jira Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us and fix it by submitting a 🔗[Pull Request](https://github.com/apache/incubator-celeborn/pulls). +If you meet any questions, feel free to file a 🔗[Jira Ticket](https://issues.apache.org/jira/projects/CELEBORN/issues) or connect us and fix it by submitting a 🔗[Pull Request](https://github.com/apache/celeborn/pulls). | IM | Contact Info | |:---------|:------------------------------------------------------------------------------------------------------------------------------------------| diff --git a/build/make-distribution.sh b/build/make-distribution.sh index 5e9b983dc6f..afd787fc732 100755 --- a/build/make-distribution.sh +++ b/build/make-distribution.sh @@ -390,7 +390,6 @@ cp "$PROJECT_DIR/docker/Dockerfile" "$DIST_DIR/docker" cp -r "$PROJECT_DIR/charts" "$DIST_DIR" # Copy license files -cp "$PROJECT_DIR/DISCLAIMER" "$DIST_DIR/DISCLAIMER" if [[ -f $"$PROJECT_DIR/LICENSE-binary" ]]; then cp "$PROJECT_DIR/LICENSE-binary" "$DIST_DIR/LICENSE" cp -r "$PROJECT_DIR/licenses-binary" "$DIST_DIR/licenses" diff --git a/build/release/release.sh b/build/release/release.sh index 48bb5093c21..6ed0a96e44a 100755 --- a/build/release/release.sh +++ b/build/release/release.sh @@ -56,8 +56,8 @@ fi RELEASE_TAG="v${RELEASE_VERSION}-rc${RELEASE_RC_NO}" -SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/incubator/celeborn" -SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/incubator/celeborn" +SVN_STAGING_REPO="https://dist.apache.org/repos/dist/dev/celeborn" +SVN_RELEASE_REPO="https://dist.apache.org/repos/dist/release/celeborn" RELEASE_DIR="${PROJECT_DIR}/tmp" SVN_STAGING_DIR="${PROJECT_DIR}/tmp/svn-dev" diff --git a/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE b/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE index 63b5024b0e1..43452a38afe 100644 --- a/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE +++ b/client-flink/flink-1.14-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE b/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE index 63b5024b0e1..43452a38afe 100644 --- a/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE +++ b/client-flink/flink-1.15-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE b/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE index 63b5024b0e1..43452a38afe 100644 --- a/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE +++ b/client-flink/flink-1.17-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE b/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE index 63b5024b0e1..43452a38afe 100644 --- a/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE +++ b/client-flink/flink-1.18-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE b/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE index 5b5319639e9..9a5437b44ea 100644 --- a/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE +++ b/client-mr/mr-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE b/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE index 1fd47fe3d9e..c48952d00d9 100644 --- a/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE +++ b/client-spark/spark-2-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE b/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE index 1fd47fe3d9e..c48952d00d9 100644 --- a/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE +++ b/client-spark/spark-3-shaded/src/main/resources/META-INF/NOTICE @@ -1,5 +1,5 @@ -Apache Celeborn (Incubating) +Apache Celeborn Copyright 2022-2024 The Apache Software Foundation. This product includes software developed at diff --git a/dev/merge_pr.py b/dev/merge_pr.py index f46370d5952..4794e62aa41 100755 --- a/dev/merge_pr.py +++ b/dev/merge_pr.py @@ -64,8 +64,8 @@ GITHUB_OAUTH_KEY = os.environ.get("GITHUB_OAUTH_KEY") -GITHUB_BASE = "https://github.com/apache/incubator-celeborn/pull" -GITHUB_API_BASE = "https://api.github.com/repos/apache/incubator-celeborn" +GITHUB_BASE = "https://github.com/apache/celeborn/pull" +GITHUB_API_BASE = "https://api.github.com/repos/apache/celeborn" JIRA_BASE = "https://issues.apache.org/jira/browse" JIRA_API_BASE = "https://issues.apache.org/jira" # Prefix added to temporary branches diff --git a/docs/README.md b/docs/README.md index 2187835ca31..125b6f4a845 100644 --- a/docs/README.md +++ b/docs/README.md @@ -20,11 +20,11 @@ license: | --- Quick Start === -This documentation gives a quick start guide for running Apache Spark/Flink with Apache Celeborn™(Incubating). +This documentation gives a quick start guide for running Spark/Flink/MapReduce with Apache Celeborn™. ### Download Celeborn Download the latest Celeborn binary from the [Downloading Page](https://celeborn.apache.org/download/). -Decompress the binary and set `$CELEBORN_HOME` +Decompress the binary and set `$CELEBORN_HOME`. ```shell tar -C -zxvf apache-celeborn--bin.tgz export CELEBORN_HOME= @@ -37,7 +37,7 @@ cd $CELEBORN_HOME/conf cp log4j2.xml.template log4j2.xml ``` #### Configure Storage -Configure the directory to store shuffle data, for example `$CELEBORN_HOME/shuffle` +Configure the directory to store shuffle data, for example `$CELEBORN_HOME/shuffle`. ```shell cd $CELEBORN_HOME/conf echo "celeborn.worker.storage.dirs=$CELEBORN_HOME/shuffle" > celeborn-defaults.conf @@ -154,11 +154,15 @@ INFO [async-reply] Controller: CommitFiles for local-1690000152711-0 success wit ``` ## Start MapReduce With Celeborn -### Add Celeborn client jar to MapReduce's classpath -1.Add $CELEBORN_HOME/mr/*.jar to `mapreduce.application.classpath` and `yarn.application.classpath`. -2.Restart your yarn cluster. -### Add Celeborn configurations to MapReduce's conf -Modify `${HADOOP_CONF_DIR}/yarn-site.xml` +### Copy Celeborn Client to MapReduce's classpath +1. Copy `$CELEBORN_HOME/mr/*.jar` into `mapreduce.application.classpath` and `yarn.application.classpath`. +```shell +cp $CELEBORN_HOME/mr/ +cp $CELEBORN_HOME/mr/ +``` +2. Restart your yarn cluster. +### Add Celeborn configuration to MapReduce's conf +- Modify configurations in `${HADOOP_CONF_DIR}/yarn-site.xml`. ```xml @@ -173,7 +177,7 @@ Modify `${HADOOP_CONF_DIR}/yarn-site.xml` ``` -Modify `${HADOOP_CONF_DIR}/mapred-site.xml` +- Modify configurations in `${HADOOP_CONF_DIR}/mapred-site.xml`. ```xml @@ -195,10 +199,11 @@ Modify `${HADOOP_CONF_DIR}/mapred-site.xml` ``` -Then you can run a word count to check whether your configs are correct. +Then deploy the example word count to the running cluster for verifying whether above configurations are correct. ```shell cd $HADOOP_HOME -hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /sometext /someoutput + +./bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /someinput /someoutput ``` During the MapReduce Job, you should see the following message in Celeborn Master's log: ```log diff --git a/docs/developers/glutensupport.md b/docs/developers/glutensupport.md index 3879e353f6e..6092caec98f 100644 --- a/docs/developers/glutensupport.md +++ b/docs/developers/glutensupport.md @@ -19,9 +19,9 @@ license: | # Gluten Support ## Velox Backend -[Gluten](https://github.com/oap-project/gluten) with velox backend supports Celeborn as remote shuffle service. Below introduction is used to enable this feature +[Gluten](https://github.com/apache/incubator-gluten) with velox backend supports Celeborn as remote shuffle service. Below introduction is used to enable this feature. -First refer to this URL(https://github.com/oap-project/gluten/blob/main/docs/get-started/Velox.md) to build Gluten with velox backend. +First refer to [Get Started With Velox](https://github.com/apache/incubator-gluten/blob/main/docs/get-started/Velox.md) to build Gluten with velox backend. When compiling the Gluten Java module, it's required to enable `rss` profile, as follows: @@ -31,10 +31,10 @@ mvn clean package -Pbackends-velox -Pspark-3.3 -Prss -DskipTests Then add the Gluten and Spark Celeborn Client packages to your Spark application's classpath(usually add them into `$SPARK_HOME/jars`). -- Celeborn: celeborn-client-spark-3-shaded_2.12-0.3.0-incubating.jar -- Gluten: gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar, gluten-thirdparty-lib-xx.jar +- Celeborn: `celeborn-client-spark-3-shaded_2.12-[celebornVersion].jar` +- Gluten: `gluten-velox-bundle-spark3.x_2.12-xx-xx-SNAPSHOT.jar`, `gluten-thirdparty-lib-xx.jar` -Currently to use Gluten following configurations are required in `spark-defaults.conf` +Currently, to use Gluten following configurations are required in `spark-defaults.conf`. ``` spark.shuffle.manager org.apache.spark.shuffle.gluten.celeborn.CelebornShuffleManager @@ -42,7 +42,7 @@ spark.shuffle.manager org.apache.spark.shuffle.gluten.celeborn.CelebornShuffleMa # celeborn master spark.celeborn.master.endpoints clb-master:9097 -# we recommend set spark.celeborn.push.replicate.enabled to true to enable server-side data replication +# we recommend set `spark.celeborn.push.replicate.enabled` to true to enable server-side data replication # If you have only one worker, this setting must be false spark.celeborn.client.push.replicate.enabled true @@ -52,7 +52,7 @@ spark.shuffle.service.enabled false spark.sql.adaptive.localShuffleReader.enabled false # If you want to use dynamic resource allocation, -# please refer to this URL (https://github.com/apache/incubator-celeborn/tree/main/assets/spark-patch) to apply the patch into your own Spark. +# please refer to this URL (https://github.com/apache/celeborn/tree/main/assets/spark-patch) to apply the patch into your own Spark. spark.dynamicAllocation.enabled false ``` diff --git a/docs/developers/overview.md b/docs/developers/overview.md index 9617fa98974..a57948949eb 100644 --- a/docs/developers/overview.md +++ b/docs/developers/overview.md @@ -18,7 +18,7 @@ license: | # Celeborn Architecture -This article introduces high level Apache Celeborn™(Incubating) Architecture. For more detailed description of each module/process, +This article introduces high level Apache Celeborn™ Architecture. For more detailed description of each module/process, please refer to dedicated articles. ## Why Celeborn @@ -30,13 +30,13 @@ the disk and network inefficiency (M * N between Mappers and Reducers) in tradit Besides inefficiency, traditional shuffle framework requires large local storage in compute node to store shuffle data, thus blocks the adoption of disaggregated architecture. -Apache Celeborn(Incubating) solves the problems by reorganizing shuffle data in a more efficient way, and storing the data in +Apache Celeborn solves the problems by reorganizing shuffle data in a more efficient way, and storing the data in a separate service. The high level architecture of Celeborn is as follows: ![Celeborn](../../assets/img/celeborn.svg) ## Components -Celeborn(Incubating) has three primary components: Master, Worker, and Client. +Celeborn has three primary components: Master, Worker, and Client. - Master manages Celeborn cluster and achieves high availability(HA) based on Raft. - Worker processes read-write requests. diff --git a/mkdocs.yml b/mkdocs.yml index 63179b9a261..8dfa083883c 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -15,9 +15,9 @@ # limitations under the License. # -site_name: Apache Celeborn™ (Incubating) -repo_name: apache/incubator-celeborn -repo_url: https://gitbox.apache.org/repos/asf/incubator-celeborn.git +site_name: Apache Celeborn™ +repo_name: apache/celeborn +repo_url: https://gitbox.apache.org/repos/asf/celeborn.git plugins: - search @@ -53,21 +53,9 @@ extra: - icon: fontawesome/brands/github copyright: > - ApacheCon North America -
- Copyright © 2022 The Apache Software Foundation + Copyright © 2024 The Apache Software Foundation, Licensed under the Apache License, Version 2.0. Privacy Policy
-
- Apache Celeborn™, Apache Incubator, Apache, the Apache feather logo, and the Apache Incubator project logo are - trademarks or registered trademarks of The Apache Software Foundation.
-
- Apache Celeborn™ is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the - Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that - the infrastructure, communications, and decision making process have stabilized in a manner consistent with - other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or - stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
-
Please visit
Apache Software Foundation for more details.

diff --git a/project/CelebornBuild.scala b/project/CelebornBuild.scala index 0a4575b67cf..80b84b13b43 100644 --- a/project/CelebornBuild.scala +++ b/project/CelebornBuild.scala @@ -255,8 +255,8 @@ object CelebornCommonSettings { pomExtra := https://celeborn.apache.org/ - git@github.com:apache/incubator-celeborn.git - scm:git:git@github.com:apache/incubator-celeborn.git + git@github.com:apache/celeborn.git + scm:git:git@github.com:apache/celeborn.git )