You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
narrow dependencies, where each partition of the parent RDD is used by at most one partition of the child RDD
wide dependencies, where multiple child partitions may depend on it
However, The definition of dependencies from the chapter JobLogicalPlan is different :
NarrowDependency, Each partition of the child RDD fully depends on a small number of partitions of its parent RDD. Fully depends (i.e., FullDependency) means that a child partition depends the entire parent partition.
ShuffleDependency, Multiple child partitions partially depends on a parent partition. Partially depends (i.e., PartialDependency) means that each child partition depends a part of the parent partition.
This makes me really confused. Are ShuffleDependency and wide dependency the same thing?
The text was updated successfully, but these errors were encountered:
From the paper Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing
However, The definition of dependencies from the chapter JobLogicalPlan is different :
NarrowDependency, Each partition of the child RDD fully depends on a small number of partitions of its parent RDD. Fully depends (i.e., FullDependency) means that a child partition depends the entire parent partition.
ShuffleDependency, Multiple child partitions partially depends on a parent partition. Partially depends (i.e., PartialDependency) means that each child partition depends a part of the parent partition.
This makes me really confused. Are ShuffleDependency and wide dependency the same thing?
The text was updated successfully, but these errors were encountered: