This repository has been archived by the owner on May 8, 2024. It is now read-only.
Releases: douban/dpark
Releases · douban/dpark
Release 0.5.0
API change
- Remove module-level api like
dpark.textFile
. - Support Streaming shuffle and Disk shuffle (Experimental, compatible).
Fixes
- Bug when parsing mfs chunk info.
Improvement
- Better broadcast impl using shared memory for tasks on the same slave to reduce memory cost.
- Better offer-matching logic for MesosScheduler which remember bad slaves.
- Refactor: style and layout.
New Feature
- Multi segment dump to save memory.
- Gather statics for stage.
- Support run tests/test_rdd on mesos.
- Add colorful progress bar for dpark.
- Support mesos role.
- Support multi named mesos master in conf.
- Loghub for admin.
Release 0.4.2
- Support Python3 & PyPy
- Support MooseFS 3.x & refactor on file-system interface
Release 0.4.1
- Enhancement for the containerizer in DPark
- Use broadcast when parallelize big dataset
- Fix missing line bug for bzip2 files
- Add TopByKey in RDD
- Other minor bugs
Release 0.4.0
- Bugfix: deserialize error of old-style class.
- Refactor beansdb RDD
- Web UI support for dpark
- Use pymesos >= 0.2.0
- Eager serialize values of ParallelCollection