Skip to content

Tachyon 0.1.0

Compare
Choose a tag to compare
@haoyuan haoyuan released this 21 Oct 21:23
· 34588 commits to main since this release

Tachyon enables high-throughput memory sharing between different jobs/queries and cloud computing frameworks, such as Spark and MapReduce. Tachyon will cache datasets in memory, and enable different jobs/queries and frameworks to access cached datasets at memory speed. Thus, Tachyon avoids going to disk to load datasets that is frequently read. It has following main use cases:

  • Different jobs/queries (Spark/Shark/StreamingSpark/Hadoop) accessing the same datasets will read it directly from memory, except the first time it is loaded from disk.
  • If a job that read a dataset crashes, the restarted job does not need to read the dataset from disk, but can read it at memory speed from Tachyon.