Spark Release 0.6.1
Spark 0.6.1 is a maintenance release that contains several important bug fixes and performance improvements. You can download it as a source package (2.4 MB tar.gz) or prebuilt package (48 MB tar.gz).
The fixes and improvements in this version include:
- Fixed overly aggressive message timeouts that could cause workers to disconnect from the cluster
- Fixed a bug in the standalone deploy mode that did not expose hostnames to scheduler, affecting HDFS locality
- Improved connection reuse in shuffle, which can greatly speed up small shuffles (contributed by Reynold Xin)
- Fixed some potential deadlocks in the block manager (contributed by Tathagata Das)
- Fixed a bug getting IDs of failed hosts from Mesos (contributed by Imran Rashid)
- Several EC2 script improvements, like better handling of spot instances (contributed by Josh Rosen)
- Made the local IP address that Spark binds to customizable (contributed by Mikhail Bautin)
- Support for Hadoop 2 distributions (contributed by Thomas Dudziak)
- Support for locating Scala on Debian distributions (contributed by Thomas Dudziak)
- Improved standalone cluster web UI to show more information about jobs
- Added an option to spread out jobs over the standalone cluster instead of concentrating them on a small number of nodes (
spark.deploy.spreadOut
)
We recommend that all Spark 0.6 users update to this maintenance release.
Spark News Archive