Alluxio is an open-source virtual distributed file system (VDFS). Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio sits between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License.
|Original author(s)||Haoyuan Li|
|Developer(s)||UC Berkeley AMPLab|
|Initial release||April 8, 2013|
v2.4.1 / November 20, 2020
|Operating system||macOS, Linux|
|License||Apache License 2.0|
Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs (such as Hadoop HDFS API, S3 API, FUSE API) provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Presto, Apache Spark, Apache Hive, and TensorFlow, etc.
Alluxio can be deployed on-premise, in the cloud (e.g. Microsoft Azure, AWS, Google Compute Engine), or a hybrid cloud environment. It can run on bare-metal or in a containerized environments such as Kubernetes, Docker, Apache Mesos.
Alluxio was initially started by Haoyuan Li at UC Berkeley's AMPLab in 2013, and open sourced in 2014. Alluxio had in excess of 1000 contributors in 2018, making it one of the most active projects in the data eco-system.
|Version||Original release date||Latest version||Release date|
|Old version, no longer maintained: 0.2||2013-04-08||0.2.1||2013-04-25|
|Old version, no longer maintained: 0.3||2013-10-21||0.3.0||2013-10-21|
|Old version, no longer maintained: 0.4||2014-02-02||0.4.1||2014-02-25|
|Old version, no longer maintained: 0.5||2014-07-20||0.5.0||2014-07-20|
|Old version, no longer maintained: 0.6||2015-03-01||0.6.4||2015-04-23|
|Old version, no longer maintained: 0.7||2015-07-17||0.7.1||2015-08-10|
|Old version, no longer maintained: 0.8||2015-10-21||0.8.2||2015-11-10|
|Old version, no longer maintained: 1.0||2016-02-23||1.0.1||2016-03-27|
|Old version, no longer maintained: 1.1||2016-06-06||1.1.1||2016-07-04|
|Old version, no longer maintained: 1.2||2016-07-17||1.2.0||2016-07-17|
|Old version, no longer maintained: 1.3||2016-10-05||1.3.0||2016-10-05|
|Old version, no longer maintained: 1.4||2017-01-12||1.4.0||2017-01-12|
|Old version, no longer maintained: 1.5||2017-06-11||1.5.0||2017-06-11|
|Old version, no longer maintained: 1.6||2017-09-24||1.6.1||2017-11-02|
|Old version, no longer maintained: 1.7||2018-01-14||1.7.1||2018-03-26|
|Older version, yet still maintained: 1.8||2018-07-07||1.8.2||2019-08-05|
|Older version, yet still maintained: 2.0||2019-06-27||2.0.1||2019-09-03|
|Older version, yet still maintained: 2.1||2019-11-06||2.1.2||2020-02-04|
|Older version, yet still maintained: 2.2||2020-03-11||2.2.2||2020-06-24|
|Older version, yet still maintained: 2.3||2020-06-30||2.3.0||2020-06-30|
|Current stable version: 2.4||2020-10-19||2.4.1||2020-11-20|
Enterprises that use AlluxioEdit
The following is a list of notable enterprises that have used or are using Alluxio:
- Li, Haoyuan (7 May 2018). Alluxio: A Virtual Distributed File System (Technical report). EECS Department, University of California, Berkeley. UCB/EECS-2018-29.
- Open HUB Alluxio development activity
- "This New Open Source Project Is 100X Faster than Spark SQL In Petabyte-Scale Production".
- "Making the Impossible Possible with Tachyon: Accelerate Spark Jobs from Hours to Seconds".
- "China Unicom's big bet on open source".
- "Operationalizing Machine Learning—Managing Provenance from Raw Data to Predictions".
- "Cray Analytics and Alluxio – Wrangling Enterprise Storage".
- "Alluxio's Use and Practice in Didi".
- "Data Transformation in Financial Services".
- "ArcGIS and Alluxio - Using Alluxio to enhance ArcGIS data capability and get faster insights from all your data".
- "Huawei hugs open-sourcey Alluxio: Thanks for the memories".
- "How Alluxio is Accelerating Apache Spark Workloads". Archived from the original on 2019-07-14. Retrieved 2019-02-19.
- "Getting Started with Tachyon by Use Cases".
- "Using Alluxio as a fault-tolerant pluggable optimization component of JD.com's compute frameworks".
- "World's Largest Computer Maker Lenovo Selects Alluxio for Data Management of Worldwide Smartphone Data".
- "Enhancing the Value of Alluxio with Samsung NVMe SSDs".
- "Tencent Delivering Customized News to Over 100 Million Users per Month with Alluxio".
- "The Practice of Alluxio in Near Real-Time Data Platform at VIPShop".
- "Bringing Data to Life - Data Management and Visualization Techniques".