COLLABORATE

COLLABORATE
Developing a Strongly Consistent, Long-Lived, Fault-Tolerant, Distributed Storage System with a Failure Prediction Mechanism

Distributed Storage Systems (DSS) encompass the technology powering modern cloud data storage services such as DropBox and Google Drive that are used by millions of users as networked platforms for collaborative applications and data storage. Algorithms for DSS ensure data availability and survivability by replicating data in geographically dispersed network locations. However, a major problem with data distribution is consistency, especially when the storage is accessed concurrently by multiple processes; a key to enabling collaboration. Numerous strategies have been devised to mitigate these issues; however, a robust and efficient solution remains elusive. In this project, we propose a novel atomic DSS built on top of asynchronous message-passing, failure-prone, commodity devices. Ultimately the project aims to lead to the implementation of a DSS with the following characteristics.

Strong Consistency (Atomicity)

Despite the existence of concurrent operations, asynchrony, and node failures, our goal is to design algorithms for read/write objects that guarantee that each read operation returns a value no older than the value written by its latest preceding write and no older than the one returned by any preceding read. Such consistency guarantee is known as Atomicity. Atomicity is the most natural consistency guarantee as it provides the illusion of a centralized, sequentially accessed storage.

Fault Tolerance

The service will allow the termination of read/write operations, despite the existence of transient or persistent failures of data hosts in the system. In this project, we focus on crash failures.

Long Liveness

To ensure that persistent faults will not affect the operation of the service in the future, the service will implement mechanisms to remove faulty data hosts, insert new healthy alternatives, and migrate the data for a seamless uninterrupted experience to the clients. Such mechanisms are known as reconfigurations since they result in updating the membership of the host nodes.

Failure Prediction

It is one thing to reconfigure and another to know when to reconfigure. The last characteristic of the service is to implement Machine Learning algorithms in order to predict when soon to fail storage devices. This will allow determining which hosts will become unavailable and thus how the service needs to reconfigure to maintain functionality.

Minimum Viable Prototype

Essentially we would like to devise an efficient prototype of an atomic, distributed storage system, by combining the following key services:

Distributed Object Management,
Data Fragmentation,
Object Reconfiguration, and
Failure Prediction

10 Nov

Final Project Workshop

The University of Cyprus hosted the Final Workshop of the project. The workshop included three presentations which demonstrated the overall activities and outcomes of the project. A very successful event that draw the interest of the audience to the project…

17 Jun

Presentation at IMDEA Networks Seminar Series

As part of the dissemination activities of the project we were excited to present our work in the IMDEA Networks Seminar Series and in a diverse and international audience. Unfortunately COVID19 enforced the virtual hosting of the event. Fruitful discussion…

05 Dec

1st Working Meeting for COLLABORATE

The 1st Working Meeting for the project COLLABORATE has been taken place in the premises of UCY between 25/11-29/11. It was an intense, fruitful, and productive working week with our partners at the University of Cyprus, Assoc. Professor Chryssis Georgiou,…

30 Oct

Collaborate in Paideia-News

The activity and goals of the project Collaborate has been published in the online magazine Paideia-News: https://tinyurl.com/yy35enep

01 Jul

Collaborate announced in IMDEA’s webpage

The IMDEA Networks Institute announced the beginning of the project COLLABORATE in their website: https://www.networks.imdea.org/research/projects/collaborate

COLLABORATE

COLLABORATE Developing a Strongly Consistent, Long-Lived, Fault-Tolerant, Distributed Storage System with a Failure Prediction Mechanism

Strong Consistency (Atomicity)

Fault Tolerance

Long Liveness

Failure Prediction

Minimum Viable Prototype

COLLABORATE
Developing a Strongly Consistent, Long-Lived, Fault-Tolerant, Distributed Storage System with a Failure Prediction Mechanism