Loading...

Ma[R]s

Memory as
a [Reconfigurable]
service

Ma[R]s

Memory as a
[reconfigurable]
service

for Distributed Application

Distributed Shared Memory
atomicity
reconfiguration
Efficient, Robust, Practical
[MaaS] Memory
as a Service
PhD in Industry
[MaaS] Memory
as a Service
Distributed Shared Memory
phd in industry
reconfiguration
efficient, robust, practical
atomicity

advancing STRONGLY CONSISTENT DISTRIBUTED SHARED MEMORY SYSTEMS (DSM)

Supervised by experts from academia (University of Cyprus) and industry (Algolysis Ltd), our PhD Student is dedicated to advancing research and industrial development in Strongly Consistent Distributed Shared Memory Systems (DSM).

Dual focus

Ma[R]s in Theory:
To advance the knowledge in the field of DSM by exploring efficient, robust, and practical solutions in highly dynamic environments.

Ma[R]s in Practice:
Creating a Memory-as-a-Service (MaaS) platform for deploying and managing DSM, supporting next-gen distributed applications. Real-World Validation: Testing DSM algorithms on stationary servers and less-powerful peer-to-peer devices for practical testing and validation.

Motivation - How Ma[R]s makes a difference

The data explosion and the increasing complexity of computational problems have driven the demand for distributed applications in recent years. Distributed applications enable organizations to harness the power of multiple computers or nodes in a distributed system, enhancing scalability, fault tolerance, and performance. Decentralized computation, exemplified by technologies like blockchain, promotes transparency and trust. Designing and building dependable distributed applications is challenging due to communication complexities like asynchrony, message delays/losses, and node failures. DSMs are fundamental for creating complex, decentralized, cloud applications in emerging technologies (e.g., IoT, VR/AR), they may offer a transparent cloud memory space where distributed applications can store, retrieve, and coordinate over shared data. An Atomic DSM (ADSM) provides the illusion of a sequential memory space over asynchronous, fail-prone, message passing nodes, simplifying the development process. An ADSM keeps data copies in multiple network locations (replica hosts or servers) to ensure data availability and survivability. The main challenge comes when those copies are accessed concurrently by different processes or clients: how can we ensure that data copies remain consistent? Several ADSM algorithms have been proposed, some for static replica host sets and others for dynamic sets. Although these algorithms theoretically proven to satisfy atomic guarantees, only a handful concern about practicality issues (e.g., support for large data objects, operation speed and liveness), and their scalability has not been closely examined.

Main Outcomes

.
Latency Efficiencydesign and implement latency efficient algorithmic solutions for dynamic
ADSM algorithms
.
Reconfiguration Timingstrategies specifying when and how an ADSM service should reconfigure
.
Web Platforma web platform to be used as a portal for the deployment of ADSM services on a set of networked devices, the management of deployed memories, and the access to memory data

Project Objectives

Building on state-of-the-art dynamic ADSM algorithms (e.g., CoARESf), the first objective is to devise methodologies to reduce the latency of read and write operations, making dynamic ADSM services more practical and attractive for commercial use. To reach our goals we will: (i) use distributed tracing tools within existing PoC implementations to identify performance bottlenecks during executions on the Emulab testbed, and (ii) propose solutions for eliminating (or at least reducing) the identified performance bottlenecks; ideas for the optimizations can be retrieved from other works that use reconfiguration mechanisms, like RAMBO.

Utilizing the reconfiguration mechanism offered by the ADSM services, in this objective we will design and analyze (smart) Reconfiguration Orchestration Strategies (ROS) on when reconfigurations should be invoked on the ADSM service and how the membership of the ASDM service should change. Proposed approaches will specify which environmental parameters may affect the decisions on when and how to reconfigure, and reconfiguration decisions will utilize (existing) tools that monitor these parameters, eliminating or minimizing human intervention. These parameters will involve, for example, network connectivity, server’s health, and storage capacity. The developed service will be designed to interact with any dynamic ADSM service via the reconfiguration mechanism.

The focus of this objective is to bring the developments in Objectives O1 and O2 together, yielding a dynamic ADSM service with ROS support. More precisely, we will implement an ADSM service based on the algorithms proposed in O1, and a Reconfiguration Orchestration Module (ROM) as an external service given the outcomes of O2. Integration of the two services will be accomplished by implementing an API that exposes the reconfiguration operation of the ADSM service. An API will also be used for the interaction of the ROM service with the different monitoring tools, which will be deployed to collect the data necessary for the ROM to generate reconfiguration decisions. The integrated service will be deployed on a real testbed (e.g., Emulab) and evaluated against five (5) characteristics: (i) scalability in terms of concurrent users and object sizes the service may support, (ii) operation speed (especially compared with previous ADSM algorithms), and (iii) fault-tolerance and liveness, mainly testing the reaction of the ROM service during harsh environmental conditions.

The final objective will involve the design and implementation of a web platform to be used as a portal for the deployment, management, and access of ADSM instances. The platform will work as a wrapper allowing the users through an intuitive UI, compelling to a wide class of users, to: (i) deploy new instances of an ADSM service with ROS support by specifying a set of hosting devices; (ii) manage existing instances by getting an overview of various service parameters (e.g., hosts, memory size, memory objects, etc.); and (iii) get access to existing instances for
reading and writing data objects either through the platform directly or through a third party application using appropriate security tokens. The platform will be evaluated in terms of the number of supported users, UI scalability as the number of users and ADSM instances grows, and UI usability.

Dissemination

Branding Guides

View

Logo Assets

Download

Dissemination Poster

View