![]() | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Home | Curriculum | Research | Teaching | Publications | Students | | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
João Leitão's Homepage : Research NG-STORAGE Research Project Project Name:NG-STORAGE: New Generation of data STORage And manaGement systEmsFunding Institution: Fundação para a Ciência e Tecnologia Project Reference: PTDC/CCI-INF/32038/2017 Principal Investigator:: João Carlos Antunes Leitão Leading Institution: NOVA.ID.FCT - Associação para a Inovação e Desenvolvimento da FCT (NOVA.ID.FCT/FCT/UNL) Other Institutions: INESC-ID - Instituto de Engenharia de Sistemas e Computadores: Investigação e Desenvolvimento em Lisboa NOVA SST - NOVA School of Science and Technology (Faculdade de Ciências e Tecnologia - Universidade NOVA de Lisboa) NOVA LINCS - NOVA Laboratory for Informatics and Computer Science Técnico - Instituto Superior Técnico - Universidade de Lisboa Project Duration: 01-10-2018 to 30-09-2022 Total Funding: 239.502,56 € Project Abstract Data management, including data replication, is a central aspect on the design of most distributed systems and information systems, ranging from web applications to emerging Internet of Things (IoT) applications. Considering the increasing scale and relevance of distributed systems in everyday life, motivated by the increasing popularity, user base size, data volumes, and different data provenances; achieving flexible and efficient partial replication (where replicas only store and handle operations for a sub-set of all data) is becoming an essential aspect that must be tackled by modern data storage and management systems. The imminent availability of smart 5G towers further stresses the need for more efficient and flexible data storage solutions. Unfortunately, existing solutions for achieving partial replication in large-scale data storage and management systems are scarce, being either based on a centralised design which hinders scalability, or lacking flexibility by only supporting a single consistency model and by not supporting dynamic and automatic mechanisms to control the life-cycle of replicas, to ensure adequate performance according to runtime factors such as dynamic workloads.In this project we propose to tackle the emergent challenges in the design and implementation of modern data storage and management systems regarding the needs of global-scale distributed systems such as social networks and emerging large-scale IoT applications that are expected to generate and manipulate an unprecedented amount of data. To address this need, we plan to focus on three complementary aspects of data storage and management systems: i) designing systems that can provide flexible and efficient partial replication with a large number of replicas, potentially scattered among datacenters and edge devices, such as set top boxes, or smart 5G towers that will soon become available; ii) enrich the design of the data storage and management system with mechanisms to automatically control the life-cycle of replicas, spawning and decommissioning replicas as needed considering the workload and access patterns to data; and iii) empowering application developers to specify different consistency guarantees for different data objects, improving the use of the infrastructure with benefits in quality of service for end users and lower operational costs for application operators. Based on the results from the previously discussed research we plan to build prototypes that will enable us to conduct an assessment of the benefits that emerge from the use of data storage and management systems with these characteristics. In particular we plan to focus on two distinct use cases, both of them in the context of geo-distributed systems: web applications, in particular social networks systems, that manage large quantities of user data and IoT applications. Research Team
Publications Book Chapters Achieving Low Latency Transactions for Geo-Replicated Storage with Blotter.
Journals Omega: a Secure Event Ordering Service for the Edge.
International Conferences Deduplication vs Privacy Tradeoffs in Cloud Storage.
Babel: A Framework for Developing Performant and Dependable Distributed Protocols.
High Throughput Replication with Integrated Membership Management
Engage: Session Guarantees for the Edge
Enriching Kademlia by Partitioning
TESRAC: A Framework for Test Suite Reduction Assessment at Scale
Generalizing Wireless Ad Hoc Routing for Future Edge Applications
Cathode: A Consistency-Aware Data Placement Algorithm for the Edge.
Overlay Networks for Edge Management
Causality Tracking Tradeoffs for Distributed Storage.
Omega: a Secure Event Ordering Service for the Edge.
Enabling Wireless Ad Hoc Edge Systems with Yggdrasil
Revisiting Broadcast Algorithms for Wireless Edge Networks
Efficient Synchronization of State-based CRDTs
National Conferences Compreender os compromissos entre algoritmos de coerência causal através de simulação.
Estudo prático de um sistema descentralizado: IPFS.
Emulador de Redes para Validação Empírica de Algoritmos Distribuídos
Ataques de Frequência em Deduplicação Cifrada na Nuvem.
Difusão Causal Flexível e Escalável para Replicação na Periferia
ResEst - Algoritmo Distribuído para a Inferência de Recursos da Rede
P-KAD: Enriquecer o Kademlia com Particionamento (Oral Presentation Only)
Generalizando o Encaminhamento Ad Hoc Sem-fios para Futuras Aplicações na Berma (Oral Presentation Only)
Technical Reports Babel: A Framework for Developing Performant and Dependable Distributed Protocols
Thesis Rodrigo Silva
Vítor Hugo Menino
Ema Vieira
João Monteiro
Bruno Anjos
Nuno Morais
David Romão
André Rosa
Pedro Silvestre
Hugo Guerreiro
Khrystyna Fedyuk
Paulo Ricardo Moita
Prototypes and Software Babel Framework Prototype This prototype is fully functional and can be easily used through the use of Maven. Additional details are provided in the public repository of the project that can be accesses in https://github.com/pfouto/babel-core. Prototype of Overlay Networks for Edge Management This prototype includes several types of fully functional distributed protocols employed in the experimental work of the paper Overlay Networks for Edge Management by Pedro Ákos Costa et. al. referred above. The code is accessible in https://github.com/pedroAkos/EdgeOverlayNetworks. ChainPaxos Prototype This is the prototype employed in the experimental work reported on paper High Throughput Replication with Integrated Membership Management by Pedro Fouto et. al. referred above. The prototype was developed using the Babel Framework and was awarded two verification stamps by the Usenix Annual Technical Conference 2022 artifact evaluation comittee. The prototype can be found in https://github.com/pfouto/chain. Engage Prototype This is the prototype employed in the experimental work reported on the paper Engage: Session Guarantees for the Edge by Miguel Belém et. al. referred above. The prototype is available at https://github.com/pfouto/engage Integration of Engage with Cassadra Prototype This is the protoype that integrates the Engage solution with the Cassandra open-source distributed data storage system. The prototype is available at https://github.com/pfouto/engage-cassandra Events The project team was involved in the organization of the Encontro Nacional de Sistemas Distribuídos the first Portuguese meeting specific for the research community in distributed systems. Some of the results of this project were presented in this event.Pedagogical Activities The Babel framework that was developed in the context of the NG-STORAGE project has been employed as the development environment across deveral course editions of Master degree level courses at the Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa. The courses and editions are listed below. The students in the course have leveraged Babel to develop several types of distributed protocols and systems, including decentralized protocols, agreement protocols, replicated data storage systems, among others.
In addition to this, the Babel framework has been frequently used for the development of prototypes of distributed systems and protocols in the context of the research activities of the Computer Systems Group of the NOVA LINCS research laboratory.
|