© Photo by Gonçalo João

| Home | Curriculum | Research | Teaching | Publications | Students |

João Leitão's Homepage : Research

NG-STORAGE Research Project

Project Name:NG-STORAGE: New Generation of data STORage And manaGement systEms
Funding Institution: Fundação para a Ciência e Tecnologia
Project Reference: PTDC/CCI-INF/32038/2017
Principal Investigator:: João Carlos Antunes Leitão
Leading Institution:
NOVA.ID.FCT - Associação para a Inovação e Desenvolvimento da FCT (NOVA.ID.FCT/FCT/UNL)
Other Institutions:
INESC-ID - Instituto de Engenharia de Sistemas e Computadores: Investigação e Desenvolvimento em Lisboa
NOVA SST - NOVA School of Science and Technology (Faculdade de Ciências e Tecnologia - Universidade NOVA de Lisboa)
NOVA LINCS - NOVA Laboratory for Informatics and Computer Science
Técnico - Instituto Superior Técnico - Universidade de Lisboa
Project Duration: 01-10-2018 to 30-09-2022
Total Funding: 239.502,56 €

Project Abstract

Data management, including data replication, is a central aspect on the design of most distributed systems and information systems, ranging from web applications to emerging Internet of Things (IoT) applications. Considering the increasing scale and relevance of distributed systems in everyday life, motivated by the increasing popularity, user base size, data volumes, and different data provenances; achieving flexible and efficient partial replication (where replicas only store and handle operations for a sub-set of all data) is becoming an essential aspect that must be tackled by modern data storage and management systems. The imminent availability of smart 5G towers further stresses the need for more efficient and flexible data storage solutions. Unfortunately, existing solutions for achieving partial replication in large-scale data storage and management systems are scarce, being either based on a centralised design which hinders scalability, or lacking flexibility by only supporting a single consistency model and by not supporting dynamic and automatic mechanisms to control the life-cycle of replicas, to ensure adequate performance according to runtime factors such as dynamic workloads.

In this project we propose to tackle the emergent challenges in the design and implementation of modern data storage and management systems regarding the needs of global-scale distributed systems such as social networks and emerging large-scale IoT applications that are expected to generate and manipulate an unprecedented amount of data. To address this need, we plan to focus on three complementary aspects of data storage and management systems: i) designing systems that can provide flexible and efficient partial replication with a large number of replicas, potentially scattered among datacenters and edge devices, such as set top boxes, or smart 5G towers that will soon become available; ii) enrich the design of the data storage and management system with mechanisms to automatically control the life-cycle of replicas, spawning and decommissioning replicas as needed considering the workload and access patterns to data; and iii) empowering application developers to specify different consistency guarantees for different data objects, improving the use of the infrastructure with benefits in quality of service for end users and lower operational costs for application operators.

Based on the results from the previously discussed research we plan to build prototypes that will enable us to conduct an assessment of the benefits that emerge from the use of data storage and management systems with these characteristics. In particular we plan to focus on two distinct use cases, both of them in the context of geo-distributed systems: web applications, in particular social networks systems, that manage large quantities of user data and IoT applications.

Research Team

NameInstitutionRole
João LeitãoNOVA SST and NOVA LINCSPrincipal Investigator
Luís RodriguesTécnico and INESC-IDCo-Principal Investigator
Nuno PreguiçaNOVA SST and NOVA LINCSResearcher
Carla FerreiraNOVA SST and NOVA LINCSResearcher
João LourençoNOVA SST and NOVA LINCSResearcher
João Nuno SilvaTécnico and INESC-IDResearcher
Sérgio DuarteNOVA SST and NOVA LINCSResearcher
Miguel MatosTécnico and INESC-IDResearcher
Rui André OliveiraNOVA SST and NOVA LINCSHired Researcher
Pedro FoutoNOVA SST and NOVA LINCSPhD Student
Pedro Ákos CostaNOVA SST and NOVA LINCSPhD Student
Albert van der LindeNOVA SST and NOVA LINCSPhD Student
André RosaNOVA SST and NOVA LINCSMSc Student
Angel GestosoTécnico and INESC-IDPhD Student
Cláudio CorreiraTécnico and INESC-IDPhD Student
Ema VieiraNOVA SST and NOVA LINCSMaster Student
Vitor MeninoNOVA SST and NOVA LINCSMaster Student
André AtalaiaNOVA SST and NOVA LINCSMaster Student
João MonteiroNOVA SST and NOVA LINCSMaster Student
João AntãoNOVA SST and NOVA LINCSMaster Student
Sofia BrazNOVA SST and NOVA LINCSMaster Student
Pedro CoelhoNOVA SST and NOVA LINCSUndergraduate Student

Publications

Book Chapters

Achieving Low Latency Transactions for Geo-Replicated Storage with Blotter.
H. Moniz, J. Leitão, R. J. Dias, J. Gehrke, N. Preguiça, and R. Rodrigues.
In: Zomaya, A., Taheri, J., Sakr, S. (eds) Encyclopedia of Big Data Technologies (2nd Edition). Springer, Cham. 2022.
[https://doi.org/10.1007/978-3-319-63962-8_158-2][bibtex]

Journals

Omega: a Secure Event Ordering Service for the Edge.
C. Correia, M. Correia, and L. Rodrigues.
IEEE Transactions on Dependable and Secure Computing. Volume: 19, Issue: 5, September. 2022.
[Publisher Website][PDF][bibtex]

International Conferences

Deduplication vs Privacy Tradeoffs in Cloud Storage.
R. Silva, C. Correia, M. Correia and L. Rodrigues.
Proceedings of the The 38th ACM/SIGAPP Symposium On Applied Computing (SAC), Tallinn Estonia, March 2023.
(to appear)

Babel: A Framework for Developing Performant and Dependable Distributed Protocols.
Pedro Fouto, Pedro Ákos Costa, Nuno Preguiça, and João Leitão.
Proceedings of the 41st International Symposium on Reliable Distributed Systems (SRDS 2022), September 19-22, Vienna, Austria, 2022.
[PDF][bibtex]

High Throughput Replication with Integrated Membership Management
Pedro Fouto, Nuno Preguiça, and João Leitão.
Proceedings of the 2022 USENIX Annual Technical Conference (USENIX ATC'22), July 11-13, Carlsbad, CA, USA.
[Publisher Website][PDF][bibtex]

Engage: Session Guarantees for the Edge
Miguel Belém, Pedro Fouto, Taras Lykhenko, João Leitão, Nuno Preguiça, and Luís Rodrigues.
Proceedings of the The 31st International Conference on Computer Communication and Networks (ICCCN 2022). Virtual Conference, July 2022.
[PDF][bibtex]

Enriching Kademlia by Partitioning
João Monteiro, Pedro Ákos Costa, João Leitão, Alfonso de la Rocha, and Yiannis Psaras.
Proceedings of the 1st Workshop on Decentralized Internet, Networks, Protocols, and Systems (DINPS'22) colocated with ICDCS. Bologna, Italy, July 2022.
[PDF][bibtex]

TESRAC: A Framework for Test Suite Reduction Assessment at Scale
João Becho, Frederico Cerveira, João Leitão, and Rui A. Oliveira.
Proceedings of the 15th IEEE International Conference on Software Testing, Verification and Validation (ICST'22), Virtual Event, April 2022.
[PDF][bibtex]

Generalizing Wireless Ad Hoc Routing for Future Edge Applications
André Rosa, Pedro Ákos Costa, and João Leitão.
Proceedings of the EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services (MobiQuitous'21), Japan, November 2021.
[Publisher Website][Tech Report Version][bibtex]

Cathode: A Consistency-Aware Data Placement Algorithm for the Edge.
L. Epifâneo, C. Correia and L. Rodrigues.
Proceedings of the 20th IEEE International Symposium on Network Computing and Applications (NCA 2021), Online, November, 2021.
[PDF][bibtex]

Overlay Networks for Edge Management
P. A. Costa, P. Fouto, and J. Leitão.
Proceedings of the 19th IEEE International Symposium on Network Computing and Applications (NCA 2020). November 24-27, 2020. Online Conference.
[Publisher Website][PDF][bibtex]

Causality Tracking Tradeoffs for Distributed Storage.
H. Guerreiro, L. Rodrigues, N. Preguiça and N. Quental.
Proceedings of the 19th IEEE International Symposium on Network Computing and Applications (NCA), Online, November 2020.
[PDF][bibtex]

Omega: a Secure Event Ordering Service for the Edge.
C. Correia, L. Rodrigues, and M. Correia.
Proceedings of 50th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN), Valência, Spain, June 2020.
[PDF][bibtex]

Enabling Wireless Ad Hoc Edge Systems with Yggdrasil
P. Á. Costa, A. Rosa, and J. Leitão.
Proceedings of the The 35th ACM/SIGAPP Symposium On Applied Computing (SAC), Brno, Czech Republic, March 30-April 3, 2020.
[Publisher Website][PDF][bibtex]

Revisiting Broadcast Algorithms for Wireless Edge Networks
A. Rosa, P. Á. Costa, and J. Leitão.
Proceedings of the 38th IEEE International Symposium on Reliable Distributed Systems (SRDS 2019), Lyon, France, October 2019.
[PDF][bibtex]

Efficient Synchronization of State-based CRDTs
V. Enes, P. S. Almeida, C. Baquero, and J. Leitão.
Proceedings of the 35th IEEE International Conference on Data Engineering (ICDE 2019). 8-12 April, 2019. Macau, China.
[PDF][bibtex]

National Conferences

Compreender os compromissos entre algoritmos de coerência causal através de simulação.
António Duarte, Pedro Fouto, João Leitão, and Nuno Preguiça.
Actas do décimo terceiro Simpósio de Informática, Guarda, Portugal, Sep 2022.
[PDF]

Estudo prático de um sistema descentralizado: IPFS.
Diogo Fona, Pedro Ákos Costa, and João Leitão.
Actas do décimo terceiro Simpósio de Informática, Guarda, Portugal, Sep 2022.
[PDF]

Emulador de Redes para Validação Empírica de Algoritmos Distribuídos
Diogo Almeida, Pedro Fouto, Pedro Ákos Costa, and João Leitão.
Actas do décimo terceiro Simpósio de Informática, Guarda, Portugal, Sep 2022.
[PDF]

Ataques de Frequência em Deduplicação Cifrada na Nuvem.
R. Silva, C. Correia, M. Correia and L. Rodrigues.
Actas do décimo terceiro Simpósio de Informática, Guarda, Portugal, Sep. 2022.
[PDF]

Difusão Causal Flexível e Escalável para Replicação na Periferia
Ema Vieira, Pedro Fouto, Nuno Preguiça, and João Leitão.
Actas do décimo segundo Simpósio de Informática, Lisboa, Portugal, Sep 2021.
[PDF]

ResEst - Algoritmo Distribuído para a Inferência de Recursos da Rede
Vítor Hugo Menino, Pedro Ákos Costa, and João Leitão.
Actas do décimo segundo Simpósio de Informática, Lisboa, Portugal, Sep 2021.
[PDF]

P-KAD: Enriquecer o Kademlia com Particionamento (Oral Presentation Only)
João Monteiro, Pedro Ákos Costa, Alfonso de la Rocha, Yiannis Psaras, and João Leitão.
Actas do décimo segundo Simpósio de Informática, Lisboa, Portugal, Sep 2021.
[PDF]

Generalizando o Encaminhamento Ad Hoc Sem-fios para Futuras Aplicações na Berma (Oral Presentation Only)
André Rosa, Pedro Ákos Costa, and João Leitão.
Actas do décimo segundo Simpósio de Informática, Lisboa, Portugal, Sep 2021.
[PDF]

Technical Reports

Babel: A Framework for Developing Performant and Dependable Distributed Protocols
Pedro Fouto, Pedro Ákos Costa, Nuno Preguiça, and João Leitão.
Technical Report. April 2022.
[PDF][Arxiv Page]

Thesis

Rodrigo Silva
Instituto Superior Técnico, Universidade de Lisboa.
MSc Thesis (2022): Deduplication vs Privacy Tradeoffs in Cloud Storage

Vítor Hugo Menino
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2022): A Novel Approach to Load Balancing in P2P Overlay Networks for Edge Systems

Ema Vieira
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2022): ECO SYNC TREE: A Causa and Dynamic Broadcast Tree for Edge-based Replication

João Monteiro
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2022): A Multi Level DHT approach through Hierarchical Naming

Bruno Anjos
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2021): LowNimbus: A decentralized autonomic cloud to edge deployment framework

Nuno Morais
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2021): DeMMon: Decentralized Management and Monitoring Framework

David Romão
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2021): Dynamic Data Placement in Cloud/Edge Environments

André Rosa
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2021): Communication Primitives For Wireless Ad Hoc Networks

Pedro Silvestre
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2021): Clonos: Consistent High-Availability for Distributed Stream Processing through Causal Logging

Hugo Guerreiro
Instituto Superior Técnico, Universidade de Lisboa.
MSc Thesis (2020): Causality Tracking Trade-offs for Distributed Storage

Khrystyna Fedyuk
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2020): Sheik: Dynamic Location and Binding of Microservices for Cloud/Edge Settings

Paulo Ricardo Moita
Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa.
Intermediate Report
MSc Thesis (2020): Modular and Adaptive Key-Value Storage Systems

Prototypes and Software

Babel Framework Prototype This prototype is fully functional and can be easily used through the use of Maven. Additional details are provided in the public repository of the project that can be accesses in https://github.com/pfouto/babel-core.

Prototype of Overlay Networks for Edge Management This prototype includes several types of fully functional distributed protocols employed in the experimental work of the paper Overlay Networks for Edge Management by Pedro Ákos Costa et. al. referred above. The code is accessible in https://github.com/pedroAkos/EdgeOverlayNetworks.

ChainPaxos Prototype This is the prototype employed in the experimental work reported on paper High Throughput Replication with Integrated Membership Management by Pedro Fouto et. al. referred above. The prototype was developed using the Babel Framework and was awarded two verification stamps by the Usenix Annual Technical Conference 2022 artifact evaluation comittee. The prototype can be found in https://github.com/pfouto/chain.

Engage Prototype This is the prototype employed in the experimental work reported on the paper Engage: Session Guarantees for the Edge by Miguel Belém et. al. referred above. The prototype is available at https://github.com/pfouto/engage

Integration of Engage with Cassadra Prototype This is the protoype that integrates the Engage solution with the Cassandra open-source distributed data storage system. The prototype is available at https://github.com/pfouto/engage-cassandra

Events

The project team was involved in the organization of the Encontro Nacional de Sistemas Distribuídos the first Portuguese meeting specific for the research community in distributed systems. Some of the results of this project were presented in this event.

Pedagogical Activities

The Babel framework that was developed in the context of the NG-STORAGE project has been employed as the development environment across deveral course editions of Master degree level courses at the Faculdade de Ciências e Tecnologia, Universidade NOVA de Lisboa. The courses and editions are listed below. The students in the course have leveraged Babel to develop several types of distributed protocols and systems, including decentralized protocols, agreement protocols, replicated data storage systems, among others.

  • Algorihtms and Distributed Systems 2020/21.
  • Algorihtms and Distributed Systems 2021/22.
  • Algorihtms and Distributed Systems 2022/23.
  • Dependable Distributed Systems 2022/23.

In addition to this, the Babel framework has been frequently used for the development of prototypes of distributed systems and protocols in the context of the research activities of the Computer Systems Group of the NOVA LINCS research laboratory.