Deduplication algorithms and models for efficient data storage - Laboratoire des sciences et techniques de l'information, de la communication et de la connaissance, site de l'UBO Accéder directement au contenu
Communication Dans Un Congrès Année : 2020

Deduplication algorithms and models for efficient data storage

Résumé

This paper is dedicated to data deduplication algorithms and models that lead to efficient solutions to reduce the amount of data both transmitted over the network and stored in data systems. To be specific, we consider the case where replicas of an original file are generated by edit errors and adopt a theoretical approach to explore data files. Our study can apply to primary, backup or archival storage. We introduce a new variable-length block-level deduplication algorithm that outperforms prior work and reduces the computational complexity by focusing on pivots. We provide a theoretical comparative analysis of the algorithm computational costs and experimental results to evaluate its performance. The proposed deduplication solution enhances prior approaches in terms of cost and achieves the same rates as brute force or naive methods.
Fichier principal
Vignette du fichier
CSCC2020_deduplication.pdf (230.55 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-03976889 , version 1 (07-02-2023)

Identifiants

Citer

Laura Conde-Canencia, Belaid Hamoum. Deduplication algorithms and models for efficient data storage. 2020 24th International Conference on Circuits, Systems, Communications and Computers (CSCC), Jul 2020, Chania (Virtuel), Greece. pp.23-28, ⟨10.1109/CSCC49995.2020.00013⟩. ⟨hal-03976889⟩
4 Consultations
74 Téléchargements

Altmetric

Partager

Gmail Facebook X LinkedIn More