Óbudai Egyetem Digitális Archívum
    • magyar
    • English
  • English 
    • magyar
    • English
  • Login
View Item 
  •   DSpace Home
  • 5. Folyóiratcikkek
  • Acta Polytechnica Hungarica
  • View Item
  •   DSpace Home
  • 5. Folyóiratcikkek
  • Acta Polytechnica Hungarica
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Big Data Deduplication in Data Lake

Thumbnail
View/Open
Hlavacka_Bobak_Hluchy_151.pdf (690.3Kb)
Metadata
Show full item record
URI
http://hdl.handle.net/20.500.14044/32350
Collections
  • Acta Polytechnica Hungarica [175]
Abstract
Data lakes are the next generation of technology to process and store big data. As usual, new challenges and problems arise inevitably with new technologies. One of these problems is the occurrence of duplicate data in the storage. Our paper aims to address this challenge during the data ingestion phase that is currently overlooked or addressed insufficiently. The first part discusses the design of a suitable architecture for the data lake and deduplication workflow for processing structured and unstructured data. The proposed solution is evaluated through experiments that deal with the flexible deduplication window, the scalability of the proposed solution, the suitable hash function, and the advantages of an in-memory pointer repository.
Title
Big Data Deduplication in Data Lake
Author
Hlavačka, Jakub
Bobák, Martin
Hluchý, Ladislav
xmlui.dri2xhtml.METS-1.0.item-date-issued
2024
xmlui.dri2xhtml.METS-1.0.item-rights-access
Open access
xmlui.dri2xhtml.METS-1.0.item-identifier-issn
1785-8860
xmlui.dri2xhtml.METS-1.0.item-language
en
xmlui.dri2xhtml.METS-1.0.item-format-page
20 p.
xmlui.dri2xhtml.METS-1.0.item-subject-oszkar
data lake, deduplication, big data
xmlui.dri2xhtml.METS-1.0.item-description-version
Kiadói változat
xmlui.dri2xhtml.METS-1.0.item-identifiers
DOI: 10.12700/APH.21.11.2024.11.17
xmlui.dri2xhtml.METS-1.0.item-other-containerTitle
Acta Polytechnica Hungarica
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalYear
2024
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalVolume
21. évf.
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalNumber
11. sz.
xmlui.dri2xhtml.METS-1.0.item-type-type
Tudományos cikk
xmlui.dri2xhtml.METS-1.0.item-subject-area
Műszaki tudományok - informatikai tudományok
xmlui.dri2xhtml.METS-1.0.item-publisher-university
Óbudai Egyetem

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV