Óbudai Egyetem Digitális Archívum
    • magyar
    • English
  • English 
    • magyar
    • English
  • Login
View Item 
  •   DSpace Home
  • 5. Folyóiratcikkek
  • Acta Polytechnica Hungarica
  • View Item
  •   DSpace Home
  • 5. Folyóiratcikkek
  • Acta Polytechnica Hungarica
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

A Lightweight Execution Manager for Training TensorFlow Models under the Slurm Queuing System

Thumbnail
View/Open
Lupion_Cruz_Romero_Sanjuan_Ortigosa_155.pdf (588.8Kb)
Metadata
Show full item record
URI
http://hdl.handle.net/20.500.14044/32104
Collections
  • Acta Polytechnica Hungarica [193]
Abstract
Artificial neural networks currently represent the flagship of Machine Learning and have reached multiple fields alongside Computer Science. This kind of computational model generally needs massive amounts of data and high-performance computing resources. The availability of graphical processing units is especially relevant. Thus, only institutional computing platforms and clusters satisfy such a high demand for computational power and storage resources. These systems rely on resource managers capable of handling multiple users and computing resources. However, the users interested in working with artificial neural networks, especially those without a background in Computer Engineering, might not master system administration. For them, planning their executions within the framework of a resource manager focused on high-performance computing is problematic. This work presents S-TFManager, an easy-to-use open-source web manager for launching and controlling the execution of TensorFlow models consisting of artificial neural networks in a heterogeneous cluster with a Slurm queuing system. Both TensorFlow and Slurm are arguably the most extended tools in their respective fields, so the proposed tool is of public interest. The tool, written in Python, includes built-in batching and visualization capabilities, and its simplicity makes it easy to extend.
Title
A Lightweight Execution Manager for Training TensorFlow Models under the Slurm Queuing System
Author
Lupión, Marcos
Cruz, C. Nicolás
Romero, Felipe
Sanjuan, F. Juan
Ortigosa, M. Pilar
xmlui.dri2xhtml.METS-1.0.item-date-issued
2025
xmlui.dri2xhtml.METS-1.0.item-rights-access
Open access
xmlui.dri2xhtml.METS-1.0.item-identifier-issn
1785-8860
xmlui.dri2xhtml.METS-1.0.item-language
en
xmlui.dri2xhtml.METS-1.0.item-format-page
16 p.
xmlui.dri2xhtml.METS-1.0.item-subject-oszkar
machine learning, TensorFlow, Slurm, Resource Management
xmlui.dri2xhtml.METS-1.0.item-description-version
Kiadói változat
xmlui.dri2xhtml.METS-1.0.item-identifiers
DOI: 10.12700/APH.22.3.2025.3.4
xmlui.dri2xhtml.METS-1.0.item-other-containerTitle
Acta Polytechnica Hungarica
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalYear
2025
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalVolume
22. évf.
xmlui.dri2xhtml.METS-1.0.item-other-containerPeriodicalNumber
3. sz.
xmlui.dri2xhtml.METS-1.0.item-type-type
Tudományos cikk
xmlui.dri2xhtml.METS-1.0.item-subject-area
Műszaki tudományok - multidiszciplináris műszaki tudományok
xmlui.dri2xhtml.METS-1.0.item-publisher-university
Óbudai Egyetem

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV
 

 

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

LoginRegister

DSpace software copyright © 2002-2016  DuraSpace
Contact Us | Send Feedback
Theme by 
Atmire NV