神戸大学附属図書館デジタルアーカイブ
入力補助
English
カテゴリ
学内刊行物
ランキング
アクセスランキング
ダウンロードランキング
https://hdl.handle.net/20.500.14094/0100491598
このアイテムのアクセス数:
65
件
(
2025-05-20
14:21 集計
)
閲覧可能ファイル
ファイル
フォーマット
サイズ
閲覧回数
説明
0100491598 (fulltext)
pdf
2.26 MB
69
メタデータ
ファイル出力
メタデータID
0100491598
アクセス権
open access
出版タイプ
Version of Record
タイトル
On the Performance of Malleable APGAS Programs and Batch Job Schedulers
著者
Finnerty, Patrick ; Posner, Jonas ; Bürger, Janek ; Takaoka, Leo ; Kanzaki, Takuma
著者ID
A3182
研究者ID
1000050957628
ORCID
0000-0002-9037-967X
KUID
https://kuid-rm-web.ofc.kobe-u.ac.jp/search/detail.html?systemId=e6455a08c99e01e3520e17560c007669
著者名
Finnerty, Patrick
フィネルティ, パトリック
所属機関名
システム情報学研究科
著者名
Posner, Jonas
著者名
Bürger, Janek
著者名
Takaoka, Leo
著者名
Kanzaki, Takuma
言語
English (英語)
収録物名
SN Computer Science
巻(号)
5(4)
ページ
349
出版者
Springer Nature
刊行日
2024-03-27
公開日
2024-09-17
抄録
Malleability—the ability for applications to dynamically adjust their resource allocations at runtime—presents great potential to enhance the efficiency and resource utilization of modern supercomputers. However, applications are rarely capable of growing and shrinking their number of nodes at runtime, and batch job schedulers provide only rudimentary support for such features. While numerous approaches have been proposed to enable application malleability, these typically focus on iterative computations and require complex code modifications. This amplifies the challenges for programmers, who already wrestle with the complexity of traditional MPI inter-node programming. Asynchronous Many-Task (AMT) programming presents a promising alternative. In AMT, computations are split into many fine-grained tasks, which are processed by workers. This makes transparent task relocation via the AMT runtime system possible, thus offering great potential for enabling efficient malleability. In this work, we propose an extension to an existing AMT system, namely APGAS for Java. We provide easy-to-use malleability programming abstractions, requiring only minor application code additions from programmers. Runtime adjustments, such as process initialization and termination, are automatically managed by our malleability extension. We validate our malleability extension by adapting a load balancing library handling multiple benchmarks. We show that both shrinking and growing operations cost low execution time overhead. In addition, we demonstrate compatibility with potential batch job schedulers by developing a prototype batch job scheduler that supports malleable jobs. Through extensive real-world job batches execution on up to 32 nodes, involving rigid, moldable, and malleable programs, we evaluate the impact of deploying malleable APGAS applications on supercomputers. Exploiting scheduling algorithms, such as FCFS, Backfilling, Easy-Backfilling, and one exploiting malleable jobs, the experimental results highlight a significant improvement regarding several metrics for malleable jobs. We show a 13.09% makespan reduction (the time needed to schedule and execute all jobs), a 19.86% increase in node utilization, and a 3.61% decrease in job turnaround time (the time a job takes from its submission to completion) when using 100% malleable job in combination with our prototype batch job scheduler compared to the best-performing scheduling algorithm with 100% rigid jobs.
キーワード
Malleable runtime system
Malleable job scheduling
APGAS
カテゴリ
システム情報学研究科
学術雑誌論文
権利
© 2024, The Author(s)
Creative Commons Attribution 4.0 International License
関連情報
DOI
https://doi.org/10.1007/s42979-024-02641-7
詳細を表示
資源タイプ
journal article
eISSN
2661-8907
OPACで所蔵を検索
CiNiiで学外所蔵を検索
ホームへ戻る