USCMS Researcher: Jin Zhou

Postdoc dates: Aug 2024 - Sep 2025
Home Institution: University of Notre Dame
Project: Scalable Data Analysis Applications for High Energy Physics
- Accelerate the execution of CMS analysis applications. - Reduce storage consumption to enable more ambitious computations. - Enhance fault tolerance by breaking long tasks into smaller ones and implementing effective checkpointing strategies.More information: My project proposal
Mentors:
-
Douglas Thain (Cooperative Computing Lab, University of Notre Dame)
-
Kevin Lannon (Physics department, University of Notre Dame)
Current Status
2025 Q1
- Progress
- Developed the large-input first (LIF) algorithm and the pruning algorithm which effectively reduce the storage consumption by over 90% while running hundreds of thousands of tasks.
- Enhanced the resource allocation and temp file replication on the task scheduler side.
- Attempted to submit a paper to IPDPS 2025 though was rejected.
- Next steps
- Sketch a paper about effectively using limited storage to accomplish enormous computations.
- Develop an algorithm that divides long running tasks in DV5 into smaller ones, which reduces the overhead of rerunning tasks on worker evictions but increases the latency of scheduling a large number of small tasks, so the next plan would be trying to strike a balance between task scheduling and fault tolerance.
- Develop an algorithm that checkpoints remote temp files on time to reduce the risk of losing critical files.
Contact me: