USCMS Researcher: Jin Zhou



Postdoc dates: Aug 2024 - Sep 2025

Home Institution: University of Notre Dame


Project: Scalable Data Analysis Applications for High Energy Physics

- Accelerate the execution of CMS analysis applications. - Reduce storage consumption to enable more ambitious computations. - Enhance fault tolerance by breaking long tasks into smaller ones and implementing effective checkpointing strategies.

More information: My project proposal

Mentors:
  • Douglas Thain (Cooperative Computing Lab, University of Notre Dame)

  • Kevin Lannon (Physics department, University of Notre Dame)

Presentations

Current Status


2025 Q1

  • Progress
    • Developed the large-input first (LIF) algorithm and the pruning algorithm which effectively reduce the storage consumption by over 90% while running hundreds of thousands of tasks.
    • Enhanced the resource allocation and temp file replication on the task scheduler side.
    • Attempted to submit a paper to IPDPS 2025 though was rejected.
  • Next steps
    • Sketch a paper about effectively using limited storage to accomplish enormous computations.
    • Develop an algorithm that divides long running tasks in DV5 into smaller ones, which reduces the overhead of rerunning tasks on worker evictions but increases the latency of scheduling a large number of small tasks, so the next plan would be trying to strike a balance between task scheduling and fault tolerance.
    • Develop an algorithm that checkpoints remote temp files on time to reduce the risk of losing critical files.


Contact me: