Haodong Wang

Ph.D. Student, USC@NSL

(2023- ) Ph.D. in Computer Science, University of Southern California, Los Angeles.
(2022) M.S. in Computer Science, University of Chicago, Chicago.
(2021) B.S. in Computer Science, Peking University, Beijing.

Jump to Biography, Publications, Experiences.

Biography

I am a first-year Ph.D. student in Networked System Lab (NSL) at University of Southern California. I am honored to work with Prof. Harsha Madhyastha and Prof. Ramesh Govindan. My research interests include Volumetric Video, Video Analytics and Systems for Machine Learning.

Prior to joining USC, I completed my pre-doctoral M.S. in Computer Science at the University of Chicago and my B.S. in Computer Science at Peking University .

Work Experience

Graduate Research Assistant (August 2023 - present)
Department of Computer Science, University of Southern California
Advisor: Harsha Madhyastha, Ramesh Govindan

Graduate Research Assistant (September 2021 - May 2023)
Department of Computer Science, University of Chicago
Advisor: Junchen Jiang

Teaching Experience

Teaching Assistant at the University of Chicago
Course: MPCS 51046: Intermediate Python Programming, Fall 2022
Course: MPCS 51044: C++ for Advanced Programmers, Winter 2022

Publications

SoCC

OneAdapt: Fast Adaptation for Deep Learning Applications via Backpropagation

Du, Kuntai, Liu, Yuhan, Hao, Yitian, Zhang, Qizheng, Wang, Haodong, Huang, Yuyang, Ananthanarayanan, Ganesh, and Jiang, Junchen

In Proceedings of the 2023 ACM Symposium on Cloud Computing 2023

Abs Link

Deep learning inference on streaming media data, such as object detection in video or LiDAR feeds and text extraction from audio waves, is now ubiquitous. To achieve high inference accuracy, these applications typically require significant network bandwidth to gather high-fidelity data and extensive GPU resources to run deep neural networks (DNNs). While the high demand for network bandwidth and GPU resources could be substantially reduced by optimally adapting the configuration knobs, such as video resolution and frame rate, current adaptation techniques fail to meet three requirements simultaneously: adapt configurations (i) with minimum extra GPU or bandwidth overhead (ii) to reach near-optimal decisions based on how the data affects the final DNN’s accuracy, and (iii) do so for a range of configuration knobs. This paper presents OneAdapt, which meets these requirements by leveraging a gradient-ascent strategy to adapt configuration knobs. The key idea is to embrace DNNs’ differentiability to quickly estimate the accuracy’s gradient to each configuration knob, called AccGrad. Specifically, OneAdapt estimates AccGrad by multiplying two gradients: InputGrad (i.e., how each configuration knob affects the input to the DNN) and DNNGrad (i.e., how the DNN input affects the DNN inference output). We evaluate OneAdapt across five types of configurations, four analytic tasks, and five types of input data. Compared to state-of-the-art adaptation schemes, OneAdapt cuts bandwidth usage and GPU usage by 15-59% while maintaining comparable accuracy or improves accuracy by 1-5% while using equal or fewer resources.
SoCC

Minimizing packet retransmission for real-time video analytics

Wang, Haodong, Du, Kuntai, and Jiang, Junchen

In Proceedings of the 13th Symposium on Cloud Computing 2022

Abs Link

In smart-city and video analytics (VA) applications, high-quality data streams (video frames) must be accurately analyzed with a low delay. Since maintaining high accuracy requires compute-intensive deep neural nets (DNNs), these applications often stream massive video data to remote, more powerful cloud servers, giving rise to a strong need for low streaming delay between video sensors and cloud servers while still delivering enough data for accurate DNN inference. In response, many recent efforts have proposed distributed VA systems that aggressively compress/prune video frames deemed less important to DNN inference, with the underlying assumptions being that (1) without increasing available bandwidth, reducing delays means sending fewer bits, and (2) the most important frames can be precisely determined before streaming. This short paper challenges both views. First, in high-bandwidth networks, the delay of real-time videos is primarily bounded by packet losses and delay jitters, so reducing bitrate is not always as effective as reducing packet retransmissions. Second, for many DNNs, the impact of missing a video frame depends not only on itself but also on which other frames have been received or lost. We argue that some changes must be made in the transport layer, to determine whether to resend a packet based on the packet’s impact on DNN’s inference dependent on which packets have been received. While much research is needed toward an optimal design of DNN-driven transport layer, we believe that we have taken the first step in reducing streaming delay while maintaining a high inference accuracy.
MLSys

AccMPEG: Optimizing Video Encoding for Accurate Video Analytics

Du, Kuntai, Zhang, Qizheng, Arapin, Anton, Wang, Haodong, Xia, Zhengxu, and Jiang, Junchen

In Proceedings of Machine Learning and Systems 2022

Link
ICC

Cluster-based Handoff Scheme Design for Platoons in Cellular V2X Networks

Wang, Haodong, Zhang, Shuhang, Di, Boya, and Song, Lingyang

In ICC 2021 - IEEE International Conference on Communications 2021