Cargando…
Reinforcement Learning Based Multipath QUIC Scheduler for Multimedia Streaming
With the recent advances in computing devices such as smartphones and laptops, most devices are equipped with multiple network interfaces such as cellular, Wi-Fi, and Ethernet. Multipath TCP (MPTCP) has been the de facto standard for utilizing multipaths, and Multipath QUIC (MPQUIC), which is an ext...
Autores principales: | , |
---|---|
Formato: | Online Artículo Texto |
Lenguaje: | English |
Publicado: |
MDPI
2022
|
Materias: | |
Acceso en línea: | https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9460924/ https://www.ncbi.nlm.nih.gov/pubmed/36080792 http://dx.doi.org/10.3390/s22176333 |
Sumario: | With the recent advances in computing devices such as smartphones and laptops, most devices are equipped with multiple network interfaces such as cellular, Wi-Fi, and Ethernet. Multipath TCP (MPTCP) has been the de facto standard for utilizing multipaths, and Multipath QUIC (MPQUIC), which is an extension of the Quick UDP Internet Connections (QUIC) protocol, has become a promising replacement due to its various advantages. The multipath scheduler, which determines the path to which each packet should be transmitted, is a key function that affects the multipath transport performance. For example, the default minRTT scheduler typically achieves good throughput, while the redundant scheduler gains low latency. While the legacy schedulers may generally give a desirable performance in some environments, however, each application renders different requirements. For example, Web applications target low latency, while video streaming applications require low jitter and high video quality. In this paper, we propose a novel MPQUIC scheduler based on deep reinforcement learning using the Deep Q-Network (DQN) that enhances the quality of multimedia streaming. Our proposal first takes into account both delay and throughput as a reward for reinforcement learning to achieve a low video chunk download time. Second, we propose a chunk manager that informs the scheduler of the video chunk information, and we also tune the learning parameters to explore new random actions adequately. Finally, we implement our new scheduler on the Linux kernel and give results using the Mininet experiments. The evaluation results show that our proposal outperforms legacy schedulers by at least 20%. |
---|