论文标题

聪明的导演:实时广播的事件驱动的导演系统

Smart Director: An Event-Driven Directing System for Live Broadcasting

论文作者

Pan, Yingwei, Chen, Yue, Bao, Qian, Zhang, Ning, Yao, Ting, Liu, Jingen, Mei, Tao

论文摘要

实时视频广播通常需要众多技能和专业知识,并具有域知识,以实现多相机的作品。随着相机数量的不断增加,导演现场体育广播现在变得比以往任何时候都变得更加复杂和具有挑战性。在制作过程中,广播导演需要更加集中,响应良好和知识渊博。为了减轻董事的密集努力,我们开发了一种创新的自动化体育广播导演系统,称为Smart Director,该系统旨在模仿典型的人类在环境广播过程中,以使用一系列高级多视频视频分析algorithms自动创建近乎专业的广播节目。 Inspired by the so-called "three-event" construction of sports broadcast, we build our system with an event-driven pipeline consisting of three consecutive novel components: 1) the Multi-view Event Localization to detect events by modeling multi-view correlations, 2) the Multi-view Highlight Detection to rank camera views by the visual importance for view selection, 3) the Auto-Broadcasting Scheduler to control the production of broadcasting videos.据我们所知,我们的系统是第一个用于多相机体育广播的端到端自动指导系统,完全由对运动事件的语义理解驱动。它也是第一个通过跨视图建模来解决多视图联合事件检测的新问题的系统。我们对现实世界中多相机足球数据集进行客观和主观评估,该数据集证明了我们自动生成视频的质量与被人指导的视频相当。由于其更快的响应,我们的系统能够捕获更多的快速和短期事件,这些事件通常被人类董事错过。

Live video broadcasting normally requires a multitude of skills and expertise with domain knowledge to enable multi-camera productions. As the number of cameras keep increasing, directing a live sports broadcast has now become more complicated and challenging than ever before. The broadcast directors need to be much more concentrated, responsive, and knowledgeable, during the production. To relieve the directors from their intensive efforts, we develop an innovative automated sports broadcast directing system, called Smart Director, which aims at mimicking the typical human-in-the-loop broadcasting process to automatically create near-professional broadcasting programs in real-time by using a set of advanced multi-view video analysis algorithms. Inspired by the so-called "three-event" construction of sports broadcast, we build our system with an event-driven pipeline consisting of three consecutive novel components: 1) the Multi-view Event Localization to detect events by modeling multi-view correlations, 2) the Multi-view Highlight Detection to rank camera views by the visual importance for view selection, 3) the Auto-Broadcasting Scheduler to control the production of broadcasting videos. To our best knowledge, our system is the first end-to-end automated directing system for multi-camera sports broadcasting, completely driven by the semantic understanding of sports events. It is also the first system to solve the novel problem of multi-view joint event detection by cross-view relation modeling. We conduct both objective and subjective evaluations on a real-world multi-camera soccer dataset, which demonstrate the quality of our auto-generated videos is comparable to that of the human-directed. Thanks to its faster response, our system is able to capture more fast-passing and short-duration events which are usually missed by human directors.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源