部分可观测时空混沌系统的无模型预测

论文标题

部分可观测时空混沌系统的无模型预测

Semi-supervised Body Parsing and Pose Estimation for Enhancing Infant General Movement Assessment

论文作者

Ni, Haomiao, Xue, Yuan, Ma, Liya, Zhang, Qian, Li, Xiaoye, Huang, Xiaolei

论文摘要

婴儿运动视频（IMV）的一般运动评估（GMA）是婴儿早期检测婴儿脑瘫（CP）的有效方法。我们在本文中证明，可以应用用于图像序列识别的端到端可训练的神经网络可以在GMA中取得良好的结果，更重要的是，使用婴儿身体解析和姿势估计信息增强原始视频可以显着提高性能。为了解决有效利用部分标记的IMV进行人体解析的问题，我们提出了一个半督导模型，称为Siamparsenet（SPN），该模型由两个分支组成，一个用于体内身体部位段，另一个用于互段标签的传播。在训练期间，这两个分支通过使用仅标记框架的输入对以及标记和未标记框架的输入进行交替培训。我们还通过提出一个分解的视频生成对抗网络（FVGAN）来研究培训数据的增强，以合成新的标记帧进行训练。测试时，我们采用多源推理机制，其中测试框架的最终结果是通过分割分支或通过附近的钥匙帧传播获得的。我们在两个婴儿运动视频数据集上使用SPN进行了广泛的实验，用于使用SPN，SPN与FVGAN相结合，实现了最新的性能。我们进一步证明，SPN可以很容易地适应婴儿姿势估计任务，并具有出色的性能。最后但并非最不重要的一点是，我们探索了我们方法在GMA中的临床应用。我们收集了一个带有GMA注释的新临床IMV数据集，我们的实验表明，在前两个数据集中训练了身体解析和姿势估计的SPN模型可以很好地推广到新的临床数据集，其结果可以显着提高CRNN基于CRNN的GMA预测性能。

General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for early detection of cerebral palsy (CP) in infants. We demonstrate in this paper that end-to-end trainable neural networks for image sequence recognition can be applied to achieve good results in GMA, and more importantly, augmenting raw video with infant body parsing and pose estimation information can significantly improve performance. To solve the problem of efficiently utilizing partially labeled IMVs for body parsing, we propose a semi-supervised model, termed SiamParseNet (SPN), which consists of two branches, one for intra-frame body parts segmentation and another for inter-frame label propagation. During training, the two branches are jointly trained by alternating between using input pairs of only labeled frames and input of both labeled and unlabeled frames. We also investigate training data augmentation by proposing a factorized video generative adversarial network (FVGAN) to synthesize novel labeled frames for training. When testing, we employ a multi-source inference mechanism, where the final result for a test frame is either obtained via the segmentation branch or via propagation from a nearby key frame. We conduct extensive experiments for body parsing using SPN on two infant movement video datasets, where SPN coupled with FVGAN achieves state-of-the-art performance. We further demonstrate that SPN can be easily adapted to the infant pose estimation task with superior performance. Last but not least, we explore the clinical application of our method for GMA. We collected a new clinical IMV dataset with GMA annotations, and our experiments show that SPN models for body parsing and pose estimation trained on the first two datasets generalize well to the new clinical dataset and their results can significantly boost the CRNN-based GMA prediction performance.

下载PDF全文

下载文献需遵守相关版权规定

论文标题