论文标题
标题:仅使用名词对象进行分析,标记和解释对象进行纠正
CAPTION: Correction by Analyses, POS-Tagging and Interpretation of Objects using only Nouns
论文作者
论文摘要
最近,深度学习(DL)方法在图像字幕和视觉问题答案中表现出色。但是,尽管具有性能,但DL方法并未学习用于描述场景的单词的语义,因此很难发现字幕中使用的不正确单词或互换具有相似含义的单词。这项工作提出了用于对象检测和自然语言处理的DL方法的组合,以验证图像的标题。我们在Foil-Coco数据集中测试我们的方法,因为它仅使用MS-Coco图像数据集中表示的对象为各种图像提供了正确且错误的字幕。结果表明,在某些情况下,我们的方法具有良好的整体性能,类似于人类表现。
Recently, Deep Learning (DL) methods have shown an excellent performance in image captioning and visual question answering. However, despite their performance, DL methods do not learn the semantics of the words that are being used to describe a scene, making it difficult to spot incorrect words used in captions or to interchange words that have similar meanings. This work proposes a combination of DL methods for object detection and natural language processing to validate image's captions. We test our method in the FOIL-COCO data set, since it provides correct and incorrect captions for various images using only objects represented in the MS-COCO image data set. Results show that our method has a good overall performance, in some cases similar to the human performance.