论文标题

HITSKT:用于会话感知知识跟踪的分层变压器模型

HiTSKT: A Hierarchical Transformer Model for Session-Aware Knowledge Tracing

论文作者

Ke, Fucai, Wang, Weiqing, Tan, Weicong, Du, Lan, Jin, Yuan, Huang, Yujin, Yin, Hongzhi

论文摘要

知识追踪(KT)旨在利用学生的学习历史来估算一系列预定义技能的精通水平,基于可以准确预测相应的未来表现。作为为在线教育提供个性化经验的一种重要方法,KT近年来引起了人们的关注。在实践中,学生的学习历史包括对大规模问题的答案,每个问题都称为会议,而不仅仅是一系列独立的答案。从理论上讲,在这些课程中和在这些课程中,学生的学习动力可能会大不相同。因此,如何有效地对课程和跨课程中学生知识状态的动态进行建模对于处理KT问题至关重要。大多数现有的KT模型将学生的学习记录视为单个持续序列,而无需捕获学生知识状态的会话转变。为了解决上述问题,我们提出了一个名为HITSKT的新型层次变压器模型,包括一个互动( - 级)编码器,以捕获学生在会话中获得的知识,以及一个会话(-Level)编码器以汇总过去课程中获得的知识。为了预测当前会话中的互动,知识回收者将摘要的过去的知识与以前的交互信息集成到适当的知识表示中。然后,这些表示形式用于计算学生的当前知识状态。此外,为了模拟学生在整个会话中的长期遗忘行为的建模,在会议编码器中设计和部署了幂律 - 纽约注意机制,从而使其可以更多地强调最近的课程。在三个公共数据集上进行的广泛实验表明,与六种最先进的KT模型相比,Hitskt在所有数据集上都实现了新的最新性能。

Knowledge tracing (KT) aims to leverage students' learning histories to estimate their mastery levels on a set of pre-defined skills, based on which the corresponding future performance can be accurately predicted. As an important way of providing personalized experience for online education, KT has gained increased attention in recent years. In practice, a student's learning history comprises answers to sets of massed questions, each known as a session, rather than merely being a sequence of independent answers. Theoretically, within and across these sessions, students' learning dynamics can be very different. Therefore, how to effectively model the dynamics of students' knowledge states within and across the sessions is crucial for handling the KT problem. Most existing KT models treat student's learning records as a single continuing sequence, without capturing the sessional shift of students' knowledge state. To address the above issue, we propose a novel hierarchical transformer model, named HiTSKT, comprises an interaction(-level) encoder to capture the knowledge a student acquires within a session, and a session(-level) encoder to summarise acquired knowledge across the past sessions. To predict an interaction in the current session, a knowledge retriever integrates the summarised past-session knowledge with the previous interactions' information into proper knowledge representations. These representations are then used to compute the student's current knowledge state. Additionally, to model the student's long-term forgetting behaviour across the sessions, a power-law-decay attention mechanism is designed and deployed in the session encoder, allowing it to emphasize more on the recent sessions. Extensive experiments on three public datasets demonstrate that HiTSKT achieves new state-of-the-art performance on all the datasets compared with six state-of-the-art KT models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源