Saligrama, VenkateshKag, AnilAcar, Durmus Alp EmreGangrade, Aditya2024-05-072024-05-072023-04-25V. Saligrama, A. Kag, D.A.E. Acar, A. Gangrade. 2023. "Scaffolding a student to instill knowledge" ICLR 2023. https://openreview.net/group?id=ICLR.cc/2023/Conferencehttps://hdl.handle.net/2144/48707We propose a novel knowledge distillation (KD) method to selectively instill teacher knowledge into a student model motivated by situations where the student’s capacity is significantly smaller than that of the teachers. In vanilla KD, the teacher primarily sets a predictive target for the student to follow, and we posit that this target is overly optimistic due to the student’s lack of capacity. We develop a novel scaffolding scheme where the teacher, in addition to setting a predictive target, also scaffolds the student’s prediction by censoring hard-to-learn examples. The student model utilizes the same information as the teacher’s soft-max predictions as inputs, and in this sense, our proposal can be viewed as a natural variant of vanilla KD. We show on synthetic examples that censoring hard-examples leads to smoothening the student’s loss landscape so that the student encounters fewer local minima. As a result, it has good generalization properties. Against vanilla KD, we achieve improved performance and are comparable to more intrusive techniques that leverage feature matching on benchmark datasetsScaffolding a student to instill knowledgeConference materials2024-02-170000-0002-0675-2268 (Saligrama, Venkatesh)902485