会议论文

Learning Modality Knowledge Alignment for Cross-Modality Transfer 收藏

Learning Modality Knowledge Alignment for Cross-Modality Transfer

机构暂未开通该资源的服务权限，如有疑问请联系图书馆。发布源

作者

Wenxuan Ma, Shuang Li, Lincan Cai, Jingxuan Kang

发布日期

21-27 July 2024

页码

33777-33793

来源信息

Proceedings of Machine Learning Research, 2024年, 卷, 33777-33793页

摘要

Cross-modality transfer aims to leverage large pretrained models to complete tasks that may not belong to the modality of pretraining data. Existing works achieve certain success in extending classical finetuning to cross-modal scenarios, yet we still lack understanding about the influence of modality gap on the transfer. In this work, a series of experiments focusing on the source representation quality during transfer are conducted, revealing the connection between larger modality gap and lesser knowledge reuse which means ineffective transfer. We then formalize the gap as the knowledge misalignment between modalities using conditional distribution $P(Y|X)$. Towards this problem, we present Modality kNowledge Alignment (MoNA), a meta-learning approach that learns target data transformation to reduce the modality knowledge discrepancy ahead of the transfer. Experiments show that the approach significantly improves upon cross-modal finetuning methods, and most importantly leads to better reuse of source modality knowledge.

摘要译文

Wenxuan Ma, Shuang Li, Lincan Cai, Jingxuan Kang. Learning Modality Knowledge Alignment for Cross-Modality Transfer[C]//Volume 235: International Conference on Machine Learning, Vienna, Austria, 21-27 July 2024, AT: PMLR;ICML, 2024: 33777-33793