MovieChat: From Dense Token to Sparse Memory in Long Video Understanding
Enxin Song*, Wenhao Chai*♡, Guanhong Wang*,Yucheng Zhang, Haoyang Zhou, Feiyang Wu, Tian Ye, Jenq-Neng Hwang, Gaoang Wang✉
Computer Vision and Pattern Recognition (CVPR), 2024
[Website]
[Paper]
[Dataset]
[Code]
MovieChat achieves state-of-the-art performace in long video understanding by introducing memory mechanism.
Devil in the Number: Towards Robust Multi-modality Data Filter
Yichen Xu, Zihan Xu, Wenhao Chai*♡, Zhonghan Zhao, Enxin Song, Gaoang Wang✉
International Conference on Computer Vision Workshop (ICCVW), 2023
[Paper]
we show that CLIP model is not robust regarding the number.
Knowledge Graph Extrapolation Network with Transductive Learning for Recommendation
Ruixin Ma, Fangqing Guo, Liang Zhao, Biao Mei, Xiya Bu, Hao Wu, Enxin Song
Applied Sciences, 2022
[Paper]
Motivated by long tail phenomenon and data sparsity, the Knowledge Graph Extrapolation Network with Transductive Learning for Recommendation is proposed to improve recommendation quality.