Thao Minh Le

Thao Minh Le, Vuong Le, Svetha Venkatesh and Truyen Tran. Hierarchical Conditional Relation Networks for Multimodal Video Question Answering. International Journal of Computer Vision (IJCV).
Tri Minh Nguyen, Thin Nguyen, Thao Minh Le, Truyen Tran. GEFA: Early Fusion Approach in Drug-Target Affinity Prediction. IEEE/ACM Transactions on Computational Biology and Bioinformatics.
Thao Minh Le, Long Hoang Dang, Thanh-Son Nguyen, Thi Minh Huyen Nguyen, Xuan-Son Vu. VLSP 2021 – VieCap4H Challenge: Automatic Image Caption Generation for Healthcare Domain in Vietnamese. VNU Journal of Science: Computer Science and Communication Engineering. To be appeared!
Long Hoang Dang, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran. Dynamic Reasoning for Movie QA: A Character-Centric Approach. IEEE Transactions on Multimedia.
Nikolaj Normann Holma, Thao Minh Le, Anne Frølichc, Ove Andersene, Helle Gybel Juul-Larsene, Anders Stockmarra, Svetha Venkatesh. amVAE: Age-aware Multimorbidity clustering using Variational AutoEncoders. Computers in Biology and Medicine.

Quang-Hung Le, Long Hoang Dang, Ngan Le, Truyen Tran, Thao Minh Le. Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language Models. To appear at the AAAI Conference on Artificial Intelligence 2025 (AAAI-25).
Tuyen Tran, Thao Minh Le, Hung Tran, Truyen Tran. Unified Compositional Query Machine with Multimodal Consistency for Video-based Human Activity Recognition. To appear at the British Machine Vision Conference 2024.
Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran. Guiding Visual Question Answering with Attention Priors. To appear at the 2023 Winter Conference on Applications of Computer Vision (WACV'23).
Hoang-Anh Pham, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran. Video Dialog as Conversation about Objects Living in Space-Time. To appear at 2022 European Conference on Computer Vision (ECCV'22) . Code is available on GitHub.
Long Hoang Dang, Thao Minh Le, Vuong Le, Truyen Tran. Hierarchical Object-oriented Spatio- Temporal Reasoning for Video Question Answering. To appear at the 2021 International Joint Conference on Artificial Intelligence (IJCAI’21).
Long Hoang Dang, Thao Minh Le, Vuong Le, Truyen Tran. Object-Centric Representation Learning for Video Question Answering. To appear at the 2021 International Joint Conference on Neural Networks (IJCNN’21).
Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran. Dynamic Language Binding in Relational Visual Reasoning. In Proceedings of the Twenty-Ninth International Joint Conference on Artificial Intelligence (IJCAI’20), pages 818-824. Code is available on GitHub.
Thao Minh Le, Vuong Le, Svetha Venkatesh, Truyen Tran. Neural Reasoning, Fast and Slow, for Video Question Answering. In Proceedings of the 2020 International Joint Conference on Neural Networks (IJCNN’20) (pp. 1-8), doi: 10.1109/IJCNN48605.2020.9207580.
Thao Minh Le, Vuong Le, Svetha Venkatesh and Truyen Tran. Hierarchical Conditional Relation Networks for Video Question Answering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’20), pages 9972-9981. Code is available on GitHub (Oral acceptance).
Thao Minh Le, Nakamasa Inoue, Koichi Shinoda. A Fine-to-Coarse Convolutional Neural Network for 3D Human Action Recognition. In Proceedings of the British Machine Vision Conference (BMVC'18), Sep. 3, 2018.
Thao Le Minh, Nobuyuki Shimizu, Takashi Miyazaki, Koichi Shinoda. Deep Learning Based Multi-modal Addressee Recognition in Visual Scenes with Utterances. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI'18), pp. 1546-1553, Jul. 13, 2018. Check our GitHub out for details of the released dataset.
Viet Dung Nguyen, Minh Thao Le, Anh Duc Do, Hoang Hai Duong, Toan Dat Thai, and Duc Hoa Tran. An efficient camera-based surveillance for fall detection of elderly people. In Proceedings of the Industrial Electronics and Applications (ICIEA'14), 2014 IEEE 9th Conference on, pp. 994-997. IEEE, 2014.

Tuyen Tran, Thao Minh Le, Truyen Tran. Promptable Iterative VisualRefinement for Video Instance Segmentation. In Instance-Level Recognition Workshop at ECCV, 2024.
Long Hoang Dang, Thao Minh Le, Vuong Le, Tu Minh Phuong, Truyen Tran. Time-Evolving Conditional Character-centric Graphs for Movie Understanding. In NeurIPS 2022 Temporal Graph Learning Workshop, 2022.
Tri Minh Nguyen, Thin Nguyen, Thao Minh Le, Truyen Tran. GEFA: Early Fusion Approach in Drug-Target Affinity Prediction. In NeurIPS 2020 Workshop on Machine Learning for Structural Biology (MLSB’20).
Long Hoang Dang, Thao Minh Le, Vuong Le, Truyen Tran. Object-Centric Relational Reasoning for Video Question Answering. In the ECCV 2nd Workshop on Video Turing Test: Toward Human-Level Video Story Understanding, Aug, 2020.
Thao Minh Le, Nakamasa Inoue, Koichi Shinoda. Skeleton-based Human Action Recognition with Fine-to-Coarse Convolutional Neural Network. Technical Reports of IEICE PRMU, vol. 118, no. 362, pp. 61-64, Dec. 13, 2018.

Truyen Tran, Vuong Le, Hung Le, Thao Minh Le. From Deep Learning to Deep Reasoning. The 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’21), Singapore. Tutorial website.
Truyen Tran, Vuong Le, Hung Le, Thao Minh Le. Neural Machine Reasoning. The 2021 International Joint Conference on Artificial Intelligence (IJCAI’21), Montreal, Canada. Accepted tutorial list.

Journal Papers

Conference Preceedings (peer reviewed)

Workshop Papers and Technical Reports

Tutorials