Thao Minh LE

I am an incoming tenure-track Assistant Professor of AI at the Pennsylvania State University at Great Valley, starting from August 2025. I am currently a Research Fellow at the Applied AI Institute, Deakin University, Australia. I received my Ph.D. in Computer Science from Deakin University in 2021. Prior to that, I obtained my M.Sc. in Computer Science from Tokyo Institute of Technology, Japan and my B.Eng. in Electronics and Communication Engineering from Hanoi University of Science and Technology, Vietnam. I have been recognized with multiple awards for my research contributions and academic excellence.

Email: thaoyd2@gmail.com

My research interests focus on deep learning and machine learning techniques for visual perception, and vision and language reasoning. These capabilities are the key elements required of the next generation of virtual assistant systems. Real-world applications of these systems include security and safety services, healthcare.


  • [Jul 29, 2025] I am joining Pennsylvania State University at Great Valley, Pennsylvania, USA as a tenure-track Assistant Professor of AI from August 2025. I am looking forward to working with my new colleagues and students.
  • [Jul 11, 2025] Our paper Planner-Refiner: Dynamic Space-Time Refinement for Vision-Language Alignment in Videos is accepted for presentation at the European Conference on Artificial Intelligence 2025.
  • [Jul 11, 2025] Our paper Towards Agentic AI for Multimodal-Guided Video Object Segmentation is accepted for presentation at the Instance-Level Recognition and Generation Workshop, ICCV, 2025.
  • [Dec 24, 2024] Our paper amVAE: Age-aware Multimorbidity clustering using Variational AutoEncoders is accepted for publication in Computers in Biology and Medicine (CIBM).
  • [Dec 10, 2024] Our paper Progressive Multi-granular Alignments for Grounded Reasoning in Large Vision-Language Models is accepted for presentation at the AAAI Conference on Artificial Intelligence 2025.
  • [Nov 4, 2024] I have been awarded a 3-year research support, starting from April 2025, for my research proposal on "Fine-grained Human Motion Understanding and Its Applications" by Deakin University as part of Deakin University Postdoctoral Research Fellowship 2025.
  • [Oct 5-13, 2024] I gave a talk at the Ludwig Maximilian University of Munich and Fraunhofer Research Institution, Germany on "Vision Language Intelligence: Machines That Reason About What They See". I am super excited with my upcoming research collaboration with Fraunhofer on AI for surgical education and training and leveraging its capabilities to enhance patient safety.
  • [Sep 2, 2024] I will be visiting Ludwig Maximilian University of Munich and Fraunhofer Research Institution for Individualized and Cell-Based Medical Engineering IMTE in early October as part of my DAAD Postdoc-NeT-AI Fellowship.
  • [Aug 9, 2024] Our preliminary work on Promptable Iterative Visual Refinement for Video Instance Segmentation is accepted for presentation at Instance-Level Recognition Workshop at ECCV 2024.
  • [Jul 25, 2024] Our paper Unified Compositional Query Machine with Multimodal Consistency for Video-based Human Activity Recognition is accepted for presentation at British Machine Vision Conference 2024.
  • [Apr 3, 2024] I have been selected as a DAAD Alnet fellow for the Postdoctoral Networking Tour in AI 04/2024. I will be participating in a virtual networking week (15/4-19/4/2024) and later receiving the DAAD's financial and origanizational support to visit German institutions in person to learn about the German AI research community. Please say "Hi" if you are also attending!
  • [Dec 1, 2023] My grant application on video analysis for early detection of Cerebral Palsy has been successful. I will serve as the Lead Chief Investigator for the two-year project with the Cerebral Palsy Alliance Research Foundation.

  • [Older news]