Semantic segmentation is vital for scene understanding but suffers from expensive pixel-level annotation requirements. Recently, Vision Foundation Models (VFMs) like CLIP, DINO and SAM have emerged as transformative tools, leveraging massive pre-trained knowledge for visual tasks. This seminar explores how these VFMs benefit segmentation under limited supervision. By transferring their inherent visual-semantic understanding, high-performance segmentation is achieved with minimal labels – enabling scalable applications in medical imaging, industrial inspection and so on.
Bio
Dr Bingfeng Zhang is an Associate Professor at China University of Petroleum (East China). As a Taishan Scholar Young Expert of Shandong Province, he serves on the Council of Shandong Association for Artificial Intelligence. He obtained his Ph.D. from the University of Liverpool in 2022, specializing in computer vision and pattern recognition. Dr. Zhang leads the National Natural Science Foundation of China (NSFC) and contributing to two NSFC Key Programs. With 20+ publications in top-tier venues (TPAMI, IJCV, CVPR, etc.), he actively serves as Guest Editor for Remote Sensing and reviewer for flagship journals/conferences (TPAMI, CVPR, ICCV). His honors include Runner-up in ECCV 2022 Defect Detection Challenge, Natural Science Award from Jiangsu Computer Society.
Workshop Report
The report for this workshop is coming soon, please check back later.