Why does the human brain understand people’s activities in a video far better than existing computer systems? Human vision scientists and computer vision scientists will explore how research findings from their respective disciplines could be combined to create systems that more closely replicate human performance.
We invite participants for a multi-disciplinary workshop on “Human and Computer Models of Video Understanding”. The core research question we are concerned with is: How does the human brain understand people's activities in a video much better than existing computer systems? We invite participants from the science of human vision (psychology or brain sciences) and computer vision, focusing on understanding activities from video.
To give some concrete examples: Humans can very quickly make accurate judgements about the activity happening in a video even if the quality of the video is poor, or the motions observed are ambiguous, for example, to discriminate hugging from fighting, or smoking from eating finger food. Computers cannot match human performance in these tasks, which are critical for applications in surveillance, monitoring safety and welfare in a care setting, or removing inappropriate videos from social media. We do not yet fully understand how humans perform these feats, nor how to make computer vision systems reach their performance.
INVITED SPEAKERS
Professor Shaogang Gong, Queen Mary University of London, Queen Mary Computer Vision Laboratory