|
The world is being digitized through photography at an unprecedented rate.
Aerial imagery and street-level panoramic video, once the domain of specialists, is now viewed everyday by people on the Web. Photo sharing sites now contain billions of photographs, and this process is accelerating as cameras and cell phones merge.
How can we make sense of this incredible wealth of data? A promising approach is to match and cluster images together to weave a semantic Web of the world's photographs, and to then reconstruct 3D models and scenes from the matched imagery. This allows us to navigate photographs using familiar 3D interaction techniques and to create evocative composites that exploit the richness of photographs taken at any location.
In this talk, I review our Photo Tourism project and its Web-based counterpart, Photosynth. I describe the 3D computer vision techniques used to match images, locate the cameras, and build sparse and dense 3D models of the visual world. I also show how to smoothly navigate and explore the resulting Web of photographs, including finding related images, annotating regions of interest, viewing 3D aspects of an object, and creating tours through the photo collection. The talk highlights this new generation of Internet-based computer vision and computer graphics algorithms, and shows how the union of these creates a new medium that takes photography and photo sharing to a whole new level.
|