Internet photo collections offer extensive coverage of landmark buildings, monuments, sculptures, and paintings, enabling the development of landmark recognition engines that can automatically tag images with names and locations. These engines utilize clustering algorithms to group millions of images based on the depicted buildings or objects. This process requires a clear definition of landmarks, robust image similarity measures that account for variations in viewpoint and lighting, and efficient clustering methods. The Iconoid Shift algorithm is introduced to meet these needs, representing each landmark with an iconic image, or Iconoid, which is the image that overlaps most with others of the same landmark. Iconoids are identified through mode search using a novel homography overlap distance measure. The thesis also presents efficient, highly parallel clustering algorithms. The growing density of online photo collections enables the discovery of building sub-structures, such as doors and spires. To address this, the Hierarchical Iconoid Shift algorithm is introduced, producing a hierarchy of clusters representing these sub-structures through a novel hierarchical variant of Medoid Shift. Finally, a large-scale evaluation of landmark recognition system components is conducted, analyzing how various component choices and parameters impact overall system performance.
Tobias Weyand Libri
