I’m Bryan Ressler, a Software Engineer on the PIX Team. If you’ve been reading this blog, you’ll know that the PIX team is responsible for the photo and imaging user experiences in Windows Vista, as well as Microsoft’s Digital Image Suite product line. PIX also maintains a small incubation team, called PIX Labs, whose charter is to investigate photo-related technologies, create prototype software, and learn from those prototypes to help shape the roadmap for consumer photo experiences. I’m on the PIX Labs team, and I’d like to share some details about one of our exciting projects.
Photosynth – What is it?
Photosynth is based on research carried out by University of Washington’s Noah Snavely and Steve Seitz with MSR Principal Researcher Rick Szeliski. They envisioned and prototyped a system by which a collection of photos could be processed into an immersive, three-dimensional viewing environment. The team’s primary research is being presented this week at the international SIGGRAPH computer graphics conference.
PIX Labs saw Photo Tourism, since renamed Photosynth, as a powerful new way for everyday photographers to enjoy their photos, and saw lots of potential beyond that. So PIX Labs joined into a collaboration with Live Labs’ recently-acquired Seadragon team to create a compelling technology preview based on the original Photo Tourism idea.
The resulting application provides a “point cloud” 3D model of the scene along with the 3D locations of all the cameras (the small orange pyramids in the picture). The photos in the collection can be “projected” onto the 3D model (like a slide projector). Because so much is known about the relationships between the photographs, easy navigation mechanisms are provided, such as “show me an image to the left of this one” (the arrows around the outside of the window), and “show me the images that are similar to this one,” as shown in the “splatter” view here.
Additionally, because Photosynth was built atop Live Labs’ Seadragon technology, with Photosynth you can zoom in to arbitrarily high resolutions. Even if every photo in the collection is 12 megapixels or more, all that information is preserved.
How It Works
Photosynth collections start with a set of photographs of roughly the same subject, such as a place, object, or monument. The photographs might have been taken all at once by the same photographer, or they might be a disparate set of pictures collected from different photographers at different times. The images are then processed by a preprocessor program that identifies “features” in the photographs – identifiable points in each image. (This picture shows a photo with some of its feature points superimposed.)
Once features have been identified for all the photos, the preprocessor finds the feature point correspondences between all the images in the set. In the process, the software uses a computer vision technique called “structure from motion” to determine the three-dimensional position of each feature point. This also allows the program to determine the relative position in 3D from which each photograph was taken.
The point cloud you saw in the first picture is simply the complete set of 3D feature points of all the photos in the collection. It helps provide context for the individual photos in the collection. (This image below shows part of a point cloud, some of the camera locations, and the “projection” of one of the cameras that indicates what part of the model was photographed.)
It turns out that by using this technique quite a bit of information can be gleaned from the photo set, all automatically. And because the software knows how the photographs fit together, unique navigational aids are built into Photosynth to allow the user to navigate left, right, up, down, in, and out from the currently viewed photograph. As a result it is very easy to take “virtual tour” of a place, letting you see a view not too different from what you’d experience strolling around the location in person. Thus we see Rick’s original “photo tourism” dream realized.
What you see here is a technology preview. In the short term, we’re working hard to get a public release ready. But what’s really exciting is to think where a technology like this could take us.
What if all the world’s billions of images were woven into a single gigantic Photosynth collection? What if you could visit any place, anywhere, through the eyes of the countless people who have photographed that place in the past? What if you could take a trip through time, seeing how a place changed as time went by?
Those are a few of the big-picture projections of where we’re going with Photosynth. But in the shorter term, here are a couple practical examples of benefits we could see from this technology. Someone takes a picture of Nelson’s Column in the middle of Trafalgar Square in London, tags that photo “Nelson’s Column,” and adds it to the web Trafalgar Square Photosynth collection. Later, when you visit London, you upload your photos of Trafalgar Square to the same web collection. Since the software can “see” that some of your pictures contain Nelson’s Column, those pictures can be automatically tagged with the proper metadata — along with your other sights in London. This makes your photos immediately more searchable and thereby more valuable.
Another example: You walk up to the Trevi Fountain in Rome and wonder the name of the big guy in the middle of the fountain. You point your camera phone at it, snap a picture, and send it to a web service that uses the photo, perhaps your GPS or cell-triangulated location, and a Photosynth collection of the Trevi Fountain to determine the content of your photo. A moment later you receive an SMS message: “Neptune, god of the sea” along with a set of reference links.
I hope you share my enthusiasm for this technology. Photosynth is just one example of Microsoft innovation aimed at creating richer, more fulfilling experiences for real-world computer users like you and me. Keep an eye on the PIX Blog for more information on Photosynth and its upcoming public Tech Preview release. In the mean time, check out Live Labs’ Photosynth site and be sure to watch the videos of Photosynth in action.
– Bryan Ressler (Software Engineer)