As a follow up to my post on location standards for mobile augmented reality, I’ve been thinking about how an AR author would specify the location of virtual objects in the physical world. In my research I came up with six different ways of specifying location, and identified some interesting challenges for mobile AR design. This is still at the “thinking out loud” stage, so help me out if you don’t like the way I came in.
(Just to be clear, this discussion is focused on the representation of location, and doesn’t speak to interaction design of the authoring process. I plan to address that topic in a future writeup.)
A location specification can be direct or indirect. Direct location is specified in measurable quantitative terms such as spatial coordinates. Indirect location is described in terms of a relationship relative to some other entity, or as a semantic identifier of a spatial or geographic entity. Each has its charms and disillusionments.
Direct Location
1. Geospatial coordinates — The most common model for virtual object location in mobile AR is to specify a point as (longitude, latitude, altitude) in an Earth-based coordinate system. This point becomes the origin of a local coordinate system for the virtual object. This model is relatively well understood and in wide usage by most of the mobile AR systems, location based services, GPS points of interest (POIs), geoweb services etc.
The specific challenges for mobile AR include standardizing on data representations, supporting multiple coordinate reference systems (CRS), defining precision and accuracy, and supporting geometries beyond simple points. Hopefully, ongoing discussions around ARML, GeoAR and others will lead to reasonable convergence among the interested communities.
2. Alternative spatial coordinates — It’s not too hard to imagine cases where you want something other than a ground-based spatial coordinate system. For example, what if my object is a giant billboard in (non-geosynchronous) orbit around the Earth? A geodetic system like WGS-84 does you little good in this case, so you might want to use a geocentric model. The X3D architecture supports geocentric coordinates, for example. Better yet, what if my object is a virtual planet in orbit around the Sun? Earth coordinates will be quite unhelpful for this case, and I’m not aware of any systems that have heliocentric CRS support. Yet another interesting scenario involves indoor positioning systems which establish their own CRS on a local basis.
Challenges here include identifying alternative reference systems that should be supported for AR use cases beyond ground-based scenarios, and specifying the transformations between these frames of reference and other involved CRSes.
Indirect Location
3. Spatial entity identifiers — The names of geographic places — Heathrow Airport, New York City, Fujiyama, the Transamerica Building, Beverly Hills 90210 Â — are indirect specifications of location. So are unique identifiers such as Yahoo GeoPlanet’s WOEIDs. They are indirect because the physical coordinates they represent, the centroids and bounding boxes of their ground shapes, must be looked up in some reference database.
For AR, the opportunity is obviously to embrace the human-centric context of place names, and to leverage the large investment in geoweb services by Yahoo and others. There are many AR use cases where it would be desirable to augment an entire geographic place at once. The challenge is to define a representation for virtual objects that supports identifiers as locations, and provides for appropriate services to resolve IDs to geographic coordinates. Of course for these identifiers to have meaning, the model also needs to support geometries beyond simple points.
4. Client-relative location — Location relative to the client device, which presumably knows its geographic location, as a proxy for the location of the human user. This is the augmented pet scenario, maybe. Faithful digiRover the auggie doggie’s location is specified as a vector offset from my location, and follows me around as I move.
5. Viewport-relative location — In computer graphics, the viewport is basically defined by the size and resolution of the viewing display. An AR application might wish to locate a virtual object model at a specific point on the user’s display, regardless of where the “magic lens” is pointed. For example, I might “click” the QR code on a physical object and a related virtual 3D object model appears pinned to the center of my display, where I can play with it without having to hold my device aimed steadily in one place. The object’s location is specified as (x%, y%) of my device’s screen size. If you like, we could have a good discussion in the comments about whether this is a valid “location” or not.
6. Object-relative location — An important class of use cases that are easy to describe in language, and more difficult to represent in code. “Alice’s avatar is on a boat”. “Bob’s tweets float above his head as he walks by”. “Charlie is wearing a virtual hat”. “The virtual spider is on the virtual table, which is on the physical floor”. Â In each case, a virtual object’s location is specified relative to another virtual or physical object. The location of the second object may also be relative to something else, so we need to be able to follow the nested chain of relative locations all the way down to ground-based coordinates (or fail to locate). Of course, the second object might also be an AR marker, an RFID, a 1D or 2D barcode or similar physical hyperlink. It might be an object that is identified and tracked by an image-based recognition system. It might further be a physical object that has embedded location capability and publishes its location through an API, as Alice’s boat might do.
Clearly object-relative location poses a host of challenges for mobile AR. Among these: defining an object model for virtual and physical objects that includes representations for identification and location; defining an extensible object identification and naming scheme; and defining a location scheme that allows for a variety of methods for resolving location specifiers, including nested constructs.
As I said, this is definitely “thinking out loud”, and I’d love to have your feedback in the comments below or via @genebecker. I want to acknowledge @tishshute and Thomas Wrobel, who have been leading the discussion on using Google Wave protocols as a communication & distribution mechanism for an open AR network. That collaboration stimulated these ideas, and you might find it stimulating to participate in the conversation, which appropriately is happening as a Wave. Let me know if you need a Wave invite, I have a few left for serious participants.
As always, YMMV. Peace.