Spatialite and GeoPackage

So I’d like to talk about some of the decisions we made in GeoPackage, as much work went in to discussing alternatives and possibilities that is not obvious from the current document. And I’m interested in opening up the dialog around specification development. This is all written as a private individual, not representing any organization or standards working group.

One of the toughest decisions was how to design the binary format for feature data. One of the most important things for me was that it would be possible for implementors to build code to create a geopackage without requiring any special SQLite stuff. The core should just be a file format. And the file should be readable by any SQLite, not one compiled with special options or additional libraries. This means the core GeoPackage is far less than SpatiaLite. Which is one of the main ‘naive’ questions about GeoPackage – why did we not just adopt SpatiaLite? SpatiaLite as a whole does not meet my core requirement: it requires a number of libraries. Indeed it is the additional libraries, it is the full spatial database, not a core format that any software can produce.

The less naive question is why did we not just use SpatiaLite’s geometry binary format, as it’s very similar to what ended up in the spec. Both more or less use Well Known Binary for the geometry, with some header information to help in parsing. This was perhaps the only vote that was not unanimous. From my traditional outsider perspective OGC seems to have a strong case of ‘Not Invented Here’ (TMS -> WMTS to me was particular painful, as it just seemed to overcomplicate things more than it needed to). I thought it would be really great to have some compatibility with SpatiaLite, to potentially have a built-in install base.

It would have been a slam dunk to go that way if SpatiaLite’s binary was completely compatible with the widely used Well Known Binary specification. They have some extra headers, but those would have been fine, as they were nice optimizations. But the key was to be able to point any existing Well Known Binary parser at the geometry portion and have it read. Unfortunately SpatiaLite made some small changes so that as a Java programmer I can’t just use JTS‘s Well Known Binary parser, and it’s a similar story in OGR. And it’s the same case for any GIS vendor that has a WKB parser. So I felt that we should favor future implementors over backwards compatibility with a binary format that has a growing but still relatively small install base.  But I’m still not sure about the decision, and may have gotten some things wrong, so I welcome technical dialog on this.

When we were freed from backwards compatibility we also got a chance to do some small innovations on the header. Most implementations we found put some sort of bounding box in their geometry, so we did the same. Paul Ramsey had a great suggestion we incorporated to have a special flag for points so we could skip the bounds, since it requires four times as much information as the point itself, so you can save significant space.

I tried to get a ‘plugfest’ going with Luciad and Esri. Pepijn and Luciad have been amazing, and even managed to release their work open source. Our crew at OpenGeo has been busy with other projects, but got a version out in GeoTools, and Justin found it quite easy to work with. Esri has been silent about any work they may be doing, so it’s not the plugfest success I was hoping for.

I do still believe a real commitment from Esri to include GeoPackage in their products (ideally not just as a mobile transfer format, but as an input and output to ArcGIS Desktop, Server and Online) would keep me committed to the current GeoPackage binary format, so I’m hoping they step up and get at least a reader and writer for tiles and features to help further QA the specification, as a great step towards opening up. But if we get a lot of compelling arguments to just use SpatiaLite I certainly can be swayed, especially if Esri doesn’t manage to get even a prototype out. I believe it should not be hard to change the core binary format in the spec.

There’s been some good conversation on github about the technical aspects of using SpatiaLite’s binary format. I’d love to hear more from developers on if matching the binary format will actually make us backwards compatible with existing SpatiaLite data packages. And on the flip side I’d be really interested if someone from the SpatiaLite community could explore implementing GeoPackage, doing what all the other spatial database implementors will do, which is just use a compatibility layer to read in and write out the GeoPackage binary, using their existing geometry blob format internally. I definitely admit I may not understand all the issues, and while I’ve been impressed by the technical discussion of the SWG it is still a small group with a limited perspective. So please sound in on the ticket, and show what’s possible with real world implementations. And if you’re an Esri customer or employee encourage them to step up and release a GeoPackage implementation, as I’d really love to have this specification go to version 1.0 with at least 3 real world implementations. Luciad, GeoTools/GeoServer and Esri would be ideal. But Luciad, GeoTools, and SpatiaLite and/or GDAL/OGR would also be great.

Githubbing the GeoPackage specification

In the past few months one of the main things I’ve spent time on is the new GeoPackage specification being worked on within the OGC. I was involved in the very early days of the conception, before it was picked up for the OWS-9 testbed, as I feel it has the potential to fill a big hole in the geospatial world (James Fee sums up my feelings nicely). We discussed keeping it super lean, accessible to all, and implementable by non-GIS experts.

While I got wrapped up in other things a spec for public comment popped out in January, which I felt had strayed from what I believed were the original core design goals. I sent a huge missive on all the changes I felt could get it back to that lean core. I try to never be one to just criticize though, as it’s easy to cast stones and much harder to build something that is strong enough to hold up to others pelting rocks. And my open source roots teach me that a criticism without offering to put in effort to the alternative is close to worthless. So I decided to join the GeoPackage ‘Standards Working Group’, participating in weekly (and then twice a week) calls, and trying to work with the OGC workflow of wikis and massive Word documents.

One of my main goals was to learn about how the OGC process actually works, and hopefully from a place of knowledge be able to offer some suggestions for improvement from my open source software experience. That’s worth a whole post on its own, so I’ll hold off on much of that for now.

OGC staff has also been great about being open to new ways of working. It’s had to be balanced against actually getting the specification out, as many agencies need it ‘yesterday’. But we achieved an OGC first, putting this version of the specification out on GitHub. My hope is to attract developers who are comfortable with GitHub, who won’t need to learn the whole OGC feedback process just to suggest changes and fixes.

I do believe the specification is ‘pretty good’ – it has improved in almost all the ways I was hoping it would (Pepijn and Keith drove the tech push to ‘simpler’, with Paul doing a great job as Editor and Raj from OGC helping out a bunch). I do believe GeoPackage can improve even more, primarily by feedback from real implementations. My hope is it does not go to version 1.0 without 3 full implementations. That don’t just handle one part of the document, but implement all the core options (particularly Tiles and Features).

For the OGC this is an experiment, so if you do think OGC specifications could benefit by a more open process you can help encourage them by spending some of your technical brainpower on improving GeoPackage using GitHub (big shout out to Even who already has a bunch of suggestions in). Fork the spec and make Pull Requests, on core tech decisions or on editorial tweaks that make it easier to understand. For those without GitHub experience we wrote some tips on how to contribute without needing to figure out git). We also tried to add a mechanism for ‘extensions‘ to encourage innovations on the GeoPackage base, so I’d also love to see attempts to use that and extend for things like Styles (SLD or CSS), Rasters, UTFGrids and Photo Observations.

And if you can’t fully implement it but are interested in integrating, I do encourage you to check out the Apache licensed libgpkg implementation from Luciad. Pepijn has been the technical core of GeoPackage, researching what’s come before and continually building code to test our assumptions. Libgpkg is the result of his work, and the most complete implementation. Justin also has a GeoTools implementation and GeoServer module built on it for the Java crowd. And hopefully someone will implement for GDAL/OGR and PostGIS soon.

As always I’ve written too much before I’ve even said all I wanted to. But I will try to do at least one follow up blog post to dig in to some of the details of GeoPackage. I and the SWG welcome an open dialog on the work done and where to take it next. And I’m happy to explain the thinking that got the spec to where it is, since we wanted to keep the spec itself slim and not include all that detail. We are hoping to eventually publish some ‘best practice’ papers that can dig in to ideas we considered yet discarded as not quite ready.

Opening Esri

So I’ve been meaning to write a post on this ever since I had a great talk with Andrew Turner, who recently joined Esri. He was expressing a good bit of optimism over Esri’s willingness to embrace ‘open’. My worry is that they would embrace the language of open, but not actually do the key things that would make a real open ecosystem. It’s easy to take enough action to convince more naive decision makers of their intentions, like, which looks like lots of open source projects, but is mostly code samples and api examples that are useless without pieces of closed software. So I wanted to give to Esri a measurable roadmap of actions to take that would signal to me a real commitment to ‘open’.

That was a couple months ago, but then life got in the way, as tends to happen with lots of planned blog posts. But in the meantime there’s been a fascinating flare-up around the GeoServices REST specification in the OGC (most conversation is behind OGC walls, but see open letters and lengthy discussion on OSGeo). And it similarly makes me want to address ‘Esri’ as an entity. The individuals matter less, I believe that most of them want to do the right thing. But the corporation as a whole takes action in the world, and that big picture matters more than what individuals within are trying to do. So here goes:

Dear Esri,
I imagine you’ve been pretty confused and flummoxed by all the push back to the GeoServices REST specification. It is a pretty solid piece of technology, and theoretically is a great step towards opening up Esri more – putting previously proprietary interfaces in to an open standard.
The key thing to understand though is that unfortunately very few people in this industry trust you. I am actually fairly unique in that I give you a whole lot more benefit of the doubt than most. In my life I try as hard as I can to only judge people based on their direct interactions with me, not on whatever people say about them. And though you aren’t exactly a person I do still try to judge in the same way. And I personally have never had a truly bad interaction with you.
But amazingly just about everyone that I’ve encountered in the broader geospatial industry has. When I bring up that I’ve never been mistreated by you they have the utmost confidence that I inevitably will. I have always been fascinated by this, as I believe it goes far beyond ‘typical competition’, which is one thing people at Esri have told me – something like ‘they’re just jealous’. The amount of venom that otherwise totally decent people will throw out is incredible, and it makes it hard for me to not judge harshly, when so many people whose judgement I trust absolutely have felt really screwed in the past by you. Multiple times. In really unpleasant and unexpected ways. And it’s not just competitors, but partners and clients.
So the fundamental point that needs making with regards to GeoServices REST was articulated well by one of my allies: ‘everyone in the OGC membership is objecting to the REST API because they believe ESRI to be fundamentally conniving and untrustworthy.’ He believes that if the REST interface had been proposed by anyone else then there wouldn’t be a problem – it would likely be an approved OGC standard by now. But people believe it’s a ‘cynical business play by an untrustworthy, well resourced, and predatory business interest.’
I have overall had a relatively positive take on the GeoServices REST API, because I don’t (yet?) believe you to be fundamentally conniving and untrustworthy. You’ve done an amazing amount good for this industry, but I do think you’ve just lost your way a bit. I imagine you think that putting this great interface in to OGC is a great step in the right direction, and it is, but unfortunately it’s too much too fast. It’s as if the schoolyard bully is suddenly super nice to you – you wonder what’s up his sleeve, how he’s going to screw you this time. 
But I believe there’s a path forward, to building up people’s trust so that something like GeoServices REST could be accepted in OGC. It just has to be slower and more incremental. This list is just what I can think of right now. There is a fundamental principle at work though, which is moving towards an architecture that encourages people to mix and match Esri components with other technology. Not merely implementing open standards to check the box, but building for hybrid environments: QGIS exporting to or editing ArcGIS Server, ArcGIS Desktop doing the same to GeoServer, TileMill styling a file geodatabase and publishing to ArcGIS Online, ArcGIS Desktop styling and publishing to Google Maps Engine or CartoDB, etc, etc. Each piece of Esri technology ideally could be used stand alone with other pieces. Stated another way, there should be no lock-in of anything that users create – even their cartography rules. Anything that is created with Esri software should be able to be liberated without extreme pain, in line with the Data Liberation Front  (though I think the google maps engine and earth enterprise teams also need som help from them, they too deserve a similar letter).
I realize this is a big leap, since it is not the absolutely most efficient way to solve the needs of most of your customers, since most of them use only your software. And it is a business risk, since it opens up more potential competition. But it’s also a big business opportunity if done right. And reaches beyond mere business to being a real force for good in the world, becoming a truly loved company, with lots of friends. This list is roughly ordered by easier wins first, and so later ones should probably build on them. If these actions are taken I will start to defend you to my friends a lot more.
  • Enable W*S services by default in ArcGIS Server. You’ve done a pretty great job of doing CITE certified implementations. But as the docs show they are not enabled by default, though GeoServices REST and KML are.
  • Make GeoJSON an output option for the ArcGIS Server REST implementation (and get the clients all reading it). And then get it in the next GeoServices REST specification version.
  • Publish the file formats of file geodatabase. The open API was a really great step, and indeed goes 80% of the way. But many libraries would like to not have to depend on it. We all know the format itself is likely an embarrassing mess, but everyone will forgive, as we’ve all written embarrassing code and formats.
  • Help the coming GeoPackage be the next generation of file geodatabases. Help it evolve to be the main way Esri software moves data between one another. Your team has been great on it so far, but the key step is to make it a top level format not just in mobile but also on ArcGIS Desktop (reading and writing it, like a shapefile), Server and Online (as an output format and upload configuration format for both). And I’m hoping your team can join our plug-fest this week, to get to 3 real implementations before version 1.0 of the specification.
  • Stop the marketing posts about how open you are. Open is action and conversation, not a press release or a case study. No need to talk about how we’ll be ‘surprised’ by what you open in the future. Just start opening a lot and let others do the posts for you, and count on that reaching your customers if you’re doing it well.
  • Openly publish the .lyr file format (and any other formats that define cartography and internal data structures for maps). This is probably an internal format that isn’t pretty, but opening it would enable a lot of cool hybrid architectures.
  • Support CartoCSS as an alternative to .lyr file, and collaborate around advancing it as an open standard to ideally make it the next generation .lyr file. This likely will include a number of vendor extensions as your rendering capabilities are beyond most everyone else. But we can all collaborate on a core. Supporting SLD in desktop would also be great, but we can also just look to the future.
  • Move WFS and GML support out of ‘data interoperability‘ extension and in to core format support
  • Bring support for WFS-T to ArcGIS Desktop, so people can use their desktop to directly edit a WFS-T server. The server side support in ArcGIS Server is great, but the client software also needs to implement the open standards for real interoperability. I think a similar thing might be needed for Desktop to edit directly with GeoServices REST, though maybe that is there now.
  • Support WFS-T in Javascript API and iOS SDK (and I guess flex and silverlight, since you tend to try to have all toolkits the same).
  • Become a true open source citizen. This is another large topic, and as is my tendency I’m already going on too long. So perhaps I will detail it more in another post. I have limited the above to mostly standards, but embracing open source can take you even further. But it is much more about contributing to existing open source libraries and building a real community of outside contributors on your projects, not just releasing lots of code on github. Karl Fogel wrote an excellent book on the subject. You are making some progress on this, but it is just a smidgen if you are actually serious about opening up. I will give props for the flex api, though my cynical friends would say it’s the easiest one to open source as it is dying. So please, do continue to surprise us on the open source front, you just don’t need to talk about it a bunch.
So I believe that doing a majority of these will send a strong signal that Esri is truly changing to be more open and that the submission of the GSR spec and putting code on github is more than a marketing move.
My personal recommendation for you on the GeoServices REST specification is to back down from the OGC standardization for now. Instead work to evolve the spec in the open, and try again in a year or two. I don’t think any argument is going to win over those against it, only real action will. And continuing to try to force it through will only hurt our global community. I do believe there has been really great progress made on the spec through the OGC process, and the resulting document has a much higher chance of being implemented by others. Perhaps we could make it an ‘incubating’ OGC standard, that evolves with OGC process, but does not yet get the marketing stamp. The key to me is to encourage real implementations of it, and continue to advance Esri’s implementation as well, but in dialog with other implementers. Everyone can now start with the latest version, but without the backwards compatibility requirement. Call it GeoServices REST 1.0 – it can be a standard that improved from the OGC process though needs more ‘incubation’ before it deserves the full stamp of interoperability. And aim for 1.1 or 2.0 to happen outside the OGC, and once there are 3 solid implementations all influencing its future then resubmit to OGC.
One thing that I think could help a lot is to publish and work on the spec on github, where people can more easily contribute. Take a more agile process to improving it, that doesn’t depend on a huge contentious vote and lots of OGC process. And when the trust has grown all around submit it to be an official OGC document. I believe this would also be one of the best signals to the wider world of Esri’s commitment to open, to dialog publicly and then iteratively address the concerns of all objectors in a collaborative process. If done right it will fly through the OGC process the next time, or else by then we will all be working together towards the next generation APIs that combine the best of GeoServices REST and W*S, while cutting down complexity. It will be difficult, but will ensure that the geospatial industry remains relevant as the whole world becomes geospatial. So let’s all be as open as we can to figure out how to get there together. 
Chris Holmes

(Written as a private citizen, who cares about our collective future and believes geospatial has a higher role to play than just an ‘industry’. Views expressed here do not reflect my employer or any other organizations I am associated with)


come in, we're open

Collaborative Mapping: Tools

Continuing the collaborative mapping thread, I’d like to think a bit about tools to make this happen. Do a bit of dreaming, and maybe think through how we can get there. Definitely as soon as I start to talk about this people want to do all kinds of crazy synchronization and distributed editing of features. I do think we’ll get there, but I fear going for too much too soon, getting loaded down by over-designing and not addressing the immediate problems. Indeed Open Street Map has proven that if the energy is there the tools just need to do the very basics. I have been putting my energy in to getting a standards based implementation, on top of WFS-T, but that’s more because I know it and I like standards. I don’t think it’s the best way to do things, and I don’t even think it should be the default way to do things – at this point I’d prefer something more RESTful. But I believe in being compatible with as much as possible, and there are already nice clients written against WFS-T. So it should always be a route in to collaborative editing.

First off, I think we need more user friendly options for collaborative editing. Not just putting some points on a map, but being able to get a sense of the history of the map, getting logs of changes and diffs of certain actions. Editing should be a breeze, and there should be a number of tools that enable this. Google’s MyMaps starts to get at the ease of editing, but I want it collaborative, able to track the history of edits and give you a visual diff of what’s changed. Rollbacks should also be a breeze – if you have really easy tools to edit it’s also going to be easier for people to vandalize. So you need to make tools that are even easier to rollback. On the GeoServer extended WFS-T Versioning API we’ve got a rollback operation, that can work against an area of the map, a certain property, or a certain user (or combinations of those). Soon we hope to be working on some tools built on top of openlayers to handle those operations in a nice editing environment.

The next step on user friendly options will be desktop applications that aren’t full GIS, but that lets users easily edit. These can leverage the tools of existing open source GIS desktop environments, like uDig and qgis, but can strip down the interface to just be simple editing environments with a few hard coded background layers. You could have branded environments for specific layers of information. And ideally build other kinds of reporting tools that also leverage the same GIS tools, but in an interface geared towards the task at hand, like search and rescue or tracking birds. The other thing I hope to work on is getting some of the editing hooked up with Google Earth. I just learned there’s a COM API that might allow us to hack something in, or we can try to get Google Earth to support POSTing of KML to arbitrary URLs as Sean suggest

Next I’d like to see integration with ‘power tools’, the full on, expensive ass GIS applications that are the realm of ‘professionals’. Not that I have a huge love for those tools, but I’d really like to engage as many people as possible in to collaborative mapping. GIS professionals are a great target audience, since most of them are already passionate about mapping. They have a lot of expertise to bring to the table. And while some of them can be elitist about collaborative mapping and ‘lesser’ tools, so too can many of the amateurs raise their noses at people who aren’t DIY. At the extremes it can obviously be a major divide, but I think both could have a lot to teach each other if they’re willing to listen. But I believe the first step to get there is to get the ‘power tools’ compatible with the collaborative mapping protocols, so you start them off in collaboration. This is one reason I’m an advocate of the WFS-T approach, as there are plugins for ArcGIS and other heavy desktop GIS’s. I think we could see some professionals get really excited about collaborative mapping, as it could become the thing they are passionate and do in their free time that is fun and helps boost their resume. This is how many open source contributions work now, it’s a complex interplay that includes professional development. Perhaps one’s collaborative mapping contributions could help land jobs in the future.

I’d also like to see more automation available in the process. This is an area that could use a lot of experimentation, how much to automate, how much to let humans collaborate on. But I think there’s an untapped area of figuring out vector geometries from the aggregrated tracks of GPS, cell phones and wifi positioning data. People are generating tons of data every single day, and most of it is not even recorded. It’s great when people take a GPS and decide explicitly to map an area and then go online and digitize it. But we could potentially get even more accurate than just one person’s GPS by aggregating all the data over a road. Good algorithms could extract the vector information, including turn restriction data, since it could figure out that 99% of fast moving tracks are going in the same direction. Of course we’ll still need people to add the valuable attribute information, but this way they’d have a nice geometry already in place.

You could also do feature extraction from satellite and aerial imagery. This is obviously a tough people that many people are working on, but perhaps it could also be improved by the leveraging human collaboration. In a system with good feedback people could perhaps help train the feature extraction to improve over time. It also could be valuable to do automated change detection, which then notifies people that somethings changed in the area, and then they could figure out the proper action.

The final area I think we could improve with automation is prevention of vandalism and silly mistakes. GeoServer had work done by Refractions a few years ago to do an automatic validation engine. Unfortunately this has languished with no documentation, but it’s still part of GeoServer. One can define arbitrary rules to automatically reject bad transactions – geometries that intersect badly, roads with out names, ect. This could also reject things like ‘Chris Rulez’ scrawled over the whole of the US, as it could know that no real roads run in completely straight lines for over 200 miles. I could imagine a whole nice chain of rules to ensure that all edits meet certain quality criteria. And perhaps instead of rejecting straight up any edit that doesn’t follow all rules can go in to a sandbox. I could also imagine some sort of continuous integration system once there is topology to check network validity, and other quality assurance pieces that can’t take place instantly.

Ok, I’ll wrap this post up for now, will continue this thread soon.

Collaborative Mapping: The Business Thread, cont.

So if there is a future where collaboratively mapping could be economically competitive, how do we go about actually getting there?  I actually think we’re further than many might think, though I believe there is still a lot of work to be done, innovating with the tools, communities and workflows to make this happen.  But I’ll address that in another post, for now I just want to present a possible path for collaborative mapping to bootstrap in to the mainstream.  I’m going to focus on street maps, since that’s the information that people pay big money for, and there is already early success with Open Street Map.  Later will examine how the lessons learned there can feed in to other domains and back

So step 0 is proving that it’s possible for a diverse group of people to collaborate on an openly licensed map.  I’d be hard pressed to entertain any arguments that Open Street Map has not already accomplished this.  Of course in its current state you can’t navigate a car on it, you’re not going to do emergency vehicle response with it.  But their driving principle has been they ‘just want a fscking map’, and a map they do have.  There are many contributors running around with GPS’s and creating a map.

The next point in the evolution is when the map is good enough for basic ‘context’.  Again, OSM is already there for several parts of the world.  If you’re doing a mashup of your favorite neighborhoods you don’t really care if all the streets are there.  You just need enough that it looks about like your neighborhood on other maps.  Many mashups use google maps and others in this way – which is sorta like using the same quality water to flush your toilet as comes out of your kitchen sink (USA!).  Which is to say a bit of a waste, but who really cares if someone else is paying for it.
Which speaks to another tipping point, which is when the big portals start putting ads on their maps.  Or when they start charging to use their APIs.  I concede now that this may never happen, that it’s a good loss leader to have people using your API for free as long as they put their maps out in the public.   But a part of me feels like we may be in that period of the GeoWeb like the first web bubble, when you could get $10 off coupons from CDNow and B+N, allowing you to buy any cd you wanted for a few bucks.  It wasn’t going to last, but it’s sure fun while it does.  But at some point there may be a shift when they need to make some money, which could drive more energy to collaborative maps as people look to get ads off their service.

The next step starts to get fun, which would be once a collaborative street map gets good enough for basic routing and navigation.  Right now it seems to be (though I could be wrong, I don’t know the OSM community intimately) people who set out to add data to the map, they want to get their area map.  If they go to new areas they’ll bring a GPS along, but it’s often to a totally unmapped area.  I think once large areas start to get close to completion we’ll have people hobble together ghetto car navigation kits.  A laptop with a GPS and the collaborative map, either connected over some kind of wireless internet or downloaded to the car.  One can drive around with this and it will show one’s place on the map, and directions to the end point as well.  Note that this kind of usage is currently illegal with Google Maps or any of the others who get their data from commercial providers.  From the API agreement: ‘In addition, the Service may not be used: (a) for or with real time route guidance (including without limitation, turn-by-turn route guidance and other routing that is enabled through the use of a sensor’.  This is because the commercial mapping providers make big money off of car navigation, and license the (exact same) data to do that at a higher price.

With basic navigation on a collaborative map in place you can get people excited about going off in to a ‘new frontier’, going off the map and tapping in to their inner Lewis and Clark.  Actively encourage people to Dérive (though I’m not sure how much the Situationists really would like the idea of people using cars to dérive) in to uncharted areas of the map.

On other fronts I believe that we’ll see niche areas getting high quality mapping.  Governments and companies will realize that if there’s a map that’s 80% done and they just need to fund the last 20%, and that owning the map is not their key value proposition, then they’ll just look to fund the collaborative map instead of doing it themselves.  Those that can think long term will realize that this will most always be cheaper, since they won’t have to keep paying to get it up to date.  With a good collaborative structure much of that will happen on its own.  And they may put a bit extra in each year.  And in areas where a few different organizations all partner up it will definitely be cheaper.  Already we’re seeing some enlightened folks fund Open Street Map contributors to have a mapping party and map an area.

We’ll also likely see collaborative maps for niche verticals.  If you’re doing walking maps then you don’t need the turn restriction information to do car routing, for example.  Someone may offer a map of the best drives in southern california, which would be a subset of the main map.  Or a detailed map of which roads need to be plowed after a snowstorm, that leaves out the roads that don’t.

After that I think you’ll see people hacking commercial nav systems to make use of the collaborative map, and then navigation companies offering low price versions of their systems that don’t rely on the commercial data.  Already we’re seeing navigation companies start to ‘leverage user contributions’, with TomTom’s ‘MapShare‘ to let people update points of interest and the like, and Dash Navigation‘s ability to leverage GPS from other cars to see if a new road has opened up.  I think you may see people even more excited about this if they knew their work was going to a common good instead of just to the advantage of one company.

Once people are able to ‘correct’ the map that they’re driving on I believe we’ll see a really big tipping point.  Build in some voice recognition to call out the name of a street while you’re driving.  This could be billed as the ‘mapping game’, where one gets points for driving new areas.  One could even imagine a company that sets up a business with sort of ‘bounty navigation’ where you can actually make money if you drive new areas of the map and do good reporting of road names and the like.  This could be one of the decoupled functions of the economics around collaborative map making, the navigation company partners with the company that guarantees the map is up to date, and instead of contracting out another company to drive the roads they just put money rewards on driving in new areas.  People could make it so their navigation is free, or even have it be like the electrical grid where if you generate a lot of extra navigation information they pay you.  I haven’t thought through all the details of this, but I think it could work, and would be super cool for helping people think of geospatial data as a commons that one can contribute to and that we’re all responsible for and can be a part of, not just consumers of a service.

Which speaks a bit to a further point, which is when governments realize that they can tap in to and contribute to this as well.  The census spends a ton of money keeping up to date road information.  But their data is not entirely accurate, and it doesn’t do any turn restrictions.  Instead of maintaining their own database they could combine with an open map, and plug in to that workflow.  Indeed such a map likely would have started from one of their TIGER line maps anyways in the US.  So government organizations can join the ecosystem, likely just as funders contracting out other companies to perform the work, as they are starting to do more and more with open source software.  Some may want to try to do it themselves, but the smart ones will plug in to existing ecosystems.

The other tipping point towards the end will be when the big mapping providers decide to invest in collaborative maps.  I had initially been thinking that things would need to be really far along worldwide before they’d make the switch, but a more likely solution might be that they use it in conjunction with their commercial maps.  They already make use of TeleAtlas and Navtech in different places.  So as long as the collaborative map didn’t have a restriction about combining with other sources they could just use it in places that have poor coverage from the major providers.  And they could see where areas of the map are close to being done and strategically fund those.  Another potential source of investment in this kind of mapping could be from aid agencies in areas that commercial providers haven’t mapped.  They could hook up their GPS’s to gather information, and then employ a few people to help process and QA it to make maps they can use.  Since it’s not a core value proposition to them they can share it with others, and start to build really good street maps in areas that no one has touched because it’s too hard for the money they would get.  I would love to try a start up in Africa that hooks up the correcting car navigation systems to a bunch of vehicles and just starts building the living map.  It’d be quite ironic if Africa ended up with more up to date maps than Europe.

They key with all this for me is the evolution of viewing mapping data as a public good, that we all collaborate on to make better.  As GPS’s become more and more prevalent we are all just emitting maps as we go through our lives.  All that’s really needed is a structure to turn that in to useful information, getting the tools better and setting up the economic reward structure.  I’m not a business person, so I don’t have much more to throw out in terms of economic ideas.  But I believe it is possible to set the levers right to encourage this.  And I’m going to do my best to get the tools better and better to show what is possible and get us all moving towards as a future where an up to date accurate map is a commons available to all, and that all are a part of.

Google and the Geospatial Web: A smaller piece of a much, much larger pie.

Well, I’ve been back from Where 2.0 for awhile, and as usual blogging hasn’t been the highest priority, but there was one topic that I’ve been really wanting to write about.

And that is that Google seems to be legitimately moving in a more open direction with regards to geospatial. I’ve rarely been overtly critical of their lack of openness, but it’s always been a source of frustration for me. And as I got to know more people there I definitely realized that their lack of collaboration wasn’t the result of any malicious intent, it was simply a perceived lack of resources – they felt they didn’t have time to put effort in to standards and working with others. And I’ll be the first to admit that it takes more work to be open and collaborative.

But I do say ‘perceived’, because the thing I’ve found again and again doing the ‘open’ thing is that it’s an investment that pays off in the medium and long term. Working alone definitely allows you to move faster in the short term, but working with others leaves you much better off in the longer term.

With regards to Google’s geo portfolio, the way I’ve always termed it is they could have a huge piece of a small pie or a sizeable piece of a much, much bigger pie. What pie is it we’re talking about? What I’ve referred to as the Geospatial Web, though I’m trying to call it the geoweb, since that term seems to be taking off more. They are obviously the clear leader, with Google Earth and Maps, specifically KML and mashups – as those both allow more geospatial data to get out in the world. And they could just push KML and their platform and do quite well. But it would be a silo. It wouldn’t be like the web, it’d be a greatly expanded and easier to use Geography Network. Much, much better and bigger, but still a single platform. It could potentially even become a platform like Windows, truly dominant, but the point for me is it still wouldn’t be as big as it could be. It wouldn’t be the World Wide Web, where innovation comes from all over building something far bigger than any single company could possibly make on their own.

The bigger pie is the vision of a true Geospatial Web, that diverse individuals and organizations all contribute to, and where technical innovations come from all over. To achieve this there must be an underpinning of open standards, that others can contribute to. There must be an ecosystem of companies and services, business models and startups and dot-orgs. The ecosystem can be dominated by an entity, but can’t be entirely dependent on a single entity, as would be the case if Google defines the software and the format and the search engine. But if this open geoweb is nurtured and encouraged the right way we’ll get exponential growth. Citizens will start demanding that governments and organizations data put their data on, just like we’ve seen happen with eGovernment on the WWW. It will become a default, and people will look at you weird if you have geospatial data that’s not on it.

I think it’s not crazy to aim for the majority of all spatial information to be available on it. It will be a much bigger pie than one that Google owns, as more and more people will feel comfortable making their data available, since it’s a public resource instead of clearly benefiting a single company. And it also allows further innovations to come from the outside. Google has a ton of smart people, but they don’t have all the smart people in the world. They can afford to let innovation come from elsewhere (though I’m sure they’ll probably just buy up the best ones), because they’ll start to do what the company does best: search. There’s no reason to own a geoweb when you can own the way most people find information on The geoweb.

Of course, even with search Google could constrain it to their web, as they did when geo search came out – it was called KML Search and only could find KML. What they are going for now is much more ambitious, and indeed a bit more risky. And so I applaud them for it – they are putting a stake in the ground that says ‘our best ideas are not behind us’. They are going to be a leading force in a much bigger pie, and turn this open collaboration in to a really good long term investment.

Ok, I’ve gone on speculating about things, I probably should give a bit of evidence. I admit that it’s pretty subtle, but based on it and a few conversations my gut tells me that they are legitimately on the level. At least for now, that’s not to say that some corporate decision could move things in an opposite direction: such is the fate of a publicly traded company. But they seem to be trying to do some work that will be hard to undo.

First, KML Search is now referred to as ‘geo search’, and is crawling not just KML but also GeoRSS, with more formats likely coming soon. This is one of the most important pieces to me, and was the announcement that excited me much more than StreetView. It is admitting that it’s ok for people to use other formats, even though KML is super nice and easy to use. Yes, more formats may confuse my grandmother (one of the eloquent arguments used by Google folks in the past for why we should all just use KML), but more formats also means extending an olive branch that says you can work with others.

Second, Google is an active sponsor in OGC’s OWS-5. I had been a bit skeptical of their throwing KML over the fence to OGC. Yes, it’s nice the copyright is with OGC, but it’s kind of meaningless to me unless KML actually aligns with the other open standards. And OGC would likely try to do that, but then it remains a question if Google would actually support the new standard. Or if they’d have this covert control over it with the ability to exclude any decisions they didn’t like by just not including an implementation in Google Earth and Maps. But they are sponsoring OWS-5 which will fund several server and client implementations to flesh out a new KML spec that incorporates other OGC standards. The OWS testbeds are the best way to develop specs in the OGC, and putting real money up for this definitely indicates for me a commitment to making KML a true open standard, not just a rubber stamped pseudo-standard. The one piece that I’m not sure on is how much they’ll have engineers working with OWS-5 to try out the new spec ideas on Google Earth and Maps. If they have a couple people show up at the kickoff meeting who are set to work on it for the next few months I will be very happy.

Third, John Hanke’s speech at Where 2.0 was the first time I had heard the Google geo team really tell the world that they want to work with others. Some of it was subtle, but there was definitely a flavor of openness and collaboration that I’d not felt before. Previous speeches would always come back to the innovations they’re doing, how great KML is, ect. There was little acknowledgment of an outside world, which could come across as fairly arrogant – that not only are we doing things the best way, we haven’t even looked in to how anyone else might do since we must be doing things the best.

And finally, in private conversations many googlers have talked about a more open shift in the past 6-9 months. There were always a few voices for that, but it sounds like a tipping point has been reached and there is now a critical mass. The voices are heard and effort is being oriented in that direction. I think it’s an investment that will really pay off for Google, and though I’m going to continue to work to push them in to ever more open directions (maybe even to be able to talk to them about what they’ve got in the pipeline without signing an NDA? Ah, to dream ;), count me as a skeptic who is becoming more and more convinced that we’re going to build a true, open, collaborative geoweb.

Public domain imagery from iCubed for WorldWind and beyond?

So I’m watching this video about the new Java WorldWind.

And there’s a couple quotes of interest from Patrick Hogan, NASA’s lead on the project:

That’s access to different NASA datasets that you can leverage, public domain, so you can use and abuse that information as you like, do anything you want with it, but mostly have fun have fun with innovating, kind of going places we haven’t even dreamed of yet.

I should point out that the iLandsat is from a company called iCubed and they have provided that kind of, that dataset for the earth that typically costs about a quarter of a million plus just for internal use, and they have donated it to WorldWind for use by the public.

Public domain imagery from iCubed? Sounds like a dream come true to me. Of course this just opens up lots more questions, like what resolution, what part of the world, what year is it from, ect. But if it’s truly public domain that’s really great news for any collaborative mapping projects that are unsure about deriving their information from commercial imagery.

I’m hoping that someone will be able to hack in and figure out if the imagery is really available. But the server referred to in the source code seems to get ‘Server is too busy’ errors, and when I use WorldWind here I’m not getting any tiles. When I get some time I’ll maybe try to dig in to the source a bit more and maybe get some links to the imagery.

Looking at the source code does seem to reveal some references to GeoServer, for their placename layer, which we always like to see 🙂 I will encourage them to change the namespace prefix from ‘topp’ (which is the default and refers to the organization I work for), to something more appropriate like ‘nasa’ (though keeping it does make it easier for me to know it’s a GeoServer, which is nice…). And I’m curious about their ASPX cache – if you guys let me know what/how you’re caching I’d be happy to try to build it as a module for GeoServer.