Spatialite and GeoPackage

So I’d like to talk about some of the decisions we made in GeoPackage, as much work went in to discussing alternatives and possibilities that is not obvious from the current document. And I’m interested in opening up the dialog around specification development. This is all written as a private individual, not representing any organization or standards working group.

One of the toughest decisions was how to design the binary format for feature data. One of the most important things for me was that it would be possible for implementors to build code to create a geopackage without requiring any special SQLite stuff. The core should just be a file format. And the file should be readable by any SQLite, not one compiled with special options or additional libraries. This means the core GeoPackage is far less than SpatiaLite. Which is one of the main ‘naive’ questions about GeoPackage – why did we not just adopt SpatiaLite? SpatiaLite as a whole does not meet my core requirement: it requires a number of libraries. Indeed it is the additional libraries, it is the full spatial database, not a core format that any software can produce.

The less naive question is why did we not just use SpatiaLite’s geometry binary format, as it’s very similar to what ended up in the spec. Both more or less use Well Known Binary for the geometry, with some header information to help in parsing. This was perhaps the only vote that was not unanimous. From my traditional outsider perspective OGC seems to have a strong case of ‘Not Invented Here’ (TMS -> WMTS to me was particular painful, as it just seemed to overcomplicate things more than it needed to). I thought it would be really great to have some compatibility with SpatiaLite, to potentially have a built-in install base.

It would have been a slam dunk to go that way if SpatiaLite’s binary was completely compatible with the widely used Well Known Binary specification. They have some extra headers, but those would have been fine, as they were nice optimizations. But the key was to be able to point any existing Well Known Binary parser at the geometry portion and have it read. Unfortunately SpatiaLite made some small changes so that as a Java programmer I can’t just use JTS‘s Well Known Binary parser, and it’s a similar story in OGR. And it’s the same case for any GIS vendor that has a WKB parser. So I felt that we should favor future implementors over backwards compatibility with a binary format that has a growing but still relatively small install base.  But I’m still not sure about the decision, and may have gotten some things wrong, so I welcome technical dialog on this.

When we were freed from backwards compatibility we also got a chance to do some small innovations on the header. Most implementations we found put some sort of bounding box in their geometry, so we did the same. Paul Ramsey had a great suggestion we incorporated to have a special flag for points so we could skip the bounds, since it requires four times as much information as the point itself, so you can save significant space.

I tried to get a ‘plugfest’ going with Luciad and Esri. Pepijn and Luciad have been amazing, and even managed to release their work open source. Our crew at OpenGeo has been busy with other projects, but got a version out in GeoTools, and Justin found it quite easy to work with. Esri has been silent about any work they may be doing, so it’s not the plugfest success I was hoping for.

I do still believe a real commitment from Esri to include GeoPackage in their products (ideally not just as a mobile transfer format, but as an input and output to ArcGIS Desktop, Server and Online) would keep me committed to the current GeoPackage binary format, so I’m hoping they step up and get at least a reader and writer for tiles and features to help further QA the specification, as a great step towards opening up. But if we get a lot of compelling arguments to just use SpatiaLite I certainly can be swayed, especially if Esri doesn’t manage to get even a prototype out. I believe it should not be hard to change the core binary format in the spec.

There’s been some good conversation on github about the technical aspects of using SpatiaLite’s binary format. I’d love to hear more from developers on if matching the binary format will actually make us backwards compatible with existing SpatiaLite data packages. And on the flip side I’d be really interested if someone from the SpatiaLite community could explore implementing GeoPackage, doing what all the other spatial database implementors will do, which is just use a compatibility layer to read in and write out the GeoPackage binary, using their existing geometry blob format internally. I definitely admit I may not understand all the issues, and while I’ve been impressed by the technical discussion of the SWG it is still a small group with a limited perspective. So please sound in on the ticket, and show what’s possible with real world implementations. And if you’re an Esri customer or employee encourage them to step up and release a GeoPackage implementation, as I’d really love to have this specification go to version 1.0 with at least 3 real world implementations. Luciad, GeoTools/GeoServer and Esri would be ideal. But Luciad, GeoTools, and SpatiaLite and/or GDAL/OGR would also be great.

Advertisements

Githubbing the GeoPackage specification

In the past few months one of the main things I’ve spent time on is the new GeoPackage specification being worked on within the OGC. I was involved in the very early days of the conception, before it was picked up for the OWS-9 testbed, as I feel it has the potential to fill a big hole in the geospatial world (James Fee sums up my feelings nicely). We discussed keeping it super lean, accessible to all, and implementable by non-GIS experts.

While I got wrapped up in other things a spec for public comment popped out in January, which I felt had strayed from what I believed were the original core design goals. I sent a huge missive on all the changes I felt could get it back to that lean core. I try to never be one to just criticize though, as it’s easy to cast stones and much harder to build something that is strong enough to hold up to others pelting rocks. And my open source roots teach me that a criticism without offering to put in effort to the alternative is close to worthless. So I decided to join the GeoPackage ‘Standards Working Group’, participating in weekly (and then twice a week) calls, and trying to work with the OGC workflow of wikis and massive Word documents.

One of my main goals was to learn about how the OGC process actually works, and hopefully from a place of knowledge be able to offer some suggestions for improvement from my open source software experience. That’s worth a whole post on its own, so I’ll hold off on much of that for now.

OGC staff has also been great about being open to new ways of working. It’s had to be balanced against actually getting the specification out, as many agencies need it ‘yesterday’. But we achieved an OGC first, putting this version of the specification out on GitHub. My hope is to attract developers who are comfortable with GitHub, who won’t need to learn the whole OGC feedback process just to suggest changes and fixes.

I do believe the specification is ‘pretty good’ – it has improved in almost all the ways I was hoping it would (Pepijn and Keith drove the tech push to ‘simpler’, with Paul doing a great job as Editor and Raj from OGC helping out a bunch). I do believe GeoPackage can improve even more, primarily by feedback from real implementations. My hope is it does not go to version 1.0 without 3 full implementations. That don’t just handle one part of the document, but implement all the core options (particularly Tiles and Features).

For the OGC this is an experiment, so if you do think OGC specifications could benefit by a more open process you can help encourage them by spending some of your technical brainpower on improving GeoPackage using GitHub (big shout out to Even who already has a bunch of suggestions in). Fork the spec and make Pull Requests, on core tech decisions or on editorial tweaks that make it easier to understand. For those without GitHub experience we wrote some tips on how to contribute without needing to figure out git). We also tried to add a mechanism for ‘extensions‘ to encourage innovations on the GeoPackage base, so I’d also love to see attempts to use that and extend for things like Styles (SLD or CSS), Rasters, UTFGrids and Photo Observations.

And if you can’t fully implement it but are interested in integrating, I do encourage you to check out the Apache licensed libgpkg implementation from Luciad. Pepijn has been the technical core of GeoPackage, researching what’s come before and continually building code to test our assumptions. Libgpkg is the result of his work, and the most complete implementation. Justin also has a GeoTools implementation and GeoServer module built on it for the Java crowd. And hopefully someone will implement for GDAL/OGR and PostGIS soon.

As always I’ve written too much before I’ve even said all I wanted to. But I will try to do at least one follow up blog post to dig in to some of the details of GeoPackage. I and the SWG welcome an open dialog on the work done and where to take it next. And I’m happy to explain the thinking that got the spec to where it is, since we wanted to keep the spec itself slim and not include all that detail. We are hoping to eventually publish some ‘best practice’ papers that can dig in to ideas we considered yet discarded as not quite ready.