Collaborative Mapping: Tools, cont.

The next major area of tool improvement I see is expanding the wiki notion of editing to more of a merging revision control model, with branches, versions, patches and eventually expanding in to distributed repositories.  The ‘patch‘ is a small piece of code that can be applied to a computer program to fix something.  They are widely used in the open source software world, both to get the latest improvements, and to allow those who have commit rights to a source repository to review outside improvements before putting them in.  This helps create the meritocracy around projects, as they don’t let just anyone in to the repository as they might break the build.  Such a case is less likely with maps, but sometimes core contributors might want to see a couple sample patches before letting a new member in.  In the GeoServer versioning WFS work we have a GetDiff operation that returns a WFS Transaction that can then be applied to another WFS.  This fits in with the technical part of how a patch works – they’re really easy to apply to one’s dataset.  But unfortunately a WFS transaction is not as easy to read as a code patch.  The other great thing about patches is that when leaf nodes are updating their data they can just request the change set – the patches – instead of having to do a full check out.  So I’m still not sure how to solve this problem, the WFS Transaction is the best I’ve got, but I think we can do better, have a nice little format that just describes what changed.

Once we’ve got patches people are going to want the ability to merge changes.  If you made a patch and I made a patch and we both submit them then we need a way to see if they’re compatible.  Ideally you could merge at the feature level – if you change the road type and I change the road length of Interstate 5 then we shouldn’t get a conflict.  Even better, merge at the geometry level, if we changed different points on the road then those should merge nicely.  This will become important as people start to ‘check out’ their geo repositories, do edits, and then try to submit back in.  We could just do locking, which is what WFS-T does, but concurrent versioning is so much nicer – we just have to be able to pull off merging.

Right past merging is full on branches.  Which of course are much easier to pull off if you’ve got nice merging in place.  But branches will let people try out new geographic updates in their own sandbox before putting them on the mainstream.  This can lead to better reviews of the updates.  And with nice branching and merging you would be able to let a number of people work concurrently on their own area of the map, merging them seamlessly.  This is obviously a really hard problem, one that even ArcSDE has trouble with for the things people actually want to do.  I do think we’ll be able to get there in the open source world, indeed I believe we have a better chance of achieving it since once we get close we’ll get a lot of interest in people wanting it completed and meeting their needs, funding the iterative improvements.

The final piece, that I sort of don’t even want to think of yet, since it’s damn hard, is distributed versioning.  I do think it’s extremely important though, to let everyone have their own editing repository, which can flow back in to the main one.  I like the model a lot, and think it has great wins for geospatial.  But since we’ve barely got an SVN equivalent I think it’s wiser to wait a bit on these issues till we sort out what a patch should look like.  Indeed SVK was possible because SVn already existed.  But I’m definitely excited by the possibilities, for every node of the map to have the potential to be edited.  This can be a big win for areas with low bandwidth.

The next category of tool improvements is granular security settings.  Right now there’s not even a way to limit editing the map to only some users.  I think that many maps will flourish with the open to all editing style, making use of rollbacks to prevent vandalism.  But some will likely want to keep the map to set group of committers.  This way one could get commit rights after doing a number of good patches, perhaps ensuring higher quality for some maps.  You also might have different permissions for different users on different layers.  We should be able to get all of that with our current GeoServer security system, we just need to hook up a UI for it.  The trickier thing will be a nice feature, and I think is possible – limiting users to certain geospatial areas or features with specific properties.  Since the security system is integrated at the code level, and lets us use aspects, I think that this should be possible, will just take a bit of work to figure out.

Another area I see a lot of potential innovation is distributed processing of tiles.  Tiles are the clear winner for how to display geospatial information, Google Maps has risen the bar so that anything that isn’t tiled just feels out of date.  But tiling takes a ton of processing power.  Google is all set up to do it, but the rest of us aren’t.  To fully cache to zoom level 17 would have taken me about 5 months.  Open Street Map has been making tremendous strides on this with their Tiles@Home initiative, which I am very impressed by.  OSM is lucky in many ways, in that they have a project that people want to devote their spare CPU cycles to.  It could be cool to set up marketplaces for processing of tiles, where companies that are going to keep their data private, or just that don’t have the reputation of OSM, can engage other nodes and give them micropayments for their work.  I think other areas of potential innovation include leveraging Amazon’s EC2 to process huge amounts of tiles.  We’re also going to need to have the collaborative mapping stuff hook up with the tiling efforts, so that when there are massive edits the tiles can expire themselves and get processors started on generating new ones.  We can likely leverage http’s Conditional GET functionality to let browsers and others cache geospatial data, but also get the most up to date data when its available.

The last area I’d like to see improvement on is more granular notification mechanisms.  GeoRSS output is the obvious choice, but could also do email or SMS notifications.  Speaking of which I’d love more innovation on mobile clients, and even super low tech versions like be able to SMS in a new or updated location by just entering cross streets or reading a position from GPS.  But one should be able to have the notifications based on very granular rules – ‘send updates for highways in this bounding box’, or ’email all occurrences of the brown spotted pigeon along this river bank’.  This would be useful not only for preventing vandalism, but also to enable people to take action on up to date reports.  The map becomes not just an artifact of what has happened, but a living thing can help create more up to date information.  If the brown spotted pigeon is seen in one area then it will alert more people who can then add updates on its location and get a more detailed map of its path.

I’m sure there are many more innovations to be had with tools, but this is just a start of the things that we’re starting to work on and the things I’d like to work on in the future.  At TOPP we’re doing this stuff when we don’t have paid client work (or have met revenue targets for the year, since we’re a non-profit), but if there’s anyone out there who wants to see specific areas accelerate we’d be very excited to take on paid work to do any of the things talked about here <end shameless plug/>.

Collaborative Mapping: Tools

Continuing the collaborative mapping thread, I’d like to think a bit about tools to make this happen. Do a bit of dreaming, and maybe think through how we can get there. Definitely as soon as I start to talk about this people want to do all kinds of crazy synchronization and distributed editing of features. I do think we’ll get there, but I fear going for too much too soon, getting loaded down by over-designing and not addressing the immediate problems. Indeed Open Street Map has proven that if the energy is there the tools just need to do the very basics. I have been putting my energy in to getting a standards based implementation, on top of WFS-T, but that’s more because I know it and I like standards. I don’t think it’s the best way to do things, and I don’t even think it should be the default way to do things – at this point I’d prefer something more RESTful. But I believe in being compatible with as much as possible, and there are already nice clients written against WFS-T. So it should always be a route in to collaborative editing.

First off, I think we need more user friendly options for collaborative editing. Not just putting some points on a map, but being able to get a sense of the history of the map, getting logs of changes and diffs of certain actions. Editing should be a breeze, and there should be a number of tools that enable this. Google’s MyMaps starts to get at the ease of editing, but I want it collaborative, able to track the history of edits and give you a visual diff of what’s changed. Rollbacks should also be a breeze – if you have really easy tools to edit it’s also going to be easier for people to vandalize. So you need to make tools that are even easier to rollback. On the GeoServer extended WFS-T Versioning API we’ve got a rollback operation, that can work against an area of the map, a certain property, or a certain user (or combinations of those). Soon we hope to be working on some tools built on top of openlayers to handle those operations in a nice editing environment.

The next step on user friendly options will be desktop applications that aren’t full GIS, but that lets users easily edit. These can leverage the tools of existing open source GIS desktop environments, like uDig and qgis, but can strip down the interface to just be simple editing environments with a few hard coded background layers. You could have branded environments for specific layers of information. And ideally build other kinds of reporting tools that also leverage the same GIS tools, but in an interface geared towards the task at hand, like search and rescue or tracking birds. The other thing I hope to work on is getting some of the editing hooked up with Google Earth. I just learned there’s a COM API that might allow us to hack something in, or we can try to get Google Earth to support POSTing of KML to arbitrary URLs as Sean suggest

Next I’d like to see integration with ‘power tools’, the full on, expensive ass GIS applications that are the realm of ‘professionals’. Not that I have a huge love for those tools, but I’d really like to engage as many people as possible in to collaborative mapping. GIS professionals are a great target audience, since most of them are already passionate about mapping. They have a lot of expertise to bring to the table. And while some of them can be elitist about collaborative mapping and ‘lesser’ tools, so too can many of the amateurs raise their noses at people who aren’t DIY. At the extremes it can obviously be a major divide, but I think both could have a lot to teach each other if they’re willing to listen. But I believe the first step to get there is to get the ‘power tools’ compatible with the collaborative mapping protocols, so you start them off in collaboration. This is one reason I’m an advocate of the WFS-T approach, as there are plugins for ArcGIS and other heavy desktop GIS’s. I think we could see some professionals get really excited about collaborative mapping, as it could become the thing they are passionate and do in their free time that is fun and helps boost their resume. This is how many open source contributions work now, it’s a complex interplay that includes professional development. Perhaps one’s collaborative mapping contributions could help land jobs in the future.

I’d also like to see more automation available in the process. This is an area that could use a lot of experimentation, how much to automate, how much to let humans collaborate on. But I think there’s an untapped area of figuring out vector geometries from the aggregrated tracks of GPS, cell phones and wifi positioning data. People are generating tons of data every single day, and most of it is not even recorded. It’s great when people take a GPS and decide explicitly to map an area and then go online and digitize it. But we could potentially get even more accurate than just one person’s GPS by aggregating all the data over a road. Good algorithms could extract the vector information, including turn restriction data, since it could figure out that 99% of fast moving tracks are going in the same direction. Of course we’ll still need people to add the valuable attribute information, but this way they’d have a nice geometry already in place.

You could also do feature extraction from satellite and aerial imagery. This is obviously a tough people that many people are working on, but perhaps it could also be improved by the leveraging human collaboration. In a system with good feedback people could perhaps help train the feature extraction to improve over time. It also could be valuable to do automated change detection, which then notifies people that somethings changed in the area, and then they could figure out the proper action.

The final area I think we could improve with automation is prevention of vandalism and silly mistakes. GeoServer had work done by Refractions a few years ago to do an automatic validation engine. Unfortunately this has languished with no documentation, but it’s still part of GeoServer. One can define arbitrary rules to automatically reject bad transactions – geometries that intersect badly, roads with out names, ect. This could also reject things like ‘Chris Rulez’ scrawled over the whole of the US, as it could know that no real roads run in completely straight lines for over 200 miles. I could imagine a whole nice chain of rules to ensure that all edits meet certain quality criteria. And perhaps instead of rejecting straight up any edit that doesn’t follow all rules can go in to a sandbox. I could also imagine some sort of continuous integration system once there is topology to check network validity, and other quality assurance pieces that can’t take place instantly.

Ok, I’ll wrap this post up for now, will continue this thread soon.

Collaborative Mapping: The Business Thread, cont.

So if there is a future where collaboratively mapping could be economically competitive, how do we go about actually getting there?  I actually think we’re further than many might think, though I believe there is still a lot of work to be done, innovating with the tools, communities and workflows to make this happen.  But I’ll address that in another post, for now I just want to present a possible path for collaborative mapping to bootstrap in to the mainstream.  I’m going to focus on street maps, since that’s the information that people pay big money for, and there is already early success with Open Street Map.  Later will examine how the lessons learned there can feed in to other domains and back

So step 0 is proving that it’s possible for a diverse group of people to collaborate on an openly licensed map.  I’d be hard pressed to entertain any arguments that Open Street Map has not already accomplished this.  Of course in its current state you can’t navigate a car on it, you’re not going to do emergency vehicle response with it.  But their driving principle has been they ‘just want a fscking map’, and a map they do have.  There are many contributors running around with GPS’s and creating a map.

The next point in the evolution is when the map is good enough for basic ‘context’.  Again, OSM is already there for several parts of the world.  If you’re doing a mashup of your favorite neighborhoods you don’t really care if all the streets are there.  You just need enough that it looks about like your neighborhood on other maps.  Many mashups use google maps and others in this way – which is sorta like using the same quality water to flush your toilet as comes out of your kitchen sink (USA!).  Which is to say a bit of a waste, but who really cares if someone else is paying for it.
Which speaks to another tipping point, which is when the big portals start putting ads on their maps.  Or when they start charging to use their APIs.  I concede now that this may never happen, that it’s a good loss leader to have people using your API for free as long as they put their maps out in the public.   But a part of me feels like we may be in that period of the GeoWeb like the first web bubble, when you could get $10 off coupons from CDNow and B+N, allowing you to buy any cd you wanted for a few bucks.  It wasn’t going to last, but it’s sure fun while it does.  But at some point there may be a shift when they need to make some money, which could drive more energy to collaborative maps as people look to get ads off their service.

The next step starts to get fun, which would be once a collaborative street map gets good enough for basic routing and navigation.  Right now it seems to be (though I could be wrong, I don’t know the OSM community intimately) people who set out to add data to the map, they want to get their area map.  If they go to new areas they’ll bring a GPS along, but it’s often to a totally unmapped area.  I think once large areas start to get close to completion we’ll have people hobble together ghetto car navigation kits.  A laptop with a GPS and the collaborative map, either connected over some kind of wireless internet or downloaded to the car.  One can drive around with this and it will show one’s place on the map, and directions to the end point as well.  Note that this kind of usage is currently illegal with Google Maps or any of the others who get their data from commercial providers.  From the API agreement: ‘In addition, the Service may not be used: (a) for or with real time route guidance (including without limitation, turn-by-turn route guidance and other routing that is enabled through the use of a sensor’.  This is because the commercial mapping providers make big money off of car navigation, and license the (exact same) data to do that at a higher price.

With basic navigation on a collaborative map in place you can get people excited about going off in to a ‘new frontier’, going off the map and tapping in to their inner Lewis and Clark.  Actively encourage people to Dérive (though I’m not sure how much the Situationists really would like the idea of people using cars to dérive) in to uncharted areas of the map.

On other fronts I believe that we’ll see niche areas getting high quality mapping.  Governments and companies will realize that if there’s a map that’s 80% done and they just need to fund the last 20%, and that owning the map is not their key value proposition, then they’ll just look to fund the collaborative map instead of doing it themselves.  Those that can think long term will realize that this will most always be cheaper, since they won’t have to keep paying to get it up to date.  With a good collaborative structure much of that will happen on its own.  And they may put a bit extra in each year.  And in areas where a few different organizations all partner up it will definitely be cheaper.  Already we’re seeing some enlightened folks fund Open Street Map contributors to have a mapping party and map an area.

We’ll also likely see collaborative maps for niche verticals.  If you’re doing walking maps then you don’t need the turn restriction information to do car routing, for example.  Someone may offer a map of the best drives in southern california, which would be a subset of the main map.  Or a detailed map of which roads need to be plowed after a snowstorm, that leaves out the roads that don’t.

After that I think you’ll see people hacking commercial nav systems to make use of the collaborative map, and then navigation companies offering low price versions of their systems that don’t rely on the commercial data.  Already we’re seeing navigation companies start to ‘leverage user contributions’, with TomTom’s ‘MapShare‘ to let people update points of interest and the like, and Dash Navigation‘s ability to leverage GPS from other cars to see if a new road has opened up.  I think you may see people even more excited about this if they knew their work was going to a common good instead of just to the advantage of one company.

Once people are able to ‘correct’ the map that they’re driving on I believe we’ll see a really big tipping point.  Build in some voice recognition to call out the name of a street while you’re driving.  This could be billed as the ‘mapping game’, where one gets points for driving new areas.  One could even imagine a company that sets up a business with sort of ‘bounty navigation’ where you can actually make money if you drive new areas of the map and do good reporting of road names and the like.  This could be one of the decoupled functions of the economics around collaborative map making, the navigation company partners with the company that guarantees the map is up to date, and instead of contracting out another company to drive the roads they just put money rewards on driving in new areas.  People could make it so their navigation is free, or even have it be like the electrical grid where if you generate a lot of extra navigation information they pay you.  I haven’t thought through all the details of this, but I think it could work, and would be super cool for helping people think of geospatial data as a commons that one can contribute to and that we’re all responsible for and can be a part of, not just consumers of a service.

Which speaks a bit to a further point, which is when governments realize that they can tap in to and contribute to this as well.  The census spends a ton of money keeping up to date road information.  But their data is not entirely accurate, and it doesn’t do any turn restrictions.  Instead of maintaining their own database they could combine with an open map, and plug in to that workflow.  Indeed such a map likely would have started from one of their TIGER line maps anyways in the US.  So government organizations can join the ecosystem, likely just as funders contracting out other companies to perform the work, as they are starting to do more and more with open source software.  Some may want to try to do it themselves, but the smart ones will plug in to existing ecosystems.

The other tipping point towards the end will be when the big mapping providers decide to invest in collaborative maps.  I had initially been thinking that things would need to be really far along worldwide before they’d make the switch, but a more likely solution might be that they use it in conjunction with their commercial maps.  They already make use of TeleAtlas and Navtech in different places.  So as long as the collaborative map didn’t have a restriction about combining with other sources they could just use it in places that have poor coverage from the major providers.  And they could see where areas of the map are close to being done and strategically fund those.  Another potential source of investment in this kind of mapping could be from aid agencies in areas that commercial providers haven’t mapped.  They could hook up their GPS’s to gather information, and then employ a few people to help process and QA it to make maps they can use.  Since it’s not a core value proposition to them they can share it with others, and start to build really good street maps in areas that no one has touched because it’s too hard for the money they would get.  I would love to try a start up in Africa that hooks up the correcting car navigation systems to a bunch of vehicles and just starts building the living map.  It’d be quite ironic if Africa ended up with more up to date maps than Europe.

They key with all this for me is the evolution of viewing mapping data as a public good, that we all collaborate on to make better.  As GPS’s become more and more prevalent we are all just emitting maps as we go through our lives.  All that’s really needed is a structure to turn that in to useful information, getting the tools better and setting up the economic reward structure.  I’m not a business person, so I don’t have much more to throw out in terms of economic ideas.  But I believe it is possible to set the levers right to encourage this.  And I’m going to do my best to get the tools better and better to show what is possible and get us all moving towards as a future where an up to date accurate map is a commons available to all, and that all are a part of.