Re: Why isn’t collaborative geodata a bigger deal already?

First off thanks everyone for the great responses, it’s great to have different perspectives refine my thinking on this subject. In this post I’m going to attempt to respond to many of the great comments and questions. Some of my responses won’t be complete, and will beg a full post to themselves – indeed many of the issues raised are things I’ve thought about and have future posts planned. But I like a conversation much more than a monologue, so it makes sense to address what comes up now.

The ‘FOSS vs. Commercial divide’ definitely needs its own post, but I will evoke Arnulf and say that FOSS can be commercial, and the proper divide is FOSS vs. propertietary, which I will address in a future post.

Alan, thanks a ton for the thoughtful insight. It’s great to get feedback from someone who’s been thinking about these things far longer than I. In future posts I’m hoping to more fully explore how we can bootstrap an architecture of participation – this post was to just posit reasons as to why we haven’t seen collaborative geospatial data emerge already. But it’s great to hear that the ‘priesthood’ has no real power – as in much of life the power is truly with people, and we’ve really just got to organize right to exercise the power.

I agree that data is not software, and that the challenges are going to be different, but I’d be interested in your thoughts on how software and data are fundamentally different such that an architecture of participation could not be formed. Because I’d point to open source software and replace software with data to a number of your points –

– the great desire is to have software not to create it
– creating software requires intellectual effort (not just technological & physical)
– the more effort it takes the more valuable it is
– if software is valuable people want to steal it

As for your assertion that if data is valuable then people want to sabotage it, I’d refer to one of Weber’s points of what leads to a successful collaborative project – ‘The product benefits from widespread peer attention and review, and can improve through creative challenge and error correction (that is, the rate of error correction exceeds the rate of error introduction)‘ So the issue is not if people want to sabotage it, it’s if the architecture of participation can handle the error correction at a rate greater than error introduction. Of course he’s not just referring to malignant error introduction, but it’s necessarily a part of it. So the question is if the commons can resist the sabotage. If a true commons of value is established, then people who find value there will want to protect the commons. If there are tools that make this easy, then it’s more likely the commons will be protected. You can have easy rollbacks, you could have people sign up to ‘watch this area of the map’, like ‘watch this page’ on wikipedia, looking out for vandals on the areas you care about, and you can limit commit rights. I’ll go in to these in more depth in a full post in the future, but note that open source software suffers very little from sabotage, as those who contribute directly are vetted before. Wikipedia is more prone to it, but also is able to correct itself. So we won’t pretend that potential sabotage of data won’t increase as the dataset grows popular, we just need to figure out the proper architecture such that the commons will be protected and fixed in a timely manner. One should also note that some datasets are very valuable to a few people, but not that valuable to everyone. So bike enthusiasts who want to map their favorite paths likely won’t have their data vandalised.

As for people wanting to steal the valuable data, that shouldn’t be an issue, just like it’s not for open source software – the commons must be guaranteed to remain open. I take this to be a base condition for collaborative geospatial data to really succeed. I do concede that there could be other incentive structures that allow substantial collaboration around geospatial data. But at this point I’m not so much thinking about them, I’m thinking of something similar to the open source software movement, where the base case is that the collaborative data is open to all.

On the subject of Spatial Data Infrastructures, I’ve got another whole thread on SDI’s. Geospatial webs and applying architectures of participation to SDI’s and the like. I think Dave‘s point was also mostly about SDI’s as well, the interconnected content. For now I’m really just focusing on the micro level, creating and maintaining geospatial datasets. I certainly don’t think that all, or even the majority, of data on a true public SDI/geospatial web will be built collaboratively – we’re just talking about a small piece of the content puzzle. But I do believe that it can play an important role in helping to bootstrap a true public SDI, and it will be combined with sensors and real-time data services and the like, and I think the discovery piece that Jo points out is quite important. Dave, I’d actually disagree that historically the open source community has lead the charge, but I do think it will lead the charge for collaboration on open geodata. Are the surprises from proprietary software you’re thinking of SDI related or specifically for open geospatial collaboration?

The topic of public SDI’s segues to Geoff‘s great point that we’re likely going to see collaborative mapping emerge in places like Asia where goverments have restrictive terms for access to geospatial information. Thanks for the link to, I’d not seen it before, and am attempting to gather examples of proto-collaborative mapping. Looks like they’re using MapBuilder for their online map. Hopefully I can get in touch with them and learn more about how the community works and what the motivations of individuals are, but this is really one of the most advanced collaboratively mapping examples that I’ve seen, and I’m quite excited about it. Previously I had actually been thinking that innovation might first come from countries with less restrictive mapping policies, that we’d first see perhaps a massive project to improve TIGER data, since you have such a jump start with over 90% of a basemap for the US complete. And that you’d have forward looking mapping agencies collaborating with citizens on more ‘fun’ datasets, like nature areas and bike paths (MassGIS ,my favorite mapping agency, has had some experiments with those layers). But in some places the need is great and a small group of motivated individuals could just make enough of a difference to start. It looks like they’re making use of MapBuilder, MapServer and probably PostGIS, using strong open source projects as the base on top of which they innovate, which is definitely the path to take.

For Sean‘s point, I completely agree that it’s going to take a lot of time and effort. But I actually think open geodata falls closer to software than wikipedia (though wikipedia is great for proving that the root concept may work even better for domains other than software). The GNU effort couldn’t reuse any existing tools, the legal constraints forced them to build it all from scratch. And it took many years before it got critical mass, and even more until Linux built an operating system for the tools.

I also think the ‘snowball point’ will also be more like open source software as well – wikipedia snowballs right when you’re past the notion that only professionals are qualified to write an encyclopedia. But software certainly doesn’t snowball at a similar point. It snowballs when the existing open source software is close enough to the needs of commercial companies such that it costs less money for them to invest in the open source software than it does for them to buy proprietary licenses. Of course this point is different for each company, depending on many, many factors. But as one company invests in open source and gets it good enough for their needs then it may become advanced enough for other companies to invest for the next step, and thus a snowball is born. I believe the point when mapping data will snowball is when it makes economic sense for a company to invest in a collaboratively built map, improving it for their needs, instead of licensing a proprietary map. And yes, this too will be different for each company – some only need general context to overlay their specific geospatial information, others need exact info and routing and the like.

But I agree, it’s going to take time and energy, both at the meta level to make it easier to overcome the logistical problems, and at the down and dirty level of going out to re-survey, well, just about everything. Just attempting to identify what’s held it up in the past by no means is the same as building, and that is the much bigger challenge. It’s going to be an uphill battle for awhile, but I do believe eventually we too will see a snowball. And I’m keeping a firm eye on you and the pleiades project for some brilliant techno-cultural inventions.

11 thoughts on “Re: Why isn’t collaborative geodata a bigger deal already?

  1. Chris,
    It’s not that I think an “architechture of participation” can’t be formed – my comments were in the context of “Why don’t they work?” There have been many built but as you observed they just don’t seem to catch on. I don’t know the answer but I have rationalised it (for my own peace of mind) along these lines:

    How is data differnt from software?
    1. Data is an abstracted reality – software is a created artifact.
    2. The concept of a tract of “words” written in a “language” fits well with the Copyright concept; slabs of bytes encoding stuff that everyone can already see is much harder to conceive of as “creative”.
    3. code has a “signature” (an embedded structure that identifies it as “an individual”) wheres data is identical no matter who compiles it

    – data is not “sexy” to most people
    – the sense of kudos that applies to the creation of a dataset seems to be much less than applies to code-cutting
    – there is a diminished sense of “ownership” or “authorship” for contributors to a bucket of data. (As a simple example you might have 10 different “opinions” on the height of a tower in a dataset – it is extremely unlikely that you’ll have 10 different suggestions for the code to extract heights from the database.)
    – data is largely an opinion and even seemingly certain things like heights can be argued for centuries; software has rules and performance metrics defined by the compiler – data doesn’t have a compiler.

    This last aspect comes from the big pool of “data is political”; even seemingly innocuous stuff like my favourite Thai restuarant can be fraught with public opinion pressures and legaliites.

    To my mind these are (indicative of) the issues that need to be addressed to make a true community SDI workable and I see the attention given to the governance structure of efforts like the pleiades project as demonstration of this.

    Please, keep up the good work Chris.

  2. Why yes, yes I have seen some of the maps OSM’s been making, and indeed mentioned it in the original post. This post was a reply to the responses people put up to that post, and OSM was not in anyone’s responses and did not further support any of the points I was making. I’m developing a full thread on collaborative geodata, and am planning at least a full post on OSM. But I didn’t think I had to mention OSM in every single post on collaborative geodata?

  3. Um…
    Copyright is one form of protection for what the lawyers call Intellectual Property.

    You can steal data, people do it all the time. In fact, you can’t be infringed for copyright until there’s evidence that you’ve already stolen the property concerned. ;-(

    Who owns the data at openstreetmap?

  4. Chris,

    As a follow-up on Alan’s “Thai restuarant can be fraught with public opinion pressures and legaliites”, one particular aspect to bear in mind when opening up the world of geodata to all is so-called SLAPPs : Strategic Lawsuits Against Public Participation.

    I just happen to have learned about SLAPPs in a local newspaper last weekend. One of the oldest ecological organization in Québec will most likely have to close their offices this Friday because they face a SLAPP. Organizations can launch SLAPPs to destabilize citizens or small organizations that “bug” their business model so part of an architecture of participation for geodata has to address the protection of well-intended small data [ad hoc] providers.

    Thanx for your blog!

  5. Several years ago I was involved in an Open Source development that was specifically aimed at providing infrastructure to free up access to Urban Planning data (No names: no lawsuits.) The project was successfuly completed and launched – even though all the “bishops & captians of the industry” declared it couln’t work.

    Two months later the whole thing was dead. Political pressure (yes: power, not money, was the driver. This result was acheived simply by restructuring the “sponsoring” body and redirecting resources from “contributing” orginisations. The software still existed, the data still existed, all Open Source… No-one would touch it because it was suddenly dangerous to one’s professional health.

    Hence my persistence with the theme that technology (and licensing) are not the problem. Governance is the whole issue – and it needs just as much effort as would be applied to a commercial organisation.

    Data is power – you need to be well protected if you’re going to mess with people’s power base… That kind of protection needs a lot of thought before any action.

    I still believe it can be done Chris.


  6. Pingback: On Framing « Into The Pudding
  7. Pingback: Get Paid Money to fill out Simple Paid Surveys Online by giving your Opinion
  8. Hello! Help solve the problem.
    Very often try to enter the site, but says that the password is not correct.
    Regrettably use of remembering. Give like to be?
    Thank you!

  9. Pingback: Squidoo

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s