Is metadata unuseful?
The always-interesting Cory Doctorow some time ago wrote an article arguing that metadata will be unsuccessful. His points are several:
- Man Lies
- His point is that folks will lie in their metadata, just as searching on an old-style search engine for just about any term these days turns up porn. True — but one of the cool things about RDF and the frameworks built upon it is that anyone can annotate anything, and thus I can choose to rank metadata providers, or choose someone else’s ranking. The trouble with in-page metadata was that it was provided by the author, who had incentive to lie; out-of-band metadata may be provided by anyone. Might there perhaps arise a market for metadata marketers, much as there’s a market for Consumer Reports? I can easily imagine it.
- Man is Lazy
- His point is that authors are unlikely to annotate their own pages. This is not quite true: already, there’s a blossoming market in search engine optimisation. Now, some of it is deceitful, but much of it is concerned with presenting a page such that a spider such as the Googlebot can easily understand it. It’s a small step from that to maintaining separate metadata. Moreover, as I suggest above, it’s quite likely that there will arise a market in metadata. Yes, man is lazy — but there already exists a mechanism to get him working: it’s called a job.
- Man is Stupid
- Doctorow notes that on eBay and other such sites are rife with misspellings, and that this means that folks will not accurately categorise their data. As I noted above, there will be a market, and a market is an excellent mechanism for driving some minimal level of quality. Yeah, McDonald’s is not the greatest food in the world, but one is much less likely to contract typhoid from it than from food a century-and-a-half ago. Mission: Impossible — Know Thyself :: He notes that every man is a rotten judge of his own character. Well, duh. But others — especially the cumulative of others — are often pretty good at it. Solved, once again, by the market.
- Schemas Aren’t Neutral
- Doctorow points out that every producer will argue for a scheme of classification which represents his interests best, and denigrates those of his competitors most. Once again, he’s correct, but both in law and in finance we’ve somehow managed to overcome that to a very great deal. That’s part of the beauty of schemas which are created from the bottom up: driven by hackers such as those who brought us the World Wide Web in the first place, they become standards before any special interest can affect them over-much. Either that, or they are created from the consensus of those same interests. Whichever path they take, they tend to end up decent; those that don’t die off, once again due to market pressure.
- Metrics Influence Results
- His point here is that whatever we rate by tends to influence that which is rated, e.g. mandatory school testing leads to teaching to the test. His greater point is that ‘it’s wishful thinking to believe that a group of people competing to advance their agendas will be universally pleased with any hierarchy of knowledge.’ Well, of course: I don’t doubt that there may be multiple competing schemas. Generally, that which delivers the most to the most will win. This doesn’t necessarily mean that it will be the best (see Windows), but it will be mostly sufficient (again, see Windows). But over time, it will improve (see Linux).
- There’s More Than One Way to Describe Something
- As he says, ‘reasonable people can disagree forever on how to describe something’; true enough. Yet somehow we all manage to agree on traffic rules; and those of us who disagree (say, on the question of whether McDonald’s or Le Central affords one better food) somehow don’t come to blows over it: we form separate communities. Where reasonable men disagree, there may be a divergence of standards, but where we agree there will be uniformity. Isn’t that the ideal anyway?
It’s an interesting article, and his points are cogent, but I believe that in the long run they become irrelevant. Most of them are defeated once there is a market for metadata, and the rest are made minor over time.
04 February 2018: with the rise of Twitter & Google markup, I think that we’re seeing the very beginnings of a market-oriented solution start to take hold. It’s taken a lot longer than I ever expected, though. Also, ummm … I was very young & enthusiastic in 2004.