Last Tuesday Google single-handedly removed an important feature from the web. And that by means of small little attribute, rel=nofollow.
(To my few non-geeky readers: As you know, sometimes small changes can have an unfavourable effect on great things. This is such a case, and the great thing is the Web as a participatory world)
Google, like most other search engines, uses the collective intelligence of the web in determining what is important and what is isn't. This was beautifully democratic, as many forms of participation on the web allowed everyone to express an opinion, that was not only visible to the direct reader but also counted towards a statistical total. Every vote counted. From now on, you need to have a blog to make your voice count, and your opinion only counts if expressed on that very blog, not somewhere else. Not even the collective wisdom of the mighty Wikipedia editors counts anymore, as their links' weight as votes has been removed, too.
And all this permanently, as many others put in mere hours of consideration about the consequences before collectively jumping into the party. And Robert Scoble thinks that this kind of decision making should happen more often! I really hope not, this is just an example of the gorilla in the market planting a standard while ignoring the sensible community which would have loved to put its energy and collective intelligence behind this - and repeatedly asked for this!
Apparently, at least Dave Winer was invited to put in a weekend shift on this. Hey, Dave, you know better: You immediately complain - and rightly so - when some people at an invitation-only conference cook up possible standards, because you fear that the process is not open enough. Now we have a quicky solution, that is half-baken, ignores the consequences and doesn't even fulfill its goal. The open dialogues you champion would have yielded a better solution.
When the partying half of the blogosphere gets sobers about this, we'll all come to realise what dimensions of the web have been removed. Votes are now restricted to whoever decides the contents of brochureware and other official websites, people with the technical means to be their own webmasters and bloggers. And then only to the opinions expressed in their own living room.
As usual, John Battelle has something smart to say about all this.
At least, they could have called it rel=usergenerated. Semantic, and with the potential of creating a search engine that emphasizes personal opinions (I'm sure some smart chap will at some point use nofollow as useful meta information per se). I'm sure just a few weeks of public request for comments would have surfaced many such details. But that kind of open participation seems to have been lost long ago in the Google DNA :-(
And unfortunately it won't stop comment spam, not in the next few years, for simple economic reasons. Nor will it remove it's effects on the search engines, although it will dampen it. Ben Hammersley points out why this will probably increase comment spam, at least until the search engine industry moves to something else than Pagerank.
Oh, and linking without giving linkjuice was simple before. For example, use Google's own redirecting service. And this is based on a very old standard (robots.txt), will surely work and survice any future re-interpretation of the real meaning of nofollow.
I tried to outline some possible solutions a while back. I don't think any of them would have survived a good community process, but the main point to take away from that post, is that someone who crawls the whole web can rather easily detect comment spam anyway, globally and with great coverage and thus remove its effect. And to remove the cause, Google et al would have to speak very publicly about that measure, not some attribute that will never be implemented everywhere.
I'm sorry, I'm sounding a bit sour today. I have to disclose that I work for a Google competitor, but I write this primarily as a netizen, alas one with some technical background in the matter.
This is as much the story about the possible futures of the Web as it is testament to the power Google has accumulated.
Dear Google, I appreciate the thought, and I'm sure you meant it in every good intention, but the outcome is evil.
Posted by seefeld at January 20, 2005 09:52Yes, we're all very enthusiastic right now, but we also know it's not the end-all solution to spam. It's just another tool in the toolbox.
Moderation on a per-comment basis will still give legitimate postings the "linkjuice", so it's also not the end of the Web as we know it as you seem to believe ;-).
And since you mention it: search.ch currently does not detect comment spam and remove its effect, as this search shows: http://www.search.ch/search.html?q=cialis+soft+tab&loc=ch . Will it be implemented in the future? That would really set search.ch apart from the competition!
Posted by: Matthias at January 20, 2005 11:33 AMMatthias: If you do per-comment moderation, then you don't need the rel=nofollow attribute anyway. You're right that this will be in the original sense of the web. You want those people to get their share of linkjuice! I fear for the chilling effects of all the places where this was enabled without too much thought.
And thanks for the example query. But this is a case that nofollow wouldn't prevent anything as we find mostly the offending comments themselves, not their linkjuiced results (which seem to point outside switzerland anyway). Actually, this is an example how a real comment spam detection would work even better than this proposal.
We try several anti-spam measures at search.ch, some working, others not at all. Comment spam is much more easily solved globally (you have more data, this gives you a stronger signal), but we'll do our best.
It is not the end of the world, but this whole thing smells too much like corporate PR. Google, Yahoo, MSN, 6A all act as if this will solve the comment spam problem. And they know better! They're all smart enough to realize that it won't change anything. Why do they claim it then? Someone stepped off the cluetrain and half to blogosphere is following them. Maybe that is why I'm a bit sour today.
Martin Roell says it well (and gets linkjuice for this link in the comments): http://www.roell.net/weblog/archiv/2005/01/20/verwunderung_ueber_google_und_weblogsoftwarehersteller_ob_relnofollow.shtml
Yes, the query is useless, I didn't think it through. Sorry.
As far as comment moderation goes: Yes, in frequently maintained blogs, spam will be removed anyway, with or without rel. But it seems that many search engines will crawl frequently changing pages more frequently; the search engine will pick up the links before the owner can remove the spam. That's where an automated nofollow is also useful.
And the rel attribute is definitely useful in abandoned or forgotten blogs, guestbooks, wikis etc. Blogger and other hosts are chock full of abandoned blogs, and these abandoned blogs are chock full of comment spam. Since there are no legitimate comments in those blogs anyway, an automated global rel=nofollow doesn't hurt anyone.
Posted by: Matthias at January 20, 2005 09:52 PMA quick point about wikipedia editors. I'm fairly sure a complex system like wikipedia won't just slap this nofollow attribute on every single external link. Moderated pages will most likely have the attribute removed.
Posted by: Jon at January 21, 2005 10:38 AMJon: I would hope so. Unfortunately, in the current incarnation all external links just got that attribute:
http://mail.wikipedia.org/pipermail/mediawiki-cvs/2005-January/006255.html
Wikipedia has just very few moderated pages, although they seem to work towards some reviewing model.
A solution could be to have the attribute for the first week or so, and if the link survives that week without being removed by someone, then remove the attribute. But then again, spamlinks never survived longer than a few minutes anyway and that didn't stop the spammers from trying... So you could just leave out the attribute from the beginning!
I never thought about nofollow this way, I too was eager to get rid of comment/wiki spam by easy means. The points you list seem very insightful. I didn't get the point about robot.txt though, how can that help removing comment spam? Or rather: I tried to use a robots.txt file to do just that, but since the standard lacks regexps, it failed to work with my particular choice of wiki. Is that just tough luck, or is there more to that Google redirection service you mentioned, and I just didn't get it?
Posted by: Tobias at March 31, 2005 11:20 AM