November 21, 2005

What does Licensing a Specification under Creative Commons Actually Mean?

Like RSS and other formats Microsoft today released their draft specification for Simple Sharing Extensions for RSS and OPML under a Create Commons license. I will dig into the technical details later, but it is clear that should this gain some traction, it will be a quite significant contribution to the web (An earlier, similar draft specification unfortunately didn't get much attention).

But for now, I'm interested in a much more basic question: What implications on implementors does licensing of the specification under CC have? I mean, people are making quite a fuss of Microsoft using that license, but what is the actual benefit for me as implementor? Sure, there are certain things I'm allowed to do with the document (i.e. the form) per se, but what can I do with SSE (i.e. the information described in the document) that I can't do with something that is specified under a "All Rights Reserved" document? E.g., I would allowed to describe the protocol in my words anyway and I could choose any license for that version. It wouldn't be the official version, but neither would any CC-allowed derivate be. There seems to be limitation (i.e. no attribution or share-alike necessary) on the license of the code that implements SSE (thankfully). So far, I don't see the special case for specifications.

I can see that it is extraordinary, that Microsoft uses an essentially GPL-like license (the share-alike is what Microsoft calls a viral license), but they use it in a place where there seem to be no relevant practical consequences of doing so. There is nothing really "non-commercial" about it, as some claimed. The copyright and the license cover the form of the specification, not what is specified. And licensing the text under CC also has no apparent consequences on the patentability of the underlying ideas. So essentially I can sing the specification or print it on a t-shirt and I can modify it before doing so. What else? Why is everyone so excited about it? Even Tim Bray seems to take the CC-license as guarantee of it being legally unencumbered. And Tim obviously knows a lot about format specs. So, do I miss something?

Can someone help me out here?

Update: Philip Jacob asks the same question. No clear answer yet, but it seems that the CC-part in this context is just good PR and not much more. So then, the question remains what the lincesing-unrelated part about patents and Microsoft means. I find the formulation ("... agrees to offer a royalty-free patent license on reasonable and non-discriminatory terms and conditions ...") pretty vague, although it at least sounds as meant in good spirit. Others have a less-optimistic interpretation here.

Posted by seefeld at 19:52 | Comments (1)

November 09, 2005

The Classified Market and the Aggregators

Abstract: Off- and online classifieds business models are based on selling access to a scarce resource, readers. While classifieds aggregators (like simplyhired, Oodle and others) do deliver free traffic to those sites, it isn't nearly as valuable, as it the value-generating property (scarcity) is stripped off: Those users would find the ad through any different site, too. However, blocking aggregators would be short-sighted, since even a market leader would be bound to only loose market share over time. The era of selling vanilla ads to scarce audiences will be over sooner than many wish, the base cost of publishing ads vanish. This leaves a huge gap in monetization and the key strategy to capture this money would be to base the value proposition on something one level up the stack of hosting ads. The article quickly discusses a few such ideas. Further, it seems like providing a strong API, i.e. helping the aggregators, would be a smart move for these sites, as this would spur necessary innovation and most importantly keep power-concentration at the aggregator sites in check.

Recently, a number of startups tackle the aggregation and search of classifieds over multiple sites. Switzerland will see its share coming, too, as in the beginning of this year a landmark court decision cleared the cloud of uncertainty behind this model -- interestingly one of the suers (tamedia) was the first larger company to launch such a site, piazza.ch.

Naturally, the classical classifieds sites are wary on the situation. The aggregators argument is, that the classified sites are profiting themselves, because they get traffic for free. While this is certainly true, the argument is also a bit disingenuous, as the aggregators move themselves into an interesting position, and this with relatively little effort. An interesting position, because they're the ones who can capture most of the user's click flow and slowly gaining market position to sell advertising or even premium positions in the result list. In essence, they say "we give you free traffic to your classifieds but want to show the search results page in exchange". And Google demonstrated how financially interesting that page is.

That said, the current classified sites have not been very innovative with their results page. The business model is just an almost 1:1 copy of the offline model, money is being exchanged for exposure to an exclusive (scarce) audience. The offline model has the audience' scarcity built in. The first wave of the remaking of the classifieds market was that the audience moved online; the roles of offline publications replaced by online destination sites. Newspapers lost market share by clinching too long to their old, profitable offline business. But all in all, not much innovation has happened in this market, surely not much was driven by the fact that we're online (except for the convenience to reach the site, of course).

The aggregators are removing the scarcity, the traditional reason for paying for the ad. In principle you can put your ad on the cheapest site that is being crawled, that is the price will quickly plummet to zero.

A rather ignorant reaction of a classified site could be to technically block all kind of spiders (Craigslist did, but apparently more for technical reasons). While that would slow the proliferation of aggregators, smaller sites would still be happy to be crawled. At the very least when the big portals start in this business. A market leader site would retain their scarcity for the moment, but would be bound to only loose and loose market share over time. This model is not sustainable.

As a classifieds site, a simple answer would be to jump on the bandwagon. Use your brand recognition to have a head start in the search business, and show your own customers at the top of the result list - the most obvious monetarization scheme. In fact, the launch of piazza.ch here points in exactly to that direction and several job sites have been doing this forever. The elephant in the room here being the question wether a half-hearted approach, especially regarding the ranking, will be strong enough against new sites, that focus on being an aggregator. Remember, the barrier of entry for starting in that market is low and getting lower. Have a good idea? With almost no investment you can try it out. The signs of a market ripe for a lot of small innovations. Maybe we will have thousands of aggregators, all catering to small niches. Why not, indeed? And while these innovations happen, the classified site following the aforementioned strategy is stuck with essentially the old model, this time with audience scarcity created artificially in the ranking, while simultaneously giving up all other differentations towards other sources of classified ads.

Another option would be the move the value away from exploiting the scarcity of audience and/or attention and create it with something else. What is clear is that the prices for an ad are often marginal for the advertiser anyway, thus there is still a lot of money on the table, even a chance to recapture what was lost in the dwindling print market. What are the possibilities?

Service comes up as natural example. Maybe service in publishing the ad, making this ridiculously easy ("just send in your item, we'll photograph and describe it and send you the money when its sold". There are companies doing this for ebay). Make great ads, compete for highest conversion of lookers to buyers. Or good integration with backend systems. But most interestingly by handling the life and afterlife of the ad. Many types of ad are just there to look for the first and only customer that is willing to pay. All other leads just cost money if not properly redirected. The used car salesman will probably take any lead anyway, but if you're renting a flat for a predetermined rate, any affluent enough tenant is good enough and showing the flat to 20 people is only a cost factor. Find ways to qualify leads, automatically take the ad away when a good lead is found and maybe even monetize surplus leads in novel ways? That would be something that wasn't really possible offline!

A good neighborhood would be another possibility. Become an umbrella brand for trusted/valuable/.. ads. Become a label! I remember an old statistics of isbn.nu a price comparison engine for books. While amazon.com almost never had the cheapest offer (rather, they were in the middle of the field), they still got the most sales - by a large margin. Sure, many factors specific to e-commerce retailers play here, but one could imagine a brand for ads, established a firm standard for who can advertise, that translates into a trust that boosts the quality of leads significantly. Or have a trust system specific to your site. Ebay and craigslist lead here in their specific ways, but not much is seen here from traditional players.

In all these models, the increased circulation is really a blessing and the sites would be well served by making the work easy for the aggregators. Publish the ads in RSS feeds, maybe even additional APIs. By using something like feedtree you can offer an up-to-the-minute current feed at virtually no additional cost. Enable the market of new and hopefully innovative aggregators.

In fact, lowering their barrier of entry even further helps level the market lest no one single aggregator will come powerful enough to demand money for very current crawls or something like that.

It also enables a forest of mini sites, tailored to very specific audiences. A single parent community website maybe, that marks out neighborhoods with fellow members? An exclusive programming languages website, happy to point to job offerings for their favorite language.

To wrap up, just a quick example of how such a strategy could help a whole industry get out of the lock of a near monopoly comparison site: In Switzerland, every year the new rates for health insurance are published on the same day. Many, many people flock to comparis.ch, the leading comparison site. Their lock on the market is so strong, that they can demand high fees for every lead and dropping out of their engine is not really feasable for an insurance company. Although some smaller and actually cheap ones are actually not included, but few people know. One of the larger insurance companies, CSS, built their new price calculator around web-services. What if they would open it up, establish it as a standard and most other companies would follow suite? Creating a site more powerful and more complete than comparis would be of the order of an assignment for a CS student. The lock of comparis could quickly crumble, saving the insurance companies the fee for the lead, maybe giving it back to the customers.

While probably a bit harder to see this in the classifieds market, my guess is that a healthy transformation of this market could be driven by similar patterns; and surely there is still a lot of money there, you just have to figure out how to create value worth it. In any case, we are entering a phase of real innovation in this field, with interaction models and business models beyond almost direct translations of the offline world.

Posted by seefeld at 22:34 | Comments (0)

November 05, 2005

Jump Starting Recommendation Engines with Tagged Bookmarks

A couple of weeks ago, I came across an interesting post on tagging at topix. Especially the quote on Raw Sugar, about "value added search around the tagging done by individuals on their own data" got me thinking. There is certainly the way My Web 2.0 is integrated into Yahoo! Search (try it out, they import del.icio.us bookmarks and it's amazing how often I search for stuff I already bookmarked).

But even more intriguing could be document analysis. Services like Findory are based on finding documents similar to what the system learned were other interesting documents (grossly simplified), and the most important input for the known interesting documents is the users' previous click flow. While a unstructured bookmark collection would certainly serve as starting vector, a tagged collection could be much more useful: My tagging reveals what aspect of the article interested me. I might have bookmarked this site on paper airplanes, but the tag fun will reveal, that the reason was more a meta aspect of the document rather than a vivid interest in aeronautics or handicrafts. Tagging an article on apple with innovation will emphasize the portions on their methodology as what interested me (in contrast to stories maybe about users of apple products in general).

How would that be implemented? All these personalization engines will extract document features and look for repetitions among the pool of interesting documents (again, grossly simplified). Common document features will be emphasized, rare ones dropped. By grouping documents by their tags, the features that these documents have in common could be emphasized even more. Then we can see if other users tagged the same group of documents and again filter document features against their profiles for the tags they chose for these documents (with bonus points if the tags are actually the same, not just the tagged documents). From research in collaborative filtering, we know that it works astonishingly well as long as the prediction stays in the same domain: That we both read the Hitchhiker's Guide doesn't mean we like the same music. Tagged documents might be the element to effectively use collaborative filtering techniques to extract relevant document features.

Maybe someone like Findory will try this? While the target audience for this probably isn't larger than the web 2.0 crowd, it would surely be interesting and lessons might be applicable to other loosely structured collections.

Update 2005-12-13: Yahoo buys del.icio.us. Of course.

Posted by seefeld at 16:17 | Comments (0)

semi annual state dump

Yet again, it has been a too long time since my last entry. Let's see if I get a handle on this blogging thing this try. A lot of happened in the meantime. For one backup brain motivation for blogging almost completely goes into my del.icio.us bookmarks. Increasingly, I am become aware of being read there, too and start to try to be more interesting in my bookmarking - but not always.

It has been more than a year, since we launched map.search.ch. The technology, or rather the movement behind its increased usage, has since been christened AJAX and after Google last February, Microsoft this summer, Yahoo yesterday launched their version in a similar style. I'd say the dynamic interface we did has been thoroughly validated in the market, which is very gratifying to see! And it's nice to see, that principles like strong back button support are becoming an standard part of AJAX development (Oh, I wouldn't ever claim that all this happened because of us. I'm sure that the technology and idea was ripe, but it makes me happy that we were to first to pluck it).

And of course, development on map.search.ch went on. It is documented in the (german) company blog at about.search.ch. Highlights include points of interest including live updated ones like free parking spaces, real time water temperatures (which was more useful in summer..), webcams and my favorite, the real time public transport schedules.

And as a special treat, map.search.ch got nice national recognition by winning the Master of Swiss Web 2005 award, and even by a large margin!

So much for the incomplete wrap-up of the last half year. I am currently very involved in a new project, which will shortly see the daylight. But now on to some stalled but hopefully more interesting posts!

Posted by seefeld at 16:09 | Comments (0)