I just returned from the first Bern flashmob. Unfortunately the timing wasn't perfect. Nevertheless, the number of people was quite high; I'd guess at least 40 or 50. They were a little bit hard to distinguish from normal tourists (although slashdot t-shirts were giveaway hints). I took two pictures for my moblog.
If you want to learn more about flashmobs, have a look at cheesebikini, who covered the original flashmob.
In Japan Yahoo! BB is offering 12 mbit/s DSL for $21 monthly! Symmetric! They also offer VoIP services, where a call is about ten times cheaper than with NTT. A call to another Yahoo! BB subscriber is completely free. Imagine video over these pipes. Imagine person to person sharing and collaboration. Imagine what kind of markets for services and digital equipment this creates.
How do they do this? A whopping $2 billion investment on the one hand and cheap, commodity IP equipment (gigabit switches) on the other (Hmm, how expensive was our airline again?).
Now would somebody please unlock the last mile here?
Nutch is a nascent effort to implement an open-source web search engine.
Web search is a basic requirement for internet navigation, yet the number of web search engines is decreasing. Today's oligopoly could soon be a monopoly, with a single company controlling nearly all web search for its commercial gain. That would not be good for users of the internet.
Nutch provides a transparent alternative to commercial web search engines. Only open source search results can be fully trusted to be without bias. (Or at least their bias is public.) All existing major search engines have proprietary ranking formulas, and will not explain why a given page ranks as it does. Additionally, some search engines determine which sites to index based on payments, rather than on the merits of the sites themselves. Nutch, on the other hand, has nothing to hide and no motive to bias its results or its crawler in any way other than to try to give each user the best results possible.
Nutch aims to enable anyone to easily and cost-effectively deploy a world-class web search engine. This is a substantial challenge. To succeed, Nutch software must be able to:
- fetch several billion pages per month
- maintain an index of these pages
- search that index up to 1000 times per second
- provide very high quality search results
- operate at minimal cost
Nutch is hostet by the Internet Archive and is backed by Mitchell Kapor (of Lotus and OSAF fame), Tim O'Reilly (of O'Reilly & Associates) and others.
Apart from the political arguments, I think that such an open source web search engine will successfully attract contributions because a web search engine is a tool that is used intensively and diversely by software engineers (hackers hack hack tools). And, of course, it is a very interesting technology to work on - but maybe I am a bit biased here.
It will be interesting to watch how they will decide what code to run on the cluster. Apart from new ranking algorithms, there could be a lot of web mining to support and/or supplement search. The open source community will come up with a lot of ideas.
As for hardware financing, this will be a challenge, but on the other hand the prices for the necessary hardware will continue to drop rapidly (up to the point, where hardware is free). What will be costly is the maintenance. And most of it has to be done by full-time paid staff. But I guess, that the Internet Archive is a good place to look for experience here.
What puzzles me, is their current decision to implement it in Java. Maybe I am a bit old-fashioned here, but if your goal is to "operate at minimal cost" and scale to "several billions pages" and "1000 queries per second", then hardware costs will be a major factor and while in the age of JIT Java surely isn't factors slower than natively compiled languages, optimizing things like the L2 cache through several layers of abstractions introduced by language features, virtual machines and the JIT compiler will be extremely hard, but ultimately necessary (a percent performance gain might - if they are successful - save you a couple of hundreds of servers). Of course, they will resort to implement the crucial parts in C. But then, they could have used a dynamic script language (one of the P-languages like Python, Perl or PHP) as glue instead and would have had the advantages like faster development, etc.
A search engine is not a huge project in terms of number of lines of code. It is however quite dense in hard passages. A lot of room for smart algorithms and data structures and highly optimized code. With a language you also get a culture, and while there are a lot of smart people in the Java community, there expertise lies in big projects and clever abstractions and modeling. In my experience people interested in the guts of a search engine don't opt for Java in their own projects. But that might be just a bad prejudice of mine.
Interestingly, they seem to mandate unit tests for submitted code and I always wanted to see a non-trivial real test-driven project from inside, so this looks like a good opportunity to brush up my Java skills...
Ah, looking closer at the list of developers I spot Doug Cutting, 10+ year IR-veteran and Lucene author (which probably explains the choice of Java). And also Ben Lutch, co-founder of Excite. Exciting!
And with the Internet Archive nearby they will surely look into time-dependent link analysis. Many nice opportunities there!
Experimenting with Bluetooth and my new t610 I downloaded Salling Clicker. This is way cool.
It allows you to remote control almost everything in your Mac with your mobile phone. I can use my phone as remote control for iTunes, the DVD player or Keynote.
But best of all, it sports a proximity sensor. Now, when I walk away from my Mac, the screen blanker is automatically activated and iTunes stops. When I come back, the screen is unlocked again, iTunes resumes playing and my address book and calendar are automatically synchronized. Cool.
Update: Now my computer reads the number of unread emails when I enter the room. With AppleScript I could teach him to give me an update on almost any other things, like buddies online or upcoming meetings and deadlines.
When I think about it, the next neat thing would be voice control to work over the phone, which really has a much better suited microphone than the one integrated in the PowerBook and is with me all the time. Feels very much like Star Trek. Very geeky stuff.
Yesterday I got my Sony Ericsson t610. I figured, that if I work on bild.li, a moblogging service, I'd need one...
So, now I can do all the cool stuff like MMS moblogging, which of course leads to my first post in my brand new moblog.
Chris is asking wether blogging is really useful or just yet another way to fill the web with junk (as happened to Usenet and e-mail).
The web is 99.999% uninteresting content _for_me_ anyway :) While a lot that is said about social software looks like the late 90ties hype, a lot of the current criticism resembles the early 90ties internet skepticism.
I once saw a plot about the popularity curve of new technologies: flat (most), followed by a huge hype peak (some), followed by a downfall (all), followed by a slow and steady increase when the technology finally matures (few).
I think blogs are a manifestation of personal conversations taking back the web (hmm, sounds quite cluetrainish). Not only trafficwise (they always were important, and the blogs' traffic is hardly significant) but in terms of awareness and power.
There are some crucial differences between the old communication systems and blogs. Usenet and mailing list are organized by topic/time/author, blogs turn this around and usually scrape the topic: They are organized by author and then time (This observation is not mine, I think it was Clay Shirky, but I can't find the reference now). In a sense, the whole blogosphere is a grand distributed discussion system, which uses a white-list based pattern for relevance control instead of a topic-based pattern. This seems to work quite well and I am confident that we the system will be able to handle AOL blogs as well.
Another crucial difference is that the personality of the participating persons moves into the spotlight. This is a departure from the systems where the host of a discussion is a corporation and their branded environment struggles to take the spotlight. Humans come with well developed instincts about social behavior, which can tremendously help our ability to differentiate between junk and valuable mutterings. This turf-owning also spurs the online development of your thoughts: It is always you who defines the agenda. Now and in the future. You own your data.
And the third important thing, which also explains things like moblogs, which arguably are relevant for even less people, is what Elke once said: She thinks, that the underlying meta trend is people desire of rebuilding their existence online.
Others know much more about it. Joi Ito even manages to connect it to The Hitchhiker's Guide to the Galaxy! :-) Or Phil Ringnalda, who argues that the public nature of blogs yields small town effects.
An interesting article on how Open services like Amazon's or Google's are slowly taking off. The important insight is:
"Organizations shifting to a open service platform model have clear goals. First, they hope to encourage incremental innovation that adds value to their core offering or core service using the resources of others. It's also a realization that most of the good ideas in the world don't come from your own staff. While you might very well have lots of smart people in your organization, they are also aware of and sensitive to the inevitable organizational constraints that tend to restrain innovation. "
The article talks about possible applications in the financial industry. Easy bill handling comes to mind. Be sure to build it in a simple way, not forcefully limiting how and who can use it and you might watch how others leverage your work to industries and applications you wouldn't have paid attention to. Paypal learned this and already eclipses American Express in online transactions.
(Via Ben Hyde)