AoV’s Roving Reporter on Google/Blogger

Not wanting to be left out when something interesting happens, I emailed Google’s Press Center about their acquisition of Pyra (the company behind Blogger). I received a sparse, but prompt and polite reply from the Director of Corporate Communications including this statement:

“Google recently acquired Pyra Labs, developers of Blogger — a self-service weblog publishing tool used by more than one million people. We’re thrilled about the many synergies and future opportunities between our two companies. Blogs are a global self-publishing phenomenon that connect Internet users with dynamic, diverse points of view while also enabling comment and participation. In the coming weeks, we will report additional details. Blogger users can expect to see no immediate changes to the service.”

Often when webloggers are covering a developing news story, there is the tendency to wait for the commercial media outlets to get the official story. Why wait? Pick up the phone and call. You’ll probably just get the company line (like in this case), but it’s worth a shot.

Also, what took News.com a day and a half to catch up on this story? I still don’t see it on Salon.

 

Best Webcam Shot of the Year

When it gets this cold (-40 with the windchill in Charlottetown today), you have to talk about the weather.

From the IslandCam in downtown Charlottetown today:

today: boxers AND briefs

 

Universal Access to All Human Knowledge

Brewster Kahle speaking at the Library of CongressA link found from Matt Haughey’s a.wholelottanothing.org lead me to a talk by Brewster Kahle of the Internet Archive. His organization is working a variety of projects to make public domain content available in an “internet library”. Among these projects is the WayBack Machine, which archives the web.

The talk is part of a series at the Library of Congress and runs 1 hour and 26 minutes in RealVideo format. It is worth watching: Brewster Kahle: Public Access to Digital Materials (1hr 26min RealVideo).

Kahle’s basic idea is universal access to all human knowledge. Every book, speech, TV show, website, concert, etc. should be available to all of us. He looks at three main questions:

  • Should we? Yes!
  • Can we? Is it logistically possible from a technical and financial perspective? His answer: Yes.
  • May we? Will we be allowed to make all knowledge available under law? His answer: Yes.
  • Will we? He leaves this as an open question.

His numbers on the cost to digitize (scanning, etc.), store (disk space), and make available (bandwidth) all human knowledge are fascinating. According to Kahle, the hardware and labour costs required to make all book and all television and all music ever created available are not that difficult (within the hundreds of millions of dollars).

Taking books for example:

  • There are roughly 100,000,000 books ever created
  • The Library of Congress has about 26,000,000 books (I was impressed and amazed that the Library of Congress has 26% of all books ever created)
  • A book costs between $10 and $100 to acquire and digitize
  • A book takes up about 1Mb of space
  • 26,000,000 books would take 26 TeraBytes.
  • 1 TeraByte costs about $60,000
  • The entire Library of Congress could be stored for about $1.5 million dollars
  • Books can be printed, cut, and bound for $1/book from a mobile book printer (~$15,000)

If anyone has the right to make these claims – it would be Kahle – who’s organization is storing massive amounts of data as part of their WayBack project and other projects.

 

JetsGo Frustration

Canadians, especially those of us who live in Charlottetown, have been excited about JetsGo lately. It’s a new airline which will hopefully provide less expensive airfare intra-Canadian (and a few US) locations.

Their website, lets you shop around for flights by feeding it an origin, destination, and preferred departure time. This is great to. I like being able to get quotes right away and enjoy shopping around a bit.

But my frustration comes from a single part of the interface that could easily be fixed. If there is no flight available the day you picked it will say as much, then offer two buttons “One Day Before” and “One Day Further”. Now, with bigger cities, this works great, because there is always a flight coming in or out (apparently) on any given day. Toronto to Montreal for example has a few on each day. But try Charlottetown to anywhere. It’s a wasteland of emptiness! And you can only iterate through day by day. It takes forever!

It would be great if there were “Skip ahead to next flight” buttons. And I doubt it would be difficult to implement.

 

Cereal Impostors

Generic-brand cereal names observed at the grocery store:

  • Oatie-O’s
  • Frosted Oatie-O’s
  • Neon Crunch (or in French, Croquant Fluon)
  • Fruity Hoops (right next to the Fruit Loops)
  • Lickety Split-O’s
 

Writing semantic markup: Robots to the rescue!

Some very smart people think that the next big leap in web technology will be on the foundation of the Semantic Web. However, some other very smart people are raising concerns that this semantic utopia may be unattainable.

Matthew Thomas is an interface designer from New Zealand. Yesterday on his website, he posted a summary of a few of these smart people’s concerns about the move towards semantic markup on the web. The biggest problem is that people just don’t care about the semantic web. It takes an essay just to explain what the semantic web is – but that doesn’t mean it’s not a worthwhile idea.

I’m sympathetic to Thomas’ points here. I’ve been working to move a web-based system to the XHTML standard. On top of the usual CSS struggles (my mind still thinks in [table] tags, but I’m slowly learning to love CSS), I’m running into a difficult problem. On this particular web system (and on many, if not most, web systems), the users generate most of the content.

First of all, the web is a crappy medium for writing. It’s good for publishing what you write, but it is terrible at the actual writing stage. Spell checking, periodic backup, saving drafts, etc. – all features we’ve grown accustomed to in word processing – are sitting there, in the next window, just a few pixels away from our arcane DOS-esque text-only [textarea] form input box. Lame.

First, we need the browser makers to put better text-editing tools at our disposal. However, here’s where it gets a little complicated. You’ve probably heard hot-shot web developers scoffing at WYSIWYG web-editors before. This is mostly because they product messy and convoluted code. There is, a deeper problem though. The web is not a WYSIWYG medium. The whole idea of XHTML and CSS technologies are that you can separate design from content – style from meaning. WYSI-not-WYG.

A simple (inane) example: I recently posted a reply to a post on the Signal vs. Noise weblog. I included a quote in my reply. I used the [blockquote] tag to indicate which part of my reply was a quote. When I submitted the post, I was pleasantly surprised to see that our friends at Signal vs. Noise had included some nice formatting for the blockquote tag in their stylesheet. As a result, my quote was nicely formatted to fit in their style and layout.

There is a powerful idea behind this simple example. When I used the [blockquote] tag, I wasn’t ‘formatting’ my post. I was adding meaning to the text – I was using machine-readable language to tell web browsers that the next few words are a quote. I didn’t know exactly what it was going to look like. (Note: there are better ways to cite a quote, but this example makes the point)

I’m not sure we can expect everyone to make this distinction. I do think, however, that people can produce writing with semantic markup if the software does the hard work.

We need a semantic-friendly-WYIWYG text editor for the web. Here are some proposed features:

  • Hide the code from the writer (but make it accessible to those who want it – as many current editors do).
  • Provide only semantic tools: lists, blockquotes, citations, links, emphasis, strong, etc.
  • Not quite WYSIWYG: show the text in real time in a typically styled format – perhaps even adopting the style of the destination website.
  • Automate the creation of meaningful markup. For example, when a link is created, prompt the author for a descriptive link title.

By the way, someone has come up with an apt name for what I’m doing here. It’s called the LazyWeb – when smart-asses like me rant and rave, but don’t do anything about it. The hope is that through the LazyWeb, people willing to write code and implement can meet up with the idea (read: lazy) people.

 

I know what you’re thinking

Cover photo of Emergence by Steven Berlin JohnsonIn dealing with the emergence of artificial intelligence, Steven Berlin Johnson‘s book Emergence points out that self-awareness – consciousness – may be secondary to our awareness of others. An animal, for example, is at a great evolutionary advantage if it can understand what other animals are, and are not, aware of. If I am the first to turn a corner, I am aware of what’s around the corner, while those behind me are not aware. When I am behind something, facing their back, they cannot see what I can see behind them. Obvious, isn’t it?

Quoted from Emergence:

“[our skill at imagining other people’s mental states] comes so naturally to us and has engendered so many corollary effects that it’s hard for us to think of it as a special skill at all. And yet most animals lack the mind-reading skills of a four-year-old child. We come into the world with a genetic aptitude for building “theories of other minds,” and adjusting those theories on the fly, in response to various forms of social feedback.”

“We’re conscious of our own thoughts, the argument suggests, only because we first evolved the capacity to imagine the thoughts of others. A mind that can’t imagine external mental states is like that of a three-year-old who projects his or her own knowledge onto everyone in the room? ?But as philosophers have long noted, to be self-aware means recognizing the limits of selfhood. You can’t step back and reflect on your own thoughts without recognizing that your thoughts are finite, and that other combinations of thoughts are possible? ?Without those limits, we’d certainly be aware of the world in some basic sense – it’s just that we wouldn’t be aware of ourselves, because there’d be nothing to compare ourselves to. The self and the world would be indistinguishable.”

If you follow the development of this ability to build a mental model of what others are aware of far enough, then you may eventually lead to the abstract realization that you too have a limited point of view, just like the others. The ability to form a mental model of what others perceive is extended to the self.

Reading Emergence coincided with my introduction to the Eastern philosophy that that there is no self – only a collection of action and thoughts. When you look deep inside, there may be nothing there. You are only the sum of your thoughts and actions without which, there is nothing (I’m not sure I believe this, but it struck a chord). So it is with software. Software is nothing but a set of instructions. When you take away the instructions, there is nothing left.

Putting these two concepts together left me thinking that artificial intelligence isn’t such a distant or impossible concept. Perhaps there is no difference between our own intelligence and artificial intelligence.

 

Stylish prose in the public domain

Having read the BoingBoing.net weblog for a while, I first discovered Cory Doctorow’s fiction when Salon published his dystopian digital-rights-management-inspired sci-fi short story, 0wnz0red.

Cover of Doctorow's Down and Out in the Magic KingdomDoctorow has recently published his first novel, Down and Out in the Magic Kingdom. The novel is published on paper by Tor Books, but it is also available for free download under a Creative Commons license (see previous aov post on the Creative Commons). The Creative Commons site features an interview with Doctorow about the release.

I have been reading the HTML version, but wanted to tweak the formatting to improve readability. Since the HTML version was in beautifully structured XHTML Strict markup, it was quite simple to modify.

With the kind permission of the author, here is the HTML version of Cory Doctorow’s Down and Out in the Magic Kingdom with my alternative stylesheet.

 

The Neglected Page Website Design Test

Something I like to do when evaluating the design of a new website I’ve come across is look for a neglected page – I often use the contact page, because they tend to be easy to find. Privacy policy pages are also good for this purpose.

The idea is that these pages are often neglected during the design process – with content dumped into a template after the designer has long since left or is busy prettying up the front page.

If a site is solid and consistent, even these commonly neglected pages will be well designed. They don’t have to be flashy. There’s nothing exciting about a privacy policy of contact information. Clear, concise writing and a clean layout and design will do nicely.

Try it out next time you come across a new site.

 

Browser of Babel

Apple Safari IconToday, Apple introduced a new browser, called Safari. Also today, every web designer and developer in the world let out a big sigh and said a big collective: Crap!

As you can see from my last post, the differences between browser rendering engines give us indigestion. That said, my first impression of Safari was quite good. For those that care, it is not based on the gecko engine as some may have expected (and hoped). Rather it uses the KDE’s KHTML library from the linux community. More web developer info on Safari is available at dive into mark.

An interesting note about Apple’s release of Safari and a few other Apple apps this past year: they are blurring the line between beta and final versions of software. This is a beta release of Safari, but it is launched and promoted with as much fanfare as any final product would be. For better or for worse, the public is now a beta tester (not necessarily a bad thing – especially since only those who want it will bother downloading it).