Second Life OS

So the prof has us brainstorm in class about a "new and different" OS metaphor. I take the bait and throw out Second Life OS. Then he tells us we have to prototype it by Tuesday! Why oh why didn’t I stick with Google OS, which would just be a search box?

SecondLife OS

The fellow in the middle is Caspar David Friedrich, patron saint of POV. I have to say I do like the “reminder flies” buzzing around his head. I wanted to create a really annoying Twitter avatar but being short on time I used a flaming Adium bird instead. (Click through for more notes and larger images.)

Some features: Objects/files are placed in the virtual landscape, in clusters which may be the analogue of “folders”. Other users’ interactions with shared objects are visualized spatially, like the angel reading a PDF in the lower right. The level of animation detail for objects can range from traditional static icons through vanilla widgets (upper left) to biomorphic widgets like the reminder flies and full-blown avatarbots like the Adium bird.

P.S. I know that a Second Life OS is far too obvious to be an original idea. Not only is there plenty of prior art from the earliest days of VR, but it turns out that Scoble says Second Life already is an OS.

Going meta on metadata

In this week’s Organizing and Access to Information class I fear I steered the conversation into a detour on the subject of metadata. The prof’s introduction to the concept began straightforwardly enough — metadata is data about data, e.g. author, title, publication date, etc. But then as his examples got more complex he began to call things metadata that I would have considered part of the data itself. I don’t have his slides in front of me but I think it started when he put chapter 1, page 1 of Pride and Prejudice on the screen and said that its structure — chapters, paragraphs, etc. — was also metadata. That surprised me, and we spent probably far too much class time working on why. (I feel guilty but not too guilty — I kept offering to drop it, but other people had questions and comments, too.)

It boiled down to this: in my naive interpretation, metadata is information that applies to the “information object” as a whole, or is extrinsic to it in some way. I’d call anything integrated with the meat of the object “data”, not “metadata”. That includes structure and layout information — the information represented in typography and layout on a printed page, or in ordinary inline markup on the web, etc. Of course we can abstract structure away from presentation but that doesn’t mean that the structure is no longer part of the work. Where Jane Austen chose to put her paragraph breaks is as much a part of the novel as the words she chose to put inside them. There were some interesting examples presented in class, such as whether the abbreviation and typography conventions used to identify the parts of speech in a dictionary are metadata. I argued that whether represented through an italic n. or an XML <part-of-speech> entity, the part of speech is an integral part of the content. Or in multimedia terms, the string of bits representing an audio or video stream is the data and it doesn’t make sense to speak of the volume level or amount of cowbell as being a separate thing called metadata.

The prof worked hard to explain his model and I was probably just being thickheaded not to get it. On reflection I see that the concepts of “data” and “metadata” are conventions and where to draw the line is a matter of utility; if it is helpful in a certain setting to call internal markup (or its pre-digital equivalents of layout and typography) “metadata” then so be it. Also, there are many times when it’s good for information to live in both places: you can hear the cowbell in the audio stream, and you may also want an access point in the catalog of your music library which says “Cowbell: track 6890691, timepoint 1:37″.

But I don’t think I’m alone in my naive model. The very next day’s reading assignment was chapter 3 of Erik Ray’s Learning XML, where he defines the term: “Metadata is information about the document that is not part of the flow.” What I’m wondering now is whether these diverging definitions have any consequences beyond occasional confusion.

The Semantic Web and the I Love You problem

Since I started out with a nod to Ben Hammersley, I’ll risk biting the hand that inspires me by taking issue with something he just posted to his blog.

It’s a long-lost (and nicely produced) video of an interesting and accessible talk he gave in summer 2003 entitled A Sporting Gentleman’s Guide to the Semantic Web. It seems like a useful intro for someone like me who knows very little about the subject.

However, despite a promise not to oversell the Semantic Web, Hammersley may be doing just that. He claims that it solves what he calls the “I Love You” problem, that the same words can mean different things when said by different people in different contexts. The Semantic Web aims to address this in two ways: (1) by using unique identifiers, or URIs, for each of the terms in a statement; and (2) by defining its verbs via an explicit reference to a standard.

So far so good, but the example he uses to illustrate his point is one that sets off alarms for me, the Dublin Core element dc:creator. It would be hard to name a thornier category of metadata. Any cataloger will tell you that there are a thousand kinds of authorship. The Dublin Core definition he refers to in his talk does not make the concept unambiguous, but rather explicitly ambiguous:

Creator: An entity primarily responsible for making the content of the resource. Examples of a Creator include a person, an organisation,or a service. Typically, the name of a Creator should be used to indicate the entity.

So neither the Semantic Web nor the Dublin Core will tell you whether the dc:creator of the object conceived it, wrote it, directed it, produced it, acted the lead, played trombone in the pit, or did the catering. They say even less about what rights or responsibilities accrue to said creator. Those questions remain as messy with the Semantic Web (at least the Dublin Core-based version) as without it.

Probably I’m overreacting and the problem isn’t with the Semantic Web but with his example. More widespread machine-parseable use of a simple metadata standard like Dublin Core could come in handy despite its limitations, and if I understood it correctly the Semantic Web is extensible to any metadata standard, including much more precise ones. But a claim that the Semantic Web eliminates issues of ambiguity is an overstatement as far-fetched as the claims of magic and mind reading his talk is intended to debunk.