November 9, 2010

The benefits of structuring your data using microformats

Google Rich Snippet Testing Tool
Google’s Rich Snippet Testing Tool

Deltina HayLast week we discussed how the Semantic Web relies upon markup languages that tag Web content, making it easier for machines to interpret. This can be accomplished in a number of ways, including tagging content as structured data or linked data.

Today we’ll take a look at marking up your content as structured data using microformats.

Microformats for structured data

Microformats are one of the standard markup formats used to create structured data. Like any markup language, they consist of tags and attributes that are used to “mark up” your Web content so that a search engine can recognize the content as structured data.

Content that is typically marked up using this standard includes contact and location information, reviews, products, and events. To transform your data into structured data using microformats, you simply add some additional classes and tags to your existing HTML, adhering to the microformats standard.

To demonstrate, let’s look at the “hCard” format. This format is used for marking up information about people, companies, organizations, and places. Here is how the marked-up content will look within the HTML of your Web page:


<div id=”hcard-Deltina-Hay” class=”vcard”>
<a class=”url fn” href=””>Deltina Hay</a>
<div class=”org”>PLUMB Web Solutions</div>

<a class=”email” href=”mailto:[email protected]”>[email protected]</a>
<div class=”adr”>
<div class=”street-address”>P.O. Box 242</div>
<span class=”locality”>Austin</span>

<span class=”region”>Texas</span>
<span class=”postal-code”>78767</span>
<span class=”country-name”>USA</span>

<div class=”tel”>512-555-9999</div>


And this is how it will appear on your website:

Deltina Hay
PLUMB Web Solutions
[email protected]
P.O. Box 242
Austin, Texas, 78767 USA

To the naked eye, there is nothing special about this content. It is nothing more than your contact information with links. Search engines and Internet browsers, however, will now be able to interpret the content as structured data — specifically structured contact and location information about you and your company — and display it or use it accordingly. All you need to do is mark up your existing contact information using the microformats standards. has a lot of resources to help you out, including an hCard creator that you can use to generate code similar to that in our example. Continue reading

November 1, 2010

The Semantic Web: An explanation in plain English

E&O search results
An example of Google Rich Snippets.

Deltina HayThe Semantic Web is a big step toward Web 3.0, where the ultimate goal is to make Web content more machine-friendly and thus, in turn, more useful to humans.

Most websites are produced using HTML, which is a markup language used to make a website “look” a certain way. The Semantic Web, on the other hand, is based on markup languages that focus on tagging the content by what it “means.”

A more “semantic” Internet will allow search engines to produce more relevant results because the searched content will be “marked up” in such a way that the engines (machines) can make better sense of it.

The Semantic Web is not AI (artificial intelligence), as some people seem to think. It is about making the content easier for machines to interpret, not about making the machines themselves smarter. Two ways in which this is accomplished is through structured data and linked data.

Structured data: Making it easier to share information

You can prepare your content in a way that will help search engines include it in very relevant search results. For instance, you can offer ways for your contact information, products or reviews to show up directly in a Google or Yahoo search result by adding a few tags to your content that will transform it into what is called “structured data.”

Contact and location information, events, products and reviews are all perfect types of structured data and can be tagged in standard formats called “markup formats” to make it easy for search engines to recognize them as such.

Structured data has been around for some time, waiting in the wings for the search engines to take it seriously. In 2009, Google introduced “Rich Snippets,” a feature that recognizes markup formats and displays the content in your search listing accordingly. See the image at top for an example.

Google is supporting the two most standard markup formats: Microformats and RDFa. Both of these standard formats are very straightforward. Anyone with experience building a website or using a content management system like WordPress can easily use them to mark up their existing Web content as structured data.

Linked data: Create apps from rich datasets

Linked Data also refers to a way of structuring data, but it does so by using the Web to create links between data from many different datasets and classifies it using an established data commons.

By using a common reference to represent a piece of data, that data can be linked easily to and from other sources of data, creating what is referred to as a “Web of Data.”

The most impressive of these Webs of Data is the Linked Open Data (LOD) cloud. In the center of this “cloud” (only a small part of it) is “Dbpedia,” which is the dataset that feeds Wikipedia.

Linked Open Data Cloud

The resulting “Web of Data” can be accessed by semantic Web browsers that navigate between different data sources, similar to how traditional Web browsers navigate between HTML pages.

One of the things that make Linked Data so powerful is what one can do with the data once it is linked. Given the right tools and know-how, anyone can draw from this tremendous resource to create powerful applications. Continue reading

October 21, 2010

Web 3.0 demystified: An explanation in pictures contributor Deltina Hay now has a featured column on Technorati called You’ll Be Back: Search Optimization & Survival. The column focuses on search optimization as it applies to the entire Web: search engines, social search, mobile search, the semantic Web, etc. You can read the articles right here on every week.

In this first series of articles, we discuss each of the fundamental elements that are moving us toward an application-driven, Web-based, mobile computing era, and how they will ultimately affect search optimization.

Deltina HayWeb 3.0 aims to make online content easier for machines to understand and opens up and links large sets of data in consistent ways.

Finding a definition for Web 3.0 is no easy task when most people are still trying to grasp Web 2.0. However, it is a necessary task since Web 3.0 technologies are encroaching on the Internet quickly. Perhaps the best way is to start at the beginning.

Web 1.0: The Internet in one dimension

In the beginning, the Internet was flat. Think of it as a collection of documents (Websites) lined up side by side. Though many of the sites may have linked to each other, those links simply took a user straight to the linked site, and maybe back again.

Each website was classified using metadata composed of meta-keywords, meta-descriptions, and meta-titles that described what the content of the website was about. At their simplest, search engines used established search algorithms to comb through all of the websites’ metadata to return what it considered relevant results based on your choice of keywords.

The inventor of the Web, Timothy Berners-Lee, refers to this phase of the Internet as a “Web of Documents.”

Web 1.0

Web 2.0: A two-dimensional Internet

This next generation of the Internet added another dimension: collaboration.

This added dimension means that websites were linked in a more collaborative way. Instead of sending a visitor away from a site to view related content, the content is actually drawn into the visited site from the related site using RSS feeds or widgets.

But it isn’t only the websites that are more collaborative, it is also the users of the websites’ content. Internet users tag and comment on content and collaborate and interact among themselves.

Search engines have a whole new layer to consider in their searches: user-tagged Web content and the relevant connections between the users themselves.

Berners-Lee named this Internet phase the “Web of Content.”

Web 2.0

Web 3.0: The third dimension

Even with the rich metadata, collaboration between websites and users, and user-generated relationships to draw from, machines are still machines, and they still find it difficult to discern actual meaning from human-generated content. The third evolutionary step of the Internet aims to fix that by adding the dimension of “semantics.”

The goal of this phase is to make the content of the Web more easily interpreted by machines. Web content is typically written for humans, which means that it is produced with aesthetics in mind — little attention is paid to consistency or relevancy of the content itself.

Tim Berners-Lee calls this phase — rather passionately — the “Web of Data.” Continue reading

May 8, 2009

Free ebook: ‘Identity in the Age of Cloud Computing’


JD LasicaIt surprises me how many people don’t know about the fabulous work being done by the Aspen Institute, the 59-year-old international nonprofit organization that works on environmental and economic concerns. It’s a sort of constantly evolving think tank perfectly suited for the new economy: The Aspen Institute convenes roundtables — in Aspen, Colo., Washington, DC, India, Israel, all around the globe — and generally gathers 25 to 30 experts and thought leaders to tackle important public policy issues. During my last two trips to Aspen I met and spoke with Al Gore and former Secretary of State Madeleine Albright.

I’ve been lucky enough to participate in three such roundtables and to write the following reports, which the institute turns into print books (available for purchase) and makes available as free ebook downloads in the PDF format:

The Mobile Generation: Global Transformations at the Cellular Level, 72 pages, February 2007: a look at the profound changes ahead as a result of the convergence of wireless technologies and the Internet, with an emphasis on how youths use mobile technology (download ebook as PDF).

Civic Engagement on the Move: How Mobile Media Can Serve the Public Good, 110 pages, July 2008: a look at the startling growth in the use of cell phones and other mobile devices and the ways mobile technology can be used to advance the social good (download ebook as PDF).

• And now the just-released Identity in the Age of Cloud Computing: The next-generation Internet’s impact on business, governance and social interaction (image above), 110 pages, May 2009: a look at the next-generation Internet and how it will impact all facets of society.

Download the free ebook (as a PDF). Or see the landing page. (If you came here from Twitter and are interested in the subject, my ID is @jdlasica.)

Aspen Reports now using Creative Commons licenses

I’m happy to report that Charlie Firestone, executive director of the institute’s Communications and Society Program, took up my suggestion and has agreed to release the new report under a Creative Commons Attribution Noncommercial license, the same license I’ve been using for all of my blog posts for years. That means anyone is free to republish excerpts of the report, or the report in its entirely, for noncommercial purposes. (See excerpt below.)

Not only that, but Charlie has agreed:

• to retroactively release my still-timely two earlier reports, Civic Engagement on the Move and The Mobile Generation, under the same CC BY NC license.

• to publish all upcoming Roundtable on Information Technology reports with the CC BY NC license.

• to recommend that all of the institute’s Communications and Society Program publications be published the same way. “I will take it up with the Aspen Director of Communications, and perhaps other reports at the Institute could be published with that license as well,” he tells me.

This, to my mind, is a coup for Creative Commons, given the world-class scholarship and policy proposals that the Aspen Institute is now making freely available for redistribution and remixing.

Continue reading