December 29th, 2009 | Permalink | No Comments
There has been a flurry of chatter around the potential impact of RDFa on SEO after my brief presentation at SES Chicago 2009. In subsequent conversations with SEOs at the SES conference and folks from around the industry, I was surprised at how many people practicing SEO weren’t involving their web developers in their solutions, but rather focusing mostly on content, linking and social strategies. While these solutions are key in any SEO activities, the fact that our panel discussion and presentation was the only one involving code and coding techniques surprised me. This raises an interesting question: are many SEOs missing a core element to success, namely well structured, semantically-rich core web sites?
One can look at the current state of HTML on many web sites as an indicator of where people are focusing their efforts. The research performed to create the hProduct Microformat draft spec gives some good insight as to the condition of front-end HTML code. For years we have been building web sites mostly for visual, presentational (human-readable) purposes, and this is clear in many pages of source code analyzed for the hProduct spec. Luckily, search engines have done an incredible job of parsing out the junk and extracting the contextual and important data from billions of web pages. Machines have become vital to helping us learn, but up to this point there has been an imbalance in human-readable vs. machine-readable front-end code. Now there are emerging techniques and technologies that web developers can easily use to correct this by coding their pages to give them meaning to humans AND machines.
By combining rich front-end user and data experiences utilizing RDFa, Microformats, or the emerging Microdata spec, we build direct pathways to rich datasets, which enable machines (mostly search engines, but also next-gen parsers, browser plugins, etc.) to easily access important data and apply their algorithms, etc., to make sense of it all and index it in the ways they see fit. My personal theory is that by providing more direct access to data through front-end semantic code, machines will spend fewer CPU cycles parsing presentational code. These extra resources could then be re-allocated to better natural language processing, extending search into the “deep web”, or other efforts to make the web and it’s users smarter.
Of course this has implications to the SEO/SEM world. It forces SEO professionals to engage their web developers or become slightly more code savvy themselves. It shifts more emphasis on developing strong, data-driven semantic web sites that balance the visual needs of humans and the data needs of machines, rather than focusing on seemingly artificial techniques that increase “link juice” or utilize “secret sauce”. Using traditional SEO content strategies in combination with building strong data-rich web sites can lead to a more intelligent and useful web, which is ultimately good for businesses, users and consumers.
October 26th, 2009 | Permalink | No Comments
There is a good amount of chatter about the semantic web out there, but not a ton of concrete, working examples. I decided to put our Best Buy data to work and publish BBY SKUs in RDFa, using the GoodRelations e-commerce ontology. As I see it, simply publishing the RDFa is not an issue — the challenge is to apply real-world style and structure to the code to make it both machine and human readable. I’m trying to answer the question: is the RDFa model flexible enough to allow Joe Web Developer to successfully publish valid structured data while satisfying the desires of his design, business, and marketing counterparts?
I’m pleased with the first round of results, ~460K worth of “next-gen” product detail pages. Take a look at some choice example SKUs from the Best Buy product catalog:
Interested parties can get a full URL list here (txt, gz), or split up into list 1, list 2, and list 3 (txt).
Thanks to: Martin Hepp, Andreas Radinger, Alex Stolz, Yahoo! Searchmonkey, Jason Galep (design guidance), and Best Buy Remix.
October 20th, 2009 | Permalink | 1 Comment
I saw an interesting post today via Twitter from a local person hosting an SEO competition to see who gets the highest Google search result for the key words zompire dracularius. I am officially throwing my hat in the ring. Recently I have been investigating how good semantic markup (including RDFa/Microformats) in the front-end will improve or change the way search engines work — basically how POSH (plain old semantic html) delivers data directly to the browser with particular markup. For the purposes of this test I have employed some semantic markup techniques, plus a few other tactics to raise the visibility of my post:
- Using RDFa in my posts (dublin core)
- Creating a “human readable” URL for this post using permalink structure built into WordPress
- Providing links to the original post and other sources (I will attempt to trackback, although no trackback link is published)
- Send to social networking sites I participate in — this is done automatically when I publish a post on my blog to Friendfeed, Twitter and Facebook
- Attempt to leverage my blog’s overall visibility to force my post about zompire dracularius to the top
- Leave a comment on the original blog declaring the contest
- Have an adequate number of references to “zompire dracularius”
While it seems a little “Black Hat SEO”/ dirty to me, I will ignore that feeling for a while for the purposes of the experiment. Results to come!
September 22nd, 2009 | Permalink | 1 Comment
Over the past couple of months I’ve been thinking a lot about semantic web — specifically how it fits in with the company’s overall future web development strategy. This has lead me to tinkering with RDFa markup, re-engineering current solutions with more forward thinking, semantic approaches. I have been a big proponent of microformats as a lightweight semantic markup tool (and I still think they have merit), but since have been intrigued and impressed with the power of RDFa. As with all new concepts, RDFa isn’t a “silver bullet” right out of the gate. Sitting down and actually coding the stuff into real-world examples has brought to light some potential issues that may be encountered by developers introducing RDFa to their pages.
Ontology Explosion?
Like RDF/XML, RDFa markup depends on identifying and utilizing specific ontologies in your HTML document. One could create a custom ontology specifically for a single use case. This could result in hundreds or even thousands of custom ontologies for the same concept or object. My fear is that that without proper oversight, the overabundance of ontologies will lead to a decrease in the effectiveness of RDFa, and clutter the semantic web with non-reusable ontologies. Since RDFa is recognized by the W3C, my hope is this group can provide some leadership and governance — possibly working to establish more officially recognized ontologies for use (and reuse!) on a wide scale.
Document Size
Maybe I’m just being old fashioned here, but are people still concerned about HTML document size and performance? I know my company is. In most of my examples, I saw an increase in the amount of HTML markup needed to fully code the page, usually between 5-20%, depending on the complexity of the solution. Efforts will need to be made by developers to streamline their code in order to avoid performance issues due to “heavy” HTML.
Object Order
For some ontologies, object order matters — that is, element y is in the domain of x, and should be nested under the parent. If your web page has a specific visual look and feel, will that match the object order needed to be valid RDFa? While flexible front-end development techniques using CSS will handle most of these instances, I foresee a certain amount of give and take between design/ information architecture staff and developers to achieve a balance of human usability while maintaining the data structure of the document.
Overall, I am convinced that RDFa is the right technology to fuel the semantic web, providing human usability and machine readability, even with the issues described above. In most scenarios, the benefits of delivering rich data directly to the front-end outweigh the effort to implement it. As adoption increases, new techniques will be invented that should alleviate most problems. After all, we developers are a smart bunch 
June 5th, 2009 | Permalink | 4 Comments
The experimentation continues on our Best Buy Local Stores platform. For the past year or so, I’ve been interested in going beyond the standard web experience and into the world of semantic web. I am out to create and find examples of how we annotate good store data beyond the confines of the typical web site, and what commerce sites will look like in a new semantic web. After some research, the GoodRelations Ontology seemed like an appropriate solution to provide better data around our stores and offerings.
The current solution involves publishing offering, payment methods, delivery methods, store details and store hour data in RDF/XML using GoodRelations for all 1000+ Best Buy stores in the US. Examples may be found on all Best Buy Local store pages. A couple of highlights:
A special thanks to Martin Hepp for his great work and assistance in this venture.
May 19th, 2009 | Permalink | No Comments
As part of a recent local store web site project, we’ve been experimenting with how to represent physical store locations using microformats and RDFa in a hybrid approach, giving the parser/page scraper/app the ability to choose it’s own format to parse while keeping it visually simple for the end user. This example (live on 1000+ store pages) uses hCard and geo microformats along with FOAF and GEO RDFa specfications. There are some interesting observations I have noted from iterating on this a couple of times:
- Does a hybrid approach make sense? Could this be applied to other microformat/RDFa instances?
- I think it would be beneficial to come up with an approach for representing store hours in the data. To my knowledge there is no microformat to handle this. I did stumble on the GoodRelations ontology, and could apply this as an RDFa. Looks like GoodRelations also appears in Yahoo! SearchMonkey’s documentation.
The example. First things first, add RDFa DOCTYPE declaration and appropriate namespaces for validity (dublin core dc: appears further in the (X)HTML in the posts, does not appear in this example).
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.0//EN" "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-1.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:foaf="http://xmlns.com/foaf/0.1/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
The core of the markup:
<div id="storeinfo">
<div class="vcard" typeof="foaf:Organization">
<div class="fn n" property="foaf:name"><h1 class="org" property="geo:lat_long" content="44.863312,-93.292557">Best Buy - Richfield</h1></div>
<span rel="foaf:depiction"><img src="http://stores.bestbuy.com/wp-content/store-images/bestbuy-store-281.jpg" border="0" alt="Best Buy - Richfield" class="photo" /></span>
<div class="info">
<div class="adr">
<div class="street-address">1000 West 78th St</div>
<span class="locality">Richfield</span>, <span class="region">MN</span> <span class="postal-code">55423</span>
</div>
<div class="tel"><span>Phone</span>: 612-861-3917</div>
<div class="geo">GEO: <span class="latitude">44.863312</span>, <span class="longitude">-93.292557</span></div>
<div class="addl"><a href="javascript:mapanddirection('281','cat12091')">Maps & Directions</a> | <a href="javascript:openWeeklyAd('http://bestbuy.shoplocal.com/bestbuy/new_user_entry.aspx?adref=','55423')">Weekly Ad</a></div>
<h3>Store Hours</h3>
<div class="hours">
<strong>Mon:</strong> 10-9;
<strong>Tue:</strong> 10-9;
<strong>Wed:</strong> 10-9;
<strong>Thurs:</strong> 10-9;
<strong>Fri:</strong> 10-9;
<strong>Sat:</strong> 10-9;
<strong>Sun:</strong> 11-7
</div>
</div>
</div>
</div>
Download the src