Archive for the ‘RDF’ Category
December 29th, 2009 | Permalink | 4 Comments
There has been a flurry of chatter around the potential impact of RDFa on SEO after my brief presentation at SES Chicago 2009. In subsequent conversations with SEOs at the SES conference and folks from around the industry, I was surprised at how many people practicing SEO weren’t involving their web developers in their solutions, but rather focusing mostly on content, linking and social strategies. While these solutions are key in any SEO activities, the fact that our panel discussion and presentation was the only one involving code and coding techniques surprised me. This raises an interesting question: are many SEOs missing a core element to success, namely well structured, semantically-rich core web sites?
One can look at the current state of HTML on many web sites as an indicator of where people are focusing their efforts. The research performed to create the hProduct Microformat draft spec gives some good insight as to the condition of front-end HTML code. For years we have been building web sites mostly for visual, presentational (human-readable) purposes, and this is clear in many pages of source code analyzed for the hProduct spec. Luckily, search engines have done an incredible job of parsing out the junk and extracting the contextual and important data from billions of web pages. Machines have become vital to helping us learn, but up to this point there has been an imbalance in human-readable vs. machine-readable front-end code. Now there are emerging techniques and technologies that web developers can easily use to correct this by coding their pages to give them meaning to humans AND machines.
By combining rich front-end user and data experiences utilizing RDFa, Microformats, or the emerging Microdata spec, we build direct pathways to rich datasets, which enable machines (mostly search engines, but also next-gen parsers, browser plugins, etc.) to easily access important data and apply their algorithms, etc., to make sense of it all and index it in the ways they see fit. My personal theory is that by providing more direct access to data through front-end semantic code, machines will spend fewer CPU cycles parsing presentational code. These extra resources could then be re-allocated to better natural language processing, extending search into the “deep web”, or other efforts to make the web and it’s users smarter.
Of course this has implications to the SEO/SEM world. It forces SEO professionals to engage their web developers or become slightly more code savvy themselves. It shifts more emphasis on developing strong, data-driven semantic web sites that balance the visual needs of humans and the data needs of machines, rather than focusing on seemingly artificial techniques that increase “link juice” or utilize “secret sauce”. Using traditional SEO content strategies in combination with building strong data-rich web sites can lead to a more intelligent and useful web, which is ultimately good for businesses, users and consumers.
September 22nd, 2009 | Permalink | 1 Comment
Over the past couple of months I’ve been thinking a lot about semantic web — specifically how it fits in with the company’s overall future web development strategy. This has lead me to tinkering with RDFa markup, re-engineering current solutions with more forward thinking, semantic approaches. I have been a big proponent of microformats as a lightweight semantic markup tool (and I still think they have merit), but since have been intrigued and impressed with the power of RDFa. As with all new concepts, RDFa isn’t a “silver bullet” right out of the gate. Sitting down and actually coding the stuff into real-world examples has brought to light some potential issues that may be encountered by developers introducing RDFa to their pages.
Ontology Explosion?
Like RDF/XML, RDFa markup depends on identifying and utilizing specific ontologies in your HTML document. One could create a custom ontology specifically for a single use case. This could result in hundreds or even thousands of custom ontologies for the same concept or object. My fear is that that without proper oversight, the overabundance of ontologies will lead to a decrease in the effectiveness of RDFa, and clutter the semantic web with non-reusable ontologies. Since RDFa is recognized by the W3C, my hope is this group can provide some leadership and governance — possibly working to establish more officially recognized ontologies for use (and reuse!) on a wide scale.
Document Size
Maybe I’m just being old fashioned here, but are people still concerned about HTML document size and performance? I know my company is. In most of my examples, I saw an increase in the amount of HTML markup needed to fully code the page, usually between 5-20%, depending on the complexity of the solution. Efforts will need to be made by developers to streamline their code in order to avoid performance issues due to “heavy” HTML.
Object Order
For some ontologies, object order matters — that is, element y is in the domain of x, and should be nested under the parent. If your web page has a specific visual look and feel, will that match the object order needed to be valid RDFa? While flexible front-end development techniques using CSS will handle most of these instances, I foresee a certain amount of give and take between design/ information architecture staff and developers to achieve a balance of human usability while maintaining the data structure of the document.
Overall, I am convinced that RDFa is the right technology to fuel the semantic web, providing human usability and machine readability, even with the issues described above. In most scenarios, the benefits of delivering rich data directly to the front-end outweigh the effort to implement it. As adoption increases, new techniques will be invented that should alleviate most problems. After all, we developers are a smart bunch
June 5th, 2009 | Permalink | 4 Comments
The experimentation continues on our Best Buy Local Stores platform. For the past year or so, I’ve been interested in going beyond the standard web experience and into the world of semantic web. I am out to create and find examples of how we annotate good store data beyond the confines of the typical web site, and what commerce sites will look like in a new semantic web. After some research, the GoodRelations Ontology seemed like an appropriate solution to provide better data around our stores and offerings.
The current solution involves publishing offering, payment methods, delivery methods, store details and store hour data in RDF/XML using GoodRelations for all 1000+ Best Buy stores in the US. Examples may be found on all Best Buy Local store pages. A couple of highlights:
A special thanks to Martin Hepp for his great work and assistance in this venture.