Archive

Posts Tagged ‘DotNetNuke’

Semantic Blogging Redux

2009.09.16 Comments off

A while back I posted something about WordPress’ taxonomy model.  At the time I thought it was clever and thought we should use something like it for the DotNetNuke Blog module.  Now, I’m less enamored with it.

Here’s why.

To recap, have a look at this database diagram:

The seeming coolness stemmed from the decision to make “terms” unique, regardless of their use, and to build various taxonomies from them using the wp_term_taxonomy structure.  So let’s say you have the term “point-and-shoot”, and you use that as both a tag and a category.  “Point-and-shoot” exists once in the wp_terms table and twice in the wp_term_taxonomy table – each entry indicating the term’s inclusion in two different structures.  This seems useful because the system “understands” that the tag “point-and-shoot” and the category “point-and-shoot” both mean the same thing.

But is that always a safe assumption?

Consider the case of a photo blog, where the writer is posting photos and writing a little about each.  This photographer has a professional studio, and also shoots portraits in public locations, as well as impromptu shots at parties.

This photographer has set up a category structure indicating the situation in which the photo was taken “Studio/Location/Point-and-Shoot” (meaning, an impromptu photograph) and another structure or set of tags indicating what sort of camera was used “Point-and-Shoot”, as opposed to “DSLR”.

Same term.  Two completely different meanings.  Use that term as a search filter and you will get two sets of results, possibly mutually exclusive.

And so – to truly be “semantic”, the term cannot exist independently of its etymology (as expressed in the category hierarchy) as WordPress attempts to implement.

Putting the “Perma” Back in Permalink

2008.11.12 Comments off

The DotNetNuke Blog module has had a checkered history with Permalinks.  The earliest versions did not use them, so old blog entries never had a Permalink created for them.  Instead, links to entries were generated programmatically, on the fly.

It’s been trouble ever since.

Permalinks were later introduced, but the old code that generated links on the fly was allowed to remain.  In theory, this shouldn’t cause any problems so long as everyone is using the same rules to create the link.  In reality, depending on how a reader navigated to a blog entry, any number of URL formats might be used.  A particular blog entry might reside at any number of URLs.

From a readers point of view, there is really no issue with an entry residing at various URLs.  But from an SEO perspective, it’s a bad idea for a given piece of content to reside at more than one URL: it dilutes the linkback concentration that search engines use to determine relevance.

It’s also a troubleshooting nightmare.  Since there are so many different places in the code where URLs are being created, if a user discovers an incorrect or malformed URL, the source of the problem could be any number of places.

Finally, it’s a maintenance annoyance.  If you are publishing content using the blog, you don’t want URLs that change.  You want the confidence of knowing that when you publish a blog entry, it resides at one URL, and that URL is reasonably immutable.  The old system that generated URLs on the fly was subject to generating different URLs if there were various ways for users to navigate to the blog.

The Permalink Vision

The Blog team has a vision of where we want to take URL handling:

  1. All Blog entries should reside at one URL only (the Permalink).
  2. The Permalink URL for the entry should be “permanently” stored in the database, not generated “on the fly”.
  3. The Permalink should be SEO-friendly.
  4. Once created, the system will never “automatically” change your Permalink URLs for you.

We’ve come really close to achieving this vision in 03.05.x.

With the 03.05.00 version of the Blog module, we have undertaken an effort to ensure that the Permalink (as stored in the database) is always used for every entry URL displayed by the module.  After releasing 03.05.00 we discovered a few remnants of old code, and believe that as of the 03.05.01 maintenance release we will have ensured that all URLs pointing to entries are always using the Permalink stored in the database.

But there was a problem with changing all the URLs to use the Permalink stored in the database.  Since old versions of the Blog didn’t generate Permalinks (and some generations generated broken Permalinks) how could we safely use Permalinks from the database for all entry URLs?  The answer was to force the module to regenerate all the Permalinks on first use.  When you first use the Blog module, it will automatically regenerate all of your Permalinks for the entire portal, ensuring that the database is correctly populated with the appropriate URLs for each entry.

The decision to force all users to regenerate their Permalinks was a measured one.  Obviously, automatically forcing Permalink regeneration violates the third rule listed above, and theoretically could result in URLs for some entries to “move around” depending on how broken their Permalinks were.  But we believed that we required a one-time fix to get all entries on the new Permalink approach, and that this approach was only likely to “move” entries that had truly broken Permalinks in the first place.

Going forward we are confident that this represents the best approach to finally resolving the Permalink issue once and for all.

SEO-Friendly URLs and Permalinks

With version 03.05.00, we introduced SEO-friendly URLs that change the ending of our URLs from “default.aspx” to “my-post-title.aspx”.  We also introduced a 301 redirect that automatically intercepts requests for entries at the old “unfriendly” URL, and redirects to the new “friendly” URL.

When you install 03.05.00, it will by default still be using the old, “unfriendly” URLs.  If you want SEO-friendly URLs, you must enable them using a setting found in Module Options.

When you change the setting, only your new posts will use the new SEO-friendly URLs.  This is consistent with the Third Rule: you shouldn’t click an option and suddenly have all of your existing URLs changed for you.  If you want to make your old entries SEO-friendly, you must change the option, then use the “Regenerate Permalinks” option to apply the change to all entries.

A Couple of Issues

As I mentioned earlier, after the release of 03.05.00, we discovered a few areas in the code where the system was still generating URLs “on the fly” instead of using the Permalink.  So, if you’re using 03.05.00, and change the “SEO-Friendly” setting, you will discover that some of your existing URLs do, in fact, change to the new format.  This is a bug that is being corrected in 03.05.01.

There is one other way that a Permalink URL might change unexpectedly.  If you use the SEO-friendly URL setting, the module uses the post title to create the “friendly” portion of the link.  If, after you post an entry, you change its title, the URL will change.  Fortunately, links to the old URL will be caught by the 301 handler and redirected correctly.  This problem will not be corrected in version 03.05.01 but will probably remain until version 4.

Thoughts About Version 4

Version 4 of the Blog module is still on the back of a cocktail napkin.  No hard and fast decisions have been made yet about its feature set.  But I will preview where I think version 4 might go, at least as regards Permalinks and SEO-friendliness.

In version 4, I believe we will introduce the concept of a “slug” to the blog module.  A slug is simply a unique, SEO-friendly text string that is used to create a portion of a URL and is unchangeable except by the blog editor.  So, for example, given the URL http://www.mysite.com/tabid/109/entryid/302/my-post-title.aspx, the slug is “my-post-title”.

How are slugs different from what we have today?  The only difference is that today, the string “my-post-title” is generated automatically from the title, and if the title changes, the string changes.  With a slug, the string would not change automatically if the title changes, but could only be changed manually.  Slugs ensure that once an entry is posted, it stays put unless the publisher expressly decides to move it.

If we do deploy slugs, then there will have to be a few other changes.

First of all, the entire point of using slugs is that, once created, they can only be changed manually.  That means that the “Regenerate Permalinks” functions will have to be removed.  Once each entry has a slug, it can’t be “regenerated” programmatically.  The very idea of “regenerating” becomes moot.

Secondly, the point of a slug is to provide the SEO-friendly ending to each URL.  It presumes that the blog is “SEO-friendly”.  If you aren’t “SEO-friendly” there is no slug.  So for version 4, we may make “SEO-friendliness” mandatory and force it on all blog entries, old and new.

“But wait!” you cry.  “I thought that the point of Permalinks was to ensure that the system would never again change my URLs, and here you are saying that in a future version, you’re going to change all my URLs whether I like it or not!”

Well, yeah.  Guilty as charged.

First off, think of this as the very last step in achieving SEO-friendly Permalinks that are truly and finally “perma”.  Once we achieve SEO-friendly slugs, we have made it all the way to the goal.  And this is really the only way to get there, at least, the only way that is easy to support and not confusing to the end-user.

Secondly, the 301 redirection built into the module should ensure that the transition from old URL to SEO-friendly slug is completely transparent to all users and to search engines.  All the old links will work, and they will correctly report the move to search engines, which will update themselves accordingly.  Thousands of Blog module users are already testing this in version 03.05.x, and I believe that by version 4 we will be confident in this approach.

Of course, all of this is speculative, since version 4 isn’t even in the design stage yet.  But I hope that this information helps illuminate how the Blog team is thinking about the module and where it is likely to go in the future.  And, as usual, your feedback is highly encouraged.

Categories: Software & IT Tags: ,

Taxonomy and SEO

2008.11.03 Comments off

Taxonomy is one of the least understood weapons available for SEO.  We all know the basics of effective SEO:

  • URLs constructed with relevant terms, avoiding parameterization
  • Each page can be accessed by only one URL
  • Effective use of keywords in the title tag
  • Use of keywords in H1 tags
  • Links back to the page from other pages

How does taxonomy fit into all of this?

I started a webzine in 1998 called ProRec.com.  I built a custom CMS to run it, and spent a few years on SEO back before there was something called “SEO”.  In fact ProRec predates Google.  By the spring of 2000, ProRec consistently ranked in the top 10 search results on all relevant terms, usually in the top 3.  Due to many factors, some beyond my control, ProRec went dark in 2005 and was relaunched on DotNetNuke’s Blog module in 2007.  It no longer enjoys its former ranking glory, but I hope to use the lessons I learned to improve the Blog module in future versions.

One of the lessons I learned was the importance of effective use of taxonomy on SEO.  Designing and properly using effective taxonomy solves several problems:

  1. Populates META tags appropriately
  2. Encourages or enforces consistent use of similar keywords across the site
  3. Forms basis for navigation within the site, linking related pages
  4. Forms the basis for navigation outside the site, linking to other related information

Let’s look at these one at a time.

Populating META Tags

It’s true that META tags are not as important to search engines as they once were, but they are still used, and therefore still important.  Most blogging systems will take the keywords entered as Category or Tags and use them as META tags.  If you’re using DotNetNuke’s blog module, however, you’re out of luck.  The system simply doesn’t comprehend any kind of taxonomy and doesn’t let you inject keywords into the META tags except at the site level.  Opportunity missed.

When it comes to content tagging, a structured taxonomy (categories) offers benefits over ad-hoc keywords (tags).  The obvious reason is that a predefined and well-engineered taxonomy is more likely to apply the “right” words since a user manually entering tags on the fly can easily be sloppy or forget the appropriate term to apply.   The less obvious reason is that as a search engine crawls the site, it will consistently see the same words over and over again used to describe related content on your site.

Why is it important for the search engine to see the same words over and over again?  Because “spray and pray” (applying lots of different related words to a given piece of content) doesn’t cut it.  You don’t want to be the 1922th site on 100 different search terms.  You want to be the #1, #2, or #3 site on just a few.

So think of a search engine like a really stupid baby.  Your job is to “teach” the baby to use a few important words to describe stuff on your site.  Just like teaching a human, the more consistent you are, the more likely the search engine is to “learn” the content of your site and attach it to a small set of high-value terms.

Enforcing Keyword Usage

One of my main complaints about “tags” versus “categories” is that tags added to content on-the-fly tend to be added off the top of one’s head.  That’s fine for casual bloggers who just want to provide some simple indexing.  But if you are a content site with a lot of information about some particular subject, chances are that tagging like this can get you into trouble.  The reason for this is because on-the-fly tags often inadvertently split a cluster of information into several groups because two or three (or more) terms will be used interchangeably instead of just one.

Consider a site with a well-defined and structured taxonomy.  Let’s consider a very common application: a photography site primarily covering reviews of cameras and photography how-tos.  A solid taxonomy structure would probably include four indexes:

  • Manufacturer (Canon, Nikon, Lumix, etc..)
  • Product Model (EOS, D40, TZ3, etc..)
  • Product Type (DSLR, Rangefinder, micro, etc..)
  • Topic (Product Review, Lighting, Nature, Weddings, etc..)

Generally, the product reviews would be indexed by manufacturer, product model, and product type, with the “Topic” categorized as “Product Review”.  How-tos would be indexed by their topic (“Weddings”) as well as any camera information if the article covered the use of a specific camera.  For example, an article called “How to Improve Low-Light Performance of the Lumix TZ3” might be indexed thusly:

  • Manufacturer: Lumix
  • Product Model: TZ3
  • Product Type: Compact Digital
  • Topic: High ISO

Having a system that prompts the user to appropriately classify each article ensures that the correct keywords will be applied.  Getting the manufacturer and model correct is probably pretty easy.  It’s harder to remember the correct product type (“Compact Digital” versus “Compact”).  And remembering the right topic is a real challenge (“High ISO” versus “Low Light” versus “Exposure” or any of a hundred other terms I could throw at it).  Moreover, the user must to remember to apply all four keywords when the article is created.

We can see the value of focused keywords from this example.  At a site level, relevant keywords are at a high abstraction level, like “camera review”.  It’s unrealistic to think a web site could own a top search engine ranking for such a broad term.  At the time of this writing, Google shows almost 14 million web pages in the search result for “camera review”.  But a search for the new Nikon laser rangefinder “nikon forestry 550” returned only 138!  An early review on this product with the right SEO terms could easily capture that search space.

Having a system with four specific prompts and some kind of list is essential to keeping these indexes accurate.  Ideally the system provides a drop down or type-ahead list that encourages reuse of existing keywords.

Creating a Navigation System

Here’s where it all starts to come together.  Once you have a big pile of content all indexed using the above four indexes, the next obvious step is to create entry points into your content based on the index, and to cross-link related content by index.

On ProRec, we had five entry points into the content:

  • Main view (chronological)
  • Manufacturer index
  • Product Model index
  • Product Type index
  • Topic index

Needless to say, when a search engine finds a comprehensive listing of articles on your site, categorized by major topic, it greatly increases the relevance of those articles because the engine is able to better understand your content.  Think about it: right there under the big H1 tag that says “High ISO” is this list of six articles all of which deeply cover the ins and outs of low-light photography.  It’s a search engine gold mine.  Obviously it also helps users navigate your site and find articles of interest, too.

My favorite part of the magic, however, was using the taxonomy to create a “Related Articles” list on each article.  Say you’re reading a review of a Lumix TZ3.  We can use the taxonomy to display a list of articles about other Lumix cameras as well as other Compact Digital cameras.  On ProRec this was even more valuable, because ProRec reviews (and how-tos) many different types of gear and covers a lot of different topics.  Go to a review of a Shure KSM32 microphone, and here’s this list of reviews of other mics.

The “Related Articles” list immediately creates a web interconnecting each article to a set of the most similar articles on the site.  Instantly the search engine is able to make much more sense out of the site.  And, of course, readers will be encouraged to navigate to those other pages, increasing site stickiness.

More SEO Fun with Taxonomy

Once the system was in place I was able to extend it nicely.  For example, I created a Barnes & Noble Affiliate box that used the taxonomy to pull the most relevant book out of a list of ISBNs categorized using this same taxonomy and display it in a “Recommended Reading” box on the page.  So you’re reading an article called “Home Studio Basics” and right there on the page is “Home Studio Soundproofing for Beginners by F. Alton Everest” recommended to you.  The benefit to readers is obvious.  But there are SEO benefits, too, because search engines know “Home Studio Soundproofing for Beginners by F. Alton Everest” only shows up on pages dealing with soundproofing home studios.  Pages with that title listed on them (linked to the related page on Barnes & Noble) will rank higher than those that don’t.

You can start to see how quickly a simple “tagging” interface starts to break down.  You need the ability to create multiple index dimensions (like product, product type, and topic) as well as some system to encourage or enforce consistent use of the correct terms.  Otherwise, you’re doing most of the work, but only getting part of the benefit.

Taxonomy, Blogging, and DNN

Obviously, most casual bloggers don’t want to be forced into engineering and maintaining a predefined taxonomy.  That’s why “tagging” became popular.  Casual bloggers want to be able to add content quickly and easily and anything that makes them stop and think is a serious impediment to workflow.  So you just don’t see blog platforms with well-engineered categorization schemes, and you definitely don’t see any that allow for multiple category dimensions.

In my article “Blog Module Musings” I wondered aloud about what sort of people really use DotNetNuke as a blogging platform in the traditional sense of the word “blogging”.  My guess is that most people using DNN as a personal weblog probably have some personal reason for choosing DNN instead of any of the free and easy tools readily available like WordPress or Blogger.  So I have a belief about DNN that it isn’t a good platform for a “blog” per se, but it’s a great platform for content management and publishing.  My guess is that the DNN Blog module has much greater utility as a “publishing platform” instead of a “personal weblog”.

As such, I think it makes sense that DNN’s publishing module should offer more taxonomy power than the typical blog.  I also think that it’s possible, using well-designed user interfaces, to make a powerful taxonomy easy to manage.  My experience with ProRec demonstrated this.  It was very easy to manage ProRec’s various indices, primarily because I had a fat client to provide a rich user interface.  With Web 2.0 technologies, we can now provide these user experiences in the browser.

Blog Module Moving to Version 4

2008.10.29 Comments off

In a previous post I stated that the Blog module would offer an interim 3.6 release to provide users with a few more features before the team undertook the full-on rewrite to move the module to version 4.

Well, as it turns out, plans change.  The team has decided to go directly to version 4.  There will likely be a 3.5.1 release to patch up any bugs that surface after 3.5 is released, but no 3.6 “feature upgrade”.

This is really great news.  The team has grand plans for this module which are currently stymied by a few factors, including a lot of old deadwood in the code and poor developer productivity in the older VS 2003 environment.  Of course, the key reason is that DotNetNuke has officially left the .NET 1.1 environment so all new releases must be based on .NET 2.0.

Categories: Software & IT Tags:

New DotNetNuke MSDN-Style Help

2008.10.29 Comments off

Last night I was desperately seeking help for some DotNetNuke core classes, and I came up short.  Fortunately I was able to resolve my problem with a little help from Antonio, but I still wished I had a better help file available.

Well, today I discovered that Ernst Peter Tamminga has put together an MSDN-style help system for DotNetNuke.  Exactly what I was looking for.

If you do serious DNN development, this is a must-have.  Thanks Ernst!

Categories: Software & IT Tags:

Blog 3.5.0 Set for Release

2008.10.16 Comments off

After a few months delay, the Blog team is set to release the 3.5.0 version of the DNN Blog module.

I won’t go into the details of the reasons behind the holdup.  Our team leader has done a good job of that here, if you’re interested.  Suffice to say, sometimes, there are circumstances beyond one’s control.

I am not sure at this point if there will still be a 3.6 interim, or if we’ll proceed directly to version 4.  I’m sure everyone knows my opinion!  At any rate, it’s good to be back on track.

Categories: Software & IT Tags:

Blog Team Announces Interim 3.6 Release

2008.08.17 Comments off

The DNN Blog team has announced plans to release an interim 3.6 release to provide some final changes before undertaking the effort to rewrite the code for the version 4.x release.

The 3.6 feature set has not been made official, but current plans are to add support for BlogML, tagging, 301 redirects, and custom RSS URLs.

All effort will be made to minimize scopecreep, since it is a high priority to move forward with 4.x, but we felt that these critical changes needed to happen sooner than could be provided by 4.x.

Categories: Software & IT Tags: