HOWTO: google SEO fix ending slash url's

Permalink 2 users found helpful
Google Webmaster blog post (http://googlewebmastercentral.blogspot.com/2010/04/to-slash-or-not-... ) basically explains that having two urls point to the same content is not ideal for users or googlebot.
I have often found that you can also get SEO dillution through content duplication over the same two similar urls (site.com/url and site.com/url/)
There is however a simple fix you can implement in Conrete5 to stop this behavior and help C5 be a tad bit more SEO friendly.

First, create config/site_process.php this is a custom file that lets your site do some work every time it is accessed! Perfect for what we need.

I added the following code to detect 301 redirect the 'bad' URL's.
if ($_SERVER['REQUEST_URI'] != '/' && preg_match('#/$#',$_SERVER['REQUEST_URI']))
   {
      $newuri = preg_replace('#/$#','',$_SERVER['REQUEST_URI']);
      Header( "HTTP/1.1 301 Moved Permanently" ); 
      Header( "Location: ". BASE_URL. $newuri);    
    }




Furthermore lets also fix the getLinkToCollection.
Create helpers/navigation.php
defined('C5_EXECUTE') or die(_("Access Denied."));
class SiteNavigationHelper extends NavigationHelper {
   /** 
    * Returns a link to a page
    * @param Page $cObj
    * @return string $link
    */
   public function getLinkToCollection(&$cObj, $appendBaseURL = false, $ignoreUrlRewriting = false) {
      // basically returns a link to a collection, based on whether or we have 
      // mod_rewrite enabled, and the collection has a path
      $dispatcher = '';
      if (!defined('URL_REWRITING_ALL') || URL_REWRITING_ALL == false) {
         if ((!URL_REWRITING) || $ignoreUrlRewriting) {
            $dispatcher = '/' . DISPATCHER_FILENAME;
         }

This should fix any calls to the core API for getting the URL and the sitemap.xml generation of the page link


imple eh? Thats my motto keep it simple. I was originally concerned about this code effecting single page controller arguments but thus far in my testing in the last month I haven't discovered anything. Let me know if you do!
DavidMIRV
View Replies:
Mnkras replied on at Permalink Reply
Mnkras
norsemengrp replied on at Permalink Reply
norsemengrp
DavidMIRV replied on at Permalink Reply
DavidMIRV
It'd probably be best to create your own thread in the appropriate forum for this..
norsemengrp replied on at Permalink Reply
norsemengrp
ok, I moved it to it's own thread
glockops replied on at Permalink Reply
glockops
I believe this modification may cause a problem with certain browser / operating systems and server configurations.

Here's what I've found (yesterday was the first time I encountered this issue and I've been running this mod for quite awhile).

Mac users started to complain that the website was not available. I heard this from 6 different people, on eight different computers, running on three totally different networks.

I'm a PC, so I checked it out on a Mac. The behavior was very strange. The homepage of the site (served in concrete) would load, any non-concrete webpage would load, but pages in concrete were eventually failing. I say eventually because it would serve the first few, but then get stuck. It would timeout attempting to retrieve a page "pre-redirect."

My server runs cPanel and Apache. I'm pretty certain the server and this script were fighting to add and remove the trailing slash. The problem only showed up for users running OSX (both safari and firefox had the problem). So if you have complaints from Mac users about your site - you might check to see if this script in combo with your server configuration is the culprit.

I may be totally off-base blaming this modification (granted that it was likely my server configuration breaking it), but removing it has seemed to fix the problem. I'm going to have a few Mac users (that complained) check the website latter and see if it is available from the other networks as well.
Tony replied on at Permalink Reply
Tony
another approach to this problem is to use the canonical url <link> tag in the header to let the search engines know what the official url is.

how to set this up in concrete5 is described here:
http://inneroptics.net/concrete5_blog/canoncial-urls/...
Tony replied on at Permalink Reply
Tony
one of my customers has something very similar to this script and it's causing all post/request variables to disappear from every request. I commented this out and it started working again.
DavidMIRV replied on at Permalink Reply
DavidMIRV
Yes you actually have to expand on this further to get the core Helpers/Code to output urls without the trailing backslash.
modestbyte replied on at Permalink Reply
modestbyte
I turned this around just a bit b/c I wanted to force the trailing slash, and prevent index.php?cID=113. I'm using version 5.6.0.1 so in my config/site_post.php file I added this.

//Force Trailing Slash
 $req = Request::get();
 if ($req->getRequestCollectionPath() == $_SERVER['REQUEST_URI'])
  {
      $newuri = $_SERVER['REQUEST_URI'].'/';
      Header( "HTTP/1.1 301 Moved Permanently" ); 
      Header( "Location: ". BASE_URL. $newuri);    
   }
 //Andrews SEO Fix 
 if ($req->getRequestCollectionID() > 1 && $req->getRequestPath() == ''
   && $_SERVER['REQUEST_METHOD'] != 'POST') {
      // This is a request that is directly for the cID, rather than the path
      $u = new User();
      // If the user is logged in we do NOT redirect
      if (!$u->isRegistered()) {


I thought this might help someone and please let me know if guys you see anything wrong with how I handled this.