Weird Inbound Links From other C5 Sites

Permalink
When looking at Google Webmaster tools, google reports inbound links to some of the default C5 pages like /search/ and /about/guestbook/ from weird unrelated websites, all built in concrete5 though. Anybody else getting that?

It's just kind of weird, I wonder how google is getting things crossed up, and I wonder if they count towards (or against) my page rank.

-Guy

guythomas
 
ScottC replied on at Permalink Reply
ScottC
they might, they are probably included in your sitemap xml, make sure they are excluded. This is just a guess though, it is a collection attribute.

Thanks.

-Scott
Kurieuo replied on at Permalink Reply
I'm getting that too! And it has me entirely puzzled.

Furthermore, let's assume contact pages exist as follows: c5site1.com/contact/ and c5site2.com/contact/ (both separate websites). Google searching site:c5site1.com may actually cache the page and metadata for c5site2.com/contact/.

I have half a dozen websites setup under a centralised install ('concrete' and 'packages' folders) each with their own database and site.php config file.

I don't know what is going on. They are all separate websites, but for some reason Google is confusing pages with the same path between different C5 websites. This is only affecting C5 websites.

Anyone know why this is happening? Really, really, confused. And not good since SEO is a service we provide. This may actually mean we need to switch to another product. =(
frz replied on at Permalink Reply
frz
wait, how did you centralize the setup? Typically just the /concrete directory is symlinked, but from what you're saying you may be centralizing the overrides too in some way... ?

did you follow a particular how-to for centralization? There are a few different approaches.
Kurieuo replied on at Permalink Reply
I setup the centralisation myself using a symlink approach. However, it is aligned with Method#1 at: http://www.concrete5.org/documentation/how-tos/developers/setting-u...

Basically, the follow approach was taken:
1) Installed latest C5 version (we call this install our main install)
2) Loaded up with free themes and packages and standardised configuration
3) Moved core folders (concrete and packages) to shared folder on server
4) symlinked to the folders

With this done, we take like a snapshot of the "main install". That is, we gzip the files in the "main install" and take an SQL dump of its database.

So when we setup a new C5 account, we extract our gzip copy and import the database. Then we update the site.php config file.

This worked perfectly fine, and is great when it comes to upgrading C5. However, as with the first poster here, we noticed in our web master account that one particular C5 website we installed has a lot of links from other C5 websites (despite there being no interlinking between websites).

Then when we perform a site:c5site.com Google search, Google's cached results here and there turn up pages from other installed C5 sites, it seems where they may share a common path (e.g., /contact or /services). Yet, when we visit the actual location of the page Google indexed, it is the correct page of c5site.com.

It's like, what is getting served to Google's bots is sometimes different for a page. I don't know how this is possible, but for privacy reasons, will PM you a link so you can see for yourself.
ryan replied on at Permalink Reply
ryan
I'm fairly certain that nothing in your shared core setup would affect this - I imagine the sites that are linking inbound are not sites that you setup yourself?

I wonder if google has a bug where it's not uniquely identifying the pages to the correct sites - strange..
Kurieuo replied on at Permalink Reply
Actually the sites from which the inbound links are occurring, are C5 sites we've setup on the same server. That's half the issue though. The site: results being mixed with other C5 websites is a larger issue I think.

I can only think it is a bug with Google's indexing, but wonder why it seems to be unique to C5... The control doesn't work differently for serving pages to search engines or Google does it?
ryan replied on at Permalink Reply
ryan
Hey Kurieuo,

I took a look at those sites you pm'd me. The only thing I noticed is that they're on the same ip address - that may confuse google also you're not using the same google analytics code for both are you? Best I can say is google is confused and there's not much you can do about it.

Anyone who's more SEO focused care to chime in?
carlos123 replied on at Permalink Reply
Hmm...interesting problem.

If I may suggest something...

If you are running under Linux download and install Wireshark, a packet analyzer and sniffer.

Then start a capture on the interface used by your wireless card.

And access the site pages getting mixed up by Google while watching Wireshark.

It will tell you what requests are going out and what responses are coming back. There may be a clue as to what is going on with that.

Better yet...duplicate a couple of your sites running locally under your own copy of Apache and give them the same domain name locally (in effect rendering them unreachable from your browser).

Then access those pages and watch what happens in Wireshark. It may give you some clue.

I'd be very surprised if Google has a bug in this.

Just my two cents worth.

Carlos