404 errors in my sitemap - can someone help?

Permalink
Hi guys, I am clearly inherently stupid, but please help!

There are three URLs in my sitemap that Google says are 404 errors.

They are:

htt​p:/​/ww​w.a​ura​bas​ssh​ake​r.c​om/​?cI​D=1​48

htt​p:/​/ww​w.a​ura​bas​ssh​ake​r.c​om/​?cI​D=1​31

htt​p:/​/ww​w.a​ura​bas​ssh​ake​r.c​om/​?cI​D=1​29

I do not know what these URLs are for. Does anyone know? I don't even know what a cID is, and Mr Google won't tell me.

Buckets and buckets of top-shelf karma to you.

Jerry

View Replies:
cgsmith replied on at Permalink Reply
cgsmith
Hi Jerry,

The cID's are Concrete's unique IDs for the pages.

Where is it telling you that there is a 404? How do you get to the 404 by navigating on your site?
jerrytoga replied on at Permalink Reply
Hey CGSmith, thanks heaps for replying, you're awesome.

Thanks for telling me what cID means - I couldn't find that answer anywhere.

Now, in response to your questions:

When I go to google webmaster, it tells me that there are:
- 3 crawl errors in my Sitemap
- 3 "not found" URLs.

Good webmaster tells me that the 3 URLs are the the cID ones (that I've listed above).

Webmaster provides me with those 3 URLS in a clickable list.

Beside each URL it says "Detail: 404 (not found)".

When I click on one of the links in the list, it opens the Concrete5 "page not found" page.

Let me know if you need more info! And thank you again!
arrestingdevelopment replied on at Permalink Reply
arrestingdevelopment
Jerry,

It might help to re-generate your sitemap file and then re-submit it in Google's Webmaster tools to see if that fixes things. It's possible that something got messed up in your sitemap (like it didn't get re-generated after you deleted those three pages or something like that) and hasn't gotten corrected.

HOW you re-generate your sitemap varies depending upon if you're on v5.4.x or v5.5.x...

In v5.4.x... go to the Dashboard, then System & Maintenance. Make sure you're on the "Jobs" tab and that the "Generate Sitemap File" option is checked... then click the "Run Checked" button in the upper right. That will get Concrete5 to re-generate your sitemap.

In 5.5.x... go to the Dashboard, then select System & Settings. From there, click on the "Automated Jobs" option under the "Optimization" category. At the bottom of the resulting window, you'll see a long hyperlink that says something about "If you wish to run these jobs...". Click that link.

Hopefully that will fix it. If not... you should be able to FTP to your server and look in the root folder for the "sitemap.xml" file. Copying that to your computer will let you open it up and maybe figure out why those entries are in your sitemap file... and, hopefully, give you an idea of how to fix it.

Hope that helps!

- John
jerrytoga replied on at Permalink Reply
Hey Arresting Development (great name by the way)

Thanks heaps for helping. I tried resubmitting as you suggested, but the nasty URLs were still there. I don't know what's going in. I really don't know what those pages are!!

UNTIL I can find out what those nutty cID URLs are (if I ever do find out) I will just tell google they don't exist. I used the tute athttp://www.denisvlasov.net/103/customizing-sitemapxml-in-concrete5/... to block the URLs from the sitemap. (If this was a dumb idea, PLEASE let me know. I have been known to commit large acts of dumbness in the past.)

Hmmmmmmmmmmmmmmmmmmmmmm. Wot could those URLS be??

Thanks to both of you guys for helping!

Jerry
arrestingdevelopment replied on at Permalink Reply
arrestingdevelopment
Hi Jerry,

When you say you resubmitted but "the nasty URLs were still there"... are you referring to having looked at the actual sitemap.xml file? Or the fact that Google was still reporting the errors?

If the former (they're still in the sitemap.xml file), you might want to post a copy of it here, because someone may be able to determine where in your site they're coming from.

If it was the latter (Google is still showing them in Webmaster tools), then I don't think that means anything... yet. It takes Google a while to update its information. It has to first crawl your sitemap, then crawl your site... and it can take days or weeks before that is reflected in their index or in Webmaster tools. So all may not be lost (yet).

- John
jerrytoga replied on at Permalink Reply
Hey John,

I've removed the URLs from my sitemap at least, by editing the generate_sitemap.php file. They no longer show up in my sitemap. (I should have taken a screenshot when they were still there. Didn't think.)

And yes, Mr Google it still saying that I have the "nasty urls". And thanks for the advice: hopefully this error message will just go away in time! I'll see what happens next time I get crawled.

Thanks for all the help!

Jerry
reachdigital replied on at Permalink Reply
reachdigital
G'Day,

We had the exact same problem. We are on 5.6.1.2.

Figured out that the Drafts are being saved and creating pages and when you publish the xml file the cID gets created. Simply delete the draft pages and should fix that issue.

Hope that helps.

Mauricio