import blog posts from blogspot into Concrete 5

Permalink
I have a blogspot blog with hundred of blog posts that I want to migrate/import into a new concrete5 site. And move completely to using Concrete5 for adding posts in the future.

Is there any way to import into Concrete5 the posts using the RSS feed, Atom or even the exported XML from blogspot into Concrete 5?

I want to import each post and images in the post, if possible.

Currently using concrete5 Version 8.3.2

Thanks for any help/advice on this

Aaron

 
mnakalay replied on at Permalink Reply
mnakalay
There is no tool available for this at the moment. You would have to have one built.
A3020 replied on at Permalink Reply
A3020
May I ask why you want to migrate from Blogger to concrete5?
arodden replied on at Permalink Reply
Doing a site re-write. Concrete5 was chosen over other CMS for its ease of use. Migration of existing pages (creating them in Concrete5) and some custom code for back-end data administration already done. The handful of pages didn't take long to build in Concrete5. But, 500+ blog posts would take a while and that was why I was hoping someone already created something for this task that was available.

Moving away from blogspot to Concrete5 to have everything in one place. Easier for making changes, the theme being exactly the same, backups done in one place.
northdecoder replied on at Permalink Reply
Not exactly a complete answer, but some initial thoughts:

Copy the blog site to a static site.

After inspecting server blogspot post html it seems the underlying
structure is different for each selected theme and blog, one
size export probably does not fit all blogs.

Create a temporary working website
mytemporarysite.net OR 192.198.1.1/~mysite OR localhost/whatever

SSH into your site

Create a temporary working directory
$ mkdir tempwd #or whatever is a good name for you.


`wget` your blog into a temporary working directory `tempwd`,

$ cd tempwd
$ wget -E -H -k -p http://notArealBlog.blogspot.com/


Make a new directory in public_html (or whatever your
provider calls it), but NOT in your concrete 5 directory.
$ cd ~/public_html
$ mkdir mysitecopy


Copy the downloaded files to be served publicly
$ cd ~/tempwd
$ cp * ~/public_html/mysitecopy


You now have a complete copy including pictures and can be browsed at:
mytemporarysite.net/mysitecopy/notArealBlog.blogspot.com/index.html

= = =

The next step, the hard part, will be to scrape your own data
depending on the structure. It looks like blogger has
meaningful class names that might be usable for extracting
the content from each file.

Here is a rough idea copy and pasted from a function that
worked at one time. (This is NOT the whole function)

function getURL($url,$class){
  // expect input $url, $class
  $html = file_get_contents($url);//web page or local file
  $dom = new DOMDocument;
  $dom->loadHTML($html);
  $xpath = new DOMXPath($dom);                     
  $element = $xpath->query($class);
  print"Elements of xpath->query:<br/>  ";//debug only
  var_dump($element);//debug only
  foreach($element as $item) {
    // loops through the elements of the query
    // search for your blogs classed elements here
    $myelement="the result of your search"
  }
  return $myelement;