Bulk Import of Pages

Permalink 1 user found helpful
Dear friendly forum members,

I have numerous websites with Frontier Userland as the current Content Management System that I would like to efficiently convert to Concrete5 websites.

I am wondering if there was some sort of bulk page import tool in Concrete5 (perhaps based upon XML for instance, akin to Wordpress). I really do not want to duplicate so many pages from so many websites manually into Concrete5 since that will take eons.

Any advice would be substantially appreciated,


Chi

PS - Sorry for referencing two other CMSs :)

 
ScottC replied on at Permalink Reply
ScottC
I have a script that imports wordpress blog and creates the same site structure and puts the content into a content block, also maintains the site structure(post heirarchy etc).

It is possible, though i have no idea what kind of schema they use.
melat0nin replied on at Permalink Reply
melat0nin
ScottC

Are you planning to make the script available? A good WP>c5 script could be dynamite...
frz replied on at Permalink Reply
frz
i know there has been talk of a CMS agnostic XML format for dealing with web content. Anyone have any thoughts on that?

I think import would be huge if done right, I'm just not sure what that really means - something totally generalized, or something platform specific.
ScottC replied on at Permalink Reply
ScottC
Good point. I think that is actually a pretty good approach.

If you can get your data into "this" format, concrete5 can import it. Here's what we think that format should be. *format* Say it is open or whatever, put about 3 hours into it, get 3 peer reviews and make that the format. Hopefully everyone else adopts it, but it gives them a target reference.

That would be a wickedly fun add-on to build, not sure in regards to ROTI(return on time invested).
LucasAnderson replied on at Permalink Reply
LucasAnderson
Are we talking like an "Easy Button"? Turn my site into concrete5 instantly. If that could be built, I'd charge $5000 per instance.
ScottC replied on at Permalink Reply
ScottC
ah yeah well given a predictable format you can code against it is certainly possible. it'd just be up to them to get their data in *the* xml format.

Or i guess they can pay the guy that created the add-on to create the xml from their db export or whatever.
chizeng replied on at Permalink Reply
Great point, you guys. That would be a very useful feature in C5.

It would certainly allow for many sites to efficiently migrate into C5, potentially dramatically increasing the number of C5 websites out there.
okhayat replied on at Permalink Reply 1 Attachment
okhayat
Here's a script we created at work and used to migrate content from an old CMS.
It basically reads a sections table and a category table, matches both and adds pages based on that. Surely it needs to be modified to suit your needs.
Copy it to your /single_pages, install it, fill in the required info and try it.
hursey013 replied on at Permalink Reply
hursey013
Has there been any progress on this? I am looking to import data from xml or csv into c5 pages. In addition to title, description, etc there would also be some custom attributes - anyone done this before? I looked at okhayat's script and I'm not sure it fits my needs.
ScottC replied on at Permalink Reply
ScottC
did one with xml, 1210 pages or something like that. This was work for hire but it might be work releasing. I'll have to ask the guys I work for if they want to put it in the marketplace.

FWIW it took about 20 hours to write.
hursey013 replied on at Permalink Reply
hursey013
It would be very useful and I wouldn't hesitate to pay for it.
ScottC replied on at Permalink Reply
ScottC
It has a pretty nice format for the xml, I'll see if they want to put it out there..either way you need to massage your data into the given format, it populates pages, maintains heirarchy, populates collection attributes(and creates them if it doesnt exist) including the select attribute options.. it also pops in a content block for the provided area handle..

Like i said it is totally up to them
hursey013 replied on at Permalink Reply
hursey013
Any word on this?
ScottC replied on at Permalink Reply
ScottC
they want to make it into an addon but it has a bit different syntax, we are rewriting it.

The site structure is like

<page>
<!--pagestuff-->
<page>
<!--pagestuff-->
</page>
</page>
so each page node if it has children pages they are page xml elements inside of it. Not sure when it'll be released but I am doing some work on it tomorrow.
hursey013 replied on at Permalink Reply
hursey013
Sounds good, keep us posted.
dihakz replied on at Permalink Reply
dihakz
How is that coming along? Anything yet?
dihakz replied on at Permalink Reply
dihakz
How is that coming along? Anything yet?
hursey013 replied on at Permalink Reply
hursey013
Also wondering, this tool would be invaluable in converting sites to c5...
keeasti replied on at Permalink Reply
keeasti
Anybody come up with anything?
We are facing a mammoth 1850 page import!
JohntheFish replied on at Permalink Reply
JohntheFish
I am kicking around some ideas based on screen scraping, essentially an extension of the tricks I have used for Magic Heading. Its too early yet to know whether it will be feasible, especially in the highly automated bulk you are looking at.

I would be interested in the overall shape/style of the pages you are looking at converting and what the key requirements are. Just text, headings and lists, or tables, or images, or complex layouts involving floating divs etc? Do you need to preserve layout? Or just content?
keeasti replied on at Permalink Reply
keeasti
@JohntheFish
The data is currently held in a database but not sure of exactly how it is structured. Waiting for a sample DB dump.
ScottC replied on at Permalink Reply
ScottC
I wrote a package that created something like 2k-2.6k pages, this was based on xml format of the existing pages with well defined collection attributes and all of that populated, it also added just content blocks as that was all that was relevant at that point. It ran in a ec2 big instance of the xml file in about 60 seconds.

If you need something like that I've written it in the past.
keeasti replied on at Permalink Reply
keeasti
@ScottC
Sounds more or less what I need to do ... will know more when I get a sample DB dump.
carl101lee replied on at Permalink Reply
carl101lee
Hi People

I have a Similar issue to this, My client has given me 800 old blog posts in word formate! As this old blog has been shut down!

Now I can convert these to CSV/SQL or what ever formate, But I would Like to be able to mass upload them into a blog thumbnail Pages.


I can't find any add-ons to help was wondering if these script on here would still work with latest versions of concrete5?

regards
Carl
ScottC replied on at Permalink Reply
ScottC
Hi Carl,
You can certainly do this, but I would really recommend that you bulk import the files for the thumbnails ahead of time. That is no big deal really at all to do.. they hopefully existed in a flat single directory before and as such all the names are unique?

If so then in your custom bulk import script you would want your script to grab the filename from the CSV format, look up the file_id associated with it, then set it as the page_attribute as it runs through the rows adding your content.

If not then you'd want to re-import the file using the api but still keep around a file_id to file relation so you don't create un-necessary duplicate files.

If you are trying to get quotes for doing this I would think that with testing and everything this is something that one would accomplish in about half a day, 4 - 6 hours if everything goes swimmingly. The actual code would take perhaps 15 minutes to an hour but then there's verification and everything. If you have to import the files because they are in some weird format or similarly named then add an hour. But thats what you are looking at.

-Scott