Getting over the upload timeout problem

Permalink
Hello.

I have been banging my head against this one for a few days now and wonder if somebody might have an answer.

The site needs to upload a large csv file that requires line by line processing as it updates the database, specifically into User attributes. Unfortunately I get the 'The connection to the server was reset while the page was loading.' problem before the file has finished processing. It's not a question of the file being too big, but appears to be a processing timeout problem.

I have tried altering php.ini to extend memory and timeouts and this appears to have no effect.

I have tried adding 'set_time_limit(10);' before each line of the .csv is processed in the controller and this doesn't work either.

I tried a way of splitting the file into chunks by first uploading it into the File Manager so I had a copy of the file on the server (no problem doing that). Then using a database table to hold a pointer to the next line to process, setting a counter to process twenty lines at a time and using $this->redirect to the page itself. The idea being to start afresh and reload the page, carrying on at the next line of the database. Works beautifully except the failure is Firefox telling me that I had made too many redirects!

Does anyone please have any advice for the best way to set a process running in the background, that won't timeout, and the way to monitor it's progress?

tangent
 
JohntheFish replied on at Permalink Reply
JohntheFish
You need to think of this as 2 separate issues.
- uploading the csv file
- processing it into the database.

I have a gut feeling you could be getting bogged down in the first issue, when it is really the second issue that is causing the problem.

If this is only a one off initial data upload or a very occasional upload, you can:
- upload using the file manager and process/import from there
- upload using ftp and then either import to the file manager or process the file directly.

Again, if this is a simple one-off import of a csv into a database table, once the file is uploaded you could import using phpMyAdmin.

Perhaps you could tell us how many kbytes or megabytes the csv file is, how many rows you are importing, and whether this is just one database table or a whole structure of related tables.
jvansanten replied on at Permalink Reply
In an application, I use straight PHP code to use HTTP upload to a folder and process from there. After the usual bumping up of the settings that you have, we're uploading files up to 150 meg, which works fine for reliable Internet connections with good home user bandwidth -- nothing fancy.
tangent replied on at Permalink Reply
tangent
Hey guys. I am overwhelmed by the responses. Thanks.

It's not an upload issue. I tried inventing a system that used FileImporter() to load the csv file into the File Manager, thinking that any redirect would lose the original File and also that it should be on the server for performance reasons. The upload worked with no problem.
My thinking was that by storing the 'Next' parameters in the database all that would be required would be a redirect to the page. The next upload would start where the last one left off and the database records the status of the upload of each User.

This worked a treat, but as jshannon picked up, the redirect idea eventually killed the browser.

I am importing/updating a dozen attributes per User and this could need to be more. The CSV file holds several thousand Users. On my server it takes about a second to alter the attributes of each User, so we are looking at a process that takes about 40 minutes 'at the moment'.

The User update will be a regular occurrence. The charity in question have a complex Access database and wish to manage their Members on it rather than move everything onto the web - which I have suggested several times to them.

What would be ideal is the knowledge of how to start a background process, (not a page) that has access to all the Concrete5 MVC objects that won't time out. This loops through the lines of the uploaded CSV file and stores status information in the database. And how to do a page that queries the database and displays the status of the uploads at regular intervals.

The redirect idea was my kludgy way of trying it. Putting it in the background would be nirvana.

Over to you chaps, and thanks again.
jshannon replied on at Permalink Reply
jshannon
From experience writing my user importer, updating attributes in c5 can be super inefficient. If you do it the common way, it does an ->index() after each attribute change. If you do it the uncommon way, the index() can be called manually. It's not the attribute update that takes all the time, but the indexing. I can't remember, but I think that common is $user->setAttribute() and uncommon is $attribute->set($user); This cut my imports down from 12 minutes to 2 minutes!

Do some searching for command line concrete5. It's easily doable, and I think there are some tutorials. But if your host is killing processes (which I suspect they are... otherwise 60 seconds doesn't make sense), then they're going to kill any command line processes too.

There are definitely some gaps between what you want to do and my user importer (updates and automation), but I'm working on version 2.0 right now...
tangent replied on at Permalink Reply
tangent
Thanks James.

I guess version 2 will have updates! I have nailed that for this organisation.

Thanks for the tip on ->index(), I will have a look at $attribute::set and also the command line.