Optimising code 8.2.1 User Import

Permalink
I am trying to be as efficient as a I can using a CSV to import users. I have a list of 700+ users and i've created an upload screen to take in a CSV and process the creation of new users. Code below.

Now I have ran the code several times (and then removed the users) to test how long its going to take for users of the system to generate a large list of customers. Currently its taking around 16.34 seconds per user!! which is obviously a huge amount of time when you're importing 700+ (3Hrs 24Mins total it took).

Can anyone make any suggestions for optimisations to speed things up, any rookie mistakes I've made. If you can explain why that would really help my understanding.

public function handleSubmit()
{
    $l = new Logger(LOG_TYPE_EXCEPTIONS);
    $l->info("");
    $l->info("Start User Import");
    $recordProcess = [];
    $eachUserCount = 0;
    $RegionEntities = Express::getList('region');
    $FacilityEntities = Express::getList('facility');
    $this->connImport = $this->getDB();
    $this->entityManager = \Core::make('database / orm')->entityManager();
    $userRegistrationService = \Core::make('\Concrete\Core\User\RegistrationServiceInterface');
    $userinfo = array();
    $htmlReturn = "";
    $token = $this->app->make('token');

 
mnakalay replied on at Permalink Reply
mnakalay
Here are some micro-optimization you can try. They won't half the time it takes but they'll help.

You have this
$reg = $userRegistrationService->create($userinfo);
 // some more code
// ....
$u = \Core::make('Concrete\Core\User\UserInfoRepository')->getByName($data['id']);

You don't need to do that last line. When you create that user, you get the UserInfo object back so $reg is already what you need, you don't need to do that $u stuff.

In the while loop you have
$ui = \Core::make('Concrete\Core\User\UserInfoRepository')->getByName($data['id']);

Then later, still in the while but also inside 2 for loops inside the while loop you have
$this->entityManager->getRepository('Concrete\Core\Entity\Express\Entry')->findOneById($value);

You should probably do
$entity = $this->entityManager->getRepository('Concrete\Core\Entity\Express\Entry');

Outside before the while loop and then just use that variable instead of creating a new instance of the class every time. Same for the UserInfoRepository class.

What I understand from your code is you first empty the core Users table. Then you load the data from your file into UserDatabase.Users table. Then you do some processing.

Then you select data from UsersDatabase.Users_combined. I am not sure how that table came into existence in the first place.

Using that data you start checking if those users exist and if not create them using the data from UserDatabase.Users_combined

So the things that are not clear to me:
1- Am I right in assuming you are using totally separate databases for your processing?
2- Why is it first UserDatabase and then UsersDatabase with an s?
3- After loading your file's data into UserDatabase.Users, what do you do with that data? Is it needed?
4- Since you truncate the core Users table, why do you need to check for each user if it exists before adding it? It won't be there since you emptied the table. Unless your data has redundancies?
5- If I am right and you are using separate databases, do you have to? That's bound to slow things down a lot.
6- Adding users to the Core table the way you do it is slow. Isn't there a way you can load your data directly in the core Users table instead of UserDatabase.Users and then deal with attributes and other things?

Again, my assessment might be wrong but it might help.
dkw15 replied on at Permalink Reply
Hi mnakalay

1. I am using a separate database, this is so I don't interfere with the core database tables.
2. Probably a typo from copying across the code (this isn't the case in the code I have).
3. After the CSV is loaded in to the database it doesn't necessarily need to be stored (although for continuity it might be a good idea to keep a log of imported users with datetime)
4. So I am truncating the external table (ready for a new CSV load) - Not the Core Users table in the CMS database. But I still need to check the CMS Users table for the user to see if they exist before adding them in.
5. In an ideal world no, but to keep things separated then it's necessary at the moment.
6. Can you explain what you mean by Adding users the way I do is slow? Is there a better faster way?

I'll see if I can implement your optimisations and thanks for the suggestions and assistance! It's much appreciated.
JohntheFish replied on at Permalink Reply
JohntheFish
For a few dollars there are user csv importers in the marketplace.
There is also a free importer by @hissy on github
dkw15 replied on at Permalink Reply
True, but the project and user import tool are fairly bespoke so any free/paid addon I would use would need modifying anyway. Thanks though
mnakalay replied on at Permalink Reply
mnakalay
concerning (6) it's just that I don't understand why you need to import your data into a separate table in a separate database and from there run a series of queries to add one user at a time.

Wouldn't it be more efficient to process the data imported in a way that allows you to get rid of that intermediary table?

also, as far as the creation of users is concerned, it might be more efficient to add them all to the table using an SQL query instead of the C5 API and once all the users are in, then you can use the C5 API for attributes and groups and stuff like that.

I think it would be faster. Harder too probably.