How to speed up groovy scripts

I had a task to optimize user creation and storing. Before changes on each anonymous checkout a new user was creates with ID in format RANDOM_UID|USER_EMAILand we had a lot of users who used anonymous checkout more than once. After changes only one user with ID in format USER_EMAIL for all anonymous checkouts was used. As a part of the task I need to migrate all old users to new format users. First version of moigration script had an execution time took more than 3 days for processing around 7 millions of people.

Most significant impact on speed of execution was made by committing transaction after each saving via model service. Trade off for this optimization is absence of rollback mode(all changes would be committed and persisted even if you will run groovy script in rollback mode).

Firstly close current transaction in the very begging of the script:

1
2
3
4
5
    
    Transaction tx = Transaction.current();
    if (tx.isRunning()) {
        tx.commit()
    }

Then wrap every execution of modelService.save and modelService.remove with transaction opening and committing:

1
2
3
4
5
    for(UserModel u : users) {
        tx.begin()
        modelService.save(u)
        tx.commit()
    }

Such changes decreased execution time to 8 hours. After that we were able to decrease execution time up to one and a half hours by paralleling scripts. User list was split on 6 batches and each batch was executed in parallel in separate hac console (unfortunately paralleling with threads inside groovy scripts doesn’t have any effect on execution time).

P.S. Remove operation is very time consuming. We marked all needed for deletion users via groovy script and removed them via direct mysql query. Be careful with such approach, because it avoids execution of hybris interceptors and automatic removal of all connected database objects.

comments powered by Disqus