Backup your site before using Extreme Clean. I accept no responsibility if you end up cleaning critical data!

For backing up a site, have a look at Backup Voodoo.

As a developer, I find the core Remove Old Page Versions job has its limitations:

  • It only processes 3 pages at a time, so needs to be run multiple times
  • It leaves the 10 most recent versions intact

Such a job is eminently suitable for use by end users, but as a developer I often find I need more. Cleaning up a site can also involve more than just deleting old page versions. I regularly found myself having to truncate statistics and logs tables.

Through a number of projects where I needed a more rigorous and extreme cleanup, hacking Remove Old Page Versions to meet my own requirements and pulling in table cleanup, I evolved Extreme Clean.

Extreme Clean runs as a queueable job, so can just be set running from the dashboard jobs page and it will iterate away happily by itself until it has scoured an entire site. All old page versions will be removed and the PageStatistics, Logs and JobsLog tables will be truncated.

By default Extreme Clean only tries to cleanup one table or one page within each batch step. The core Remove Old Page Versions job works in steps of 3 pages, but does not clean as many versions. If you have a fast server and find Extreme Clean too slow for you, then define the site constant EXTREME_CLEAN_BATCH_SIZE.

To prevent Extreme Clean from timing out where pages have massive numbers of old versions, the maximum number removed in any one pass is 20. If you have a fast server and find this too slow, you can defin ethe site constant EXTREME_CLEAN_VERSIONS_IN_PASS to raise (or lower) the maximum number of page versions cleaned in any one pass. Setting this to 0 removes any limits. 

If you find that Extreme Clean occasionally times out on execution time available, you may want to reduce EXTREME_CLEAN_VERSIONS_IN_PASS.

When cleaning a page is incomplete because the maximum number of versions cleaned in a pass is exceeded by the number of old page versions requiring cleaning, the page is added again onto the end of the Queueable Job. When running the job from the dashboard, this will show as the last pass of the job being run multiple times (because the concrete5 core does not detect that the queue has grown while it is being processed).

If you don't like this behaviour, you can define the constant EXTREME_CLEAN_AUTO_EXTEND as false. Extreme clean will then simply report that a page could not be fully cleaned of old versions and advise you to run the job again.

If you want to leave more versions than the latest or keep some table rows, define the constants EXTREME_CLEAN_MAX_VERSIONS and EXTREME_CLEAN_MAX_ROWS.

If you want to clean up a different set of tables, define the constant EXTREME_CLEAN_TABLES as a comma separated list. The default is equivalent to 'PageStatistics,Logs,JobsLog'.

Beware that when cleaning tables, the 'most recent' rows to keep are judged by the first column/field in the table. This is a massive assumption that happens to hold true for the tables configured by default. Please make sure you understand what you are doing if you want to merely trim other tables.

If other developers have favourite cleanups they apply before deployment, please send me complete or partially worked out code and I will consider incorporating it.

If Extreme Clean appears to stall...

First, check the developer console network tab and see if it has really hung. It could actually be adding and evaluating new passes for pages with lots of old versions, but not actually showing that in the job progress bar. That is unfortunately a limitation of the way the core jobs mechanism calculates the progress bar.

If Extreme Clean really has stalled, the most likely reason is that a a complex page has so many versions they can't be easily cleaned within the server resources available. This can be solved by setting EXTREME_CLEAN_VERSIONS_IN_PASS with a value smaller than the default (see above).


A final reminder:

Backup your site before using Extreme Clean. I accept no responsibility if you end up cleaning critical data!

For backing up a site, have a look at Backup Voodoo.