Help with Page Popularity Addon

Permalink
I am working on an Addon that will list pages based on data derived from the PageStatistics table.

Three options so far are:
1. Most hits of all time
2. Most hits in the last month
3. Trending pages in the last month

The first two are easy: add up all of the hits for the given time range. The third is causing my head to hurt. I've come up with a SQL query for getting the 'exponentially smoothed moving average with 10% smoothing' based on calculations fromhttp://www.fourmilab.ch/hackdiet/www/subsection1_4_1_0_1.html... . That SQL query is in the attached file.

What I'd like to do is to calculate 'hot' pages with the following considerations:
* If a page is always popular (like the home page), it should be rated lower
* If a page is trending really well, but the actual page hits is low, then it should be rated lower. For instance, a page got 8 hits this week and 1 hit last week, the overall change is 800%, but the actual page hits is much lower than some other page that got 50 hits.

Any help would be appreciated as more advanced math is not my strong point.

1 Attachment

SkyBlueSofa
 
Mainio replied on at Permalink Reply
Mainio
I would consider creating some kind of "weight factor" for the page hits percentage, so count the percentage as you did 1 => 8 is 8/1=800% as much as the previous week and for the other page e.g. if it had 40 views last week and 50 views this week 50/40=125% as much as previous week.

Then you might count some kind of "weight factor" for those numbers that would make them comparable with each other. I would probably count them e.g. per 100 page views.

I would suggest counting the weight factor based on how many views your site gets overall and maybe also correcting that number somehow (e.g. don't add the views for the home page) and comparing the single page views to that number would give you figures that can be better compared with each other.

E.g. your site gets 1000 page views during the week, of which:
a) Page 1 gets 50
=> 50/1000 = 0.05% of total views
b) Page 2 gets 8
=> 8/1000 = 0.008% of total views

During the last week you got 800 page views of which:
a) Page 1 got 40
=> 40/800 = 0.05%
b) Page 2 got 1
=> 1/800 = 0.00125% of total views


And then if you compare these figures with each other, you can get a comparable "increase" number:
a) For page 1: 0.05-0.05=0% increase
b) For page 2: (0.008-0.00125)/0.00125=540% increase


I don't know if this is exactly what you were looking for but at least it gives you page specific numbers that you can compare with each other...


Best,
Antti / Mainio
SkyBlueSofa replied on at Permalink Reply 1 Attachment
SkyBlueSofa
Thanks, Antti.

The SQL that I came up with so far is in the attachment.

Nutshell:
a = hits for the week
b = total hits for the week
c = (a/b)*100 : page percent of total hits

x = average hits for the previous two weeks
y = total average hits for the previous two weeks
z = (x/y)*100 : page percent of avg total hits prev weeks

p = c-z : percent change from previous weeks to this week


I liked what you proposed, but I had problems and I ended up with a 'division by zero'. Say that there was a new page with some hits (lets say 10 hits). Last week it didn't exist and so it got zero hits. Now the calculation that you proposed would be something to the effect of ((10-0)/0) = null.

Can you think of anything else I could do to make it better?




I did what you described except for the
Mainio replied on at Permalink Reply
Mainio
Hi,

I'd just suggest not doing everything with SQL. You know, you can do some calculations programmatically in PHP, so you can avoid these problems quite easily. ;)

Also, building big SQL queries quite usually leads you to performance issues, so generally it is quite good idea to just handle quite simple-level queries in the SQL server.


Antti