It is currently Wed Jul 23, 2014 8:41 pm

All times are UTC - 8 hours




Post new topic Reply to topic  [ 61 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next
Author Message
 Post subject: Re: Update to popularity score algorithm
PostPosted: Mon Sep 26, 2011 10:56 am 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
You are right..I saw it after the post and I'm doing some more tests to understand why (the code is updated if you want to test it)..
I also sent a message to the author of the second article to ask help with the formula..

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Mon Sep 26, 2011 8:48 pm 
Offline

Joined: Tue Mar 09, 2010 7:36 pm
Posts: 197
m^(2) wrote:
Why 100+ 100- has lower popularity than 10- 10-?
IMO it should work the other way...
I would think that since more people have voted on the +100 -100, that the confidence is greater, therefore the percentage reflects that.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Tue Sep 27, 2011 12:46 am 
Offline
User avatar

Joined: Sat Feb 04, 2006 9:19 am
Posts: 1932
Just fixed a serious bug in the code.

For certain entries which are not too popular, the historical aggregate score was also added to the final score, which makes some of them rise above the truly more popular ones based on the past 30-day total.

My apologies for this error. :oops:

The new top 10 is a mixture of old and new compared to the all-time list:

Code:
Old list:
FastStone Capture => 196930
Yod'm 3D => 99423
Foxit Reader Portable => 68417
Undelete Plus => 29393
SilentNight Micro CD Burner => 27443
PixaMSN => 20691
EVEREST Home Edition => 19900
Universal Extractor => 19585
CCleaner => 17341
Mozilla Firefox, Portable Edition => 16610


Code:
New list:
FastStone Capture => 4775
Undelete Plus => 1290
EVEREST Home Edition => 911
Free PDF Compressor => 789
Disk Digger => 681
PDF-XChange Viewer => 625
DriveImage XML => 608
TrueCrypt => 524
Q-Dir => 522
Softi FreeOCR => 443


For example, the top 3 are all found in the all-time list. Foxit Reader Portable has moved down but is still between 10 ~ 20 (not shown) etc.

This corrected list seems to look more reasonable now. :D


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Tue Sep 27, 2011 7:21 am 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
I'm studying the new formula.. is the idea still valid? because I think current solution has only a limited value and not correctly reflects popularity.. but I'd like to know your and other opinions about it :P

A couple of questions about the database: does it still store all votes? (to eventually restore old votes if needed) ..and could you give me some examples of software ratings? (as + and - votes for each app, to make tests.. like 20 apps)
Because if in general plus votes are much more than minus votes, current formula could be already good..

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Wed Sep 28, 2011 4:04 am 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
I updated the test code with a new better and easier solution, that uses statical weights for registered and unregistered votes:
Quote:
Popularity = 0.8 * (0.8 * Value1 + 0.2 * Value2) + 0.2 * (0.8 * Value3 + 0.2 * Value4)

With:
Value1 = unregistered positive votes / unregistered total votes
Value2 = unregistered positive votes / unregistered maximum positive votes
Value3 = registered positive votes / registered total votes
Value4 = registered positive votes / registered maximum positive votes

Weights could be changed for example to give more importance to number of votes if the difference between positive and negative votes is less relevant..

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Last edited by Lupo73 on Thu Sep 29, 2011 10:28 am, edited 1 time in total.

Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Wed Sep 28, 2011 6:30 am 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
I update the code and the screenshot again.. now there are 5 solutions available:
1. Wilson formula (that analyzes only single software rating and gives a limited weight to number of votes)
2. Bayes formula (that compares a software rating with average software ratings, but seems to be not reliable for ratings with a relevant percentage of negative votes)
3. Lupo formula (that is the simple solution I proposed in previous post, apparently good but that needs to be verified)
4. positive / total formula (that doesn't consider number of votes at all)
5. positive - negative / total formula (that is another reference formula fully independent from number of votes)

I'd like to have some example of ratings to test them with realistic values, but anyway my opinion is that Wilson formula is the best one.. or eventually could be used my simple formula, that allows to specify the weight of rating parameters..
About Bayes formula instead, I'm waiting an answer from the author of a related article.. because I think to have correctly implemented it, but results obtained are a little strange..

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Sat Oct 01, 2011 4:23 am 
Offline
User avatar

Joined: Sat Feb 04, 2006 9:19 am
Posts: 1932
@Lupo73: Thanks for your work so far!

Before we continue, let me summarize the current system as it is (after digging into the code).

Every time a browser is interested enough to click on the "Website" or "Download" link, a "+1" score is added to a "points" table. Let's call this the "activity score". For simplicity, I am not going to go into duplicate detection (another table stores the IP addresses for such actions, so if a user clicks "Website", then "Download", it will still count as "+1").

Every time a browser is interested enough to rate an entry, a "+5" or "-5" score is added to the "points" table depending on whether it's a "rocks" or "sucks" vote. Let's call this the "voting score". Separately, an internal score corresponding to the user's rank (if available) is added to a "appscore" table.

On a daily basis, the scores for each entry in the "points" table are consolidated via an SQL sum() and inserted into another "points2" table, then all entries in "points" are cleared for the next day (to save space).

As is obvious now, the popularity score is computed by summing the "points2" table for each entry (range is currently 30 days, previously it was the entire range).

So the only "neg" score available is the "sucks" vote. (The "appscore" table does not record the "sucks" vote. It is merely a record of all the "plus" votes by registered users of each entry).

Also, the "voting score" is overwhelmed by the "activity score". Very few anonymous browsers vote. For example, at this moment, the "activity score" is in the thousands, while the "voting score" is in the tens.

Due to the negligible presence of "neg" score in the current system, I suspect the formula won't make much of a difference. We can increase the weights assigned to registered users, but unless it's some ridiculous number, they will be overwhelmed by the "activity score" of anonymous browsers.

The point of contention now I think it whether the list should be based on the score over the entire range, or just the past x days (or some exponentially decreasing window applied to the scores, so older scores have lower weights).

My current thinking is maybe by having two lists i.e. recent popular titles + all time favorites, as some of you have suggested, this debate can be somewhat resolved. The reason is I don't think a single list can cater to both recent scores and perpetual scores simultaneously.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Sat Oct 01, 2011 11:01 am 
Offline

Joined: Sun Jul 23, 2006 10:45 am
Posts: 52
I'm still not clear on the concept. Should we be re-upvoting for programs we have previously up-voted?


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Sat Oct 01, 2011 11:21 am 
Offline
User avatar

Joined: Sat Jul 31, 2010 1:19 am
Posts: 1101
Location: Helsinki, Finland
Andrew Lee wrote:
My current thinking is maybe by having two lists i.e. recent popular titles + all time favorites, as some of you have suggested, this debate can be somewhat resolved. The reason is I don't think a single list can cater to both recent scores and perpetual scores simultaneously.


IMHO, the "all time favorites" list isn't interesting at all. It's way too constant.

flector wrote:
I'm still not clear on the concept. Should we be re-upvoting for programs we have previously up-voted?


No.

http://www.portablefreeware.com/forums/viewtopic.php?p=39412#p39412

_________________
My YouTube channel | Release date of my sixth playlist: December 12, 2013


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Sat Oct 01, 2011 12:04 pm 
Offline

Joined: Sun Jul 23, 2006 10:45 am
Posts: 52
You have a point with "all time favorites."

But as a freeware author, I've been looking at the popularity ratings of my software for years. Those ratings are now in the toilet.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Sun Oct 02, 2011 5:34 pm 
Offline

Joined: Tue Mar 09, 2010 7:36 pm
Posts: 197
Andrew Lee wrote:
The point of contention now I think it whether the list should be based on the score over the entire range, or just the past x days (or some exponentially decreasing window applied to the scores, so older scores have lower weights).
Have you considered using exponential moving averages?

http://en.wikipedia.org/wiki/Exponentia ... ng_average

I use this to indicate trends on how fast server hard disks are filling up where I work. After some tweaking it works quite well.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Tue Oct 04, 2011 12:05 pm 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
Now current solution is clear.. the idea of use activity in rating is good..

I think you could use a unified formula like my solution to have a more accurate rating (giving a weight to activity score, a weight to unregistered voting score and one to registered voting score). It may resolves the "problem" of points overwhelming, because the importance of each aspect has a related percentage. For example you could evaluate popularity in this way:
Code:
Popularity = 70 * (ActivityScore / MaxActivityScore) + 30 * (PositiveVotes / TotalVotes)

In ActivityScore you could add +1 for "Website" click and +2 for "Download" click (independently from the solution you will use, I think a Download click is more important that a Website click).
MaxActivityScore is evaluated checking all ActivityScore counters once per day.
For this solution you need also to separate counters of positive and negative votes for each app. But you could consider to add registered and unregistered votes together, giving: +1 to unregistered users, +2 * UserLevel to registered users.

Another good improvement could be to add a simple time factor, for example giving a different weight to scores of different dates (I have something in mind, but I avoid to write a too long message now).

In alternative, keeping current solution for rating, I think these aspects may be improved:
1. give a different weight to Download and Website clicks (as previously described)
2. give a bigger weight to "voting score" (or eventually keep them permanently, not only last 30 days votes)
3. offer the two scores you proposed (last 30 days and all times)

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Wed Oct 05, 2011 12:59 am 
Offline
User avatar

Joined: Sat Feb 04, 2006 9:19 am
Posts: 1932
@Hydaral: My point is, I don't think using exponential moving averages will eliminate the need for two separate lists. Right?

@Lupo73: Some of your suggestions can be readily implemented (eg. different scores for website and download). Others will need considerably more work and changes, which I will KIV until I have sorted the current situation out.

I think we need to discuss whether we can do only with one list, or whether we need two. If we have two, what should be the popularity score for an app (past x-days, or entire range)?

More to the point, would using an exponential moving average eliminate the need for two lists? If so, what is the ideal formula for the window?


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Wed Oct 05, 2011 8:21 pm 
Offline

Joined: Tue Mar 09, 2010 7:36 pm
Posts: 197
What about using EMA for the last x days (30?) then add a percentage of the total votes for that app (5%?). The EMA will weight new apps that have been voted up a lot recently and the percentage of the total will add weight to all-time popular apps.


Top
 Profile  
 
 Post subject: Re: Update to popularity score algorithm
PostPosted: Thu Oct 06, 2011 5:36 am 
Offline
User avatar

Joined: Mon Mar 19, 2007 8:55 am
Posts: 993
Location: Italy
Parameters that can be considered:
- Registered user votes
- Unregistered user votes
- Download clicks (main activity)
- Website clicks (secondary activity)

Some considerations:
- a first doubt is about the unified counter for activity and votes.. because the risk is that more frequent is a software update and more popular it will be (I think it is the reason of your limitation for updates per month, but it may be not enough)
- a second doubt about the unified counter is that vote an app loses its importance if it is overwhelmed by activity score (so a separated parameter could give much more relevance to votes and stimulate users to vote apps)
- another consideration is that current solution of separated counters for registered and unregistered users is not very useful.. it may be studied a unified solution for them (eventually keeping the support to see preferences of other registered users)

After these considerations, my opinion is that a good solution could be two Popularity Scores:
1. Rating Score without a time limit and unified for registered and unregistered users
2. Activity Score with a time limit (e.g. 30 days) or without it (using EMA formula)

For the first counter you could use the Wilson formula and give different weights to Registered and Unregistered users (for example +1 to unregistered, +2*level to registered).
For the second counter you need to decide the formula and give different weights to Download and Website clicks (for example +1 to Website, +3 to Download).

_________________
Lupo PenSuite: all-in-one and completely free selection of portable programs and games.
DropIt: personal assistant to automatically manage your files.
ArcThemALL!: application to multi-archive your files and folders.


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 61 posts ]  Go to page Previous  1, 2, 3, 4, 5  Next

All times are UTC - 8 hours


Who is online

Users browsing this forum: No registered users and 1 guest


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  

Protected by Anti-Spam ACP Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group