I have been browsing this site for the answer but I'm still a little unsure how to plan a similar system in its database structure and implementation.

In PHP and MySQL it would be clear that some achievements are earned immediately (when a specialized action is taken, in SO case: Filled out all profile fields), although I know SO updates and assigns badges after a certain amount of time. With so many users & badges wouldn't this create performance problems (in terms of scale: high number of both users & badges).

So the database structure I assume would something as simple as:

Badges     |    Badges_User      |    User
----------------------------------------------
bd_id      |    bd_id            |  user_id
bd_name    |    user_id          |  etc
bd_desc    |    assigned(bool)   |  
           |    assigned_at      |

But as some people have said it would be better to have an incremental style approach so a user who has 1,000,000 forum posts wont slow any function down.

Would it then be another table for badges that could be incremental or just a 'progress' field in the badges_user table above?

Thanks for reading and please focus on the scalability of the desired system (like SO thousands of users and 20 to 40 badges).

EDIT: to some iron out some confusion I had assigned_at as a Date/Time, the criteria for awarding the badge would be best placed inside prepared queries/functions for each badge wouldn't it? (better flexibility)

Comments

are some more popular badges checked more than others as well?

Written by bluedaniel

Could you clarify the need for "assigned(bool)"? Isn't it sort of redundant as you wouldn't have a mapping in the first case unless you are assigned the badge? And why would the number of forum posts matter?

Written by Fredrik

my first mistake! say you were assigning badges depending on the amount of posts made i.e rewarding high posts with good badges -> incentive to get users to interact

Written by bluedaniel

@Fredrik, agreed on "assigned", but "assigned_at" means you can track stuff like the 1,000,000 forum posts badge mentioned. He's basically tracking two kinds of badges in the one table.

Written by da5id

exactly, thats gotta be a bad idea hasnt it?

Written by bluedaniel

pedantically praps, but i think it quite clever :)

Written by da5id

@da5id: I still don't see why the "assigned_at" should make it any slower with a million posts unless he creates a record per post. But that would be a quite incorrect way of solving the problem. Keeping track of the number of posts (per area, topic, whatever) is just a matter of having counters somewhere, they will be equally expensive with one post as with a million posts.

Written by Fredrik

Addition to my last post, that assigned_at could be the counter but that would basically mean that all users get a badge mapping even if they have not earned it yet and that would scale quite bad with a lot of users and/or a lot of badges.

Written by Fredrik

@Fredrik, that's not what I meant, sorry for the confusion - see my answer below.

Written by da5id

@da5id:Gave you +1 for it but I think bluedaniel would benefit from a better description of tables, when they are updated and with what...

Written by Fredrik

@Fredfrick im unsure how the counter would be just as expensive, having to count all the posts by each user to see if they have earned the '1,000,000 badge' would be more expnesive than checking the counter wouldnt it?

Written by bluedaniel

And im unsure what to worry about more, size vs performance. ie big counting queries or lots of counters

Written by bluedaniel

@bluedaniel: There is no need to recount every post every time if you keep track of the number of qualifying posts. Each new post means that you spend a small amount of CPU increasing the counter. It doesn't matter if it is the second or millionth time.

Written by Fredrik

Sorry, off to a meeting or I'd expand my answer, but basically the idea is to not have to worry about either: Every time a user submits something to the site you increment the Submissions_User table in my answer - immediately after that you run the "event listener" and award a badge if applicable.

Written by da5id

...just remember to put indexes on the tables!

Written by da5id

@Fredfrick exactly so would it be fitting to have two tables, one for incremental badges and one for 'one off' achievements? also what about SO's style of running these queries to assign badges. each function would surely [check if they have the badge already] -> [check if they qualify] -> [assign the badge] but would they run for each user and each badge every few hours? good way or not?

Written by bluedaniel

@bludanniel: I'm not going to design it for you but it really isn't as hard to do as you make it sound. If a "one off" badge cold be "first post to be a favorite", just keep track of that and give the user the badge if it happens. If there is a badge assigned the first time a user gets 10 upvotes/post, keep track of the maximum number of upvotes/post so far. There is no need to check if the badge is assigned already if you only assign it when you reach a specific limit with a counter. It is late over here and I need to get up really early tomorrow so I can't help you more.

Written by Fredrik

thats fine, thanks for your help Fredrick

Written by bluedaniel

Accepted Answer

I think the structure you've suggested (without the "assigned" field as per the comments) would work, with the addition of an additional table, say "Submissions_User", containing a reference to user_id & an incrementing field for counting submissions. Then all you'd need is an "event listener" as per this post and methinks you'd be set.

EDIT: For the achievement badges, run the event listener upon each submission (only for the user making the submission of course), and award any relevant badge on the spot. For the time-based badges, I would run a CRON job each night. Loop through the complete user list once and award badges as applicable.

Written by da5id
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki