I can imagine this being a handy tool for mods of subreddits. Find blacklisted users parading under a new account. Unlike Twitter or Facebook, finding the similarity between users in Reddit is a bit harder. From the official Reddit API, the data we have to work with is (the ones that look promising anyway):
- GET /api/v1/user/username/trophies
- GET /user/username/about
- GET /user/username/where
- → /user/username/submitted
→ /user/username/hidden (not accessible?)
→ /user/username/saved (not accessible?)
- GET /api/multi/user/username
A few assumptions and possible features about puppets:
- Age – They’re new – the account is barely a few days old. Not a very good indicator though in my opinion. Users could have been using the alt account for a long time. This is a better feature for detecting trolls.
- Karma – A lot of controversial posts/links, again a better indicator for trolls. I would rarely expect the Karma for two users to match unless they’ve been equally active.
- Activity – Time of submissions of posts and links would be a good way to find out a users time zone. Days when they post, could reveal posting habits.
- Links they’ve submitted – find similar interests.
- Upvoted and downvoted content – again similar interests
- Subs they post in – similar interests
And finally their literary style or fingerprint. This could include emojis and emoticons.
Breaking up the problem:
The individual problems I see are:
- Similar user interests
- Find the time zone of a user
- Troll detector
- Literary style of users
I’ll start with literary fingerprints for my next blog post in this series.