Thursday, February 20, 2020

The Degree of Anonymity


We are often faced with the challenge of transparency. How transparent should we be on the internet to ensure we don’t give away too much information and still, get that free account on Spotify or a date on Tinder?


In the article Social Networking Virtues, Vallor, a philosopher at Santa Clara University, points out that "anonymity has low entry and exit barriers as compared to social environments (online and offline) but is heavily correlated with less result". Hence, we can design a Linear Relationship between transparency and data concern combined with what we trying to achieve from the internet be it ordering a perfume for your friend or free surfing through HBO. But what is the Optimal Point? Maybe it varies from site to site or maybe it depends on every individual person based on how worried they are about their data privacy.

Transparency vs Data Concern and Result

Netflix unintentionally stumbled upon this topic back in 2006 when it rolled out a $1M prize for a contest that challenged entrants to improve its recommendation algorithm by 10%. The Netflix Prize was an open competition for the best collaborative filtering algorithm to predict ratings for films, based on the data Netflix collected from its users. Netflix provided a data set of over 17,000 movies generated by 480,0189 anonymous users without any other information about the users or film. 

Winners of Netflix Prize
Netflix thought the meer data set they provided was anonymous enough but, in 2007, two researchers from The University of Texas at Austin were able to de-anonymize the user’s identity if that user had also left movie ratings at another site, such as IMDB. Therefore, the Prize has been criticized by privacy advocates like Paul Ohm, a law professor at Georgetown University, claiming that Netflix gave away private information by targeting more on improving its software rather than securing its customers' information.

Companies like Netflix, Facebook and Google are constantly releasing "anonymous" data to the public in order to improve their software by allowing the public to build upon it but it often leads to spillage of information. I’m sure these companies want to keep their users' data as anonymous as they can to avoid the economic ramifications and lawsuits and still get the desired result. However, they seem to have no idea of what the Optimal Point is between transparency and result. Does anyone know how anonymous we should be on the internet to get the outcome we want? The irony is that it wouldn’t be long until a startup becomes a tech giant by guiding people on how transparent they should be and accidentally ends up using and disclosing the data set of its clients. 

1 comment:

  1. I really enjoyed your post! I didn't notice a ton of differences between this post and the original, but I thought your discussion on anonymity and companies' responsibilities with user data was really interesting. I especially liked your example with Netflix, and it made me question what I considered "anonymous" to be, and what the ethical considerations are when "anonymous" user data could still be de-anonymized. Do you think there is a clear line between what data is ethical to be released anonymously and what is too risky? In the Netflix case they were only able recover a small portion of users' identities, and these users had already made their information public on another site, so is the problem that urgent?

    ReplyDelete

Note: Only a member of this blog may post a comment.