Friday, February 21, 2020

Your Hospital Records Might Be Public

Image result for hospital machine learning 
Much of your private data is available online. But you probably already know that. Google is tracking your every move, Facebook is selling your data, and your car may even be monitoring you now.  Ok, but companies keep this information private right? At the very least, you should be able to change your privacy policy settings, or even stop using the services altogether ... right? Well, this is not an option for users of the health care industry (which is basically everyone).  They have been publicly releasing their records for use in academic research, and these records are very useful for developing machine learning algorithms - one was even able to diagnose cancer

Even with the upsides, releasing health records seems like a major privacy concern.  Philosophy professor and president of the
International Association for Ethics and Information Technology, Philip Brey, describes ethical approaches to emerging technologies in his paper "Anticipating Ethical Issues." One approach that he describes focuses only on issues that can be predicted reliably.  He states that these issues correlate with "characteristics inherent to the technology."  For machine learning, these characteristics are large datasets. To develop a machine learning algorithms for use in healthcare, researchers need access to healthcare records. Since this ethical issue was easily predictable, health care providers
Before data is anonymized



decided to anonymize the data. Basically, they would take the data and remove some of the information that could be used to easily identify the people in the records. 

Anonymized data

At a first glance this seems very secure. When reading the anonymized table it doesn't seem possible to match up the disease
with the person, however, with powerful computing it is becoming very possible.  In 1997, a researcher was able to identify the Governor of Massachusetts William Weld's hospital records from some records released for research purposes. She was able to do this by piecing together the hospital records with voter rolls in the city of Cambridge, MA.

To prevent data re-identification, companies decided to generalize the data even more before releasing, but this comes with a cost. The more generalized data is, the less useful it is for machine learning.  It is unethical to release sensitive data that could be traced back to specific people, but we need to find a middle ground that protects the people in the data while also releasing impactful data for machine learning research.

4 comments:

  1. With privacy concerns in other areas of tech like social media, the user always has the option to opt out of using the product if they're concerned with privacy. However, when you need medical intervention there really is no opt out option. The intro of this post is really interesting, and I liked that you took a pretty balanced approach to public medical records. You do a good job of evaluating the pros and cons. You also do a good job of relating the topic to the Brey reading.

    ReplyDelete
  2. I like the structure of your post. Starting with a question draws the reader into the post and ending with an action gives the reader something to think about. I also liked the use of images and links. Both features added a lot of value to the article without being too distracting or irrelevant. Your introduction and explanation of the Brey article was great because it was so detailed. I think it would be clear to any reader outside of SI410. My only suggestion would be to avoid certain "fluff" words like "very" and "basically". Pulling these words would make your argument clearer and more concise.

    ReplyDelete
  3. This post does a good job of drawing the reader in. However, I was a bit confused by the constant shift in voice. I wasn't sure whether you were in support of or against the use of data in medical records. I think that something interesting to integrate into this post would be an on balance comparison between data being used in research vs data not being used in research. Is the possibility that one's medical records be found out worth saving lives? Also, is there a way to have the good without the bad? Overall, I think this is a thought-provoking article, so good job.

    ReplyDelete
  4. Hi Taran, this is an interesting topic as our privacies are being constantly sharing through the internet. The structure of the post is very informative as it walks through the issue for the readers, which makes it very easy to understand. Also, the pictures that you incorporated were able to make clear of your point. Overall, nice post!

    ReplyDelete

Note: Only a member of this blog may post a comment.