Posts tagged ‘ethics’

In Silicon Valley, Great Power but No Responsibility

I saw a tweet today that gave me a lot to think about:

.

A rather intricate example of social adaptation to technology. If I understand correctly, the cousins in question are taking advantage of the fact that liking someone’s status/post on Facebook generates a notification for the poster that remains even if the post is immediately unliked. [1]

What’s humbling is that such minor features have the power to affect so many, and so profoundly. What’s scary is that the feature is so fickle. If Facebook starts making updates available through a real-time API, like Google Buzz does, then the ‘like’ will stick around forever on some external site and users will be none the wiser until something goes wrong. Similar things have happened: a woman was fired because sensitive information she put on Twitter and then deleted was cached by an external site. I’ve written about the privacy dangers of making public data “more public”, including the problems of real-time APIs. [2]

As complex and fascinating as the technical issues are, the moral challenges interest me more. We’re at a unique time in history in terms of technologists having so much direct power. There’s just something about the picture of an engineer in Silicon Valley pushing a feature live at the end of a week, and then heading out for some beer, while people halfway around the world wake up and start using the feature and trusting their lives to it. It gives you pause.

This isn’t just about privacy or just about people in oppressed countries. RescueTime estimates that 5.3 million hours were spent worldwide on Google’s Les Paul doodle feature. Was that a net social good? Who is making the call? Google has an insanely rigorous A/B testing process to optimize between 41 shades of blue, but do they have any kind of process in place to decide whether to release a feature that 5.3 million hours—eight lifetimes—are spent on?

For the first time in history, the impact of technology is being felt worldwide and at Internet speed. The magic of automation and ‘scale’ dramatically magnifies effort and thus bestows great power upon developers, but it also comes with the burden of social responsibility. Technologists have always been able to rely on someone else to make the moral decisions. But not anymore—there is no ‘chain of command,’ and the law is far too slow to have anything to say most of the time. Inevitably, engineers have to learn to incorporate social costs and benefits into the decision-making process.

Many people have been raising awareness of this—danah boyd often talks about how tech products make a mess of many things: privacy for one, but social nuances in general. And recently at TEDxSiliconValley, Damon Horowitz argued that technologists need a moral code.

But here’s the thing—and this is probably going to infuriate some of you—I fear that these appeals are falling on deaf ears. Hackers build things because it’s fun; we see ourselves as twiddling bits on our computers, and generally don’t even contemplate, let alone internalize, the far-away consequences of our actions. Privacy is viewed in oversimplified access-control terms and there isn’t even a vocabulary for a lot of the nuances that users expect.

The ignorant are at least teachable, but I often hear a willful disdain for moral issues. Anything that’s technically feasible is seen as fair game and those who raise objections are seen as incompetent outsiders trying to rain on the parade of techno-utopia. The pronouncements of executives like Schmidt and Zuckerberg, not to mention the writings of people like Arrington and Scoble who in many ways define the Valley culture, reflect a tone-deaf thinking and a we-make-the-rules-get-over-it attitude.

Something’s gotta give.

[1] It’s possible that the poster is talking about Twitter, and by ‘like’ they mean ‘favorite’. This makes no difference to the rest of my arguments; if anything it’s stronger because Twitter already has a Firehose.

[2] Potential bugs are another reason that this feature is fickle. As techies might recognize, ensuring that a like doesn’t show up after an item is unliked maps to the problem of update propagation in a distributed database, which the CAP theorem proves is hard. Indeed, Facebook often has glitches of exactly this sort—you might notice it because a comment notification shows up and the comment doesn’t, or vice versa, or different people see different like counts, etc.

[ETA] I see this essay as somewhat complementary to my last one on how information technology enables us to be more private contrasted with the ways in which it also enables us to publicize our lives. There I talked about the role of consumers of technology in determining its direction; this article is about the role of the creators.

[Edit 2] Changed the British spelling ‘wilful’ to American.

Thanks to Jonathan Mayer for comments on a draft.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter.

June 11, 2011 at 7:33 am 13 comments

Is Making Public Data “More Public” a Privacy Violation?

What on earth does more public mean? Technologists draw a simple distinction between data that is public and data that is not. Under this view, the notion of making data more public is meaningless. But common sense tells us otherwise: it’s hard to explain the opposition to public surveillance if you assume that it’s OK to collect, store and use “public” information indiscriminately.

There are entire philosophical theories devoted to understanding what one can and cannot do with public data in different contexts. Recently, danah boyd argued in her SXSW keynote in support of “privacy through obscurity” and how technology is destroying this comfort. According to boyd, most public data is “quasi-public” and technologists don’t have the right to “publicize” it.

Some examples. One can debate the point in the abstract, but there is no question that companies and individuals have repeatedly been bitten when applying the “it’s already public” rule. Let’s look at some examples (the list and the discussion is largely concerned with data on the web).

  1. The availability of the California Birth Index on the web caused considerable consternation about a decade ago, despite the fact that birth records in the state are public and anyone’s birth record can be obtained through official channels albeit in a cumbersome manner.
  2. IRSeek planned to launch a search engine for IRC in 2007 by monitoring and indexing public channels (chatrooms). There was a predictable privacy outcry and they were forced to shut down.
  3. The Infochimps guys crawled the Twitter graph back in 2008 and posted it on their site. Twitter forced them to take the dataset down.
  4. The story was repeated with Pete Warden and Facebook; this time it was nastier and involved the threat of a lawsuit.
  5. MySpace recently started selling user data in bulk on Infochimps. As MySpace has pointed out, the data is already public, but privacy concerns have nevertheless been raised.
  6. One reason for the backlash against Google Buzz was auto-connect: it connected your activity on Google Reader and other services and streamed it to your friends. Your Google Reader activities were already public, but Buzz took it further by broadcasting it.
  7. Spokeo is facing similar criticism. As Snopes explains, “Spokeo displays listings that sometimes contain more personal information than many people are comfortable having made publicly accessible through a single, easy-to-use search site.”

The latter four examples are all from the last couple of months. For some reason the issue has suddenly started cropping up all the time. The current situation is bad for everyone: data trustees and data analysts have no clear guidelines in place, and users/consumers are in a position of constantly having to fight back against a loss of privacy. We need to figure out some ground rules to decide what uses of public data on the web are acceptable.

Why not “none?” I don’t agree with a blanket argument against using data for purposes other than originally intended, for many reasons. The first is that users’ privacy expectations, when they go beyond the public/private dichotomy, are generally poorly articulated, frequently unreasonable and occasionally self-contradictory. (An unfortunate but inevitable consequence of the complexity of technology.) The second reason is that these complex privacy rules, even if they can be figured out, often need to be communicated to the machine.

The third reason is the “greater good.” I’ve opposed that line of reasoning when used to justify reneging on an explicit privacy promise. But when it comes to a promise that was never actually made but merely intuitively understood (or mis-understood) by users, I think the question is different, and my stance is softer. Privacy needs to be weighed against the benefit to society from “publicizing” data — disseminating, aggregating and analyzing it.

In the next article of this series, I will give a rigorous technical characterization of what constitutes publicizing data. My hope is that this will go a long way towards determining what is and is not a violation of privacy. In the meanwhile, I look forward to hearing different opinions.

Thanks to Pete Warden and Vimal Jeyakumar for comments on a draft.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter.

April 5, 2010 at 6:11 pm 13 comments

Is Anonymity Research Ethical?

A researcher who is working on writing style analysis (“stylometry”), after reading my post on related de-anonymization techniques, wonders what the positive impact of such research could be, given my statement that the malicious uses of the technology are far greater than the beneficial ones. He says:

Sometimes when I’m thinking of an interesting research topic it’s hard to forget the Patton Oswalt line “Hey, we made cancer airborne and contagious! You’re welcome! We’re science: we’re all about coulda, not shoulda.”

This was my answer:

To me, generic research on algorithms always has a positive impact (if you’re breaking a specific website or system, that’s a different story; a bioweapon is a whole different category.) I do not recognize a moral question here, and therefore it does not affect what I choose to work on.

My belief that the research will have a positive impact is not at odds with my belief that the uses of the technology are predominantly evil.  In fact, the two are positively correlated. If we’re talking about web search technology, if academics don’t invent it, then (benevolent) companies will. But if we’re talking about de-anonymization technology, if we don’t do it, then malevolent entities will invent it (if they haven’t already), and of course, keep it to themselves. It comes down to a choice between a world where everyone has access to de-anonymization techniques, and hopefully defenses against it, versus one in which only the bad guys do. I think it’s pretty clear which world most people will choose to live in.

I realize I lean toward the “coulda” side of the question of whether Science is—or should be—amoral. Someone like Prof. Benjamin Kuipers here at UT seems to be close to the other end of the spectrum: he won’t take any DARPA money.

Part of the problem with allowing morality to affect the direction of science is that it is often arbitrary. The Patton Oswalt quote above is a perfect example: he apparently said that in response to news of science enabling a 63 year old woman to give birth. The notion that something is wrong simply because it is not “natural” is one that I find most repugnant. If the freedom of a 63 year old woman to give birth is not an important issue to you, let me note that more serious issues such as stem cell research, that could save lives, fall under the same category.

Going back to anonymity, it is interesting that tools like Tor face much criticism, but for enabling the anonymity of “bad” people rather than breaking the anonymity of “good” people. Who is to be the arbiter of the line between good and bad? I share the opinion of most techies that Tor is a wonderful thing for the world to have.

There are many sides to this issue and many possible views. I’d love to hear your thoughts.

April 9, 2009 at 8:42 pm 8 comments


About 33bits.org

I'm an assistant professor in computer science at Princeton. I research (and teach) information privacy and security, and moonlight in technology policy.

This is a blog about my research on breaking data anonymization, and more broadly about information privacy, law and policy.

For an explanation of the blog title and more info, see the About page.

Subscribe

Be notified when there's a new post — subscribe to the feed, follow me on Google+ or twitter or use the email subscription box below.

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 155 other followers


Follow

Get every new post delivered to your Inbox.

Join 155 other followers