Posts tagged ‘academia’

Academic publishing as (ruinous) competition: Is there a way out?

Aaron Johnson invited me to speak as part of a panel on academic publishing at PETS 2013. This is a rough transcript of my talk, written from memory.

Aaron mentioned he was looking for one more speaker for this panel, so that we could hear the view of someone naive and inexperienced, and asked if I was available. I said, “Great, I do that every day!” So that will be the tone of my comments today. I don’t have any concrete proposals that can be implemented next year or in two years. Instead these are blue-sky thoughts on how things could work someday and hopeful suggestions for moving in that direction. [1]

I just finished my first year as a faculty member at Princeton. It’s still a bit surreal. I wasn’t expecting to have an academic career. In fact, back in grad school, especially the latter half, whenever someone asked me what I wanted to do after I graduated, my answer always was, “I don’t know for sure yet, but there’s one career I’m sure I don’t want — academia.”

I won’t go into the story of why that was and how it changed. But it led to some unusual behavior. I ranted a lot about academia on Twitter, as Aaron already mentioned when he introduced me. Also, many times I “published” stuff by putting up a blog post. For instance I had a series of posts on the ability of a malicious website to deanonymize visitors (1, 2, 3, 4, 5, 6). People encouraged me to turn it into a paper, and I could have done that without much extra effort. But I refused, because my primary goal was to quickly disseminate the information, and I felt my blog posts had accomplished that adequately. True, I wouldn’t get academic karma, but why would I care? I wasn’t going to be an academic!

When I eventually decided I wanted to apply for academic positions, I talked to a professor whose opinion I greatly respected. He expressed skepticism that I’d get any interviews, given that I’d been blogging instead of writing papers. I remember thinking, “oh shit, I’ve screwed up my career, haven’t I?” So I feel extremely lucky that my job search turned out successfully.

At this point a sane person would have decided to quit while they were ahead, and start playing the academic game. But I guess sanity has never really been one of my strong points. So in the last year I’ve been thinking a lot about what the process of research collaboration and publishing would look like if we somehow magically didn’t have to worry at all about furthering our individual reputations.

Polymath

Something that’s very close to my ideal model of collaboration is the Polymath project. I was fascinated when I heard about it a few years ago. It was started by mathematician Tim Gowers in a blog post titled “Is massively collaborative mathematics possible?” [2] He and Terry Tao are the leaders of the project. They’re among the world’s top mathematicians. There have been several of these collaborations so far and they’ve been quite successful, solving previously open math problems. So I’ve been telling computer scientists about these efforts and asking if our community could produce something like this. [3]

To me there are three salient aspects of Polymath. The first is that the collaboration happens online, in blog posts and comments, rather than phone or physical meetings. When I tell people this they are usually enthusiastic and willing to try something like that. The second aspect is that it is open, in that there is no vetting of participants. Now people are a bit unsure, and say, “hmm, what’s the third?” Well, the third aspect is that there’s no keeping score of who contributed what. To which they react, “whoa, whoa, wait, what??!!”

I’m sure we can all see the problem here. Gowers and Tao are famous and don’t have to worry about furthering their careers. The other participants who contribute ideas seem to do it partly altruistically and partly because of the novelty of it. But it’s hard to imagine this process being feasible on a bigger scale.

Misaligned incentives

Let’s take a step back and ask why there’s this gap between doing good research and getting credit for it. In almost every industry, every human endeavor, we’ve tried to set things up so that the incentives for individuals and the broader societal goals of the activity align with each other. But sometimes individual incentives get misaligned with the societal goals, and that leads to problems.

Let’s look at a few examples. Individual traders play the stock market with the hope of getting rich. But at the same time, it helps companies hedge against risk and improves overall financial stability. At least that’s the theory. We’ve seen it go wrong. Similarly, copyright is supposed to align the desire of creators to make money with the goal of the maximum number of people enjoying the maximum number of creative works. That’s gotten out of whack because of digital technology.

My claim is that we’re seeing the same problem in academic research. There’s a metaphor that explains what’s going on in research really well, and to me it is the root of all of the ills that I want to talk about. And that metaphor is publishing as competition. What do I mean by that? Well, peer review is a contest. Succeeding at this contest is the immediate incentive that we as researchers have. And we hope that this will somehow lead to science that benefits humanity.

To be clear, I’m far from the first one to make this observation. Let me quote someone who’s much better qualified to talk about this. Oded Goldreich, I’m sure most of you know of him, has a paper titled “On Struggle and Competition in Scientific Fields.” Here’s my favorite quote from the paper. He’s talking about the flagship theory conferences.

Eventually, FOCSTOC may become a pure competition, defined as a competition having no aim but its own existence (i.e., the existence of a competition). That is, pure competitions serve no scientific purpose. Did FOCSTOC reach this point or is close to it? Let me leave this question open, and note that my impression is that things are definitely evolving towards this direction. In any case, I think we should all be worried about the potential of such an evolution.

I’m don’t know enough about the theory community to have an opinion on how big a problem this is. Still, I’m sure we can agree with the sentiment of the last sentence.

But here’s the very next paragraph. I think it gives us hope.

Other TOC conferences seem to suffer less from the aforementioned phenomena. This is mainly because they “count” less as evidence of importance (i.e., publications in them are either not counted by other competitions or their effect on these competitions is less significant). Thus, the vicious cycle described above is less powerful, and consequently these conferences may still serve the intended scientific purposes.

We see the same thing in the security and privacy community. Something I’ve seen commonly is a situation where you have a neat result, but nothing earth-shattering, and it’s not good enough as it is for a top tier venue. So what do you do? You pad it with bullshit and submit it, and it gets in. Another trend that this encourages is deliberately making a bad or inaccurate model so that you can solve a harder problem. But PETS publications and participants seem to suffer less from these effects. That’s why I’m happy to be discussing this issue with this group of people.

Paper as final output

It seems like we’re at an impasse. We can agree that publishing-as-competition has all these problems, but hiring committees and tenure committees need competitions to identify good research and good researchers. But I claim that publishing as competition fails even at the supposed goal of identifying useful research.

The reason for that is simple. Publishing as competition encourages or even forces viewing the paper as the final output. But it’s not! The hard work begins, not ends when the paper is published. This is unlike the math and theory communities, where the paper is in fact the final output. If publishing-as-competition is so bad for theory, it’s much worse for us.

In security and privacy research, the paper is the starting point. Our goal is not to prove theorems but to more directly impact the world in some way.  By creating privacy technologies, for example. For research to have impact, authors have to do a variety of things after publication depending on the nature of the research. Build technology and get people to adopt it. Explain the work to policymakers or to other researchers who are building upon it. Or even just evangelize your ideas. Some people claim that ideas should stand on their own merit and compete with other ideas on a level playing field. I find this quite silly. I lean toward the view expressed in this famous quote you’ve probably heard: “if your ideas are any good you’ll have to shove them down people’s throats.”

The upshot of this is that impact is heavily shortchanged in the publication-as-competition model. This is partly because of what I’ve talked about, we have no incentive to do any more work after getting the paper published. But an equally important reason is that the community can’t judge the impact of research at the point of publication. Deciding who “wins the prizes” at the point of publication, before the ideas have a chance to prove themselves, has disastrous consequences.

So I hope I’ve convinced you that publication-as-competition is at the root of many of our problems. Let me give one more example. Many of us like the publish-then-filter model, where reviews are done in the open on publicly posted papers with anyone being able to comment. One major roadblock to moving to this model is that it screws up the competition aspect. The worry is that papers that receive a lot of popular attention will be reviewed favorably, and so forth. We want papers to be reviewed on a level playing field. But if the worth of a paper can’t be judged at publication time, that means all this fairness is toward an outcome that is meaningless anyway. Do we still want to keep this model at all costs?

A way forward?

So far I’ve done a lot of complaining. Let me offer some suggestions now. I want to give two sets of suggestions that are complementary. The first is targeted at committees, whether tenure committees, hiring committees, award communities, or even program committees to an extent, and to the community in general. The second is targeted at authors.

Here’s my suggestion for committees and the community: we can and should develop ways to incentivize and measure real impact. Let me give you a four examples. I have more that I’d be happy to discuss later. First, retrospective awards. That is, “best paper from this conference 10 years ago” or some such. I’ve been hearing more about these of late, and I think that’s good news. The idea is that impact is easier to evaluate 10 years after publication.

Second, overlay journals. These are online journals that are a way of “blessing” papers that have already been published or made public. There is a lag between initial publication and inclusion in the overlay journal, and that’s a good thing. Recently the math community has come up with a technical infrastructure for running overlay journals. I’m very excited about this. [4]

There are two more that are related. These are specific to our research field. For papers that are about a new tool, I think we should look at adoption numbers as an important component of the review process. Finally, such papers should also have an “incentives” section or subsection. Because all too often we write papers that we imagine unspecified parties will implement and deploy, but it turns out there isn’t the slightest economic incentive for any company or organization to do so.

I think we should also find ways to measure contributions through blog posts and sharing data and code in publications. This seems more tricky. I’d be happy to hear suggestions on how to do it.

Next, this is what I want to say to authors: the supposed lack of incentives for nontraditional ways of publishing is greatly exaggerated. I say this from my personal experience. I said earlier that I was very lucky that my job search turned out well. That’s true, but it wasn’t all luck. I found out to my surprise that my increased visibility through blogging and especially the policy work that came out of it made a huge difference to my prospects. If I’d had three times as many publications and no blog, I probably would have had about the same chances. I’m sure some departments didn’t like my style, but there are definitely others that truly value it.

My Bitcoin experiment

I have one other personal experience to share with you. This is an experiment I’ve been doing over the last month or so. I’d been thinking about the possibility of designing a prediction market on top of Bitcoin that doesn’t have a central point of control. Some of you may know the sad story of Intrade. So I tweeted my interest in this problem, and asked if others had put thought into it. Several people responded. I started an email thread for this group, and we went to work.

12,000 words and several conference calls later, we’re very happy with where we are, and we’ve started writing a paper presenting our design. What’s even better is who the participants are — Jeremy Clark at Carleton, Joe Bonneau who did his Ph.D. with Ross Anderson and is currently at Google, and Andrew Miller at UMD who is Jon Katz’s Ph.D. student. All these people are better qualified to write this paper than I am. By being proactive and reaching out online, I was able to assemble and work with this amazing team. [5]

But this experiment didn’t go all the way. While I used Twitter to find the participants and was open to accepting anyone, the actual collaboration is being done through traditional channels. My original intent was to do it in public, but I realized quite early on that we had something publication-worthy and became risk-averse.

I plan to do another experiment, this time with the explicit goal of doing it in public. This is again a Bitcoin-related paper that I want to write. Oddly enough, there is no proper tutorial of Bitcoin, nor is there a survey of the current state of research. I think combining these would make a great paper. The nature of the project makes it ideal to do online. I haven’t figured out the details yet, but I’m going to launch it on my blog and see how it goes. You’re all welcome to join me in this experiment. [6]

So that’s basically what I wanted to share with you today. I think the current model of publication as competition has gone too far, and the consequences are starting to get ruinous. It’s time we put a stop to it. I believe that committees on one hand, and authors on the other both have the incentive to start changing things unilaterally. But if the two are combined, the results can be especially powerful. In fact, I hope that it can lead to a virtuous cycle. Thank you.

[1] Aaron didn’t actually say that, of course. You probably got that. But who knows if nuances come across in transcripts.

[2] At this point I polled the room to see who’d heard of Polymath before. Only three hands went up (!)

[3] There is one example that’s closer to computer science that I’m aware of: this book on homotopy type theory written in a similar spirit as the Polymath project.

[4] During my talk I incorrectly cited the URL for this infrastructure as selectedpapers.net. That is a somewhat related but different project. It is actually the Episciences project.

[5] Since the talk, we’ve had another excellent addition to the team: Josh Kroll at Princeton, who recently published a neat paper on the economics of Bitcoin mining with Ian Davey and Ed Felten.

[6] Something that I meant to mention at the end but ran out of time for is Michael Neilsen’s excellent book Reinventing Discovery: The New Era of Networked Science. If you find the topic of this post at all interesting, you should absolutely read this book.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

July 15, 2013 at 7:13 am 10 comments

Embracing failure: How research projects are like startups

As an academic who’s spent time in the startup world, I see strong similarities between the nature of a scientific research project and the nature of a startup. This boils down the fact that most research projects fail (in a sense that I’ll describe), and even among the successful projects the variance is extremely high — most of the impact is concentrated in a few big winners.

Of course, research projects are clearly unlike startups in some important ways: in research you don’t get to capture the economic benefit of your work; your personal gain from success is not money but academic reputation (unless you commercialize your research and start an actual startup, but that’s not what this post is about at all.) The potential personal downside is also lower for various reasons. But while the differences are obvious, the similarities call for some analysis.

I hope this post is useful to grad students in particular in acquiring a long-term vision for how to approach their research and how to maximize the odds of success. But perhaps others including non-researchers will also find something useful here. There are many aspects of research that may appear confusing or pathological, and at least some of them can be better understood by focusing on the high variance in research impact.

1. Most research projects fail.

To me, publication alone does not constitute success; rather, the goal of a research project is to impact the world, either directly or by influencing future research. Under this definition, the vast majority of research ideas, even if published, are forgotten in a few years. Citation counts estimate impact more accurately [1], but I think they still significantly underestimate the skew.

The fact that most research projects don’t make a meaningful lasting impact is OK — just as the fact that most startups fail is not an indictment of entrepreneurship.

A researcher might choose to take a self-interested view and not care about impact, but even in this view, merely aiming to get papers published is not a good long-term strategy. For example, during my recent interview tour, I got a glimpse into how candidates are evaluated, and I don’t think someone with a slew of meaningless publications would have gotten very far. [2]

2. Grad students: diversify your portfolio!

Given that failure is likely (and for reasons you can’t necessarily control), spending your whole Ph.D. trying to crack one hard problem is a highly risky strategy. Instead, you should work on multiple projects during your Ph.D., at least at the beginning. This can be either sequential or parallel; the former is more similar to the startup paradigm (“fail-fast”).

I achieved diversity by accident. Halfway through my Ph.D. there were at least half a dozen disparate research topics where I’d made some headway (some publications, some works in progress, some promising ideas). Although I felt I was directionless, this turned out to be the right approach in retrospect. I caught a lucky break on one of them — anonymity in sanitized databases — because of the Netflix Prize dataset, and from then on I doubled down to focus on deanonymization. This breadth-then-depth approach paid off.

3. Go for the big hits.

Paul Graham’s fascinating essay Black Swan Farming is about how skewed the returns are in early-stage startup investing. Just two of the several hundred companies that YCombinator has funded are responsible for 75% of the returns, and in each batch one company outshines all the rest.

The returns from research aren’t quite as skewed, but they’re skewed enough to be highly counterintuitive. This means researchers must explicitly account for the skew in selecting problems to work on. Following one’s intuition and/or the crowd is likely to lead to a mediocre career filled with incremental, marginally publishable results. The goal is to do something that’s not just new and interesting, but which people will remember in ten years, and the latter can’t necessarily be predicted based on the amount of buzz a problem is generating in the community right now. Breakthroughs often come from unsexy problems (more on that below).

There’s a bit of a tension between going for the hits and diversifying your portfolio. If you work on too few projects, you incur the risk that none of them will pan out. If you work on too many, you spread yourself too thin, the quality of each one suffers, and lowers the chance that at least one of them will be a big hit. Everyone must find their own sweet spot. One piece of advice given to junior professors is to “learn to say no.”

4. Find good ideas that look like bad ideas.

How do you predict if an idea you have is likely to lead to success, especially a big one? Again let’s turn to Paul Graham in Black Swan Farming:

“the best startup ideas seem at first like bad ideas. … if a good idea were obviously good, someone else would already have done it. So the most successful founders tend to work on ideas that few beside them realize are good.”

Something very similar is true in research. There are some problems that everyone realizes are important. If you want to solve such a problem, you have to be smarter than most others working on it and be at least a little bit lucky. Craig Gentry, for example, invented Fully Homomorphic Encryption mostly by being very, very smart.

Then there are research problems that are analogous to Graham’s good ideas that initially look bad. These fall into two categories: 1. research problems that no one has realized are important 2. problems that everyone considers prohibitively difficult but which turn out to have a back door.

If you feel you are in a position to take on obviously important problems, more power to you. I try to work on problems that everyone seems to think are bad ideas (either unimportant or too difficult), but where I have some “unfair advantage” that leads me to think otherwise. Of course, a lot of the time they are right, but sometimes they are not. Let me give two examples.

I consider Adnostic (online behavioral advertising without tracking) to be moderately successful: it has had an impact on other research in the area, as well as in policy circles as an existence proof of behavioral-advertising-with-privacy.[3] Now, my coauthors started working on it before I joined them, so I can take none of the credit for problem selection. But it’s a good illustration of the principle. The main reason they decided this problem was important was that privacy advocates were up in arms about online tracking. Almost no one in the computer science community was studying the topic, because they felt that simply blocking trackers was an adequate solution. So this was a case of picking a problem that people didn’t realize was important. Three years later it’s become a very crowded research space.

Another example is my work with Shmatikov on deanonymizing social networks by being able to find a matching between the nodes of two social graphs. Most people I talked to at the time thought this was impossible — after all, it’s a much harder version of graph isomorphism, and we’re talking about graphs with millions of nodes. Here’s the catch: people intuitively think graph isomorphism is “hard,” but it is in fact not NP-complete and on real-world graphs it embarrassingly easy. We knew this, and even though the social network matching problem is harder than graph isomorphism, we thought it was still doable. In the end it took months of work, but fortunately it was just within the realm of possibility.

5. Most researchers are known for only one or two things.

Let me end with an interesting side effect of the high-skew theory: a successful researcher may have worked on many successful projects during their career, but the top one or two of those will likely be far better known than the rest. This seems to be borne out empirically, and a source of much annoyance for many researchers to be pigeonholed as “the person who did X.” Let’s take Ron Rivest who’s been prolific for several decades not just in cryptography but also in algorithms and lately in voting. Most computer scientists will recall that he’s the R in RSA, but knowledge of his work drops off sharply after that. This is also reflected in the citation counts (the first entry is a textbook, not a research paper). [4]

In summary, if you’re a researcher, think carefully about which projects to work on and what the individual and overall chances of success are. And if you’re someone who’s skeptical about academia because your friend who dropped out of a Ph.D. after their project failed convinced you that all research is useless, I hope this post got you to think twice.

I may do a follow-up post examining whether ideas are as valuable as they are held to be in the research community, or whether research ideas are more similar to startup ideas in that it’s really execution and selling that lead to success.

[1] For example, a quarter of my papers are responsible for over 80% of my citations.
[2] That said, I will get a much better idea in the next few months from the other side of the table :)
[3] Specifically, it undermines the “we can’t stop tracking because it would kill our business model” argument that companies love to make when faced with pressure from privacy advocates and regulators.
[4] To be clear, my point is that Rivest’s citation counts drop off relative to his most well-known works.

Thanks to Joe Bonneau for comments on a draft.

To stay on top of future posts, subscribe to the RSS feed or follow me on Twitter or Google+.

January 2, 2013 at 8:09 am Leave a comment

An Update on Career Plans and Some Observations on the Nature of Research

I’ve had a wonderful time at Stanford these last couple of years, but it’s time to move on. I’m currently in the middle of my job search, looking for faculty and other research positions. In the next month or two I will be interviewing at several places. It’s been an interesting journey.

My Ph.D. years in Austin were productive and blissful. When I finished and came West, I knew I enjoyed research tremendously, but there were many aspects of research culture that made me worry if I’d fit in. I hoped my postdoc would give me some clarity.

Happily, that’s exactly what happened, especially after I started being an active participant in program committees and other community activities. It’s been an enlightening and humbling experience. I’ve come to realize that in many cases, there are perfectly good reasons why frequently-criticized aspects of the culture are just the way they are. Certainly there are still facets that are far from ideal, but my overall view of the culture of scientific research and the value of research to society is dramatically more positive than it was when I graduated.

Let me illustrate. One of my major complaints when I was in grad school was that almost nobody does interdisciplinary research (which is true — the percentage of research papers that span different disciplines is tiny). Then I actually tried doing it, and came to the obvious-in-retrospect realization that collaborating with people who don’t speak your language is hard.

Make no mistake, I’m as committed to cross-disciplinary research as I ever was (I just finished writing a grant proposal with Prof’s Helen Nissenbaum and Deirdre Mulligan). I’ve gradually been getting better at it and I expect to do a lot of it in my career. But if a researcher makes a decision to stick to their sub-discipline, I can’t really fault them for that.

As another example, consider the lack of a “publish-then-filter” model for research papers, a whole two decades after the Web made it technologically straightforward. Many people find this incomprehensibly backward and inefficient. Academia.edu founder Richard Price wrote an article two days ago arguing that the future of peer review will look like a mix of Pagerank and Twitter. Three years ago, that could have been me talking. Today my view is very different.

Science is not a popularity contest; Pagerank is irrelevant as a peer-review mechanism. Basically, scientific peer review is the only process that exists for systematically separating truths from untruths. Like democracy, it has its problems, but at least it works. Social media is probably the worst analogy — it seems to be better at amplifying falsehoods than facts. Wikipedia-style crowdsourcing has its strengths, but it can hit-or-miss.

To be clear, I think peer review is probably going to change; I would like it to be done in public, for one. But even this simple change is fraught with difficulty — how would you ensure that reviewers aren’t influenced by each others’ reviews? This is an important factor in the current system. During my program committee meetings, I came to realize just how many of these little procedures for minimizing bias are built into the system and how seriously people take the spirit of this process. Revamping peer review while keeping what works is going to be slow and challenging.

Moving on, some of my other concerns have been disappearing due to recent events. Restrictive publisher copyrights are a perfect example. I have more of a problem with this than most researchers do — I did my Master’s in India, which means I’ve been on the other side of the paywall. But it looks like that pot may finally have boiled over. I think it’s only a matter of time now before open access becomes the norm in all disciplines.

There are certainly areas where the status quo is not great and not getting any better. Today if a researcher makes a discovery that’s not significant enough to write a paper about, they choose not to share that discovery at all. Unfortunately, this is the rational behavior for a self-interested researcher, because there is no way to get credit for anything other than published papers. Michael Neilsen’s excellent book exploring the future of networked science gives me some hope that change may be on the horizon.

I hope this post has given you a more nuanced appreciation of the nature of scientific research. Misconceptions about research and especially about academia seem to be widespread among the people I talk to both online and offline; I harbored a few myself during my Ph.D., as I said earlier. So I’m thinking of doing posts like this one on a semi-regular basis on this blog or on Google+. But that will probably have to wait until after my job search is done.

To stay on top of future posts, subscribe to the RSS feed or follow me on Google+.

February 7, 2012 at 11:05 am 2 comments

In which I come out: Notes from the FTC Privacy Roundtable

I was on a panel at the second FTC privacy roundtable in Berkeley on Thursday. Meeting a new community of people is always a fascinating experience. As a computer scientist, I’m used to showing up to conferences in jeans and a T-shirt; instead I found myself dressing formally and saying things like “oh, not at all, the honor is all mine!”

This post will also be the start of a new direction for this blog. So far, I’ve mostly confined myself to “doing the math” and limiting myself to factual exposition. That’s going to change, for two reasons:

  • The central theme of this blog and of my Ph.D dissertation — the failure of data anonymization — now seems to be widely accepted in policy circles. This is due in large part to Paul Ohm’s excellent paper, which is a must-read for anyone interested in this topic. I no longer have to worry about the acceptance of the technical idea being “tainted” by my opinions.
  • I’ve been learning about the various facets of privacy — legal, economic, etc. — for long enough to feel confident in my views. I have something to contribute to the larger discussion of where technological society is heading with respect to privacy.

Underrepresentation of scientists

Living up to the stereotype

As it turned out, I was the only academic computer scientist among the 35 panelists. I found this very surprising. The underrepresentation is not because computer scientists have nothing to contribute — after all, there were other CS Ph.Ds from industry groups like Mozilla. Rather, I believe it is a consequence of the general attitude of academic scientists towards policy issues. Most researchers consider it not worth their time, and a few actively disdain it.

The problem is even deeper: academics have the same disdainful attitude towards the popular exposition of science. The underlying reason is that the goal in academia is to impress one’s peers; making the world better is merely a side-effect, albeit a common one. The incentive structure in academia needs to change. I will pick up this topic in future posts.

The FTC has an admirable approach to regulation

As I found out in the course of the day’s panels, the FTC is not about prescribing or mandating what to do. Pushing a specific privacy-enhancing technology isn’t the kind of thing they are interested in doing at all. Rather, they see their role as getting the market to function better and the industry to self-regulate. The need to avoid harming innovation was repeatedly emphasized, and there was a lot of talk about not throwing the baby out with the bathwater.

The following were the potential (non baby hurting) initiatives that were most talked about:

  • Market transparency. Markets can only work well when there is full information, and when it comes to privacy the market has failed horribly. Users have no idea what happens to their data once it’s collected, and no one reads privacy policies. Regulation that promotes transparency can help the market fix itself.
  • Consumer education. This is a counterpart to the previous point. Education about privacy dangers as well as privacy technologies can help.
  • Enforcement. A few bad apples have been responsible for the most egregious privacy SNAFUs. The larger players are by and large self-regulating. The FTC needs to work with law enforcement to punish the offenders.
  • Carrots and sticks. Even the specter of regulation, corporate representatives said, is enough to get the industry to self-regulate. Many would disagree, but I think a carrots-and-sticks approach can be made to work.
  • Incentivizing adoption of PETs (privacy enhancing technologies) in general. The question of how the FTC can spur the adoption of PETs was brought up on almost every panel, but I don’t think there were any halfway convincing answers. Someone mentioned that the government in general could go into the market for PETs, which seems reasonable.

As a libertarian, I think the overall non-interventionist approach here is exactly right. I’m told that the FTC is rather unusual among US regulatory agencies in this regard (which makes sense, considering that the FCC, for example, spends its time protecting children from breasts when it is not making up lists of words.)

Facebook’s two faces

Facebook public policy director Tim Sparapani, who was previously with the ACLU, made a variety of comments on the second panel that were bizarre, to put it mildly. Take a look (my comments are in sub-bullets):

  • “We absolutely compete on privacy.”
    • That’s a weird definition of “compete.” Facebook has a history of rolling out privacy-infringing updates, such as Beacon, the ToS changes, and the recent update that made the graph public. Then they wait to see if there’s an outcry and roll back some of the changes. It is hard to think of another company has had such a cavalier approach.
  • “There are absolutely no barriers to entry to create a new social network.”
    • Except for that little thing called the network effect, which is the mother of all barriers to entry. In a later post I will analyze why Facebook has reached a critical level of penetration in most markets which makes it nearly unassailable as a  general-purpose social network.
  • “Our users have learned to trust us.”
    • I don’t even know what to say about this one.
  • “We are a walled garden.”
    • Sparapani is confusing two different senses of “walled garden” here. This was said in response to a statement by the Google rep about Google’s features to let users migrate their data to other services (which I find very commendable). In this sense, Facebook is indeed a walled garden, and doesn’t allow migration, which is a bad thing.  But Sparapani said he meant it in the sense that Facebook doesn’t sell user data wholesale to other companies. That sounds like good news, except that third party app developers end up sharing user data with other entities, because enforcement of the application developer Terms of Service is virtually non-existent.
  • “If you delete the data it’s gone.” (in the context of deleting your account)
    • That might be true in a strict sense, but it is misleading. Deleting all your data is actually impossible to achieve because most pieces of data belong to more than one user. Each of your messages will live on in the other person’s inbox (and it would be improper to delete it from theirs). Similarly, photos in which you appear, which you would probably like gone when you delete your account, still live on in the album of whoever took the picture. The same goes for your pokes, likes and other multi-user interactions. These are the very things that make a social network social.
  • “We now have controls on privacy at the moment you share data. This is an extraordinary innovation and our engineers are really proud of it.”
    • The first part of that statement is true: you can now change the privacy controls on each of your Facebook status messages independently. The second part is downright absurd. It is completely trivial to implement from an engineering perspective (and LiveJournal for instance has had it for a decade).

There were more absurd statements, but you get the picture. It’s not just the fact that Sparapani’s comments were unhinged from reality that bothers me — the general tone was belligerent and disturbing. I missed a few minutes of the panel, during which he apparently he responded to a criticism from Chris Conley of the ACLU by saying “I was at the ACLU longer than you’ve been there.” This is unprofessional, undignified and a non-answer. Amusingly, he claimed that Facebook was “very proud” of various aspects of their privacy track record at least half a dozen times in the course of the panel.

Contrast all this with Mark Zuckerberg’s comments in an interview with Michael Arrington, which can be summed up as “the age of privacy is over.” That article goes on to say that Facebook’s actions caused the shift in social norms (to the extent that they have shifted at all) rather than merely responding to them. Either way, it is unquestionable that Facebook’s true behavior at the present time pays lip service to privacy, and Zuckerberg’s statement is a more-or-less honest reflection of that. On the other hand, as I have shown, the company sings a completely different tune when the FTC is listening.

Engaging privacy skeptics

Aside from Facebook’s shenanigans, I feel that that there are two groups in the privacy debate who are talking past each other. One side is represented by consumer advocates, and is largely echoed by the official position of the FTC. The other side’s position can be summed up as “yeah, whatever.” When expressed coherently, there are three tenets of this position (with the caveats that not all privacy skeptics adhere to all three):

  • Users don’t care about privacy any more
  • Even if they do, privacy is impossible to achieve in the digital age, so get over it
  • There are no real harms arising from privacy breaches.

Click image to embiggen

To  the right is an illustrative example of a mainstream-media representative who was at the workshop covering it on Twitter through the lens of his preconceived prejudices.

Privacy scholars never engage with the skeptics because the skeptical viewpoint appears obviously false to anyone who has done some serious thinking about privacy. However, it is crucial to engage the opponents, because 1. the skeptical view is extremely common 2. many of the startups coming out of the valley fall into this group, and they are are going to have control over increasing amounts of user data in the years to come.

The “privacy is dead” view was most famously voiced by Scott McNealy. In its extreme form it is easy to argue against: “start streaming yourself live on the Internet 24/7, and then we’ll talk.” (To be sure, a few people did this 10 years ago as a publicity stunt, but it is obvious that the vast majority of people aren’t ready for this level of invasiveness of monitoring/data collection.) But engaging with skeptics isn’t about refutation, it’s about dealing with a different way of thinking and getting the message across to the other side. Unfortunately real engagement hasn’t really been happening.

I have a double life in academia and the startup world, and I think this puts me in a somewhat unusual position of being able to appreciate both sides of the argument. My own viewpoint is somewhere in the middle; I will expand on this theme in future blog posts.

January 31, 2010 at 3:49 am 13 comments


About 33bits.org

I'm an assistant professor of computer science at Princeton. I research (and teach) information privacy and security, and moonlight in technology policy.

This is a blog about my research on breaking data anonymization, and more broadly about information privacy, law and policy.

For an explanation of the blog title and more info, see the About page.

Me, elsewhere

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Join 245 other followers