Posts tagged ‘ycombinator’
Facebook has begun to accelerate the web-wide roll-out of the Instant Personalization program. The number of partner websites recently jumped from three to five, and a partnership with early stage venture firm YCombinator is set to greatly expand that number in the coming months.
Instant Personalization allows a partner website to automatically learn the identity of a visitor (as well as some data about them) without any explicit user action, provided that the visitor is a logged-in Facebook user. It is probably the most privacy-intrusive change introduced by the company this year, and could lead to a profound change in how the web works and is perceived.
Facebook’s superficially reassuring line is that only data that is already public is shared with partner sites. Even ignoring the fact that it is hard for users to figure out exactly what data is public, and is only getting harder, I find the official explanation to be a red herring. In this article I will examine the various fundamental flaws of Instant Personalization.
1. Sneakiness. All the information transmitted via Instant Personalization is available via Facebook connect; the sole purpose of Instant Personalization is to eliminate the element of user authorization from the process. Thus, I find the very raison d’etre to be questionable. If a user declines to use Facebook connect, perhaps they had a good reason for doing so. Think about a porn site — I don’t think I need to elaborate.
2. Identity. To me, what is much more worrisome than third parties getting your data is third parties getting your identity when you browse. The idea that a website knows who you are as soon as you land on it is inherently creepy because it violates users’ mental model of how the web works. The cumulative effect is worse — people are intensely uncomfortable when they feel they are being “followed around” as they browse the web.
From a technical perspective, an Instant Personalization partner could itself turn around and become an Instant Personalization provider, and so could any website that this partner provided Instant Personalization services for, ad infinitum. This is because any number of tracking devices (invisible iframes) can be nested within a page.
Implementation bugs on partner sites also have the effect of leaking your identity to other parties. In my ubercookies series, I documented a series of bugs that can be exploited by an arbitrary website to learn the visitor’s identity. All of these apply to Instant Personalization, i.e., if any one of the partner websites has such a bug, that can be exploited by an arbitrary attacker to instantly de-anonymize a visitor to his site. Security researcher theharmonyguy has a great post on cross-site scripting vulnerabilities on both Rotten Tomatoes and Scribd that compromise Instant Personalization in this fashion.
3. Facebook gets your clickstream. Instant Personalization is a two way street: while the partner site gets access to the user’s identity, Facebook learns the URLs of the pages the user visits. In a world where Instant Personalization is widely deployed, Facebook will be able to monitor a large fraction, perhaps the majority, of clicks that you make around the web.
While troubling, this is not unprecedented: the Faceook like button constitutes a very similar privacy problem — Facebook sees you whenever you visit any page with the like button (or another social plugin) installed, even if you don’t click the like button. Facebook bowed to pressure from privacy advocates and agreed to delete the logs from social plugins after 90 days; I would like to see the same policy applied to Instant Personalization logs as well.
4. Third parties could get your clickstream. Normally, an Instant Personalization partner can only see your clicks on their own site. However, think of an Instant Personalization partner whose product is a social widget or an analytics plugin that is intended to be installed on many client sites. From a technical perspective, loading a page or widget in an iframe is not fundamentally different from visiting the site directly. That means it is feasible for an Instant Personalization partner with a social widget to monitor your clicks — tied to your real identity, of course — on all sites with the widget installed.
5. Lack of enforcement. So far I have described the lack of technological barriers to various types of misuse and abuse of Instant Personalization. However, Facebook contractually prohibits partners from misusing the data. The natural question is whether this is effective.
It is too early to tell yet, because there are currently only five partners. To predict how things will turn out once numerous startups — without the resources or incentive for security testing and privacy compliance — get on board, we can look to the track-record of Facebook’s third party application platform. As you may recall, this has been rather poor, with enforcement of Terms of Service violations being haphazard at best.
Mitigation. In my opinion these flaws are inherent, and I don’t think Instant Personalization will turn out well from a security and privacy perspective. User expectations are not malleable, cross-site scripting bugs will always exist, there will soon be too many partner sites to monitor closely, and some of them will look for ways to push the boundaries of what they can do.
 YCombinator-funded companies will get “priority access” to various Facebook technologies including “Facebook Credits, Instant Personalization and upcoming beta features”. Interestingly, Instant Personalization seems to be the feature that YCombinator is most interested in.
 Yelp.com was also found vulnerable to a cross-site scripting bug soon after Instant Personalization launch. This means the majority of partner sites — 3 out of 5 — have had vulnerabilities that compromise Instant Personalization.
 In Instant Personalization, Facebook and the partner site communicate invisibly in the background each time the user visits a page on the partner site; in this way the mechanism is different from social widgets.
 Large-scale clickstream data is prone to misuse in various ways: government coercion, hacking, or being purchased as part of bankruptcy settlements (expecially when we’re talking about startups).
Thanks to Kevin Bankston for pointing me to Facebook’s log rentention policy for social plugins.