<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>33 Bits of Entropy &#187; Uncategorized</title>
	<atom:link href="http://33bits.org/category/uncategorized/feed/" rel="self" type="application/rss+xml" />
	<link>http://33bits.org</link>
	<description>The End of Anonymized Data and What to Do About It</description>
	<lastBuildDate>Mon, 30 Jan 2012 06:39:28 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='33bits.org' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>33 Bits of Entropy &#187; Uncategorized</title>
		<link>http://33bits.org</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://33bits.org/osd.xml" title="33 Bits of Entropy" />
	<atom:link rel='hub' href='http://33bits.org/?pushpress=hub'/>
		<item>
		<title>Printer Dots, Pervasive Tracking and the Transparent Society</title>
		<link>http://33bits.org/2011/10/18/printer-dotspervasive-tracking-and-the-transparent-society/</link>
		<comments>http://33bits.org/2011/10/18/printer-dotspervasive-tracking-and-the-transparent-society/#comments</comments>
		<pubDate>Tue, 18 Oct 2011 19:35:51 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[civil liberties]]></category>
		<category><![CDATA[fingerprinting]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[surveillance]]></category>
		<category><![CDATA[tracking]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=1005</guid>
		<description><![CDATA[So far in the fingerprinting series, we’ve seen how a variety of objects and physical devices [1, 2, 3, 4], often even supposedly identical ones, can be uniquely fingerprinted. This article is non-technical; it is an opinion on some philosophical questions about tracking and surveillance. Here’s a fascinating example of tracking that’s all around you but that you’re probably [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=1005&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><em>So far in the fingerprinting series, we’ve seen how a variety of objects and physical devices [<a href="http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/">1</a>, <a href="http://33bits.org/2011/09/19/digital-camera-fingerprinting/">2</a>, <a href="http://33bits.org/2011/10/04/fingerprinting-of-rfid-tags-and-high-tech-stalking/">3</a>, <a href="http://33bits.org/2011/10/11/everything-has-a-fingerprint-%e2%80%94-dont-forget-scanners-and-printers/">4</a>], often even supposedly identical ones, can be uniquely fingerprinted. This article is non-technical; it is an opinion on some philosophical questions about tracking and surveillance.</em></p>
<p>Here’s a fascinating example of tracking that’s all around you but that you’re probably unaware of:</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">Color laser printers and photocopiers print small yellow dots on every page for tracking purposes.</div>
<p>My source for this is the EFF’s <a href="http://en.wikipedia.org/wiki/Seth_Schoen">Seth Schoen</a>, who has made his<a href="https://www.eff.org/files/filenode/printers/ccc.pdf"> </a><a href="https://www.eff.org/files/filenode/printers/ccc.pdf">presentation</a> on the subject available.</p>
<p><a href="http://33bits.files.wordpress.com/2011/10/yellowtrackingdots.png"><img class="aligncenter size-medium wp-image-1007" title="yellowtrackingdots" src="http://33bits.files.wordpress.com/2011/10/yellowtrackingdots.png?w=300&#038;h=222" alt="" width="300" height="222" /></a></p>
<p>The dots are not normally visible, but can be seen by a variety of methods such as shining a blue LED flashlight, magnification under a microscope or scanning the document with a commodity scanner. The pattern of dots typically encodes the device serial number and a timestamp; some parts of the code are yet unidentified. There are interesting differences between the codes used by different manufacturers. [1] Some examples are shown in the pictures. There’s a lot more information in the presentation.</p>
<div id="attachment_1006" class="wp-caption aligncenter" style="width: 465px"><a href="http://33bits.files.wordpress.com/2011/10/patterns.png"><img class="size-full wp-image-1006" title="patterns" src="http://33bits.files.wordpress.com/2011/10/patterns.png?w=455&#038;h=106" alt="" width="455" height="106" /></a><p class="wp-caption-text">Pattern of dots from three different printers: Epson, HP LaserJet and Canon.</p></div>
<p>Schoen says the dots could have been the result of the Secret Service pressuring printer manufacturers to cooperate, going back as far as the 1980s. The EFF’s Freedom of Information Act request on the matter from 2005 has been “mired in bureaucracy.”</p>
<p>The EFF as well as the<a href="http://seeingyellow.com/"> </a><a href="http://seeingyellow.com/">Seeing Yellow project</a> would like to see these dots gone. The EFF has consistently argued against pervasive tracking. In <a href="https://www.eff.org/wp/biometrics-whos-watching-you">this article</a> on biometric surveillance, they say:</p>
<blockquote><p>EFF believes that perfect tracking is inimical to a free society. A society in which everyone&#8217;s actions are tracked is not, in principle, free. It may be a livable society, but would not be our society.</p></blockquote>
<p>Eloquently stated. You don’t have to be a privacy advocate to see that there are problems with mass surveillance, especially by the State. But I’d like to ask the question: can we really hope to stave off a surveillance society forever, or are efforts like the Seeing Yellow project just buying time?</p>
<p>My opinion is that it impossible to put the genie back into the bottle — the cost of tracking every person, object and activity will continue to drop exponentially. I hope the present series of articles has convinced you that even if privacy advocates are successful in preventing the deployment of <em>explicit</em> tracking mechanisms, just about everything around you is <em>inherently</em> trackable. [2]</p>
<p>And even if we can prevent the State from setting up a surveillance infrastructure, there are undeniable commercial benefits in tracking everything that’s trackable, which means that private actors will deploy this infrastructure, as they’ve done with online tracking. If history is any indication, most people will happily allow themselves to be tracked in exchange for free or discounted services. From there it’s a simple step for the government to obtain the records of any person of interest.</p>
<p>If we accept that we cannot stop the invention and use of tracking technologies, what are our choices? Our best hope, I believe, is a world in which the ability to conduct tracking and surveillance is <strong>symmetrically distributed</strong>, a society in which ordinary citizens can and do turn the spotlight on those in power, keeping that power in check. On the other hand, a world in which only the government, large corporations and the rich are able to utilize these technologies, but themselves hide under a veil of secrecy, would be a true dystopia.</p>
<p>Another important principle is for those who do conduct tracking to be required to be <strong>transparent</strong> about it, to have social and legal processes in place to determine what uses are acceptable, and to allow opting out in contexts where that makes sense. Because ultimately what matters in terms of societal freedom is not surveillance itself, but how surveillance affects the balance of power. To be sure, the society I describe — pervasive but transparent tracking, accessible to everyone, and with limited opt-outs — would be different from ours, and would take some adjusting to, but that doesn’t make it <em>worse</em> than ours.</p>
<p>I am hardly the first to make this argument. A similar position was first prominently articulated by David Brin his 1999 book <a href="http://www.amazon.com/Transparent-Society-Technology-Between-Privacy/dp/0738201448">Transparent Society</a>. What the last decade has shown is just how inevitable pervasive tracking is. For example, Brin focused too much on cameras and assumed that tracking people indoors would always be infeasible. That view seems almost quaint today.</p>
<p>Let me be clear: I have absolutely no beef with efforts to oppose pervasive tracking. Even if being watched all of the time is our eventual destiny, society won’t be ready for it any time soon — these changes take decades if not generations. The pace at which the industry wants us to make us switch to “living in public” is far faster than we’re capable of. Buying time is therefore extremely valuable.</p>
<p>That said, embracing the Transparent Society view has important consequences for civil libertarians. It suggests working toward an achievable if sub-optimal goal instead of an ideal but impossible one. It also suggests that the “democratization of surveillance” should be <em>encouraged</em> rather than feared.</p>
<p>Here are some currently hot privacy and civil-liberties issues that I think will have a significant impact on the distribution of power in a ubiquitous-surveillance society: <a href="http://www.aclu.org/blog/free-speech/it-legal-photograph-or-videotape-police">the right to videotape on-duty police officers and other public officials</a>, transparent government initiatives including <a href="http://en.wikipedia.org/wiki/Freedom_of_Information_Act_(United_States)">FOIA</a> requests, and closer to my own interests, the <a href="http://donottrack.us">Do Not Track</a> opt-out mechanism, and tools like <a href="http://fourthparty.info/">FourthParty</a> which have helped illuminate the dark world of online tracking.</p>
<p>Let me close by calling out one battle in particular. Throughout this series, we’ve seen that fingerprinting techniques have security-enhancing applications (such as forensics), as well as privacy-infringing ones, but that most research papers on fingerprinting consider only the former question. I believe the primary reason is that <em>funding</em> is for the most part available only for the former type of research and not for the latter. However, we need a culture of research into privacy-infringing technologies, whether funded by federal grants or otherwise, in order to achieve the goals of symmetry and transparency in tracking.</p>
<p>[1] Note that this is just an encoding and not encryption. The current system allows anyone to read the dots; public-key encryption would allow at least nominally restricting the decoding ability to only law-enforcement personnel, but there is no evidence that this is being done.</p>
<p>[2] This is analogous to the cookies-vs-fingerprinting issue in online tracking, and why cookie-blocking alone is not sufficient to escape tracking.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/1005/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/1005/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/1005/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=1005&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/10/18/printer-dotspervasive-tracking-and-the-transparent-society/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/yellowtrackingdots.png?w=300" medium="image">
			<media:title type="html">yellowtrackingdots</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/patterns.png" medium="image">
			<media:title type="html">patterns</media:title>
		</media:content>
	</item>
		<item>
		<title>Everything Has a Fingerprint — Don&#8217;t Forget Scanners and Printers</title>
		<link>http://33bits.org/2011/10/11/everything-has-a-fingerprint-%e2%80%94-dont-forget-scanners-and-printers/</link>
		<comments>http://33bits.org/2011/10/11/everything-has-a-fingerprint-%e2%80%94-dont-forget-scanners-and-printers/#comments</comments>
		<pubDate>Tue, 11 Oct 2011 18:02:25 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[fingerprinting]]></category>
		<category><![CDATA[forensics]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=994</guid>
		<description><![CDATA[Previous articles in this series looked at fingerprinting of blank paper, digital cameras and RFID chips. This article will discuss scanners and printers, rounding out the topic of physical-device fingerprinting. To readers who’ve followed the series so far, it should come as no surprise that scanners can be fingerprinted, and this can be used to match an image to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=994&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p><em>Previous articles in this series looked at fingerprinting of <a href="http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/">blank paper</a>, <a href="http://33bits.org/2011/09/19/digital-camera-fingerprinting/">digital cameras</a> and <a href="http://33bits.org/2011/10/04/fingerprinting-of-rfid-tags-and-high-tech-stalking/">RFID chips</a>. This article will discuss scanners and printers, rounding out the topic of physical-device fingerprinting.</em></p>
<p>To readers who’ve followed the series so far, it should come as no surprise that scanners can be fingerprinted, and this can be used to match an image to the device that scanned it. Scanners capture images via a process similar to digital cameras, so the underlying principle used in fingerprinting is the same: characteristic ‘pattern noise’ in the sensor array as well as idiosyncracies of the algorithms used in the post-processing pipeline. The former is device-specific whereas the latter is make/model specific.</p>
<p>There are two important differences, however, that make scanner fingerprinting more difficult: first, scanner sensor arrays are one-dimensional (the sensor moves along the length of the device to generate the image), which means that there is much less entropy available from sensor imperfections. Second, the paper may not be placed in the same part of the scanner bed each time, which rules out a straightforward pixel-wise comparison.</p>
<p>A <a href="http://cobweb.ecn.purdue.edu/~prints/">group at Purdue</a> has been very active in this area, as well as in printer identification, which I will discuss later in this article. These <a href="http://cobweb.ecn.purdue.edu/~prints/public/papers/ei07-nitin2.pdf">two</a> <a href="http://cobweb.ecn.purdue.edu/~prints/public/papers/iwcf08_khanna.pdf">papers</a> are very relevant for our purposes. The application they have in mind is forensics; in this context, it can be assumed that the investigator has physical possession of the scanner to generate a fingerprint against which a scanned image of unknown or uncertain origin can be tested.</p>
<p>To extract 1-dimensional noise from a 2-dimensional scanned image, the authors first extract 2-dimensional noise, in a process similar to what is used in camera fingerprinting, and then they collapse each noise pattern into a single row, which is the average of all the rows. Simple enough.</p>
<p>Dealing with the other problem, the lack of synchronicity, is trickier. There are broadly two approaches: 1. try to synchronize the image by trying various alignments 2. extract fingerprints using statistical features of the image that are robust against desynchronization. The authors use the latter approach, mainly <a href="http://en.wikipedia.org/wiki/Standardized_moment">moment</a>-based features of the noise vector.</p>
<p>Here are the results. At the native resolution of scanners, 1200–4800 dpi, they were able to distinguish between 4 scanners with an average accuracy of 96%, including a pair with identical make and model. In subsequent work, they improved the feature extraction to be able to handle images that are reduced to 200 dpi, which is typically the resolution used for saving and emailing images. While they achieved 99.9% accuracy in classifying 10 scanners, they can no longer distinguish devices of identical make and model.</p>
<p>The authors claim that a correlation based approach — searching for the right alignment between two images, and then directly comparing the noise vectors — won’t work. I am skeptical about this claim. The fact that it hasn’t worked so far doesn’t mean it can’t be made to work. If it does work, it is likely to give far higher accuracies and be able to distinguish between a much larger number of devices.</p>
<p>The privacy implications of scanner fingerprinting are of an analogous nature to digital camera fingerprinting: a whistleblower exposing scanned documents may be deanonymized. However, I would judge the risk to be much lower: scanners usually aren’t personal devices, and a labeled corpus of images scanned by a particular device is typically not available to outsiders.</p>
<p>The Purdue group have also worked on <a href="http://cobweb.ecn.purdue.edu/~prints/public/papers/nip05-mikkilineni.pdf">printer identification</a>, both laser and inkjet. In laser printers, one prominent type of observable signature arising from printer artifacts is <em>banding</em> — alternating light and dark horizontal bands. The bands are subtle and not noticeable to the human eye. But they are easily algorithmically detectable, constituting a 1–2% deviation from average intensity.</p>
<div id="attachment_995" class="wp-caption aligncenter" style="width: 465px"><a href="http://33bits.files.wordpress.com/2011/10/printer-fingerprint.png"><img class="size-full wp-image-995 " title="Laser printer signature" src="http://33bits.files.wordpress.com/2011/10/printer-fingerprint.png?w=455&#038;h=364" alt="" width="455" height="364" /></a><p class="wp-caption-text">Fourier Transform of greyscale amplitudes of a background fill (printed with an HP LaserJet)</p></div>
<p>Banding can be demonstrated by printing a constant grey background image, scanning it, measuring the row-wise average intensities and taking the <a href="http://en.wikipedia.org/wiki/Fourier_transform">Fourier Transform</a> of the resulting 1-dimensional vector. One such plot is shown here: the two peaks (132 and 150 cycles/inch) constitute the signature of the printer. The amount of entropy here is small — the two peak frequencies — and unsurprisingly the authors believe that the technique is good enough to distinguish between printer models but not individual printers.</p>
<p>Detecting banding in printed text is difficult because the power of the signal dominates the power of the noise. Instead the authors classify <em>individual letters</em>. By extracting a set of statistical features and applying an <a href="http://en.wikipedia.org/wiki/Support_vector_machine">SVM classifier</a>, they show that instances of the letter ‘e’ from 10 different printers can be correctly classified with an accuracy of over 90%.</p>
<p>Needless to say, by combining the classification results from all the ‘e’s in a typical document, they were able to match documents to printers 100% of the time in their tests. Presumably the same method would apply for all other characters, but wasn’t tested due to the additional manual effort required for different shapes.</p>
<div id="attachment_996" class="wp-caption aligncenter" style="width: 424px"><a href="http://33bits.files.wordpress.com/2011/10/printer_vertical_lines.png"><img class="size-full wp-image-996" title="Inkjet printers: vertical lines" src="http://33bits.files.wordpress.com/2011/10/printer_vertical_lines.png?w=455" alt=""   /></a><p class="wp-caption-text">Vertical lines printed by three different inkjet printers</p></div>
<p>Inkjet printers seem to be even more variable than laser printers; an example is shown in the picture taken from <a href="http://cobweb.ecn.purdue.edu/~prints/public/papers/sp_article_09_chiang.pdf">this paper</a>. I found it a bit hard to discern exactly what the state of the art is, but I’m guessing that if it isn’t already possible to detect different printer models with essentially perfect accuracy, it will soon be.</p>
<p>The privacy implications of printer identification, in the context of a whistleblower who wishes to print and mail some documents anonymously, would seem to be minimal. If you’re printing from the office, printer logs (that record a history of print jobs along with user information) would probably be a more realistic threat. If you’re using a home printer, there is typically no known set of documents that came from your printer to compare against, unless law enforcement has physical possession of your printer.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/994/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/994/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/994/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=994&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/10/11/everything-has-a-fingerprint-%e2%80%94-dont-forget-scanners-and-printers/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/printer-fingerprint.png" medium="image">
			<media:title type="html">Laser printer signature</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/printer_vertical_lines.png" medium="image">
			<media:title type="html">Inkjet printers: vertical lines</media:title>
		</media:content>
	</item>
		<item>
		<title>Fingerprinting of RFID Tags and High-Tech Stalking</title>
		<link>http://33bits.org/2011/10/04/fingerprinting-of-rfid-tags-and-high-tech-stalking/</link>
		<comments>http://33bits.org/2011/10/04/fingerprinting-of-rfid-tags-and-high-tech-stalking/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 21:20:19 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[fingerprinting]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[tracking]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=989</guid>
		<description><![CDATA[Previous articles in this series looked at fingerprinting of blank paper and digital cameras. This article is about fingerprinting of RFID, a domain where research has directly investigated the privacy threat, namely tracking people in public. The principle behind RFID fingerprinting is the same as with digital cameras: Microscopic physical irregularities due to natural structure and/or manufacturing defects [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=989&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><em>Previous articles in this series looked at fingerprinting of <a href="http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/">blank paper</a> and <a href="http://33bits.org/2011/09/19/digital-camera-fingerprinting/">digital cameras</a>. This article is about fingerprinting of RFID, a domain where research has directly investigated the privacy threat, namely tracking people in public.</em></p>
<p>The principle behind RFID fingerprinting is the same as with digital cameras:</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">Microscopic physical irregularities due to natural structure and/or manufacturing defects cause observable, albeit tiny, behavioral differences.</div>
<p><strong>The basics.</strong> First let’s get the obvious question out of the way: why are we talking about devious methods of identifying RFID chips, when the primary raison d&#8217;être of RFID is to enable unique identification? Why not just use them in the normal way?</p>
<p>The answer is that fingerprinting, which exploits the physical properties of RFID chips rather than their logical behavior, allows identifying them in unintended ways and in unintended contexts, and this is powerful. RFID applications, for example in e-passports or smart cards, can often be <a href="http://www.schneier.com/blog/archives/2008/08/hacking_mifare.html">cloned</a> at the logical level, either because there is no authentication or because authentication is broken. Fingerprinting can make the system (more) secure, since fingerprints arise from microscopic randomness and there is no known way to create a tag with a given fingerprint.</p>
<p>If sensor patterns in digital cameras are a relatively clean example of fingerprinting, RF (and anything to do with the electromagnetic spectrum in general) is the opposite. First, the data is an arbitrary waveform instead of an fixed-size sequence of bits. This means that a simple point-by-point comparison won’t work for fingerprint verification; the task is conceptually more similar to algorithmically comparing two faces. Second, the probe signal itself is variable. RFID chips are passive: they respond to the signal produced by the reader (and draw power from it).[1] This means that the fingerprinting system is in full control of what kind of signal to interrogate the chip with. It’s a bit like being given a blank canvas to paint on.</p>
<p><strong>Techniques.</strong> A <a href="http://www.syssec.ethz.ch/research/identification">group at ETH Zurich</a> has done some impressive work in this area. In their <a href="http://www.syssec.ethz.ch/research/usenixsec09_phyid_rfid.pdf">2009 paper</a>, they report being able to compare an RFID card with a stored fingerprint and determine if they are the same, with an error rate of 2.5%–4.5% depending on settings.[2] They use two types of signals to probe the chip with — “burst” and “sweep” — and extract features from the response based on the <a href="http://en.wikipedia.org/wiki/Frequency_spectrum">spectrum</a>.</p>
<div id="attachment_990" class="wp-caption aligncenter" style="width: 465px"><a href="http://33bits.files.wordpress.com/2011/10/rfid.png"><img class="size-full wp-image-990" title="rfid" src="http://33bits.files.wordpress.com/2011/10/rfid.png?w=455&#038;h=161" alt="" width="455" height="161" /></a><p class="wp-caption-text">Chip response to different signals. Fingerprints are extracted from characteristic features of these responses.</p></div>
<p>Other papers have demonstrated different ways to generate signals/extract features. A University of Arkansas team <a href="http://comp.uark.edu/~drt/pubs/2010/Fingerprinting_RFID_Tags2010.pdf">exploited</a> the minimum power required to get a response from the tag at various frequencies. The authors achieved a 94% true-positive rate using 50 identical tags, with only a 0.1% false-positive rate. (About 6% of the time, the algorithm didn’t produce an output.)</p>
<p>Yet other techniques, namely the energy and <a href="http://en.wikipedia.org/wiki/Q_factor">Q factor</a> of higher harmonics were studied in a <a href="http://www.nist.gov/pml/electromagnetics/rf_electronics/upload/RFID_counter_TMTT.pdf">couple</a> of <a href="http://www.nist.gov/pml/electromagnetics/rf_electronics/upload/RFID-resonance.pdf">papers</a> out of NIST. In the latter work, they experimented with 20 cards which consisted of 4 batches of 5 ‘identical’ cards in each. The overall identification accuracy was 96%.</p>
<p>It seems safe to say that RFID fingerprinting techniques are still in their infancy, and there is much room for improvement by considering new categories of features, by combining different types of features, or by using different classification algorithms on the extracted features.</p>
<p><strong>Privacy.</strong> RF fingerprinting, like other types of fingerprinting, shows a duality between security-enhancing and privacy-infringing applications, but in a less direct way.  There are two types of RFID systems: “near-field” based on inductive coupling, used in contactless smartcards and the like, and “far field” based on backscatter, used in vehicle identification, inventory control, etc. <em>The papers discussed so far pertain to near-field systems.</em> There are no real privacy-infringing applications of near-field RF fingerprinting, because you can’t get close enough to extract a fingerprint without the owner of the tag knowing about it. Far-field systems, to which we will now turn, are ideally suited to high-tech stalking.</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">Fingerprinting provides the ability to enhance the security of near-field RFID systems and to infringe privacy in the context of far-field RFID chips.</div>
<p>In a recent <a href="http://www.syssec.ethz.ch/research/zanetti_pets11_CR.pdf">paper</a>, the Zurich team mentioned earlier investigated the possibility of tracking a people in a shopping mall based on strategically placed sensors, assuming that shoppers have several (far-field) RFID tags on them. The point is that it is possible to design chips that prevent tracking at the logical level by authenticating the reader, but this is impossible at the physical level.</p>
<p>Why would people have RFID tags on them? Tags used for inventory control in stores, and not deactivated at the point-of-sale are one <a href="http://online.wsj.com/article/SB10001424052748704421304575383213061198090.html">increasingly common possibility</a> — they would end up in shopping bags (or even on clothes being worn, although that’s less likely). RFID tags in wallets and medical devices are another source; these are tags that the user <em>wants</em> to be present and functional.</p>
<p>What makes the tracking device the authors built powerful is that it is low-cost and can be operated surreptitiously at some distance from the victim: up to 2.75 meters, or 9 feet. They show that 5.4 bits of entropy can be extracted from a single tag, which means that 5 tags on a person gives 22 bits, easily enough to distinguish everyone who might be in a particular mall.</p>
<p>To assess the practical privacy risk, technological feasibility is only one dimension. We also need to ask who the adversary is and what the incentives are. Tracking people, especially shoppers, in physical space has the strongest incentive of all: selling products. While online tracking is pervasive, the majority of shopping dollars are still spent offline, and there’s still no good way to automatically identify people when they are in the vicinity in order to target offers to them. Facial recognition technology is highly error-prone and creeps people out, and that’s where RF fingerprinting comes in.</p>
<p>That said, RF fingerprinting is only one of the many ways of passively tracking people <em>en masse</em> in physical space — unintentional leaks of identifiers from smartphones and logical-layer identification of RFID tags seem more likely — but it’s probably the hardest to defend against. It is possible to disable RFID tags, but this is usually irreversible and it’s difficult to be sure you haven’t missed any. RFID jammers are another option but they are far from easy to use and are probably <a href="http://consumerist.com/2007/01/protect-your-rfid-credit-card-with-a-rf-jammer.html">illegal in the U.S</a>. One of the ETH Zurich researchers <a href="http://www.mics.org/Workshop2011/Slides/Zanetti_WS11_FingerprintingRFIDTags.pdf">suggests</a> tinfoil wrapping when going out shopping :-)</p>
<p style="text-align:center;"><img class="aligncenter" title="tinfoil" src="http://33bits.files.wordpress.com/2011/10/tinfoil.png?w=162&#038;h=338" alt="" width="162" height="338" /></p>
<p>[1] Active RFID chips exist but most commercial systems use passive ones, and that’s what the fingerprinting research has focused on.</p>
<p>[2] They used a population of 50 tags, but this number is largely irrelevant since the experiment was one of binary classification rather than 1-out-of-n identification.</p>
<p>&nbsp;</p>
<p><em>Thanks to Vincent Toubiana for comments on a draft.</em></p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/989/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/989/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/989/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=989&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/10/04/fingerprinting-of-rfid-tags-and-high-tech-stalking/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/rfid.png" medium="image">
			<media:title type="html">rfid</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/10/tinfoil.png" medium="image">
			<media:title type="html">tinfoil</media:title>
		</media:content>
	</item>
		<item>
		<title>No Two Digital Cameras Are the Same: Fingerprinting Via Sensor Noise</title>
		<link>http://33bits.org/2011/09/19/digital-camera-fingerprinting/</link>
		<comments>http://33bits.org/2011/09/19/digital-camera-fingerprinting/#comments</comments>
		<pubDate>Mon, 19 Sep 2011 17:25:56 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[anonymity]]></category>
		<category><![CDATA[de-anonymization]]></category>
		<category><![CDATA[fingerprinting]]></category>
		<category><![CDATA[signal processing]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=980</guid>
		<description><![CDATA[The previous article looked at how pieces of blank paper can be uniquely identified. This article continues the fingerprinting theme to another domain, digital cameras, and ends by speculating on the possibility of applying the technique on an Internet-wide scale. For various kinds of devices like digital cameras and RFID chips, even supposedly identical units that [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=980&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p class="c0"><em><span class="c4">The </span><span class="c6 c4"><a class="c2" href="http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/">previous article</a></span><span class="c4"> looked at how pieces of blank paper can be uniquely identified. This article continues the fingerprinting theme to another domain, digital cameras, and ends by speculating on the possibility of applying the technique on an Internet-wide scale.</span></em></p>
<p class="c0">For various kinds of devices like digital cameras and RFID chips, even supposedly identical units that come out of a manufacturing plant behave slightly differently in characteristic ways, and can therefore be distinguished based on their output or behavior. How could this be? The unifying principle is this:</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">Microscopic physical irregularities due to natural structure and/or manufacturing defects cause observable, albeit tiny, behavioral differences.</div>
<p class="c0">Digital camera identification belongs to a class of techniques that exploits ‘pattern noise’ in the ‘sensor arrays’ that capture images. The same techniques can be used to fingerprint a scanner by analyzing pixel-level patterns in the images scanned by it, but that’ll be the focus of a later article.</p>
<p class="c0 c9" style="text-align:center;"><a href="http://33bits.files.wordpress.com/2011/09/imageds.jpg"><img class="aligncenter size-full wp-image-981" title="Dark signal" src="http://33bits.files.wordpress.com/2011/09/imageds.jpg?w=455&#038;h=303" alt="" width="455" height="303" /></a></p>
<p class="c0"><strong>A long-exposure dark frame [<span class="c6"><a class="c2" href="http://www.cameralabs.com/forum/viewtopic.php?t=1094">source</a></span>]. Click image to see full size. Three ‘hot pixels’ and some other sensor noise can be seen.</strong></p>
<p class="c0">A photo taken in the absence of any light doesn’t look completely black; a variety of factors introduce noise. There is random noise that varies in every image, but there is also ‘pattern noise’ due to inherent structural defects or irregularities in the physical sensor array. The key property of the latter kind of noise is that it manifests the same way every image taken by the camera.[1] Thus, the total noise vector produced by a camera is not identical between images, nor is it completely independent.</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">The pixel-level noise components in images taken by the same camera are correlated with each other.</div>
<p class="c0">Nevertheless, separating the pattern noise from random noise and the image itself — after all, a good camera will seek to minimize the strength or ‘power’ of the noise in relation to the image — is a very difficult task, and is the primary technical challenge that camera fingerprinting techniques must address.</p>
<p class="c0"><strong><span class="c3">Security vs. privacy.</span></strong> A quick note about the applications of camera fingerprinting. We saw in the <span class="c6"><a class="c2" href="http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/">previous article</a></span> that there are security-enhancing and privacy-infringing applications of document fingerprinting. In fact, this is almost <em><span class="c4">always</span></em> the case with fingerprinting techniques. [2]</p>
<p class="c0">Camera fingerprinting can be used on the one hand for detecting forgeries (e.g., photoshopped images), and to <span class="c6"><a class="c2" href="http://www.physorg.com/news64638499.html">aid criminal investigations</a></span> by determining who (or rather, which camera) might have taken a picture. On the other hand, it could potentially also be used for unmasking individuals who wish to disseminate photos anonymously online.</p>
<p class="c0">Sadly, most papers studying fingerprinting study only the former type of application, which is why we’ll have to speculate a bit on the privacy impact, even though the underlying math of fingerprinting is the same.</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">Most fingerprinting techniques have both security-enhancing and privacy-infringing applications. The underlying principles are the same but they are applied slightly differently.</div>
<p class="c0">Another point to note is that because of the focus on forensics, <em><span class="c4">most of the work in this area so far has studied distinguishing different camera models</span></em>. But there are some preliminary results on distinguishing ‘identical’ cameras, and it appears that the same techniques will work.</p>
<p class="c0"><strong><span class="c3">In more detail.</span></strong> Let’s look at what I think is the most well-known <span class="c6"><a class="c2" href="http://www.ws.binghamton.edu/fridrich/Research/double.pdf">paper</a></span> on sensor pattern noise fingerprinting, by Binghamton University researchers Jan Lukáš,<a class="c2" href="http://en.wikipedia.org/wiki/Jessica_Fridrich"> </a><span class="c6"><a class="c2" href="http://en.wikipedia.org/wiki/Jessica_Fridrich">Jessica Fridrich</a></span>, and Miroslav Golja. [3] Here’s how it works: the first step is to build a reference pattern of a camera from multiple known images taken from it, so that later an unsourced image can be compared against these reference patterns. The authors suggest using at least 50, but for good measure, they use 320 in their experiments. In the forensics context, the investigator probably has physical possession of the camera and therefore can generate an unlimited number of images. We’ll discuss what this requirement means in the privacy-breach context later.</p>
<p class="c0">There are two steps to build the reference pattern. First, for each image, a <span class="c6"><a class="c2" href="http://en.wikipedia.org/wiki/Noise_reduction#In_images">denoising filter</a></span> is applied, and the denoised image is subtracted from the original to leave only the noise. Next, the noise is averaged across all the reference images — this way the random noise cancels out and leaves the pattern noise.</p>
<p class="c0">Comparing a new image to a reference pattern, to test if it came from that camera, is easy: extract the noise from the test image, and compare this noise pixel-by-pixel with the reference noise. The noise from the test image includes random noise, so the match won’t be close to perfect, but nevertheless the <em><span class="c4">correlation</span></em> between the two noise patterns will be roughly equal to the contribution of pattern noise towards the total noise in the test image. On the other hand, if the test image didn’t come from the same camera, the correlation will be close to zero.</p>
<p class="c0">The authors experimented with nine cameras, of which two were from the same brand and model (Olympus Camedia C765). In addition, two other cameras had the same type of sensor. There was not a single error in their 2,700 tests, including those involving the two ‘identical’ cameras — in each case, the algorithm correctly identified which of the nine cameras a given image came from. By extrapolating the correlation curves, they conservatively estimate that for a False Accept Rate of 10<sup>-3</sup>, their method achieves a False Reject Rate of anywhere between 10<sup>-2</sup> to 10<sup>-10</sup> or even less depending on the camera model and camera settings.</p>
<p class="c0">The takeaway from this seems to be that distinguishing between cameras of different models can be performed with essentially perfect accuracy. Distinguishing between cameras of the same model also seems to have very high accuracy, but it is hard to generalize because of the small sample size.</p>
<p class="c0"><strong><span class="c3">Improvements.</span></strong> Impressive as the above numbers are, there are at least two major ways in which this result can, and has been improved. First, the Binghamton paper is focused on a specific signal, sensor noise. But there are several stages in image acquisition and processing pipeline in the camera, each of which could leave idiosyncratic effects on the image. <span class="c6"><a class="c2" href="http://www.busim.ee.boun.edu.tr/~sankur/SankurFolder/IEEE_IFS_Cellphon_Camera.pdf">This paper</a></span> out of Turkey incorporates many such effects by considering all patterns of certain types that occur in the lower order (least significant) bits of the image, which seems like a rather powerful technique.</p>
<p class="c0">The effects other than sensor noise seem to help more with identifying the camera model than the specific device, but to the extent that the former is a component of the latter, it is useful. They achieve a 97.5% accuracy among 16 test cameras — but with cellphone cameras with pictures at a resolution of just 640&#215;480.</p>
<p class="c0">Second is the effect of the scene itself on the noise. Denoising transformations are not perfect — sharp boundaries look like noise. The Binghamton researchers picked their denoising filter (a wavelet transform) to minimize this problem, but a recent <span class="c6"><a class="c2" href="http://wrap.warwick.ac.uk/3318/1/WRAP_Li_Source_Camera.pdf">paper</a></span> by Chang-Tsun Li claims to do it better, and shows even better numerical results: with 6 cameras (all different models), accurate (over 99%) identification for image fragments cropped to just 256 x 512.</p>
<p class="c0"><strong><span class="c3">What does this mean for privacy?</span></strong> I said earlier that there is a duality between security and privacy, but let’s examine the relationship in more detail. In privacy-infringing applications like mass surveillance, the algorithm need not always produce an answer, and it can occasionally be wrong when it does. The penalty for errors is much lower. On the other hand, the matching algorithm in surveillance-like applications needs to handle a far larger number of candidate cameras. The key point is:</p>
<div style="background-color:#eef;border:1px dashed #bcc;margin-left:20px;margin-bottom:15px;padding:5px;">The parameters of fingerprinting algorithms can usually be tweaked to handle a larger number of classes (i.e., devices) at the expense of accuracy.</div>
<p class="c0">My intuition is that state-of-the-art techniques, configured slightly differently, should allow probabilistic deanonymization from among tens of thousands of different cameras. A Flickr or Picasa profile with a few dozen images should suffice to fingerprint a camera.[4] Combined with metadata such as location, this puts us within striking distance of Internet-scale source-camera identification from anonymous images. I really hope there will be some serious research on this question.</p>
<p class="c0">Finally, a word defenses. If you find yourself in a position where you wish to anonymously publicize a sensitive photograph you took, but your camera is publicly tied to your identity because you’ve previously shared pictures on social networks (and who hasn’t), how do you protect yourself?</p>
<p class="c0">Compressing the image is one possibility, because that destroys the &#8216;lower-order&#8217; bits that fingerprinting crucially depends on. However, it would have to be way more aggressive than most camera defaults (JPEG quality factor ~60% according to one of the studies, whereas defaults are ~95%). A different strategy is rotating the image slightly in order to ‘desynchronize’ it, throwing off the fingerprint matching. An attack that defeats this will have to be much more sophisticated and will have a far higher error rate.</p>
<p class="c0">The deanonymization threat here is analogous to <span class="c6"><a class="c2" href="http://33bits.org/2009/01/15/de-anonymizing-the-internet/">writing-style fingerprinting</a></span>: there are simple defenses, albeit not foolproof, but sadly most users are unaware of the problem, let alone solutions.</p>
<p class="c0">[1] That was a bit simplified; mathematically, there is an additive component (dark signal nonuniformity) and a multiplicative component (photoresponse nonuniformity). The former is easy to correct for, and higher-end cameras do, but the latter isn’t.</p>
<p class="c0">[2] Much has been said about the tension between security and privacy at a social/legal/political level, but I’m making a relatively uncontroversial technical statement here.</p>
<p class="c0">[3] Fridrich is incidentally one of the pioneers of<a class="c2" href="http://en.wikipedia.org/wiki/Speedcubing"> </a><span class="c6"><a class="c2" href="http://en.wikipedia.org/wiki/Speedcubing">speedcubing</a></span> i.e., speed-solving the Rubik’s cube.</p>
<p class="c0">[4] The Binghamton paper uses 320 images per camera for building a fingerprint (and recommends at least 50); the Turkey paper uses 100, and Li’s paper 50. I suspect that if more than one image taken from the unknown camera is available, then the number of reference images can be brought down by a corresponding factor.</p>
<p class="c0">To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/980/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/980/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/980/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=980&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/09/19/digital-camera-fingerprinting/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/09/imageds.jpg" medium="image">
			<media:title type="html">Dark signal</media:title>
		</media:content>
	</item>
		<item>
		<title>Everything Has a Fingerprint: The Case of Blank Paper</title>
		<link>http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/</link>
		<comments>http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/#comments</comments>
		<pubDate>Tue, 13 Sep 2011 18:41:56 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[fingerprinting]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=963</guid>
		<description><![CDATA[This article is the first in a series that looks at “fingerprinting” techniques and the implications for privacy. Unique-identification techniques similar to fingerprints have been applied in an astonishing variety of contexts in recent decades. Biometrics like iris and DNA profiling are well known, but there are lesser known methods like hand geometry, as well [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=963&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>
<p style="padding-left:30px;"><em>This article is the first in a series that looks at “fingerprinting” techniques and the implications for privacy.</em></p>
<p>Unique-identification techniques similar to fingerprints have been applied in an astonishing variety of contexts in recent decades. Biometrics like iris and DNA profiling are well known, but there are lesser known methods like <a href="http://en.wikipedia.org/wiki/Hand_geometry">hand geometry</a>, as well as “behavioral biometrics” like voice, handwriting, typing patterns, and even <a href="http://www.springerlink.com/content/9k91axk7lx5h6jxx/">gait analysis</a>. Many techniques for deanonymization, the principal topic of this blog, work by “fingerprinting” people’s preferences, habits, or style.</p>
<p>But this article is not about biometrics, nor is it about fingerprinting of <a href="http://en.wikipedia.org/wiki/Acoustic_fingerprint">content</a> or complex systems such as <a href="http://panopticlick.eff.org/">a web browser in conjunction with the OS and the user</a>.[1] I will instead discuss one of the most surprising domains of fingerprinting — blank paper.</p>
<p><a href="http://33bits.files.wordpress.com/2011/09/blankpaperupclose.jpeg"><img class="aligncenter size-full wp-image-964" title="Blank paper under the microscope" src="http://33bits.files.wordpress.com/2011/09/blankpaperupclose.jpeg?w=455" alt=""   /></a></p>
<p>This is what paper looks like up close — far from being smooth, it has a rich natural structure. Even considering this, the state-of-the-art <a href="http://citp.princeton.edu/pub/paper09oak.pdf">study</a> on fingerprinting of physical documents, by <a href="http://www.cs.princeton.edu/~wclarkso/">Will Clarkson</a> and colleagues at Princeton, achieves something remarkable: they show how to extract fingerprints from paper using just commodity scanners, and no microscopic technology. The fingerprint survives when the document/paper is printed on, written or scribbled on, or even soaked in water.</p>
<div class="mceTemp mceIEcenter">
<dl class="wp-caption aligncenter">
<dt class="wp-caption-dt"><a href="http://33bits.files.wordpress.com/2011/09/scannedpaper.png"><img class="size-full wp-image-965" title="Scanned paper" src="http://33bits.files.wordpress.com/2011/09/scannedpaper.png?w=455&#038;h=336" alt="" width="455" height="336" /></a></dt>
</dl>
<h5 class="wp-caption-dd">A small (10mm tall) region of paper scanned from two different angles — top-to-bottom and left-to-right</h5>
</div>
<p>The image above, taken from the Princeton paper, shows what the output of a scanner looks like. Not quite the resolution of the microscopic image, but a lot of structure is still visible. The key technique is: by scanning the paper at different orientations and comparing the images, the height at each point is estimated from which a 3-D map of the not-so-flat surface of the paper is constructed.</p>
<p>These 3-D maps can be used as fingerprints, but for efficiency they look at the maps of only about 100 randomly picked small “patches” on the paper. To further compress the extracted information, they do a “dimensionality reduction,” resulting in a 400 byte “feature vector” for each piece of paper, which is the fingerprint.</p>
<p>To verify or compare an observed fingerprint against a stored one, they simply look at the Hamming distance between the two bit-vectors. Why does this simple comparison technique succeed? Comparison of two human fingerprints is a lot more difficult, after all. It’s because a rectangular piece of paper has a nice property that human skin doesn’t: <em>when the objects being fingerprinted have a precise, fixed geometry, fingerprint verification is easy — it is just a pointwise comparison of the corresponding features.</em></p>
<p>The result of such comparisons is this: two fingerprints from different pieces of paper match in roughly 50% of the bits, almost always in the 45%–55% range. Two fingerprints from the same piece of paper, on the other hand, differ in less than 5% of the bits, and occasionally up to 20% of bits if it has been handled particularly badly, such as by soaking. Therefore it is straightforward to infer whether or not two fingerprints came from the same piece of paper.</p>
<p>Readers familiar with the “<a href="http://33bits.org/about/">33 bits of entropy</a>” concept might notice that the fingerprint here is 400 bytes long, or 3200 bits, which is ridiculously high. There are surely less than 2<sup>50</sup> pieces of paper in the world — that’s a million for every person — which means that these fingerprints should easily be able to uniquely identify every piece of paper in the world. [2] The authors estimate that the chance of an error is no more than 1 in 10<sup>148</sup>. In other words, they achieve perfect accuracy.</p>
<p>What are the implications? As the authors point out, document identification “has a wide range of applications, including detecting forged currency and tickets, authenticating passports, and halting counterfeit goods.” On the negative side, it “could also be applied maliciously to de-anonymize printed surveys and to compromise the secrecy of paper ballots.”</p>
<p>[1] This is often referred to as<a href="http://en.wikipedia.org/wiki/Device_fingerprint"> device fingerprinting</a>, but I find that a poor choice of terminology and will use reserve that term for a different concept in this series.</p>
<p>[2] It is hard to estimate entropy exactly in cases like this, but the feature vector is obtained via <a href="http://en.wikipedia.org/wiki/Principal_component_analysis">Principal Component Analysis</a>, which makes it likely that the entropy is close to the maximum value of 3200 bits.</p>
<p><em>Thanks to Will Clarkson for reviewing a draft of this post.</em></p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/963/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/963/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/963/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=963&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/09/13/everything-has-a-fingerprint-the-case-of-blank-paper/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/09/blankpaperupclose.jpeg" medium="image">
			<media:title type="html">Blank paper under the microscope</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/09/scannedpaper.png" medium="image">
			<media:title type="html">Scanned paper</media:title>
		</media:content>
	</item>
		<item>
		<title>Google+ and Privacy: A Roundup</title>
		<link>http://33bits.org/2011/07/03/google-and-privacy-a-roundup/</link>
		<comments>http://33bits.org/2011/07/03/google-and-privacy-a-roundup/#comments</comments>
		<pubDate>Sun, 03 Jul 2011 19:04:52 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=938</guid>
		<description><![CDATA[By all accounts, Google has done a great job with Plus, both on privacy and on the closely related goal of better capturing real-life social nuances. [1] This article will summarize the privacy discussions I’ve had in the first few days of using the service and the news I’ve come across. The origin of Circles [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=938&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>By all accounts, Google has done a great job with <a href="https://plus.google.com/">Plus</a>, both on privacy and on the closely related goal of better capturing real-life social nuances. [1] This article will summarize the privacy discussions I’ve had in the first few days of using the service and the news I’ve come across.</p>
<p><strong>The origin of Circles</strong></p>
<p>“Circles,” as you’re probably aware, is the big privacy-enhancing feature. A presentation titled “<a href="http://www.slideshare.net/padday/the-real-life-social-network-v2">The Real-Life Social Network</a>” by user-experience designer <a href="http://twitter.com/padday">Paul Adams</a> almost exactly a year ago went viral in the tech community; it looks <a href="http://www.readwriteweb.com/archives/google_to_launch_major_new_social_network_called_c.php">likely</a> this was the genesis, or at least a crystallization, of the Circles concept.</p>
<p>But Adams defected to Facebook a few months later, which lead to <a href="http://techcrunch.com/2010/12/20/paul-adams-googler-whose-presentation-foretold-facebook-groups-heads-to-facebook/">speculation</a> that it was the end of whatever plans Google may have had for the concept. But little did the world know at the time that Plus was a company-wide, bet-the-farm initiative involving <a href="http://www.wired.com/epicenter/2011/06/inside-google-plus-social/all/1">30 product teams</a> and hundreds of engineers, and that the departure of one made no difference.</p>
<p>Meanwhile, Facebook introduced a <a href="http://www.facebook.com/help/?page=768">friend-lists feature</a> but it was DOA. When you’re staring at a giant list of several hundred “friends” — Facebook doesn’t do a good job of discouraging indiscriminate friending — categorizing them all is intimidating to say the least. My guess is that Facebook was merely playing the <a href="http://preibusch.de/publications/social_networks/privacy_jungle_dataset.htm">privacy communication game</a>.</p>
<p><strong>Why are circles effective?</strong></p>
<p>I did an informal poll to see if people are taking advantage of Circles to organize their friend groups. Admittedly, I was looking at a tech-savvy, privacy-conscious group of users, but the response was overwhelming, and it was enough to convince me that Circles will be a success. There’s a lot of excitement among the early user community as they collectively figure out the technology as well as the norms and best practices for Circles. For example, this <a href="https://plus.google.com/u/0/111661289724043424828/posts/SqaoG4Jc9rc">tip on how to copy a circle</a> has been shared over 400 times as I write this.</p>
<p>One obvious explanation is that Circles captures real-life boundaries, and this is what users have been waiting for all along. That’s no doubt true, but I think there’s more to it than that. Multiple people have pointed out how the exemplary user interface for creating circles encouraged them to explore the feature. It is gratifying to see that Google has finally learned the importance of interface and interaction design in getting social right.</p>
<p>There are several other UI features that contribute to the success of Circles. When friending someone, you’re <em>forced</em> to pick one or more circles, instead of being allowed to drop them into a generic bucket and categorize them later. But in spite of this, the UI is so good that I find it no harder than friending on Facebook.</p>
<p>In addition, you have to pick circles to share each post with (but again the interface makes it really easy). Finally, each post has a little snippet that shows who can see it, which has the effect of constantly reminding you to mind the information flow. In short, it is nearly impossible to ignore the Circles paradigm.</p>
<p><strong>The resharing bug</strong></p>
<p>Google+ tries to balance privacy with Twitter-like resharing, which is always going to be tricky. Amusing inconsistencies result if you share a post with a circle that doesn’t include the original poster. A more serious issue, pointed out by many people including an <a href="http://blogs.ft.com/fttechhub/2011/06/google-plus-privacy-flaw">FT blogger</a>, is that  “limited” posts can be publicly reshared. To their credit, Google engineers acknowledged it and quickly disabled the feature.</p>
<p>Meanwhile, some have opined that this issue is “<a href="http://www.techdirt.com/articles/20110701/00262714929/first-totally-bogus-privacy-issue-over-google-raised.shtml">totally bogus</a>” and that this is <a href="http://www.buzzmachine.com/2011/06/30/social-is-for-sharing-not-hiding/">how life works</a> and how email works, in that when you tell someone a secret, they could share it with others. I strongly disagree, for two reasons.</p>
<p>First, this is <em>not</em> how the real world (or even email) works. Someone can repeat a secret you told them in real life, or forward an email, but they typically won’t <em>broadcast it to the whole world</em>. We’re talking about making something <em>public</em> here, something that will be forever associated with your real name and could very well come up in a web search.</p>
<p>Second, user-interface hints are an important and well-established way of nudging privacy-impacting behaviors. If there’s a ‘share’ button with a ‘public’ setting, many users will assume that it is OK to do just that. Twitter used to allow public retweets of protected tweets, and a <a href="http://w2spconf.com/2010/papers/p28.pdf">study</a> found that this had been done millions of times. In response, Twitter removed this ability. The <a href="http://privicons.org/">privicons</a> project seeks to embed similar hints in emails.</p>
<p>In other words, the privacy skeptics are missing the point: the goal of the feature is not to try to technologically <em>prevent</em> leakage of protected information, but to better <em>communicate</em> to users what’s OK to share and what isn’t. And in this case, the simplest way to do that is to remove the 1-click ability to share protected content publicly, and instead let users copy-paste if they really want to do that. It would also make sense to remind users to be careful when they’re sharing a limited to their circles, which, I’m happy to see, is <a href="https://plus.google.com/u/0/103541694080221120019/posts/htTdkLezSjP">exactly what Google is doing</a>.</p>
<div id="attachment_939" class="wp-caption aligncenter" style="width: 352px"><img class="size-full wp-image-939 " title="sharingreminder" src="http://33bits.files.wordpress.com/2011/07/sharingreminder.png?w=455" alt=""   /><p class="wp-caption-text">The tip you now see when you share a limited post (with another limited group). This is my favorite Google+ feature.</p></div>
<p><strong>A window into your circles</strong></p>
<p>Paul Ohm <a href="https://plus.google.com/u/0/117949726855391305467/posts/Ykc3irss45D">points out</a> that if someone shares content with a set of circles that includes you, you get to see 21 users who are part of those circles, apparently picked at random. [2] This means that if you look at these lists of 21 over time you can figure out a lot about someone&#8217;s circles, and possibly decipher them completely. Note that by default your profile shows a list of users in your circles, but not who&#8217;s in <em>which</em> circle, which for most people is <a href="http://twitter.com/mrgunn/statuses/86531372822441984">significantly more sensitive</a>.</p>
<p>In my view, this is an interesting finding, but not anything Google needs to fix; the feature is very useful (and arguably privacy-<em>enhancing</em>) and the information leakage is an inevitable tradeoff. But it’s definitely something that users would do well to be aware of: the secrecy of your circles is far from bulletproof.</p>
<p>Speaking of which, the network visibility of different users on their profile page confused me terribly, until I realized Google+ is A/B testing that privacy setting! These are the two possibilities you could see when you edit your profile and click the circles area in the left sidebar: <a href="http://dl.dropbox.com/u/131764/web/graphprefs1.png">A</a>, <a href="http://dl.dropbox.com/u/131764/web/graphprefs2.png">B</a>. This is very interesting and unusual. At any rate, very few users seem to have changed the defaults so far, based on a random sample of a few dozen profiles.</p>
<p><strong>Identity and distributed social networking</strong></p>
<p>Some people are peeved that Google+ discourages you from participating pseudonymously. I don’t think a social network that wants to target the mainstream and wants to capture real-world relationships has any real choice about this. In fact, I want it to go further. Right now, Google+ often suggests I add someone I’ve already added, which turns out to be because I’ve corresponded with multiple email addresses belonging to that person. Such user confusion could be minimized if the system did some graph-mining to automatically figure out which identities belong to the same person. [3]</p>
<p>A related question is what this will mean for distributed social networking, which was <a href="http://www.wired.com/epicenter/2010/05/facebook-rogue/">hailed</a> a year ago as the savior of privacy and user control. My guess is that Google+ will take the wind out of it — <a href="https://www.google.com/takeout/">Google takeout</a> gives you a significant degree of control over your data. Further, due to the <a href="http://allthingsd.com/20110607/whats-twitters-identity-now-that-its-apples-identity-provider/">Apple-Twitter integration</a> and the success of Android, the threat of Facebook monopolizing identities has been obliterated; there are at least three strong players now.</p>
<p>Another reason why Google+ competes with distributed social networks: for people worried about the social networking service provider (or the Government) reading their posts, client-side encryption on top of Google+ could work. The Circles feature is exactly what is needed to make encrypted posts viable, because you can make a circle of those who are using a compatible encryption/decryption plugin. At least a half-dozen such plugins have been created over the years (examples: <a href="http://www.bbc.co.uk/news/technology-12215921">1</a>, <a href="https://uprotect.it/index">2</a>), but it doesn’t make much sense to use these over Facebook or Twitter. Once the <a href="http://news.cnet.com/8301-19882_3-20075974-250/developer-api-for-google-its-coming/">Google+ developer API</a> rolls out, I’m sure we’ll see yet another avatar of the encrypted status message idea, and perhaps the the n-th time will be the charm.</p>
<p><strong>Concluding thoughts</strong></p>
<p>Two years ago, I <a href="http://33bits.org/2009/09/09/livejournal-done-right-the-case-for-a-social-network-with-built-in-privacy/">wrote</a> that there’s a market case for a privacy-respecting social network to fill Livejournal’s shoes. Google+ seems poised to fulfill most of what I anticipated in that essay; the asymmetric nature of relationships and the ability to present different facets of one’s life to different people are two important characteristics that the two social networks have in common. [4]</p>
<p>Many have speculated on whether, and to what extent, Google+ is a threat to Facebook. One recurring comparison is Facebook as “ghetto” compared to Plus, such as in <a href="http://i.imgur.com/OJiZu.png">this image</a> making the rounds on Reddit, reminiscent of Facebook vs. Myspace a few years ago. This perception of “coolness” and “class” is the single biggest thing Google+ has got going for it, more than any technological feature.</p>
<p>It’s funny how people see different things in Google+. While I’m planning to use Google+ as a Livejournal replacement for protected posts, since that’s what fits my needs, the majority of the commentary has compared it to Facebook. A few think it could <a href="http://venturebeat.com/2011/06/30/google-could-make-twitter-the-next-myspace/">replace Twitter</a>, generalizing from their own corner of the Google+ network where people haven’t been using the privacy options. Forbes, being a business publication, thinks <a href="http://blogs.forbes.com/quentinhardy/2011/06/29/google-other-targets/">LinkedIn is the target</a>. I’ve seen a couple of commenters saying they might use it instead of Yammer, another business tool. According to yet other articles, <a href="http://www.pixiq.com/article/google-may-not-kill-facebook-but-flickr-should-be-worried">Flickr</a>, <a href="http://gigaom.com/2011/06/28/why-google-plus-wont-hurt-facebook-but-skype-will-hate-it/">Skype</a> and various other Internet companies should be shaking in their boots. Have you heard the parable of the <a href="http://www.noogenesis.com/pineapple/blind_men_elephant.html">blind men and the elephant</a>?</p>
<p>In short, Google+ is whatever you want it to be, and probably a better version of it. It’s remarkable that they’ve pulled this off without making it a confusing, bloated mess. Myspace founder Tom Anderson seems to have the <a href="https://plus.google.com/112063946124358686266/posts/SrQrSSXeViq">most sensible view</a> so far: Google+ is simply a better … <em>Google</em>, in that the company now has a smoother, more integrated set of services. You’d think people would have figured it out from the name!</p>
<p>[1] I will use the term “privacy” in this article to encompass both senses.</p>
<p>[2] It’s actually 22 users, including yourself and the poster. It’s not clear just how random the list is; in my perusal, mutual friends seem to be preferentially picked.</p>
<p>[3] I am <em>not</em> suggesting that Google+ should prevent users from having multiple accounts, although Circles makes it much less useful/necessary to have multiple accounts.</p>
<p>[4] On the other hand, when it comes to third party data collection, I <a href="http://33bits.org/2011/03/18/privacy-and-the-market-for-lemons-or-how-websites-are-like-used-cars/">do not believe</a> that the market can fix itself.</p>
<p>I’m grateful to <a href="http://josephhall.org/">Joe Hall</a>, <a href="http://stanford.edu/~jmayer/">Jonathan Mayer</a>, and many, many others with whom I had interesting discussions, mostly via Google+ itself, on the topics that led to this post.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a> or <a href="https://plus.google.com/u/0/110908828231461227679">Google+</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/938/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/938/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/938/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=938&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/07/03/google-and-privacy-a-roundup/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/07/sharingreminder.png" medium="image">
			<media:title type="html">sharingreminder</media:title>
		</media:content>
	</item>
		<item>
		<title>Data-mining Contests and the Deanonymization Dilemma: a Two-stage Process Could Be the Way Out</title>
		<link>http://33bits.org/2011/06/14/data-mining-contests-and-the-deanonymization-dilemma-a-two-stage-process-could-be-the-way-out/</link>
		<comments>http://33bits.org/2011/06/14/data-mining-contests-and-the-deanonymization-dilemma-a-two-stage-process-could-be-the-way-out/#comments</comments>
		<pubDate>Tue, 14 Jun 2011 18:54:27 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[machine learning]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[prizes]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=925</guid>
		<description><![CDATA[Anonymization, once the silver bullet of privacy protection in consumer databases, has been shown to be fundamentally inadequate by the work of many computer scientists including myself. One of the best defenses is to control the distribution of the data: strong acceptable-use agreements including prohibition of deanonymization and limits on data retention. These measures work [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=925&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Anonymization, once the silver bullet of privacy protection in consumer databases, has been shown to be <a href="http://33bits.org/about/sitemap/#deanonymization">fundamentally inadequate</a> by the work of many computer scientists including myself. One of the best defenses is to control the distribution of the data: strong acceptable-use agreements including prohibition of deanonymization and limits on data retention.</p>
<p>These measures work well when outsourcing data to another company or a small set of entities. But what about scientific research and data mining contests involving personal data? Prizes are big and <a href="http://33bits.org/2011/06/06/the-surprising-effectiveness-of-prizes-as-catalysts-of-innovation/">only getting bigger</a>, and by their very nature involve wide data dissemination. Are legal restrictions meaningful or enforceable in this context?</p>
<p>I believe that having participants sign and fax a data-use agreement is much better from the privacy perspective than being able to download the data with a couple of clicks. However, I am sympathetic to the argument that I hear from contest organizers that every extra step will result a big drop-off in the participation rate. Basic human psychology suggests that <a href="http://webjackalope.com/lazy-registration/">instant gratification is crucial</a>.</p>
<p>That is a dilemma. But the more I think about it, the more I’m starting to feel that a two-step process could be a way to get the best of both worlds. Here’s how it would work.</p>
<p>For the first stage, the current minimally intrusive process is retained, but the contestants don’t get to download the full data. Instead, there are two possibilities.</p>
<ul>
<li>Release data on only a subset of users, minimizing the quantitative risk. [1]</li>
<li>Release a synthetic dataset created to mimic the characteristics of the real data. [2]</li>
</ul>
<p>For the second stage, there are various possibilities, not mutually exclusive:</p>
<ul>
<li>Require contestants to sign a data-use agreement.</li>
<li>Restrict the contest to a shortlist of best performers from the first stage.</li>
<li>Switch to an “online computation model” where participants upload code to the server (or make database queries over the network) and obtain results, rather than download data.</li>
</ul>
<p>Overstock.com recently <a href="http://www.fastcompany.com/1752913/overstock-to-offer-1-million-for-improved-recommendations">announced</a> a contest that conformed to this structure—a synthetic data release followed by a semi-final and a final round in which selected contestants upload code to be evaluated against data. The reason for this structure appears to be partly privacy and partly the fact that are trying to improve the performance of their <em>live</em> system, and performance needs to be judged in terms of impact on real users.</p>
<p>In the long run, I really hope that an online model will take root. The privacy benefits are significant: high-tech machinery like <a href="http://en.wikipedia.org/wiki/Differential_privacy">differential privacy</a> works better in this setting. But even if such techniques are not employed, although there is the theoretical possibility of contestants extracting all the data by issuing malicious queries, the fact that queries are logged and might be audited should serve as a strong deterrent against such mischief.</p>
<p>The advantages of the online model go beyond privacy. For example, I served on the <a href="http://www.heritagehealthprize.com/c/hhp">Heritage Health Prize</a> advisory board, and we discussed mandating a limit on the amount of computation that contestants were allowed. The motivation was to rule out algorithms that needed so much hardware firepower that they couldn’t be deployed in practice, but the stipulation had to be rejected as unenforceable. In an online model, enforcement would not be a problem. Another potential benefit is the possibility of collaboration between contestants at the code level, almost like an open-source project.</p>
<p>[1] Obtaining informed consent from the subset whose data is made publicly available would essentially eliminate the privacy risk, but the caveat is the possibility of selection bias.</p>
<p>[2] Creating a synthetic dataset from a real one without leaking individual data points and at the same time retaining the essential characteristics of the data is a serious technical challenge, and whether or not it is feasible will depend on the nature of the specific dataset.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/925/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/925/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/925/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=925&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/06/14/data-mining-contests-and-the-deanonymization-dilemma-a-two-stage-process-could-be-the-way-out/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>
	</item>
		<item>
		<title>In Silicon Valley, Great Power but No Responsibility</title>
		<link>http://33bits.org/2011/06/11/in-silicon-valley-great-power-but-no-responsibility/</link>
		<comments>http://33bits.org/2011/06/11/in-silicon-valley-great-power-but-no-responsibility/#comments</comments>
		<pubDate>Sat, 11 Jun 2011 07:33:00 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[ethics]]></category>
		<category><![CDATA[privacy]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=895</guid>
		<description><![CDATA[I saw a tweet today that gave me a lot to think about: . A rather intricate example of social adaptation to technology. If I understand correctly, the cousins in question are taking advantage of the fact that liking someone&#8217;s status/post on Facebook generates a notification for the poster that remains even if the post [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=895&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I saw a <a href="http://twitter.com/tweets4peace/status/79286375467331584">tweet</a> today that gave me a lot to think about:</p>
<div><a href="http://33bits.files.wordpress.com/2011/06/syriatweet.png"><img class="aligncenter size-medium wp-image-898" title="syriatweet" src="http://33bits.files.wordpress.com/2011/06/syriatweet.png?w=300&#038;h=142" alt="" width="300" height="142" /></a></div>
<p>.</p>
<p>A rather intricate example of social adaptation to technology. If I understand correctly, the cousins in question are taking advantage of the fact that liking someone&#8217;s status/post on Facebook generates a notification for the poster that remains even if the post is immediately unliked. [1]</p>
<p>What’s humbling is that such minor features have the power to affect so many, and so profoundly. What’s scary is that the feature is so fickle. If Facebook starts making updates available through a real-time API, like <a href="http://code.google.com/apis/buzz/">Google Buzz does</a>, then the ‘like’ will stick around forever on some external site and users will be none the wiser until something goes wrong. Similar things have happened: a woman <a href="http://www.inc.com/news/articles/2010/05/nonprofit-fires-woman-for-blogging-about-sex.html">was fired</a> because sensitive information she put on Twitter and then deleted was cached by an external site. I’ve written about the privacy dangers of <a href="http://33bits.org/2010/07/06/what-every-developer-needs-to-know-about-public-data-and-privacy/">making public data “more public”</a>, including the problems of real-time APIs. [2]</p>
<p>As complex and fascinating as the technical issues are, the moral challenges interest me more. <strong>We’re at a unique time in history in terms of technologists having so much direct power. </strong>There’s just something about the picture of an engineer in Silicon Valley pushing a feature live at the end of a week, and then heading out for some beer, while people halfway around the world wake up and start using the feature and trusting their lives to it. It gives you pause.</p>
<p>This isn’t just about privacy or just about people in oppressed countries. RescueTime estimates that <a href="http://www.geekwire.com/2011/googles-les-paul-doodle-consumes-record-53m-hours-rescuetime-estimates">5.3 million hours</a> were spent worldwide on Google’s Les Paul doodle feature. Was that a net social good? Who is making the call? Google has an insanely rigorous A/B testing process to optimize between <a href="http://www.zeldman.com/2009/03/20/41-shades-of-blue/">41 shades of blue</a>, but do they have any kind of process in place to decide whether to release a feature that 5.3 million hours—<strong><a href="http://www.google.com/search?q=5300000+%2F+(24+*+365+*+78)">eight lifetimes</a></strong>—are spent on?</p>
<p>For the first time in history, the impact of technology is being felt worldwide and at Internet speed. The magic of automation and ‘scale’ dramatically magnifies effort and thus bestows great power upon developers, but it also comes with the burden of social responsibility. <strong>Technologists have always been able to rely on someone else to make the moral decisions.</strong> But not anymore—there is no ‘chain of command,’ and the law is far too slow to have anything to say most of the time. Inevitably, engineers have to learn to incorporate social costs and benefits into the decision-making process.</p>
<p>Many people have been raising awareness of this—danah boyd often talks about how tech products make a mess of many things: <a href="http://techcrunch.com/2010/03/13/privacy-publicity-sxsw/">privacy</a> for one, but social nuances in general. And recently at TEDxSiliconValley, Damon Horowitz argued that <a href="http://www.ted.com/talks/damon_horowitz.html">technologists need a moral code</a>.</p>
<p>But here’s the thing—and this is probably going to infuriate some of you—I fear that these appeals are falling on deaf ears. Hackers build things because it’s fun; we see ourselves as twiddling bits on our computers, and generally don’t even contemplate, let alone internalize, the far-away consequences of our actions. Privacy is viewed in oversimplified <a href="http://33bits.org/2010/02/13/privacy-is-not-access-control/">access-control</a> terms and there isn’t even a vocabulary for a lot of the nuances that users expect.</p>
<p>The ignorant are at least teachable, but I often hear a <em>willful</em> disdain for moral issues. Anything that’s technically feasible is seen as fair game and those who raise objections are seen as incompetent outsiders trying to rain on the parade of techno-utopia. The pronouncements of executives like Schmidt and Zuckerberg, not to mention the writings of people like Arrington and Scoble who in many ways define the Valley culture, reflect a tone-deaf thinking and a we-make-the-rules-get-over-it attitude.</p>
<p><em>Something’s gotta give.</em></p>
<p>[1] It’s possible that the poster is talking about Twitter, and by ‘like’ they mean ‘favorite’. This makes no difference to the rest of my arguments; if anything it’s stronger because Twitter already has a Firehose.</p>
<p>[2] Potential bugs are another reason that this feature is fickle. As techies might recognize, ensuring that a like doesn’t show up after an item is unliked maps to the problem of update propagation in a distributed database, which the <a href="http://en.wikipedia.org/wiki/CAP_theorem">CAP theorem</a> proves is hard. Indeed, Facebook often has glitches of exactly this sort—you might notice it because a comment notification shows up and the comment doesn’t, or vice versa, or different people see different like counts, etc.</p>
<p>[ETA] I see this essay as somewhat complementary to my last one on <a href="http://33bits.org/2011/06/08/the-many-ways-in-which-the-internet-has-given-us-more-privacy/">how information technology enables us to be more private</a> contrasted with the ways in which it also enables us to publicize our lives. There I talked about the role of <em>consumers</em> of technology in determining its direction; this article is about the role of the <em>creators</em>.</p>
<p>[Edit 2] Changed the British spelling &#8216;wilful&#8217; to American.</p>
<p>Thanks to <a href="http://www.stanford.edu/~jmayer/">Jonathan Mayer</a> for comments on a draft.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/895/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/895/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/895/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=895&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/06/11/in-silicon-valley-great-power-but-no-responsibility/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>

		<media:content url="http://33bits.files.wordpress.com/2011/06/syriatweet.png?w=300" medium="image">
			<media:title type="html">syriatweet</media:title>
		</media:content>
	</item>
		<item>
		<title>The Many Ways in Which the Internet Has Given Us More Privacy</title>
		<link>http://33bits.org/2011/06/08/the-many-ways-in-which-the-internet-has-given-us-more-privacy/</link>
		<comments>http://33bits.org/2011/06/08/the-many-ways-in-which-the-internet-has-given-us-more-privacy/#comments</comments>
		<pubDate>Wed, 08 Jun 2011 18:52:39 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Internet]]></category>
		<category><![CDATA[privacy]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=880</guid>
		<description><![CDATA[There are many, many things that digital technology allows us to do more privately today than we ever could. Consider: The ability of marginalized or oppressed individuals to leverage the privacy of online communication tools to unite in support of a cause, or simply to find each other, has been earth-shattering. It has played a [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=880&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>There are many, many things that digital technology allows us to do more privately today than we ever could. Consider:</p>
<p>The ability of marginalized or oppressed individuals to leverage the privacy of online communication tools to unite in support of a cause, or simply to find each other, has been earth-shattering.</p>
<ul>
<li>It has <a href="http://www.cbsnews.com/stories/2011/02/15/eveningnews/main20032118.shtml">played a key role</a> in the ongoing Middle East uprisings. The Internet helps primarily by enabling rapid communication and coordination, but being able to do it <em>covertly</em>—<a href="http://www.theregister.co.uk/2011/01/25/tunisia_facebook_password_slurping/">clumsy governmental hacking attempts</a> notwithstanding—is an equally important aspect.</li>
<li>Clay Shirky <a href="http://www.amazon.com/Here-Comes-Everybody-Organizing-Organizations/dp/1594201536">tells the story</a> of how some of <a href="http://meetup.com/">meetup.com</a>’s most popular groups were (ir)religious communities that don’t find support in broader U.S. culture — Pagans, ex-Jehovah’s witnesses, atheists, etc.</li>
<li>STD-positive individuals can use online dating sites targeted at their group. Can you imagine the Sisyphean frustration of trying to date offline and find a compatible partner if you have an STD?</li>
</ul>
<p>In the political realm, the anonymity afforded by Wikileaks is leading to a challenge to the legitimacy of high-level government actors, if not entire governments. Bitcoin is another anonymity technology that shows the potential to have serious political effects. [1]</p>
<p>Most of us benefit at an everyday level from improved privacy. When we read, search, or buy online, people around us don’t find out about it. This is vastly more private than checking out a book from a library or buying something at a store. [2]</p>
<p>We’ve benefited not only in our mundane activities, but our kinky ones as well. We take and exchange naked pictures all the time, never having been able to do so back when it involved getting it developed at the store. And slightly over half of us have taken advantage of the fact that “hiding one’s porn” is trivial today compared to the bad old days of magazines.</p>
<p>I could go on—I haven’t even mentioned the uses of <a href="http://en.wikipedia.org/wiki/Tor_(anonymity_network)">Tor</a> or encryption, freely available to anyone willing to invest a little effort—but I’ve made my point. <strong>Of course, I’ve only presented one half of the story. The other half, that technology is also allowing us to <em>expose</em> ourselves in ways never before, has been told so many times by so many people, and so loudly, that it is drowning out meaningful conversation about privacy.</strong></p>
<p>Having presented the above evidence, I posit that technology by itself is actually largely neutral with respect to privacy, in that it enhances the privacy of some types of actions and encumbers that of others. Which direction society takes is up to us. In other words, I’m asserting the negation of <a href="http://en.wikipedia.org/wiki/Technological_determinism">technological determinism</a>, applied to privacy.</p>
<p>While I do believe that privacy-infringing technologies have been adopted more pervasively than privacy-enhancing ones, I would say that the disparity is far smaller than it is generally thought to be. Why the mismatch in perception? A curious collective cognitive bias. Observe that almost every one of the examples above is generally seen as a <em>new kind of activity</em> enabled by technology whereas they are really examples of technology allowing us to do a <em>familiar activity, but with more privacy</em> (among other benefits).</p>
<p>Another reason for the cognitive bias is our tendency to focus on the dangers and the negatives of technology. Let’s go back do the nude pictures example: just about <em>everyone</em> does it, but only a small number—perhaps 1%?—suffer some harm from it. Like Schneier says, <a href="http://www.schneier.com/essay-171.html">if it’s in the news, don’t worry about it</a>.</p>
<p>To the extent that privacy-infringing technologies have been more successful, it’s a choice we’ve collectively made. Demand for social networking has been so strong that the sector has somehow invented a halfway workable business model, even though it took several tries to get there. But demand for encryption has been so weak that the market never matured enough to make it usable to the general public.</p>
<p>The disparity could be because we don’t know what’s good for us—<a href="http://www.heinz.cmu.edu/~acquisti/papers/Acquisti-Grossklags-Chapter-Etrics.pdf">volumes</a> have been written about this—but it could also be partly because there are costs and benefits to giving up our privacy, and the benefits, in proportion to the costs, are rather higher than is generally made out to be.</p>
<p>Those are all questions worth pondering, but I hope I have convinced you of this: the idea that information technology inherently invades privacy is oversimplified and misleading. If we’re giving up privacy, we have only ourselves to blame.</p>
<p>[1] Many privacy-enhancing technologies are morally ambiguous. I’m merely listing the ways in which people benefit from privacy, regardless of whether they’re using it for good or evil.</p>
<p>[2] It is probably true that the Internet has made it easier for government, advertisers etc. to track your activities. But it doesn’t change the fact that there’s a privacy benefit to regular people in an everyday context, who are far more concerned about keeping secrets from their family, friends and neighbors than about abstract threats.</p>
<p>[ETA] This essay examines the role of consumers in shaping the direction of technology, whereas the <a href="http://33bits.org/2011/06/11/in-silicon-valley-great-power-but-no-responsibility/">next one</a> looks at the role of creators.</p>
<p>Thanks to <a href="http://www.cs.utexas.edu/~akilzer/">Ann Kilzer</a> for comments on a draft.</p>
<p>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a>.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/880/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/880/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/880/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=880&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/06/08/the-many-ways-in-which-the-internet-has-given-us-more-privacy/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>
	</item>
		<item>
		<title>Bad Internet Law: What Techies Can Do About It</title>
		<link>http://33bits.org/2011/06/07/bad-internet-law-what-techies-can-do-about-it/</link>
		<comments>http://33bits.org/2011/06/07/bad-internet-law-what-techies-can-do-about-it/#comments</comments>
		<pubDate>Tue, 07 Jun 2011 16:56:53 +0000</pubDate>
		<dc:creator>Arvind Narayanan</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[law]]></category>
		<category><![CDATA[policy]]></category>
		<category><![CDATA[technology]]></category>

		<guid isPermaLink="false">http://33bits.org/?p=876</guid>
		<description><![CDATA[From the dangerous copyright lobby-sponsored PROTECT IP to a variety of misguided social networking safety laws, the spectre of bad Internet law is rearing its ugly head with increasing frequency. And at the e-G8 forum, Sarkozy and others talked about even more ambitious plans to “civilize” the Internet that will surely have repercussions in the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=876&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<div>From the dangerous copyright lobby-sponsored <a href="http://news.cnet.com/8301-13578_3-20062419-38.html">PROTECT IP</a> to a variety of misguided <a href="http://www.mercurynews.com/breaking-news/ci_18156912">social networking safety</a> laws, the spectre of bad Internet law is rearing its ugly head with increasing frequency. And at the e-G8 forum, Sarkozy and others talked about even more ambitious plans to <a href="http://arstechnica.com/tech-policy/news/2011/05/france-attempts-to-civilize-the-internet-internet-fights-back.ars">“civilize” the Internet</a> that will surely have repercussions in the U.S. as well. Three things are common to these efforts: a general ignorance of technological reality, an attempt to preserve pre-Internet era norms and business models that don’t necessarily make sense anymore, and severe chilling effects on free speech and innovation.</p>
<p>The bad news is that fighting specific laws as they come up <a href="http://33bits.org/2011/05/19/fighting-protect-ip-congresswoman-lofgren/">is an uphill battle</a>. What has changed in the last ten years is that the Internet has thoroughly permeated society, and therefore the interest groups pushing these laws are much more determined to get their way. The good news is that lawmakers are reasonably receptive to arguments from both sides. So far, however, they are not hearing nearly enough of our side of the story. It’s time for techies to step up and get more actively involved in policy if we hope to preserve what we’ve come to see as our way of life. Here’s how you can make a difference.</p>
<p><strong>1. Stick to your strengths—explain technology.</strong> The primary reason why Washington is prone to making bad tech law is that they don’t understand tech, and don’t understand how bits are different from atoms. Not only is educating policymakers on tech more effective, as a technologist you’ll have more credibility if you stick to doing that, rather than opining on specific policy measures.</p>
<p><strong>2. Don’t go it alone.</strong> Giving equal weight to every citizen’s input on individual issues <a href="http://www.huffingtonpost.com/2009/07/24/is-this-the-stupidest-per_n_244440.html">may or may not</a> be a good idea in theory, but it certainly doesn’t work that way in practice. Money, expertise, connections and familiarity with the system all count. You’ll find it much easier to be heard and to make a difference if you combine your efforts with an existing tech policy group. You’ll also learn the ropes much more quickly by networking. Organizations like the <a href="http://eff.org/">EFF</a> are always looking for help from outside technologists.</p>
<p><strong>3. Charity begins at home—talk to your policy people.</strong> If you work at a large tech company, you’re already in a great position: your company has a policy group, a.k.a. lobbyists. Help them with their understanding of tech and business constraints, and have them explain the policy issues they’re involved in. Engineers often view the in-house policy and legal groups as a bunch of lawyers trying to impose arbitrary rules. This attitude hurts in the long run.</p>
<p><strong>4. Learn to navigate the Three Letter Agencies.</strong> “The Government” is not a monolithic entity. To a first approximation there are the two Houses, a variety of Agencies, Departments and Commissions, the state legislatures and the state Attorneys General. They differ in their responsibilities, agendas, means of citizen participation and the receptiveness to input on technology. It can be bewildering at first but don’t worry too much about it; you can pick it up as you go along. Weird but true: most Internet issues in the House are handled by the “Energy and Commerce” subcommittee!</p>
<p>While I have focused on bad Internet laws, since that is where the tech/politics disconnect is most obvious, there are certainly many laws and regulations that have a largely positive, or at least a mixed reception in technology circles. Net neutrality is a prominent example; I am myself involved in the <a href="http://donottrack.us/">Do Not Track</a> project. These are good opportunities to get involved as well, since there is always a shortage of technical folks. I would suggest picking one or two issues, even though it might be tempting to speak out about everything you have an opinion on.</p>
<p>To those of you who are about to post something like, “What’s the point? Congresspeople are all bought and paid for and aren’t going to listen to us anyway,” I have two things to say:</p>
<ul>
<li>Tech policy is certainly hard because of the huge chasm, but cynicism is unwarranted. Lawmakers are willing to listen and you will have an impact if you stick with it.</li>
<li>If you’re not interested, that’s your prerogative. But please refrain from discouraging others who’re fighting for your rights. Defeatism and apathy are part of the problem.</li>
</ul>
<p>Finally, here are some tech policy blogs and resources if you feel like “lurking” before you’re ready to jump in.</p>
<ul>
<li><a href="https://www.eff.org/deeplinks/archive">EFF: Deeplinks</a></li>
<li><a href="http://www.cdt.org/blog">Center for Democracy and Technology</a></li>
<li><a href="http://www.techpolicy.com/Blog.aspx">Technology | Academics | Policy</a></li>
<li><a href="http://cyberlaw.stanford.edu/">Stanford Center for Internet and Society</a></li>
<li><a href="http://googlepublicpolicy.blogspot.com/">Google Public Policy Blog</a></li>
</ul>
</div>
<div>Thanks to <a href="http://petewarden.typepad.com/">Pete Warden</a> and <a href="http://www.stanford.edu/~jmayer/">Jonathan Mayer</a> for comments on a draft.</div>
<div>To stay on top of future posts, <a href="http://33bits.org/feed/">subscribe</a> to the RSS feed or <a href="http://twitter.com/random_walker">follow me on Twitter</a>.</div>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/33bits.wordpress.com/876/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/33bits.wordpress.com/876/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/33bits.wordpress.com/876/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=33bits.org&amp;blog=5017838&amp;post=876&amp;subd=33bits&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://33bits.org/2011/06/07/bad-internet-law-what-techies-can-do-about-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/aa438b63ff1e9b75693aeabbeddae5eb?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">randomwalker</media:title>
		</media:content>
	</item>
	</channel>
</rss>
