March 19, 2024

We are releasing three longitudinal datasets of Yelp review recommendations with over 2.5M unique reviews.

By Ryan Amos, Roland Maio, and Prateek Mittal Online reviews are an important source of consumer information, play an important role in consumer protection, and have a substantial impact on businesses’ economic outcomes. Some of these reviews may be problematic; for example, incentivized reviews, reviews with a conflict of interest, irrelevant reviews, and entirely fabricated […]

What should we do about re-identification? A precautionary approach to big data privacy

Computer science research on re-identification has repeatedly demonstrated that sensitive information can be inferred even from de-identified data in a wide variety of domains. This has posed a vexing problem for practitioners and policy makers. If the absence of “personally identifying information” cannot be relied on for privacy protection, what are the alternatives? Joanna Huey, […]

My Bill to #OpenPACER in memory of #aaronsw – Open for Comment and Available on Github

I unveiled a draft bill at an event on Capitol Hill this week. It is drafted in Legislative XML, allows you to comment, and the code is available on github. Here’s the video: The Open PACER Act provides for free and open access to electronic federal court records. The courts currently offer an expensive and […]

Smart Campaigns, Meet Smart Voters

Zeynep pointed to her New York Times op-ed, “Beware the Smart Campaign,” about political campaigns collecting and exploiting detailed information about individual voters. Given the emerging conventional wisdom that the Obama campaign’s technological superiority played an important role in the President’s re-election, we should expect more aggressive attempts to micro-target voters by both parties in […]

My NYT Op-Ed: "Beware the Smart Campaign"

I just published a new opinion piece in the New York Times, entitled “Beware the Smart Campaign”. I react to the Obama campaign’s successful use of highly quantitative voter targeting that is inspired by “big data” commercial marketing techniques and implemented through state-of-the-art social science knowledge and randomized field experiments.  In the op-ed, I wonder […]