January 15, 2025

A Scanner Darkly: Protecting User Privacy from Perceptual Applications

“A Scanner Darkly”, a dystopian 1977 Philip K. Dick novel (adapted to a 2006 film), describes a society with pervasive audio and video surveillance. Our paper “A Scanner Darkly”, which appeared in last year’s IEEE Symposium on Security and Privacy (Oakland) and has just received the 2014 PET Award for Outstanding Research in Privacy Enhancing Technologies, takes a closer look at the soon-to-come world where ubiquitous surveillance is performed not by the drug police but by everyday devices with high-bandwidth sensors. A scanner darkly The age of perceptual computing is upon us. Mobile phones and laptops, programmable robots such as iRobot Create, gaming devices such as Microsoft Kinect, and augmented reality displays such as Google Glass are built around cameras and microphones, enabling software apps to “see” their physical environment. Some of these apps are mundane – motion detectors, enhanced video-chat programs, ball-chaser apps for robotic dogs – yet others are Star Trek stuff: natural user interfaces that react to gestures and sounds, sophisticated face recognizers, even room-wide, ambient, context-aware systems.

Starship Enterprise was not running untrusted apps, though. Modern perceptual computing platforms do. Mobile and robot operating systems, Kinect, Google Glass all encourage independent developers to create software for their respective app stores. What could possibly go wrong? The security and privacy risks of a malicious or buggy app with unlimited camera and microphone access are obvious. And yes, some of these devices are capable of moving around on their own: think a third-party app turning a robotic pet into a roving spy camera. Robot dog First, consider overcollection of data. Many perceptual apps (e.g., augmented-reality browsers) transmit the entire camera feed to the server for image recognition. It’s nice of them to be cognizant of users’ electric bills, but streaming high-def visuals of users’ rooms to a random app provider may have unpleasant privacy implications. What types of objects do they recognize? How? What if the camera accidentally captured a face, a computer screen, a drug label, a credit card, a license plate? What happens to images afterwards — are they kept somewhere intentionally or accidentally?

Second, continuous aggregation of information (for example, a security app surveilling the same room for hours at a time) raises entirely new categories of privacy risks. Even high-level, abstract information about visual scenes is risky when aggregated over time.  Consider a “skeleton recognizer” app that detects the presence of a person but cannot see faces, individual items, etc. Even such a restricted app can infer that there are two individuals in the room, observe their movement and proximity patterns, etc. [*] Skeleton recognizer Darkly is our first attempt to map out the road towards a new field of systems research: privacy-preserving perceptual computing. Darkly is a multi-layered, domain-specific (this is unusual!) privacy protection system that leverages the structure of perceptual software to insert protection at the platform level. Our key observation is that virtually all applications access perceptual sensors through the platform’s abstract API. This is not surprising. It is cumbersome for a developer of a computer vision application to code algorithms such as motion detection when operating on raw pixels; much easier to employ vision libraries such as OpenCV. Similarly, Microsoft’s Kinect SDK provides library functions for detecting the outline of a human body to make it easier to write gesture-controlled applications.

These APIs are exactly where Darkly intercepts the app’s sensor accesses and inserts multiple privacy protection layers. Darkly runs as part of the platform and thus at a higher privilege level than the untrusted apps (a rough analogy is system call interposition in operating systems). This helps make privacy protection transparent and requires no changes to the apps’ code. Our Darkly prototype is integrated with OpenCV, a popular computer vision library that runs on many mobile and robot platforms. Darkly architecture The first layer of protection in Darkly is access control. Darkly replaces pointers to raw pixel data with opaque references that have the same format but cannot be dereferenced by applications. OpenCV functions dereference them internally and thus operate on raw pixels without any loss of fidelity. It turns out that most of our benchmark OpenCV applications still work correctly without any modifications because they never access raw pixels, they just pass pixel pointers back and forth to OpenCV library functions (this is testament to the richness of functionality supported by OpenCV). Darkly also provides trusted GUI and storage APIs that allow an app – for example, a remotely operated security cam – to display captured images to the user, operate on user input, and store images without being able to read them.

Some apps are only interested in certain image features: for example, a security cam may need object contours to detect movement, while a QR code scanner needs the black-and-white matrix. For these features, algorithmic transforms remove individual features, leaving only simple shapes. As mentioned above, this may not prevent inferential privacy breaches!

Finally, some apps – for example, eigenface-based face recognizers – do need access to raw pixels. For such apps, Darkly provides a special ibc language and a runtime sandbox for privacy-preserving image processing. ibc is based on GNU bc and is an almost pure computation language, with no access to system calls, network, system time, etc. Consequently, it is easy to sandbox, yet can be used to implement many image processing algorithms. ibc programs are allowed to return a single 32-bit value from the sandbox, reducing the risk of accidental data overcollection.

The final layer of protection in Darkly is user audit. Darkly visually shows to the user what the app “sees,” now and over time. For algorithmic transforms, the privacy dial lets the user set the degree of transformation on the scale from 0 to 11, thus changing the amount of information released to the app. Output of sketching transform There’s more in the paper: quantifying the trade-offs between privacy and utility, a few ideas on privacy-preserving transforms, explanations of why generic privacy technologies are unlikely to solve the problem, and so on. We plan to release the source code of Darkly soon.

We hope that our work on Darkly will motivate the designers of perceptual computing platforms to start providing built-in privacy protection mechanisms as the integral part of their APIs, SDKs, and OSes. From a research perspective, we also hope that Darkly will generate interest in the nascent field of privacy-preserving perceptual computing. There are numerous interesting research  problems that need to solved here: designing new privacy transforms, metrics for measuring their effectiveness, visualization of other sensor data (e.g., microphone) for user audit, etc. Please get in touch with us if you are interested in any of these problems.

[*] While Darkly addresses many types of privacy problems associated with perceptual applications, this type of inference isn’t one of them.

Comments

  1. This looks like a good start for preventing “unintended” data collection and reuse (e.g., a persistent room monitor wouldn’t also give you access to all the barcodes and text blocks visible in the room) as you say, it doesn’t really help you with the inferential stuff (physical metadata, you might say) or situations where finegrained vision and so forth are desired for an app (what if you want your personal assistant to read something to you or scan a barcode without explicit startup of those functions?).

    I also wonder about ways of minimizing the exposure and persistence of such data when it does go out into the cloud.

    • Suman Jana says

      If an app is using a library like OpenCV that is incorporated inside Darkly, it can still operate on fine-grained data like text using library functions without seeing the raw content of the image. For example, let’s say a text-reading personal assistant app uses OpenCV to find out possible presence of text in an image and read its contents. With Darkly, it can read the text without ever knowing the raw pixels of the image. In fact, we ported an OCR app that was using OpenCV to Darkly. It works while only getting to know the texts/numbers. Please see the paper for more details.

      But you are right about the inferential stuff. As we mentioned in the post, such data won’t be protected by Darkly.

      • My question was a little different (although it’s still good there’s no raw-pixel access). If you have, say, a text-reading assistant that uses OpenCV to determine the possible presence of text in the image and read the text, is there any way to restrict that in some fashion so that the assistant can’t (hypothetically) slurp up all the text that’s ever visible in the field of view and ship it to somewhere evil?

        If the camera is always on, it’s going to see a lot of text (credit cards, prescription bottles, food labels, correspondence on the other side of the desk) that the user doesn’t want read, but that’s still in view for several frames at a time. Where should the control be that says something like “only give this to the app if it’s been stable near the middle of the field of view for at least one second”? Or “Only read things at an apparent point size of 12 or greater”? (I’m not sure those should be the criteria, but that should give you an idea of what I’m getting at.) At the moment, opportunistic recognition-of-everything is limited by camera resolution and CPU, but not for long.