September 26, 2018

Scan This or Scan Me? User Privacy & Barcode-Scanning Applications

[Please welcome guest bloggers Eric Smith and Nina Kollars. Eric Smith serves as the Chief Information Security Officer (CISO) for a higher ed consortium with membership consisting of Bucknell University, Franklin & Marshall College and Susquehanna University. Nina Kollars is assistant professor of government at Franklin & Marshall college, where her scholarship examines the ways in which individual user creativity affects the development of technology and practices.]

QR (Quick Response) codes—the two-dimensional barcodes designed by the Denso Wave company in 1994—were originally intended to track and inventory millions of parts on assembly lines. Since then, these nearly ubiquitous black and white squares have been applied to an ever-broader range of uses including business cards, patient-tracking systems, and mobile coupon clipping. In order to make use of these codes, the vast majority of consumers utilize smart phone technologies in order to convert the codes into usable information. However, neither Apple’s iOS nor Google’s Android operating systems include a robust native capability to scan and decode printed barcodes. As a result, users of these devices must download third-party applications that will do this work for them.

Research Question and Findings:

Our research question was straightforward: are there privacy and security risks associated with this emerging QR app ecosystem? In an attempt to answer this, we installed and analyzed over twenty of the most popular QR code applications. Our findings suggest that a majority of the most popular QR code readers found in the Apple App and Google Play marketplaces are not passive systems of information routing, but instead capture and transmit additional data about the device and the user back to the application developer. (For full details see our paper.)

Our findings reveal that many smartphone barcode scanning applications represent a significant threat to the privacy and, potentially, security of their users. On both platforms studied, the most popular QR code scanning apps, according to search result rankings were shown to transmit the contents of all scanned QR codes, as well as GPS location data, to a third-party server.

Triangulation of Behavior:

Certainly the collection of user data by app developers is part of the consumer calculus of the cost of free tools. That is, in exchange for some of the users’ data, the tool becomes available for use. For the everyday user, QR codes are likely a tool for simple information seeking. In exchange, market-minded developers are given an opportunity to determine the preferences of the user. This, for most users, constitutes a reasonable trade off and the use of the tool represents a transaction between developer and the user.

However, the ethical contours and acceptable limits of this trade off remain unsettled, particularly if the type of data taken is not made explicitly comprehensible to consumers. Moreover, contemporary privacy norms are increasingly threatened as what initially appear to be signals of consumer preference slide further into determining bigger-picture life patterns and behavior. The question is, how much and what kinds of data tip the scale from reasonable transfer to privacy violation? We feel that the collection of data that combines content, location, date, and time begins to edge toward the triangulation of private behavior.

We feel that the QR case begins to tread beyond reasonable data collection toward behavior triangulation as a result of the intersection of three variables: the expanding purposes for which codes are used; non-explicit user notification by the software; and limitations of user knowledge in comprehending potential threats as a result of seemingly benign data transfer.

Of the applications tested, only a handful required the user to accept an end-user license agreement (EULA). The majority of apps studied provided no notification whatsoever. For those instances in which the application prompted the device, the language contained in the prompt was worded such that the user could not reasonably infer the immediate implications of that data collection. While many QR codes “in the wild” contain only public information, such as a web site or telephone number, others may contain confidential information such as the password to a wireless network or the code to deactivate a security alarm.

A particularly egregious, though not necessarily rare example of this intersection and confusion is the University of Alaska Anchorage’s research study on alcohol cessation and pregnancy. The study’s designers placed free pregnancy tests in the bathroom of a bar and then provided a QR code in order for the user to scan to get information and answer a questionnaire. In this case, unbeknownst to the researchers, the collection of this data literally works against the intent of the project hoping to reach information seekers anonymously and in the privacy of the bathroom stall. While the QR code itself may point to a location that fully intends to maintain the anonymity of the user, the scanner does not.

Comments

  1. NathanT says:

    This is just one of many reasons why I refuse to use, or even scan QR codes. Without even digging into what information is being transmitted and to whom, the use of QR codes is simply an obfuscation method used by companies. Rather than just stating what they do, instead they encode some system that only a computer can read.

    Take for instance the simple idea that a QR code will route to a website. Why not just put the web address in place of the QR code? Then it is transparent and you don’t need special software to read it. Same for anything else, I really cannot understand any use of a QR code in public that DOESN’T present real privacy concerns simply by the fact that one cannot tell what is in the QR code in the first place.

    Not to mention the hacks that can be performed by routing someone to malicious software, storing private information thereon (mentioned in this article), and on and on including the very real fact that whomever makes the reader is simply in on the gig. I wouldn’t trust if Google nor Apple put a scanner in their own products either; they are just as bad at spying on consumers in the first place.

    Companies that use QR codes are simply tools. They believe they are presenting an easier method for their users but at the sacrifice of the user’s privacy in the first place. It is the companies that use the QR codes that need to be preached to about privacy concerns; they should just be transparent and print the information they want users to know without doing some stupid black and white boxes that take computers to read.

    On another front; but in the same vane. UPC barcodes are similar in nature right? They provide a rapid method of computers to track product (same as QR codes were originally designed for); but at least UPC barcodes you have the UPC number right there; its not a secret, it is open and transparent. QR codes however are not open, not transparent, and can be used to store anything which causes a whole slew of concerns.

    I have no problems with QR codes being used as they were originally intended to track products (pieces in manufacturing); but they should stay out of the public realm; they serve no useful purpose to consumers. And the only useful purpose they serve is to those who like to bait and switch (bait users with the idea of easy of getting information, switch by selling information about the users to marketers).

    • Anonnymoose says:

      Privacy preserving QR/bar code readers (eg. Barcode Scanner by ZXing Team) have an on-by-default option to present you with the contents of the scanned code before taking *any* action with that data. This addresses the substantive part of your objections.

      • gnaddrig says:

        Well, yes. But you have to believe them when they tell you that the app shows you the scanned data without taking any further action without your explicit authorization. Most normal users would be hard put to find out whether the app actually does what it says it does or whether it doesn’t pass on information about the device regardless.

        • Anonnymoose says:

          “But you have to believe them…”

          No. You can -as Smith and Kollars did in the research covered in their paper (discussed in this blog post)- inspect network traffic from the device running the QR scanning software as you use the software. Or you can -as is the case with ZXing’s Barcode Scanner, and (probably) some others- download the source code, inspect it, and then compile the software from that.

          Will most folks do this? No. But, for any given version of the software, you only -realistically- need to perform a thorough test once. You can make a bunch of “What if?” arguments that attempt to counter this assertion. Most of those arguments will be either tinfoil-hat type arguments “But can we REALLY trust [X]?”, or you’ve-been-targeted-by-a-government-investigation type arguments “What if the FBI/DEA/NSA wants to get *my* secrets from *my* devices?”. In the case of the latter, there’s nothing a normal user will be able to do. 😉

          • Wow, I think this may be only the second time (maybe third time; but vary rare) that a comment of mine on this blog site has gotten a reply.

            I have not looked into ZXing’s Barcode Scanner to see what they have to say on the topic; and whether they are reputable enough to trust. If what you say is true about them not taking any action but just presenting the data in the barcode; that does indeed counter most of my own worries on the subject. gnaddrig counters that with idea of trust. I too deal in means of trust; especially anything that is offered for free [which I don’t think you said whether or not that was the case here].

            If ZXing is an open source project; that alone suggests reputation to me; while I rarely dig through other person’s code to verify everything; those who choose to open their source (rather than play the “security through obscurity” card by keeping things closed) are usually trustworthy; and usually have good intentions. That said, there are some open source projects I still do not like, and do not trust; because their manner sounds untrustworthy even if their intentions are good. HTTPS Everywhere (by a reputable group eff.org) is one of those, for instance.

            What makes me trust or not trust something is rather complex. I suffer from clinical paranoia; and yet, just because I am paranoid does not make me wrong. Even though I have been called a “tinfoil-hat” wearer; I was speaking out about exactly what the government was doing years before Snowden came forward. But, because I had no proof I was seen as a “tinfoil-hat” wearer. Snowden comes out with the proof; and well, all I can say was “I wasn’t wrong.” I read the tea leaves pretty well even when I may be clinically paranoid.

            Why don’t I trust some things that others trust? For instance HTTPS Everywhere as I understand it, it either uses the https of the website where available, or where not available it encrypts the traffic to a proxy server. To me, the idea of sending any of my communications to a “proxy” that is not under my control is not a good idea (it creates an attack vector that was no there otherwise); it [the proxy] cannot be trusted; even if the makers of the software are trustworthy. The Tor network proclaimed the same kind of thing for the realm of anonymity; and yet government authorities broke into that without any difficulty–it really does not provide anonymity despite claiming to.

            But that doesn’t mean I don’t trust anyone or anything. But, to establish trust one must need do a lot more than talking points. And, so, I come full circle. I have no trust in the barcodes themselves, nor their extended purposes (except the original purpose for tracking product on an assembly line); I would have little reason to trust ZXing Barcode Scanner if I don’t trust the manner of the use of the barcodes in the first place. My original post was attempting to suggest why I don’t trust QR barcodes; but it is hard to put into words the issues of trust. While ZXing may have some privacy aspects to it (which may be found trustworthy); they still leave me with no benefit to using their scanner. I find no benefit to a QR barcode at all; and a lot of trust issues; even if you eliminated the trust issues completely you are left with still no benefit.

            Oh, I can see many benefits to the companies that make them; but none to the consumers who use them; and that is where the source of the mistrust begins.

          • “Oh, I can see many benefits to the companies that make them; but none to the consumers who use them; and that is where the source of the mistrust begins.”

            One last thing, to expound my line of reasoning. Let me put forth a simple example of my previous statement. A simple example but it is only one of literally thousands of current uses of QR codes.

            Let’s say Company X tells Consumer A that if they scan this code, they will be given a coupon for a specified product. Consumer A sees the idea of a coupon as something valuable, a reason to scan the code. Consumer ME sees the code as a manner of hiding the full intentions of Company X. Company X may indeed point the consumer to a coupon, it may even be the product the consumer wishes, and the coupon may even be honored with the stated discount on the product. But, Consumer ME says; just give me the coupon; why should I need to scan a code to get the coupon?

            The XZing Barcode reader may present just a url that would lead to the coupon; but url’s are really tricky things; looking at the url I can’t tell whether or not the information passed is going to be of a privacy concern; even if the ZXing Barcode reader doesn’t go to the url; I still won’t know whether to trust it or not. Go to Google for instance, do any search maybe click a few pages down and then copy and paste the url for the Google search, its just long randomness; no way to know what is being tracked there.

            What could be in that URL? Because the barcodes can contain anything and are produced with automated software; the url could pass information about the location of the barcode; the specific barcode number (in a line of barcodes). The only way I could know if that was the case is if I could scan ALL of the barcodes by that company on that product for that coupon, and compare all the URL’s until I decode what they are doing. That isn’t going to happen.

            Just using a url found in the QR code (one specific example) may tell Company X that I was in Walmart at a specific U.S. Address. What does Company X do with that information? Location information (even if it is not passed by the scanner from the device’s GPS coords) is still a very hot piece of information for marketers. And location information isn’t everything either. The article here mentions QR codes in bathrooms; well that would generally tell the company the person’s gender as well. QR codes in places such as pharmacies can reveal a whole lot of health information. And on and on.

            The fact that the information presented in any random string is unknown to the user; but known to the company either creating the barcodes or using them. And that information I KNOW darn well is used by the companies; and the data is also generally stored and thus available to government agencies and hackers alike.

            Even worse; many companies outsource this type of thing to marketing firms. And marketing firms are even less trustworthy (because they then aggregate all of their data from many different “clients” that they in turn sell to their “partners” [e.g. whomever they can sell the data to]); making privacy even less.

            So, back to my coupon. A reputable company would just print a coupon that I could use. The same coupon with the same regular barcode on every copy; one that I can read the numbers on; that will be scanned at the checkout line. I can verify that by using the coupon my own privacy is assured [well presuming I don’t leave a paper trail with a cc card or some such]. Such a coupon is used as incentive to try a product hoping to get people to like it enough to keep buying it. And reputable companies know that is the purpose of their coupons (or store coupons may have the purpose to get people in the door).

            A non-reputable company will create QR codes that have hidden information; information unknown to the user, but known to the company; information that will identify the consumer and the consumer’s habits to be tracked and sold and used. The QR codes may actually go to the same coupon; but individualized URL’s will present all sorts of other information to the company. The main purpose then for printing a QR code and having consumers scan it [rather than just printing a coupon] isn’t for the consumer’s benefit but for the company’s benefit.

            I am NOT wrong on this point, just as I was NOT wrong on what the government was doing with phones and Internet traffic. Marketing firms have shown me what they do (now that in the course of my work I have had to deal with marketing firms) and so even while I suspected this type of thing in the past, now I have PROOF that this type of thing goes on in the Internet realm. And thus it leaves me with no trust in the QR codes at all. Hence, privacy of the scanner itself is only ONE reason not to use them. There are lots of other reasons not to use QR codes.