Speaking Publicly on Privacy: A Conversation about Digital Privacy

04/02/2019

Raphael Bostic: Hello, everyone—welcome to the latest edition of the Federal Reserve Bank of Atlanta's Public Affairs Forum podcast. I'm Raphael Bostic. I'm president and CEO of the Atlanta Fed, and I'm really excited to have you here for a conversation I'm going to have with Alessandro Acquisti. Alessandro is a professor of information technology and public policy at the Heinz College at Carnegie Mellon University. He's widely known as an expert in the emerging discipline of privacy economics. Alessandro, it's really good to have you here—welcome.

Alessandro Acquisti: Well, thank you so much for the invitation. It's my pleasure to be here.

Heinz College professor Alessandro Acquisti during the recording of a podcast episode.
Photo: David Fine

Bostic: Well, it's my pleasure as well—this is something that I'm really interested in, and I think that there's a lot that we can learn from the work that you've done. So let's start with the basics. How do you define privacy? What is it, and has your definition evolved as the instruments that have developed to collect consumers' data change and become more common?

Acquisti: So these are great questions. They're also challenging—the first one in particular, because there are probably as many definitions of privacy as there are researchers [laughter] studying privacy. Nowadays, most people when they talk about privacy, they think about their personal data—this "informational" view of privacy. But it's important to remember that that focus on information is a rather modern phenomenon. Probably only from the 1960s, the focus of privacy research has been on personal data.

If we go back to some of the early scholarly writings about privacy—for instance, 1890's Warren and Brandeis piece in Harvard Law Review on the right to privacy—they were thinking of privacy in spatial terms more so than informational terms...the right to be left alone, the right to be in a space where you are not going to be intruded on by other people.

So to me, it's interesting that there are so many different definitions of privacy, because it does suggest not only that privacy is a complex concept—which it is, for sure—but it's also a universal concept. What I mean is the following—some people suggest that privacy is a modern invention, but in fact you can find references to privacy, explicitly or implicitly, already in the holy books of ancient monotheistic religions, suggesting that privacy is some need that human beings have had and explored through history, through space, cultures, and times—only the different cultures, different people, different times conceive of privacy in very different ways.

So if you ask me how my view has changed, it's changed exactly along these dimensions. When I started working in this field, when I was a student at Berkeley, I was mainly thinking of privacy in informational terms—data—and then I discovered that there was so much more related to privacy. Ultimately, for me, privacy is about this dynamic, this negotiation, between public and private.

Bostic: So that's very interesting, and I have to say, I didn't expect you to become a historical philosopher on the first question—

Acquisti: [laughter] Me neither!

Bostic: I was trying to ask a pretty basic question, but it is really interesting, listening to you, about how there are different modes of privacy. Now, when you walk around most urban places, there are cameras that people are on in many places. Sometimes they know they're on camera, sometimes not. Something that's become really common right now is the Ring system, in people's homes, where there's a camera at the front door, or at the back door, or wherever...

Acquisti: Or even inside your home now, like Nest.

Bostic: Or even inside the home—no, that's exactly right. And so that's a different notion of privacy than the stuff that people are talking about when they're talking about Facebook or much of the internet and tracking and those sorts of things. Does your scholarship treat them differently, or how would you describe your scholarship in this context?

Acquisti: That's a great question. So they are different notions. The key issue is whether we lose or we benefit from treating them in part as different aspects of the same fundamental problem. And there are scholars who disagree on this. There are some scholars who suggest that privacy is a misnomer. You go back to, for instance, Chicago School of Economics scholars like Posner and Stigler. They would suggest that the very term "privacy" is not very useful in academic discussions, and even in policy and practical discussions, because people mean so many different things.

So if I use the term "privacy," different people will hear different things, and we're actually discussing very different things: autonomy versus freedom, versus tracking, versus price discrimination, versus home invasion—very different things, right? I accept it, and I agree with that. However, other scholars suggest that yes, all these things are different, but they go back to one common core, which I was referring to earlier: these boundaries of public and private that we human beings negotiate in so many different ways, in different moments and aspects of our lives. So, much of my work is indeed on privacy as information privacy—so consumer data, Facebook, and tracking, online advertising, for sure. I try to focus on that, because I'm an economist by training—right?—but because I also grew into behavior economics and behavior decision research as I was studying privacy, inevitably so. Then I started also getting exposed to some of these other dimensions of privacy, so much so that one of my currently unpublished, but more interesting, experiments is about possible evolutionary roots of privacy concerns—just to give you an example of how you can start from economics and you can end up dealing with something completely different from it.

Bostic: That's really, really interesting. Let's talk about some of the dynamics around privacy. There's been a lot of conversation, and I think there were congressional hearings on this, about the fact that consumers are often at a disadvantage in the privacy space because they may not know what information is being collected, they may not know how it's being collected or how it's going to be used. This is asymmetric information space. So when I teach my undergrads, whenever there are asymmetries in information, there are opportunities for exploitation, but there are also opportunities to get some extra benefits out of this as well. Can you talk about this? Is this a problem that we should be trying to solve, or is this something we can capitalize on?

Acquisti: I feel that it is a problem that we've been trying to address—not yet very successfully. The reason why I see it as a problem is that, as you pointed out, information asymmetries are really endemic when it comes to privacy. When we go online, we rarely know precisely what information is being collected, and even those among us who are, in theory, experts in the field—I should consider myself an expert, because I do research in this area, and even I would not know. I would have a sense of what information is being collected, but I would not know where it ends up. I may know that when I'm searching for something on Google, Google is collecting my IP address, is collecting my location, is collecting my queries, and is connecting them to a history of locations and queries and clicks I've done before, but I don't know whether Google is going to pass it to this group within Google, or that group—whether it's going to share it with an outside group or not.

By the way, in the specific case of Google I do know because Google is quite protective of the information, so they don't share it with outside companies. But for many other companies, we have much less information. We don't know ultimately what happens to our data, and we don't know what are the downstream economic consequences of all this information collection and sharing. They may be beneficial, and typically consumers are told, "The sharing of data is what makes it possible to have free content and free services on the internet, so you as consumers, you're going to benefit from this. Don't worry." But in fact, there might be also costs. Some are obvious—identity theft, some much less obvious—price discrimination, or subtle influence at the individual level, or societal-scale level—think influence on political elections. And these costs are very hard to quantify and study. That's why I do find information asymmetry a problem.

And as in terms of solutions, there have been attempts to propose solutions. For instance, in the United States we seem to rely a lot on "notice and consent" regimes, basically privacy policies—the "notice" part—and privacy settings—the "consent" part. But much of the behavioral experimental research that I myself and many other colleagues at CMU [Carnegie Mellon University] and other places have done suggest that these mechanisms fail to actually address consumer privacy needs. These notices are hard to comprehend, hard to read. Very few people even bother reading them, because even if you do then you realize that you may need a law degree to understand everything that you see there.

Bostic: They're very difficult.

Acquisti: Very difficult. They're vague. They may say, "Well, we may use this data, and we may share it with some business partners." So are you really better informed after you read "may share with business partners"? You don't have actually factual evidence on which to base your decision-making, so why waste your time even reading them? So this "notice and consent" mechanism is an approach which, to me, is broken. We still need transparency, but to me it is a necessary, not a sufficient, condition for pricing protection. So we need transparency, but it's not enough.

Bostic: Well, one of the things I would...I had a couple of reactions to that. First, the existence of a disclosure form may not equal transparency.

Acquisti: Right.

Bostic: So what you're talking about—and we have the same issues. If you bought a house, or if you've gotten a credit card, there are disclosures that you sign that you say you've gotten the documents, but oftentimes they're—like in the case of mortgages, they used to be 40 pages long. And you initial every page, but you're not reading all those pages, and it would take so long, and you probably wouldn't understand what most of it meant anyway, that it's not particularly meaningful. And so this issue of disclosures, it's a common approach that we have in our policy space to the existence of any information asymmetry, or the fear that there might be. But executing it in a way that actually reduces the asymmetry is extremely difficult, and I don't think we've really mastered this at all.

Acquisti: I agree with you.

Bostic: And so you've mentioned—and I'm glad you did this, because I wanted to go there—about sharing. The other reaction I had was, there was a Wall Street Journal article just a couple weeks ago about some health apps that said they weren't sharing their data but they actually were. And so there's an issue that even if it's written in the policy, you're never sure they're actually executing the policy internally. So that's a disconnect there that also introduced itself. And because the space is not really regulated, there's no one checking these things. And so we don't have easy mechanisms to verify these things, unless you have an investigative reporter who happens to decide that they want to check on this, which is a pretty significant...

Acquisti: It is. So you are pointing at a really significant problem, which is, it's not just difficult for the end user, it's difficult overall to understand what really happens in the data economy. And this may sound like a strong claim, because we, I guess, are led to believe that the data economy is about transparency and offers more quantifiable information, more metrics, more ability to control what is happening. But in fact, look for instance at what is happening in the area of online advertising, and how both publishers—online publishers, like online newspapers—and merchants that buy ads have started complaining that they don't really know what happens in the black box of online advertising. Because in online advertising, theoretically, you can target ads very precisely based on the behavior of the consumer, and that is true. But then there are all these other nuances, such as not knowing exactly how many of those ads are seen by real humans as opposed to bots, for instance. Not knowing whether the auction as a merchant you participated in, and the price you ended up paying, was in fact affected by which other bidders, which other merchants were at the same time trying to bid for the same visitors.

And how much of the amount you end up receiving as a publisher, for showing ads online—how much of that actually remains? How much of the money—how much of the surplus generated through this process—ends up going to the publishers which show the ads, versus how much of that surplus remains with the information intermediaries, with the ad exchanges, with the auction platforms? We don't have good measurements of these things because the system is actually rather opaque, so what's interesting to me is that there is a little bit of a lack of transparency at the individual level, but also at the level of the entire industry.

Bostic: So that's very interesting, and it really does...well, it's gotten a lot of policymakers' attention, and just recently, like in the last couple of weeks, California's governor proposed that there be a data dividend, which was basically saying, "Firms, if you want to use consumers' data, you've got to pay them." Facebook famously announced a reversal on their thinking about privacy, and talking about the emergence of a privacy-focused communications platform that's more like a digital living room than a digital town square. We're talking California, where Silicon Valley is. We're talking Facebook, the largest data-sharing platform. What do you make of these developments?

Acquisti: I have two comments. One is that I want to see in the long run how much of this talk, then, will grow and transform and evolve into action, so that only at the end of this process will we be able to truly judge these claims. The other comment is—taking the claims for what they are—they seem to suggest that there is a recognition, both among policymakers and within the data industry, that the current system is broken, that it does not provide something that consumers want, and we need to find some solutions. The tricky part is the implementation, because the devil, when it comes to privacy, is always in the details.

For instance, Facebook may say that they will provide a more private messaging system. OK, but what if they are still collecting a vast amount of other data, which makes the protection of these particular messages almost not a factor? We analyze a similar issue in an academic article that we published a few years ago, which we called—aptly, for this conversation you and I are having—"silent listeners," the listeners who are silent. What we found was that Facebook, in particular, over time gave users more and more granular control over their personal information, so that in the beginning where if you were on Facebook in around, say 2005, you could simply decide whether your profile was public or private. There were no gray areas, no in between.

And then over time you could start selectively sharing your photo with your uncle but not with your grandmother, your birthday with this friend, but not this other network of friends, etc. And people started using these features, because they allow people to feel they were somewhat in control. But interestingly, we didn't realize that in allowing people to control how much they are disclosing to their peer or their uncle or their grandmother, their friend, etc., we are also making them forget a little how much they're disclosing actually to the other party who is listening to all these conversations and following everything, which is Facebook itself.

So the point of the silent listener as a behavior point is that if you give people control over their data, or you make them believe they have control over their data, people will start paying less attention to the fact that there is another entity who is actually monitoring everything they are doing, and this is, to me, potentially worrisome.

Bostic: That's very interesting, and I guess where I want to go is to sort of a baseline question: Is this a good thing? As a former academic, I always get nervous when people quote things that I've written, [laughter] but I'm going to do it here. You and your coauthors wrote that the ultimate goal of tools and policies meant to enhance users' privacy and security choices is—and this is the part where I quote—"to increase individual organizational and societal welfare in a way that leads people to make more informed and more desirable autonomous choices." So first, do you think that's happening? Do you think that the way that these tools are being deployed is welfare enhancing, for either the individual, organization, or society?

Acquisti: Thank you, because you could have chosen a worse quote, [laughter] meaning, I can actually stand by that particular quote. I don’t regret it.

Bostic: It’s a good one, it’s a good one.

Acquisti: Thank you. I stand by it in the sense that I still think it's worded in a broad enough manner that it should be quite uncontroversial. We refer to the fact that privacy has economic consequences, both the protection and the collection of data have economic implications. They do create economic winners and losers. This is an undeniable fact, meaning it's in the data—it's not an issue of opinions. Once we acknowledge that, the issue then becomes, can we see under what conditions there are certain degrees of data sharing or data protection which help the consumer, or help all consumers as a whole, or help publishers, or help the data industry as a whole, or help society as a whole?

As we go down that path, we realize that very rarely the interests of all these different stakeholders are always aligned. So let me give you a very simple example: merchants and consumers. A consumer may be willing to share information with a merchant about what their interests and preferences are so that the merchant can make offers which are closer to the interest of the consumer. You reduce what we—in economics we refer to "search costs," right? Better match. But the consumer may not want to share with the merchant how much they really like a given product—what we as economists call "willingness to pay"—because if the merchant knows the reservation price, they can charge exactly that price to consumer.

So here you have an example showing how it's rational for a consumer to want to have some information shared but not other, and where the interests of the two parties are aligned in one case—they both want to have preference information—and misaligned in the other case—willingness to pay. So I'm using this as an example to point out that there are all these intricate trade-offs that differ by stakeholder, so the issue of "how do we devise policies that try to improve a given stakeholders' welfare, or the entire society's welfare?" That is still an important issue. And if you ask me, "Are we there yet?" I'm pretty sure that we are not there yet, meaning I don't think that we are doing the best we can. If you ask me how to do it, I would have no idea [laughter] because it's an incredibly complex problem. So do we do it through technology, better technology? Do we do it through better regulation? Do we do it just by letting market forces play their role? I do not know for sure, but I'm pretty skeptical that what we have now is the best of possible worlds.

And the reason I'm saying this is that right now…we'll go back to the problem of asymmetric information. The consumer data is collected in manners that are often opaque to end users, so we don't know exactly how much we are benefiting from it, we don't know how much we're paying for it. It's a very opaque system which seems to create very few, but very strong, winners—certain data companies. And some benefits to other entities, but we again—we don't have a good sense of to what extent the other stakeholders are benefiting from the system.

Bostic: Well, to listen to you talk—and I like the way you put it, in the sense that we have multiple stakeholders that are trying to accomplish different things, and success for one might mean a cost for the other. It gets to trade-offs and the notion that we may be able to maximize the total amount of welfare, but it may be distributed in ways such that some are really worse off and some are really better off.

Acquisti: Absolutely.

Bostic: And if that's the space that we evolve to, then it may say that policy may need to intervene, because we may be uncomfortable with some people getting very, or some organizations having very, large losses. That may be the thing we want to avoid, as opposed to the total surplus. And it's a conversation we haven't really been having that way, and I think part of the challenge has been how opaque it is, just to even figure out: well, how much is anyone winning, or how much is anyone losing? Because you've got to have that as a precondition before you can start quantifying trade-offs and deciding what's going on.

Acquisti: I agree with you, and indeed, if you consider the field of privacy-enhancing technologies—also known as P-E-T, PETs—so the field of research on technologies that try to protect data. And we have many of those technologies, by the way. For over 20 years now, both academic researchers and industry labs have created protocols—cryptographic protocols, security protocols—to pretty much make any online transaction that we are doing nowadays, make it more secure and more private, more private payments, more private browsing, more private search engine. Think about the transactions that we're doing now online, you can think about a more private way of doing that, OK?

What we don't know yet is, how do these technologies affect—to use economic terminology—economic surplus, and the allocation of the surplus? In other words, if I start using cryptographic protocols that decrease the amount of data available, say, to Google, what are the downstream implications of that? Who is going to pay the cost for those technologies? Is it the consumer? Because of those technologies, behavioral advertising is no longer as precise as before and not as accurate. Is it society as a whole? Because due to those technologies we now have less precise data, and therefore the next researcher investigating cancer or epidemics, they have less valuable data to find a cure.

Or is it just the company itself? For instance, Google in this scenario/example, which sees it cutting the profit margin? So we have three different scenarios: consumer being affected, society being affected, the company being affected—all three may be true at the same time, or only one, or one more than others, but we have no idea. And that, to me, is one of the big unknowns in the field which we need to address, because we want to understand how things may change once we deploy more private technologies.

Bostic: We've been talking at a very academic, sort of high philosophical, almost "first principles" type of level about privacy. But you've actually been on the ground, and some of your work has actually changed some practices. So I know that you've been talking with the Social Security Administration, and some of the work that you've done has changed how they assign social security numbers. Can you tell that story?

Acquisti: Yes, certainly. What we did was—it has been now about 10 years ago—to use publicly available data on social security numbers: data coming from the so-called SSDI, Social Security Death Index, which is basically a database of the Social Security numbers of people who are dead. And we did a statistical analysis on that database, and we realized that the assignment scheme that the Social Security Administration [SSA] had been using to assign the numbers contained much less randomness than previously believed.

So people—scholars and observers, and obviously the SSA itself—knew that the scheme was not entirely around random, but there was a belief that it was random enough to make it hard to just predict or infer someone's SSN just starting from public data. We discovered that that was not the case—that in fact, SSNs can be predicted—and I use the term "predicted," so "statistically predicted." It's a nine-digit number, so typically we cannot predict, just with a single attempt, all of the nine digits correctly. But if we have 50 times, or 100 times, then we can know we have a good degree of likelihood that the actual number lies within those—

Bostic: In that list of 50 or 100.

Acquisti: Yes. It depends also on the state in which the person is born, and what year they are born.

Bostic: But it's pretty systematic—or it used to be.

Acquisti: Yes. What happened, basically, is that in the beginning social security numbers were not assigned when someone was born, but when someone entered the workforce, typically—which happens at unpredictable moments of someone's life, or rather for each person it may happen at a different time, a time different than the next person. But during especially the ‘80s, the probability that social security numbers will be assigned at birth increased dramatically. That created a sort of a fixed point that can be used as leverage to both understand better how the assignment scheme worked, but also to infer the SSN of a given person based on their date of birth. Now, when we published this ourselves, truth be told, the Social Security Administration had already started considering, for independent reasons, changing the assignment scheme—so we get no credit for that.

Bostic: I bet they accelerated once the paper was published, though.

Acquisti: Maybe. Or maybe we influenced some part of their assignment scheme—I cannot know. But I know this has not been the SSA's fault or issue. The problem is that the social security numbers were designed in the 1930s to fulfill a purpose, completely alien from the purpose that now they are used for in the credit industry. With a social security number, and a person's name, and a person's date of birth, you can often create a line of credit under the person's name. Social security numbers are taken as evidence—kind of like an authentication device. Knowledge of that is taken as evidence that you are who you claim to be.

But that's not what SSNs were created to do. That's not what the Social Security Administration meant for them to be. So this is, again, another interesting story about how over time, you have function creep. Over time, a device—SSN—created for a certain goal, started being used for other goals, and this in turn created the huge problem of identity theft that we have nowadays in the United States.

Bostic: Well, it's always the case that with innovation and technology you get opportunities, but you also get risk. Trying to think about how we try to maximize the benefits from the opportunities, while also minimizing the risk that we're exposed to—it's an ongoing challenge. So I wanted to ask—well, you work a lot in this privacy space. I'm guessing you may have some tips for your average consumer about things they might do to protect their personal data. Like, what things might we think about doing to keep our data in a more private forum?

Acquisti: My answer may surprise you—hopefully not in a bad way, meaning that there are things we can do, for sure. Some things are obvious. Don't put your SSN on a public document that you put online. Don't take a copy of your credit card or your ID and post it on Facebook, obviously.

Bostic: So it's straightforward.

Acquisti: Straightforward. In fact, it takes effort and time and cost to actually do these things rather than not doing them. So basically, don't be silly. Then there are things which are actually a little bit more demanding, a little bit more consuming. They take more effort and knowledge. I was referring earlier to all these privacy-enhancing technologies that all of us can use. We can use encryption to protect our e-mails and to protect the content of our hard disk. We can try to use messaging systems which are more secure and more private than the common ones that are most popular. I will not mention names, but anyone with access to a search engine can easily find these technologies.

And yet—and this is perhaps the part of my answer that may surprise you—I'm not necessarily sure that my advice to people will be "use all of these technologies," for a simple reason. Well, maybe it's not that simple, but its initial principles [are]. Do we really want to live in a world where we push the responsibility of privacy protection back to the users, or do we want a world where the problem of privacy is addressed in a more systemic—and in a way, comprehensive—manner?

Because yes, there are all these technologies, but can we really expect end users to spend their time being privacy experts, always kept up to date about the latest technology? Because it's an arms race, right? There is a new way of tracking, and there is the new protection against the tracking. Then, after there is the protection, there is a new new way of tracking, which is even more sophisticated. Then there is the protection against that. It's an endless race that consumers can hardly win, so do we really want them to feel responsible for a problem which is really a societal problem? That's why I'm suggesting that perhaps what we have to do is to look for "societal case" solutions, such as technological solutions—rebuild the infrastructure so as to be more privacy-protective or policy interventions. But we go back to the issue of what exactly is the precise form that those interventions should take? Well, that's a very difficult question for which I do not have an answer.

Bostic: Well, we're almost out of time, and I certainly don't want to end with you not having an answer. [laughter] So I wanted to just ask one last question, which is, are you optimistic about the future, or do you think that the challenges that we face are so large that they're unlikely to be overcome?

Acquisti: Absolutely optimistic, and in a way, it is apt that we are ending with this question, because my answer goes back to exactly where we started. We were talking about these different definitions of privacy, and I was claiming that informational privacy is just one angle. There are many different angles, but they all fall under this dialectic, the intersection between public and private. And I even claimed that there is some evidence that privacy is a universal need of people—so is the need to disclose and socialize, by the way.

So if you believe in this argument I'm making, that suggests that privacy is a need which is a constant, a universal constant, across human beings. So to claim that privacy is dead is to confuse a contingency question with a universal need. What I mean is the following: People seem to have a need for privacy, and that need takes many different expressions. Nowadays, the technology is reducing the availability of private space for individuals, but if my assumption is correct—that people still have that need—they will always find a way to satisfy the need. It could be by leaving social media, it could be by using privacy-enhancing technologies. It could be by creating and reclaiming private spaces for themselves. That's why fundamentally I'm optimistic, because technology changes, but human nature does not.

Bostic: That's a good way to end it, a good note to end on. This has been really, really fascinating, professor—it's great to have you here. I've been talking with Alessandro Acquisti. He's a professor of information technology and public policy at the Heinz College at Carnegie Mellon University. Thank you again for being here—I've really enjoyed it.

Acquisti: Absolutely, my pleasure—thank you for having me.

Bostic: And I also want to thank you for listening. I hope you found this as enjoyable as I have, and I trust you will tune in for our next edition. Thank you.