2000 B.C. Intell. Prop. & Tech. F. 041901

Privacy and the Internet

Herman T. Tavani fnA

April 19, 2000
(Originally presented at the Ethics & Technology Conference June 5, 1999)

1. Introduction

In the recent literature on privacy and technology, considerable attention has been paid to privacy issues and concerns involving the Internet. In the present study, we consider whether any -- and if so, which -- privacy concerns are unique, or in any way special, to the Internet. It is argued that while many privacy concerns currently associated with the Internet are essentially concerns that were introduced by information and communications technologies that predate the Internet, at least two Internet-related privacy issues have resulted from the use of tools and techniques that did not exist prior to the Internet era: "cookies" and search engines. Privacy concerns raised by these tools and techniques are labeled "Internet-specific" and are contrasted with those privacy concerns categorized as "Internet-enhanced." It is also suggested that perhaps the most significant impact that the Internet has had for personal privacy thus far has not been with respect to any Internet-specific privacy concerns that have been recently introduced, but instead can be found in the implications that certain Internet activities have for questions related to the public vs. private nature of personal information. It will be seen that both Internet-specific privacy concerns, such as those caused by certain uses of search-engine tools, and Internet-enhanced privacy concerns, such as those related to certain uses of data-mining technology on the Internet, raise serious questions for the debate over the public vs. private nature of certain kinds of personal information currently accessible on the Internet.

We begin with an attempt to gain a clearer understanding of the concept of privacy by examining some recent philosophical theories. We next set out to clarify what exactly is meant by the Internet before considering specific privacy concerns currently associated with the Internet. Privacy concerns attributable to Internet-specific and Internet-enhanced tools and techniques are then considered. Next, we examine the impact of those concerns for the debate over the public vs. private nature of personal information currently accessible to users of the Internet. We conclude with an analysis of certain Internet-related privacy issues vis-à-vis a policy model recently put forth by James Moor (1997).

2. What is Personal Privacy?

Privacy is a concept that is neither clearly understood nor easily defined. Some authors suggest that it is more useful to view privacy as either a presumed or stipulated interest that individuals have with respect to protecting personal information, personal property, or personal space than to think about privacy as a moral or legal right. For example, Posner (1978) has suggested that privacy can be viewed in terms of an economic interest and that information about individuals might be thought of in terms of personal property that could be bought and sold in the commercial sphere. Clarke (1999) has recently suggested that privacy can be thought of as an "interest individuals have in sustaining personal space free from interference by other people and organizations." From a practical point of view, it might seem fruitful to approach privacy-related issues and concerns simply from the vantage point of various stipulated interests. Many Western European nations have preferred to approach issues related to individual privacy as issues of "data protection" for individuals -- i.e., as an interest in protecting personal information -- rather than in terms of a normative concept that needs philosophical analysis. In the U.S., on the other hand, discussions involving the concept of privacy as a legal right are rooted in extensive legal and philosophical argumentation, including debate in both Constitutional and tort law.

We shall see that a brief examination of some of the philosophical and legal foundations of privacy will not only provide us with a rich perspective on privacy itself, but will also be particularly useful in helping us to understand what privacy is, why it is valued, and how it is currently threatened by certain activities on the Internet. Such an examination will also help us to differentiate between some subtle, yet significant, aspects of personal privacy. For example, it will enable us to differentiate between the condition of privacy (what is required to have privacy) and a right to privacy, and between a loss of privacy and a violation or invasion of privacy. The purpose of our brief look into privacy theories is not so much to determine whether privacy is or ought to be a right -- moral, legal, or otherwise -- but rather to understand better how one's privacy is threatened by certain activities on the Internet.
Traditional privacy theories have tended to fall into one of two broad types, which I have elsewhere described and critically examined as the "nonintrusion" and "exclusion" theories (see Tavani, 1996). The nonintrusion theory, which views privacy as "being let alone" or "being free from unauthorized intrusion," tends to confuse privacy with liberty. On the other hand, the exclusion theory, which equates privacy with "being alone," confuses privacy with solitude. Both theories could be thought of as variations of what some call "psychological privacy" (see Regan, 1995) and others call "accessibility privacy" (see DeCew, 1997), in that they seem to focus on psychological harms to a person that result from either physical intrusion into one's space or interference with one's personal affairs.

Moor (1997) points out that in the U.S. the concept of privacy has evolved from one concerned with intrusion and interference to one that has more recently been concerned with information. To support Moor's claim it must be noted that recent theories of privacy have tended to center on issues related to personal information and to the access and flow of that information, rather than on psychological concerns related to intrusion into one's personal space and interference with one's personal affairs. Addressing information-related privacy concerns, including access to personal information stored in computer databases, many authors now use the expression "informational privacy" as distinct category of privacy concern. Two relatively recent theories, both of which attend closely to the concept of privacy as it relates to personal information, are the "control" and the "restricted access" theories. Let us briefly examine each theory, beginning with a look at the control theory.

2.1 Control and Restricted Access Theories of Privacy

According to the standard version of the control theory of privacy -- variations of which can be found in Fried (1970) and Rachels (1975) -- one has privacy if and only if one has control over information about oneself. One virtue of this theory is that it separates privacy from both liberty and solitude. Another of its virtues, and perhaps its major insight and contribution to the literature on privacy, is that it correctly recognizes the role of choice that an individual who has privacy enjoys in being able to grant, as well as to deny, individuals access to information about oneself. However, the control theory is beset with at least two major difficulties: one which is practical in nature, and the other which is theoretical or conceptual. On a practical level, one is never able to have complete control over every piece of information about oneself. The control theory also seems flawed from a theoretical perspective because it implies that one could conceivably reveal every bit of personal information about oneself and yet also be said to retain personal privacy. The prospect of someone revealing all of his or her personal information and still somehow retaining personal privacy, merely because he or she retains control over whether to reveal that information, is indeed counter to the way we ordinarily perceive of privacy. Another weakness of the control theory is that in focusing almost exclusively on the aspect of control or choice, it confuses privacy with autonomy.

According to those who subscribe to the restricted access theory (see, for example, Allen, 1988; and Gavison, 1984), privacy consists in the condition of having access to information about oneself limited or restricted in certain contexts. This theory of privacy, unlike the control theory, correctly recognizes the importance of setting up contexts or "zones" of privacy. Another strength of this theory is that it avoids confusing privacy with autonomy as well as with liberty and solitude. One problem with most versions of the restricted access theory, however, is that they underestimate the role of control or choice that is also required in one's having privacy. That is, they ignore the fact that someone who has privacy can choose to grant as well as to limit or deny others access to information about oneself. Some variations of the restricted access theory also suggest that to the extent that access to information about a person is limited, the more privacy that person has. On this view, privacy would seem to be confused with secrecy.

So it would seem that neither the control nor the restricted access theory is itself adequate. However, since both theories address privacy issues related to personal information (and access to that information) better than earlier theories, which viewed privacy in terms of nonintrusion and exclusion, it would seem that the former pair of theories would be more useful in helping us to understand issues related to informational privacy. Although neither the control nor the restricted access theory provides a comprehensive account of privacy, each theory seems to offer an important insight into what is essential for individuals to have privacy. Can these two theories somehow be successfully combined?

2.2 Moor's Control/Restricted Access Theory
Recently, Moor (1997) has advanced an account of privacy, called the "control/restricted access theory," in which he argues that an individual "has privacy in a situation with regard to others if and only if in that situation the individual...is protected from intrusion, interference, and information access by others." Included in Moor's definition is the notion of a situation, which is central to his theory of privacy. He deliberately leaves this notion of a situation vague so that it can apply to a number of contexts which we "normally regard as private." A situation can, he says, be an "activity," a "relationship," or a "location," such as the storage, access, or manipulation of information in a computer database.

Unlike earlier theories of privacy, Moor's account enables us to distinguish clearly between the condition of privacy and a right to privacy, and between the loss of privacy and a violation of privacy. It does so by drawing a crucial distinction between what Moor calls a "naturally private situation" and a "normatively private situation." In the former situation, individuals are protected by natural means -- e.g., physical boundaries in natural settings, such as when one is hiking in the woods -- from observation or intrusion by others. In the latter situation, privacy is also protected by ethical, legal, and conventional norms. In a naturally private situation, privacy can be lost but not violated or invaded because there are no norms -- conventional, legal, or ethical -- which proscribe one's right to be protected. Moor further refines his definition of privacy by claiming that an individual "has normative privacy in a situation with regard to others if and only if in that situation the individual...is normatively protected from intrusion, interference, and information access by others" (Italics added).

At first glance, Moor's theory might appear to be simply a variation of the restricted access theory since it is concerned with limiting or restricting access to situations involving personal information. However, in determining whether or not to set up normatively private situations, Moor argues that individuals must also have some sense of control or choice. To have privacy, on Moor's view, individuals need not have absolute control over every piece of information about them -- unlike the claim implicit in the control theory of privacy -- but rather control in the sense of having some choice. This sense of choice or limited control is based on what Moor calls "informed consent." As informed persons, individuals can, via public debate and public referenda, decide whether a certain situation should be declared normatively private. The rules and parameters defining the situation must, according to Moor, be explicit and public, and individuals must have the opportunity to debate whether or not a certain situation should be declared normatively private. The details for this procedure are worked out more fully in Moor's "Publicity Principle," which will be addressed in the final section of this chapter within the application of Moor's control/restricted access theory to specific privacy issues involving the Internet.

3. How is Personal Privacy Threatened by Activities on the Internet?

Having briefly reviewed some influential theories of privacy, and having arrived at a working model of privacy, we next consider ways in which privacy is threatened by certain activities on the Internet. According to a study conducted by the Boston Consulting Group in 1997 (see Wright and Kakalik, 1997), over 70% of 9,300 users responded in an online survey that they were "more concerned about privacy on the Internet" than they were about privacy threats from any other medium. And according to another recent survey conducted by Business Week/Harris survey (see Benassi, 1999), privacy was found to be the "number one consumer issue" -- ahead of ease of use, spam, security, and cost -- facing the Internet. It would perhaps be useful at this point to get clear as to what exactly we mean by the Internet.

3.1 What Exactly is the Internet?

For purposes of this study, the Internet can be understood as the network of computer networks or, as it is sometimes called, the Global Information Infrastructure (GII). As such, the Internet must be distinguished from privately owned computer networks, including local area networks (LANs) and wide area networks (WANs). A synthesis of contemporary information and communications technologies, the Internet, as it is known today, evolved from an earlier U.S. Defense Department initiative known as the ARPANET, which originated in the 1960s. The Internet is not owned by any individual(s) or any nation(s). It includes various components, each based on a different protocol. For example, it includes the World-Wide Web, based on the hyper text transfer protocol (HTTP), the File Transfer Protocol (FTP) and Gopher, a non-graphical or text-only form of interface.
Is the Internet a medium or is it an entity of some sort? On the one hand, the Internet might be viewed as the repository of all information connected to its various databases and servers. This view suggests that the Internet can be understood as an entity of some sort. We often speak of downloading information "from the Internet" or of placing information "onto the Internet." So we clearly refer, at least on some occasions, to the Internet as a thing or entity. Let us call this approach to understanding the Internet, the entitative or substantival view. On the other hand, the Internet might be viewed not so much as an entity but rather as a medium through which the information residing on connected servers and databases can be accessed. For example, we sometimes speak of "using the Internet" to access information in databases and servers or to interact with others by way of an Internet forum such as a "chat room." This approach to understanding the Internet, i.e., as a medium for access and interaction which provides us with a means of viewing the information from a certain perspective, can be called the perspectival view of the Internet. With these distinctions in mind, we can next consider which, if any, special implications the Internet might have for personal privacy. To answer this question, we will need to examine briefly privacy threats posed by earlier information and communications technologies.

3.2 What is New About Privacy Threats Posed by the Internet?

First, it is worth noting that concerns about personal privacy existed long before the era of the Internet. Since privacy issues regarding the use of technology to gather and communicate information about individuals predate the Internet era, it might seem that there is nothing at all new with respect to the privacy concerns currently associated with the Internet. Perhaps this recent medium has, at most, intensified the debate over concerns already introduced by earlier computing and communications technologies. Even if that assumption turned out to be correct, however, we should not underestimate the impact that Internet technology, with its various data-gathering tools and techniques, has had in terms of the magnitude of certain privacy concerns, whether or not those concerns already existed in some form. Also, and more importantly, we should not, based on what we have considered thus far, assume that no new privacy threats have been introduced by Internet technologies.

We will see that the Internet has contributed to the privacy debate in at least two ways. First, the Internet has made it possible for certain existing privacy threats to occur on a scale that would not have been possible, in a practical sense, with pre-Internet technologies. This set of privacy concerns, while not originating from the Internet, can be said to be enhanced by the Internet. Secondly, by virtue of certain tools and techniques unique to the Internet, specific privacy threats that were not possible with earlier information and communications technologies are now made possible by the Internet. These latter privacy concerns could be said to be specific to the Internet. We consider both kinds of concerns, which we shall henceforth refer to as Internet-specific and Internet-enhanced privacy concerns, beginning with an analysis of the latter.

4. Internet-Enhanced Privacy Threats

Privacy concerns which may have arisen because of certain uses of earlier information and communications technologies, but which are now also inextricably associated with and exacerbated by the Internet, can be analyzed under two general headings: (i) dataveillance and data gathering, and (ii) data exchange and data mining.

4.1 Dataveillance and Data-Gathering Activities on the Internet

Some authors suggest that the Internet can be viewed as a new "surveillance medium." Clarke (1988) uses the term dataveillance to refer to the systematic use of systems, which would include the Internet, in the "monitoring of people's actions or communications." Some privacy advocates argue that the Internet, because of its data-monitoring and data-recording mechanisms, poses a threat to privacy on a scale that could not have been realized in the pre-Internet era. In considering such a claim we should first note the obvious, but relevant, point that privacy threats associated with surveillance are by no means unique to the Internet. Nonetheless, concerns over surveillance have, in more recent times, been aggravated by certain activities on the Internet.
In the early days of computing, when computers were owned and operated mostly by large public agencies, it was feared that strong centralized governments would be able to monitor the day-to-day activities of their citizens. Today, however, the threat of surveillance comes not so much from governments and their agencies, at least not in Western democratic societies, as it does from surveillance by online businesses and corporations in the private sector who can now monitor the activities of persons who visits their Web sites, determine how frequently these persons visit those sites, and draw conclusions about which preferences those visitors have when accessing their sites. Even the number of "clickstreams" -- key strokes and mouse clicks -- entered by a Web site visitor can be monitored and recorded.

Data about Internet users can be gathered either directly or indirectly from their online activities. One direct method of information gathering involves the use of Web Forms, a data-gathering mechanism into which Web users enter information online. Forms technology, often used to collect information about Web visitors (such as a user's name and address) can, as Kotz (1998) notes, also be used to track the sequence of pages one visits within a given Web site. For the most part, the use of Web forms would seem to be uncontroversial since online users voluntary submit the requested information. However, information gained from forms can be combined with other directly gathered information, such as information about the items a user purchases online, and can then be combined with online information about individuals that is gathered indirectly. One indirect method for gathering personal information involves the use of Internet server log files. Kotz points out that because Web browsers transmit certain kinds of data to a Web server, such as the Internet address of the user's computer system as well as the brand name and version number of the user's Web browser and operating system software, Internet server logs can be used to gather personal data in an indirect manner. Information gathered indirectly from server logs can also be combined with information gained directly from Web forms, which can then eventually be used by advertising agencies to target specific individuals.

Although Web forms and Internet server logs can be used in ways that pose significant threats to the privacy of Internet users, it is worth noting that neither of these two technologies is unique to the Internet since the development and use of both technologies predates the Internet era. However, the scale on which dataveillance and data gathering can be now be carried out has increased dramatically because of the use of forms- and server-log technologies on the Internet.

4.2 Data-Exchanging and Data-Mining Activities on the Internet

Whereas the dataveillance and data-gathering tools described in the preceding section are used mainly to monitor and record activities of online users, other tools are used to exchange that data on the Internet. This exchange of online personal information often involves the sale of personal data to third parties, which has resulted in commercial profits for certain online entrepreneurs, often without the knowledge and consent of individuals about whom the data is exchanged.

Techniques for exchanging personal data online are hardly new to the Internet. Data-exchange techniques such as the merging and matching of electronic records stored in databases occurred before the Internet era (see, Tavani, 1996). However, Internet technology has facilitated the exchange of online personal information at a rate that was not possible in the pre-Internet era. In response to earlier data-exchange practices involving computer networks, certain privacy laws and data-protection guidelines have been enacted and implemented. These laws and principles, which specifically address the exchange of personal information between databases in computer networks, would also seem by extension to apply to the exchange of personal data on the Internet as well.
Although earlier privacy issues involving the exchange of personal information in databases have centered mainly on the transfer of such information between databases in computer networks, some recent privacy concerns have emerged because of the kind of personal information that can now be extracted or "mined" from within a single database. These concerns arise from certain technique commonly referred to as data mining. Data-mining technology, which combines research in artificial intelligence (AI) and pattern recognition, is defined by Cavoukian (1998) as a "set of automated techniques used to extract buried or previously unknown pieces of information from large databases." Using data-mining techniques, it is possible to unearth patterns and relationships, which were previously unknown, and to use this "new" information -- i.e., new "facts" and relationships in the data -- to make decisions and forecasts. Through the use of data-mining algorithms, individual pieces of data about an online user's activities, which in themselves might seem innocuous, can be recorded, combined, and recombined in ways to construct profiles of individuals. As a result of data-mining applications, an individual might eventually discover that he or she belongs to some consumer category or some risk group, the existence of which he or she had been previously unaware.

The mining of personal data in the pre-Internet era, which depended on the use of large commercial databases called data warehouses to store the data, focused mainly on transactional information. Personal data mined from the Internet, however, need not be (and frequently is not) transactional. For example, information typically included in and mined from personal Web pages, as well as noncommercial Web sites, is nontransactional. Because of Internet commerce, however, much transactional information can now also be gained from the Web as well. When an individual orders a book from Amazon.com (an online book store), for instance, transactional information is recorded about the purchase, and information about that particular transaction can be (and frequently is) used for future business decisions. What distinguishes the Internet as a mining resource from the large databases or data warehouses used in data mining, however, is the vast amount of nontransactional, personal information currently available for mining on the Internet.

Cavoukian (1998), who points out that one of the purposes of data mining is to "map the unexplored terrain of the Internet," notes that the Internet is becoming an "emerging frontier for data mining." She also notes that with access to an Internet server, it is possible to FTP (file transfer protocol) the data from the client's server and then conduct various data mining activities. Fulda (1998) points out that because data-mining software employs certain AI techniques, it can "learn" about the Web by coming to understand the content associated with common HTML tags. Eisenberg (1996) notes that intelligent agents can "sift through" the potential wealth of data on the Internet, and Etzioni (1996) describes the use of "learning techniques" or systems such as softbots (intelligent software robots or agents that use tools on a person's behalf) and metasearch engines (such as MetaCrawler and Ahoy) to uncover general patterns at individual Web sites and across multiple Web sites. So data-mining techniques that previously raised privacy concerns only at the database or data-warehouse level now raise concerns on the Internet as well.

Although data-mining techniques are currently used on the Internet, we have also seen that the use of that technology predates the Internet era. So in this sense, privacy issues associated with data mining on the Internet are similar in kind to privacy concerns associated with Web forms and Internet server log files. Like the privacy concerns raised by the use of these latter tools or technologies, concerns raised by data-mining technology are not unique to the Internet. Instead, they are instances of what we have earlier identified as Internet-enhanced privacy concerns.

5. Internet-Specific Privacy Threats
We next examine privacy issues that are specific to, rather than merely enhanced by, activities on the Internet. Privacy concerns that are attributable to tools and techniques provided by the Internet itself, which we described earlier as "Internet-specific" concerns, arise mainly from the use of two new types of online tools. One technique involves the gathering of personal data from users who visit certain Web sites, whereas the other tool can be used to locate personal information via the Internet.

5.1 Internet Cookies

Through the use of a data-gathering technique called Internet cookies, online businesses and Web-site owners can store and retrieve information about a user who visits their Web sites, typically without that user's knowledge or consent. Cookies technology has generated considerable controversy, in large part, because of the novel way in which certain information about Internet users can be collected and stored. Information about an individual's online browsing preferences can be "captured" while that user is visiting a Web site, and then stored on a file placed on the hard drive of the user's computer system. The information can then be retrieved from the user's system and resubmitted to a Web site the next time the user accesses that site. Cookies technology is the only data-gathering technique that actually stores the data it gathers about a user on the user's computer system.

The owners and operators of one Web site cannot access cookies-related information pertaining to a user's activities on another Web site. However, information about a user's activities on different Web sites can, under certain circumstances, be gathered and compiled by online advertising agencies. For example, online advertising agencies such as DoubleClick.net, who pay to place advertisements on Web sites, include a link from a host site's Web page to the advertising agency's URL. So when a user accesses a Web page that contains an advertisement from DoubleClick.net, a cookie is sent to the user's system not only from the requested Web site but also from that online advertising agency. The advertising agency can then retrieve the cookie from the user's system and use the information it acquires about that user in its marketing advertisements. The agency can also acquire information about that user from cookies retrieved from other Web sites the user has visited, assuming that the agency advertises on those sites as well. The information can then be combined and cross-referenced in ways which enable a marketing profile of that user's online activities to be constructed and used in more direct advertisements.

Several privacy advocates have argued that because cookies technology involves the monitoring and recording of a user's activities while visiting Web sites (without informing the user) as well as the subsequent downloading of that information onto a user's computer system, that technology violates the user's privacy. Defenders of cookies, who are usually owners of online businesses and Web sites, maintain that they are performing a service for repeat users of a Web site by customizing a user's means of information retrieval and by providing the user with a list of preferences for future visits to that Web site. Despite any alleged advantages provided by cookies to users who frequently visit one or more Web sites, most users surveyed indicated that they are more concerned about not losing their privacy while visiting Web sites than they are with gaining customized retrieval preferences for their favorite sites. According to a 1996 Equifax/Harris Internet Consumer Privacy Survey (cited in Wright and Kakalik 1997), 64% of respondents believed that providers of on-line services and Web sites should not be able to track users' activities on the Internet, including the Web sites they visit, regardless of whether that information is eventually used by online advertisers.

To assist Internet users in their concerns about cookies, a number of privacy-enhancing tools have recently been made available. One such product from Pretty Good Privacy (PGP) is pgpcookie.cutter, which enables users to identify and block cookies on a selective basis. In the newer versions of most Web browsers, users have an option to "disable" cookies. As such, users can either opt-in or opt-out of cookies, assuming that they are aware of cookies technology and assuming that they know how to enable/disable that technology on their Web browsers. The reason that privacy threats associated with cookies can be categorized as an Internet-specific privacy concern, of course, is that the privacy threats posed by that particular data-gathering technique is unique to the Internet.

5.2 Internet Search Engines

Internet technology has also provided tools that support new techniques for disseminating information about persons. Wright and Kakalik (1997) note that a certain kind of information about individuals, which was once difficult to find and even more difficult to cross-reference, is now readily accessible and collectible through the use of automated search facilities on the Internet. These facilities are called Internet search engines. Included in the list of potential topics on which search-engine users can inquire is information about individual persons. By entering the name of an individual in the search-engine program's entry box, search engine users can potentially retrieve information about that individual. However, because an individual may be unaware that his or her name is among those included in a search-engine database, or because he or she might be altogether unfamiliar with search-engine programs and their ability to retrieve information about persons, questions concerning the implications of search engines for personal privacy have been raised (see Tavani, 1997). As in the case of privacy concerns associated with Internet cookies, privacy issues involving search engines did not exist prior to the Internet era.
A search for a person's name will often return the addresses of Web pages written by that person or the addresses of Web sites that include information about that person. Kotz (1998) points out that since many email-discussion lists are stored and archived on Web pages, it is possible for a search engine to locate information that users contribute to electronic mailing lists or listservers. Search engines can also search through archives of news groups, such as Usenet, on which online users also post and retrieve information. One such group, DejaNews, is set up to save permanent copies of new postings and thus provides search engines with a comprehensive searchable database. DejaNews also provides "author profiles" which include links to all of the online articles posted by a particular person. Because the various news groups contain links to information posted by a person, they can provide search-engine users with considerable insight into that person's interests and activities.

It could be argued that information currently available on the Internet, including information about individual persons, is, by virtue of its residing on the Internet, public information. We can, of course, question whether all of the information currently available on the Internet should be viewed as public information. The following scenario (see Tavani, 1997) may cause us to question whether at least some information about individual persons, such as personal information stored on a Web server or in a database that is accessible to Internet users, should be viewed as public information. Consider a case in which an individual contributes to a cause sponsored by a homosexual organization. That individual's contribution is later acknowledged in the organization's newsletter (a hardcopy publication that has a limited distribution). The organization's publications, including its newsletter, are then converted to electronic format and included on the organization's Internet Web site. The Web-site is "discovered" by a search-engine program and an entry about that site's address is recorded in the search engine's database. Suppose that you enter this individual's name in the entry box of a search-engine program and a "hit" results, identifying that person with a certain homosexual organization. Since the person identified may have no idea that such publicly available information about his or her activities exists, here the use of search engine-technology might indeed raise a privacy concern for the individual in question.

6. The Public vs. the Private Aspects of Personal Information

We have already seen that certain Internet-related privacy concerns, whether they fall under the category of concerns specific to Internet technology or under the category of concerns simply enhanced by Internet tools and techniques, may cause us to reassess some of our implicit assumptions regarding the distinction between public and private information. For example, at least some uses of search engines (an Internet-specific tool) as well as certain cases involving data mining (an Internet-enhanced technique) challenge our commonly held assumptions about which information should be regarded as public and which should be viewed as private.

6.1 Is the Internet a Public "Place?"

With respect to the public vs. private characteristics of the data that is both collected from certain Internet activities by online business and available to any Internet user via a range of search tools and techniques, a number of questions arise. For example, is the Internet a public "place" (assuming, of course, an "entitative" or substantive view of the Internet), in which case certain activities would come under the rules of the public sphere? If so, can visitors reasonably expect to receive any more protection in the public realm of "Internet space" or cyberspace than they currently enjoy in the public realm of physical space? On the one hand, the Internet might seem, by definition, to be in a public sphere. Does it follow, however, that all information about persons that is currently gathered from, and included in databases accessible to, the Internet is ipso facto public information? Wright and Kakalik (1997) point out that while certain types of information have always been available through public documents, e.g., real estate deeds in the past, this information was not readily available for mass distribution. Should the fact that this publicly accessible information can now be distributed electronically, both in a manner and on a scale that was not previously possible, cause us to reconsider our sense of what is "public information?" Do we perhaps need new rules and policies regarding what counts as a "public activity" on the Internet because of the ways in which "public information" can now be exchanged?

As we saw in our earlier discussion of data-mining activities on the Internet, most privacy concerns involving data mining were not centered on the exchange of confidential or intimate information such as one's medical records, financial records, or personal relationships. Rather, the concern was over the collection of a kind of personal information which, in the past, one might have thought not to need protection. In many cases, some legal protection has been granted to personal information thought to be private (i.e., intimate, sensitive, or confidential information). As Nissenbaum (1997) so aptly puts the matter, our concern is now for personal information in "spheres other than the intimate" (italics Nissenbaum). Concerns about which kinds of personal information need protecting are now beginning to intensify because of the ways personal privacy is threatened in certain (nonconfidential and nonintimate) online activities that might easily be construed as activities in the public sphere. So it would seem that Nissenbaum is justified in her claim that we need to "protect privacy in public."
Nissenbaum also notes that because of the ways we typically use the terms "private" and "public," the expression "privacy in public" may seem oddly paradoxical. She further notes that few normative theories sufficiently attend to the public aspect of privacy protection and that philosophical work on privacy theory suffers a "theoretical blind spot" when it comes to the protection of privacy in public. Nissenbaum points out that the handful of arguments that have been advanced to protect privacy in public have met with "knock-down objections," based on the following line of reasoning: "If a person makes no effort to...[conceal]...information about themselves...then restricting what others perceive of the person, or record and do with the information thus recorded, is to place unacceptable limits on their freedom" (Nissenbaum, 1997). It would certainly seem that this line of reasoning provides a convenient rationale for those online entrepreneurs who currently engage in data-mining techniques as well as in other data-gathering activities on the Internet.

6.2 Can Moor's Theory of Privacy Help us to Resolve these Issues?

To resolve privacy concerns related to the distinction between the private vs. public nature of personal information, we will need to proceed from an adequate theory of privacy. We can now return to the theory of privacy advanced by Moor (1997), which was considered briefly in Section 2.2 of this study. There it was argued that Moor's control/restricted theory was the most comprehensive of the privacy theories that we examined. Because Moor's theory can account for certain anomalies concerning private situations in public contexts, his conception of privacy would, unlike earlier theories of privacy, seem to offer us a starting point for considering questions regarding the public vs. private character of personal data. Let us next consider Moor's insight into this question by briefly looking at two of his examples: one involving faculty salaries at different colleges and another involving a discussion between a married couple in a restaurant.

Moor invites us to consider a case involving the question whether information about salaries received by college faculty should be protected (i.e., declared a "normatively private situation"). He points out that at private colleges faculty salary schedules are often kept confidential, whereas at larger state colleges, faculty salaries above a certain level are sometimes published. And, Moor claims, good reasons can be given for doing so in each situation. Note that there is nothing inherent in the information concerning faculty salaries that would suggest that information about those salaries should be public information or private information. i.e., information that should be protected in a "normatively private situation." So it would seem that in the process of determining whether certain information should be viewed as public information or should be protected as normatively private information, it is neither the kind of information -- i.e., whether the information is intimate or confidential -- nor the content of the information itself that informs us as to whether the information in question should be declared private or left public. Instead, it is the situation or context in which the information is used that can determine whether that information should be declared normatively private or left unprotected.

Moor (1997) also envisions a scenario in which a married couple arguing loudly in a restaurant about certain details of their marital life can respond to a waiter, eager to offer the couple some advice, by pointing out to the waiter that their discussion is really a "private matter" and that they do not wish to hear his advice. Whereas the couple's response to the waiter might seem odd or even paradoxical, Moor notes that such a response does make sense because in private situations the access to information "can be blocked in both directions." Moor points out that the couple can reasonably reject incoming information from the waiter, despite the fact that the couple had been publicly indiscreet in revealing details of their marital life to the waiter and to the patrons of the restaurant. Because Moor's examples involving the restaurant conversation and faculty salaries seem to address, at least on a certain level, issues concerning privacy in public, his control/restricted access theory would seem to provide some insight into the public vs. private distinctions regarding personal information. We next consider Moor's theory more fully by applying it to some specific privacy concerns currently associated with the Internet.

7. Applying Moor's Theory to Specific Privacy Issues on the Internet

We next consider whether Moor's control/restricted access theory can be successfully applied to the cluster of Internet-related privacy concerns. First, we consider whether any of the privacy concerns we have considered in the preceding sections necessarily violate or invade the privacy of individuals. Finally, we will see whether Moor's theory can be used to frame a coherent privacy policy that will enable us to resolve future privacy issues that may arise on the Internet as well.

7.1 The Loss of Privacy vs. the Violation or Invasion of Privacy
We begin by examining Moor's distinction between a loss privacy and a violation or invasion of privacy, considered briefly in the Section 2.2 of this study. We should recall that central to Moor's theory of privacy was the notion of a situation. Does the Internet constitute a situation? As we saw earlier, the Internet could be viewed either substantively as a repository or perspectivally as a medium. When viewed as a repository of information, consisting of all of the information contained in the data bases accessible to it, the Internet might not seem to fit neatly Moor's notion of a situation. When viewed as a medium, however, the Internet can be said to constitute multiple situations. So using Moor's notion of a situation in conjunction with our sense of "Internet as medium," the Internet can be viewed as consisting of a series of situations. Interestingly, Moor includes among his descriptions of a "situation" an "activity in a location" and the "storage and access of information" (such as that in a computer database). Each of the following Internet activities would seem to count as a legitimate situation: the use of Internet search engines to locate individuals or information about those individuals; the use of Internet cookies to gather personal information about users and to store that information on the user's computer; the mining of personal data; the use of Internet forms to (directly) gather personal information; and the use of Internet server log files to gather personal data (indirectly). Although Moor's theory could be applied to any of these situations, let us consider, for purposes of this section of the study, the situation of mining personal data on the Internet.

Does mining personal data on the Internet violate or invade an individual's privacy? Moor (1997) would concede that privacy is indeed lost by an individual, X, whenever data about X is mined on the Internet. On Moor's view, however, the mere loss of privacy by an individual in a natural situation does not necessarily constitute an invasion of that individual's privacy. So it is not yet clear whether X's privacy has been violated or invaded in a normative sense. Should all personal information contained on the Internet which is currently accessible to data-mining techniques be declared a normatively private situation? In other words, should access to that information be restricted, and, if so, how should we decide the matter? Moor notes that the boundaries of normative privacy can vary significantly from "place to place, and time to time." He further notes that situations can also vary within a group, but points out that this does not mean that privacy standards are "arbitrary or unjustified." We saw in the preceding section in our discussion of whether information regarding faculty salaries should be construed as public information or be declared normatively private information, policy decisions can vary from situation to situation, or context to context. We also saw that we can have good reasons for making certain faculty salaries public in one context, e.g., at state colleges, and declaring them normatively private situations in other contexts, such as in small private colleges.

Moor (1997) claims that we can better protect our privacy if we know exactly where the zones of privacy are, and if we know under what conditions and to whom information will be given. One strength of Moor's theory is that it requires us to state publicly what the parameters of a private situation are so that they will be "completely public" and presumably known to all those in or affected by a situation. Moor goes on to point out that privately restricted situations or zones must conform to what he calls the Publicity Principle. Let us next see how Moor's Publicity Principle can serve as the foundation for a policy regarding privacy on the Internet in general and privacy issues related to data mining in particular.

7.2 Moor's Publicity Principle and its Implications for the Internet

It would perhaps be prudent for us to begin a public dialogue on privacy concerns related to the Internet in general, and data mining in particular, while both are still relatively new. A plausible policy would, as Moor's theory rightly suggests, need to spell out clearly the requirements for all individuals (Internet users) and online businesses, including Web-site owners. Applying Moor's Publicity Principle to data-mining situations involving online businesses and online consumers, we could propose what Moor calls a rational debate on data mining in which consumers are first informed that data mining is being used by certain online businesses to gather information about them that can be used in ways that they most likely had not explicitly authorized. Following Moor's Publicity Principle, with its invitation for a rational debate, the onus would seem to be on businesses to inform consumers about data mining, and not on the consumers to discover for themselves which online businesses engage in this practice. Consumers need to be told explicitly that information about them is being used in data mining activities, since it would not be reasonable to expect that the average consumer would be aware of data-mining technologies. So by having an explicit policy in which consumers were made aware of data mining and its applications, online users could inquire into how information about them is compiled and used by the businesses with which they transact, and these consumers would thus be able to make informed choices. An opportunity for individuals to make informed choices would certainly seem to be an important ingredient in any policy that purports to be open and fair.

An open and fair policy would require the explicit consent of the online user (or data subject) to have his or her data used for data-mining purposes. And following Moor's Publicity Principle, with its notion of informed consent, consumers must also be given some say in what the acceptable rules -- e.g., the parameters and limitations of uses of the data about them -- will be in that practice. Clearly defined rules must, Moor says, be established, and individuals must be explicitly informed of those rules.
While most online users would likely opt-out of data mining, some users might see certain advantages for themselves in having their personal data mined. For example, that process might result in their e-mail (or hardcopy) solicitations being more directly-targeted to their individual interests as opposed to their receiving more generic forms of "junk mail." Other online consumers might be inclined, if given a choice over whether to have their personal data mined, to "opt-in" if there were certain financial advantages such as consumer discounts or rebates on items purchased. The important point, of course, is that Internet users would have some say in how data about them is used. The same rules used for determining whether to declare data-mining activities on the Internet a normatively private situation, and the same considerations regarding the pros and cons for why users might opt-in or opt-out out of Internet data mining, could be applied to Internet cookies as well as to other Internet situations or activities.

8. Conclusion

It would seem that Moor's control/restricted access theory, with its Publicity Principle, provides us with a comprehensive and consistent, yet flexible, procedure for resolving privacy disputes involving the Internet. Because of its flexibility of application, this theory also provides us with a mechanism to resolve, via open and rational debate, future privacy concerns that may arise from the use of Internet tools and techniques that have yet to be developed and implemented. Unlike many privacy-enhancing technologies that have recently been put forth as technical solutions or "techno-fixes" to privacy threats introduced by specific Internet technologies, Moor's comprehensive theory provides a procedure for addressing and resolving privacy issues in a much more systematic manner. Because most technical solutions are aimed at eliminating threats introduced by specific Internet tools and techniques, they tend to be ad hoc and nonsystematic "quick fixes" to privacy issues that are complex in nature. An adequate solution to current and future privacy issues involving the Internet needs to take into account those complexities. Fortunately, Moor's theory enables us to do just that.


Allen, A.: 1988, Uneasy Access: Privacy for Women in a Free Society, (Rowman and Littlefield, Totowa, NJ).

Benassi, P.: 1999, "TRUSTe: An Online Privacy Seal Program," Communications of the ACM, vol. 42, 2, 56-59.

Cavoukian, A.: 1998, Data Mining: Staking a Claim on Your Privacy, (Information and Privacy Commissioner's Report, Ontario, Canada).

Clarke, R.: 1988, "Information Technology and Dataveillance," Communications of the ACM, vol. 31, 5, 498-512.

Clarke, R.: 1999, "Internet Privacy Concerns Confirm the Case for Intervention," Communications of the ACM, vol. 42, 2, 60-67.

DeCew, J.W.: 1997, In Pursuit of Privacy: Law, Ethics, and the Rise of Technology, (Cornell University Press, Ithaca, New York).

Eisenberg, A.: 1996, "Privacy and Data Collection on the Net," Scientific American, March, 120.

Etzioni, O.: 1996, "The World Wide Web: Quagmire or Gold Mine?" Communications of the ACM, vol. 39, 11, 65-68.

Fried, C.: 1970, "Privacy: A Rational Context," Chap. IX in Anatomy of Values, (Cambridge University Press, New York).

Fulda, J.: 1998, "Data Mining and the Web," Computers and Society, vol. 28, 1, 42-43.

Gavison, R.: 1980, "Privacy and the Limits of the Law," Yale Law Journal, Vol. 89.

Kotz, D.: 1998, "Technological Implications for Privacy." In J.H. Moor, ed. Proceedings of the Conference on The Tangled Web: Ethical Dilemmas of the Internet, August 7-9, 1998, (Dartmouth College, Hanover, NH) forthcoming.

Moor, J.H.: 1997, "Towards a Theory of Privacy in the Information Age," Computers and Society, vol. 27, 3, 27-32.

Nissenbaum, H.: 1997, "Can We Protect Privacy in Public?" In M.J. van den Hoven, ed. Proceedings of the Conference on Computer Ethics: Philosophical Enquiry: CEPE '97, (Erasmus University Press, Rotterdam, The Netherlands), 191-204.

Posner, R.A. : 1978, "An Economic Theory of Privacy," Regulations, May-June, 19-26.

Rachels, James. (1975), "Why is Privacy Important," Philosophy and Public Affairs, vol. 4, 4.

Tavani, H.T.: 1996, "Computer Matching and Personal Privacy: Can They Be Compatible?" In C. Huff, ed. Proceedings of the Symposium on Computers and the Quality of Life: CQL '96, (ACM Press, New York), 97-101.

Tavani, H.T.: 1997, "Internet Search Engines and Personal Privacy." In M.J. van den Hoven, ed. Proceedings of the Conference on Computer Ethics: Philosophical Enquiry: CEPE '97, (Erasmus University Press, Rotterdam, The Netherlands), 214-223.

Tavani, H.T.: 1998, "Data Mining, Personal Privacy, and Public Policy." In L.D. Introna, ed. Proceedings of the Conference on Computer Ethics: Philosophical Enquiry: CEPE '98, (University of London Press, London, UK), 113-120.

Tavani, H.T.: 1999, "Informational Privacy, Data Mining, and the Internet." Ethics and Information Technology, vol. 1, 2.

Tavani, H.T.: In Press, "Privacy Security," Chap. 4. in D. Langford, ed. Internet Ethics, (MacMillan Press, London, UK).

Wright, M. and J. Kakalik.: 1997, "The Erosion of Privacy," Computers and Society, vol. 27, 4, 22-25.


Significant portions of this paper are extracted from "Privacy and Security," Chap. 4 in Internet Ethics (ed. Duncan Langford), forthcoming from MacMillan Publishers (UK). I am grateful to Professor Langford and to MacMillan Publishers for permission to use material from that chapter in several sections of this paper.

© 2000 Herman T. Tavani. Published with permission of the copyright holder.

Front Page Commentary © and Disclaimer About IPTF The Intellectual Property and Technology Forum