copyright notice
link to published version: IEEE Computer, January, 2001

accesses since November 10, 2000

Cyberprivacy in the New Millenium - it doesn't look good from the U.S. perspective!

Hal Berghel

The right to privacy in the U.S. has always been difficult to define because it was never clearly and explicitly articulated in the Constitution. According to Justice William O. Douglas in 1965, we are all entitled to a "zone of privacy" because of a "penumbra" that lies somewhere within the wording of the First, Fourth and Fifth amendments to the Constitution. However, the extent of that zone remains a moving target and is inconsistently applied as public officials and celebrities alike can testify. The penumbra shades some more than others.

In very general terms, for the past century the U.S. courts have operated under the principle that the right to privacy is tantamount to the "right to be left alone." Everyday experience confirms that this right certainly isn't absolute. Some of us cannot get through a meal without telemarketers invading our privacy. Many of us get spammed continuously by unscrupulous mass-marketers who view email as an invitation to electronically invade our personal computing space. In my town, our user-friendly Post Office provides direct mail dumpsters near the exits that are filled to capacity most of the day with discarded junk mail that should never have been deposited in the P.O. boxes in the first place. We are pestered in the airports by those representing causes and needs, real or imagined. Individual's names and likenesses routinely appear on Websites without their permission. In fact, URLs containing celebrity names that have no connection at all with the celebrity abound. In recent years it has become something of a game to sell (perhaps extort would be more accurate) registered domain names to the individual whose name appears within. A minor typo or confusion can easily misdirect the unprepared surfer from an innocuous business or government site into a porn site. Whether in the comfort of our home or office, or in a public place, assaults on our privacy are continuous and unrelenting. But the worst is yet to come.

Email and the modern "electronic auditorium"

The metaphor of electronic auditorium is appropriate here because modern networking technology is slowly but surely transforming our heretofore private sanctuaries into public forums. And Email started this trend.

I reflected on Email some years ago in an article entitled "Email: the Good, the Bad and the Ugly" (http://www.acm.org/~hlb/col-edit/digital_village/apr-97/dv_4-97.html). My observations remain true today. While Email is indispensable for most of us, the convenience is not without penalty. This time manager's dream that enables us to schedule our own communication interrupts, easily dismiss geographical transmission delays, integrate seamlessly into our digital desktops, and sets the standard for interpersonal though not-in-person communication, has some nasty side effects. For one, it's ubiquitous, no-cost, ease-of-use actually encourages abuse. It should be remembered that terms like "spamming," "Email bombing," and "flaming" entered our vocabulary through the use of Email. But even more innocuous applications of email have unfortunate consequences. The collective streams of consciousness from our well-intentioned friends and associates can by themselves easily exceed our personal bandwidths. Email by its very nature is enticing to the point of communication exhaustion among participants. And this is not to mention the more pernicious aspects of email, such as embedded and attached viruses and ill-behaved or malevolent executable attachments.

But by far the most worrisome is the negative impact of Email on our individual privacy. This concern comes from two areas, one overt and one subtle. The overt is the easiest to contend with. It derives from the decision of the Philadelphia Federal District Court regarding the now-famous "Pillsbury Case" that an employers reading of employee Email does not "tortiously invade" the latter's right to privacy. Unlike telephone conversations, Email is regarded as corporate property because it relies upon corporate computer systems and inhabits corporate storage facilities (presumably, the corporate disk drive is akin to the desk drawer or locker). The fact that arguments of these sorts are too specious to deserve comment does not lessen their societal impact. However, this nuisance is easily dealt with if computer users are willing to organize their lives around server backup schedules, encryption technologies, disk housekeeping and secure offsite storage. Of course, for most of us this is a bridge too far.

The subtle, negative impact of Email is harder to deal with. In this case, Email has "conditioned" us to accept of level of invasion of our privacy that we would otherwise have found unacceptable. Because it has no volume, unwanted email is easier to discard and thus normally falls below our "call-to-arms" threshold. How many of us would willingly accept the volume of unwanted snail mail that we receive as Email? The interesting twist to today's "dynamic marketers" is that they correlate Web hits with snail mail addresses because unsolicited mail generally falls below the individuals abuse threshold. Using Web hits to direct snail mail. The irony in this should not be overlooked.

Worse yet, the enormous efficiencies of Email encouraged us to lower our defenses against allowing outside access to our lives. No longer a cottage industry, indexing personal identifiers (e.g., Email addresses, IP numbers, fax numbers, server names) along with other personal or transaction-oriented information (e.g., cookies correlated with the above) has become a big business. It is a virtual certainty that when we click on ad banners, respond to advertisements, or use the "mailto" auto-reply embedded in Websites and Email, we are revealing something personal about ourselves - even if only the contents of the environment variables within our IP packets (cf. the CGI-BIN Bin site at www.uark.edu/wrgx if you're curious about how this works).

Web Barbarians at the Cookie Jar

While Email started the digital assault on personal privacy, the Web accelerated it. As mentioned above, modern "dynamic marketers" maintain huge databases of sundry personal data that are correlated with names, phone numbers, email addresses, IP addresses, potentially the entire set of both client- and server-side environment variables, and, most regrettably, Social Security numbers. In the trade, this is called "profiling via clickthroughs." As a convenience, we'll label those who would use the Web to penetrate our "digital zone of privacy" Web barbarians.

The primary security hole for the Web barbarians is the cookie - digital information stored on the client computer by the browser software or network application. Originally intended as an innocuous Web extension to overcome a deficiency in the statelessness of the Hyper Text Transfer Protocol, the cookie has become a bete noir of Internet privacy zealots and informed cybernauts.

HTTP was set up to minimize the bandwidth drain of persistent network connections. The metaphor for HTTP is "connect-process request-respond-disconnect." Unlike other TCP/IP environments such as Telnet and FTP, HTTP only enables one request/response cycle at a time. This becomes problematic for complex communication sequences. Even something as simple as a request to change directories requires a separate connection. With the advent of electronic commerce, it became obvious that persistent connections would be necessary - e.g., in filling "shopping carts." To record all of this information on a server for millions of users would be beyond the pale, so Netscape Corporation came up with the concept of the client-side identifier, which they called a "cookie." For those interested in the recipe for digital cookies see www.acm.org/~hlb/publications/web99/web99.html.

Web cookies come in two flavors, session and persistent. Session cookies last only as long as the browser session. They are useful for shopping carts and other transaction "lists." Persistent cookies remain on the client until either an expiration date is reached, they are manually deleted, or they are automatically deleted by a client-side cookie manager (e.g., Cookie Crusher , Window Washer , Cookie Cruncher, Cookie Pal, NS/IEClean). Persistent cookies are useful in "remembering" where past navigation streams through Web sites, storing account names and passwords, or "personalizing" the appearance of a Web site based on recorded user-preferences. From a technical point of view, the only difference between session and persistent cookies are whether the "expires=date" option is used and set.

However, not all persistent cookies are benevolent - at least from the computer user's point of view. The business part of a cookie is a sequence of delimited "name=value" strings. There are no restrictions on the content of these strings. For all the end-user knows, their Social Security number, telephone number, and email address could be included. Further, cookies are inherently "sharable" between similar domains. Cookie lists are matched against "domain tails" (the latter strings of a domain name separated by at least 2 "dots") and can be sent to the server if a match is detected. Thus, "domain=widget.com" in the cookie could match with "sales_prospects.widget.com," "share_with_hate-group.widget.com," etc. One can easily recognize the potential of malevolent cookies for abuse.

The potential for malevolence doesn't end there. Enter the "Web bug." Modern productivity software, most especially Web browsers, routinely render multi-source documents as single pages. Coalescing disparate Web resources in a single presentation window is one of the great advantages of HTML. Few are aware, however, that any server that contributes any part of a Web page can potentially retrieve, use or share any cookie that relates to the main URL. Such cookies are called "third party" whenever they are manipulated by sites that are unidentified by the active URL. Even attempts to block "third party" cookies met with limited success, since the third party can always add the active URL's domain tail to the end of their own in their domain id.

Third party cookies are less worrisome if the page "chunks" of origin are large, and their source is plainly identifiable. But Web bugs are typically one pixel in size - an image anchor in HTML need not be visible! These pixel-sized Web bugs are commonly used for a wide variety of tracking, surveillance, and monitoring activities. No one knows how widespread this practice has become because a single pixel is practically invisible to the user, and thus suspicion is seldom aroused.

So far, we have just nibbled away at cookies (for more complete information, see www.cookiecentral.com). Additional privacy threats arise from the Windows 98 Registration Wizard, ill-behaved HTTP servers, public-domain utilities (e.g., "Comet Cursor"), the identd identification daemon, viruses, trojan horses, Java scripts, "hit logging," spyware that monitors use off-line and then reports the activity when the user re-connects, Internet Explorer's "phone home" feature, and even innocuous productivity apps like Word and Powerpoint. The latter embed network media in just the same way as browsers and are, in principle, just as vulnerable. The invasion of privacy in cyberspace is approaching Orwellian proportions.

The Full Monty

There is no question that the technology-oriented invasions of privacy through the Internet will become a major social issue in the years to come. However, it pales in comparison to a much more mundane problem - misuse of Social Security numbers.

The Social Security Act of 1935 sought to provide a retirement "cushion" for all U.S. citizens at a time when any relief from the economic woes of the depression were sought after. A byproduct of this Act was the Social Security Number, which was to provide the Social Security Administration with a unique record identifier for each applicant.

Unfortunately, the SSN began to take on a life of its own. In 1943, President Roosevelt allowed all Federal databases to use the SSN as a unique identifier, and that practice continues in many Federal agencies today. The Tax Reform Act of 1976 further extended the use of SSN by state and local government agencies. By this time, the SSN became embedded in commercial databases, credit bureaus, marketing lists, and so forth. Despite admonitions from citizen watch groups like the Better Business Bureau (see, www.bbbonline.org/consumers/tips.html), Social Security Numbers are the primary ingredient of personal identification in the U.S. today.

This fact takes on cosmic significance when one considers that the seemingly innocuous SSN is now the primary tool for identity theft - the fastest growing white collar crime in the U.S. According to a 1998 General Accounting Office report, losses due to identity theft approached $1 billion annually (GAO/GGD-98-100BR). And the holy grail of the identity thief is - you guessed it - the Social Security Number.

Identity theft works in the following way. Important information is compiled on someone with good credit. Likely sources include:

If any of these sources have the SSN in their database, unique identification is a no-brainer. Consider, that some states still require the SSN on drivers licenses! With just the SSN, a potential identity thief can extract enough data to create a duplicate, credit-worthy identity of virtually anyone. A "bogus" request for a credit report, together with a well-timed interception of the return mail at the curb, is all one needs to begin.

This is where the Internet comes in. The multi-billion record databases collections in cyberspace, cross-indexed and mined to the extreme, do for identity theft what cookies and their sister technology-excesses do for activity tracking and computer identification. The Internet is where the two putatively harmless ingredients, Social Security Numbers and the technology to harvest vast amounts of data inexpensively and conveniently, come together to produce a great deal of public harm.

On the technology front, there is little to be gained beyond the ad hoc software patching currently in practice. Someone discovers the latest Web bug, then an ingenious developer creates the appropriate insecticide. This spawns the next version of the bug, which in turn... and so on. One of the remarkable aspects of cyberspace is that the pendulum swings are ever so brief.

The solution is to deal with the full monty - the Social Security Number, and all other surrogate unique identifiers that society has bestowed on us. Two such proposals are H.R. 1450 and H.R. 4857, the Personal Information Privacy Act and the Social Security Number Privacy and Identification Protection Act, respectively, introduced by Congressman Jerry Kleczka (Wisconsin). Under these proposed laws, consumers would regain considerable control over the use of their personal identifiers. Credit bureaus would be prohibited from giving out any information not available in the phone book without written consent. Businesses, especially those engaged in electronic commerce, would be prohibited from requiring SSNs as conditions of doing business. As Congressman Jerry Kleczka put it, "Put simply, protecting the SSN is to identity theft as locking the door is to burglary."

But before society deals with the full monty, we are left with a variegated mix of ad hoc remedies such as the cookie managers mentioned above, Web anonymizers (e.g., www.anonymizer.com) that sanitize packet headers that pass from the client to server, re-mailers (e.g., www.zeroknowledge.com) which do as much for email, pseudonym services (www.zks.org), encrypted authentication environments (www.xs4all.nl/~freeswan/), Web monitors that keep abreast of snooping (www.privacyinc.com/) and sundry other digital appliances (www.int.c2.net/).

For sensible overviews of this significant social problem, frequent visits to websites such as these are called for: Travis Perry's Crime Alert (www.futurecrime.com), Computer Professionals for Social Responsibility (www.cpsr.org), and the Privacy Foundation (www.privacyfoundation.org).


Hal Berghel is Professor and Chair of Computer Science at the University of Nevada at Las Vegas and is a frequent contributor to the literature on cyberspace. Related articles may be found via his Website at berghel.net.