What is the definition of “your data”? The answer may determine the future of the Internet – and, more broadly, of communications media, the users that derive value from them, and the marketers that depend on them.
The combination of the word “data” or “information” with a personal possessive pronoun lies at the heart of the current debate over interactive advertising and privacy. In the Monday New York Times story “Web Privacy on the Radar in Congress,” reporter Stephanie Clifford wrote that a subject of her piece knows that companies “are collecting his data.” The Center for Democracy and Technology, the prominent Washington-based proponent of a Federally mandated “do not track list” against interactive advertising, told the Los Angeles Times recently that Americans are “uncomfortable” with “the collection of their data.” The Federal Trade Commission, in proposing principles to control “behavioral advertising,” recommends that “consumers can choose whether or not to have their information collected for such purpose.” Democratic Congressman Edward J. Markey of Massachusetts said yesterday that he expects to introduce legislation during the coming year that “includes a set of legal guarantees that consumers have with respect to their information.”
All well and good, you might say: My identity must be protected from thieves and exploiters. But guess what? The plans that these activists and their enablers are promoting have nothing to do with identity protection. To the contrary, they are agitating – some, perhaps, unwittingly — for a new property right, unique in U.S. law, that would provide consumers personal ownership of all information that derives from their activities, no matter how anonymous, non-identifying, aggregated, or otherwise impersonal it may be. They are further proposing that the Government, as the codifier and protector of such rights, use this definition of “behavioral data” to assert Federal control over most Internet operations. The effect could be to cripple the architecture of the World Wide Web.
Although this effort to socialize the Web is taking place in plain sight, it involves no small degree of artifice. Rarely if ever, for example, are such phrases as “his data” or “their information” explicated – leaving readers to believe that sensitive personal records are being compromised. But on occasion, the activists slip in telling ways. Jeff Chester, the proprietor of the extremist Center for Digital Democracy and a frequent witness in regulatory hearings in Washington, has made clear his belief that no distinction exists between identifying data and impersonal data: If it can be used in any way for marketing purposes, it belongs to the individual, and Government should restrict its application. As he wrote on his blog last April, interactive publishers
… know that in today’s digital marketing era, the very tiny bits of personal behavior they have identified are parts of individual human identity. Our ‘virtual’ identities may be composed of discrete and disassembled bits of information about ourselves: what we like to read, watch, buy; our problems and concerns (such as health or our children’s education) or our political interests, but they are very much living aspects of ourselves. The goal of interactive marketing is to collect, analyze, and use such information to serve the interests of those paying for the targeting. The technique uses one, two or multiple individual data points in a variety of ways (search ads, broadband videos, virtual worlds) to get individual consumers to behave or act in ways that favor or reflect the marketer’s goals.
State legislatures -the stalking horses for the Washington lobbyists and legislators looking to constrain marketing and media – have followed the lead of Mr. Chester and his “regulate-the-Web” comrades. A New York State bill that was written to restrict what it termed “online preference marketing” actually promises explicitly to extend Government control over virtually all consumer research that has a Web component. The bill (sponsored, mystifyingly, by an Assemblyman from Westchester County, home to such consumer marketing and media giants as PepsiCo, the Readers Digest Association, IBM, and Starwood Hotels & Resorts), defines “online preference marketing” as “a process used by entities whereby data is typically collected over time and across Web pages to determine or predict consumer characteristics or preference for use in ad delivery, including the use of non-personally identifiable information.”
These definitions and metaphysical disquisitions help us understand how breathtakingly and unprecedentedly broad the supposedly protective proposals to restrict “behavioral targeting” actually are. They unambiguously define “behavior” as any and all consumption activity, no matter how distanced it is from one’s personal identity. Equally plainly, they say that any “data” or “information” that derives from such behavior would fall under their proposed regulatory scheme, even if it cannot compromise an individual’s identity, let alone cause him or her any harm.
Calling All Clients
Let’s be clear what’s at risk here: the Internet, and any communications activity that depends upon it. Why? Because all Internet activity throws off such non-identifying “behavioral” data all the time. Indeed, behavioral data is the center of the client/server call process that’s the essence of the Internet’s architecture, which delivers content based on information generated by user activity. As IAB Vice President for Industry Services Jeremy Fain, one of the interactive media industry’s top operations experts, puts it: “A client calling a server asking for content, and the server sending it back, is the fundamental underpinning of the Internet.”
Put this vital piece of the Web’s infrastructure under Government control, as the activists suggest, and the ad-supported innovation that has driven this communications revolution would be impaired. As Fain explains, “Within the client request there are many pieces of information, including a cookie, an IP address, and a user agent string. Cookies could be stripped out of that process, but the Web experience would change drastically. Cookie IDs are essential to user experience; as Wikipedia nicely observes, cookies give a state, a ‘memory of previous events,’ to otherwise stateless HTTP transactions.”
Without cookies, each new page view would be an isolated event. The Web’s relevancy engine would disappear. A news site wouldn’t be able to give you recommendations for articles you might want to read based on earlier things you’d read. Click analysis would be impossible, so retailers and brands would not be able to understand how their customers are using their sites. There could be no logged-in state beyond a single session, making it necessary for a user to log in to every site each and every time he or she visited. For retrieving email this wouldn’t be a big change, but for any news or entertainment sites that require a registration, any blog, any social media site, this would change the experience dramatically.
“IP addresses can’t be stripped out,” Fain continues. “They are fundamental to the delivery system — both client and server must know where to send the information. User agents – which are basically the identification of the browser type — should not be stripped out, either. Besides being fundamental to conducting any business online — they are the best way to distinguish human activity from machine-generated activity, and accurately count how many times content was delivered to real people – user agents are essential to delivering better experiences. At first glance, user agents may seem tangential to the consumer data and information discussion, but decisions are made by Web sites based on a consumer’s user agent strings all the time. A person using the Safari browser will regularly see something different from someone using Internet Explorer.” (An excellent example of the importance of user agents is the mobile Web: Page views will be optimized for the smaller screen based simply on the server’s ability to know that the consumer is using a mobile-device browser. )
“This is behavioral information,” Fain says, “but if companies cannot collect or store it, they cannot make business decisions on how to optimize their sites for their viewers. ”
In other words, the Internet runs on behavioral data. When a user launches her browser, behavioral data is generated that gets her to her designated home page. When a user clicks on a de.licio.us bookmark, behavioral data is generated that whisks him to the site. When users click on an article’s “go to next page” button, behavioral data is generated that positions them on the next page – in pretty much the same way a click on a “skip this ad” button will assure they don’t get the advertisement they don’t want to see. Under no normal circumstances is this behavioral data connected to an individual’s name, address, Social Security number, or other information we would conventionally associate with personal identity.
Sadly, this doesn’t seem to matter to the activists, because under the rules they are pushing in Congress, this impersonal string of otherwise meaningless symbols would still be classified as “their data,” and subject to Government regulation. And with that change, control of media and commerce would pass from the private sector to the Feds.
Think I’m being overly inflammatory? In addition to the obvious damage to interactive content customization and relevance, consider what else would be placed at risk if this de facto, all-encompassing definition of “behavioral data” were to become de jure:
- Bar-code scanners used at checkout counters. With ownership of impersonal consumption data legally enshrined as consumer property, this crucial component in retail supply-chain management could become unusable – at least if the data they collect is transmitted over the Web. Internet-based supply-chain management systems employing RFID tags could similarly be compromised.
- Lists of “most e-mailed stories” in newspapers and magazines. These popular features – and vital editorial management tools – could become illegal under the proposals floating around Washington and the states, for they depend on aggregated behavioral data.
- Search-engine competition. Kiss goodbye any efforts by its competitors to compete with Google. Whether small fry like Cuil or giants like Microsoft, their ability to take data to optimize their own processes or experiment with new algorithms would be gone. So would the search-engine optimization and search-engine marketing industries, too.
- Social science research. Academics interested in observing, say, the effect of health communications on Americans’ behavior would be restricted from utilizing the anonymous data generated by the billions of interactions daily between Web users and content. Even the American Psychological Association, which recommends “informed consent” as a standard in most research, recognizes that some forms of research, including anonymous questionnaires and “naturalistic observations” in cyberspace, don’t necessarily require it. Some legislators and activists, though, want their judgment to supersede the scientists’.
- Journalism and commentary. If people own “their data,” publishing observations of their activities online – how many times a video was watched, how many members of a social network enjoy Cream of Wheat for breakfast, what people are saying about that new Carmen Diaz movie – would fall into a legally murky area. Remember, California , for 20 years, has accorded people “personality rights” that prevent the unsanctioned use of anyone’s “name, voice, signature, photograph or likeness on or in products, merchandise or goods.” Extending this right to “their data” is basically what the anti-Internet proposals envision. In a profound, First Amendment-grounded critique of the FTC’s proposals against behavioral marketing, the Newspaper Association of America wrote, “The fully protected rights of news publishers are at stake. A limitation on behavioral targeting would directly affect the selection of content that is presented to readers.”
- Branded-media and small-publisher growth. Many major media companies are hooking their futures on the opportunity to gain more reach by constructing large networks of affiliated sites, whose content and demographic affinities would be abetted by network-based ad delivery. Ban the use of anonymous behavioral data, and these enterprises comes tumbling down. So does network-based advertising support for small publishers, which underpins the economics of tens of thousands of sites.
It’s ironic that I’m writing this only a week after the U.S. Government broke what The New York Times described as a global, criminal “cyber-ring” that “plundered the credit card numbers of millions of Americans.” Such threats – to family information, financial records, health data – are real. In fact, exposure of such crimes pretty much requires the retrieval and storage of user string agents and other behavioral data by e-commerce providers and other sites. But instead of zeroing in on real crimes and real harm, aggressive legislators, regulators, and their champions seem hell-bent on grouping under the same regulatory regime sensitive identifying data and the kind of impersonal behavioral data necessary to run the Web.
I – and my colleagues at the IAB – have been sounding these alarms for more than a year. I’ve testified before the Federal Trade Commission, and the House Small Business Committee. Yet the call for regulation grows bewilderingly louder, from elected officials who have specified no harm and conducted little research. Even the militant Center for Democracy and Technology, which has declared that “concerns about behavioral advertising practices are widespread,” recorded zero consumer complaints filed with states’ attorneys general in 2006-2007 over privacy violations involving behavioral targeting. Zero! Indeed, in its just-released 37-page report Online Consumers at Risk and the Role of State Attorneys General, which documents thousands of cases of Internet-related sales fraud, spyware, phishing, data security breaches, and child solicitation, the word privacy comes up only once – in a Texas case filed against two Web sites that allegedly failed to protect “the privacy and safety of minors.”
Such violations are already covered by existing law (in this case, the federal Children’s Online Privacy Protection Act, or COPPA) but the CDT, asserting that “behavioral advertising poses a growing risk to consumer privacy,” wants “a new general privacy law backed up by regulatory enforcement.”
It’s time for CMO’s, media company CEO’s, technology entrepreneurs, free-press advocates, independent Web publishers, retailers, e-tailers and others who depend on robust Internet communications and a thriving free media to stand up and let the world know where these recommendations are explicitly heading: Toward a Government takeover of the Internet, and a silencing of the diverse voices that make up the Web.