Ethical And Political Issues In Search Engines

google-rnd-051Introduction

In the final months of 2004, rumors began to circulate on the Internet that the infamous prison abuse photographs from Abu Ghraib were no longer available on a Google image search, although they continued to show up on other search engines.i The implication was that political considerations might have been influencing the search engine results, and implication that Google denies.ii When I emailed Google directly about this issue, Nate Tyler, a spokesman for Google, wrote: “Basically, Google did show these images but only for a limited period of time, as our index (collection of web images) cycles through every so often to update itself. New images replace the old. At no point did we filter these images.” This explanation seems implausible, given the large number of old photos that seem to stay in the Google database and the high level of importance (and back-links) of these particular photos.

     This was not the first instance of ethical issues being raised about search engines. In the early years of search engines, the line had not always been clearly drawn between “sponsored sites” (i.e., sites that pay the search company to put their sites on the top of the list) and regular, non-paying sites. This has in large measure been worked out, and search results typically label those sites that have paid to be listed. This strikes a nice balance between the demands of honesty and those of business. Search engines are understandably heavily dependent on advertising revenues, so it was important to provide a solution that permitted that to continue; at the same time, it was important that users find themselves directed toward the most relevant sites.

     Subtle variations upon this theme, however, are now pervasive. Search engine companies sell certain keywords to advertisers in such a way that, when searches enter that term, certain advertising results are displayed in the results page. The advertiser then pays the search engine company a fixed amount per click. This has given rise to “click fraud,” generated by the lure of an estimated 3.8 billion dollars annually in advertising revenues.iii Competitors may repeatedly click on the ads, thereby driving up the advertising costs paid by their competitors. The average price-per-click for popular keywords is $1.70, and can range in rare cases as high as $50 per click. It’s easy to see how an unscrupulous competitor could drive the advertising budget of another company into the ground.

     Other issues have proved more troublesome. In a typical Google search on the word “Jew,” several of the first ten sites that come up are virulently anti-Semitic, including “Jew Watch” and “The International Jew: The World’s Foremost Problem.” Comparable searches on “Christian” or “Muslim” or “Hindu” do not yield critical sites among the top-ranked entries. In a note from Google on “Offensive Search Results,”iv The Google Team points out that anti-Semitic sites do not typically appear in a search for “Jewish people,” “Jews,” or “Judaism,” only in a search for the singular word “Jew.”

     In an international counterpart to the United States emphasis on local standards for judging pornography, international search engines encounter the problem that such anti-Semitic websites are illegal in some countries. Responding to the legal requirements of their home countries, Google.de and Google.fr do not list those anti-Semitic sites. A search for “Juden” (the plural-the singular in German, “Jude,” returns many entries on Jude Law) on Google.de yields over 2M entries, but the first page contains no critical entries; nor does a search on “Juif” on Google.fr yield anti-Semitic sites.

     Google’s official policy on this issue is clearly stated in the note on offensive entries:

Our search results are generated completely objectively and are independent of the beliefs and preferences of those who work at Google. Some people concerned about this issue have created online petitions to encourage us to remove particular links or otherwise adjust search results. Because of our objective and automated ranking system, Google cannot be influenced by these petitions. The only sites we omit are those we are legally compelled to remove or those maliciously attempting to manipulate our results.v

Several of the first page sites that appear in a search on the “Klu Klux Klan” are highly critical of the Klan; no note appears in that search about offensive results.

     These cases raise interesting and extremely important ethical issues about access to information on the Web and the role of search engines. Let me begin by commenting on the public function and responsibility of search engines.

The Public Function and Responsibility of Search Engines

Search engines occupy a privileged place in the world of information technology. They are like windows onto the web-and, like windows, tend to be largely unnoticed because our gaze focuses on what is visible through them. With windows, however, it is easy to detect when they are cloudy or distorted. With search engines, however, it is much more difficult to tell when they are providing distorted or incomplete pictures. Several points should be noted here.

     First, the vast amount of information available on the Web would be almost useless without search engines. They play an absolutely crucial role in the access to information.vi In the world of the Web, esse est indicato in Google: to exist is to be indexed on Google. The challenge in information retrieval is not simply to find the right piece of information, but also to avoid listing all the pieces of extraneous information. (The success of Google was precisely in its ability to help users find exactly the information they were seeking and to avoid irrelevant sites.) Search engines are the gatekeepers of the web,vii helping people to reach their desired destinations. Without them, much of the web would simply be inaccessible to us.

     Second, access to information is crucial for responsible citizenship.viii Citizens in a democracy, and indeed members of the international community in general, cannot make informed decisions without access to accurate and complete information. Within a few years, the Web has become the favored source of information retrieval. When we want to find more information about a topic, whether it be torture or tsunamis, we turn first-and often only-to the Web. The Web has become the principal source of research information for most Americans who do casual research. Typically, users turn first to Google for searches; Machill et al. estimated that 74% of users turn to Google first.ix

     Third, search engines have become central to education. Students today perform countless web searches in an average day. They search Google far more often than they go to the library, undoubtedly more often than they look in a book for information. Search engines play a role analogous to the card catalogue in traditional libraries and the indices, such as the Reader’s Guide to Periodical Literature, that were so important to students of the previous generation. Imagine a library without a card catalogue; that would be a close analogy to the Web without search engines, but with one important difference. Books would still be written without card catalogues, but without search engines, many persons and groups would probably not develop their websites.

     Fourth, search engines are owned by private corporations, businesses that are quite properly seeking to make a profit. These companies, especially Google since it has become the search engine of choice for so many millions, have a crucial public responsibility but are accountable to shareholders, not the general public. This sets up a tension between the public role of search engines and their corporate accountability.

     Let’s now examine three areas in which we encounter difficult and persistent ethical issues in search engine technology.

The Problem of the Algorithm

The key to the success of Google was an important conceptual shift in the understanding of searches. Initially search engines used fairly elementary algorithms to determine page rank such as the number of visits to a page, the number of other pages which link to a given page. What is common to these initial approaches to user searches was that they depended on objective criteria such as the number of page views. A given search engine could certainly get it wrong, but that did not diminish the fact that there was an objective fact of the matter to be gotten wrong. These initial searches were at least intended to rank the most popular sites, where “popularity” would have a technical and objective meaning.

     The shift in what we could call second-generation search engines involved looking much more closely at what users wanted to find, which was not always the most popular site, but the site that most closely meets their needs. The remarkable success of Google depends in part on its ability to offer users what they are looking for, based on the search terms that are entered. 

This is conceptually very different from a ranking of page popularity alone; what the user wants becomes an integral part of the formula, as does the set of search terms most commonly used to express what the user wants.

     The situation described above is complicated by the fact that the search algorithms that govern searches are well- kept secrets, and properly so. Not only do these algorithms give some companies a competitive edge, but potential spammers can manipulate search engine results much more easily if they know the details of the algorithms used to rank search results. Consequently, the search process is not transparent, that is, we do not know why certain sites have been included or excluded and we do not know what some sites are ranked above others.x

The Politics of Searching: Privacy and Liberty

In the aftermath of the September 11th attacks, the Federal Bureau of Investigation in the United States proposed to develop an email intercept system that could sniff out possible terrorist threats, getting right to the “meat” of the message and disregarding the inessential. Carnivore, as it came to be known,xi was designed to monitor email traffic, but it is easy to see the way in which the same argument could justify monitoring internet searches. Carnivore, like most FBI computer projects, was a technical failure and abandoned, after an expenditure of $6-15M, in favor of commercial software.xiiAfter all, if the government is entitled by the Patriot Act of 2001 to see what books we have been taking out from the library,xiii wouldn’t the same logic mandate access to search requests?

     The potentially chilling effects of such a situation are clear. The technical difficulties are significant but surmountable. Certainly it is virtually impossible to check who is doing searches from a public computer. From office or home machines, it’s at least possible to obtain ip addresses, and sometimes more if, for example, someone has cookies enabled. Most recently, Google has offered a voluntary search history, “My Search History,” that allows users to store and retrieve their searches. It “lets you easily view and manage your search history from any computer.”xiv Google stresses the benefits for end users, building on the fact that most of us have at one time or another been unable to retrieve a reference we originally found in a Google search but cannot find again. However, there is obviously an economic motive behind this helpful attitude: Google can provide advertisers with far more sophisticated consumer profiles if it maintains a comprehensive database of search histories that can be sorted by individual user. To some extent, this is already possible with cookies and with individuals signed in with a Gmail account, but the new “My Search History” feature increases accuracy dramatically and tracks users across multiple machines.

     Economics is driving these technological developments in tracking search engine users, but the truly frightening aspect of this is political rather than economic. We all leave countless virtual footprints as we move through the day, using credit cards, making cell phone calls, accessing ATM machines, etc. These already provide a surprisingly detailed picture of an individual’s daily life at least in terms of external activities. Search histories, however, go one step further: they provide an excellent source of insight into what someone is thinking, not just what that person is doing.

The danger, at least in the United States, is that such monitoring may be used increasingly to monitor and eventually suppress political dissent. The terrorist attacks of September 11th were ironically effective in strengthening public support for the erosion of personal liberty in the United States, and one can easily imagine government monitoring of search engine activity justified as a counter-terrorism measure.xv

     If such a scenario seems too implausible, and if it seems unthinkable that major search engine companies would cooperate with such an undertaking, one only has to look at Internet filtering in China today to see what the future may hold.

Local Standards in a Global Village

Perhaps the most frightening aspect of the power of search engines has occurred recently in China, which has made massive and highly effective efforts to prevent average Chinese citizens from accessing certain sites on the Internet. The accepted wisdom has been that the Internet is an unstoppable force for democratization, a force for liberation that cannot be tamed by local governments.

     This assumption has been proved false in the case of Internet censorship in China. The Chinese government has succeeded in blocking the access of the average Chinese computer user to political sites dealing with the Dalai Lama and free Tibet, the Falun Gong, Tiananmen Square and-most recently-the Chinese demonstrations against Japan’s most recent attempts at revisionist history.xvi The report of the ONI on “Internet Filtering in China 2004-2005” indicates that China has been far more successful in preventing its citizens from accessing certain websites than previously imagined. China’s approach has been multi-pronged. Much of it occurs at the backbone level, which is highly effective, but this is supplemented by restrictions on internet service providers and even down to the level of cybercafés, which are required to track customer usage.xvii Email appears to be filtered at the service provider level, not at the backbone level, and increasingly sophisticated anti-spam filtering software can also be modified for use in political filtering. Blog provides are carefully monitored through keyword filtering, and politically incorrect bloggers are typically removed quickly from the servers. Within China, when one looks for Google, one often reaches alternative search engines such as Openfind, Globepage, chinaren.com, search.online.sh.cn, and fm365.com.xviii These search engines are easily manipulated to carry out the kind of filtering that the Chinese government mandates.xix

     It is important to realize here the degree of cooperation that China has gotten from the West in its Internet filtering programs. Certainly much of the backbone of China’s Internet has been supplied by American manufacturers. According to the ONI Country Study on China, Cisco Systems has played a pivotal role in providing the infrastructure that enables the Chinese government to filter the Internet so effectively.xx Without the technical expertise and physical infrastructure provided by American companies, China’s Internet filtering endeavors would be far less successful.

     The role of Google in this situation, at least what we know of that role, does little to quell fears about the ways in which Google may be subject to political pressure. In 2004, the Chinese government began intermittently to shut down access from within China to the China Edition of Google News. Eventually, Google decided to shape its search results within China to the expectations of the Chinese government. A Google statement describes the situation in the following terms.

There has been controversy about our new Google News China edition, specifically regarding which news sources we include. For users inside the People’s Republic of China, we have chosen not to include sources that are inaccessible from within that country.

      In other words, Google decided to respect the Chinese political censorship rather than allow it to be shut down once again.

Although China is a vast potential market, it currently has little economic influence over Google, and presumably no political power over it. Nevertheless, Google seems to have accommodated itself to the wishes of the Chinese government. If this is the case, one cannot help but worry that Google could eventually be much more strongly influenced by the United States government, which has far greater economic and political impact on Google than does the government of China.

Conclusion

Search engines play an increasingly pivotal role in the distribution and eventual construction of knowledge, yet they are largely unnoticed, their procedures are opaque, and they are almost completely devoid of independent oversight: powerful, cloaked in secrecy, and not subject to external control. Insofar as the flourishing of deliberative democracy is dependent on the free and undistorted access to information, and insofar as search engines are increasingly the principal gatekeepers of knowledge, we find ourselves moving in a politically dangerous direction. We risk having our access to information controlled by ever-powerful, increasingly opaque, and almost completely unregulated search engines that could shape and distort our future largely without our knowledge. For the sake of a free society, we must pursue the development of structures of accountability for search engines. Based on the cases discussed above, there is little reason to think that search engines will remain impervious to external political and economic pressures.

 

 

 

 

Footnotes

When I did a search on “Abu Ghraib” in December 2004 on Alta Vista (http://www.altavista.com/image/results?q=abu+ghraib&mik=photo&mik=graphic&mip=all&mis=all&miwxh=all), I came across a number of the infamous photos on the first page of results; the research listed a total number of 2,579 results. However, when I did a comparable search on Google (with SafeSearch turned off) (http://images.google.com/images?q=abu+ghraib&hl=en&lr=&safe=off&start=0&sa=N), I got 137 results, but almost none of them were the prison abuse photos that from Abu Ghraib that so electrified the world. The same search, repeated in February 2005, yielded far more images in Google, although still some of the original infamous photos seemed not to be present.

ii Email from Mr. Tyler to me on 1/4/05.

iii Michael Liedtke, “Click Fraud Looms As Search-Enging Threat,” Associated Press, Feb. 11, 2005; http://www.miami.com/mld/miamiherald/business/national/10876986.htm?1c. Also see Jessie C. Stricchiola, “Click Fraud-An Overview.” Alchemist Media, Inc http://www.alchemistmedia.com/CPC_Click_Fraud.htm .

iv http://www.google.com/explanation.html . They write, in part, that “If you use Google to search for “Judaism,” “Jewish” or “Jewish people,” the results are informative and relevant. So why is a search for “Jew” different? One reason is that the word “Jew” is often used in an anti-Semitic context. Jewish organizations are more likely to use the word “Jewish” when talking about members of their faith.”

v Ibid.

vi In March 2005, Google was ranked fourth in most accessed U.S. sites by Nielsen, with a unique audience that month of 60M viewers, which equaled an audience reach of 43%. http://www.netratings.com/news.jsp?section=dat_to&country=us The other principal mode of access to the Web has been guides done by individuals. In the early stages of the Web, these flourished. More recently, with increasing accuracy of search engines, they have declined in importance.

vii On the gatekeeper metaphor, see Baye, M. R. and Morgan, J (2001). Information Gatekeepers on the Internet and the Competitiveness of Homogeneous Product Markets, American Economic Review 91(3): 454-474.

viii On the political dangers associated with search engines, see Introna, Lucas D. and Helen Nissenbaum (2000) “Shaping the Web: Why the Politics of Search Engines Matters”, The Information Society, Vol. 16, No.3, 1-17; available at http://www.indiana.edu/~tisj/readers/full-text/16-3%20Introna.html. On government surveillance, see “The Nature and Scope of Governmental Electronic Surveillance Activity,” Center for Democracy and Technology (2004), at http://www.cdt.org/wiretap/wiretap_overview.html; for current standards, see “CURRENT LEGAL STANDARDS FOR ACCESS TO PAPERS, RECORDS, AND COMMUNICATIONS: What Information Can the Government Get About You, and How Can They Get It?” at http://www.cdt.org/wiretap/govaccess/govaccesschart.html

ix Machill, M., Neuberger, C., Schweiger, W. and Wirth, W, “Wegweiser im Netz” Qualität und Nutzung von Suchmaschinen,” in Wegweiser im Netz: Qualität und Nutzung von Suchmaschinen, Verlag Bertelsman Stiftung, Bielefeld, p. 397.

x For a discussion of transparency, see Von Carsten Welp, “Ein Code of Conduct für Suchmaschinen,” Wegweiser im Netz: Qualität und Nutzung von Suchmaschinen, Verlag Bertelsman Stiftung, Bielefeld, pp. 499-502.

xi Later, it was called DCS-1000.

xii “FBI cuts Carnivore Internet probe,” CNN website. Tuesday, January 18, 2005 Posted: 9:59 PM EST (0259 GMT) Tuesday, January 18, 2005.

xiii “FBI monitoring library records in terror probe,” Associated Press, June 25, 2002 (http://www.freedomforum.org/templates/document.asp?documentID=16468; last accessed 5/3/05).

xiv https://www.google.com/searchhistory/login

xv For an insightful discussion of this issue in the European context, including a discussion of the differences between the American and European contexts, see Michael Nagenborg, “Privacy and Terror: Some Remarks from Historical Perspective, IJIE International Journal of Information Ethics, Vol. 2 (11/2004), 1-5.

xvi Jonathan Krim, “Web Censors In China Find Success,” Washington Post, Thursday, April 14, 2005; Page A20. Also see Jonathan Zittrain and Benjamin Edelman, “Empirical Analysis of Internet Filtering in China,” Berkman Center for Internet & Society, Harvard Law School: http://cyber.law.harvard.edu/filtering/china/ ; last accessed 5/2/05; this includes a complete list of the 18,931 sites blocked by the Chinese government.

xvii OpenNet Initiative (ONI), “Internet Filtering in China 2004-2005: A Country Study,” April 14, 2005. http://opennetinitiative.net/studies/ china/ONI_China_Country_Study.pdf Also see Jonathan Zittrain and Benjamin Edelman, “Internet Filtering in China,” 2003. http://unpan1.un.org/intradoc/groups/public/documents/apcity/unpan011043.pdf

xvii Berkman Center for Internet & Society, Harvard Law School, “Replacement of Google with Alternative Search Systems in China: Documentation and Screen Shots,”

xviii http://cyber.law.harvard.edu/filtering/china/google-replacements/

xix OpenNet Initiative: Bulletin 005, “Probing Chinese search engine filtering,” August 19, 2004 http://www.opennetinitiative.net/bulletins/005/

x “There has been considerable debate about the complicity of Western corporations in the development and maintenance of China’s filtering system. China’s Internet infrastructure includes equipment and software from U.S. companies, including Cisco Systems, Nortel Networks, Sun Microsystems, and 3COM.28 Cisco Systems in particular has been integral to China’s Internet development. The core of China’s Internet relies on Cisco technology; Cisco specifically implemented the backbone networks for ChinaNet29 and CERNet30, China’s nation-wide educational network. Cisco’s involvement continues to this day with the company’s role in the development of China’s “Next-Generation Network,” known as CN2.31.” “Internet Filtering in china 2004-2005,” pp. 6-7.

xxi http://www.google.com/googleblog/2004/09/china-google-news-and-source-inclusion.html Google concludes, “On balance we believe that having a service with links that work and omits a fractional number is better than having a service that is not available at all. It was a difficult tradeoff for us to make, but the one we felt ultimately serves the best interests of our users located in China. We appreciate your feedback on this issue.”

Also see the links at http://www.google-watch.org/china.html .

 

By Lawrence M. Hinman 

International Review of Information Ethics June 2005

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s