Algorithms of Oppression

Home > Nonfiction > Algorithms of Oppression > Page 6
Algorithms of Oppression Page 6

by Safiya Umoja Noble


  Google, according to its own disclaimer, will only remove pages that are considered unlawful, as is the case in France and Germany, where selling or distributing neo-Nazi materials is prohibited. Without such limits on derogatory, racist, sexist, or homophobic materials, Google allows its algorithm—which is, as we can see, laden with what Diaz calls “sociopolitics”—to stand without debate while protesting its inability to remove pages. As recently as June 27, 2012, Google settled a claim by the French antiracism organization the International League Against Racism over Google’s use of ethnic identity—“Jew”—in association with popular searches.42 Under French law, racial identity markers cannot be stored in databases, and the auto-complete techniques used in the Google search box link names of people to the word “Jew” on the basis of past user searches. What this recent case points to is another effort to redefine distorted images of people in new media. These cases of distortion, however, continue to accumulate.

  Figure 1.12. Explanation of results by Google. Source: www.google.com/​explanation.html (originally available in 2005).

  The public’s as well as the Jewish community’s interest in accurate information about Jewish culture and the Holocaust should be enough motivation to provoke a national discussion about consumer harm, to which my research shows we can add other cultural and gender-based identities that are misrepresented in search engines. However, Google’s assertion that its search results, though problematic, were computer generated (and thus not the company’s fault) was apparently a good-enough answer for the Anti-Defamation League (ADL), which declared, “We are extremely pleased that Google has heard our concerns and those of its users about the offensive nature of some search results and the unusually high ranking of peddlers of bigotry and anti-Semitism.”43 The ADL does acknowledge on its website its gratitude to Sergey Brin, cofounder of Google and son of Russian Jewish immigrants, for his personal letter to the organization and his mea culpa for the “Jew” search-term debacle. The ADL generously stated in its press release about the incident that Google, as a resource to the public, should be forgiven because “until the technical modifications are implemented, Google has placed text on its site that gives users a clear explanation of how search results are obtained. Google searches are automatically determined using computer algorithms that take into account thousands of factors to calculate a page’s relevance.”44

  If there is a technical fix, then what are the constraints that Google is facing such that eight years later, the issue has yet to be resolved? A search for the word “Jew” in 2012 produces a beige box at the bottom of the results page from Google linking to its lengthy disclaimer about the results—which remain a mix of both anti-Semitic and informative sites (see figure 1.13). That Google places the responsibility for bad results back on the shoulders of information searchers is a problem, since most of the results that the public gets on broad or open-ended racial and gendered searches are out of their control and entirely within the control of Google Search.

  Figure 1.13. Google’s bottom-of-the-page beige box regarding offensive results, which previously took users to “An Explanation of Our Search Results.” Source: www.google.com/​explanation (no longer available).

  It is important to note that Google has conceded the fact that anti-Semitism as the primary information result about Jewish people is a problem, despite its disclaimer that tries to put the onus for bad results on the searcher. In Germany and France, for example, it is illegal to sell Nazi memorabilia, and Google has had to put in place filters that ensure online retailers of such are not visible in search results. In 2002, Benjamin Edelman and Jonathan Zittrain at Harvard University’s Berkman Center for Internet and Society concluded that Google was filtering its search results in accordance with local law and precluding neo-Nazi organizations and content from being displayed.45 While this indicates that Google can in fact remove objectionable hits, it is equally troubling, because the company provided search results without informing searchers that information was being deleted. That is to say that the results were presented as factual and complete without mention of omission. Yahoo!, another leading U.S. search engine, was forced into a protracted legal battle in France for allowing pro-Nazi memorabilia to be sold through its search engine, in violation of French law. What these cases point to is that search results are deeply contextual and easily manipulated, rather than objective, consistent, and transparent, and that they can be legitimated only in social, political, and historical context.

  The issue of unlawfulness over the harm caused by derogatory results is a question of considerable debate. For example, in the United States, where free speech protections are afforded to all kinds of speech, including hate speech and racist or sexist depictions of people and communities, there is a higher standard of proof required to show harm toward disenfranchised or oppressed people. We need legal protections now more than ever, as automated decision-making systems wield greater power in society.

  Gaming the System: Optimizing and Co-opting Results in Search Engines

  Google’s advertising tool or optimization product is AdWords. AdWords allows anyone to advertise on Google’s search pages and is highly customizable. With this tool, an advertiser can set a maximum amount of money that it wants to spend on a daily basis for advertising. The model for AdWords is that Google will display ads on search pages that it believes are relevant to the kind of search query that is taking place by a user. If a user clicks on an ad, then the advertiser pays. And Google incentivizes advertisers by suggesting that their ads will show up in searches and display, but the advertiser (or Google customer) pays for the ad only when a user (Google consumer) clicks on the advertisement, which is the cost per click (CPC). The advertiser selects a series of “keywords” that it believes closely align with its product or service that it is advertising, and a customer can use a Keyword Estimator tool in order to see how much the keywords they choose to associate with their site might cost. This advertising mechanism is an essential part of how PageRank prioritizes ads on a page, and the association of certain keywords with particular industries, products, and services derives from this process, which works in tandem with PageRank.

  In order to make sense of the specific results in keyword searches, it is important to know how Google’s PageRank works, what commercial processes are involved in PageRank, how search engine optimization (SEO) companies have been developed to influence the process of moving up results,46 and how Google bombing47 occurs on occasion. Google bombing is the practice of excessively hyperlinking to a website (repeatedly coding HTML to link a page to a term or phrase) to cause it to rise to the top of PageRank, but it is also seen as a type of “hit and run” activity that can deliberately co-opt terms and identities on the web for political, ideological, and satirical purposes. Judit Bar-Ilan, a professor of information science at Bar-Ilan University, has studied this practice to see if the effect of forcing results to the top of PageRank has a lasting effect on the result’s persistence, which can happen in well-orchestrated campaigns. In essence, Google bombing is the process of co-opting content or a term and redirecting it to unrelated content. Internet lore attributes the creation of the term “Google bombing” to Adam Mathes, who associated the term “talentless hack” with a friend’s website in 2001. Practices such as Google bombing (also known as Google washing) are impacting both SEO companies and Google alike. While Google is invested in maintaining the quality of search results in PageRank and policing companies that attempt to “game the system,” as Brin and Page foreshadowed, SEO companies do not want to lose ground in pushing their clients or their brands up in PageRank.48 SEO is the process of “using a range of techniques, including augmenting HTML code, web page copy editing, site navigation, linking campaigns and more, in order to improve how well a site or page gets listed in search engines for particular search topics,”49 in contrast to “paid search,” in which the company pays Google for its ads to be displayed when specific terms are searched. A media spectacl
e of this nature is the case of Senator Rick Santorum, Republican of Pennsylvania, whose website and name were associated with insults in order to drive objectionable content to the top of PageRank.50 Others who have experienced this kind of co-optation of identity or less-than-desirable association of their name with an insult include former president George W. Bush and the pop singer Justin Bieber.

  Figure 1.14. Example of a Google bomb on George W. Bush and the search terms “miserable failure,” 2005.

  All of these practices of search engine optimization and Google bombing can take place independently of and in concert with the process of crawling and indexing the web. In fact, being found gives meaning to a website and creates the conditions in which a ranking can happen. Search engine optimization is a major factor in findability on the web. What is important to note is that search engine optimization is a multibillion-dollar industry that impacts the value of specific keywords; that is, marketers are invested in using particular keywords, and keyword combinations, to optimize their rankings.

  Despite the widespread beliefs in the Internet as a democratic space where people have the power to dynamically participate as equals, the Internet is in fact organized to the benefit of powerful elites,51 including corporations that can afford to purchase and redirect searches to their own sites. What is most popular on the Internet is not wholly a matter of what users click on and how websites are hyperlinked—there are a variety of processes at play. Max Holloway of Search Engine Watch notes, “Similarly, with Google, when you click on a result—or, for that matter, don’t click on a result—that behavior impacts future results. One consequence of this complexity is difficulty in explaining system behavior. We primarily rely on performance metrics to quantify the success or failure of retrieval results, or to tell us which variations of a system work better than others. Such metrics allow the system to be continuously improved upon.”52 The goal of combining search terms, then, in the context of the landscape of the search engine optimization logic, is only the beginning.

  Much research has now been done to dispel the notion that users of the Internet have the ability to “vote” with their clicks and express interest in individual content and information, resulting in democratic practices online.53 Research shows the ways that political news and information in the blogosphere are mediated and directed such that major news outlets surface to the top of the information pile over less well-known websites and alternative news sites in the blogosphere, to the benefit of elites.54 In the case of political information seeking, research has shown how Google directs web traffic to mainstream corporate news conglomerates, which increases their ability to shape the political discourse. Google too is a mediating platform that, at least at one moment in time, in September 2011, allowed the porn industry to take precedence in the representations of Black women and girls over other possibilities among at least eleven and a half billion documents that could have been indexed.55 That moment in 2011 is, however, emblematic of Google’s ongoing dynamic. It has since produced many more problematic results.

  As the Federal Communications Commission declares broadband “the new common medium,”56 the role of search engines is taking on even greater importance to “the widest possible dissemination of information from diverse and antagonistic sources . . . essential to the welfare of the public.”57 This political economy of search engines and traditional advertisers includes search engine optimization companies that operate in a secondary or gray market (often in opposition to Google). Ultimately, the results we get are about the financial interest that Google or SEOs have in helping their own clients optimize their rankings. In fact, Google is in the business of selling optimization. Extensive critiques of Google have been written on the political economy of search58 and the way that consolidations in the search engine industry market contribute to the erosion of public resources, in much the way that the media scholars Robert McChesney, former host of nationally syndicated radio show Media Matters, and John Nichols, a writer for the Nation, critique the consolidation of the mass-media news markets. Others have spoken to the inherent democratizing effect of search engines, such that search is adding to the diversity of political organization and discourse because the public is able to access more information in the marketplace of ideas.59 Mounting evidence shows that automated decision-making systems are disproportionately harmful to the most vulnerable and the least powerful, who have little ability to intervene in them—from misrepresentation to prison sentencing to accessing credit and other life-impacting formulas.

  This landscape of search engines is important to consider in understanding the meaning of search for the public, and it serves as a basis for examining why information quality online is significant. We must trouble the notion of Google as a public resource, particularly as institutions become more reliant on Google when looking for high-quality, contextualized, and credible information. This shift from public institutions such as libraries and schools as brokers of information to the private sector, in projects such as Google Books, for example, is placing previously public assets in the hands of a multinational corporation for private exploitation. Information is a new commodity, and search engines can function as private information enclosures.60 We need to make more visible the commercial interests that overdetermine what we can find online.

  The Enclosure of the Public Domain through Search Engines

  At the same time that search engines have become the dominant portal for information seeking by U.S. Internet users, the rise of commercial mediation of information in those same search engines is further enclosing the public domain. Decreases in funding for public information institutions such as libraries and educational institutions and shifts of responsibility to individuals and the private sector have reframed the ways that the public conceives of what can and should be in the public domain. Yet Google Search is conceived of as a public resource, even though it is a multinational advertising company. These shifts of resources that were once considered public have been impacted by increased intellectual property rights, licensing, and publishing agreements for companies and private individuals in the domain of copyrights, patents, and other legal protections. The move of community-based assets and culture to private hands is arguably a crisis that has rolled back the common good, but there are still possible strategies that can be explored for maintaining what can remain in the public domain. Commercial control over the Internet, often considered a “commons,” has moved it further away from the public through a series of national and international regulations and intellectual and commercial borders that exist in the management of the network.61 Beyond the Internet and the control of the network, public information—whether delivered over the web or not—continues to be outsourced to the private sphere, eroding the public information commons that has been a basic tenet of U.S. democracy.

  The critical media scholar Herbert Schiller, whose work foreshadowed many of the current challenges in the information and communications landscape, provides a detailed examination of the impact of outsourcing and deregulation in the spheres of communication and public information. His words are still timely: “The practice of selling government (or any) information serves the corporate user well. Ordinarily individual users go to the end of the dissemination queue. Profoundly antidemocratic in its effect, privatizing and/or selling information, which at one time was considered public property, has become a standard practice in recent years.”62 What this critique shows is that the privatization and commercial nature of information has become so normalized that it not only becomes obscured from view but, as a result, is increasingly difficult to critique within the public domain. The Pew Internet and American Life Project corroborates that the public trusts multinational corporations that provide information over the Internet and that there is a low degree of distrust of the privatization of information.63 Part of this process of acquiescence to the increased corporatization of public life can be explained by the economic landscape, which is shaped by military-indust
rial projects such as the Internet that have emerged in the United States,64 increasing the challenge of scholars who are researching the impact of such shifts in resources and accountability. Molly Niesen at the University of Illinois has written extensively on the loss of public accountability by federal agencies such as the Federal Trade Commission (FTC), which is a major contribution to our understanding of where the public can focus attention on policy interventions.65 We should leverage her research to think about the FTC as the key agency to manage and intervene in how corporations control the information landscape.

  The Cultural Power of Algorithms

  The public is minimally aware of these shifts in the cultural power and import of algorithms. In a 2015 study by the Pew Research Center, “American’s Privacy Strategies Post-Snowden,” only 34% of respondents who were aware of the surveillance that happens automatically online through media platforms, such as search behavior, email use, and social media, reported that they were shifting their online behavior because of concerns of government surveillance and the potential implications or harm that could come to them.66 Little of the American public knows that online behavior has more importance than ever. Indeed, Internet-based activities are dramatically affecting our notions of how democracy and freedom work, particularly in the realm of the free flow of information and communication. Our ability to engage with the information landscape subtly and pervasively impacts our understanding of the world and each other.

  Figure 1.15. Forbes’s online reporting (and critique) of the Epstein and Robertson study.

 

‹ Prev