By Brian Dominick
Ingenious innovations are not always good ideas, at least in the long run, and corporations can typically be found at the forefront along the path to ironic counter-productivity. Early adoption of cellular technology saddled the US with an expensive, clumsy mobile phone system inferior to those employed throughout Europe and much of the Third World. High-speed Internet over TV cable or enhanced telephone wires gave way to monopoly dependence that left our broadband access far slower and many times as expensive as that available in most other countries.
Likewise, as consolidation of Internet technologies continues and intellectual property claims become more pervasive, the relatively level playing field once promised by the so-called “Information Superhighway” remains at once elusive and threatened.
One recent revelation has independent online news publishers cringing for fear of losing one of the only equalizing forces capable of bringing mainstream attention to otherwise marginalized voices.
It shouldn’t seem particularly ironic that when a revolutionary idea — say one that makes web searches fundamentally faster, more useful and more powerful — thrusts a couple of regular geeks clean through the atmosphere of success, the corporate behemoth their good fortune inevitably spawns would turn sinister before too long. But whether our response to such a development takes the form of surprise, disappointment or “I told you so,” the apparent trajectory of Google from mythic cottage project to Microsoft-esque corporate monster is portentous indeed — and it is very real.
It is not at all clear that Google — the company or its much-revered search engine — was ever something to admire on principle like so many truly independent marvels of modern software such as early Linux incarnations and the more recent Mozilla Firefox browser. Relatively noncommercial, open-source projects like these are not only technologically advanced by comparison to their profit-driven rivals; they represent a politically and economically desirable alternative to the corporate method.
But nothing compares to the staggering growth of Google — the technology and the company — which has achieved an unmatched popularity, evidenced by its dominance of the search engine market and the household-verb status of the company’s very name.
In 2003, Google launched a public model of its online news aggregator service, which collects headlines from and presents well-organized links to a reported 4,500 news and commentary sites. Champions of independent media couldn’t help but celebrate. The folks at Google had created a “spider” program that “crawls” and “scrapes” the content of all those news outlets, indexing every word of every item they publish and collecting all the results together in one big archive for all the world to search and browse, free of charge.
As far as anyone assumed until recently, when a user searched that archive using her or his web browser, the results Google found would come back prioritized by the relative relevance of each news item to the search words, by the date the items were published, or by a combination of those criteria.
Google has been fairly open when deciding what “news” sites to index. Lots of small, non-corporate sites — left and right alike — not only have a presence in the Google News index, but they often appear prominently in search results. Finding stories from AlterNet, Infoshop.org, ZNet and lots of other progressive and radical sites mixed in with the New York Times and CNN is not uncommon. Sometimes you will even find links to stories on Common Dreams and Rabble.ca ranked more prominently than their counterparts on Reuters and Fox News covering the same subject.
The explanation for this fairly unique approach to gathering and presenting the news is described in essentially idealistic terms on the company’s “About Google News” page:
Google News is a highly unusual news service in that our results are compiled solely by computer algorithms, without human intervention. As a result, news sources are selected without regard to political viewpoint or ideology, enabling you to see how different news organizations are reporting the same story. This variety of perspectives and approaches is unique among online news sites, and we consider it essential in helping you stay informed about the issues that matter most to you.
Unfortunately, any moment now, that level playing field may receive a drastic tilt in favor of the corporate giants.
In April, New Scientist magazine revealed that in 2003 Google filed for a patent on what it calls “systems and methods for improving the ranking of news articles.”
Google’s plans involve establishing a supposedly “qualitative” gauge of a news outlet’s “credibility” by measuring features such as the size of the organization’s staff and how long the publication has been in existence. The new “system” even incorporates “human evaluations” of the relative worth of each outlet.
The vague description of the method Google intends to patent explains that Google is developing a way to calculate the relative value — to you and me — of a news organization based on criteria apparently deemed worthy by techies and corporate executives. What they came up with is a far cry from anything journalists or amateur news hounds would likely have produced.
Included on the list of attributes Google values in a news organization are the “number of articles produced by the news source during a first time period,” which can be combined with the “amount of important coverage that the news source provides in a second time period.” This presumably refers to an automated means of evaluating how the outlet fairs in the cable-news-driven game of determining which stories will pan out as having been “important,” almost certainly assessed by software based on sheer volume of coverage each story receives. Of course, across most news media, “importance” of this kind is simply a measurement of monetary value, since most producers focus on stories that generate revenues.
Google’s journalism experts also say they may take into account the amount of traffic the site receives, how many countries its visitors come from, circulation statistics, the size of the organization’s staff and the number of bureaus it keeps in different locations. Many of these quantitative “quality” criteria are distinctly troubling. They are merely measurements of capital, which has more to do with the opinions potential investors hold of the organization’s profit value, having nothing whatsoever to do with the quality of a given article the organization puts on the Web.
Still other considerations for Google include the “breadth of coverage” a site produces and something called a “breaking news score.”
More often than not, as a rule of thumb, a specialized or local news outlet will cover a given issue or subject better than CNN or The New York Times or the BBC or any other global operation that might score very well in “breadth of coverage.” If all you do is report on genetics, or Africa, or hometown politics, or video games, wouldn’t it stand to reason you should fair considerably better in search results on that topic than an organization that dabbles lightly in everything? That is not to say that a news outlet cannot be broadly focused and still be very good, but why not let an outfit shine where it excels?
Also, the idea of providing a higher rating to outlets that offer more breaking news is like rewarding your partner for climaxing first. Breaking news is inherently subject to the most errors and the worst journalism. So it might be good to know that an outlet typically has something early on, but that is not a reliable method for evaluating the quality of its news reporting. It’s bad enough that speed is considered more important than substance in corporate media today — why regard it as a defining component of quality.
Consider this factor: “the age of the news source may be taken as a measure of confidence by the public.” As if the very existence of China’s government-run Xinhua news service, or Voice of America, or for that matter the endowment-backed London Guardian is an indication that these outlets can be “trusted.”
The most surprising aspect of Google’s new method is its “human” assessors intend to evaluate each source. Here we see the introduction of the ever-ambiguous, all-powerful human “evaluator.”
In another implementation, evaluators may be shown a selection of articles from individual news sources and asked to assign each source a score.
So much for the hands-off approach Google (as of press time) officially touts as “essential in helping you stay informed about the issues that matter most to you.”
In keeping with its tradition of trade secrecy, Google won’t talk about what it is up to, and there is no certainty that it has implemented or will implement any of the proposed methods to alter its ranking criteria. But concern that the once-level playing field is fast on its way to favoring the corporate industry leaders has spread far and wide on the Net, with small, independent publishers concerned that a major portal from the mainstream to the marginal is about to be squeezed shut.
A Good Idea Gone Awry
Really, it doesn’t take a conspiracy theorist to see that Google is self-consciously snuggling up with establishment media outlets and hobbling alternative and independent publishers. There are sensible criteria one could apply to improve search results for more objective factors of quality... were that one’s actual goal. A few of those appear in Google’s patent application. For instance, it makes sense to favor outlets that name their sources over those that simply assert “truths” and offer no means for vetting or verifying the accuracy of statements made, so Google’s purported consideration of such factors is welcome. And it makes sense to favor hard news over commentary, if that were what they mean by writing style, since it’s called Google News and not (at risk of sparking yet another trademark) Google Views.
But part of what Google refers to in its patent application as “writing style” is less valuable. Consider, for instance, “automated tests for measuring spelling correctness, grammar, and reading levels can be used to generate a metric value that reflects writing style.”
It is not at all obvious that grammar and spelling should matter, though at least an argument could be made that better proofreading goes hand-in-hand with better editing, and that editing improves quality. But “reading level?” So Google is now trying to drive away users who read at lower levels? As much as we may hate the Neanderthal approach to news taken by the likes of Rupert Murdoch and the tabloids, it is difficult to make an argument that snooty is better.
Can the Grassroots Prevail?
The obvious solution for self-motivated independent media activists would be to develop an alternative to Google — something that is technologically comparable but grounded in grassroots rather than profit motivations. But this is astronomically easier said than done, not least because of Google’s patented, secretive methods. Billions in capital, legions of server computers, a massive staff and infinite bandwidth don’t hurt either.
That’s what folks at OpenZuka strongly believe. The OpenZuka project came about very recently, in response to the threatened change to Google News, as a project to create an alternative news search engine. Rather than starting with a technical idea or a money-making scheme, OpenZuka’s founders began with a (“tentative”) set of values — diversity, empowerment, transparency and fairness — which they say should apply to the development team and the product alike.
If people with the technological savvy to make a project like that work also value independent media, there is a good chance that a powerful grassroots news scraper and search engine could become a reality. The need for Google News alternative may not be obvious today — and there is no telling if Google would find any success in a corporate news search engine as opposed to the more open model — but the tendency to corrupt will be overwhelming for any giant corporation with near-monopoly control over an information portal.
Brian Dominick is co-founder and co-editor of The NewStandard, a progressive, independent news website that relies heavily on Google News for its traffic.