Phil Sim

Web, media, PR and… footy

What Google can do to stop plagiarism and save journalism

When people lament the decay of journalism in the online age, you’ll almost certainly hear Google thrown in as a primary contributor. Critics cite the echo-chamber effect of Google News and the need for journalists to spit copy out at a million miles-an-hour to keep up in the Google-age as factors driving the craft of journalism into the ground.

Plagiarism also will often rear its ugly head. The problem is getting worse and worse. Yesterday, on our MediaConnect/ITJourno site we wrote how SMH.com.au journalist Asher Moses complained about a piece of his being pretty much pilfered by the Daily Mail. This piece about a local newspaper in the US made up almost entirely of plagiarised content ran prominently on online magazine Slate along with this blog on the Publish2 blog. Now plagiarism pre-dates Google by centuries but it’s probably a fair argument to suggest that the rise of Google-driven online media models has intensified the problem by a degree of magnitude.

Google recently made a move to lower its ranking given to duplicated content so if you use wire copy its less likely to show up prominently in Google News and on search engines. Surely that would act as a demotivator to plagiarise you might say? No, it just forces unethical outlets to go to a little more effort to rework the plagiarised content so its not identical to the original. The Daily Mail story was a perfect example of a story which has been reworked but the fact of the matter remains that it has taken slabs of content without any attribution.

Google has become such a powerful force that it heavily influences, some might say dictates, the directions and strategies of most online media companies. If Google changes its ranking algorithms, SEO gurus all over the world scramble to react.

So what would happen if Google used its power to remedy some of the issues its rise has created – especially in regards to plagiarism.

What if an outlet caught plagiarising content was immediately removed from the Google News index? And/or had its Pagerank trashed?

What if Google set a benchmark for what it considered fair use and asked all those outlets who wished to be included in Google News to abide by those guidelines.

You’d solve the plagiarism problem almost overnight.

For many sites, being thrown out of Google News would instantly cripple them. No one would risk running plagiarised copy. In fact, editors might even get vigilant about checking that contributors pieces haven’t been overtly ‘inspired’ by the work of others. If getting busted for plagiarism potentially threatens your financial viability, what outlet wouldn’t attempt to put checks and balances in place. Who knows, sub-editors might even become employable again!

Google should demand that any re-used copy/quotes come with attribution to the original source and that it adheres to re-use guidelines if they are published clearly and obviously on any article.

Some will say Google is but a search engine and its not its charter to decide what is and what isn’t plagiarism. But Google News has rules laid around it, already. To be indexed you need to be a multi-authored content site, with easily attainable contact information and so forth. It’s a human-managed process that would only need to be expanded to take into account copyright guidelines.

Google also has to make policies and decisions relating to black-hat linking and SEO. If sites are caught buying links they can have their pagerank reset so again, its not a big leap to do the same regarding what is essentially blackhat publishing.

And even if Google didn’t want to give itself that responsibility, it could instead work with an industry body established by large media companies and publishers that could define guidelines and make decisions on when those rules have been broken. I would suggest those guidelines should put the emphasis on publishers to state clearly their accepted re-use policy and those who choose to re-use copy simply need to comply with those guidelines. If found to have not done so, the body would simply inform Google that the infringing site has breached and then Google and other news aggregators who opted in to the system would then remove said sites from their indexes and/or downgrade their overall site rankings.

Legal solutions to this problem won’t work – the economics don’t add up. But that shouldn’t matter – we operate in a link economy and Google is already the overlord of the link so this really would be just a natural extension of its responsibility to the Internet community and its media partners. And I guarantee it would all but put an end to commercial plagiarism almost immediately.

If you think this idea could float I encourage you to spread the word. There are a lot of movements like Data Portability that the web community have managed to champion but the media community seldom seems able to come together to do anything yet the plagiarism is one massive problem that threatens everyone’s livelihood and yet I believe, as described above, is relatively easily solved. Even if the media industry doesn’t go as far as forming an official industry group, a group like dataportability.org could at least be formed to
start engaging with companies like Google to work towards a solution.

About these ads

Filed under: Plagiarism

8 Responses

  1. vealmince says:

    What planet do you live on, Phil? An automated system couldn’t tell the difference between plagiarism and licensed or news-feed content. It would require a human complaints system…
    “Dear Google, the Daily Mail stole my article. -Asher.”
    “Dear Daily Mail, Asher says you stole his article. That’s very naughty and we’re going to reduce your PageRank. -Google.”
    “Dear Google, No we didn’t. And besides, it was a freelance writer who we won’t use again. And we changed some of the words. And if you mention this again or change our PageRank, we’ll sue your arses. -Daily Mail.”
    It wouldn’t work without some kind of plagiarism tribunal. And since plagiarism is only poor form rather than illegal, and because many publishers benefit from plagiarism (free content!), nobody’s going to pay for that.

  2. Phil Sim says:

    Josh, yes it would have to be a human complaints system. As I said, Google News already has a human filter on sites that are submitted. And again, as I said it would probably be more realistic for an independent body to set up, aka your plagiarism body that Google outsources that process to and which is funded by media companies/wire services.

    I’m sure AP, AFP, etc would be quite happy to put some resources behind a body like that. It would only take Google to agree t0 work with it.

    If Google wants to take a site out of its Google News index (lets leave Pagerank alone) it is totally entitled to.

  3. vealmince says:

    But why would Google voluntarily reduce its traffic and piss off some of its best customers to defend an abstract principle that nobody cares much about except a) journalists and b) the kind of people who religiously watch Media Watch and read the Oz Media section (mostly journalists)?
    And how do you legally define and prove plagiarism anyway? Did the Daily Mail cover itself by putting a “told Fairfax Media” in the story? Morally no, but legally?

  4. Caitlin says:

    I wouldn’t fancy getting my page rank trashed because of legitimate republication, such as wire copy that I’ve paid for, or on my own site republication of material that I’ve retained copyright to.

  5. Caitlin says:

    @vealmince plagiarism is actually illegal not just in poor form. If the content or structure of the article is substantially copied and you can prove that, then it’s breach of copyright.

    If SMH could be bothered to sue the Daily Mail for breach of copyright and then present that finding to Google as evidence for why the Mail’s page ranking should be trashed, THEN maybe it would work.

  6. On August 13th, I was notified that a great deal of my copyrighted material had been plagiarized. As an author of 3 books, too many articles to name and unique site content, I naturally want my words protected. And I want justice and compensation for my loss. Two of my articles were published by the most recognized media companies in America. The publishers perhaps didn’t know, but all publishers should google check articles submitted to them for publication to make sure they are not plagiarized. It is easy enough to do. I am appalled by what has happened. Who cares?
    Everyone should care. As a matter of fact infringing on someone else’s copyright (plagiarism) is a criminal offense. And there are stiff fines. The FBI can investigate.

  7. Phil Sim says:

    Ariadne, I really do feel for you. However, it really does appear that nobody does care which I still find simply astounding.

    I still say the difficulty of conducting legal action means nobody is scared by that route, there absolutely needs to be consequences enforced by our industry that punishes those who don’t take the time to check for plagiarism and police it.

  8. Blackhat SEO says:

    Two of my articles were published by the most recognized media companies in America. The publishers perhaps didn’t know, but all publishers should google check articles submitted to them for publication to make sure they are not plagiarized.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

Follow

Get every new post delivered to your Inbox.

Join 27 other followers

%d bloggers like this: