TweetMeme and Site Scrapers – Content Theft


TweetMeme finds the hottest stories from twitter for you to retweet

At least this is what Tweetmeme would want you to believe.

The purpose, from what I can gather, is that tweetmeme is a service that allows you to retweet twitter tweets .. your retweet basically points people to the story that was posted on Twitter by someone else.

Fair enough .. if you use Twitter, be prepared to have others copy your headlines.

Scrapers have discovered a wonderful way to abuse the Tweetmeme service, in that instead of pointing to the headline in the original tweet or site, the headline url is redirected to a website other than that of the original author.

(Visit the linked text above, and click on any one of the links he has and you’ll see what I mean)

The original content is lifted from the source blog or other publication and re-posted on the scraper domain and the retweet url is written to point to the scraper site.

I won’t name names in this case, but today’s incident involved a few boneheads in India that somehow thought that it would be really cool to steal my content and re-post it on their own domain, with a retweet pointing to it.

I’ll thank Google for pointing all of this out to me in the first place.

A while after posting, I searched the keyword(s) used in the article and found it top slot in the SERPS .. 5 minutes after that, the scraped article appeared two positions down from mine.

I followed up on what I considered to be a bogus listing by first notifying Google. I supplied the scraper source code, including all of the Adsense and analytics information contained within.

I then began to prepare a DMCA filing because the scraper was hosted on American servers. I then made initial contact to the owners in India. I also cc’ed their web host into any emails I sent so that they would be made aware of the incident and what exactly I planned to do about it.

Long story short, the content was removed as quickly as these wanna-be scrapers could remove it.

I knew what their web host would do under the terms of the DMCA and so did they. Google, on the other hand, might be a different story. The scraper domain is destined to be banned anyway because of the abuses and whether or not my report of the abuse to Google has sped the process up remains to be seen — either way — this scraper site is history.

Tweetmeme has a long ways to go in curbing all of the abuse it is getting at the hands of site scrapers and for now the ranges are blocked, and probably will be for the duration.


Please enter your comment!
Please enter your name here