• What?
  • Who?
  • Contact Jonathan Jeter
  • Privacy Policy

Jonathan Jeter

Director of Application Development

  • Jonathan Jeter on Google+
  • Jonathan Jeter on Facebook
  • Follow Jonathan Jeter on Twitter
  • Jonathan Jeter on LinkedIn
  • Jonathan Jeter's rss feed
Home Web Development Duplicate Content & Multiple Site Issues

Duplicate Content & Multiple Site Issues

November 15, 2011 By Jonathan Jeter Leave a Comment

By Shari Thurow, Funder and SEO Director of Omni Marketing Interactive

What is duplicate content? Search engines don’t see layout, just copy. Exclude our site search via robots.txt and meta robots tag to avoid submitting duplicate content to Google. Duplicate content is not considered spam, but it can be penalized. The biggest issue, however, is usability. IMPORTANT NOTE: Duplicate content is not penalized, it’s filtered out of the SERPs.

How do search engines filter out duplicate content?

  • Crawl time filters (duplicate URLs)
  • Index-time filters
  • Query-time filters

Index-Time Filters

  • Broilerplate strippings (search engine removes broilerplate elements to determine the content fingerprint)
  • Linkage properties (# of inbound and outbound links)
  • Host name resolution (what domain resides on which ip address?)
  • Shingle Comparison (Andre Broder – Google Scholar and shingles)
    • Every Web document has a unique content signature or “fingerprint”
    • Content is broken down into sets of word patterns
    • word sets are created from groups of adjacent words
Are you linking consistently to the same URLs throughout the site with unique anchor text? make sure you use or don’t use www consistently and set it correctly in webmaster tools
Robots.txt file – Are you preventing the duplicate page from being spidered?
  • Pattern matching
    • * = matches any sequence of characters
    • $ = matches the end of a URL
Faceted classification and duplicate content – determine the three primary ways that people find content on your site and exclude the rest from the search engines.
Canonical Tag – tells the search engine what the best url is for your web site.
301 redirects – change of address card for search engine
NOFOLLOW attribute is basically a hint or a cue. It doesn’t mean the search engines won’t follow it. Don’t use it on pages that are important to your users. Use it on pages with forms that you don’t want the search engines to fill out.
XML Sitemap
  • Be consistent and be proactive
  • Don’t robots exclude a page and then put it in your sitemap

StoneTemple Consulting – How to Syndicate Content Safely

301 Redirects

301 redirect eliminates duplicate content altogether, but it doesn’t cause the pages to cease to exist. Redirects link juice back to the canonical page. (301 redirect doesn’t move all of the link juice, but probably 99%). Change any links on your site to point to the canonical page.

Rel=canonical

The duplicate page remains and still gets crawled, but it’s only a suggestion. The search engine doesn’t have to follow the canonical instruction. The reason is that search engines find that webmasters make mistakes with canonical a lot of the time. Mistakes happen all the time. Google doesn’t care if your first. They want the best content for the user.

Webmaster Tools

In Webmaster Tools, you can tell the search engines to ignore parameters.

noindex

If you syndicate your content, you should make sure it is noindexed. Search engines sometimes prefer the syndicatee, instead of the original creator of the content. If you noindex a page, it keeps the duplicate out of the index, but the page will still be crawled. It can still pass link juice/pagerank through to other page.s

Pseudo Dupes – Duplicate Titles

Title tags are one of the most powerful on page ranking factors and give a strong hint to search engines on what a page is about. Duplicate title tags may cause a page to not rank, much the same as an actual duplicate page.

Syndicating Content

Syndicating exact copies of content is a bad idea. Just because it works today, doesn’t mean it will work in the future.

Syndicating content is a great way to get quality links back to your site. Don’t syndicate your content to the point that you no longer rank for it.

Good Syndication creates new original content based on your existing subject matter expertise. It’s a great way to get visibility.

Divide your writing efforts by creating new original articles. Publish some of them on your own site and syndicate the rest to 3rd parties and watch the link juice flows. It also gives users a reason to go to your site because there is different content on your site.

3 major syndication options

  • synopsize
  • rel-canonical
  • create new original content
Don’t spew duplicate content all over the web. It won’t help you at all.

Helpful tools to check for duplicate content: write a script or use Majestic SEO

Tell someone about this:

  • Email
  • Print
  • Twitter
  • LinkedIn
  • Facebook
  • More
  • Reddit
  • Pocket
  • Tumblr

Like this:

Like Loading...

Related

Filed Under: Web Development Tagged With: Business Finance, Director of Omni Marketing Interactive Duplicate Content, Entertainment Culture, Interactive Duplicate Content, marketing interactive, Multiple Site Issues, omni marketing, Search engine optimization, SEO Director, SES Chicago 2011, shari thurow, Technology Internet, Web content, Web Design and Development, Web page, Web search engine

About Jonathan Jeter

Jonathan Jeter has been creating websites since 1997. He is currently Director of Technology Services and Digital Development at TracyLocke, a shopper marketing agency. You can follow him @mywebthoughts, on LinkedIn or connect on Google+.

What do YOU think? Let me know...Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Stuff I Like to Talk About:

  • Business
  • Digital Imaging
  • Internet Marketing
    • Email Marketing
    • SEM / Paid Search
  • Life
  • Other Stuff
    • Health
    • Taekwondo (TKD)
  • Sports
    • Football
  • Technology
    • Augmented Reality
    • Awesome or Scary?
    • Marketing Technology
      • Data / Analytics
      • Omnichannel
    • Mobile
      • Android
    • Virtual Reality
  • User Interface / User Experience Design
  • Web Development
    • Browsers
    • CSS
    • Front-End Development
    • Google+ (Google Plus)
    • HTML5
    • JavaScript
    • jQuery
    • Mobile
    • MVC
    • Responsive Design
    • SEO
    • Social Media
    • UI/UX
    • WordPress

HTML

  • HTML Entities

JavaScript

  • MEAN.js

My Sites

  • Head Turning Media
  • Jonathan Jeter (Brand Yourself)
  • My Humor

Online Experts

  • Bryan Eisenberg
  • Danny Sullivan
  • Duane Forrester
  • Keith Brown
  • Louis Gray
  • Matt Cutts

UI / UX

  • Jared Spool
  • Paul Jeter
FreshBooks
Genesis Framework for WordPress Premise Landing Pages Made Easy

Most Popular

  • Exploring Standard Ad Unit Sizes: Google AdSense 300…
  • Looking for Instagram or Android fonts or logos and…
  • Verizon Wireless – My Favorite Mobile Provider
  • Jared Spool – The Essential Principles behind…
  • Who?
  • Contact Jonathan Jeter
  • Exploring Standard Ad Unit Sizes: Google AdSense…
  • Privacy Policy
  • HTML5 TX 2013
  • A Brief History of the Complete Redesign of Google…

Copyright © 2025 Jonathan Jeter

%d