{"id":487,"date":"2011-11-14T22:01:08","date_gmt":"2011-11-14T21:01:08","guid":{"rendered":"http:\/\/sburke.eu\/blog\/?p=487"},"modified":"2022-07-25T23:27:40","modified_gmt":"2022-07-25T22:27:40","slug":"seo-tweaks-using-mod_rewrite-to-prevent-duplicate-content-being-crawled","status":"publish","type":"post","link":"http:\/\/sburke.eu\/blog\/2011\/11\/seo-tweaks-using-mod_rewrite-to-prevent-duplicate-content-being-crawled\/","title":{"rendered":"SEO tweaks using mod_rewrite to prevent duplicate content being crawled"},"content":{"rendered":"<p><a href=\"http:\/\/sburke.eu\/blog\/wp-content\/uploads\/2011\/11\/no-www.gif\"><img loading=\"lazy\" class=\"alignright size-full wp-image-490\" title=\"No WWW Logo\" src=\"http:\/\/sburke.eu\/blog\/wp-content\/uploads\/2011\/11\/no-www.gif\" alt=\"\" width=\"100\" height=\"100\" \/><\/a>Over the past few years, with the rise in popularity of using <a href=\"http:\/\/no-www.org\" target=\"_blank\" rel=\"noopener\">no-www<\/a>\u00a0there have been some issues where content is getting crawled by a search engines twice, and in some cases getting a negative scoring due to duplicated content. I.E. Accessing a website using http:\/\/www.site.com and also\u00a0http:\/\/site.com<\/p>\n<p>More recently I&#8217;ve come across the fact that http:\/\/www.site.com\/index.html and http:\/\/www.site.com\/ (just ending in a trailing slash) are been indexed twice as separate pages,\u00a0and\u00a0also getting a negative score due to duplicated content.<\/p>\n<p><a href=\"http:\/\/sburke.eu\/blog\/wp-content\/uploads\/2011\/11\/apache.gif\"><img loading=\"lazy\" class=\"alignright size-full wp-image-493\" title=\"Apache Logo\" src=\"http:\/\/sburke.eu\/blog\/wp-content\/uploads\/2011\/11\/apache.gif\" alt=\"\" width=\"100\" height=\"28\" \/><\/a>I&#8217;m sure search engines are getting more intelligent and not negatively scoring websites for these oversights, however I said I&#8217;d use mod_rewrite for the Apache webserver to overcome these issues.<\/p>\n<pre>vi .htaccess\r\nRewriteEngine On\r\nRewriteCond %{HTTP_HOST} ^domain\\.com$\r\nRewriteRule (.*) http:\/\/www.domain.com\/$1 [R=301,L]\r\nRewriteRule ^$ http:\/\/www.domain.com\/index.html [R=301,L]<\/pre>\n<p>In your apache vhost config, you may have to change &#8220;AllowOverride None&#8221; to &#8220;AllowOverride FileInfo&#8221; (more fine grained and safer than going AllowOverride All).<\/p>\n<p>So the above mod_rewrite code rewrites http:\/\/domain.com to http:\/\/www.domain.com\/index.html It also rewrites http:\/\/www.domain.com\/ to http:\/\/www.domain.com\/index.html<br \/>\nThis was for a recent web design project for a simple static website. Your mileage may vary, so if you use the above, make sure to test thoroughly.<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Over the past few years, with the rise in popularity of using no-www\u00a0there have been some issues where content is getting crawled by a search engines twice, and in some cases getting a negative scoring due to duplicated content. I.E. &hellip; <a href=\"http:\/\/sburke.eu\/blog\/2011\/11\/seo-tweaks-using-mod_rewrite-to-prevent-duplicate-content-being-crawled\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[3,12,15],"tags":[16,50,19],"_links":{"self":[{"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/posts\/487"}],"collection":[{"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/comments?post=487"}],"version-history":[{"count":8,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/posts\/487\/revisions"}],"predecessor-version":[{"id":688,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/posts\/487\/revisions\/688"}],"wp:attachment":[{"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/media?parent=487"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/categories?post=487"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/sburke.eu\/blog\/wp-json\/wp\/v2\/tags?post=487"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}