Multiple Homepage URLs google error

leonor

Active Member
License Active
Hello,

i dont know if any else have this problem too, but when i log in to my google anylytics acc and check my VL sites i get on ALL an error/notice like this.

Its appear in the right Corner:




I have already this in my gateway.html for some Month but i dosent help
Code:
<META NAME="ROBOTS" CONTENT="NOINDEX, FOLLOW">

greetz,
leonor
 

Basti

Administrator
Staff member
Yea, normally that would solve it, no use indexing vote urls, but been reported that this somehow not work ( while it should ). Think that was you? Dunno

I gave it some thought a while ago and i think this is what happens. Maybe you want to try it before we include this as last edit into 1.4 ( since it has other seo fixes already )

Problem
1) If gateway is disabled, any voter or google goes through vote checks and apossible redirect.
When they use the google friendly vote url ( site.com on members site ), no redirect happens, since they already linked to the right url.
But if they have on their site index.php?a=in&u=username as vote url, google not sees a proper header once it get redirected to the rankings page. I'll explain this in the solution.

Some members, even when google friendly links are enabled ( vote url as site.com ), still might link from somewhere else using the non-friendly version to enable voting not only from the members site, example link on facebook maybe, would only work with the non-friendly vote link.
So its possible, that even with friendly vote links issues are produced.

2) Gateway is enabled and you have set noindex, follow. So what happens if google follows a vote link from a member?
It seeing the gateway and is told to not index this url and it don't do this, but it still can follow the vote button, which is a url like the non-friendly vote link.

Ok so it follows that vote button? What now?
We are back to issue 1, because it tries to vote ( there are checks so it not happens ) and gets redirected to rankings without a proper header


So what does this mean ( point #1 and #2 )
Google sees these "non-friendly vote urls" as "302 Moved Temporarily" due a redirect that happens if it follows that link.
And what that means is, it will index the redirect content ( rankings ) using the original url ( index.php?a=in&u=username ). Normally after a while these indexes should loose value, but wont be gone from google. Sometimes even resulting in those "soft 404" errors you see in webmaster tools.

Due this redirect issue, it doesnt matter if you have it in your html, because google index not your gateway, but your rankings using a wrong url, causing duplicate content with its real url.

We should set a proper "301 Moved Permanently". That way also old urls should go out of google index if there are any, as they are permanently moved to the rankings.
In sources/in.php
Code:
      header("Location: {$vote_url}");
Replace that with
Code:
      header("HTTP/1.1 301 Moved Permanently");
      header("Location: {$vote_url}");
Also make sure you are not disallowing any vote url in robots.txt. As that would prevent google from seeing the php headers
.
robots.txt tip
URLs disallowed via robots.txt become indexed by search engines when they appear as links in pages not disallowed via robots.txt . Google is then able to associate text from other sources with disallowed URLs to return URLs disallowed via robots.txt in search results pages. This is done without crawling pages disallowed with robots.txt. To prevent URLs from appearing in Google search results, URLs must be crawlable and not disallowed with robots.txt
While possibly a rare scenario, making vote urls in robots.txt a bit useless


Word of advise, open for discussion. Gateway enabled, use the html meta tag at all?

Generally, why use it?
If you use non-friendly links, its not guaranteed google follows the vote button on gateway ( see #2 + solution ) and we get a proper redirect
So we still need to tell google not to index this gateway url

We need to tell google, to not index this gateway, why i say no html tag?
How does it react if friendly vote links ( site.com ) are enabled?
If you click on a vote button you end up on your site.com and see a gateway page

Is it wise to tell google to NOT index this gateway url? You know, noindex remove any existing url from google search results, do you really want that?
I think i not once see a duplicate content warning with absolute same urls

My suggestion here would putting up a php header noindex instead of the html one, ONLY when google friendly vote links are disabled. Why?
Because we only want those ugly index.php?a=in&u=username gateway urls as noindex ( look at everything above :) ).

sources/in.php also
Code:
  static public function  gateway($username) {
    global $DB, $LNG, $CONF, $FORM, $TMPL;
Becomes this
Code:
  static public function  gateway($username) {
    global $DB, $LNG, $CONF, $FORM, $TMPL;

    if (empty($CONF['google_friendly_links'])) {
        header('X-Robots-Tag: noindex');
    }

Yes, Seo can be a pain! :)
 
Last edited:

leonor

Active Member
License Active
I just added this to one off my lists, and i think its working ! =)




I will add this now to the other lists, and see what happen in the next days :)
 
Top