patch to point search engines at current version of docs

classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

patch to point search engines at current version of docs

adam sah
Currently, search engines often rank old versions of Postgres documentation pages higher than newer versions. This patch should fix this by having search engines always prefer the current version of the documentation. Users who want old versions can click on the links at the top of the page - today, users who want the current version have to do this.

original discussion: https://www.postgresql.org/message-id/CANNMO%2B%2BkxJmaaB7X6hq_8SqcEruySZrF%3DUkcPm-EG1JCKVascw%40mail.gmail.com


adam


rel_canonical.diff (1K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: patch to point search engines at current version of docs

Jonathan S. Katz-3
Hi Adam,

On 5/11/19 5:58 PM, adam sah wrote:

> Currently, search engines often rank old versions of Postgres
> documentation pages higher than newer versions. This patch should fix
> this by having search engines always prefer the current version of the
> documentation. Users who want old versions can click on the links at the
> top of the page - today, users who want the current version have to do this.
>
> original
> discussion: https://www.postgresql.org/message-id/CANNMO%2B%2BkxJmaaB7X6hq_8SqcEruySZrF%3DUkcPm-EG1JCKVascw%40mail.gmail.com
>
> more about rel=canonical: https://www.google.com/search?q=rel+canonical+url
Thanks for the suggestion! I remember looking at this with Magnus last
year during the last time this was brought up. I found this page to be
very helpful on the subject:

https://support.google.com/webmasters/answer/139066?hl=en

We also researched what some other open source projects were doing with
respect to their documentation pages to see how they could optimize it.

I believe the strategy you are proposing would involve setting every
anchor tag that is pointing to "/docs/current/.*" to contain the
rel="canonical" attribute. If we went down this path, the better way
would be to use the "<link rel="canonical" ...>" method that is
mentioned, i.e.

<link
  rel="canonical"
  href="https://www.postgresq.org/docs/current/the-doc-page.html"
/>

and put it in the <head> block.

At the time, one drawback we found was that this could end up causing
pages that existed in older, but supported, versions to go missing from
search engines, which is not necessarily great. We may end up deciding
that this doesn't matter, and that the main point is to get people to
the documentation and then they can select the version, but OTOH, this
could end up breaking a lot of people's workflows for how they look for
info in the docs (myself included).

We might be able to incorporate rel="alternate" on the other
documentation version pages to let the crawlers know that alternate
versions exist, but from what I've seen it seems to be restricted to
i18n or media support, and does not seem to take other attributes such
as "versions."

Thanks,

Jonathan


signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: patch to point search engines at current version of docs

adam sah
Yeah, sigh, good point - I've confirmed that rel=canonical will make it impossible to find old versions in Google, even (e.g.) adding the version number to your search keywords. As an example, Citus uses readthedocs with rel=canonical and old docs versions require clicks from the new version page http://docs.readthedocs.org/en/latest/canonical.html

Let's see if we can support both types of users. Some ideas:

1. nudge search engines to rank the current version higher - a lot of subtle work:

2. NOT RECOMMENDED: let users pick a default version, then detect visits from search engines and set/read a cookie and then redirect to this version. Google doesn't like this and will likely punish the site for the perceived deception.

3. use a browser extension to do this trick - this should be safe from penalties. It will require users to use particular browsers (Firefox or Chrome) and perform a one-time installation.

4. (yuck) offer a keyboard shortcut to a (user selected?) old version ?

5. other ideas?

adam


On Sat, May 11, 2019, 10:41 PM Jonathan S. Katz <[hidden email]> wrote:
Hi Adam,

On 5/11/19 5:58 PM, adam sah wrote:
> Currently, search engines often rank old versions of Postgres
> documentation pages higher than newer versions. This patch should fix
> this by having search engines always prefer the current version of the
> documentation. Users who want old versions can click on the links at the
> top of the page - today, users who want the current version have to do this.
>
> original
> discussion: https://www.postgresql.org/message-id/CANNMO%2B%2BkxJmaaB7X6hq_8SqcEruySZrF%3DUkcPm-EG1JCKVascw%40mail.gmail.com
>
> more about rel=canonical: https://www.google.com/search?q=rel+canonical+url

Thanks for the suggestion! I remember looking at this with Magnus last
year during the last time this was brought up. I found this page to be
very helpful on the subject:

https://support.google.com/webmasters/answer/139066?hl=en

We also researched what some other open source projects were doing with
respect to their documentation pages to see how they could optimize it.

I believe the strategy you are proposing would involve setting every
anchor tag that is pointing to "/docs/current/.*" to contain the
rel="canonical" attribute. If we went down this path, the better way
would be to use the "<link rel="canonical" ...>" method that is
mentioned, i.e.

<link
  rel="canonical"
  href="https://www.postgresq.org/docs/current/the-doc-page.html"
/>

and put it in the <head> block.

At the time, one drawback we found was that this could end up causing
pages that existed in older, but supported, versions to go missing from
search engines, which is not necessarily great. We may end up deciding
that this doesn't matter, and that the main point is to get people to
the documentation and then they can select the version, but OTOH, this
could end up breaking a lot of people's workflows for how they look for
info in the docs (myself included).

We might be able to incorporate rel="alternate" on the other
documentation version pages to let the crawlers know that alternate
versions exist, but from what I've seen it seems to be restricted to
i18n or media support, and does not seem to take other attributes such
as "versions."

Thanks,

Jonathan

Reply | Threaded
Open this post in threaded view
|

Re: patch to point search engines at current version of docs

Peter Eisentraut-6
In reply to this post by adam sah
On 2019-05-11 23:58, adam sah wrote:

> Currently, search engines often rank old versions of Postgres
> documentation pages higher than newer versions. This patch should fix
> this by having search engines always prefer the current version of the
> documentation. Users who want old versions can click on the links at the
> top of the page - today, users who want the current version have to do this.
>
> original
> discussion: https://www.postgresql.org/message-id/CANNMO%2B%2BkxJmaaB7X6hq_8SqcEruySZrF%3DUkcPm-EG1JCKVascw%40mail.gmail.com
>
> more about rel=canonical: https://www.google.com/search?q=rel+canonical+url

Previous discussion:
https://www.postgresql.org/message-id/flat/38c68b83-30ae-c039-acd0-9e853997edc4%402ndquadrant.com

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services