amcheck verification for GiST

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

amcheck verification for GiST

Andrey Borodin-2
Hi, hackers!

Here's the patch with amcheck functionality for GiST.

It basically checks two invariants:
1. Every internal tuple need no adjustment by tuples of referenced page
2. Internal page reference or only leaf pages or only internal pages

We actually cannot check all balanced tree invariants due to concurrency reasons some concurrent splits will be visible as temporary balance violations.

Are there any other invariants that we can check?

I'd be happy to hear any thought about this.

Best regards, Andrey Borodin.

0001-GiST-verification-function-for-amcheck.patch (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: amcheck verification for GiST

Thomas Munro-3
On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <[hidden email]> wrote:
> Here's the patch with amcheck functionality for GiST.

Hi Andrey,

Windows doesn't like it[1]:

contrib/amcheck/verify_gist.c(163): error C2121: '#' : invalid
character : possibly the result of a macro expansion
[C:\projects\postgresql\amcheck.vcxproj]

That's:

+    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
+ "amcheck context",
+#if PG_VERSION_NUM >= 110000
+ ALLOCSET_DEFAULT_SIZES);
+#else
+ ALLOCSET_DEFAULT_MINSIZE,
+ ALLOCSET_DEFAULT_INITSIZE,
+ ALLOCSET_DEFAULT_MAXSIZE);
+#endif

Not sure what's gong on there... perhaps it doesn't like you to do
that in the middle of a function-like-macro invocation
(AllocSetContextCreate)?

[1] https://ci.appveyor.com/project/postgresql-cfbot/postgresql/build/1.0.14056

--
Thomas Munro
http://www.enterprisedb.com

Reply | Threaded
Open this post in threaded view
|

Re: amcheck verification for GiST

Andres Freund
On 2018-09-24 15:29:38 +1200, Thomas Munro wrote:

> On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <[hidden email]> wrote:
> > Here's the patch with amcheck functionality for GiST.
>
> Hi Andrey,
>
> Windows doesn't like it[1]:
>
> contrib/amcheck/verify_gist.c(163): error C2121: '#' : invalid
> character : possibly the result of a macro expansion
> [C:\projects\postgresql\amcheck.vcxproj]
>
> That's:
>
> +    MemoryContext mctx = AllocSetContextCreate(CurrentMemoryContext,
> + "amcheck context",
> +#if PG_VERSION_NUM >= 110000
> + ALLOCSET_DEFAULT_SIZES);
> +#else
> + ALLOCSET_DEFAULT_MINSIZE,
> + ALLOCSET_DEFAULT_INITSIZE,
> + ALLOCSET_DEFAULT_MAXSIZE);
> +#endif
>
> Not sure what's gong on there... perhaps it doesn't like you to do
> that in the middle of a function-like-macro invocation
> (AllocSetContextCreate)?

But note that the version dependent code shouldn't be present in
/contrib anyway.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: amcheck verification for GiST

Andrey Borodin-2
In reply to this post by Thomas Munro-3
Hi!

> 24 сент. 2018 г., в 8:29, Thomas Munro <[hidden email]> написал(а):
>
> On Sun, Sep 23, 2018 at 10:15 PM Andrey Borodin <[hidden email]> wrote:
>> Here's the patch with amcheck functionality for GiST.
>
> Hi Andrey,
>
> Windows doesn't like it[1]:

Thanks, Thomas! Yes, I've missed that version-dependent macro. Surely, it's redundant.

Best regards, Andrey Borodin.

0001-GiST-verification-function-for-amcheck-v2.patch (17K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: amcheck verification for GiST

Peter Geoghegan-4
On Sun, Sep 23, 2018 at 10:12 PM Andrey Borodin <[hidden email]> wrote:
> (0001-GiST-verification-function-for-amcheck-v2.patch)

Thanks for working on this. Some feedback:

* You do this:

> +/* Check of an internal page. Hold locks on two pages at a time (parent+child). */

This isn't consistent with what you do within verify_nbtree.c, which
deliberately avoids ever holding more than a single buffer lock at a
time, on general principle. That isn't necessarily a reason why you
have to do the same, but it's not clear why you do things that way.
Why isn't it enough to have a ShareLock on the relation? Maybe this is
a sign that it would be a good idea to always operate on a palloc()'d
copy of the page, by introducing something equivalent to
palloc_btree_page(). (That would also be an opportunity to do very
basic checks on every page.)

* You need to sprinkle a few CHECK_FOR_INTERRUPTS() calls around.
Certainly, there should be one at the top of the main loop.

* Maybe gist_index_check() should be called gist_index_parent_check(),
since it's rather like the existing verification function
bt_index_parent_check().

* Alternatively, you could find a way to make your function only need
an AccessShareLock -- that would make gist_index_check() an
appropriate name. That would probably require careful thought about
VACUUM.

* Why is it okay to do this?:

> +       if (GistTupleIsInvalid(idxtuple))
> +           ereport(LOG,
> +                   (errmsg("index \"%s\" contains an inner tuple marked as invalid",
> +                           RelationGetRelationName(rel)),
> +                    errdetail("This is caused by an incomplete page split at crash recovery before upgrading to PostgreSQL 9.1."),
> +                    errhint("Please REINDEX it.")));

You should probably mention the gistdoinsert() precedent for this.

* Can we check GIST_PAGE_ID somewhere? I try to be as paranoid as
possible, adding almost any check that I can think of, provided it
hasn't got very high overhead. Note that gistcheckpage() doesn't do
this for you.

* Should we be concerned about the memory used by pushStackIfSplited()?

* How about a cross-check between IndexTupleSize() and
ItemIdGetLength(), like the B-Tree code? It's a bit unfortunate that
we have this redundancy, which wastes space, but we do, so we might as
well get some small benefit from it.

--
Peter Geoghegan