found xmin from before relfrozenxid on pg_catalog.pg_authid

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
45 messages Options
123
Reply | Threaded
Open this post in threaded view
|

found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel
Getting some concerning errors in one of our databases that is on 9.5.11, on autovacuum from template0 database pg_authid and pg_auth_members.  I only saw some notes on the list about this error related to materialized views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to 9.5.  Here is an example:

2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33 CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid 740087784",,,,,"automatic vacuum of table ""template0.pg_catalog.pg_authid""",,,,""
2018-03-19 12:08:33.957 CDT,,,14892,,5aafee91.3a2c,2,,2018-03-19 12:08:33 CDT,59/340953,0,ERROR,XX001,"found xmin 2906288383 from before relfrozenxid 740087784",,,,,"automatic vacuum of table ""template0.pg_catalog.pg_auth_members""",,,,""


Any insight would be much appreciated.

Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Álvaro Herrera
Jeremy Finzel wrote:

> Getting some concerning errors in one of our databases that is on 9.5.11,
> on autovacuum from template0 database pg_authid and pg_auth_members.  I
> only saw some notes on the list about this error related to materialized
> views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> 9.5.  Here is an example:
>
> 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33
> CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid
> 740087784",,,,,"automatic vacuum of table
> ""template0.pg_catalog.pg_authid""",,,,""

Can you please supply output of pg_controldata?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel
pg_control version number:            942
Catalog version number:               201510051
Database system identifier:           6351536019599012028
Database cluster state:               in production
pg_control last modified:             Mon 19 Mar 2018 12:56:10 PM CDT
Latest checkpoint location:           262BE/FE96240
Prior checkpoint location:            262BA/623D5E40
Latest checkpoint's REDO location:    262BA/F5499E98
Latest checkpoint's REDO WAL file:    00000001000262BA000000F5
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          16/3132524419
Latest checkpoint's NextOID:          1090653331
Latest checkpoint's NextMultiXactId:  2142
Latest checkpoint's NextMultiOffset:  5235
Latest checkpoint's oldestXID:        1829964553
Latest checkpoint's oldestXID's DB:   12376
Latest checkpoint's oldestActiveXID:  3131774441
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:1829964553
Latest checkpoint's newestCommitTsXid:3132524418
Time of latest checkpoint:            Mon 19 Mar 2018 12:54:08 PM CDT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    logical
wal_log_hints setting:                off
max_connections setting:              2000
max_worker_processes setting:         10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       on
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0

On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <[hidden email]> wrote:
Jeremy Finzel wrote:
> Getting some concerning errors in one of our databases that is on 9.5.11,
> on autovacuum from template0 database pg_authid and pg_auth_members.  I
> only saw some notes on the list about this error related to materialized
> views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> 9.5.  Here is an example:
>
> 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33
> CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid
> 740087784",,,,,"automatic vacuum of table
> ""template0.pg_catalog.pg_authid""",,,,""

Can you please supply output of pg_controldata?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel
In reply to this post by Álvaro Herrera


On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <[hidden email]> wrote:
Jeremy Finzel wrote:
> Getting some concerning errors in one of our databases that is on 9.5.11,
> on autovacuum from template0 database pg_authid and pg_auth_members.  I
> only saw some notes on the list about this error related to materialized
> views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> 9.5.  Here is an example:
>
> 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19 12:08:33
> CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before relfrozenxid
> 740087784",,,,,"automatic vacuum of table
> ""template0.pg_catalog.pg_authid""",,,,""

Can you please supply output of pg_controldata?

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Please forgive my accidental top-post.  Here:

pg_control version number:            942
Catalog version number:               201510051
Database system identifier:           6351536019599012028
Database cluster state:               in production
pg_control last modified:             Mon 19 Mar 2018 12:56:10 PM CDT
Latest checkpoint location:           262BE/FE96240
Prior checkpoint location:            262BA/623D5E40
Latest checkpoint's REDO location:    262BA/F5499E98
Latest checkpoint's REDO WAL file:    00000001000262BA000000F5
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          16/3132524419
Latest checkpoint's NextOID:          1090653331
Latest checkpoint's NextMultiXactId:  2142
Latest checkpoint's NextMultiOffset:  5235
Latest checkpoint's oldestXID:        1829964553
Latest checkpoint's oldestXID's DB:   12376
Latest checkpoint's oldestActiveXID:  3131774441
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 16400
Latest checkpoint's oldestCommitTsXid:1829964553
Latest checkpoint's newestCommitTsXid:3132524418
Time of latest checkpoint:            Mon 19 Mar 2018 12:54:08 PM CDT
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    logical
wal_log_hints setting:                off
max_connections setting:              2000
max_worker_processes setting:         10
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       on
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0


Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Andres Freund
Hi Jeremy, Alvaro,

On 2018-03-19 13:00:13 -0500, Jeremy Finzel wrote:

> On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <[hidden email]>
> wrote:
>
> > Jeremy Finzel wrote:
> > > Getting some concerning errors in one of our databases that is on 9.5.11,
> > > on autovacuum from template0 database pg_authid and pg_auth_members.  I
> > > only saw some notes on the list about this error related to materialized
> > > views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> > > 9.5.  Here is an example:
> > >
> > > 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
> > 12:08:33
> > > CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
> > relfrozenxid
> > > 740087784",,,,,"automatic vacuum of table
> > > ""template0.pg_catalog.pg_authid""",,,,""
> >
> > Can you please supply output of pg_controldata?

> Latest checkpoint's NextXID:          16/3132524419
> Latest checkpoint's NextMultiXactId:  2142
> Latest checkpoint's NextMultiOffset:  5235
> Latest checkpoint's oldestXID:        1829964553
> Latest checkpoint's oldestXID's DB:   12376
> Latest checkpoint's oldestActiveXID:  3131774441
> Latest checkpoint's oldestMultiXid:   1
> Latest checkpoint's oldestMulti's DB: 16400

Hm, based on these it doesn't look like multixacts were involved (based
on oldestMultiXid it's highly unlikley there've multi wraparound, and
there's not much multixact usage on system tables anyway).  Which
suggests that there might have been actual corrpution here.

Jeremy:
- which version of 9.4 and 9.5 ran on this? Do you know?
- Can you install the pageinspect extension? If so, it might be a
  CREATE EXTENSION pageinspect;
  CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
    SELECT blockno, lp, t_xmin
    FROM
        generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno, -- every block in the relation
        heap_page_items(get_raw_page($1::text, blockno::int4)) -- every item on the page
    WHERE
        t_xmin IS NOT NULL -- filter out empty items
        AND t_xmin != 1 -- filter out bootstrap
        AND t_xmin != 2 -- filter out frozen transaction id
        AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' | x'0200')::int) -- filter out frozen rows with xid present
        AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid = $1)) -- xid cutoff filter
$$;
  SELECT * FROM check_rel('pg_authid') LIMIT 100;

  and then display all items for one of the affected pages like
  SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));


Alvaro:
- Hm, we talked about code adding context for these kind of errors,
  right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 1:17 PM, Andres Freund <[hidden email]> wrote:
Hi Jeremy, Alvaro,

On 2018-03-19 13:00:13 -0500, Jeremy Finzel wrote:
> On Mon, Mar 19, 2018 at 12:46 PM, Alvaro Herrera <[hidden email]>
> wrote:
>
> > Jeremy Finzel wrote:
> > > Getting some concerning errors in one of our databases that is on 9.5.11,
> > > on autovacuum from template0 database pg_authid and pg_auth_members.  I
> > > only saw some notes on the list about this error related to materialized
> > > views.  FWIW, we did use pg_upgrade to upgrade this database from 9.4 to
> > > 9.5.  Here is an example:
> > >
> > > 2018-03-19 12:08:33.946 CDT,,,14892,,5aafee91.3a2c,1,,2018-03-19
> > 12:08:33
> > > CDT,59/340951,0,ERROR,XX001,"found xmin 2906288382 from before
> > relfrozenxid
> > > 740087784",,,,,"automatic vacuum of table
> > > ""template0.pg_catalog.pg_authid""",,,,""
> >
> > Can you please supply output of pg_controldata?

> Latest checkpoint's NextXID:          16/<a href="tel:3132524419" value="+13132524419">3132524419
> Latest checkpoint's NextMultiXactId:  2142
> Latest checkpoint's NextMultiOffset:  5235
> Latest checkpoint's oldestXID:        1829964553
> Latest checkpoint's oldestXID's DB:   12376
> Latest checkpoint's oldestActiveXID:  3131774441
> Latest checkpoint's oldestMultiXid:   1
> Latest checkpoint's oldestMulti's DB: 16400

Hm, based on these it doesn't look like multixacts were involved (based
on oldestMultiXid it's highly unlikley there've multi wraparound, and
there's not much multixact usage on system tables anyway).  Which
suggests that there might have been actual corrpution here.

Jeremy:
- which version of 9.4 and 9.5 ran on this? Do you know?

We upgraded to 9.5.5, and today we are running 9.5.11.  And actually we upgraded from 9.3, not 9.4.  We are still trying to figure out which point release we were on at 9.3.
 
- Can you install the pageinspect extension? If so, it might be a
  CREATE EXTENSION pageinspect;
  CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT lp int2, OUT xmin xid)
RETURNS SETOF RECORD
LANGUAGE SQL
AS $$
    SELECT blockno, lp, t_xmin
    FROM
        generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno, -- every block in the relation
        heap_page_items(get_raw_page($1::text, blockno::int4)) -- every item on the page
    WHERE
        t_xmin IS NOT NULL -- filter out empty items
        AND t_xmin != 1 -- filter out bootstrap
        AND t_xmin != 2 -- filter out frozen transaction id
        AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' | x'0200')::int) -- filter out frozen rows with xid present
        AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid = $1)) -- xid cutoff filter
$$;
  SELECT * FROM check_rel('pg_authid') LIMIT 100;

Small note - Needs to be this because != is not supported for xid:

        AND NOT t_xmin = 1 -- filter out bootstrap
        AND NOT t_xmin = 2 -- filter out frozen transaction id
 

  and then display all items for one of the affected pages like
  SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));


Alvaro:
- Hm, we talked about code adding context for these kind of errors,
  right? Is that just skipped for csvlog?
- Alvaro, does the above check_rel() function make sense?

Greetings,

Andres Freund

The function does NOT show any issue with either of those tables.

One very interesting thing that is puzzling us - we have taken several san snapshots of the system real time that are running on the exact same version 9.5.11, and they do NOT show the same error when we vacuum these tables.  It makes us wonder if simply a db restart would solve the issue.

We will continue to investigate but interested in your feedback about what we have seen thus far.

Thanks,
Jeremy

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Andres Freund
On 2018-03-19 14:37:24 -0500, Jeremy Finzel wrote:
> We upgraded to 9.5.5, and today we are running 9.5.11.  And actually we
> upgraded from 9.3, not 9.4.  We are still trying to figure out which point
> release we were on at 9.3.

Ok.  IIRC there used to be a bug a few years back that sometimes lead to
highly contended pages being skipped during vacuum, and we'd still
update relfrozenxid. IIRC it required the table to be extended at the
same time or something?


>
> > - Can you install the pageinspect extension? If so, it might be a
> >   CREATE EXTENSION pageinspect;
> >   CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT
> > lp int2, OUT xmin xid)
> > RETURNS SETOF RECORD
> > LANGUAGE SQL
> > AS $$
> >     SELECT blockno, lp, t_xmin
> >     FROM
> >         generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno,
> > -- every block in the relation
> >         heap_page_items(get_raw_page($1::text, blockno::int4)) -- every
> > item on the page
> >     WHERE
> >         t_xmin IS NOT NULL -- filter out empty items
> >         AND t_xmin != 1 -- filter out bootstrap
> >         AND t_xmin != 2 -- filter out frozen transaction id
> >         AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' |
> > x'0200')::int) -- filter out frozen rows with xid present
> >         AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid
> > = $1)) -- xid cutoff filter
> > $$;
> >   SELECT * FROM check_rel('pg_authid') LIMIT 100;
> >
>
> Small note - Needs to be this because != is not supported for xid:
>
>         AND NOT t_xmin = 1 -- filter out bootstrap
>         AND NOT t_xmin = 2 -- filter out frozen transaction id

Only on older releases ;). But yea, that looks right.



> >   and then display all items for one of the affected pages like
> >   SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));
> >
> >
> > Alvaro:
> > - Hm, we talked about code adding context for these kind of errors,
> >   right? Is that just skipped for csvlog?
> > - Alvaro, does the above check_rel() function make sense?
> >
> > Greetings,
> >
> > Andres Freund
> >
>
> The function does NOT show any issue with either of those tables.

Uh, huh?  Alvaro, do you see a bug in my query?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 2:41 PM, Andres Freund <[hidden email]> wrote:
On 2018-03-19 14:37:24 -0500, Jeremy Finzel wrote:
> We upgraded to 9.5.5, and today we are running 9.5.11.  And actually we
> upgraded from 9.3, not 9.4.  We are still trying to figure out which point
> release we were on at 9.3.

Ok.  IIRC there used to be a bug a few years back that sometimes lead to
highly contended pages being skipped during vacuum, and we'd still
update relfrozenxid. IIRC it required the table to be extended at the
same time or something?


>
> > - Can you install the pageinspect extension? If so, it might be a
> >   CREATE EXTENSION pageinspect;
> >   CREATE OR REPLACE FUNCTION check_rel(rel regclass, OUT blockno int8, OUT
> > lp int2, OUT xmin xid)
> > RETURNS SETOF RECORD
> > LANGUAGE SQL
> > AS $$
> >     SELECT blockno, lp, t_xmin
> >     FROM
> >         generate_series(0, pg_relation_size($1::text) / 8192 - 1) blockno,
> > -- every block in the relation
> >         heap_page_items(get_raw_page($1::text, blockno::int4)) -- every
> > item on the page
> >     WHERE
> >         t_xmin IS NOT NULL -- filter out empty items
> >         AND t_xmin != 1 -- filter out bootstrap
> >         AND t_xmin != 2 -- filter out frozen transaction id
> >         AND (t_infomask & ((x'0100' | x'0200')::int)) != ((x'0100' |
> > x'0200')::int) -- filter out frozen rows with xid present
> >         AND age(t_xmin) > age((SELECT relfrozenxid FROM pg_class WHERE oid
> > = $1)) -- xid cutoff filter
> > $$;
> >   SELECT * FROM check_rel('pg_authid') LIMIT 100;
> >
>
> Small note - Needs to be this because != is not supported for xid:
>
>         AND NOT t_xmin = 1 -- filter out bootstrap
>         AND NOT t_xmin = 2 -- filter out frozen transaction id

Only on older releases ;). But yea, that looks right.



> >   and then display all items for one of the affected pages like
> >   SELECT * FROM heap_page_items(get_raw_page('pg_authid', 34343));
> >
> >
> > Alvaro:
> > - Hm, we talked about code adding context for these kind of errors,
> >   right? Is that just skipped for csvlog?
> > - Alvaro, does the above check_rel() function make sense?
> >
> > Greetings,
> >
> > Andres Freund
> >
>
> The function does NOT show any issue with either of those tables.

Uh, huh?  Alvaro, do you see a bug in my query?

Greetings,

Andres Freund

FWIW, if I remove the last filter, I get these rows and I believe row 7/57/2906288382 is the one generating error:

SELECT * FROM check_rel('pg_authid') LIMIT 100;
 blockno | lp |    xmin
---------+----+------------
       7 |  4 | 2040863716
       7 |  5 | 2040863716
       7 |  8 | 2041172882
       7 |  9 | 2041172882
       7 | 12 | 2041201779
       7 | 13 | 2041201779
       7 | 16 | 2089742733
       7 | 17 | 2090021318
       7 | 18 | 2090021318
       7 | 47 | 2090021898
       7 | 48 | 2090021898
       7 | 49 | 2102749003
       7 | 50 | 2103210571
       7 | 51 | 2103210571
       7 | 54 | 2154640913
       7 | 55 | 2163849781
       7 | 56 | 2295315714
       7 | 57 | 2906288382
       7 | 58 | 2906329443
       7 | 60 | 3131766386
       8 |  1 | 2089844462
       8 |  2 | 2089844462
       8 |  3 | 2089844463
       8 |  6 | 2089844463
       8 |  9 | 2295318868
(25 rows)
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Andres Freund
Hi,

On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
> FWIW, if I remove the last filter, I get these rows and I believe row 7/57/
> 2906288382 is the one generating error:

Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.

> SELECT * FROM check_rel('pg_authid') LIMIT 100;
>  blockno | lp |    xmin
> ---------+----+------------
>        7 |  4 | 2040863716
>        7 |  5 | 2040863716
>        7 |  8 | 2041172882
>        7 |  9 | 2041172882
>        7 | 12 | 2041201779
>        7 | 13 | 2041201779
>        7 | 16 | 2089742733
>        7 | 17 | 2090021318
>        7 | 18 | 2090021318
>        7 | 47 | 2090021898
>        7 | 48 | 2090021898
>        7 | 49 | 2102749003
>        7 | 50 | 2103210571
>        7 | 51 | 2103210571
>        7 | 54 | 2154640913
>        7 | 55 | 2163849781
>        7 | 56 | 2295315714
>        7 | 57 | 2906288382
>        7 | 58 | 2906329443
>        7 | 60 | 3131766386
>        8 |  1 | 2089844462
>        8 |  2 | 2089844462
>        8 |  3 | 2089844463
>        8 |  6 | 2089844463
>        8 |  9 | 2295318868
> (25 rows)

Could you show the contents of those two pages with a query like I had
in an earlier email?

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 2:56 PM, Andres Freund <[hidden email]> wrote:
Hi,

On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
> FWIW, if I remove the last filter, I get these rows and I believe row 7/57/
> 2906288382 is the one generating error:

Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.

> SELECT * FROM check_rel('pg_authid') LIMIT 100;
>  blockno | lp |    xmin
> ---------+----+------------
>        7 |  4 | 2040863716
>        7 |  5 | 2040863716
>        7 |  8 | 2041172882
>        7 |  9 | 2041172882
>        7 | 12 | 2041201779
>        7 | 13 | 2041201779
>        7 | 16 | 2089742733
>        7 | 17 | 2090021318
>        7 | 18 | 2090021318
>        7 | 47 | 2090021898
>        7 | 48 | 2090021898
>        7 | 49 | 2102749003
>        7 | 50 | 2103210571
>        7 | 51 | 2103210571
>        7 | 54 | 2154640913
>        7 | 55 | 2163849781
>        7 | 56 | 2295315714
>        7 | 57 | 2906288382
>        7 | 58 | 2906329443
>        7 | 60 | 3131766386
>        8 |  1 | 2089844462
>        8 |  2 | 2089844462
>        8 |  3 | 2089844463
>        8 |  6 | 2089844463
>        8 |  9 | 2295318868
> (25 rows)

Could you show the contents of those two pages with a query like I had
in an earlier email?

Greetings,

Andres Freund

SELECT heap_page_items(get_raw_page('pg_authid', 7));

                                                              heap_page_items
--------------------------------------------------------------------------------------------------------------------------------------------
 (1,4720,1,108,1897434979,0,0,"(7,1)",11,2825,32,111111111000000000000000000000000000000001011000010011111100001001111000,507769370)
 (2,4608,1,108,1897442758,0,18,"(7,2)",11,2825,32,111111111000000000000000000000000000000011000011101100000010001001111000,507776451)
 (3,4496,1,108,1897442758,0,20,"(7,3)",11,2825,32,111111111000000000000000000000000000000000100011101100000010001001111000,507776452)
 (4,4384,1,108,2040863716,0,37,"(7,4)",11,2313,32,111111111000000000000000000000000000000000110110111011100011001011111000,525105004)
 (5,4272,1,108,2040863716,0,39,"(7,5)",11,2313,32,111111111000000000000000000000000000000010110110111011100011001011111000,525105005)
 (6,0,3,0,,,,,,,,,)
 (7,0,3,0,,,,,,,,,)
 (8,4160,1,108,2041172882,0,49,"(7,8)",11,2313,32,111111111000000000000000000000000000000001110100101011000111001011111000,525219118)
 (9,4048,1,108,2041172882,0,51,"(7,9)",11,2313,32,111111111000000000000000000000000000000011110100101011000111001011111000,525219119)
 (10,0,3,0,,,,,,,,,)
 (11,0,3,0,,,,,,,,,)
 (12,3936,1,108,2041201779,0,181,"(7,12)",11,2313,32,111111111000000000000000000000000000000011010100010111100111001011111000,525236779)
 (13,3824,1,108,2041201779,0,183,"(7,13)",11,2313,32,111111111000000000000000000000000000000000110100010111100111001011111000,525236780)
 (14,0,3,0,,,,,,,,,)
 (15,0,3,0,,,,,,,,,)
 (16,3712,1,108,2089742733,0,0,"(7,16)",11,2313,32,111111111000000000000000000000000000000001000101110011100000001111111000,532706210)
 (17,3600,1,108,2090021318,0,1,"(7,17)",11,2313,32,111111111000000000000000000000000000000001001100001101001000001111111000,532753458)
 (18,3488,1,108,2090021318,0,3,"(7,18)",11,2313,32,111111111000000000000000000000000000000011001100001101001000001111111000,532753459)
 (19,8080,1,108,753125991,0,2,"(7,19)",11,2825,32,111111111000000000000000000000000000000000110001010111001011100001110000,236796556)
 (20,7968,1,108,753125991,0,4,"(7,20)",11,2825,32,111111111000000000000000000000000000000010110001010111001011100001110000,236796557)
 (21,7856,1,108,753125992,0,19,"(7,21)",11,2825,32,111111111000000000000000000000000000000011001001010111001011100001110000,236796563)
 (22,7744,1,108,753125992,0,21,"(7,22)",11,2825,32,111111111000000000000000000000000000000000101001010111001011100001110000,236796564)
 (23,7632,1,108,753125993,0,36,"(7,23)",11,2825,32,111111111000000000000000000000000000000001011001010111001011100001110000,236796570)
 (24,7520,1,108,753125993,0,38,"(7,24)",11,2825,32,111111111000000000000000000000000000000011011001010111001011100001110000,236796571)
 (25,7408,1,108,753125994,0,53,"(7,25)",11,2825,32,111111111000000000000000000000000000000010000101010111001011100001110000,236796577)
 (26,7296,1,108,753125994,0,55,"(7,26)",11,2825,32,111111111000000000000000000000000000000001000101010111001011100001110000,236796578)
 (27,7184,1,108,753125995,0,70,"(7,27)",11,2825,32,111111111000000000000000000000000000000000010101010111001011100001110000,236796584)
 (28,7072,1,108,753125995,0,72,"(7,28)",11,2825,32,111111111000000000000000000000000000000010010101010111001011100001110000,236796585)
 (29,6960,1,108,753125997,0,87,"(7,29)",11,2825,32,111111111000000000000000000000000000000011001101010111001011100001110000,236796595)
 (30,6848,1,108,753125997,0,89,"(7,30)",11,2825,32,111111111000000000000000000000000000000000101101010111001011100001110000,236796596)
 (31,6736,1,108,753125998,0,104,"(7,31)",11,2825,32,111111111000000000000000000000000000000001011101010111001011100001110000,236796602)
 (32,6624,1,108,753125998,0,106,"(7,32)",11,2825,32,111111111000000000000000000000000000000011011101010111001011100001110000,236796603)
 (33,6512,1,108,753125999,0,121,"(7,33)",11,2825,32,111111111000000000000000000000000000000010000011010111001011100001110000,236796609)
 (34,6400,1,108,753125999,0,123,"(7,34)",11,2825,32,111111111000000000000000000000000000000001000011010111001011100001110000,236796610)
 (35,6288,1,108,753126000,0,138,"(7,35)",11,2825,32,111111111000000000000000000000000000000000010011010111001011100001110000,236796616)
 (36,6176,1,108,753126000,0,140,"(7,36)",11,2825,32,111111111000000000000000000000000000000010010011010111001011100001110000,236796617)
 (37,6064,1,108,753126001,0,155,"(7,37)",11,2825,32,111111111000000000000000000000000000000011110011010111001011100001110000,236796623)
 (38,5952,1,108,753126001,0,157,"(7,38)",11,2825,32,111111111000000000000000000000000000000000001011010111001011100001110000,236796624)
 (39,5840,1,108,753126002,0,172,"(7,39)",11,2825,32,111111111000000000000000000000000000000001101011010111001011100001110000,236796630)
 (40,5728,1,108,753126002,0,174,"(7,40)",11,2825,32,111111111000000000000000000000000000000011101011010111001011100001110000,236796631)
 (41,5616,1,108,753126003,0,189,"(7,41)",11,2825,32,111111111000000000000000000000000000000010111011010111001011100001110000,236796637)
 (42,5504,1,108,753126003,0,191,"(7,42)",11,2825,32,111111111000000000000000000000000000000001111011010111001011100001110000,236796638)
 (43,5392,1,108,753126004,0,206,"(7,43)",11,2825,32,111111111000000000000000000000000000000000100111010111001011100001110000,236796644)
 (44,5280,1,108,753126004,0,208,"(7,44)",11,2825,32,111111111000000000000000000000000000000010100111010111001011100001110000,236796645)
 (45,5168,1,108,753126005,0,223,"(7,45)",11,2825,32,111111111000000000000000000000000000000011010111010111001011100001110000,236796651)
 (46,5056,1,108,753126005,0,225,"(7,46)",11,2825,32,111111111000000000000000000000000000000000110111010111001011100001110000,236796652)
 (47,3376,1,108,2090021898,0,18,"(7,47)",11,2313,32,111111111000000000000000000000000000000001000111001101001000001111111000,532753634)
 (48,3264,1,108,2090021898,0,20,"(7,48)",11,2313,32,111111111000000000000000000000000000000000100111001101001000001111111000,532753636)
 (49,3152,1,108,2102749003,0,0,"(7,49)",11,2313,32,111111111000000000000000000000000000000001111011101110010000110100000100,548445662)
 (50,3040,1,108,2103210571,0,107,"(7,50)",11,2313,32,111111111000000000000000000000000000000000010110101100100100110100000100,548556136)
 (51,2928,1,108,2103210571,0,109,"(7,51)",11,2313,32,111111111000000000000000000000000000000010010110101100100100110100000100,548556137)
 (52,0,3,0,,,,,,,,,)
 (53,0,3,0,,,,,,,,,)
 (54,2816,1,108,2154640913,0,0,"(7,54)",11,2313,32,111111111000000000000000000000000000000010111111010010000000100010000100,554701565)
 (55,2704,1,108,2163849781,0,0,"(7,55)",11,2313,32,111111111000000000000000000000000000000000110000001011101001100010000100,555316236)
 (56,2592,1,108,2295315714,0,0,"(7,56)",11,2313,32,111111111000000000000000000000000000000000100101010100001000010001000100,572590756)
 (57,2480,1,108,2906288382,0,0,"(7,57)",11,2313,32,111111111000000000000000000000000000000000001000101101000100111001111100,1047670032)
 (58,2368,1,108,2906329443,0,0,"(7,58)",11,2313,32,111111111000000000000000000000000000000001111001101101000100111001111100,1047670174)
 (59,60,2,0,,,,,,,,,)
 (60,2224,1,144,3131766386,0,0,"(7,60)",32779,10507,32,111111111100000000000000000000000000000000001101010001010011111100000010,1090298544)
 (61,0,0,0,,,,,,,,,)
 (62,0,0,0,,,,,,,,,)
 (63,0,0,0,,,,,,,,,)
 (64,0,0,0,,,,,,,,,)
 (65,0,0,0,,,,,,,,,)
 (66,0,0,0,,,,,,,,,)
 (67,0,0,0,,,,,,,,,)
 (68,0,0,0,,,,,,,,,)
 (69,4944,1,108,1034607755,0,1,"(7,69)",11,2825,32,111111111000000000000000000000000000000000010111011000010010010100001000,279217896)
 (70,4832,1,108,1034607755,0,3,"(7,70)",11,2825,32,111111111000000000000000000000000000000010010111011000010010010100001000,279217897)
(70 rows)
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 3:01 PM, Jeremy Finzel <[hidden email]> wrote:


On Mon, Mar 19, 2018 at 2:56 PM, Andres Freund <[hidden email]> wrote:
Hi,

On 2018-03-19 14:53:58 -0500, Jeremy Finzel wrote:
> FWIW, if I remove the last filter, I get these rows and I believe row 7/57/
> 2906288382 is the one generating error:

Oh, yea, that makes sense. It's wrapped around and looks like it's from
the future.

> SELECT * FROM check_rel('pg_authid') LIMIT 100;
>  blockno | lp |    xmin
> ---------+----+------------
>        7 |  4 | 2040863716
>        7 |  5 | 2040863716
>        7 |  8 | 2041172882
>        7 |  9 | 2041172882
>        7 | 12 | 2041201779
>        7 | 13 | 2041201779
>        7 | 16 | <a href="tel:(208)%20974-2733" value="+12089742733" target="_blank">2089742733
>        7 | 17 | 2090021318
>        7 | 18 | 2090021318
>        7 | 47 | 2090021898
>        7 | 48 | 2090021898
>        7 | 49 | <a href="tel:(210)%20274-9003" value="+12102749003" target="_blank">2102749003
>        7 | 50 | <a href="tel:(210)%20321-0571" value="+12103210571" target="_blank">2103210571
>        7 | 51 | <a href="tel:(210)%20321-0571" value="+12103210571" target="_blank">2103210571
>        7 | 54 | <a href="tel:(215)%20464-0913" value="+12154640913" target="_blank">2154640913
>        7 | 55 | <a href="tel:(216)%20384-9781" value="+12163849781" target="_blank">2163849781
>        7 | 56 | <a href="tel:(229)%20531-5714" value="+12295315714" target="_blank">2295315714
>        7 | 57 | 2906288382
>        7 | 58 | 2906329443
>        7 | 60 | 3131766386
>        8 |  1 | <a href="tel:(208)%20984-4462" value="+12089844462" target="_blank">2089844462
>        8 |  2 | <a href="tel:(208)%20984-4462" value="+12089844462" target="_blank">2089844462
>        8 |  3 | <a href="tel:(208)%20984-4463" value="+12089844463" target="_blank">2089844463
>        8 |  6 | <a href="tel:(208)%20984-4463" value="+12089844463" target="_blank">2089844463
>        8 |  9 | <a href="tel:(229)%20531-8868" value="+12295318868" target="_blank">2295318868
> (25 rows)

Could you show the contents of those two pages with a query like I had
in an earlier email?

Greetings,

Andres Freund

SELECT heap_page_items(get_raw_page('pg_authid', 7));

                                                              heap_page_items
--------------------------------------------------------------------------------------------------------------------------------------------
 (1,4720,1,108,1897434979,0,0,"(7,1)",11,2825,32,111111111000000000000000000000000000000001011000010011111100001001111000,507769370)
 (2,4608,1,108,1897442758,0,18,"(7,2)",11,2825,32,111111111000000000000000000000000000000011000011101100000010001001111000,507776451)
 (3,4496,1,108,1897442758,0,20,"(7,3)",11,2825,32,111111111000000000000000000000000000000000100011101100000010001001111000,507776452)
 (4,4384,1,108,2040863716,0,37,"(7,4)",11,2313,32,111111111000000000000000000000000000000000110110111011100011001011111000,525105004)
 (5,4272,1,108,2040863716,0,39,"(7,5)",11,2313,32,111111111000000000000000000000000000000010110110111011100011001011111000,525105005)
 (6,0,3,0,,,,,,,,,)
 (7,0,3,0,,,,,,,,,)
 (8,4160,1,108,2041172882,0,49,"(7,8)",11,2313,32,111111111000000000000000000000000000000001110100101011000111001011111000,525219118)
 (9,4048,1,108,2041172882,0,51,"(7,9)",11,2313,32,111111111000000000000000000000000000000011110100101011000111001011111000,525219119)
 (10,0,3,0,,,,,,,,,)
 (11,0,3,0,,,,,,,,,)
 (12,3936,1,108,2041201779,0,181,"(7,12)",11,2313,32,111111111000000000000000000000000000000011010100010111100111001011111000,525236779)
 (13,3824,1,108,2041201779,0,183,"(7,13)",11,2313,32,111111111000000000000000000000000000000000110100010111100111001011111000,525236780)
 (14,0,3,0,,,,,,,,,)
 (15,0,3,0,,,,,,,,,)
 (16,3712,1,108,<a href="tel:(208)%20974-2733" value="+12089742733" target="_blank">2089742733,0,0,"(7,16)",11,2313,32,111111111000000000000000000000000000000001000101110011100000001111111000,532706210)
 (17,3600,1,108,2090021318,0,1,"(7,17)",11,2313,32,111111111000000000000000000000000000000001001100001101001000001111111000,532753458)
 (18,3488,1,108,2090021318,0,3,"(7,18)",11,2313,32,111111111000000000000000000000000000000011001100001101001000001111111000,532753459)
 (19,8080,1,108,753125991,0,2,"(7,19)",11,2825,32,111111111000000000000000000000000000000000110001010111001011100001110000,236796556)
 (20,7968,1,108,753125991,0,4,"(7,20)",11,2825,32,111111111000000000000000000000000000000010110001010111001011100001110000,236796557)
 (21,7856,1,108,753125992,0,19,"(7,21)",11,2825,32,111111111000000000000000000000000000000011001001010111001011100001110000,236796563)
 (22,7744,1,108,753125992,0,21,"(7,22)",11,2825,32,111111111000000000000000000000000000000000101001010111001011100001110000,236796564)
 (23,7632,1,108,753125993,0,36,"(7,23)",11,2825,32,111111111000000000000000000000000000000001011001010111001011100001110000,236796570)
 (24,7520,1,108,753125993,0,38,"(7,24)",11,2825,32,111111111000000000000000000000000000000011011001010111001011100001110000,236796571)
 (25,7408,1,108,753125994,0,53,"(7,25)",11,2825,32,111111111000000000000000000000000000000010000101010111001011100001110000,236796577)
 (26,7296,1,108,753125994,0,55,"(7,26)",11,2825,32,111111111000000000000000000000000000000001000101010111001011100001110000,236796578)
 (27,7184,1,108,753125995,0,70,"(7,27)",11,2825,32,111111111000000000000000000000000000000000010101010111001011100001110000,236796584)
 (28,7072,1,108,753125995,0,72,"(7,28)",11,2825,32,111111111000000000000000000000000000000010010101010111001011100001110000,236796585)
 (29,6960,1,108,753125997,0,87,"(7,29)",11,2825,32,111111111000000000000000000000000000000011001101010111001011100001110000,236796595)
 (30,6848,1,108,753125997,0,89,"(7,30)",11,2825,32,111111111000000000000000000000000000000000101101010111001011100001110000,236796596)
 (31,6736,1,108,753125998,0,104,"(7,31)",11,2825,32,111111111000000000000000000000000000000001011101010111001011100001110000,236796602)
 (32,6624,1,108,753125998,0,106,"(7,32)",11,2825,32,111111111000000000000000000000000000000011011101010111001011100001110000,236796603)
 (33,6512,1,108,753125999,0,121,"(7,33)",11,2825,32,111111111000000000000000000000000000000010000011010111001011100001110000,236796609)
 (34,6400,1,108,753125999,0,123,"(7,34)",11,2825,32,111111111000000000000000000000000000000001000011010111001011100001110000,236796610)
 (35,6288,1,108,753126000,0,138,"(7,35)",11,2825,32,111111111000000000000000000000000000000000010011010111001011100001110000,236796616)
 (36,6176,1,108,753126000,0,140,"(7,36)",11,2825,32,111111111000000000000000000000000000000010010011010111001011100001110000,236796617)
 (37,6064,1,108,753126001,0,155,"(7,37)",11,2825,32,111111111000000000000000000000000000000011110011010111001011100001110000,236796623)
 (38,5952,1,108,753126001,0,157,"(7,38)",11,2825,32,111111111000000000000000000000000000000000001011010111001011100001110000,236796624)
 (39,5840,1,108,753126002,0,172,"(7,39)",11,2825,32,111111111000000000000000000000000000000001101011010111001011100001110000,236796630)
 (40,5728,1,108,753126002,0,174,"(7,40)",11,2825,32,111111111000000000000000000000000000000011101011010111001011100001110000,236796631)
 (41,5616,1,108,753126003,0,189,"(7,41)",11,2825,32,111111111000000000000000000000000000000010111011010111001011100001110000,236796637)
 (42,5504,1,108,753126003,0,191,"(7,42)",11,2825,32,111111111000000000000000000000000000000001111011010111001011100001110000,236796638)
 (43,5392,1,108,753126004,0,206,"(7,43)",11,2825,32,111111111000000000000000000000000000000000100111010111001011100001110000,236796644)
 (44,5280,1,108,753126004,0,208,"(7,44)",11,2825,32,111111111000000000000000000000000000000010100111010111001011100001110000,236796645)
 (45,5168,1,108,753126005,0,223,"(7,45)",11,2825,32,111111111000000000000000000000000000000011010111010111001011100001110000,236796651)
 (46,5056,1,108,753126005,0,225,"(7,46)",11,2825,32,111111111000000000000000000000000000000000110111010111001011100001110000,236796652)
 (47,3376,1,108,2090021898,0,18,"(7,47)",11,2313,32,111111111000000000000000000000000000000001000111001101001000001111111000,532753634)
 (48,3264,1,108,2090021898,0,20,"(7,48)",11,2313,32,111111111000000000000000000000000000000000100111001101001000001111111000,532753636)
 (49,3152,1,108,<a href="tel:(210)%20274-9003" value="+12102749003" target="_blank">2102749003,0,0,"(7,49)",11,2313,32,111111111000000000000000000000000000000001111011101110010000110100000100,548445662)
 (50,3040,1,108,<a href="tel:(210)%20321-0571" value="+12103210571" target="_blank">2103210571,0,107,"(7,50)",11,2313,32,111111111000000000000000000000000000000000010110101100100100110100000100,548556136)
 (51,2928,1,108,<a href="tel:(210)%20321-0571" value="+12103210571" target="_blank">2103210571,0,109,"(7,51)",11,2313,32,111111111000000000000000000000000000000010010110101100100100110100000100,548556137)
 (52,0,3,0,,,,,,,,,)
 (53,0,3,0,,,,,,,,,)
 (54,2816,1,108,<a href="tel:(215)%20464-0913" value="+12154640913" target="_blank">2154640913,0,0,"(7,54)",11,2313,32,111111111000000000000000000000000000000010111111010010000000100010000100,554701565)
 (55,2704,1,108,<a href="tel:(216)%20384-9781" value="+12163849781" target="_blank">2163849781,0,0,"(7,55)",11,2313,32,111111111000000000000000000000000000000000110000001011101001100010000100,555316236)
 (56,2592,1,108,<a href="tel:(229)%20531-5714" value="+12295315714" target="_blank">2295315714,0,0,"(7,56)",11,2313,32,111111111000000000000000000000000000000000100101010100001000010001000100,572590756)
 (57,2480,1,108,2906288382,0,0,"(7,57)",11,2313,32,111111111000000000000000000000000000000000001000101101000100111001111100,1047670032)
 (58,2368,1,108,2906329443,0,0,"(7,58)",11,2313,32,111111111000000000000000000000000000000001111001101101000100111001111100,1047670174)
 (59,60,2,0,,,,,,,,,)
 (60,2224,1,144,3131766386,0,0,"(7,60)",32779,10507,32,111111111100000000000000000000000000000000001101010001010011111100000010,1090298544)
 (61,0,0,0,,,,,,,,,)
 (62,0,0,0,,,,,,,,,)
 (63,0,0,0,,,,,,,,,)
 (64,0,0,0,,,,,,,,,)
 (65,0,0,0,,,,,,,,,)
 (66,0,0,0,,,,,,,,,)
 (67,0,0,0,,,,,,,,,)
 (68,0,0,0,,,,,,,,,)
 (69,4944,1,108,1034607755,0,1,"(7,69)",11,2825,32,111111111000000000000000000000000000000000010111011000010010010100001000,279217896)
 (70,4832,1,108,1034607755,0,3,"(7,70)",11,2825,32,111111111000000000000000000000000000000010010111011000010010010100001000,279217897)
(70 rows)

Does the fact that a snapshot does not have this issue suggest it could be memory-related corruption and a db restart could clear it up?

Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Andres Freund
Hi,

On 2018-03-19 15:37:51 -0500, Jeremy Finzel wrote:
> Does the fact that a snapshot does not have this issue suggest it could be
> memory-related corruption and a db restart could clear it up?

Could you show the page from the snapshot? I suspect it might just be a
problem that's temporarily not visible as corrupted.

Greetings,

Andres Freund

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Peter Geoghegan-4
In reply to this post by jfinzel
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <[hidden email]> wrote:
> SELECT heap_page_items(get_raw_page('pg_authid', 7));

Can you post this?

SELECT * FROM page_header(get_raw_page('pg_authid', 7));

--
Peter Geoghegan

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <[hidden email]> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <[hidden email]> wrote:
> SELECT heap_page_items(get_raw_page('pg_authid', 7));

Can you post this?

SELECT * FROM page_header(get_raw_page('pg_authid', 7));

--
Peter Geoghegan

@Peter :

staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
      lsn       | checksum | flags | lower | upper | special | pagesize | version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
 262B4/10FDC478 |        0 |     1 |   304 |  2224 |    8192 |     8192 |       4 |         0
(1 row)

@Andres :

This is from snapshot (on 9.5.12, but we didn't have the error either on a 9.5.11 snap):

                                                             heap_page_items
-----------------------------------------------------------------------------------------------------------------------------------------
 (1,0,0,0,,,,,,,,,)
 (2,0,0,0,,,,,,,,,)
 (3,0,0,0,,,,,,,,,)
 (4,0,0,0,,,,,,,,,)
 (5,0,0,0,,,,,,,,,)
 (6,8080,1,108,3137434815,0,0,"(7,6)",11,10505,32,111111111000000000000000000000000000000001111110111100001000010010000001,2166427518)
 (7,7936,1,144,3137434816,0,0,"(7,7)",11,10507,32,111111111100000000000000000000000000000001110111000001110101101100001011,3504005358)
 (8,0,0,0,,,,,,,,,)
 (9,0,0,0,,,,,,,,,)
 (10,7792,1,144,3137434817,0,0,"(7,10)",11,10507,32,111111111100000000000000000000000000000001010110001101000011011111101000,401353834)
 (11,7680,1,108,3137434818,0,0,"(7,11)",11,10505,32,111111111000000000000000000000000000000001100011010010010001000001100001,2248708806)
 (12,0,0,0,,,,,,,,,)
 (13,0,0,0,,,,,,,,,)
 (14,7568,1,108,3137434819,0,0,"(7,14)",11,10505,32,111111111000000000000000000000000000000001110011110011011011100010100101,2770187214)
 (15,7456,1,108,3137434820,0,0,"(7,15)",11,10505,32,111111111000000000000000000000000000000011110001010001010011110010100001,2235343503)
 (16,0,0,0,,,,,,,,,)
 (17,0,0,0,,,,,,,,,)
 (18,0,0,0,,,,,,,,,)
 (19,0,0,0,,,,,,,,,)
 (20,0,0,0,,,,,,,,,)
 (21,0,0,0,,,,,,,,,)
 (22,0,0,0,,,,,,,,,)
 (23,0,0,0,,,,,,,,,)
 (24,0,0,0,,,,,,,,,)
 (25,0,0,0,,,,,,,,,)
 (26,0,0,0,,,,,,,,,)
 (27,0,0,0,,,,,,,,,)
 (28,0,0,0,,,,,,,,,)
 (29,0,0,0,,,,,,,,,)
 (30,0,0,0,,,,,,,,,)
 (31,0,0,0,,,,,,,,,)
 (32,0,0,0,,,,,,,,,)
 (33,0,0,0,,,,,,,,,)
 (34,0,0,0,,,,,,,,,)
 (35,0,0,0,,,,,,,,,)
 (36,0,0,0,,,,,,,,,)
 (37,0,0,0,,,,,,,,,)
 (38,0,0,0,,,,,,,,,)
 (39,0,0,0,,,,,,,,,)
 (40,0,0,0,,,,,,,,,)
 (41,0,0,0,,,,,,,,,)
 (42,0,0,0,,,,,,,,,)
 (43,0,0,0,,,,,,,,,)
 (44,0,0,0,,,,,,,,,)
 (45,0,0,0,,,,,,,,,)
 (46,0,0,0,,,,,,,,,)
 (47,0,0,0,,,,,,,,,)
 (48,0,0,0,,,,,,,,,)
 (49,0,0,0,,,,,,,,,)
 (50,0,0,0,,,,,,,,,)
 (51,0,0,0,,,,,,,,,)
 (52,7344,1,108,3137434821,0,0,"(7,52)",11,10505,32,111111111000000000000000000000000000000011011011111110001010010101000001,2191859675)
 (53,7232,1,108,3137434822,0,0,"(7,53)",11,10505,32,111111111000000000000000000000000000000001101011010111100110011011100100,661027542)
 (54,0,0,0,,,,,,,,,)
 (55,0,0,0,,,,,,,,,)
 (56,0,0,0,,,,,,,,,)
 (57,0,0,0,,,,,,,,,)
 (58,0,0,0,,,,,,,,,)
 (59,0,0,0,,,,,,,,,)
 (60,0,0,0,,,,,,,,,)
 (61,7120,1,108,3137434823,0,0,"(7,61)",11,10505,32,111111111000000000000000000000000000000000010111010110000101010111010100,732568296)
 (62,6976,1,144,3137434824,0,0,"(7,62)",11,10507,32,111111111100000000000000000000000000000010100100001001001010110000010100,674571301)
 (63,6864,1,108,3137434825,0,0,"(7,63)",11,10505,32,111111111000000000000000000000000000000010011011011111011100001001010001,2319695577)
 (64,6720,1,144,3137434826,0,0,"(7,64)",11,10507,32,111111111100000000000000000000000000000001000010011001111011100100101000,345892418)
 (65,6608,1,108,3137434827,0,0,"(7,65)",11,10505,32,111111111000000000000000000000000000000001000001010100010010101011001010,1398049410)
 (66,6496,1,108,3137434828,0,0,"(7,66)",11,10505,32,111111111000000000000000000000000000000000101000111100110101001000101010,1414188820)
 (67,6384,1,108,3137434829,0,0,"(7,67)",11,10505,32,111111111000000000000000000000000000000000110101011101111011001110101001,2513301164)
 (68,0,0,0,,,,,,,,,)
 (69,0,0,0,,,,,,,,,)
 (70,0,0,0,,,,,,,,,)
(70 rows)


Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Peter Geoghegan-4
On Mon, Mar 19, 2018 at 1:55 PM, Jeremy Finzel <[hidden email]> wrote:
> @Peter :
>
> staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
>       lsn       | checksum | flags | lower | upper | special | pagesize |
> version | prune_xid
> ----------------+----------+-------+-------+-------+---------+----------+---------+-----------
>  262B4/10FDC478 |        0 |     1 |   304 |  2224 |    8192 |     8192 |
> 4 |         0
> (1 row)

Thanks.

That looks normal. I wonder if the contents of that page looks
consistent with the rest of the table following manual inspection,
though. I recently saw system catalog corruption on a 9.5 instance
where an entirely different relation's page ended up in pg_attribute
and pg_depend. They were actually pristine index pages from an
application index. I still have no idea why this happened.

This is very much a guess, but it can't hurt to check if the contents
of the tuples themselves are actually sane by inspecting them with
"SELECT * FROM pg_authid". heap_page_items() doesn't actually care
about the shape of the tuples in the page, so this might have been
missed.

--
Peter Geoghegan

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Mon, Mar 19, 2018 at 4:12 PM, Peter Geoghegan <[hidden email]> wrote:
On Mon, Mar 19, 2018 at 1:55 PM, Jeremy Finzel <[hidden email]> wrote:
> @Peter :
>
> staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
>       lsn       | checksum | flags | lower | upper | special | pagesize |
> version | prune_xid
> ----------------+----------+-------+-------+-------+---------+----------+---------+-----------
>  262B4/10FDC478 |        0 |     1 |   304 |  2224 |    8192 |     8192 |
> 4 |         0
> (1 row)

Thanks.

That looks normal. I wonder if the contents of that page looks
consistent with the rest of the table following manual inspection,
though. I recently saw system catalog corruption on a 9.5 instance
where an entirely different relation's page ended up in pg_attribute
and pg_depend. They were actually pristine index pages from an
application index. I still have no idea why this happened.

This is very much a guess, but it can't hurt to check if the contents
of the tuples themselves are actually sane by inspecting them with
"SELECT * FROM pg_authid". heap_page_items() doesn't actually care
about the shape of the tuples in the page, so this might have been
missed.

--
Peter Geoghegan

The data all looks fine.  I even forced the index scan on both indexes which also looks fine.

Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel
In reply to this post by jfinzel


On Mon, Mar 19, 2018 at 3:55 PM, Jeremy Finzel <[hidden email]> wrote:


On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <[hidden email]> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <[hidden email]> wrote:
> SELECT heap_page_items(get_raw_page('pg_authid', 7));

Can you post this?

SELECT * FROM page_header(get_raw_page('pg_authid', 7));

--
Peter Geoghegan

@Peter :

staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
      lsn       | checksum | flags | lower | upper | special | pagesize | version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
 262B4/10FDC478 |        0 |     1 |   304 |  2224 |    8192 |     8192 |       4 |         0
(1 row)

@Andres :

This is from snapshot (on 9.5.12, but we didn't have the error either on a 9.5.11 snap):

                                                             heap_page_items
-----------------------------------------------------------------------------------------------------------------------------------------
 (1,0,0,0,,,,,,,,,)
 (2,0,0,0,,,,,,,,,)
 (3,0,0,0,,,,,,,,,)
 (4,0,0,0,,,,,,,,,)
 (5,0,0,0,,,,,,,,,)
 (6,8080,1,108,<a href="tel:(313)%20743-4815" value="+13137434815" target="_blank">3137434815,0,0,"(7,6)",11,10505,32,111111111000000000000000000000000000000001111110111100001000010010000001,2166427518)
 (7,7936,1,144,<a href="tel:(313)%20743-4816" value="+13137434816" target="_blank">3137434816,0,0,"(7,7)",11,10507,32,111111111100000000000000000000000000000001110111000001110101101100001011,3504005358)
 (8,0,0,0,,,,,,,,,)
 (9,0,0,0,,,,,,,,,)
 (10,7792,1,144,<a href="tel:(313)%20743-4817" value="+13137434817" target="_blank">3137434817,0,0,"(7,10)",11,10507,32,111111111100000000000000000000000000000001010110001101000011011111101000,401353834)
 (11,7680,1,108,<a href="tel:(313)%20743-4818" value="+13137434818" target="_blank">3137434818,0,0,"(7,11)",11,10505,32,111111111000000000000000000000000000000001100011010010010001000001100001,2248708806)
 (12,0,0,0,,,,,,,,,)
 (13,0,0,0,,,,,,,,,)
 (14,7568,1,108,<a href="tel:(313)%20743-4819" value="+13137434819" target="_blank">3137434819,0,0,"(7,14)",11,10505,32,111111111000000000000000000000000000000001110011110011011011100010100101,2770187214)
 (15,7456,1,108,<a href="tel:(313)%20743-4820" value="+13137434820" target="_blank">3137434820,0,0,"(7,15)",11,10505,32,111111111000000000000000000000000000000011110001010001010011110010100001,2235343503)
 (16,0,0,0,,,,,,,,,)
 (17,0,0,0,,,,,,,,,)
 (18,0,0,0,,,,,,,,,)
 (19,0,0,0,,,,,,,,,)
 (20,0,0,0,,,,,,,,,)
 (21,0,0,0,,,,,,,,,)
 (22,0,0,0,,,,,,,,,)
 (23,0,0,0,,,,,,,,,)
 (24,0,0,0,,,,,,,,,)
 (25,0,0,0,,,,,,,,,)
 (26,0,0,0,,,,,,,,,)
 (27,0,0,0,,,,,,,,,)
 (28,0,0,0,,,,,,,,,)
 (29,0,0,0,,,,,,,,,)
 (30,0,0,0,,,,,,,,,)
 (31,0,0,0,,,,,,,,,)
 (32,0,0,0,,,,,,,,,)
 (33,0,0,0,,,,,,,,,)
 (34,0,0,0,,,,,,,,,)
 (35,0,0,0,,,,,,,,,)
 (36,0,0,0,,,,,,,,,)
 (37,0,0,0,,,,,,,,,)
 (38,0,0,0,,,,,,,,,)
 (39,0,0,0,,,,,,,,,)
 (40,0,0,0,,,,,,,,,)
 (41,0,0,0,,,,,,,,,)
 (42,0,0,0,,,,,,,,,)
 (43,0,0,0,,,,,,,,,)
 (44,0,0,0,,,,,,,,,)
 (45,0,0,0,,,,,,,,,)
 (46,0,0,0,,,,,,,,,)
 (47,0,0,0,,,,,,,,,)
 (48,0,0,0,,,,,,,,,)
 (49,0,0,0,,,,,,,,,)
 (50,0,0,0,,,,,,,,,)
 (51,0,0,0,,,,,,,,,)
 (52,7344,1,108,<a href="tel:(313)%20743-4821" value="+13137434821" target="_blank">3137434821,0,0,"(7,52)",11,10505,32,111111111000000000000000000000000000000011011011111110001010010101000001,2191859675)
 (53,7232,1,108,<a href="tel:(313)%20743-4822" value="+13137434822" target="_blank">3137434822,0,0,"(7,53)",11,10505,32,111111111000000000000000000000000000000001101011010111100110011011100100,661027542)
 (54,0,0,0,,,,,,,,,)
 (55,0,0,0,,,,,,,,,)
 (56,0,0,0,,,,,,,,,)
 (57,0,0,0,,,,,,,,,)
 (58,0,0,0,,,,,,,,,)
 (59,0,0,0,,,,,,,,,)
 (60,0,0,0,,,,,,,,,)
 (61,7120,1,108,<a href="tel:(313)%20743-4823" value="+13137434823" target="_blank">3137434823,0,0,"(7,61)",11,10505,32,111111111000000000000000000000000000000000010111010110000101010111010100,732568296)
 (62,6976,1,144,<a href="tel:(313)%20743-4824" value="+13137434824" target="_blank">3137434824,0,0,"(7,62)",11,10507,32,111111111100000000000000000000000000000010100100001001001010110000010100,674571301)
 (63,6864,1,108,<a href="tel:(313)%20743-4825" value="+13137434825" target="_blank">3137434825,0,0,"(7,63)",11,10505,32,111111111000000000000000000000000000000010011011011111011100001001010001,2319695577)
 (64,6720,1,144,<a href="tel:(313)%20743-4826" value="+13137434826" target="_blank">3137434826,0,0,"(7,64)",11,10507,32,111111111100000000000000000000000000000001000010011001111011100100101000,345892418)
 (65,6608,1,108,<a href="tel:(313)%20743-4827" value="+13137434827" target="_blank">3137434827,0,0,"(7,65)",11,10505,32,111111111000000000000000000000000000000001000001010100010010101011001010,1398049410)
 (66,6496,1,108,<a href="tel:(313)%20743-4828" value="+13137434828" target="_blank">3137434828,0,0,"(7,66)",11,10505,32,111111111000000000000000000000000000000000101000111100110101001000101010,1414188820)
 (67,6384,1,108,<a href="tel:(313)%20743-4829" value="+13137434829" target="_blank">3137434829,0,0,"(7,67)",11,10505,32,111111111000000000000000000000000000000000110101011101111011001110101001,2513301164)
 (68,0,0,0,,,,,,,,,)
 (69,0,0,0,,,,,,,,,)
 (70,0,0,0,,,,,,,,,)
(70 rows)


Thanks,
Jeremy

Any suggestions as to what I can do from this point?  Is it feasible that a server restart would fix this, or are you saying it would just mask the problem?  I can't reproduce it on a snapshot which is quite odd.

FWIW, maybe this is obvious, but the pages look identical on the streamers we have to the master.

Thanks,
Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Tue, Mar 20, 2018 at 11:19 AM, Jeremy Finzel <[hidden email]> wrote:


On Mon, Mar 19, 2018 at 3:55 PM, Jeremy Finzel <[hidden email]> wrote:


On Mon, Mar 19, 2018 at 3:53 PM, Peter Geoghegan <[hidden email]> wrote:
On Mon, Mar 19, 2018 at 1:01 PM, Jeremy Finzel <[hidden email]> wrote:
> SELECT heap_page_items(get_raw_page('pg_authid', 7));

Can you post this?

SELECT * FROM page_header(get_raw_page('pg_authid', 7));

--
Peter Geoghegan

@Peter :

staging=# SELECT * FROM page_header(get_raw_page('pg_authid', 7));
      lsn       | checksum | flags | lower | upper | special | pagesize | version | prune_xid
----------------+----------+-------+-------+-------+---------+----------+---------+-----------
 262B4/10FDC478 |        0 |     1 |   304 |  2224 |    8192 |     8192 |       4 |         0
(1 row)

@Andres :

This is from snapshot (on 9.5.12, but we didn't have the error either on a 9.5.11 snap):

                                                             heap_page_items
-----------------------------------------------------------------------------------------------------------------------------------------
 (1,0,0,0,,,,,,,,,)
 (2,0,0,0,,,,,,,,,)
 (3,0,0,0,,,,,,,,,)
 (4,0,0,0,,,,,,,,,)
 (5,0,0,0,,,,,,,,,)
 (6,8080,1,108,<a href="tel:(313)%20743-4815" value="+13137434815" target="_blank">3137434815,0,0,"(7,6)",11,10505,32,111111111000000000000000000000000000000001111110111100001000010010000001,<a href="tel:(216)%20642-7518" value="+12166427518" target="_blank">2166427518)
 (7,7936,1,144,<a href="tel:(313)%20743-4816" value="+13137434816" target="_blank">3137434816,0,0,"(7,7)",11,10507,32,111111111100000000000000000000000000000001110111000001110101101100001011,3504005358)
 (8,0,0,0,,,,,,,,,)
 (9,0,0,0,,,,,,,,,)
 (10,7792,1,144,<a href="tel:(313)%20743-4817" value="+13137434817" target="_blank">3137434817,0,0,"(7,10)",11,10507,32,111111111100000000000000000000000000000001010110001101000011011111101000,401353834)
 (11,7680,1,108,<a href="tel:(313)%20743-4818" value="+13137434818" target="_blank">3137434818,0,0,"(7,11)",11,10505,32,111111111000000000000000000000000000000001100011010010010001000001100001,<a href="tel:(224)%20870-8806" value="+12248708806" target="_blank">2248708806)
 (12,0,0,0,,,,,,,,,)
 (13,0,0,0,,,,,,,,,)
 (14,7568,1,108,<a href="tel:(313)%20743-4819" value="+13137434819" target="_blank">3137434819,0,0,"(7,14)",11,10505,32,111111111000000000000000000000000000000001110011110011011011100010100101,2770187214)
 (15,7456,1,108,<a href="tel:(313)%20743-4820" value="+13137434820" target="_blank">3137434820,0,0,"(7,15)",11,10505,32,111111111000000000000000000000000000000011110001010001010011110010100001,<a href="tel:(223)%20534-3503" value="+12235343503" target="_blank">2235343503)
 (16,0,0,0,,,,,,,,,)
 (17,0,0,0,,,,,,,,,)
 (18,0,0,0,,,,,,,,,)
 (19,0,0,0,,,,,,,,,)
 (20,0,0,0,,,,,,,,,)
 (21,0,0,0,,,,,,,,,)
 (22,0,0,0,,,,,,,,,)
 (23,0,0,0,,,,,,,,,)
 (24,0,0,0,,,,,,,,,)
 (25,0,0,0,,,,,,,,,)
 (26,0,0,0,,,,,,,,,)
 (27,0,0,0,,,,,,,,,)
 (28,0,0,0,,,,,,,,,)
 (29,0,0,0,,,,,,,,,)
 (30,0,0,0,,,,,,,,,)
 (31,0,0,0,,,,,,,,,)
 (32,0,0,0,,,,,,,,,)
 (33,0,0,0,,,,,,,,,)
 (34,0,0,0,,,,,,,,,)
 (35,0,0,0,,,,,,,,,)
 (36,0,0,0,,,,,,,,,)
 (37,0,0,0,,,,,,,,,)
 (38,0,0,0,,,,,,,,,)
 (39,0,0,0,,,,,,,,,)
 (40,0,0,0,,,,,,,,,)
 (41,0,0,0,,,,,,,,,)
 (42,0,0,0,,,,,,,,,)
 (43,0,0,0,,,,,,,,,)
 (44,0,0,0,,,,,,,,,)
 (45,0,0,0,,,,,,,,,)
 (46,0,0,0,,,,,,,,,)
 (47,0,0,0,,,,,,,,,)
 (48,0,0,0,,,,,,,,,)
 (49,0,0,0,,,,,,,,,)
 (50,0,0,0,,,,,,,,,)
 (51,0,0,0,,,,,,,,,)
 (52,7344,1,108,<a href="tel:(313)%20743-4821" value="+13137434821" target="_blank">3137434821,0,0,"(7,52)",11,10505,32,111111111000000000000000000000000000000011011011111110001010010101000001,2191859675)
 (53,7232,1,108,<a href="tel:(313)%20743-4822" value="+13137434822" target="_blank">3137434822,0,0,"(7,53)",11,10505,32,111111111000000000000000000000000000000001101011010111100110011011100100,661027542)
 (54,0,0,0,,,,,,,,,)
 (55,0,0,0,,,,,,,,,)
 (56,0,0,0,,,,,,,,,)
 (57,0,0,0,,,,,,,,,)
 (58,0,0,0,,,,,,,,,)
 (59,0,0,0,,,,,,,,,)
 (60,0,0,0,,,,,,,,,)
 (61,7120,1,108,<a href="tel:(313)%20743-4823" value="+13137434823" target="_blank">3137434823,0,0,"(7,61)",11,10505,32,111111111000000000000000000000000000000000010111010110000101010111010100,732568296)
 (62,6976,1,144,<a href="tel:(313)%20743-4824" value="+13137434824" target="_blank">3137434824,0,0,"(7,62)",11,10507,32,111111111100000000000000000000000000000010100100001001001010110000010100,674571301)
 (63,6864,1,108,<a href="tel:(313)%20743-4825" value="+13137434825" target="_blank">3137434825,0,0,"(7,63)",11,10505,32,111111111000000000000000000000000000000010011011011111011100001001010001,<a href="tel:(231)%20969-5577" value="+12319695577" target="_blank">2319695577)
 (64,6720,1,144,<a href="tel:(313)%20743-4826" value="+13137434826" target="_blank">3137434826,0,0,"(7,64)",11,10507,32,111111111100000000000000000000000000000001000010011001111011100100101000,345892418)
 (65,6608,1,108,<a href="tel:(313)%20743-4827" value="+13137434827" target="_blank">3137434827,0,0,"(7,65)",11,10505,32,111111111000000000000000000000000000000001000001010100010010101011001010,1398049410)
 (66,6496,1,108,<a href="tel:(313)%20743-4828" value="+13137434828" target="_blank">3137434828,0,0,"(7,66)",11,10505,32,111111111000000000000000000000000000000000101000111100110101001000101010,1414188820)
 (67,6384,1,108,<a href="tel:(313)%20743-4829" value="+13137434829" target="_blank">3137434829,0,0,"(7,67)",11,10505,32,111111111000000000000000000000000000000000110101011101111011001110101001,<a href="tel:(251)%20330-1164" value="+12513301164" target="_blank">2513301164)
 (68,0,0,0,,,,,,,,,)
 (69,0,0,0,,,,,,,,,)
 (70,0,0,0,,,,,,,,,)
(70 rows)


Thanks,
Jeremy

Any suggestions as to what I can do from this point?  Is it feasible that a server restart would fix this, or are you saying it would just mask the problem?  I can't reproduce it on a snapshot which is quite odd.

FWIW, maybe this is obvious, but the pages look identical on the streamers we have to the master.

Thanks,
Jeremy

A server restart and upgrade to 9.5.12 (at the same time), as expected, made the issue go away.  Still doesn't give us any answers as to what happened or if it would happen again!  Thanks for the feeback.

Jeremy
Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

Peter Geoghegan-4
On Wed, Mar 21, 2018 at 1:38 PM, Jeremy Finzel <[hidden email]> wrote:
> A server restart and upgrade to 9.5.12 (at the same time), as expected, made
> the issue go away.  Still doesn't give us any answers as to what happened or
> if it would happen again!  Thanks for the feeback.

You may still want to use amcheck to look for problems. The version on
Github works with 9.5, and there are Redhat and Debian pgdg packages.
See:

https://github.com/petergeoghegan/amcheck

The "heapallindexed" option will be of particular interest to you -
that option verifies that the table has matching rows for a target
index (in addition to testing the structure of a target B-Tree index
itself). This is probably the best general test for corruption that is
available. There is a fair chance that this will reveal new
information.

--
Peter Geoghegan

Reply | Threaded
Open this post in threaded view
|

Re: found xmin from before relfrozenxid on pg_catalog.pg_authid

jfinzel


On Wed, Mar 21, 2018 at 4:29 PM, Peter Geoghegan <[hidden email]> wrote:
On Wed, Mar 21, 2018 at 1:38 PM, Jeremy Finzel <[hidden email]> wrote:
> A server restart and upgrade to 9.5.12 (at the same time), as expected, made
> the issue go away.  Still doesn't give us any answers as to what happened or
> if it would happen again!  Thanks for the feeback.

You may still want to use amcheck to look for problems. The version on
Github works with 9.5, and there are Redhat and Debian pgdg packages.
See:

https://github.com/petergeoghegan/amcheck

The "heapallindexed" option will be of particular interest to you -
that option verifies that the table has matching rows for a target
index (in addition to testing the structure of a target B-Tree index
itself). This is probably the best general test for corruption that is
available. There is a fair chance that this will reveal new
information.

--
Peter Geoghegan

Thank you for the recommendation.  I ran both amcheck functions on all 4 indexes of those 2 tables with heapallindexed = true, but no issues were found.

Thanks,
Jeremy
123