Re: [PATCHES] Cleaning up unreferenced table files

classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHES] Cleaning up unreferenced table files

Bruce Momjian-2

Heikki, do you have any interest in completing your file checking patch
for inclusion in 8.1 by adding tablespace information and other fixes as
requested by Tom below?  The current patch version is at:

        ftp://candle.pha.pa.us/pub/postgresql/mypatches

called checkfiles*.

Anyone else want to complete it?

---------------------------------------------------------------------------

Tom Lane wrote:

> Bruce Momjian <[hidden email]> writes:
> > Applied.
>
> Now that I've had a chance to look at it, this patch is thoroughly
> broken.  Problems observed in a quick review:
>
> 1. It doesn't work at all for non-default tablespaces: it will
> claim that every file in such a tablespace is stale.  The fact
> that it does that rather than failing entirely is accidental.
> It tries to read the database's pg_class in the target tablespace
> whether it's there or not.  Because the system is still in recovery
> mode, the low-level routines allow the access to the nonexistent
> pg_class table to pass --- in fact they think they should create
> the file, so after it runs there's a bogus empty "1259" file in each
> such tablespace (which of course it complains about, too).  The code
> then proceeds to think that pg_class is empty so of course everything
> draws a warning.
>
> 2. It's not robust against stale subdirectories of a tablespace
> (ie, subdirs corresponding to a nonexistent database) --- again,
> it'll try to read a nonexistent pg_class.  Then it'll produce a
> bunch of off-target complaint messages.
>
> 3. It's assuming that relfilenode is unique database-wide, when no
> such assumption is safe.  We only have a guarantee that it's unique
> tablespace-wide.
>
> 4. It fails to examine table segment files (such as "nnn.1").  These
> should be complained of when the "nnn" doesn't match any hash entry.
>
> 5. It will load every relfilenode value in pg_class into the hashtable
> whether it's meaningful or not.  There should be a check on relkind.
>
> 6. I don't think relying on strtol to decide if a filename is entirely
> numeric is very safe.  Note all the extra defenses in pg_atoi against
> various platform-specific misbehaviors of strtol.  Personally I'd use a
> strspn test instead.
>
> 7. There are no checks for readdir failure (compare any other readdir
> loop in the backend).
>
> See also Simon Riggs' complaints that the circumstances under which it's
> done are pretty randomly selected.  (One particular thing that I think
> is a bad idea is to do this in a standalone backend.  Any sort of
> corruption in any db's pg_class would render it impossible to start up.)
>
> To fix the first three problems, and also avoid the performance problem
> of multiply rescanning a database's pg_class for each of its
> tablespaces, I would suggest that the hashtable entries be widened to
> RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid).  Then
> there should be one iteration over pg_database to learn the OIDs and
> default tablespaces of each database; with that you can read pg_class
> from its correct location for each database and load all the entries
> into the hashtable.  Then you iterate through the tablespaces looking
> for stuff not present in the hashtable.  You might also want to build a
> list or hashtable of known database OIDs, so that you can recognize a
> stale subdirectory immediately and issue a direct complaint about it
> without even recursing into it.
>
> regards, tom lane
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  [hidden email]               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: [PATCHES] Cleaning up unreferenced table files

Heikki Linnakangas
No, not for now. Maybe for 8.2. And maybe as a contrib tool at first after
all.

- Heikki

On Mon, 27 Jun 2005, Bruce Momjian wrote:

>
> Heikki, do you have any interest in completing your file checking patch
> for inclusion in 8.1 by adding tablespace information and other fixes as
> requested by Tom below?  The current patch version is at:
>
>        ftp://candle.pha.pa.us/pub/postgresql/mypatches
>
> called checkfiles*.
>
> Anyone else want to complete it?
>
> ---------------------------------------------------------------------------
>
> Tom Lane wrote:
>> Bruce Momjian <[hidden email]> writes:
>>> Applied.
>>
>> Now that I've had a chance to look at it, this patch is thoroughly
>> broken.  Problems observed in a quick review:
>>
>> 1. It doesn't work at all for non-default tablespaces: it will
>> claim that every file in such a tablespace is stale.  The fact
>> that it does that rather than failing entirely is accidental.
>> It tries to read the database's pg_class in the target tablespace
>> whether it's there or not.  Because the system is still in recovery
>> mode, the low-level routines allow the access to the nonexistent
>> pg_class table to pass --- in fact they think they should create
>> the file, so after it runs there's a bogus empty "1259" file in each
>> such tablespace (which of course it complains about, too).  The code
>> then proceeds to think that pg_class is empty so of course everything
>> draws a warning.
>>
>> 2. It's not robust against stale subdirectories of a tablespace
>> (ie, subdirs corresponding to a nonexistent database) --- again,
>> it'll try to read a nonexistent pg_class.  Then it'll produce a
>> bunch of off-target complaint messages.
>>
>> 3. It's assuming that relfilenode is unique database-wide, when no
>> such assumption is safe.  We only have a guarantee that it's unique
>> tablespace-wide.
>>
>> 4. It fails to examine table segment files (such as "nnn.1").  These
>> should be complained of when the "nnn" doesn't match any hash entry.
>>
>> 5. It will load every relfilenode value in pg_class into the hashtable
>> whether it's meaningful or not.  There should be a check on relkind.
>>
>> 6. I don't think relying on strtol to decide if a filename is entirely
>> numeric is very safe.  Note all the extra defenses in pg_atoi against
>> various platform-specific misbehaviors of strtol.  Personally I'd use a
>> strspn test instead.
>>
>> 7. There are no checks for readdir failure (compare any other readdir
>> loop in the backend).
>>
>> See also Simon Riggs' complaints that the circumstances under which it's
>> done are pretty randomly selected.  (One particular thing that I think
>> is a bad idea is to do this in a standalone backend.  Any sort of
>> corruption in any db's pg_class would render it impossible to start up.)
>>
>> To fix the first three problems, and also avoid the performance problem
>> of multiply rescanning a database's pg_class for each of its
>> tablespaces, I would suggest that the hashtable entries be widened to
>> RelFileNode structs (ie, db oid, tablespace oid, relfilenode oid).  Then
>> there should be one iteration over pg_database to learn the OIDs and
>> default tablespaces of each database; with that you can read pg_class
>> from its correct location for each database and load all the entries
>> into the hashtable.  Then you iterate through the tablespaces looking
>> for stuff not present in the hashtable.  You might also want to build a
>> list or hashtable of known database OIDs, so that you can recognize a
>> stale subdirectory immediately and issue a direct complaint about it
>> without even recursing into it.
>>
>> regards, tom lane
>>
>
> --
>  Bruce Momjian                        |  http://candle.pha.pa.us
>  [hidden email]               |  (610) 359-1001
>  +  If your life is a hard drive,     |  13 Roberts Road
>  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
>               http://www.postgresql.org/docs/faq
>

- Heikki

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
    (send "unregister YourEmailAddressHere" to [hidden email])