Confused about how to enable backups (e.g. Write Ahead Log).

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

Confused about how to enable backups (e.g. Write Ahead Log).

mimble9
I have been attempting, without success, to create a Write Ahead Log (WAL)
backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS.

It might be that I am doing the wrong thing and my method will never work.
In which case: what is the best way to achieve my 'Goal' (below)?

Goal:

I want to backup a database every 15 minutes. Ideally, I want to backup
only if the database has changed (perhaps this isn't possible and backups
happen whether or not the database has changed)?

Attempts:

I edited /etc/postgresql/10/main/postgresql.conf as follows:

wal_level = replica
archive_mode = on
archive_command = 'cp %p /test/%f'
archive_timeout = 900

This should - in my opinion - every 15 minutes backup (cp) the database
(referenced by %p) to the /test/ directory using the filename (referenced
by %f) of the database.

I then:

/etc/init.d/postgresql stop
/etc/init.d/postgresql start

And checked that PostgreSQL is running with /etc/init.d/postgresql status:

postgresql.service - PostgreSQL RDBMS
   Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor
preset: enabled)
   Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago
  Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
 Main PID: 60534 (code=exited, status=0/SUCCESS)

Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset
devices.list: Operation not permitted
Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS...
Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS.

However, the /test/ directory did not show any backup files

I then read that I needed to use pg_basebackup to make an initial backup
(before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup.
This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup
but I'm not sure how this relates to or effects (if it does) the Write
Ahead Log.

I still don't have any backups in /test/ (or anywhere).

Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether
using a modification of this method or something different (but superior).
Many thanks!





Reply | Threaded
Open this post in threaded view
|

Re: Confused about how to enable backups (e.g. Write Ahead Log).

Stefan Fercot
Hi,

Indeed Continuous Archiving and Point-in-Time Recovery
(https://www.postgresql.org/docs/10/continuous-archiving.html) is the
best way to achieve your goal.

Have you checked the archiver process ? ('ps -ef |grep postgres |grep
archiver'). It should report success or fail.

You could also check that with the system view : 'SELECT * FROM
pg_stat_archiver;'.

Best option is to have a look at the PostgreSQL logs.

Is your /test directory writable by the postgres linux user ?

To help you, there's a few PITR tools that handles backups, restores and
even backup retention. Have a look at https://pgbackrest.org/ ;-)

Kind regards,

On 2/21/20 12:18 PM, [hidden email] wrote:

> I have been attempting, without success, to create a Write Ahead Log (WAL)
> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS.
>
> It might be that I am doing the wrong thing and my method will never work.
> In which case: what is the best way to achieve my 'Goal' (below)?
>
> Goal:
>
> I want to backup a database every 15 minutes. Ideally, I want to backup
> only if the database has changed (perhaps this isn't possible and backups
> happen whether or not the database has changed)?
>
> Attempts:
>
> I edited /etc/postgresql/10/main/postgresql.conf as follows:
>
> wal_level = replica
> archive_mode = on
> archive_command = 'cp %p /test/%f'
> archive_timeout = 900
>
> This should - in my opinion - every 15 minutes backup (cp) the database
> (referenced by %p) to the /test/ directory using the filename (referenced
> by %f) of the database.
>
> I then:
>
> /etc/init.d/postgresql stop
> /etc/init.d/postgresql start
>
> And checked that PostgreSQL is running with /etc/init.d/postgresql status:
>
> postgresql.service - PostgreSQL RDBMS
>     Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor
> preset: enabled)
>     Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago
>    Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
>   Main PID: 60534 (code=exited, status=0/SUCCESS)
>
> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset
> devices.list: Operation not permitted
> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS...
> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS.
>
> However, the /test/ directory did not show any backup files
>
> I then read that I needed to use pg_basebackup to make an initial backup
> (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup.
> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup
> but I'm not sure how this relates to or effects (if it does) the Write
> Ahead Log.
>
> I still don't have any backups in /test/ (or anywhere).
>
> Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether
> using a modification of this method or something different (but superior).
> Many thanks!
>
>
>
>
>
--
Stefan FERCOT
http://dalibo.com - http://dalibo.org



Reply | Threaded
Open this post in threaded view
|

Re: Confused about how to enable backups (e.g. Write Ahead Log).

Stephen Frost
In reply to this post by mimble9
Greetings,

* [hidden email] ([hidden email]) wrote:
> I have been attempting, without success, to create a Write Ahead Log (WAL)
> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS.

There isn't really any such thing as a "WAL" backup.  The WAL is where
PG writes before it writes to the actual heap, but you don't typically
keep WAL since the start of the database and so you need to have a copy
of the heap files to start from and then you use WAL to replay up to a
specific point in time.

> It might be that I am doing the wrong thing and my method will never work.

The method you've outlined below won't work, no.

> In which case: what is the best way to achieve my 'Goal' (below)?

There's a few options.

> Goal:
>
> I want to backup a database every 15 minutes. Ideally, I want to backup
> only if the database has changed (perhaps this isn't possible and backups
> happen whether or not the database has changed)?

This is what's known as an 'incremental backup' and there's a few tools
out there which support that- but I suggest you first consider if your
goal is actually to backup every 15 minutes, or if your goal is to be
able to minimize data loss to, at most, 15 minutes, or if your goal is
to recover to any specific point in time.

The reason this matters is that PG supports Point-in-time-Recovery
(PITR), by using file-level backups combined with the write ahead log.
This means that you can restore a backup of PG and then play forward to
*any* point in time- 9:00, 9:15, 9:17, 10:20, whatever.  Taking more
frequent incremental backups will mean that there is less WAL to have to
go through to get to a specific point in time, but unless you have a
very write-heavy system, it won't take long to play through an hour of
WAL, so you might consider taking backups once an hour, or even less
frequently.  Reasonably common setups include daily incremental backups
and then regular differential and/or full backups.

> Attempts:
>
> I edited /etc/postgresql/10/main/postgresql.conf as follows:
>
> wal_level = replica
> archive_mode = on
> archive_command = 'cp %p /test/%f'
> archive_timeout = 900

I strongly recommend against trying to write your own backup software
for PG.  In particular, 'cp' isn't really safe to use as an
archive_command as it doesn't 'fsync' the WAL file after copying it,
meaning you might be missing WAL if the system crashes.

When using archive_command, hopefully with a proper tool, it can make
sense to have an archive_timeout set, which will cause PG to kick out a
WAL segment after that long, if it hasn't already, and there's been
changes since the last WAL segment.  This avoids the risk that a WAL
segment sits with data in it for a long and then the system crashes in
an unrecoverable way and those changes are then lost.  Note that this
primarily makes sense when you're actually archiving the WAL to a
different system.  PG, as part of each commit, will fsync the data to
the WAL and so as long as the filesystem recovers, no data will be lost.

> This should - in my opinion - every 15 minutes backup (cp) the database
> (referenced by %p) to the /test/ directory using the filename (referenced
> by %f) of the database.

No, that's not how archive_command works at all.  The archive_command
simply copies the *WAL* file (each of which are 16MB by default) to
another location (ideally, a remote one, where the file is fsync'd,
prior to returning to PG).  The database itself is *not* copied.

> I then:
>
> /etc/init.d/postgresql stop
> /etc/init.d/postgresql start
>
> And checked that PostgreSQL is running with /etc/init.d/postgresql status:
>
> postgresql.service - PostgreSQL RDBMS
>    Loaded: loaded (/lib/systemd/system/postgresql.service; enabled; vendor
> preset: enabled)
>    Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago
>   Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
>  Main PID: 60534 (code=exited, status=0/SUCCESS)
>
> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset
> devices.list: Operation not permitted
> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS...
> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS.
>
> However, the /test/ directory did not show any backup files
You would just end up with WAL files in that directory, per the
discussion above regarding the purpose of archive_command.

> I then read that I needed to use pg_basebackup to make an initial backup
> (before editing postgresql.conf) so I tried pg_basebackup -D /etc/backup.
> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup
> but I'm not sure how this relates to or effects (if it does) the Write
> Ahead Log.

pg_basebackup is one tool to create a backup, but it doesn't (currently,
at least) support incremental backups, so each backup is a full backup
of the system.  The WAL can be used to take that backup up to a specific
point in time.

> I still don't have any backups in /test/ (or anywhere).

Based on what you wrote above, the contents of '/etc/backup' should
actually be a backup of PG, with the necessary WAL to bring it up
consistently.  Note that it's absolutely essential that all of the WAL
generated during a backup be kept, otherwise the backup is not valid.  I
strongly recommend using a tool which ensures that all WAL from a backup
is verified to have been stored properly.

Thanks,

Stephen

signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Confused about how to enable backups (e.g. Write Ahead Log).

mimble9
In reply to this post by Stefan Fercot
Thank you for your input,

>
> You could also check that with the system view : 'SELECT * FROM
> pg_stat_archiver;'.
>

5 | 000000010000000000000004 | 2020-02-21 01:29:56.77076+00 |            0
|                 |                  | 2020-02-21 00:31:12.898026+00


So there is some kind of archive but not one that is regularly updating.

> Is your /test directory writable by the postgres linux user ?

Permissions are: drwxr-xr-x 1 root     root        278 Feb 21 01:50 test

>
> To help you, there's a few PITR tools that handles backups, restores and
> even backup retention. Have a look at https://pgbackrest.org/ ;-)
>

I've had a look at the FAQ for this tool and it seems suitable. Have you
used it? I like the step-by-step instructions plus screenshots of the
results. And it uses the WAL.

My impression is that this is a user-friendly way for novices to backup
their databases hopefully without having the convoluted messes that I'm
currently having editing postgresql.conf, etc. Is that correct?


> Kind regards,
>
>
> On 2/21/20 12:18 PM, [hidden email] wrote:
>
>> I have been attempting, without success, to create a Write Ahead Log
>> (WAL)
>> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS.
>>
>> It might be that I am doing the wrong thing and my method will never
>> work. In which case: what is the best way to achieve my 'Goal' (below)?
>>
>>
>> Goal:
>>
>>
>> I want to backup a database every 15 minutes. Ideally, I want to backup
>>  only if the database has changed (perhaps this isn't possible and
>> backups happen whether or not the database has changed)?
>>
>> Attempts:
>>
>>
>> I edited /etc/postgresql/10/main/postgresql.conf as follows:
>>
>>
>> wal_level = replica archive_mode = on archive_command = 'cp %p /test/%f'
>> archive_timeout = 900
>>
>> This should - in my opinion - every 15 minutes backup (cp) the database
>>  (referenced by %p) to the /test/ directory using the filename
>> (referenced
>> by %f) of the database.
>>
>> I then:
>>
>>
>> /etc/init.d/postgresql stop
>> /etc/init.d/postgresql start
>>
>>
>> And checked that PostgreSQL is running with /etc/init.d/postgresql
>> status:
>>
>>
>> postgresql.service - PostgreSQL RDBMS Loaded: loaded
>> (/lib/systemd/system/postgresql.service; enabled; vendor
>> preset: enabled)
>> Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago
>> Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
>> Main PID: 60534 (code=exited, status=0/SUCCESS)
>>
>>
>> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset
>> devices.list: Operation not permitted
>> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS...
>> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS.
>>
>>
>> However, the /test/ directory did not show any backup files
>>
>>
>> I then read that I needed to use pg_basebackup to make an initial
>> backup (before editing postgresql.conf) so I tried pg_basebackup -D
>> /etc/backup.
>> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup
>> but I'm not sure how this relates to or effects (if it does) the Write
>> Ahead Log.
>>
>>
>> I still don't have any backups in /test/ (or anywhere).
>>
>>
>> Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether
>>  using a modification of this method or something different (but
>> superior). Many thanks!
>>
>>
>>
>>
>>
>>
> --
> Stefan FERCOT
> http://dalibo.com - http://dalibo.org
>
>
>
>
>




Reply | Threaded
Open this post in threaded view
|

Re: Confused about how to enable backups (e.g. Write Ahead Log).

Stefan Fercot
Hi,

On 2/22/20 12:07 AM, [hidden email] wrote:

> Thank you for your input,
>
>> You could also check that with the system view : 'SELECT * FROM
>> pg_stat_archiver;'.
>>
> 5 | 000000010000000000000004 | 2020-02-21 01:29:56.77076+00 |            0
> |                 |                  | 2020-02-21 00:31:12.898026+00
>
>
> So there is some kind of archive but not one that is regularly updating.
>
>> Is your /test directory writable by the postgres linux user ?
> Permissions are: drwxr-xr-x 1 root     root        278 Feb 21 01:50 test

That's a problem. /test has to be writable by the postgres linux user.

-> "sudo -u postgres touch /test/some_file" should work.

Below commands should solve that :

chown postgres: /test
chmod 750 /test

Of course, /test isn't really suitable in long term to store backups.
Don't let that stored locally, and give your directory a representative
name like pgbackups or something like that.

Furthermore, if the PostgreSQL cluster has a specific name or meaning
(like an application name), try not to store files in the main directory
of a mount point. It would be safer to use something like
/pgbackups/app-or-cluster-representative-name.

>> To help you, there's a few PITR tools that handles backups, restores and
>> even backup retention. Have a look at https://pgbackrest.org/ ;-)
>>
> I've had a look at the FAQ for this tool and it seems suitable. Have you
> used it? I like the step-by-step instructions plus screenshots of the
> results. And it uses the WAL.
>
> My impression is that this is a user-friendly way for novices to backup
> their databases hopefully without having the convoluted messes that I'm
> currently having editing postgresql.conf, etc. Is that correct?
>
In fact, as Stephen explained, WAL "backup" (archives) isn't enough.
Point-in-time recovery can be achieved by combining a file-system-level
copy and WAL archiving. Then, re-apply WAL archives on top of the copy
to restore the data to a specific target (time, transaction id,...).

pg_basebackup is one way to take a copy. Then indeed, you need to handle
WAL archives and pg_basebackup copies retention,...

pgBackRest is a tool that will help you for that. It will provide a
specific command for WAL archiving ( PostgreSQL archive_command) and
commands to take "backups" (file-system-level copies). Those backups can
be full, differential or even incremental. You'll still need to
configure postgresql.conf to configure WAL archiving.

pgBackRest will also help with the "restore" command.

(And yes, I already used it a lot. I'm like a "super fan" :
https://pgstef.github.io/pitr/)

>> Kind regards,
>>
>>
>> On 2/21/20 12:18 PM, [hidden email] wrote:
>>
>>> I have been attempting, without success, to create a Write Ahead Log
>>> (WAL)
>>> backup using PostgreSQL 10.0 under 18.04 Ubuntu VPS.
>>>
>>> It might be that I am doing the wrong thing and my method will never
>>> work. In which case: what is the best way to achieve my 'Goal' (below)?
>>>
>>>
>>> Goal:
>>>
>>>
>>> I want to backup a database every 15 minutes. Ideally, I want to backup
>>>   only if the database has changed (perhaps this isn't possible and
>>> backups happen whether or not the database has changed)?
>>>
>>> Attempts:
>>>
>>>
>>> I edited /etc/postgresql/10/main/postgresql.conf as follows:
>>>
>>>
>>> wal_level = replica archive_mode = on archive_command = 'cp %p /test/%f'
>>> archive_timeout = 900
>>>
>>> This should - in my opinion - every 15 minutes backup (cp) the database
>>>   (referenced by %p) to the /test/ directory using the filename
>>> (referenced
>>> by %f) of the database.
>>>
>>> I then:
>>>
>>>
>>> /etc/init.d/postgresql stop
>>> /etc/init.d/postgresql start
>>>
>>>
>>> And checked that PostgreSQL is running with /etc/init.d/postgresql
>>> status:
>>>
>>>
>>> postgresql.service - PostgreSQL RDBMS Loaded: loaded
>>> (/lib/systemd/system/postgresql.service; enabled; vendor
>>> preset: enabled)
>>> Active: active (exited) since Fri 2020-02-21 01:23:59 UTC; 9h ago
>>> Process: 60534 ExecStart=/bin/true (code=exited, status=0/SUCCESS)
>>> Main PID: 60534 (code=exited, status=0/SUCCESS)
>>>
>>>
>>> Feb 21 01:23:58 server systemd[1]: postgresql.service: Failed to reset
>>> devices.list: Operation not permitted
>>> Feb 21 01:23:58 server systemd[1]: Starting PostgreSQL RDBMS...
>>> Feb 21 01:23:59 server systemd[1]: Started PostgreSQL RDBMS.
>>>
>>>
>>> However, the /test/ directory did not show any backup files
>>>
>>>
>>> I then read that I needed to use pg_basebackup to make an initial
>>> backup (before editing postgresql.conf) so I tried pg_basebackup -D
>>> /etc/backup.
>>> This created a duplicate of /var/lib/postgresql/10/main/ in /etc/backup
>>> but I'm not sure how this relates to or effects (if it does) the Write
>>> Ahead Log.
>>>
>>>
>>> I still don't have any backups in /test/ (or anywhere).
>>>
>>>
>>> Can anyone kindly explain how, exactly, I can achieve my 'Goal' whether
>>>   using a modification of this method or something different (but
>>> superior). Many thanks!
>>>
--
Stefan FERCOT
http://dalibo.com - http://dalibo.org