Question on placing database on a network attached storage device

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Question on placing database on a network attached storage device

N/A-5
Hi,
 
I'm putting together a mission critical application which will use a Postgresql database, and I am trying to decide if I should place my Postgresql database on a network attached storage appliance or on local disks. 
 
If I place the database on the NAS, it will get "backed up" by the NAS device's "snap" type of images.  That means the NAS will make a copy of the delta, or difference, between the current version of the files on it, and the version from the last time the "snap" was done.  The "snap" images will be dumped to tape at some interval of time.
 
I'm concerned about using a "snap" of an open and active database to restore the database if the need arises.  Now, postgresql is pretty good about keeping checkpoints and recovering, but I'm still concerned about data integrity, (who knows what state any internal pointers are in when the "snap is taken), and as my application will need to archive groups of files that have to be kept together in order for the data in those files to be meaningful, (a group of files could be in the process of being archived to the database, the NAS takes a "snap" when only some of the files have been written in the database, and then the rest of the files are written in the database).  There is also the question of speed, Network attached storage vs. local disks, (network speed is pretty good 100 Meg and will be going to 1 Gig in the future).
 
I suppose I could place the database on the NAS, shutdown the database when a backup of the database is desired, have Postgresql create the database re-creation script, let the NAS take it's "snap", and then bring the database back up.
 
Any suggestions, experiences, or observations?
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached storage

Chris Travers
N/A wrote:

> Hi,
>  
> I'm putting together a mission critical application which will use a
> Postgresql database, and I am trying to decide if I should place my
> Postgresql database on a network attached storage appliance or on
> local disks.

Wherever possible, I think that local storage is almost always best as
it is a less complex system (hence less can go wrong).  However, you
will probably get better answers if you tell us what sort of NAS you are
looking at using, how it will connect to your system, etc.

>  
> If I place the database on the NAS, it will get "backed up" by the NAS
> device's "snap" type of images.  That means the NAS will make a copy
> of the delta, or difference, between the current version of the files
> on it, and the version from the last time the "snap" was done.  The
> "snap" images will be dumped to tape at some interval of time.
>  

How internally consistant is this snap?  I.e. if the file is changed
while the snap is being taken, what happens?  This behavior may be
product-dependnant and has serious ramifications for your data.

> I'm concerned about using a "snap" of an open and active database to
> restore the database if the need arises.  Now, postgresql is pretty
> good about keeping checkpoints and recovering, but I'm still concerned
> about data integrity, (who knows what state any internal pointers are
> in when the "snap is taken), and as my application will need to
> archive groups of files that have to be kept together in order for the
> data in those files to be meaningful, (a group of files could be in
> the process of being archived to the database, the NAS takes a "snap"
> when only some of the files have been written in the database, and
> then the rest of the files are written in the database).  There is
> also the question of speed, Network attached storage vs. local disks,
> (network speed is pretty good 100 Meg and will be going to 1 Gig in
> the future).
Personally I think that there are better approaches than NAS for this,
but again I am unwilling to say "never use it."

>  
> I suppose I could place the database on the NAS, shutdown the database
> when a backup of the database is desired, have Postgresql create the
> database re-creation script, let the NAS take it's "snap", and then
> bring the database back up.
>  
> Any suggestions, experiences, or observations?

Best Wishes,
Chris Travers
Metatron Technology Consulting


---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to [hidden email] so that your
       message can get through to the mailing list cleanly

chris.vcf (135 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached storage device

N/A-5
Chris,

The NAS is a Network Appliance 920c.  I understand that an individual file
while be "intact" during a snap, (not sure I believe it, but that's what I'm
told).  The NAS may be able to monitor the state of an individual file and
not "snap" it if it is on an "open" state or has a pending write, or some
such, but as the NAS has no knowledge of groups of files that need to exist
in a known state, (it deals on the file level), it cannot guarantee a
consistent state across a group of files.  (The database will be on the NAS
and will be NFS mounted filesystem to a Linux Redhat 3 running the
application.)
----- Original Message -----
From: "Chris Travers" <[hidden email]>
To: "N/A" <[hidden email]>
Cc: <[hidden email]>
Sent: Saturday, July 30, 2005 1:44 PM
Subject: Re: [ADMIN] Question on placing database on a network attached
storage device


> N/A wrote:
>
> > Hi,
> >
> > I'm putting together a mission critical application which will use a
> > Postgresql database, and I am trying to decide if I should place my
> > Postgresql database on a network attached storage appliance or on
> > local disks.
>
> Wherever possible, I think that local storage is almost always best as
> it is a less complex system (hence less can go wrong).  However, you
> will probably get better answers if you tell us what sort of NAS you are
> looking at using, how it will connect to your system, etc.
>
> >
> > If I place the database on the NAS, it will get "backed up" by the NAS
> > device's "snap" type of images.  That means the NAS will make a copy
> > of the delta, or difference, between the current version of the files
> > on it, and the version from the last time the "snap" was done.  The
> > "snap" images will be dumped to tape at some interval of time.
> >
>
> How internally consistant is this snap?  I.e. if the file is changed
> while the snap is being taken, what happens?  This behavior may be
> product-dependnant and has serious ramifications for your data.
>
> > I'm concerned about using a "snap" of an open and active database to
> > restore the database if the need arises.  Now, postgresql is pretty
> > good about keeping checkpoints and recovering, but I'm still concerned
> > about data integrity, (who knows what state any internal pointers are
> > in when the "snap is taken), and as my application will need to
> > archive groups of files that have to be kept together in order for the
> > data in those files to be meaningful, (a group of files could be in
> > the process of being archived to the database, the NAS takes a "snap"
> > when only some of the files have been written in the database, and
> > then the rest of the files are written in the database).  There is
> > also the question of speed, Network attached storage vs. local disks,
> > (network speed is pretty good 100 Meg and will be going to 1 Gig in
> > the future).
>
> Personally I think that there are better approaches than NAS for this,
> but again I am unwilling to say "never use it."
>
> >
> > I suppose I could place the database on the NAS, shutdown the database
> > when a backup of the database is desired, have Postgresql create the
> > database re-creation script, let the NAS take it's "snap", and then
> > bring the database back up.
> >
> > Any suggestions, experiences, or observations?
>
> Best Wishes,
> Chris Travers
> Metatron Technology Consulting
>

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached storage

Chris Travers-2
N/A wrote:

>Chris,
>
>The NAS is a Network Appliance 920c.  I understand that an individual file
>while be "intact" during a snap, (not sure I believe it, but that's what I'm
>told).
>
Depending on the way that data is journalled, this may be a possibility
(istr XFS is able to do this).

>  The NAS may be able to monitor the state of an individual file and
>not "snap" it if it is on an "open" state or has a pending write, or some
>such, but as the NAS has no knowledge of groups of files that need to exist
>in a known state, (it deals on the file level), it cannot guarantee a
>consistent state across a group of files.
>
That is a serious concern.  Every table in PostgreSQL consists of at
least one physical file.  Transactions may touch several of these
files.  While it is not impossible to solve this problem (as I
mentioned, I think XFS has this capability), it is hardly trivial or
something you would normally expect.

Also I wonder what the impact of several Postmasters trying to access
data over NFS would be.

Best Wishes,
Chris Travers

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached

Jeff Frost
In reply to this post by N/A-5
If you are using postgresql-8.0.x, then you can use PITR and base backups to
achieve a fairly consistent backup on a NAS which restores rather quickly.  If
the NAS supports hard links, then you can use the --link-dest option of rsync
to make the base backups quite fast and still retain more than one base backup
at a time.  The only problem with this solution is that you need at least as
much disk space as your PGDATA directory contains, plus enough space for the
WAL archives between base backups.

If the NAS does not support hard links, but does do snapshotting as you
indicate, then using rsync with the --partial flag against the previous
base backup would also be quite fast and bandwidth efficient.  The
snapshotting on the NAS would take care of the prior version base backup for
you and all you need do is rotate out the WAL archives according to how many
snapshots you care about.

Using PITR in this way has the added benefit of allowing you to restore the DB
to some point in the past if need be.

--
Jeff Frost, Owner <[hidden email]>
Frost Consulting, LLC http://www.frostconsultingllc.com/
Phone: 650-780-7908 FAX: 650-649-1954

On Sat, 30 Jul 2005, N/A wrote:

> The NAS is a Network Appliance 920c.  I understand that an individual file
> while be "intact" during a snap, (not sure I believe it, but that's what I'm
> told).  The NAS may be able to monitor the state of an individual file and
> not "snap" it if it is on an "open" state or has a pending write, or some
> such, but as the NAS has no knowledge of groups of files that need to exist
> in a known state, (it deals on the file level), it cannot guarantee a
> consistent state across a group of files.  (The database will be on the NAS
> and will be NFS mounted filesystem to a Linux Redhat 3 running the
> application.)


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached storage

Álvaro Herrera
In reply to this post by Chris Travers-2
On Sat, Jul 30, 2005 at 02:59:30PM -0700, Chris Travers wrote:

> Also I wonder what the impact of several Postmasters trying to access
> data over NFS would be.

That will lead to corruption very quickly.

--
Alvaro Herrera (<alvherre[a]alvh.no-ip.org>)
"Estoy de acuerdo contigo en que la verdad absoluta no existe...
El problema es que la mentira sí existe y tu estás mintiendo" (G. Lama)

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

               http://www.postgresql.org/docs/faq
Reply | Threaded
Open this post in threaded view
|

Re: Question on placing database on a network attached

Tomeh, Husam
In reply to this post by Jeff Frost
We're using similar technique but using SAN instead of NAS. The current
approach is shutting down the source Postgres database, clone or snap it
to the target server and then start the database back up. On Oracle, I
never had to shut down the source, instead I issue alter tablespace
begin backup, snap, and alter tablespace end backup to ensure a
consistent image. These being/end backup commands, were originally there
to place the database in state that you can perform a hot backup while
the database is up and running. I think PostgreSQL has similar commands
for hot backup - it will be interesting to know if anyone used them to
perform a snap or clone while the source is up and running and not
compromising database consistency.


--
 Husam  


-----Original Message-----
From: [hidden email]
[mailto:[hidden email]] On Behalf Of Chris Travers
Sent: Saturday, July 30, 2005 1:44 PM
To: N/A
Cc: [hidden email]
Subject: Re: [ADMIN] Question on placing database on a network attached
storage

N/A wrote:

> Hi,
>  
> I'm putting together a mission critical application which will use a
> Postgresql database, and I am trying to decide if I should place my
> Postgresql database on a network attached storage appliance or on
> local disks.

Wherever possible, I think that local storage is almost always best as
it is a less complex system (hence less can go wrong).  However, you
will probably get better answers if you tell us what sort of NAS you are

looking at using, how it will connect to your system, etc.

>  
> If I place the database on the NAS, it will get "backed up" by the NAS

> device's "snap" type of images.  That means the NAS will make a copy
> of the delta, or difference, between the current version of the files
> on it, and the version from the last time the "snap" was done.  The
> "snap" images will be dumped to tape at some interval of time.
>  

How internally consistant is this snap?  I.e. if the file is changed
while the snap is being taken, what happens?  This behavior may be
product-dependnant and has serious ramifications for your data.

> I'm concerned about using a "snap" of an open and active database to
> restore the database if the need arises.  Now, postgresql is pretty
> good about keeping checkpoints and recovering, but I'm still concerned

> about data integrity, (who knows what state any internal pointers are
> in when the "snap is taken), and as my application will need to
> archive groups of files that have to be kept together in order for the

> data in those files to be meaningful, (a group of files could be in
> the process of being archived to the database, the NAS takes a "snap"
> when only some of the files have been written in the database, and
> then the rest of the files are written in the database).  There is
> also the question of speed, Network attached storage vs. local disks,
> (network speed is pretty good 100 Meg and will be going to 1 Gig in
> the future).

Personally I think that there are better approaches than NAS for this,
but again I am unwilling to say "never use it."

>  
> I suppose I could place the database on the NAS, shutdown the database

> when a backup of the database is desired, have Postgresql create the
> database re-creation script, let the NAS take it's "snap", and then
> bring the database back up.
>  
> Any suggestions, experiences, or observations?

Best Wishes,
Chris Travers
Metatron Technology Consulting
**********************************************************************
This message contains confidential information intended only for the
use of the addressee(s) named above and may contain information that
is legally privileged.  If you are not the addressee, or the person
responsible for delivering it to the addressee, you are hereby
notified that reading, disseminating, distributing or copying this
message is strictly prohibited.  If you have received this message by
mistake, please immediately notify us by replying to the message and
delete the original message immediately thereafter.

Thank you.                                       FADLD Tag
**********************************************************************


---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

               http://archives.postgresql.org