Where to store Blobs?

classic Classic list List threaded Threaded
14 messages Options
Reply | Threaded
Open this post in threaded view
|

Where to store Blobs?

Thomas Güttler
Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing list.

Now I realized: Nobody talked about Blobs.

I guess most people do not store Blobs in PostgresSQL.

Where do you store Blobs?

(In my case Blobs are PDF/image files with size up to 20 MByte.
I do not talk about very big blobs which are several hundret MByte)



--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines

Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Adrian Klaver-4
On 3/13/19 7:28 AM, Thomas Güttler wrote:
> Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing
> list.
>
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?

Probably the preferred method:
https://www.postgresql.org/docs/11/datatype-binary.html

Another method:
https://www.postgresql.org/docs/11/largeobjects.html

>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)
>
>
>


--
Adrian Klaver
[hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Laurenz Albe
In reply to this post by Thomas Güttler
Thomas Güttler wrote:
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)

Ideally outside the database, if they are many.
Large databases are harder to backup than large file systems.

If you keep 20MB binaries in the database, you'd use the "bytea" data type.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Ron-2
In reply to this post by Thomas Güttler
On 3/13/19 9:28 AM, Thomas Güttler wrote:

> Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing
> list.
>
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)

We store PDFs and TIFFs of dozens to a few hundred KB in the database as bytea.

--
Angular momentum makes the world go 'round.

Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

clmartin
In reply to this post by Laurenz Albe
I store them as bytea in the database despite the fact that there are benefits to storing them in the file system. The reason is that it is easier to secure access to the database than to secure both the database and provide secure access to the file system. 

Chuck Martin
Avondale Software


On Wed, Mar 13, 2019 at 10:34 AM Laurenz Albe <[hidden email]> wrote:
Thomas Güttler wrote:
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)

Ideally outside the database, if they are many.
Large databases are harder to backup than large file systems.

If you keep 20MB binaries in the database, you'd use the "bytea" data type.

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com


Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Christopher Browne-3
In reply to this post by Thomas Güttler
On Wed, 13 Mar 2019 at 10:27, Thomas Güttler
<[hidden email]> wrote:
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?

Things have changed, but at one time, we were using RT as our
ticketing system (https://bestpractical.com/request-tracker) and it
would capture documents as database objects.

The table where RT stowed downloadable documents was one of the
largest tables in the database because of there being a few 50MB
copies of Access Databases and some many-MB spreadsheets in there.

It worked fine; no problems evident from it.  It was certainly
surprising to find such large documents there, and if people had
gotten in the habit of putting GBs of data into RT, that would have
probably led to some policy changes to prevent it, but unless you're
pretty actively trying to blow the system up, it just works.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Karsten Hilbert
On Wed, Mar 13, 2019 at 11:50:37AM -0400, Christopher Browne wrote:

>> I guess most people do not store Blobs in PostgresSQL.

- BYTEA puts practical limits on size

- LO storage happens inside the system (!) table

Nowadays, there are Foreign Data Wrappers which might
encapsulate files as if they lived inside the database.

Also, a combination of

        COPY TO FORMAT binary
        pg_read_binary_file()

and suitable plpgsql security definer functions might provide
for a Poor Man's binary file integrated external storage.

Karsten
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B

Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Achilleas Mantzios
In reply to this post by Thomas Güttler
On 13/3/19 4:28 μ.μ., Thomas Güttler wrote:
> Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing list.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)
>
bytea. Keeping those in the filesys and trying to keep filesys in sync with the db is a PITA. Also dont know what happens in the transactional dept (ACID) with lo_* large objects. In most cases bytea
are just fine.
>
>


--
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt


Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Hannes Erven
In reply to this post by Thomas Güttler
Hi,


Am 13.03.19 um 15:28 schrieb Thomas Güttler:
>
> Where do you store Blobs?


Within PostgreSQL, of course. The system I have in mind stores ZIP and
PDF files, usually a few MBs each; we're currently at a total of about
100 GB and there are no evident problems.

For this application, it is extremely important that saving/modifying
the binaries is part of a larger transaction that completes/fails
atomically.
Replication is almost real-time and access to the binaries is provided
with the same libraries (JDBC/Hibernate) as everything else.


Best regards,

        -hannes


Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Jamesie Pic
Make dump/restore of database data unnecessarily expensive in terms of time and space imho.
Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Benedict Holland
I store large models in the database because I need to have a historical data to compare to. That said, I could probably also automate a git repo but it will be just that much more work and git with binary files really doesn't make sense. Storage is really cheap and I assume the database stores bytesa types like they do text within a separate and partitioned section of the database.

Thanks,
~Ben 

On Wed, Mar 13, 2019 at 1:37 PM Jamesie Pic <[hidden email]> wrote:
Make dump/restore of database data unnecessarily expensive in terms of time and space imho.
Reply | Threaded
Open this post in threaded view
|

Re: Where to store Blobs?

Tim Cross
In reply to this post by Thomas Güttler

I don't think there is a suitable 'one size fits all' answer to this
question. A lot will depend on how you intend to use the blobs and what
sort of hardware architecture, especially storage systems, you have.

At first glance, sticking everything in the DB seems like an easy
choice. However, that can result in very large databases, which in turn
can lead to issues with respect to backup, replication etc. If all your
after is storage, then sometimes your better off using the file system
for the blobs and keeping the metadata in the db. It can potentially be
faster and easier to serve up blobs from the file system compared to the
db if that is the main use case, but if the blobs are more dynamic or
you use collections of blobs to build a master blob etc, the db has some
advantages.

If you really need database like functionality, given the relative
cheapness of storage and the wealth of options available, storing the
blobs in the database can have advantage. However, it will be important
to select the most appropriate datatype. What some people think of as a
'blob' is just an array of bytes to many DBs and as usual, you need to
make the decision as to what is the best storage representation for your
requirements, keeping in mind that the more general 'blob' like storage
type you choose often represents a loss in functionality but an increase
in flexibility wrt to what can be inserted over more precise data types,
which will be more restrictive about what can be inserted, but offer
more functionality regarding what you can do with it (at the db level).

Tim


Thomas Güttler <[hidden email]> writes:

> Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing list.
>
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)


--
Tim Cross

Reply | Threaded
Open this post in threaded view
|

s3 was: Where to store Blobs?

Thomas Güttler
In reply to this post by Thomas Güttler
I am curious. Why did nobody say:

  store blobs in a storage server (like s3) and only store the blob-id in PostgreSQL?

Regards,
   Thomas



Am 13.03.19 um 15:28 schrieb Thomas Güttler:

> Some days ago I asked "Where **not** to use PostgreSQL?" on this mailing list.
>
> Now I realized: Nobody talked about Blobs.
>
> I guess most people do not store Blobs in PostgresSQL.
>
> Where do you store Blobs?
>
> (In my case Blobs are PDF/image files with size up to 20 MByte.
> I do not talk about very big blobs which are several hundret MByte)
>
>
>

--
Thomas Guettler http://www.thomas-guettler.de/
I am looking for feedback: https://github.com/guettli/programming-guidelines

Reply | Threaded
Open this post in threaded view
|

Re: s3 was: Where to store Blobs?

Karsten Hilbert
On Mon, Mar 18, 2019 at 11:42:00AM +0100, Thomas Güttler wrote:

> I am curious. Why did nobody say:
>
>  store blobs in a storage server (like s3) and only store the blob-id in PostgreSQL?

That's been rehashed to the point of becoming a FAQ

        https://wiki.postgresql.org/wiki/BinaryFilesInDB

Karsten Hilbert
--
GPG  40BE 5B0E C98E 1713 AFA6  5BC0 3BEA AC80 7D4F C89B