Optimizing large data loads

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Optimizing large data loads

John Wells
Hi guys,

We have a Java process that uses Hibernate to load approximately 14 GB of
data.  One a dual-proc 2.4 GHZ Xeon with 2048 MB RAM, it's currently
taking over 13 hours to load (PostgreSQL 7.4.8).  We're flushing from
hibernate every 50 records.

I've turned fsync to false in postgresql.conf, and we've turned autocommit
off in our code.  Is there anything else I could try to temporarily speed
up inserts?

Thanks very much for your help.

John


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend
Reply | Threaded
Open this post in threaded view
|

Re: Optimizing large data loads

Richard Huxton
John Wells wrote:

> Hi guys,
>
> We have a Java process that uses Hibernate to load approximately 14 GB of
> data.  One a dual-proc 2.4 GHZ Xeon with 2048 MB RAM, it's currently
> taking over 13 hours to load (PostgreSQL 7.4.8).  We're flushing from
> hibernate every 50 records.
>
> I've turned fsync to false in postgresql.conf, and we've turned autocommit
> off in our code.  Is there anything else I could try to temporarily speed
> up inserts?

You don't say what the limitations of Hibernate are. Usually you might
look to:
1. Use COPY not INSERTs
2. If not, block INSERTS into BEGIN/COMMIT transactions of say 100-1000
3. Turn fsync off
4. DROP/RESTORE constraints/triggers/indexes while you load your data
5. Increase sort_mem/work_mem in your postgresql.conf when recreating
indexes etc.
6. Use multiple processes to make sure the I/O is maxed out.

Any of those do-able?

--
   Richard Huxton
   Archonet Ltd

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
Reply | Threaded
Open this post in threaded view
|

Re: Optimizing large data loads

John Wells
Richard Huxton said:
> You don't say what the limitations of Hibernate are. Usually you might
> look to:
> 1. Use COPY not INSERTs

Not an option, unfortunately.

> 2. If not, block INSERTS into BEGIN/COMMIT transactions of say 100-1000

We're using 50/commit...we can easily up this I suppose.

> 3. Turn fsync off

Done.

> 4. DROP/RESTORE constraints/triggers/indexes while you load your data

Hmmm...will have to think about this a bit...not a bad idea but not sure
how we can make it work in our situation.

> 5. Increase sort_mem/work_mem in your postgresql.conf when recreating
> indexes etc.
> 6. Use multiple processes to make sure the I/O is maxed out.

5. falls in line with 4.  6. is definitely doable.

Thanks for the suggestions!

John


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend