Fast insertion indexes: why no developments

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
91 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Fast insertion indexes: why no developments

Leonardo Francalanci
Hi,


I don't see much interest in insert-efficient indexes. These are the ones I've found:

- LSM-tree (used by Cassandra and SQLite4?)
- Y-Tree (http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P23%20(ytree%20index%20structure%20for%20DWs).pdf )
- Fractal indexes (TokuDB, patented)

While I understand that b*trees are still the best compromise in insertion/search speed, disk size, concurrency, and more in general in OLTP workloads, they are useless when it comes to insertion in big data tables (>50M rows) of random values (not ordered values).

I would like to know if the lack of development in this area (not only in Postgresql, but in databases in general) is due to:

1) complex implementation
2) poor search performance
3) poor concurrency performance
4) not interesting for most users
5) something else???

I thought this was going to change due to the fast-insertion speeds needs of "Social Applications", but only TokuDB seems to be the only "successful" player in the area (I don't know how much of it is due to good marketing). Most other DB technology claims faster insertion speed (MongoDB and the like...) but in the end they rely on the old b*tree + sharding instead of using different indexing mechanisms (with the exception of Cassandra).


Thank you in advance

Leonardo



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Craig Ringer-3
On 10/29/2013 03:53 PM, Leonardo Francalanci wrote:
> 5) something else???

Quite likely nobody has had the enthusiasm and interest to implement a
viable, quality implementation and stick with it long enough to get it
committed.

There are a great many good ideas for improvements to Pg that just don't
have the people and time behind them to make them happen.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Tom Lane-2
Craig Ringer <[hidden email]> writes:
> On 10/29/2013 03:53 PM, Leonardo Francalanci wrote:
>> 5) something else???

> Quite likely nobody has had the enthusiasm and interest to implement a
> viable, quality implementation and stick with it long enough to get it
> committed.

> There are a great many good ideas for improvements to Pg that just don't
> have the people and time behind them to make them happen.

Before getting too excited about some new academic index type, it's worth
noting the sad state in which hash indexes have languished for years.
Nobody's bothered to add WAL support, let alone do any other real work
on them.  The non-btree index types that have been getting love are the
ones that offer the ability to index queries that btree can't.  I think
a new index type whose only benefit is the claim to be faster in a narrow
use-case is likely to end up like hash, not getting used enough to be
properly maintained.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Merlin Moncure-2
In reply to this post by Leonardo Francalanci
On Tue, Oct 29, 2013 at 2:53 AM, Leonardo Francalanci <[hidden email]> wrote:

> Hi,
>
>
> I don't see much interest in insert-efficient indexes. These are the ones I've found:
>
> - LSM-tree (used by Cassandra and SQLite4?)
> - Y-Tree (http://www.bossconsulting.com/oracle_dba/white_papers/DW%20in%20oracle/P23%20(ytree%20index%20structure%20for%20DWs).pdf )
> - Fractal indexes (TokuDB, patented)
>
> While I understand that b*trees are still the best compromise in insertion/search speed, disk size, concurrency, and more in general in OLTP workloads, they are useless when it comes to insertion in big data tables (>50M rows) of random values (not ordered values).
>
> I would like to know if the lack of development in this area (not only in Postgresql, but in databases in general) is due to:
>
> 1) complex implementation
> 2) poor search performance
> 3) poor concurrency performance
> 4) not interesting for most users
> 5) something else???
>
> I thought this was going to change due to the fast-insertion speeds needs of "Social Applications", but only TokuDB seems to be the only "successful" player in the area (I don't know how much of it is due to good marketing). Most other DB technology claims faster insertion speed (MongoDB and the like...) but in the end they rely on the old b*tree + sharding instead of using different indexing mechanisms (with the exception of Cassandra).

Another point to add: I don't really see btree as a barrier to
performance for most of the problems I face.  The real barriers to
database performance are storage, contention, and query planning.
Postgres btreee indexes are pretty fast and for stuff like bulk
insertions there are some optimization techniques available (such as
sharding or create index concurrently).

Stuff I'd like to see in terms of postgres indexing:
*) faster wal logged hash index
*) composite gist/gin
*) faster gist/gin (to the extent that it's possible).

merlin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
In reply to this post by Tom Lane-2
> Before getting too excited about some new academic index type, it's worth
> noting the sad state in which hash indexes have languished for years.
> Nobody's bothered to add WAL support, let alone do any other real work
> on them.  The non-btree index types that have been getting love are the
> ones that offer the ability to index queries that btree can't.  I think
> a new index type whose only benefit is the claim to be faster in a narrow
> use-case is likely to end up like hash, not getting used enough to be
> properly maintained.
>             regards, tom lane

Aren't hash indexes in a poor state because they are not faster than btree in every condition?



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Alvaro Herrera-9
Leonardo Francalanci wrote:

> > Before getting too excited about some new academic index type, it's worth
> > noting the sad state in which hash indexes have languished for years.
> > Nobody's bothered to add WAL support, let alone do any other real work
> > on them.  The non-btree index types that have been getting love are the
> > ones that offer the ability to index queries that btree can't.  I think
> > a new index type whose only benefit is the claim to be faster in a narrow
> > use-case is likely to end up like hash, not getting used enough to be
> > properly maintained.
>
> Aren't hash indexes in a poor state because they are not faster than
> btree in every condition?

Chicken and egg.  Maybe they can be made faster than btrees (in some
situations) with enough tweaks, but because there are so many
outstanding problems, no one wants to do the huge amount of legwork to
even get to the point where such tweaks can be made in the first place.

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Kenneth Marshall-3
In reply to this post by Leonardo Francalanci
On Tue, Oct 29, 2013 at 02:53:37PM +0000, Leonardo Francalanci wrote:

> > Before getting too excited about some new academic index type, it's worth
> > noting the sad state in which hash indexes have languished for years.
> > Nobody's bothered to add WAL support, let alone do any other real work
> > on them.  The non-btree index types that have been getting love are the
> > ones that offer the ability to index queries that btree can't.  I think
> > a new index type whose only benefit is the claim to be faster in a narrow
> > use-case is likely to end up like hash, not getting used enough to be
> > properly maintained.
> >             regards, tom lane
>
> Aren't hash indexes in a poor state because they are not faster than btree in every condition?
>

Hi Leonardo,

If there was ONE perfect index, better in every condition, postgres would be
using it. As in everything else, each type has its strengths and weaknesses.
The hash index allows equality searches for very large key lengths using a
relatively very small index size. As has been mentioned before, we still do
not have WAL logging for hash indexes. But even so, for I/O bound systems
hash indexes are twice as fast for searches than the btree equivalent.

Regards,
Ken


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Tom Lane-2
In reply to this post by Leonardo Francalanci
Leonardo Francalanci <[hidden email]> writes:
>> Before getting too excited about some new academic index type, it's worth
>> noting the sad state in which hash indexes have languished for years.

> Aren't hash indexes in a poor state because they are not faster than btree in every condition?

They should, in theory, be faster than btrees -- O(1) not O(log N) page
fetches per lookup.  In practice they don't seem to be faster, and
nobody's bothered to find out exactly why.  Again, this isn't a terribly
encouraging precedent for implementing some other index type that's
supposed to (sometimes) be faster than btrees.

None of this is meant to discourage you from trying to write an index
type if you have the time and motivation to pursue it.  Just trying to
answer your question as to why nobody's done it already.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
In reply to this post by Merlin Moncure-2
> Another point to add: I don't really see btree as a barrier to
> performance for most of the problems I face.  The real barriers to
> database performance are storage, contention, and query planning.

Ehm that's true for regular OLTP stuff, which I understand is what most (95%?) of people use/need. But if you try to insert rows into a 50M table with a couple of indexes, btrees just can't keep up. 
Of course, you can't have it all: fast at big table insertion, good contention, good query times...

> Postgres btreee indexes are pretty fast and for stuff like bulk
> insertions there are some optimization techniques available (such as
> sharding or create index concurrently).


At the moment I'm relying on partitioning + creating indexes in bulk on "latest" table (the partitioning is based on time). But that means K*log(N) search times (where K is the number of partitions).
That's why I gave a look at these different indexing mechanisms.


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
In reply to this post by Tom Lane-2
> They should, in theory, be faster than btrees -- O(1) not O(log N) page
> fetches per lookup.  In practice they don't seem to be faster, and
> nobody's bothered to find out exactly why.  Again, this isn't a terribly
> encouraging precedent for implementing some other index type that's
> supposed to (sometimes) be faster than btrees.

Yes, I understand. Which is also why I was curious to know if the "claims" those papers (and the databases using them) make were real...

Thank you everybody for your replies.

Leonardo


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Peter Geoghegan-3
In reply to this post by Leonardo Francalanci
On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci <[hidden email]> wrote:
> I don't see much interest in insert-efficient indexes.

Presumably someone will get around to implementing a btree index
insertion buffer one day. I think that would be a particularly
compelling optimization for us, because we could avoid ever inserting
index tuples that are already dead when the deferred insertion
actually occurs.

--
Peter Geoghegan


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Claudio Freire

On Tue, Oct 29, 2013 at 1:10 PM, Peter Geoghegan <[hidden email]> wrote:
On Tue, Oct 29, 2013 at 7:53 AM, Leonardo Francalanci <[hidden email]> wrote:
> I don't see much interest in insert-efficient indexes.

Presumably someone will get around to implementing a btree index
insertion buffer one day. I think that would be a particularly
compelling optimization for us, because we could avoid ever inserting
index tuples that are already dead when the deferred insertion
actually occurs.


Well, that should be relatively easy the way web mining does it (with inverted indexes).

Have a small (presumably RAM-fitting) staging index where inserts take place, and regularly dump it into the "master index", the rationale being that by the time you dump it, it'll be more efficient to do many inserts at once for one, and there will be lots of dead tuples you don't even have to consider for two.

And when I say relatively easy, I mean it in the sense that it only needs careful juggling of existing data structures.
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Merlin Moncure-2
In reply to this post by Leonardo Francalanci
On Tue, Oct 29, 2013 at 10:49 AM, Leonardo Francalanci <[hidden email]> wrote:

>> Another point to add: I don't really see btree as a barrier to
>> performance for most of the problems I face.  The real barriers to
>> database performance are storage, contention, and query planning.
>
> Ehm that's true for regular OLTP stuff, which I understand is what most (95%?) of people use/need. But if you try to insert rows into a 50M table with a couple of indexes, btrees just can't keep up.
> Of course, you can't have it all: fast at big table insertion, good contention, good query times...
>
>> Postgres btreee indexes are pretty fast and for stuff like bulk
>> insertions there are some optimization techniques available (such as
>> sharding or create index concurrently).
>
>
> At the moment I'm relying on partitioning + creating indexes in bulk on "latest" table (the partitioning is based on time). But that means K*log(N) search times (where K is the number of partitions).
> That's why I gave a look at these different indexing mechanisms.

I bet you've mis-diagnosed the problem.  Btrees don't have a problem
keeping up with 50m records; you're problem is that after a certain
point your page cache can't keep up with the pseudo-random i/o
patterns and you start seeing faults to storage.  Disk storage is
several order of magnitude slower than memory and thus performance
collapses.   This has nothing to do the btree algorithm except to the
extent it affects i/o patterns.

With the advances in storage over the last several years such that
commodity priced SSD is available I think that all lot of assumptions
under these trade-offs will change.

merlin


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
> I bet you've mis-diagnosed the problem.  Btrees don't have a problem
> keeping up with 50m records; you're problem is that after a certain
> point your page cache can't keep up with the pseudo-random i/o
> patterns and you start seeing faults to storage.
> [...]   This has nothing to do the btree algorithm except to the
> extent it affects i/o patterns.


Of course; that's why those "different" index types aim to use more sequential than random writes.



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Jeff Janes
In reply to this post by Tom Lane-2
On Tue, Oct 29, 2013 at 8:16 AM, Tom Lane <[hidden email]> wrote:
Leonardo Francalanci <[hidden email]> writes:
>> Before getting too excited about some new academic index type, it's worth
>> noting the sad state in which hash indexes have languished for years.

> Aren't hash indexes in a poor state because they are not faster than btree in every condition?

They should, in theory, be faster than btrees -- O(1) not O(log N) page
fetches per lookup.  

However, all but one or two of those page fetches are almost surely cached, so if the problem is IO then the benefits are not likely to be seen.

 
In practice they don't seem to be faster, and
nobody's bothered to find out exactly why.

We know why, more or less.  Hash indexes use lmgr locks to protect against bucket splits conflicting with ordinary operations, and that destroys performance even in isolation, and destroys it even more in concurrent situations.  

Robert removed the lmgr lock on the meta page by using a retry loop with lightweight locks.  I've outlined how to remove the heavyweight lock on the bucket page as well, but it would require an on-disk change to the index so that each page knows how far the bucket it is in has been split, and it also might abuse the intention of lightweight locks a bit.  But I'm reluctant to put much time into that without there being any prospects of solving the problem of how to WAL bucket splits when buckets can have an unbounded number of overflow pages.

(Once each page knows its own split level, we could also remove the need for even a light-weight lock on the metapage for most operations by stuffing some of the key info from that into the relcache.)
 
Cheers,

Jeff
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Tom Lane-2
Jeff Janes <[hidden email]> writes:
> Robert removed the lmgr lock on the meta page by using a retry loop with
> lightweight locks.  I've outlined how to remove the heavyweight lock on the
> bucket page as well, but it would require an on-disk change to the index so
> that each page knows how far the bucket it is in has been split, and it
> also might abuse the intention of lightweight locks a bit.

FWIW, I don't think that on-disk changes to hash indexes would be a
showstopper problem at this point.  We could force people to reindex them
by means of changing the index version number on the metapage.  The
reindex downtime would be annoying for production situations --- but given
the lack of WAL support, who'd be using one in production anyway?

> But I'm
> reluctant to put much time into that without there being any prospects of
> solving the problem of how to WAL bucket splits when buckets can have an
> unbounded number of overflow pages.

Agreed, if we don't see how to implement WAL logging then it's improbable
they'll ever get to production quality :-(.

ISTM the issue here is that we'd need to acknowledge incompletely-split
buckets as a valid state, no?  But that could be a good thing anyway,
if it'd mean that we don't have to completely lock a bucket while
splitting it.  All the other index types have comparable situations
where a maintenance operation might be only partly done.

Not that I'm volunteering to put any time into this myself.

                        regards, tom lane


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Simon Riggs
In reply to this post by Leonardo Francalanci
On 29 October 2013 07:53, Leonardo Francalanci <[hidden email]> wrote:

> I don't see much interest in insert-efficient indexes.

Hmm, you realise Alvaro is working on MinMax indexes in this release?
They are very efficient with regard to index inserts and specially
designed for use on large tables.

Prior work by Heikki on Grouped Item Tuples was a way of reducing the
size of indexes, yet still allowing uniqueness checks. That is
implemented in SQLServer already and is very useful.

Your comment about the lack of development in indexes seems counter to
the literature that I've seen. The main problem is people keep
patenting things, making it fairly difficult for everyone.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
> Hmm, you realise Alvaro is working on MinMax indexes in this release?

> They are very efficient with regard to index inserts and specially
> designed for use on large tables.
>
> Prior work by Heikki on Grouped Item Tuples was a way of reducing the
> size of indexes, yet still allowing uniqueness checks. That is
> implemented in SQLServer already and is very useful.

Ah! Didn't know that!

> Your comment about the lack of development in indexes seems counter to
> the literature that I've seen. The main problem is people keep
> patenting things, making it fairly difficult for everyone.

Mmh, maybe I wasn't clear: I meant lack of development (maybe I should have said "implementation"?) in postgresql and in the other "sql databases" of the fast-insertion indexes you can find in literature.



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Leonardo Francalanci
In reply to this post by Simon Riggs
> Hmm, you realise Alvaro is working on MinMax indexes in this release?
> They are very efficient with regard to index inserts and specially
> designed for use on large tables.
>
> Prior work by Heikki on Grouped Item Tuples was a way of reducing the
> size of indexes, yet still allowing uniqueness checks. That is
> implemented in SQLServer already and is very useful.


Reading the implementation of those features, I don't think they can help in the cases handled by the index types I mentioned (insertions of random values in big tables).



--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Reply | Threaded
Open this post in threaded view
|

Re: Fast insertion indexes: why no developments

Simon Riggs
On 30 October 2013 07:55, Leonardo Francalanci <[hidden email]> wrote:

>> Hmm, you realise Alvaro is working on MinMax indexes in this release?
>> They are very efficient with regard to index inserts and specially
>> designed for use on large tables.
>>
>> Prior work by Heikki on Grouped Item Tuples was a way of reducing the
>> size of indexes, yet still allowing uniqueness checks. That is
>> implemented in SQLServer already and is very useful.
>
>
> Reading the implementation of those features, I don't think they can help in the cases handled by the index types I mentioned (insertions of random values in big tables).

Presumably the data you are inserting isn't actually random. Please
describe the use case you are considering in more detail and some view
on how frequent that is, with some examples. Once we understand the
use case and agree it is important, we might solve problems.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list ([hidden email])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
12345