Internal key management system

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
62 messages Options
1234
Reply | Threaded
Open this post in threaded view
|

Internal key management system

Masahiko Sawada-2
Hi,

I've started a new separate thread from the previous long thread[1]
for internal key management system to PostgreSQL. As I mentioned in
the previous mail[2], we've decided to step back and focus on only
internal key management system for PG13. The internal key management
system introduces the functionality to PostgreSQL that allows user to
encrypt and decrypt data without knowing the actual key. Besides, it
will be able to be integrated with transparent data encryption in the
future.

The basic idea is that PostgreSQL generates the master encryption key
which is further protected by the user-provided passphrase. The key
management system provides two functions to wrap and unwrap the secret
by the master encryption key. A user generates a secret key locally
and send it to PostgreSQL to wrap it using by pg_kmgr_wrap() and save
it somewhere. Then the user can use the encrypted secret key to
encrypt data and decrypt data by something like:

INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;

Where 'xxxxx' is the result of pg_kmgr_wrap function.

That way we can get something encrypted and decrypted without ever
knowing the actual key that was used to encrypt it.

I'm currently updating the patch and will submit it.

On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni <[hidden email]> wrote:

>
> On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> <[hidden email]> wrote:
> > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni <[hidden email]> wrote:
> > > That
> > > would allow the internal usage to have a fixed output length of
> > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> >
> > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > AES256 (master key) internally generated.
>
> No it should be 64-bytes. That way we can have separate 32-byte
> encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
>
> While it's common to reuse the same 32-byte key for both AES256 and an
> HMAC-SHA256 and there aren't any known issues with doing so, when
> designing something from scratch it's more secure to use entirely
> separate keys.

The HMAC key you mentioned above is not the same as the HMAC key
derived from the user provided passphrase, right? That is, individual
key needs to have its IV and HMAC key. Given that the HMAC key used
for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
passphrase), what will be the former key used for?

>
> > > For the user facing piece, padding would enabled to support arbitrary
> > > input data lengths. That would make the output length grow by up to
> > > 16-bytes (rounding the data length up to the AES block size) plus one
> > > more byte if a version field is added.
> >
> > I think the length of padding also needs to be added to the output.
> > Anyway, in the first version the same methods of wrapping/unwrapping
> > key are used for both internal use and user facing function. And user
> > input key needs to be a multiple of 16 bytes value.
>
> A separate length field does not need to be added as the
> padding-enabled output will already include it at the end[1]. This
> would be handled automatically by the OpenSSL encryption / decryption
> operations (if it's enabled):
>

Yes, right.

Regards,

[1] https://www.postgresql.org/message-id/031401d3f41d%245c70ed90%241552c8b0%24%40lab.ntt.co.jp
[2] https://www.postgresql.org/message-id/CAD21AoD8QT0TWs3ma-aB821vwDKa1X519y1w3yrRKkAWjhZcrw%40mail.gmail.com

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Fabien COELHO-3

Hello Masahiko-san,

> I've started a new separate thread from the previous long thread[1]
> for internal key management system to PostgreSQL. As I mentioned in
> the previous mail[2], we've decided to step back and focus on only
> internal key management system for PG13. The internal key management
> system introduces the functionality to PostgreSQL that allows user to
> encrypt and decrypt data without knowing the actual key. Besides, it
> will be able to be integrated with transparent data encryption in the
> future.
>
> The basic idea is that PostgreSQL generates the master encryption key
> which is further protected by the user-provided passphrase. The key
> management system provides two functions to wrap and unwrap the secret
> by the master encryption key. A user generates a secret key locally

In understand that the secret key is sent in the clear for being encrypted
by a master key.

> and send it to PostgreSQL to wrap it using by pg_kmgr_wrap() and save
> it somewhere. Then the user can use the encrypted secret key to
> encrypt data and decrypt data by something like:
>
> INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
> SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;
>
> Where 'xxxxx' is the result of pg_kmgr_wrap function.

I'm lost. If pg_{en,de}crypt and pg_kmgr_unwrap are functions, what
prevent users to:

   SELECT pg_kmgr_unwrap('xxxx');

so as to recover the key, somehow in contradiction to "allows user to
encrypt and decrypt data without knowing the actual key".

When dealing with cryptography and key management, I can only recommand
extreme caution.

--
Fabien.


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Chris Travers-7
In reply to this post by Masahiko Sawada-2
Hi;

So I actually have tried to do carefully encrypted data in Postgres via pg_crypto.  I think the key management problems in PostgreSQL are separable from table-level encryption. In particular the largest problem right now with having encrypted attributes is accidental key disclosure.  I think if we solve key management in a way that works for encrypted attributes first, we can then add encrypted tables later.

Additionally big headaches come with key rotation.  So here are my thoughts here.  This is a fairly big topic.  And I am not sure it can be done incrementally as much as that seems to doom big things in the community, but I think it could be done with a major push by a combination of big players, such as Second Quadrant.


On Sun, Feb 2, 2020 at 3:02 AM Masahiko Sawada <[hidden email]> wrote:
Hi,

I've started a new separate thread from the previous long thread[1]
for internal key management system to PostgreSQL. As I mentioned in
the previous mail[2], we've decided to step back and focus on only
internal key management system for PG13. The internal key management
system introduces the functionality to PostgreSQL that allows user to
encrypt and decrypt data without knowing the actual key. Besides, it
will be able to be integrated with transparent data encryption in the
future.

The basic idea is that PostgreSQL generates the master encryption key
which is further protected by the user-provided passphrase. The key
management system provides two functions to wrap and unwrap the secret
by the master encryption key. A user generates a secret key locally
and send it to PostgreSQL to wrap it using by pg_kmgr_wrap() and save
it somewhere. Then the user can use the encrypted secret key to
encrypt data and decrypt data by something like:

So my understanding is that  you would then need something like:

1.  Symmetric keys for actual data storage.  These could never be stored in the clear.
2.  User public/private keys to  use to access data storage keys.  The private key would need to be encrypted with passphrases.  And the server needs to access the private key.
3.  Symmetric secret keys to encrypt private keys
4.  A key management public/private key pair used to exchange the password for the private key.

INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;

If you get anything wrong you risk logs being useful to break tne encryption keys and make data access easy.  You don't want pg_kmgr_unwrap('xxxx') in your logs.

Here what I would suggest is a protocol extension to do the key exchange.  In other words, protocol messages to:
1.  Request data exchange server public key.
2.  Send server public-key encrypted symmetric key.  Make sure it is properly padded etc.

These are safe still only over SSL with sslmode=full_verify since otherwise you might be vulnerable to an MITM attack.

Then the keys should be stored in something like CacheMemoryContext and pg_encrypt()/pg_decrypt() would have access to them along with appropriate  catalogs needed to get to the storage keys themselves.


 

Where 'xxxxx' is the result of pg_kmgr_wrap function.

That way we can get something encrypted and decrypted without ever
knowing the actual key that was used to encrypt it.

I'm currently updating the patch and will submit it.

The above though is only a small part of the problem.  What we also need are a variety of new DDL commands specifically for key management.  This is needed because without commands of this sort, we cannot make absolutely sure that the commands are never logged.  These commands MUST not have keys logged  and therefore must have keys stripped prior to logging.  If I were designing this:

1.  Users on an SSL connection would be able to:  CREATE ENCRYPTION USER KEY PAIR WITH PASSWORD 'xyz' which would automatically rotate keys.
2.  Superusers could:  ALTER SYSTEM ROTATE ENCRYPTION EXCHANGE KEY PAIR;
3.  Add an ENCRYPTED attribute to columns and disallow indexing of ENCRYPTED columns.  This would store keys for the columns encrypted with user public keys where they have access.
4. Allow superusers to ALTER TABLE foo ALTER encrypted_column ROTATE KEYS; which would naturally require a full table rewrite.

Now, what that proposal does not provide is the use of encryption to enforce finer-grained access such as per-row keys but that's another topic and maybe something we don't need.

However I hope that explains what I see as a version of a minimum viable infrastructure here.

On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni <[hidden email]> wrote:
>
> On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> <[hidden email]> wrote:
> > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni <[hidden email]> wrote:
> > > That
> > > would allow the internal usage to have a fixed output length of
> > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> >
> > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > AES256 (master key) internally generated.
>
> No it should be 64-bytes. That way we can have separate 32-byte
> encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
>
> While it's common to reuse the same 32-byte key for both AES256 and an
> HMAC-SHA256 and there aren't any known issues with doing so, when
> designing something from scratch it's more secure to use entirely
> separate keys.

The HMAC key you mentioned above is not the same as the HMAC key
derived from the user provided passphrase, right? That is, individual
key needs to have its IV and HMAC key. Given that the HMAC key used
for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
passphrase), what will be the former key used for?

>
> > > For the user facing piece, padding would enabled to support arbitrary
> > > input data lengths. That would make the output length grow by up to
> > > 16-bytes (rounding the data length up to the AES block size) plus one
> > > more byte if a version field is added.
> >
> > I think the length of padding also needs to be added to the output.
> > Anyway, in the first version the same methods of wrapping/unwrapping
> > key are used for both internal use and user facing function. And user
> > input key needs to be a multiple of 16 bytes value.
>
> A separate length field does not need to be added as the
> padding-enabled output will already include it at the end[1]. This
> would be handled automatically by the OpenSSL encryption / decryption
> operations (if it's enabled):
>

Yes, right.

Regards,

[1] https://www.postgresql.org/message-id/031401d3f41d%245c70ed90%241552c8b0%24%40lab.ntt.co.jp
[2] https://www.postgresql.org/message-id/CAD21AoD8QT0TWs3ma-aB821vwDKa1X519y1w3yrRKkAWjhZcrw%40mail.gmail.com

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




--
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com 
Saarbrücker Straße 37a, 10405 Berlin

Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
In reply to this post by Fabien COELHO-3
On Sun, 2 Feb 2020 at 17:05, Fabien COELHO <[hidden email]> wrote:

>
>
> Hello Masahiko-san,
>
> > I've started a new separate thread from the previous long thread[1]
> > for internal key management system to PostgreSQL. As I mentioned in
> > the previous mail[2], we've decided to step back and focus on only
> > internal key management system for PG13. The internal key management
> > system introduces the functionality to PostgreSQL that allows user to
> > encrypt and decrypt data without knowing the actual key. Besides, it
> > will be able to be integrated with transparent data encryption in the
> > future.
> >
> > The basic idea is that PostgreSQL generates the master encryption key
> > which is further protected by the user-provided passphrase. The key
> > management system provides two functions to wrap and unwrap the secret
> > by the master encryption key. A user generates a secret key locally
>
> In understand that the secret key is sent in the clear for being encrypted
> by a master key.

Yeah we need to be careful about the secret key not being logged in
any logs such as server logs for example when log_statement = 'all'. I
guess that wrapping key doesn't often happen during service running
but does once at development phase. So it would not be a big problem
but probably we need to have something to deal with it.

>
> > and send it to PostgreSQL to wrap it using by pg_kmgr_wrap() and save
> > it somewhere. Then the user can use the encrypted secret key to
> > encrypt data and decrypt data by something like:
> >
> > INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
> > SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;
> >
> > Where 'xxxxx' is the result of pg_kmgr_wrap function.
>
> I'm lost. If pg_{en,de}crypt and pg_kmgr_unwrap are functions, what
> prevent users to:
>
>    SELECT pg_kmgr_unwrap('xxxx');
>
> so as to recover the key, somehow in contradiction to "allows user to
> encrypt and decrypt data without knowing the actual key".

I might be missing your point but the above 'xxxx' is the wrapped key
wrapped by the master key stored in PostgreSQL server. So user doesn't
need to know the raw secret key to encrypt/decrypt the data. Even if a
malicious user gets 'xxxx' they cannot know the actual secret key
without the master key. pg_kmgr_wrap and pg_kmgr_unwrap are functions
and it's possible for user to know the raw secret key by using
pg_kmgr_unwrap(). The master key stored in PostgreSQL server never be
revealed.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Sehrope Sarkuni
In reply to this post by Masahiko Sawada-2
On Sat, Feb 1, 2020 at 7:02 PM Masahiko Sawada <[hidden email]> wrote:

> On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni <[hidden email]> wrote:
> >
> > On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> > <[hidden email]> wrote:
> > > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni <[hidden email]> wrote:
> > > > That
> > > > would allow the internal usage to have a fixed output length of
> > > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> > >
> > > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > > AES256 (master key) internally generated.
> >
> > No it should be 64-bytes. That way we can have separate 32-byte
> > encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
> >
> > While it's common to reuse the same 32-byte key for both AES256 and an
> > HMAC-SHA256 and there aren't any known issues with doing so, when
> > designing something from scratch it's more secure to use entirely
> > separate keys.
>
> The HMAC key you mentioned above is not the same as the HMAC key
> derived from the user provided passphrase, right? That is, individual
> key needs to have its IV and HMAC key. Given that the HMAC key used
> for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
> passphrase), what will be the former key used for?

It's not derived from the passphrase, it's unlocked by the passphrase (along with the master encryption key). The server will have 64-bytes of random data, saved encrypted in pg_control, which can be treated as two separate 32-byte keys, let's call them master_encryption_key and master_mac_key. The 64-bytes is unlocked by decrypting it with the user passphrase at startup (which itself would be split into a pair of encryption and MAC keys to do the unlocking).

The wrap and unwrap operations would use both keys:

wrap(plain_text, encryption_key, mac_key) {
    // Generate random IV:
    iv = pg_strong_random(16);
    // Encrypt:
    cipher_text = encrypt_aes256_cbc(encryption_key, iv, plain_text);
    // Compute MAC on all inputs:
    mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
    // Concat user facing pieces together
    wrapped = mac || iv || cipher_text;
    return wrapped;
}

unwrap(wrapped, encryption_key, mac_key) {
    // Split wrapped into its pieces:
    actual_mac = wrapped.slice(0, 32);
    iv = wrapped.slice(0 + 32, 16);
    cipher_text = wrapped.slice(0 + 32 + 16);
    // Compute MAC on all inputs:
    expected_mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
    // Compare MAC vs value in wrapped:
    if (expected_mac != actual_mac) { return Error("MAC does not match"); }
    // MAC matches so decrypt:
    plain_text = decrypt_aes256_cbc(encryption_key, iv, cipher_text);    
    return plain_text;
}


Every input to the encryption operation, including the encryption key, must be included into the HMAC calculation. If you use the same key for both encryption and MAC that's not required as it's already part of the MAC process as the key. Using separate keys requires explicitly adding in the encryption key into the MAC input to ensure that it the correct key prior to decryption in the unwrap operation. Any additional parts of the wrapped output (ex: a "version" byte for the algos or padding choices) should also be included.

The wrap / unwrap above would be used with the encryption and mac keys derived from the user passphrase to unlock the master_encryption_key and master_mac_key from pg_control. Then those would be used by the higher level functions:

pg_kmgr_wrap(plain_text) {
    return wrap(plain_text, master_encryption_key, master_mac_key);
}

pg_kmgr_unwrap(wrapped) {
    return unwrap(wrapped, master_encryption_key, master_mac_key);
}


Regards,
-- Sehrope Sarkuni
Founder & CEO | JackDB, Inc. | https://www.jackdb.com/
Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Robert Haas
In reply to this post by Masahiko Sawada-2
On Mon, Feb 3, 2020 at 10:18 PM Masahiko Sawada
<[hidden email]> wrote:

> > I'm lost. If pg_{en,de}crypt and pg_kmgr_unwrap are functions, what
> > prevent users to:
> >
> >    SELECT pg_kmgr_unwrap('xxxx');
> >
> > so as to recover the key, somehow in contradiction to "allows user to
> > encrypt and decrypt data without knowing the actual key".
>
> I might be missing your point but the above 'xxxx' is the wrapped key
> wrapped by the master key stored in PostgreSQL server. So user doesn't
> need to know the raw secret key to encrypt/decrypt the data. Even if a
> malicious user gets 'xxxx' they cannot know the actual secret key
> without the master key. pg_kmgr_wrap and pg_kmgr_unwrap are functions
> and it's possible for user to know the raw secret key by using
> pg_kmgr_unwrap(). The master key stored in PostgreSQL server never be
> revealed.

I think I have the same confusion as Fabien. Isn't it bad if somebody
just runs pg_kmgr_unwrap() and records the return value? Now they've
stolen your encryption key, it seems.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Cary Huang-2
Since the user does not need to know the master secret key used to cipher the data, I don't think we should expose "pg_kmgr_unwrap("xxxx")" SQL function to the user at all.
The wrapped key "xxxx" is stored in control data and it is possible to obtain by malicious user and steal the key by running SELECT pg_kmgr_unwrap("xxxx"). 
Even the user is righteous, it may not be very straightforward for that user to obtain the wrapped key "xxxx" to use in the unwrap function.

pg_kmgr_(un)wrap function is in discussion because encrypt and decrypt function require the master secret key as input argument. 
I would suggest using cluster passphrase as input instead of master key, so the user does not have to obtain the master key using pg_kmgr_unwrap("xxxx") in order to use the encrypt and decrypt function. 
The passphrase is in fact not stored anywhere in the system and we have to be careful that this passphrase is not shown in any activity log

so instead of:
------------------
INSERT INTO tbl VALUES (pg_encrypt('user data', pg_kmgr_unwrap('xxxxx'));
SELECT pg_decrypt(secret_column, pg_kmgr_unwrap('xxxxx')) FROM tbl;

it would become:
------------------
INSERT INTO tbl VALUES (pg_encrypt('user data', 'cluster_pass_phrase');
SELECT pg_decrypt(secret_column, 'cluster_pass_phrase') FROM tbl;

pg_decrypt will then have to:

1. derive the cluster pass phrase into KEK and HMAC key 
2. verify pass phrase by comparing MAC
3. unwrap the key - Sehrope suggests a good approach to make wrap/unwrap function more secure by adding MAC verification and randomed IV instead of default. I think it is good
4. decrypt the data
5. return

Using passphrase instead of master key to encrypt and decrypt function will also make front end tool integration simpler, as the front end tool also do not need to know the master key so it does not need to derive KEK or unwrap the key...etc. 
Not sure if you guys agree?

Thanks!

Cary Huang
-------------
HighGo Software Inc. (Canada)


---- On Thu, 06 Feb 2020 12:30:02 -0800 Robert Haas <[hidden email]> wrote ----

On Mon, Feb 3, 2020 at 10:18 PM Masahiko Sawada
<[hidden email]> wrote:

> > I'm lost. If pg_{en,de}crypt and pg_kmgr_unwrap are functions, what
> > prevent users to:
> >
> > SELECT pg_kmgr_unwrap('xxxx');
> >
> > so as to recover the key, somehow in contradiction to "allows user to
> > encrypt and decrypt data without knowing the actual key".
>
> I might be missing your point but the above 'xxxx' is the wrapped key
> wrapped by the master key stored in PostgreSQL server. So user doesn't
> need to know the raw secret key to encrypt/decrypt the data. Even if a
> malicious user gets 'xxxx' they cannot know the actual secret key
> without the master key. pg_kmgr_wrap and pg_kmgr_unwrap are functions
> and it's possible for user to know the raw secret key by using
> pg_kmgr_unwrap(). The master key stored in PostgreSQL server never be
> revealed.

I think I have the same confusion as Fabien. Isn't it bad if somebody
just runs pg_kmgr_unwrap() and records the return value? Now they've
stolen your encryption key, it seems.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
In reply to this post by Robert Haas
On Fri, 7 Feb 2020 at 05:30, Robert Haas <[hidden email]> wrote:

>
> On Mon, Feb 3, 2020 at 10:18 PM Masahiko Sawada
> <[hidden email]> wrote:
> > > I'm lost. If pg_{en,de}crypt and pg_kmgr_unwrap are functions, what
> > > prevent users to:
> > >
> > >    SELECT pg_kmgr_unwrap('xxxx');
> > >
> > > so as to recover the key, somehow in contradiction to "allows user to
> > > encrypt and decrypt data without knowing the actual key".
> >
> > I might be missing your point but the above 'xxxx' is the wrapped key
> > wrapped by the master key stored in PostgreSQL server. So user doesn't
> > need to know the raw secret key to encrypt/decrypt the data. Even if a
> > malicious user gets 'xxxx' they cannot know the actual secret key
> > without the master key. pg_kmgr_wrap and pg_kmgr_unwrap are functions
> > and it's possible for user to know the raw secret key by using
> > pg_kmgr_unwrap(). The master key stored in PostgreSQL server never be
> > revealed.
>
> I think I have the same confusion as Fabien. Isn't it bad if somebody
> just runs pg_kmgr_unwrap() and records the return value? Now they've
> stolen your encryption key, it seems.

This feature protects data from disk thefts but cannot protect data
from attackers who are able to access PostgreSQL server. In this
design application side still is responsible for managing the wrapped
secret in order to protect it from attackers. This is the same as when
we use pgcrypto now. The difference is that data is safe even if
attackers steal the wrapped secret and the disk. The data cannot be
decrypted either without the passphrase which can be stored to other
key management systems or without accessing postgres server. IOW for
example, attackers can get the data if they get the wrapped secret
managed by application side then run pg_kmgr_unwrap() to get the
secret and then steal the disk.

Another idea we discussed is to internally integrate pgcrypto with the
key management system. That is, the key management system has one
master key and provides a C function to pass the master key to other
postgres modules. pgcrypto uses that function and provides new
encryption and decryption functions something like
pg_encrypt_with_key() and pg_decrypt_with_key(). Which
encrypts/decrypts the given data by the master key stored in database
cluster. That way user still doesn't have to know the encryption key
and we can protect data from disk thefts. But the down side would be
that we have only one encryption key and that we might need to change
pgcrypto much.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Andres Freund
Hi,

On 2020-02-07 11:18:29 +0900, Masahiko Sawada wrote:
> Another idea we discussed is to internally integrate pgcrypto with the
> key management system.

Perhaps this has already been discussed (I only briefly looked): I'd
strongly advise against having any new infrastrure depend on
pgcrypto. Its code quality imo is well below our standards and contains
serious red flags like very outdated copies of cryptography algorithm
implementations.  I think we should consider deprecating and removing
it, not expanding its use.  It certainly shouldn't be involved in any
potential disk encryption system at a later stage.

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
On Fri, 7 Feb 2020 at 11:36, Andres Freund <[hidden email]> wrote:

>
> Hi,
>
> On 2020-02-07 11:18:29 +0900, Masahiko Sawada wrote:
> > Another idea we discussed is to internally integrate pgcrypto with the
> > key management system.
>
> Perhaps this has already been discussed (I only briefly looked): I'd
> strongly advise against having any new infrastrure depend on
> pgcrypto. Its code quality imo is well below our standards and contains
> serious red flags like very outdated copies of cryptography algorithm
> implementations.  I think we should consider deprecating and removing
> it, not expanding its use.  It certainly shouldn't be involved in any
> potential disk encryption system at a later stage.

Thank you for the advise.

Yeah I'm not going to use pgcrypto for transparent data encryption.
The KMS patch includes the new basic infrastructure for cryptographic
functions (mainly AES-CBC). I'm thinking we can expand that
infrastructure so that we can also use it for TDE purpose by
supporting new cryptographic functions such as AES-CTR. Anyway, I
agree to not have it depend on pgcrypto.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Andres Freund
Hi,

On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
> Yeah I'm not going to use pgcrypto for transparent data encryption.
> The KMS patch includes the new basic infrastructure for cryptographic
> functions (mainly AES-CBC). I'm thinking we can expand that
> infrastructure so that we can also use it for TDE purpose by
> supporting new cryptographic functions such as AES-CTR. Anyway, I
> agree to not have it depend on pgcrypto.

I thought for a minute, before checking the patch, that you were saying
above that the KMS patch includes its *own* implementation of
cryptographic functions.  I think it's pretty crucial that it continues
not to do that...

Greetings,

Andres Freund


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
On Sat, 8 Feb 2020 at 03:24, Andres Freund <[hidden email]> wrote:

>
> Hi,
>
> On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
> > Yeah I'm not going to use pgcrypto for transparent data encryption.
> > The KMS patch includes the new basic infrastructure for cryptographic
> > functions (mainly AES-CBC). I'm thinking we can expand that
> > infrastructure so that we can also use it for TDE purpose by
> > supporting new cryptographic functions such as AES-CTR. Anyway, I
> > agree to not have it depend on pgcrypto.
>
> I thought for a minute, before checking the patch, that you were saying
> above that the KMS patch includes its *own* implementation of
> cryptographic functions.  I think it's pretty crucial that it continues
> not to do that...

I meant that we're going to use OpenSSL for AES encryption and
decryption independent of pgcrypto's openssl code, as the first step.
That is, KMS is available only when configured --with-openssl. And
hopefully we eventually merge these openssl code and have pgcrypto use
it, like when we introduced SCRAM.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Tomas Vondra-4
On Sat, Feb 08, 2020 at 02:48:54PM +0900, Masahiko Sawada wrote:

>On Sat, 8 Feb 2020 at 03:24, Andres Freund <[hidden email]> wrote:
>>
>> Hi,
>>
>> On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
>> > Yeah I'm not going to use pgcrypto for transparent data encryption.
>> > The KMS patch includes the new basic infrastructure for cryptographic
>> > functions (mainly AES-CBC). I'm thinking we can expand that
>> > infrastructure so that we can also use it for TDE purpose by
>> > supporting new cryptographic functions such as AES-CTR. Anyway, I
>> > agree to not have it depend on pgcrypto.
>>
>> I thought for a minute, before checking the patch, that you were saying
>> above that the KMS patch includes its *own* implementation of
>> cryptographic functions.  I think it's pretty crucial that it continues
>> not to do that...
>
>I meant that we're going to use OpenSSL for AES encryption and
>decryption independent of pgcrypto's openssl code, as the first step.
>That is, KMS is available only when configured --with-openssl. And
>hopefully we eventually merge these openssl code and have pgcrypto use
>it, like when we introduced SCRAM.
>

I don't think it's very likely we'll ever merge any openssl code into
our repository, e.g. because of licensing. But we already have AES
implementation in pgcrypto - why not to use that? I'm not saying we
should make this depend on pgcrypto, but maybe we should move the AES
library from pgcrypto into src/common or something like that.


regards

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Tomas Vondra-4
In reply to this post by Masahiko Sawada-2
Hi,

I wonder if this is meant to support external KMS systems/services like
Vault (from HashiCorp) or CloudHSM (from AWS) or a hardware HSM. AFAICS
the current implementation does not allow storing keys in such external
systems, right? But it seems kinda reasonable to want to do that, when
already using the HSM for other parts of the system.

Now, I'm not saying the first version we commit has to support this, or
that it necessarily makes sense. But for example MariaDB seems to
support this [1].

[1] https://mariadb.com/kb/en/encryption-key-management/

--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Andres Freund
In reply to this post by Tomas Vondra-4
Hi,

On February 8, 2020 7:08:26 AM PST, Tomas Vondra <[hidden email]> wrote:

>On Sat, Feb 08, 2020 at 02:48:54PM +0900, Masahiko Sawada wrote:
>>On Sat, 8 Feb 2020 at 03:24, Andres Freund <[hidden email]> wrote:
>>>
>>> Hi,
>>>
>>> On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
>>> > Yeah I'm not going to use pgcrypto for transparent data
>encryption.
>>> > The KMS patch includes the new basic infrastructure for
>cryptographic
>>> > functions (mainly AES-CBC). I'm thinking we can expand that
>>> > infrastructure so that we can also use it for TDE purpose by
>>> > supporting new cryptographic functions such as AES-CTR. Anyway, I
>>> > agree to not have it depend on pgcrypto.
>>>
>>> I thought for a minute, before checking the patch, that you were
>saying
>>> above that the KMS patch includes its *own* implementation of
>>> cryptographic functions.  I think it's pretty crucial that it
>continues
>>> not to do that...
>>
>>I meant that we're going to use OpenSSL for AES encryption and
>>decryption independent of pgcrypto's openssl code, as the first step.
>>That is, KMS is available only when configured --with-openssl. And
>>hopefully we eventually merge these openssl code and have pgcrypto use
>>it, like when we introduced SCRAM.
>>
>
>I don't think it's very likely we'll ever merge any openssl code into
>our repository, e.g. because of licensing. But we already have AES
>implementation in pgcrypto - why not to use that? I'm not saying we
>should make this depend on pgcrypto, but maybe we should move the AES
>library from pgcrypto into src/common or something like that.

The code uses functions exposed by openssl, it doesn't copy there code.

And no, I don't think we should copy the implemented from pgcrypto - it's not good. We should remove it entirely.

Andres

--
Sent from my Android device with K-9 Mail. Please excuse my brevity.


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Tomas Vondra-4
On Sat, Feb 08, 2020 at 07:47:24AM -0800, Andres Freund wrote:

>Hi,
>
>On February 8, 2020 7:08:26 AM PST, Tomas Vondra <[hidden email]> wrote:
>>On Sat, Feb 08, 2020 at 02:48:54PM +0900, Masahiko Sawada wrote:
>>>On Sat, 8 Feb 2020 at 03:24, Andres Freund <[hidden email]> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
>>>> > Yeah I'm not going to use pgcrypto for transparent data
>>encryption.
>>>> > The KMS patch includes the new basic infrastructure for
>>cryptographic
>>>> > functions (mainly AES-CBC). I'm thinking we can expand that
>>>> > infrastructure so that we can also use it for TDE purpose by
>>>> > supporting new cryptographic functions such as AES-CTR. Anyway, I
>>>> > agree to not have it depend on pgcrypto.
>>>>
>>>> I thought for a minute, before checking the patch, that you were
>>saying
>>>> above that the KMS patch includes its *own* implementation of
>>>> cryptographic functions.  I think it's pretty crucial that it
>>continues
>>>> not to do that...
>>>
>>>I meant that we're going to use OpenSSL for AES encryption and
>>>decryption independent of pgcrypto's openssl code, as the first step.
>>>That is, KMS is available only when configured --with-openssl. And
>>>hopefully we eventually merge these openssl code and have pgcrypto use
>>>it, like when we introduced SCRAM.
>>>
>>
>>I don't think it's very likely we'll ever merge any openssl code into
>>our repository, e.g. because of licensing. But we already have AES
>>implementation in pgcrypto - why not to use that? I'm not saying we
>>should make this depend on pgcrypto, but maybe we should move the AES
>>library from pgcrypto into src/common or something like that.
>
>The code uses functions exposed by openssl, it doesn't copy there code.
>

Sure, I know the code is currently calling ooenssl functions. I was
responding to Masahiko-san's message that we might eventually merge this
openssl code into our tree.

>And no, I don't think we should copy the implemented from pgcrypto -
>it's not good. We should remove it entirely.

OK, no opinion on the quality of this implementation.


--
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Tom Lane-2
Tomas Vondra <[hidden email]> writes:
> On Sat, Feb 08, 2020 at 07:47:24AM -0800, Andres Freund wrote:
>> On February 8, 2020 7:08:26 AM PST, Tomas Vondra <[hidden email]> wrote:
>>>> I don't think it's very likely we'll ever merge any openssl code into
>>>> our repository, e.g. because of licensing. But we already have AES
>>>> implementation in pgcrypto - why not to use that? I'm not saying we
>>>> should make this depend on pgcrypto, but maybe we should move the AES
>>>> library from pgcrypto into src/common or something like that.

>> The code uses functions exposed by openssl, it doesn't copy there code.

> Sure, I know the code is currently calling ooenssl functions. I was
> responding to Masahiko-san's message that we might eventually merge this
> openssl code into our tree.

No.  This absolutely, positively, will not happen.  There will never be
crypto functions in our core tree, because then there'd be no recourse for
people who want to use Postgres in countries with restrictions on crypto
software.  It's hard enough for them that we have such code in contrib
--- but at least they can remove pgcrypto and be legal.  If it's in
src/common then they're stuck.

For the same reason, I don't think that an "internal key management"
feature in the core code is ever going to be acceptable.  It has to
be an extension.  (But, as long as it's an extension, whether it's
bringing its own crypto or relying on some other extension for that
doesn't matter from the legal standpoint.)

                        regards, tom lane


Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
In reply to this post by Tomas Vondra-4
On Sun, 9 Feb 2020 at 01:53, Tomas Vondra <[hidden email]> wrote:

>
> On Sat, Feb 08, 2020 at 07:47:24AM -0800, Andres Freund wrote:
> >Hi,
> >
> >On February 8, 2020 7:08:26 AM PST, Tomas Vondra <[hidden email]> wrote:
> >>On Sat, Feb 08, 2020 at 02:48:54PM +0900, Masahiko Sawada wrote:
> >>>On Sat, 8 Feb 2020 at 03:24, Andres Freund <[hidden email]> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> On 2020-02-07 20:44:31 +0900, Masahiko Sawada wrote:
> >>>> > Yeah I'm not going to use pgcrypto for transparent data
> >>encryption.
> >>>> > The KMS patch includes the new basic infrastructure for
> >>cryptographic
> >>>> > functions (mainly AES-CBC). I'm thinking we can expand that
> >>>> > infrastructure so that we can also use it for TDE purpose by
> >>>> > supporting new cryptographic functions such as AES-CTR. Anyway, I
> >>>> > agree to not have it depend on pgcrypto.
> >>>>
> >>>> I thought for a minute, before checking the patch, that you were
> >>saying
> >>>> above that the KMS patch includes its *own* implementation of
> >>>> cryptographic functions.  I think it's pretty crucial that it
> >>continues
> >>>> not to do that...
> >>>
> >>>I meant that we're going to use OpenSSL for AES encryption and
> >>>decryption independent of pgcrypto's openssl code, as the first step.
> >>>That is, KMS is available only when configured --with-openssl. And
> >>>hopefully we eventually merge these openssl code and have pgcrypto use
> >>>it, like when we introduced SCRAM.
> >>>
> >>
> >>I don't think it's very likely we'll ever merge any openssl code into
> >>our repository, e.g. because of licensing. But we already have AES
> >>implementation in pgcrypto - why not to use that? I'm not saying we
> >>should make this depend on pgcrypto, but maybe we should move the AES
> >>library from pgcrypto into src/common or something like that.
> >
> >The code uses functions exposed by openssl, it doesn't copy there code.
> >
>
> Sure, I know the code is currently calling ooenssl functions. I was
> responding to Masahiko-san's message that we might eventually merge this
> openssl code into our tree.

Sorry for confusing you. What I wanted to say is to write AES
encryption code in src/common using openssl library as the first step
apart from pgcrypto's openssl code, and then merge these two code
library into src/common as the next step. That is, it's moving the AES
library pgcrypto to src/common as you mentioned. IIRC when we
introduced SCRAM we moved sha2 library from pgcrypto to src/common.

Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Reply | Threaded
Open this post in threaded view
|

RE: Internal key management system

tsunakawa.takay@fujitsu.com
In reply to this post by Andres Freund
From: Andres Freund <[hidden email]>
> Perhaps this has already been discussed (I only briefly looked): I'd
> strongly advise against having any new infrastrure depend on
> pgcrypto. Its code quality imo is well below our standards and contains
> serious red flags like very outdated copies of cryptography algorithm
> implementations.  I think we should consider deprecating and removing
> it, not expanding its use.  It certainly shouldn't be involved in any
> potential disk encryption system at a later stage.

+1


Regards
Takayuki Tsunakawa



Reply | Threaded
Open this post in threaded view
|

Re: Internal key management system

Masahiko Sawada-2
In reply to this post by Sehrope Sarkuni
On Wed, 5 Feb 2020 at 22:28, Sehrope Sarkuni <[hidden email]> wrote:

>
> On Sat, Feb 1, 2020 at 7:02 PM Masahiko Sawada <[hidden email]> wrote:
> > On Sun, 2 Feb 2020 at 00:37, Sehrope Sarkuni <[hidden email]> wrote:
> > >
> > > On Fri, Jan 31, 2020 at 1:21 AM Masahiko Sawada
> > > <[hidden email]> wrote:
> > > > On Thu, 30 Jan 2020 at 20:36, Sehrope Sarkuni <[hidden email]> wrote:
> > > > > That
> > > > > would allow the internal usage to have a fixed output length of
> > > > > LEN(IV) + LEN(HMAC) + LEN(DATA) = 16 + 32 + 64 = 112 bytes.
> > > >
> > > > Probably you meant LEN(DATA) is 32? DATA will be an encryption key for
> > > > AES256 (master key) internally generated.
> > >
> > > No it should be 64-bytes. That way we can have separate 32-byte
> > > encryption key (for AES256) and 32-byte MAC key (for HMAC-SHA256).
> > >
> > > While it's common to reuse the same 32-byte key for both AES256 and an
> > > HMAC-SHA256 and there aren't any known issues with doing so, when
> > > designing something from scratch it's more secure to use entirely
> > > separate keys.
> >
> > The HMAC key you mentioned above is not the same as the HMAC key
> > derived from the user provided passphrase, right? That is, individual
> > key needs to have its IV and HMAC key. Given that the HMAC key used
> > for HMAC(IV || ENCRYPT(KEY, IV, DATA)) is the latter key (derived from
> > passphrase), what will be the former key used for?
>
> It's not derived from the passphrase, it's unlocked by the passphrase (along with the master encryption key). The server will have 64-bytes of random data, saved encrypted in pg_control, which can be treated as two separate 32-byte keys, let's call them master_encryption_key and master_mac_key. The 64-bytes is unlocked by decrypting it with the user passphrase at startup (which itself would be split into a pair of encryption and MAC keys to do the unlocking).
>
> The wrap and unwrap operations would use both keys:
>
> wrap(plain_text, encryption_key, mac_key) {
>     // Generate random IV:
>     iv = pg_strong_random(16);
>     // Encrypt:
>     cipher_text = encrypt_aes256_cbc(encryption_key, iv, plain_text);
>     // Compute MAC on all inputs:
>     mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
>     // Concat user facing pieces together
>     wrapped = mac || iv || cipher_text;
>     return wrapped;
> }
>
> unwrap(wrapped, encryption_key, mac_key) {
>     // Split wrapped into its pieces:
>     actual_mac = wrapped.slice(0, 32);
>     iv = wrapped.slice(0 + 32, 16);
>     cipher_text = wrapped.slice(0 + 32 + 16);
>     // Compute MAC on all inputs:
>     expected_mac = hmac_sha256(mac_key, encryption_key || iv || cipher_text);
>     // Compare MAC vs value in wrapped:
>     if (expected_mac != actual_mac) { return Error("MAC does not match"); }
>     // MAC matches so decrypt:
>     plain_text = decrypt_aes256_cbc(encryption_key, iv, cipher_text);
>     return plain_text;
> }
>
> Every input to the encryption operation, including the encryption key, must be included into the HMAC calculation. If you use the same key for both encryption and MAC that's not required as it's already part of the MAC process as the key. Using separate keys requires explicitly adding in the encryption key into the MAC input to ensure that it the correct key prior to decryption in the unwrap operation. Any additional parts of the wrapped output (ex: a "version" byte for the algos or padding choices) should also be included.
>
> The wrap / unwrap above would be used with the encryption and mac keys derived from the user passphrase to unlock the master_encryption_key and master_mac_key from pg_control. Then those would be used by the higher level functions:
>
> pg_kmgr_wrap(plain_text) {
>     return wrap(plain_text, master_encryption_key, master_mac_key);
> }
>
> pg_kmgr_unwrap(wrapped) {
>     return unwrap(wrapped, master_encryption_key, master_mac_key);
> }
Thank you for explaining the details. I had missed something.

Attached updated patch incorporated all comments I got so far. The changes are:

* Renamed data_encryption_cipher to key_management_cipher
* Renamed pg_kmgr_wrap and pg_kmgr_unwrap to pg_wrap_key and pg_unwrap_key
* Changed wrap and unwrap procedure based on the comments
* Removed the restriction of requiring the input key being a multiple
of 16 bytes.
* Created a context dedicated to wrap and unwrap data

Documentation and regression tests are still missing.

Regarding key rotation, currently we allow online key rotation by
doing pg_rotate_encryption_key after changing
cluster_passphrase_command and loading. But if the server crashed
during key rotation it might require the old passphrase in spite of
the passphrase command in postgresql.conf having been changed. We need
to deal with it but I'm not sure the best approach. Possibly having a
new frontend tool that changes the key offline would be a safe
approach.

Regards,


--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

kms_v3.patch (70K) Download Attachment
1234