Add A Glossary

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
84 messages Options
12345
Reply | Threaded
Open this post in threaded view
|

Add A Glossary

Corey Huinker
Attached is a v1 patch to add a Glossary to the appendix of our current documentation.

I believe that our documentation needs a glossary for a few reasons:

1. It's hard to ask for help if you don't know the proper terminology of the problem you're having.

2. Readers who are new to databases may not understand a few of the terms that are used casually both in the documentation and in forums. This helps to make our documentation a bit more useful as a teaching tool.

3. Readers whose primary language is not English may struggle to find the correct search terms, and this glossary may help them grasp that a given term has a usage in databases that is different from common English usage.

3b. If we are not able to find the resources to translate all of the documentation into a given language, translating the glossary page would be a good first step.

4. The glossary would be web-searchable, and draw viewers to the official documentation.

5. adding link anchors to each term would make them cite-able, useful in forum conversations.


A few notes about this patch:

1. It's obviously incomplete. There are more terms, a lot more, to add.

2. The individual definitions supplied are off-the-cuff, and should be thoroughly reviewed.

3. The definitions as a whole should be reviewed by an actual tech writer (one was initially involved but had to step back due to prior commitments), and the definitions should be normalized in terms of voice, tone, audience, etc.

4. My understanding of DocBook is not strong. The glossary vs glosslist tag issue is a bit confusing to me, and I'm not sure if the glossary tag is even appropriate for our needs.

5. I've made no effort at making each term an anchor, nor have I done any CSS styling at all.

6. I'm not quite sure how to handle terms that have different definitions in different contexts. Should that be two glossdefs following one glossterm, or two separate def/term pairs?

Please review and share your thoughts.

0001-add-glossary-page-with-sample-terms-and-definitions.patch (28K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Fabien COELHO-3

Hello Corey,

My 0.02€:

On principle, I'm fine with having a glossary, i.e. word definitions,
which are expected to be rather stable in the long run.

I'm wondering whether the effort would not be made redundant by other
on-line effort such as wikipedia, wiktionary, stackoverflow, standards,
whatever.

When explaining something, the teacher I am usually provides some level of
example. This may or may not be appropriate there.

ISTM that there should be pointers to relevant sections in the
documentation, for instance "Analytics" provided definition suggests
pointing to windowing functions.

There is significant redundancy involved, because a lot of term would be
defined in other sections anyway.

There should be cross references, eg "Column" definition talks about
Attribute, Table & View, which should be linked to.

I'd consider making SQL keywords uppercase.

Developing that is a significant undertaking. Do we have the available
energy?

Patch generates a warning on "git apply".

  sh> git apply ...
  ... terms-and-definitions.patch:159: tab in indent. [...]
  warning: 1 line adds whitespace errors.

"Record" def as nested <para> for some unclear reason.

Basically the redacted definitions look pretty clear and well written to
the non-native English speaker I am.

On Sun, 13 Oct 2019, Corey Huinker wrote:

> Date: Sun, 13 Oct 2019 16:52:05 -0400
> From: Corey Huinker <[hidden email]>
> To: [hidden email]
> Subject: Add A Glossary
>
> Attached is a v1 patch to add a Glossary to the appendix of our current
> documentation.
>
> I believe that our documentation needs a glossary for a few reasons:
>
> 1. It's hard to ask for help if you don't know the proper terminology of
> the problem you're having.
>
> 2. Readers who are new to databases may not understand a few of the terms
> that are used casually both in the documentation and in forums. This helps
> to make our documentation a bit more useful as a teaching tool.
>
> 3. Readers whose primary language is not English may struggle to find the
> correct search terms, and this glossary may help them grasp that a given
> term has a usage in databases that is different from common English usage.
>
> 3b. If we are not able to find the resources to translate all of the
> documentation into a given language, translating the glossary page would be
> a good first step.
>
> 4. The glossary would be web-searchable, and draw viewers to the official
> documentation.
>
> 5. adding link anchors to each term would make them cite-able, useful in
> forum conversations.
>
>
> A few notes about this patch:
>
> 1. It's obviously incomplete. There are more terms, a lot more, to add.
>
> 2. The individual definitions supplied are off-the-cuff, and should be
> thoroughly reviewed.
>
> 3. The definitions as a whole should be reviewed by an actual tech writer
> (one was initially involved but had to step back due to prior commitments),
> and the definitions should be normalized in terms of voice, tone, audience,
> etc.
>
> 4. My understanding of DocBook is not strong. The glossary vs glosslist tag
> issue is a bit confusing to me, and I'm not sure if the glossary tag is
> even appropriate for our needs.
>
> 5. I've made no effort at making each term an anchor, nor have I done any
> CSS styling at all.
>
> 6. I'm not quite sure how to handle terms that have different definitions
> in different contexts. Should that be two glossdefs following one
> glossterm, or two separate def/term pairs?
>
> Please review and share your thoughts.
>
--
Fabien Coelho - CRI, MINES ParisTech
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Michael Paquier-2
On Sat, Nov 09, 2019 at 09:19:16AM +0100, Fabien COELHO wrote:
> On principle, I'm fine with having a glossary, i.e. word definitions, which
> are expected to be rather stable in the long run.
>
> I'm wondering whether the effort would not be made redundant by other
> on-line effort such as wikipedia, wiktionary, stackoverflow, standards,
> whatever.
>
> When explaining something, the teacher I am usually provides some level of
> example. This may or may not be appropriate there.

That's exactly a good reason for being a reviewer here.  You have
quite some insight here.

> I'd consider making SQL keywords uppercase.
>
> Developing that is a significant undertaking. Do we have the available
> energy?

It seems like this could be a good idea, still the patch has been
waiting on his author for more than two weeks now, so I have marked it
as returned with feedback.
--
Michael

signature.asc (849 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
It seems like this could be a good idea, still the patch has been
waiting on his author for more than two weeks now, so I have marked it
as returned with feedback.

In light of feedback, I enlisted the help of an actual technical writer (Roger Harkavy, CCed) and we eventually found the time to take a second pass at this.

Attached is a revised patch.
 

0001-add-glossary-page-with-initial-definitions.patch (31K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
This latest version is an attempt at merging the work of Jürgen Purtz into what I had posted earlier. There was relatively little overlap in the terms we had chosen to define.

Each glossary definition now has a reference id (good idea Jürgen), the form of which is "glossary-term". So we can link to the glossary from outside if we so choose.

I encourage everyone to read the definitions, and suggest fixes to any inaccuracies or awkward phrasings. Mostly, though, I'm seeking feedback on the structure itself, and hoping to get that committed.


On Tue, Feb 11, 2020 at 11:22 PM Corey Huinker <[hidden email]> wrote:
It seems like this could be a good idea, still the patch has been
waiting on his author for more than two weeks now, so I have marked it
as returned with feedback.

In light of feedback, I enlisted the help of an actual technical writer (Roger Harkavy, CCed) and we eventually found the time to take a second pass at this.

Attached is a revised patch.
 

0001-add-glossary-page.patch (64K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Roger Harkavy
Hello, everyone, I'm Roger, the tech writer who worked with Corey on the glossary file. I just thought I'd announce that I am also on the list, and I'm looking forward to any questions or comments people may have. Thanks!

On Tue, Mar 10, 2020 at 11:37 AM Corey Huinker <[hidden email]> wrote:
This latest version is an attempt at merging the work of Jürgen Purtz into what I had posted earlier. There was relatively little overlap in the terms we had chosen to define.

Each glossary definition now has a reference id (good idea Jürgen), the form of which is "glossary-term". So we can link to the glossary from outside if we so choose.

I encourage everyone to read the definitions, and suggest fixes to any inaccuracies or awkward phrasings. Mostly, though, I'm seeking feedback on the structure itself, and hoping to get that committed.


On Tue, Feb 11, 2020 at 11:22 PM Corey Huinker <[hidden email]> wrote:
It seems like this could be a good idea, still the patch has been
waiting on his author for more than two weeks now, so I have marked it
as returned with feedback.

In light of feedback, I enlisted the help of an actual technical writer (Roger Harkavy, CCed) and we eventually found the time to take a second pass at this.

Attached is a revised patch.
 
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Jürgen Purtz
I made changes on top of 0001-add-glossary-page.patch which was supplied
by C. Huinker. This affects not only terms proposed by me but also his
original terms. If my changes are not obvious, please let me know and I
will describe my motivation.

Please note especially lines marked with question marks.

It will be helpful for diff-ing to restrict the length of lines in the
SGML files to 71 characters (as usual).

J. Purtz



0002-add-glossary-page.patch (25K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker


On Wed, Mar 11, 2020 at 12:50 PM Jürgen Purtz <[hidden email]> wrote:
I made changes on top of 0001-add-glossary-page.patch which was supplied
by C. Huinker. This affects not only terms proposed by me but also his
original terms. If my changes are not obvious, please let me know and I
will describe my motivation.

Please note especially lines marked with question marks.

It will be helpful for diff-ing to restrict the length of lines in the
SGML files to 71 characters (as usual).

J. Purtz

A new person replied off-list with some suggested edits, all of which seemed pretty good. I'll incorporate them myself if that person chooses to remain off-list.



 
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
In reply to this post by Jürgen Purtz
It will be helpful for diff-ing to restrict the length of lines in the
SGML files to 71 characters (as usual).

I did it that way for the following reasons
1. It aids grep-ability
2. The committers seem to be moving towards that for SQL strings, mostly for reason #1
3. I recall that the code is put through a linter as one of the final steps before release, I assumed that the SGML gets the same.
4. Even if #3 is false, its easy enough to do manually for me to do for this one file once we've settled on the text of the definitions.

As for the changes, most things seem fine, I specifically like:
* Checkpoint - looks good
* yes, PGDATA should have been a literal
* Partition - the a/b split works for me
* Unlogged - it reads better

I'm not so sure on / responses to your ???s:
* The statement that names of schema objects are unique isn't strictly true, just mostly true. Take the case of a unique constraints. The constraint has a name and the unique index has the same name, to the point where adding a unique constraint using an existing index renames that index to conform to the constraint name.
* Serializable "other way around" question - It's both. Outside the transaction you can't see changes made inside another transaction (though you can be blocked by them), and inside serializable you can't see any changes made since you started. Does that make sense? Were you asking a different question?
* Transaction - yes, all those things could be "visible" or they could be "side effects". It may be best to leave the over-simplified definition in place, and add a "For more information see <<linref to tutorial-transactions>>
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker

* Transaction - yes, all those things could be "visible" or they could be "side effects". It may be best to leave the over-simplified definition in place, and add a "For more information see <<linref to tutorial-transactions>>

transaction-iso would be a better linkref in this case 
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Jürgen Purtz
In reply to this post by Corey Huinker

The statement that names of schema objects are unique isn't strictly true, just mostly true. Take the case of a unique constraints.

Concerning CONSTRAINTS you are right. Constraints seems to be an exception:

  • Their name belongs to a schema, but are not necessarily unique within this context: https://www.postgresql.org/docs/current/catalog-pg-constraint.html.
  • There is a UNIQUE index within the system catalog pg_constraints:  "pg_constraint_conrelid_contypid_conname_index" UNIQUE, btree (conrelid, contypid, conname), which expresses that names are unique within the context of a table/constraint-type. Nevertheless tests have shown that some stronger restrictions exists across table-boarders (,which seems to be implemented in CREATE statements - or as a consequence of your mentioned correlation between constraint and index ?).

I hope that there are no more such exception to the global rule 'object names in a schema are unique': https://www.postgresql.org/docs/current/sql-createschema.html

This facts must be mentioned as a short note in glossary and in more detail in the later patch about the architecture.

J. Purtz


Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
On Fri, Mar 13, 2020 at 12:18 AM Jürgen Purtz <[hidden email]> wrote:

The statement that names of schema objects are unique isn't strictly true, just mostly true. Take the case of a unique constraints.

Concerning CONSTRAINTS you are right. Constraints seems to be an exception:

  • Their name belongs to a schema, but are not necessarily unique within this context: https://www.postgresql.org/docs/current/catalog-pg-constraint.html.
  • There is a UNIQUE index within the system catalog pg_constraints:  "pg_constraint_conrelid_contypid_conname_index" UNIQUE, btree (conrelid, contypid, conname), which expresses that names are unique within the context of a table/constraint-type. Nevertheless tests have shown that some stronger restrictions exists across table-boarders (,which seems to be implemented in CREATE statements - or as a consequence of your mentioned correlation between constraint and index ?).

I hope that there are no more such exception to the global rule 'object names in a schema are unique': https://www.postgresql.org/docs/current/sql-createschema.html

This facts must be mentioned as a short note in glossary and in more detail in the later patch about the architecture.


I did what I could to address the near uniqueness, as well as incorporate your earlier edits into this new, squashed patch attached.
 

0001-add-glossary-page-with-revisions.patch (67K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Alvaro Herrera-9
I gave this a look.  I first reformatted it so I could read it; that's
0001.  Second I changed all the long <link> items into <xref>s, which
are shorter and don't have to repeat the title of the refered to page.
(Of course, this changes the link to be in the same style as every other
link in our documentation; some people don't like it. But it's our
style.)

There are some mistakes.  "Tupple" is most glaring one -- not just the
typo but also the fact that it goes to sql-revoke.  A few definitions
we'll want to modify.  Nothing too big.  In general I like this work and
I think we should have it in pg13.

Please bikeshed the definition of your favorite term, and suggest what
other terms to add.  No pointing out of mere typos yet, please.

I think we should have the terms Consistency, Isolation, Durability.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

0001-glossary.patch (53K) Download Attachment
0002-Change-all-For-more-info-see-X-links-to-xref-style.patch (11K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
On Thu, Mar 19, 2020 at 8:11 PM Alvaro Herrera <[hidden email]> wrote:
I gave this a look.  I first reformatted it so I could read it; that's
0001.  Second I changed all the long <link> items into <xref>s, which

Thanks! I didn't know about xrefs, that is a big improvement.
 
are shorter and don't have to repeat the title of the refered to page.
(Of course, this changes the link to be in the same style as every other
link in our documentation; some people don't like it. But it's our
style.)

There are some mistakes.  "Tupple" is most glaring one -- not just the
typo but also the fact that it goes to sql-revoke.  A few definitions
we'll want to modify.  Nothing too big.  In general I like this work and
I think we should have it in pg13.

Please bikeshed the definition of your favorite term, and suggest what
other terms to add.  No pointing out of mere typos yet, please.

Jürgen mentioned off-list that the man page doesn't build. I was going to look into that, but if anyone has more familiarity with that, I'm listening.


I think we should have the terms Consistency, Isolation, Durability.

+1
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
Jürgen mentioned off-list that the man page doesn't build. I was going to look into that, but if anyone has more familiarity with that, I'm listening.

Looking at this some more, I'm not sure anything needs to be done for man pages. man1 is for executables, man3 seems to be dblink and SPI, and man7 is all SQL commands. This isn't any of those. The only possible thing left would be how to render the text of a <glossterm>foo</glossterm, and so I looked to see what we do in man pages for acronyms, and the answer appears to be "nothing":

postgres/doc/src$ git grep acronym | grep -v '\/acronym'
sgml/filelist.sgml:<!ENTITY acronyms   SYSTEM "acronyms.sgml">
sgml/postgres.sgml:  &acronyms;
sgml/release.sgml:[A-Z][A-Z_ ]+[A-Z_]             <command>, <literal>, <envar>, <acronym>
sgml/stylesheet.css:acronym { font-style: inherit; }

filelist.sgml, postgres.sgml, ans stylesheet.css already have the corresponding change, and the release.sgml is just an incidental mention of acronym.

Of course I could be missing something.
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Alvaro Herrera-9
On 2020-Mar-20, Corey Huinker wrote:

> > Jürgen mentioned off-list that the man page doesn't build. I was going to
> > look into that, but if anyone has more familiarity with that, I'm listening.

> Looking at this some more, I'm not sure anything needs to be done for man
> pages.

Yeah, I don't think he was saying that we needed to do anything to
produce a glossary man page; rather that the "make man" command failed.
I tried it here, and indeed it failed.  But on further investigation,
after a "make maintainer-clean" it no longer failed.  I'm not sure what
to make of it, but it seems that this patch needn't concern itself with
that.

I gave a read through the first few actual definitions.  It's a much
slower work than I thought!  Attached you'll find the first few edits
that I propose.

Looking at the definition of "Aggregate" it seemed weird to have it
stand as a verb infinitive.  I looked up other glossaries, found this
one
https://www.gartner.com/en/information-technology/glossary?glossaryletter=T
and realized that when they do verbs, they put the present participle
(-ing) form.  So I changed it to "Aggregating", and split out the
"Aggregate function" into its own term.

In Atomic, there seemed to be excessive use of <glossterm> in the
definitions.  Style guides seem to suggest to do that only the first
time you use a term in a definition.  I removed some markup.

I'm not sure about some terms such as "analytic" and "backend server".
I put them in XML comments for now.

The other changes should be self-explanatory.

It's hard to review work from a professional tech writer.  I'm under the
constant impression that I'm ruining somebody's perfect end product,
making a fool of myself.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

glossary-3.patch (7K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Roger Harkavy
Alvaro, I know that you are joking, but I want to impress on everyone: please don't feel like anyone here is breaking anything when it comes to modifying the content and structure of this glossary.

I do have technical writing experience, but everyone else here is a subject matter expert when it comes to the world of databases and how this one in particular functions.

On Fri, Mar 20, 2020 at 1:51 PM Alvaro Herrera <[hidden email]> wrote:
On 2020-Mar-20, Corey Huinker wrote:

> > Jürgen mentioned off-list that the man page doesn't build. I was going to
> > look into that, but if anyone has more familiarity with that, I'm listening.

> Looking at this some more, I'm not sure anything needs to be done for man
> pages.

Yeah, I don't think he was saying that we needed to do anything to
produce a glossary man page; rather that the "make man" command failed.
I tried it here, and indeed it failed.  But on further investigation,
after a "make maintainer-clean" it no longer failed.  I'm not sure what
to make of it, but it seems that this patch needn't concern itself with
that.

I gave a read through the first few actual definitions.  It's a much
slower work than I thought!  Attached you'll find the first few edits
that I propose.

Looking at the definition of "Aggregate" it seemed weird to have it
stand as a verb infinitive.  I looked up other glossaries, found this
one
https://www.gartner.com/en/information-technology/glossary?glossaryletter=T
and realized that when they do verbs, they put the present participle
(-ing) form.  So I changed it to "Aggregating", and split out the
"Aggregate function" into its own term.

In Atomic, there seemed to be excessive use of <glossterm> in the
definitions.  Style guides seem to suggest to do that only the first
time you use a term in a definition.  I removed some markup.

I'm not sure about some terms such as "analytic" and "backend server".
I put them in XML comments for now.

The other changes should be self-explanatory.

It's hard to review work from a professional tech writer.  I'm under the
constant impression that I'm ruining somebody's perfect end product,
making a fool of myself.

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Corey Huinker
In reply to this post by Alvaro Herrera-9
It's hard to review work from a professional tech writer.  I'm under the
constant impression that I'm ruining somebody's perfect end product,
making a fool of myself.

If it makes you feel better, it's a mix of definitions I wrote that Roger proofed and restructured, ones that Jürgen had written for a separate effort which then got a Roger-pass, and then some edits of my own and some by Jürgen which I merged without consulting Roger.

Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Justin Pryzby
In reply to this post by Alvaro Herrera-9
On Thu, Mar 19, 2020 at 09:11:22PM -0300, Alvaro Herrera wrote:
> +    <glossterm>Aggregate</glossterm>
> +    <glossdef>
> +     <para>
> +      To combine a collection of data values into a single value, whose
> +      value may not be of the same type as the original values.
> +      <glossterm>Aggregate</glossterm> <glossterm>Functions</glossterm>
> +      combine multiple <glossterm>Rows</glossterm> that share a common set
> +      of values into one <glossterm>Row</glossterm>, which means that the
> +      only data visible in the values in common, and the aggregates of the

IS the values in common ?
(or, "is the shared values")

> +    <glossterm>Analytic</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Function</glossterm> whose computed value can reference
> +      values found in nearby <glossterm>Rows</glossterm> of the same
> +      <glossterm>Result Set</glossterm>.

> +    <glossterm>Archiver</glossterm>

Can you change that to archiver process ?

> +    <glossterm>Atomic</glossterm>
..
> +     <para>
> +      In reference to an operation: An event that cannot be completed in
> +      part: it must either entirely succeed or entirely fail. A series of

Can you say: "an action which is not allowed to partially succed and then fail,
..."

> +    <glossterm>Autovacuum</glossterm>

Say autovacuum process ?

> +    <glossdef>
> +     <para>
> +      Processes that remove outdated <acronym>MVCC</acronym>

I would say "A set of processes that remove..."

> +      <glossterm>Records</glossterm> of the <glossterm>Heap</glossterm> and

I'm not sure, can you say "tuples" ?

> +    <glossterm>Backend Process</glossterm>
> +    <glossdef>
> +     <para>
> +      Processes of an <glossterm>Instance</glossterm> which act on behalf of

Say DATABASE instance

> +    <glossterm>Backend Server</glossterm>
> +    <glossdef>
> +     <para>
> +      See <glossterm>Instance</glossterm>.
same

> +    <glossterm>Background Worker</glossterm>
> +    <glossdef>
> +     <para>
> +      Individual processes within an <glossterm>Instance</glossterm>, which
same

> +      run system- or user-supplied code. Typical use cases are processes
> +      which handle parts of an <acronym>SQL</acronym> query to take
> +      advantage of parallel execution on servers with multiple
> +      <acronym>CPUs</acronym>.

I would say "A typical use case is"

> +    <glossterm>Background Writer</glossterm>

Add "process" ?

> +    <glossdef>
> +     <para>
> +      Writes continuously dirty pages from <glossterm>Shared

Say "Continuously writes"

> +      Memory</glossterm> to the file system. It starts periodically, but

Hm, maybe "wakes up periodically"

> +    <glossterm>Cast</glossterm>
> +    <glossdef>
> +     <para>
> +      A conversion of a <glossterm>Datum</glossterm> from its current data
> +      type to another data type.

maybe just say
A conversion of a <glossterm>Datum</glossterm> another data type.

> +    <glossterm>Catalog</glossterm>
> +    <glossdef>
> +     <para>
> +      The <acronym>SQL</acronym> standard uses this standalone term to
> +      indicate what is called a <glossterm>Database</glossterm> in
> +      <productname>PostgreSQL</productname>'s terminology.

Maybe remove "standalone" ?

> +    <glossterm>Checkpointer</glossterm>

Process

> +      A process that writes dirty pages and <glossterm>WAL
> +      Records</glossterm> to the file system and creates a special

Does the chckpointer actually write WAL ?

> +      checkpoint record. This process is initiated when predefined
> +      conditions are met, such as a specified amount of time has passed, or
> +      a certain volume of records have been collected.

collected or written?

I would say:
> +      A checkpoint is usually initiated by
> +      a specified amount of time having passed, or
> +      a certain volume of records having been written.

> +    <glossterm>Checkpoint</glossterm>
> +    <glossdef>
> +     <para>
> +      A <link linkend="sql-checkpoint"> Checkpoint</link> is a point in time

Extra space

> +   <glossentry id="glossary-connection">
> +    <glossterm>Connection</glossterm>
> +    <glossdef>
> +     <para>
> +      A <acronym>TCP/IP</acronym> or socket line for inter-process

I don't know if I've ever heard the phase "socket line"
I guess you mean a unix socket.

> +    <glossterm>Constraint</glossterm>
> +    <glossdef>
> +     <para>
> +      A concept of restricting the values of data allowed within a
> +      <glossterm>Table</glossterm>.

Just say: "A restriction on the values..."?

> +    <glossterm>Data Area</glossterm>

Remove this ?  I've never heard this phrase before.

> +    <glossdef>
> +     <para>
> +      The base directory on the filesystem of a
> +      <glossterm>Server</glossterm> that contains all data files and
> +      subdirectories associated with a <glossterm>Cluster</glossterm> with
> +      the exception of tablespaces. The environment variable

Should add an entry for "tablespace".

> +    <glossterm>Datum</glossterm>
> +    <glossdef>
> +     <para>
> +      The internal representation of a <acronym>SQL</acronym> data type.

I'm not sure if should use "a SQL" or "an SQL", but not both.

> +    <glossterm>Delete</glossterm>
> +    <glossdef>
> +     <para>
> +      A <acronym>SQL</acronym> command whose purpose is to remove

just say "which removes"

> +   <glossentry id="glossary-file-segment">
> +    <glossterm>File Segment</glossterm>
> +    <glossdef>
> +     <para>
> +       If a heap or index file grows in size over 1 GB, it will be split

1GB is the default "segment size", which you should define.

> +   <glossentry id="glossary-foreign-data-wrapper">
> +    <glossterm>Foreign Data Wrapper</glossterm>
> +    <glossdef>
> +     <para>
> +      A means of representing data that is not contained in the local
> +      <glossterm>Database</glossterm> as if were in local
> +      <glossterm>Table</glossterm>(s).

I'd say:

+ A means of representing data as a <glossterm>Table</glossterm>(s) even though
+ it is not contained in the local <glossterm>Database</glossterm>


> +   <glossentry id="glossary-foreign-key">
> +    <glossterm>Foreign Key</glossterm>
> +    <glossdef>
> +     <para>
> +      A type of <glossterm>Constraint</glossterm> defined on one or more
> +      <glossterm>Column</glossterm>s in a <glossterm>Table</glossterm> which
> +      requires the value in those <glossterm>Column</glossterm>s to uniquely
> +      identify a <glossterm>Row</glossterm> in the specified
> +      <glossterm>Table</glossterm>.

An FK doesn't require the values in its table to be unique, right ?
I'd say something like: "..which enforces that the values in those Columns are
also present in an(other) table."
Reference Referential Integrity?

> +    <glossterm>Function</glossterm>
> +    <glossdef>
> +     <para>
> +      Any pre-defined transformation of data. Many
> +      <glossterm>Functions</glossterm> are already defined within
> +      <productname>PostgreSQL</productname> itself, but can also be
> +      user-defined.

I would remove "pre-", since you mentioned that it can be user-defined.

> +    <glossterm>Global SQL Object</glossterm>
> +    <glossdef>
> +     <para>
> +     <!-- FIXME -->
> +      Not all <glossterm>SQL Objects</glossterm> belong to a certain
> +      <glossterm>Schema</glossterm>. Some belong to the complete
> +      <glossterm>Database</glossterm>, or even to the complete
> +      <glossterm>Cluster</glossterm>. These are referred to as
> +      <glossterm>Global SQL Objects</glossterm>. Collations and Extensions
> +      such as <glossterm>Foreign Data Wrappers</glossterm> reside at the
> +      <glossterm>Database</glossterm> level; <glossterm>Database</glossterm>
> +      names, <glossterm>Roles</glossterm>,
> +      <glossterm>Tablespaces</glossterm>, <glossterm>Replication</glossterm>
> +      origins, and subscriptions for logical
> +      <glossterm>Replication</glossterm> at the
> +      <glossterm>Cluster</glossterm> level.

I think "complete" is the wrong world.
I would say:
"An object which is not specific to a given database, but instead shared across
the entire Cluster".

> +   <glossentry id="glossary-grant">
> +    <glossterm>Grant</glossterm>
> +    <glossdef>
> +     <para>
> +      A <acronym>SQL</acronym> command that is used to enable

I'd say "allow"

> +   <glossentry id="glossary-heap">
> +    <glossterm>Heap</glossterm>
> +    <glossdef>
> +     <para>
> +      Contains the original values of <glossterm>Row</glossterm> attributes

I'm not sure what "original" means here ?

> +      (i.e. the data). The <glossterm>Heap</glossterm> is realized within
> +      <glossterm>Database</glossterm> files and mirrored in
> +      <glossterm>Shared Memory</glossterm>.

I wouldn't say mirrored, and probably just remove at least the part after "and".

> +   <glossentry id="glossary-host">
> +    <glossterm>Host</glossterm>
> +    <glossdef>
> +     <para>
> +      See <glossterm>Server</glossterm>.

Or client.  Or proxy at some layer or other intermediate thing.  Maybe just
remove this.

> +   <glossentry id="glossary-index">
> +    <glossterm>Index</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Relation</glossterm> that contains data derived from a
> +      <glossterm>Table</glossterm> (or <glossterm>Relation</glossterm> such
> +      as a <glossterm>Materialized View</glossterm>). It's internal

Its

> +      structure supports very fast retrieval of and access to the original
> +      data.

> +    <glossterm>Instance</glossterm>
> +    <glossdef>
> +     <para>
...
> +     <para>
> +      Many <glossterm>Instances</glossterm> can run on the same server as
> +      long as they use different <acronym>IP</acronym> ports and manage

I would say "as long as their TCP/IP ports or sockets don't conflict, and manage..."

> +    <glossterm>Join</glossterm>
> +    <glossdef>
> +     <para>
> +      A technique used with <command>SELECT</command> statements for
> +      correlating data in one or more <glossterm>Relations</glossterm>.

I would refer to this as a SQL keyword allowing to combine data from multiple
relations.

> +    <glossterm>Lock</glossterm>
> +    <glossdef>
> +     <para>
> +      A mechanism for one process temporarily preventing data from being
> +      manipulated by any other process.

I'd say:

+      A mechanism by which a process protects simultaneous access to a resource
+      by other processes.

(I said "protects" since shared locks don't prevent all access, and it's easier
than explaining "unsafe access").


> +   <glossentry id="glossary-log-file">
> +    <glossterm>Log File</glossterm>
> +    <glossdef>
> +     <para>
> +      <link linkend="logfile-maintenance">LOG files</link> contain readable
> +      text lines about serious and non-serious events, e.g.: use of wrong
> +      password, long-running queries, ... .

Serious and non-serious?

> +    <glossterm>Log Writer</glossterm>

process

> +    <glossdef>
> +     <para>
> +      If activated and parameterized, the

I don't know what parameterized means here

> +      <link linkend="runtime-config-logging">Log Writer</link> process
> +      writes information about database events into the current
> +      <glossterm>Log file</glossterm>. When reaching certain time- or
> +      volume-dependent criterias, he <!-- FIXME "he"? --> creates a new

I think criteria is the plural..

> +    <glossterm>Log Record</glossterm>

Can we remove this ?
Couple releases ago, "pg_xlog" was renamed to pg_wal.
I'd prefer to avoid defining something called "Log Record" about WAL that's
right next to text logs.

> +    <glossterm>Logged</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Table</glossterm> is considered
> +      <glossterm>Logged</glossterm> if changes to it are sent to the
> +      <glossterm>WAL Log</glossterm>. By default, all regular
> +      <glossterm>Tables</glossterm> are <glossterm>Logged</glossterm>. A
> +      <glossterm>Table</glossterm> can be speficied as unlogged either at
> +      creation time or via the <command>ALTER TABLE</command> command. The
> +      primary use of unlogged <glossterm>Tables</glossterm> is for storing
> +      transient work data that must be shared across processes, but with a
> +      final result stored in logged <glossterm>Tables</glossterm>.
> +      <glossterm>Temporary Tables</glossterm> are always unlogged.
> +     </para>
> +    </glossdef>
> +   </glossentry>

Maybe it's be better to define "unlogged", since 1) logged is the default; and
2) it's right next to text logs.

> +    <glossterm>Master</glossterm>
> +    <glossdef>
> +     <para>
> +      When two or more <glossterm>Databases</glossterm> are linked via
> +      <glossterm>Replication</glossterm>, the <glossterm>Server</glossterm>
> +      that is considered the authoritative source of information is called
> +      the <glossterm>Master</glossterm>.

I think it'd actually be the <<instance>> which is authoritative, in case they're
running on the same <<Server>>

> +   <glossentry id="glossary-materialized">
> +    <glossterm>Materialized</glossterm>
> +    <glossdef>
> +     <para>
> +      The act of storing information rather than just the means of accessing

remove "means of" ?

> +      the information. This term is used in <glossterm>Materialized
> +      Views</glossterm> meaning that the data derived from the
> +      <glossterm>View</glossterm> is actually stored on disk separate from

separately

> +      the sources of that data. When the term
> +      <glossterm>Materialized</glossterm> is used in speaking about
> +      mulit-step queries, it means that the data of a given step is stored

multi

> +      (in memory, but that storage may spill over onto disk).
> +     </para>
> +    </glossdef>
> +   </glossentry>
> +
> +   <glossentry id="glossary-materialized-view">
> +    <glossterm>Materialized View</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Relation</glossterm> that is defined in the same way that
> +      a <glossterm>View</glossterm> is, but it stores data in the same way

change "it stores" to stores

> +   <glossentry id="glossary-partition">
> +    <glossterm>Partition</glossterm>
> +    <glossdef>
> +     <para>
> +      <!-- FIXME should this use the style used in "atomic"? -->
> +      a) A <glossterm>Table</glossterm> that can be queried independently by
> +      its own name, but can also be queried via another

just say "on its own" or "directly"

> +      <glossterm>Table</glossterm>, a partitionend

partitioned
also, put it in parens, like "via another table (a partitioned table)..."

> +      <glossterm>Table</glossterm>, which is a collection of

Say "set" here since you later talk about "subsets" and sets.

> +   <glossentry id="glossary-primary-key">
> +    <glossterm>Primary Key</glossterm>
> +    <glossdef>
> +     <para>
> +      A special case of <glossterm>Unique Index</glossterm> defined on a
> +      <glossterm>Table</glossterm> or other <glossterm>Relation</glossterm>
> +      that also guarantees that all of the <glossterm>Attributes</glossterm>
> +      within the <glossterm>Primary Key</glossterm> do not have
> +      <glossterm>Null</glossterm> values.  As the name implies, there can be
> +      only one <glossterm>Primary Key</glossterm> per
> +      <glossterm>Table</glossterm>, though it is possible to have multiple
> +      <glossterm>Unique Indexes</glossterm> that also have no
> +      <glossterm>Null</glossterm>-capable <glossterm>Attributes</glossterm>.

I would say "multiple >>unique indexes<< on >>attributes<< defined as not
nullable.

> +    <glossterm>Procedure</glossterm>
> +    <glossdef>
> +     <para>
> +      A defined set of instructions for manipulating data within a
> +      <glossterm>Database</glossterm>. <glossterm>Procedure</glossterm> can

"procedures" or "a procedure"

> +    <glossterm>Record</glossterm>
> +    <glossdef>
> +     <para>
> +      See <link linkend="sql-revoke">Tupple</link>.

Tupple is back.  And again below.

> +      A single <glossterm>Row</glossterm> of a <glossterm>Table</glossterm>
> +      or other Relation.

I think it's commonly used to mean "an instance of a row" (in an MVCC sense),
but maybe that's too much detail for here.

> +    <glossterm>Referential Integrity</glossterm>
> +    <glossdef>
> +     <para>
> +      The means of restricting data in one <glossterm>Relation</glossterm>

A means

> +   <glossentry id="glossary-relation">
> +    <glossterm>Relation</glossterm>
> +    <glossdef>
> +     <para>
> +      The generic term for all objects in a <glossterm>Database</glossterm>

"A generic term for any object in a >>database<< that has a name and..."

> +   <glossentry id="glossary-result-set">
> +    <glossterm>Result Set</glossterm>
> +    <glossdef>
> +     <para>
> +      A data structure transmitted from a <glossterm>Server</glossterm> to
> +      client program upon the completion of a <acronym>SQL</acronym>
> +      command, usually a <command>SELECT</command> but it can be an
> +      <command>INSERT</command>, <command>UPDATE</command>, or
> +      <command>DELETE</command> command if the <literal>RETURNING</literal>
> +      clause is specified.

I'd remove everything in that sentence after "usually".

> +    <glossterm>Revoke</glossterm>
> +    <glossdef>
> +     <para>
> +      A command to reduce access to a named set of

s/reduce/prevent/ ?

> +    <glossterm>Row</glossterm>
> +    <glossdef>
> +     <para>
> +      See <link linkend="sql-revoke">Tupple</link>.

tuple

> +   <glossentry id="glossary-savepoint">
> +    <glossterm>Savepoint</glossterm>
> +    <glossdef>
> +     <para>
> +      A special mark (such as a timestamp) inside a
> +      <glossterm>Transaction</glossterm>. Data modifications after this
> +      point in time may be rolled back to the time of the savepoint.

I don't think "timestamp" is a useful or accurate analogy for this.

> +    <glossterm>Schema</glossterm>
> +    <glossdef>
> +     <para>
> +      A <link linkend="ddl-schemas">schema</link> is a namespace for
> +      <glossterm>SQL objects</glossterm>, which all reside in the same
> +      <glossterm>database</glossterm>.  Each <glossterm>SQL
> +      object</glossterm> must reside in exactly one
> +      <glossterm>Schema</glossterm>.
> +     </para>

> +     <para>
> +      In general, the names of <glossterm>SQL objects</glossterm> in the
> +      schema are unique - even across different types of objects.  The lone
> +      exception is the case of <glossterm>Unique</glossterm>
> +      <glossterm>Constraint</glossterm>s, in which case there
> +      <emphasis>must</emphasis> be a <glossterm>Unique Index</glossterm>
> +      with the same name and <glossterm>Schema</glossterm> as the
> +      <glossterm>Constraint</glossterm>.  There is no restriction on having
> +      a name used in multiple <glossterm>Schema</glossterm>s.

I think there's some confusion.  Constraints are not objects, right ?

But, constraints do have an exception (not just unique constraints, though):
the constraint is only unique on its table, not in its database/schema.

    "pg_constraint_conrelid_contypid_conname_index" UNIQUE, btree (conrelid, contypid, conname) CLUSTER

> +    <glossterm>Select</glossterm>
> +    <glossdef>
> +     <para>
> +      The command used to query a <glossterm>Database</glossterm>. Normally,
> +      <command>SELECT</command>s are not expected to modify the
> +      <glossterm>Database</glossterm> in any way, but it is possible that
> +      <glossterm>Functions</glossterm> invoked within the query could have
> +      side-effects that do modify data.  </para>

I think there should be references to the sql-* pages for this and others.

> +   <glossentry id="glossary-serializable">
> +    <glossterm>Serializable</glossterm>
> +    <glossdef>
> +     <para>
> +      Transactions defined as <literal>SERIALIZABLE</literal> are unable to
> +      see changes made within other transactions. In effect, for the
> +      initializing session the entire <glossterm>Database</glossterm>
> +      appears to be frozen duration such a
> +      <glossterm>Transaction</glossterm>.

Do you mean "for the duration of the >>Transaction<<"

> +   <glossentry id="glossary-session">
> +    <glossterm>Session</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Connection</glossterm> to the <glossterm>Database</glossterm>.
> +     </para>
> +     <para>
> +      A description of the commands that were issued in the life cycle of a
> +      particular <glossterm>Connection</glossterm> to the
> +      <glossterm>Database</glossterm>.

I'm not sure what this <para> means.

> +    <glossterm>Sequence</glossterm>
> +    <glossdef>
> +     <para>
> +      <!-- sounds excessively complicated a definition -->
> +      An <glossterm>Database</glossterm> object which represents the

A not An

> +      mathematical concept of a numerical integral sequence. It can be
> +      thought of as a <glossterm>Table</glossterm> with exactly one
> +      <glossterm>Row</glossterm> and one <glossterm>Column</glossterm>. The
> +      value stored is known as the current value. A
> +      <glossterm>Sequence</glossterm> has a defined direction (almost always
> +      increasing) and an interval step (usually 1).  Whenever the
> +      <literal>NEXTVAL</literal> pseudo-column of a
> +      <glossterm>Sequence</glossterm> is accessed, the current value is moved
> +      in the defined direction by the defined interval step, and that value

say "given interval step"

> +    <glossterm>Shared Memory</glossterm>
> +    <glossdef>
> +     <para>
> +      <acronym>RAM</acronym> which is used by the processes common to an
> +      <glossterm>Instance</glossterm>. It mirrors parts of
> +      <glossterm>Database</glossterm> files, provides an area for
> +      <glossterm>WAL Records</glossterm>,

Do we use shared_buffers for WAL ?

> +   <glossentry id="glossary-table">
> +    <glossterm>Table</glossterm>
> +    <glossdef>
> +     <para>
> +      A collection of <glossterm>Tuples</glossterm> (also known as
> +      <glossterm>Rows</glossterm> or <glossterm>Records</glossterm>) having
> +      a common data structure (the same number of
> +      <glossterm>Attributes</glossterm>s, in the same order, having the same

Attributes has two esses.

> +      name and type per position). A <glossterm>Table</glossterm> is the

I don't think you need to say here that the columns of a table all have the
same type and order.

> +    <glossterm>Temporary Tables</glossterm>
> +    <glossdef>
> +     <para>
> +      <glossterm>Table</glossterm>s that exist either for the lifetime of a
> +      <glossterm>Session</glossterm> or a
> +      <glossterm>Transaction</glossterm>, as defined at creation time. The

I would say "as specified at the time of its creation".

> +    <glossterm>Transaction</glossterm>
> +    <glossdef>
> +     <para>
> +      A combination of one or more commands that must act as a single

Remove "one or more"

> +    <glossterm>Trigger</glossterm>
> +    <glossdef>
> +     <para>
> +      A <glossterm>Function</glossterm> which can be defined to execute
> +      whenever a certain operation (<command>INSERT</command>,
> +      <command>UPDATE</command>, or <command>DELTE</command>) is applied to
> +      that <glossterm>Relation</glossterm>. A <glossterm>Trigger</glossterm>

s/that/a/

> +   <glossentry id="glossary-unique">
> +    <glossterm>Unique</glossterm>
> +    <glossdef>
> +     <para>
> +      The condition of having no matching values in the same

s/matching/duplicate/

> +      <glossterm>Relation</glossterm>. Most often used in the concept of

s/concept/context/

> +   <glossentry id="glossary-update">
> +    <glossterm>Update</glossterm>
> +    <glossdef>
> +     <para>
> +      A command used to modify <glossterm>Rows</glossterm> that already

or 'may already'

> +    <glossterm>WAL File</glossterm>
...
> +     <para>
> +      The sequence of <glossterm>WAL Records</glossterm> in combination with
> +      the sequence of <glossterm>WAL Files</glossterm> represents the

Remove "in combination with the sequence of >WAL Files<"

> +    <glossentry id="glossary-wal-log">
> +    <glossterm>WAL Log</glossterm>

Can you just say WAL or "write-ahead log".

> +    <glossdef>
> +     <para>
> +      A <glossterm>WAL Record</glossterm> contains either new or changed
> +      <glossterm>Heap</glossterm> or <glossterm>Index</glossterm> data or
> +      information about a <command>COMMIT</command>,
> +      <command>ROLLBACK</command>, <command>SAVEPOINT</command>, or
> +      <glossterm>Checkpointer</glossterm> operation. WAL records use a
> +      non-printabe binary format.

non-printable
Or just remove it.
Or just remove the sentence.

> +   <glossterm>WAL Writer</glossterm>

process

> +   <glossentry id="glossary-window-function">
> +    <glossterm>Window Function</glossterm>
> +    <glossdef>
> +     <para>
> +      A type of <glossterm>Function</glossterm> similar to an
> +      <glossterm>Aggregate</glossterm> in that can derive its value from a

in that IT

> +      set of <glossterm>Rows</glossterm> in a <glossterm>Result
> +      Set</glossterm>, but still retaining the original source data.

--
Justin


Reply | Threaded
Open this post in threaded view
|

Re: Add A Glossary

Jürgen Purtz
In reply to this post by Corey Huinker
man pages: Sorry, if I confused someone with my poor English. I just
want to express in my 'offline' mail that we don't have to worry about
man page generation. The patch doesn't affect files in the /ref
subdirectory from where man pages are created.

review process: Yes, it will be time-consumptive and it may be a hard
job because of a) the patch has multiple authors with divergent writing
styles and b) the terms affect different fundamental issues: SQL basics
and PG basics. Concerning PG basics in the past we used a wide range of
similar terms with different meanings as well as different terms for the
same matter - within our documentation as well as in secondary
publications. The terms "backend server" / "instance" are such an
example and there shall be a clear decision in favor of one of the two.
Presumably we will see more discussions about the question which one is
the preferred term (remember the discussion concerning the terms
master/slave, primary/secondary some weeks ago).

ongoing: Intermediate questions for clarifications are welcome.


Kind regards, Jürgen




12345