Question from someone who is not trained in computer sciences

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Question from someone who is not trained in computer sciences

Judith Lacoste
Hi,




My question is whether or not such set up is possible with PostgreSQL?  

Why am I interested in PostrgreSQL?  First, my work has made me aware of how precious open source tools are.  Our main tools for data analysis are open source.  Commercial equivalents are black boxes which we try to avoid in the name of science reproducibility and transparency.  Secondly, the commercial software we are currently using is apparently based on PostgreSQL, so I am hoping that using PostgreSQL will make migration less painful. 

Thank you in advance,

Judith
Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Ben Chobot
On Sep 5, 2019, at 2:00 PM, Judith Lacoste <[hidden email]> wrote:
This ranges from very easy to technically-possible-but-very-difficult, depending upon what you and your colleagues do with you local copies of the data. If your database schema and activities are such that your local edits will trample over each other, reconciling those changes automatically when you return to your office might be a challenge. If, however, you and your colleagues each change different rows, or even better make no changes on your local copies, then this becomes much easier.

While it isn't the simpliest tool to set up, if you're planning to make edits to your local databases then bucardo is a replication package that can give you want you want. (Really any multi-master replication tool should work, but bucardo has an advantage in that it was designed with unreliable connectivity in mind.) 

If you're planning to have your local databases be read-only, then virtually any asynchronous replication strategy for postgres will work. (This means almost all of them.)

Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Peter Geoghegan-4
In reply to this post by Judith Lacoste
On Thu, Sep 5, 2019 at 2:00 PM Judith Lacoste <[hidden email]> wrote:
> I think PostgreSQL is the solution for my needs, but I am not a programmer/coder.

I don't think that it's restricted to people that are computer
scientists. At least, I certainly hope it isn't. SQL was originally
supposed to be something that is usable by domain experts/analysts,
rather than by computer people (that was at a time when the divide was
far larger than it is today).

> I plan to install the database on a server in the office. Me and my four colleagues will occasionally connect to this database when we are working in other locations (usually hospitals or universities). In such remote locations, we often do not have internet/network, yet we still need to access the database.  Currently, we use a system where a copy of the database lives on each of our laptops.  We can access all the information in the database despite being offline.  This local copy of the database is synchronized with the server once network becomes available again.
>
> My question is whether or not such set up is possible with PostgreSQL?

Since you're a biologist, you may like to play around with the Mouse
Genome database using PostgreSQL:

http://www.informatics.jax.org/downloads/database_backups/

Any supported version of PostgreSQL will work. You'll need to use
pg_restore to restore the databases. Something like this will do it:

  pg_restore -d mgd  /path/to/mgd.postgres.dump

(I'm not sure what operating system you'll use -- something similar to
this invocation ought to work on Windows through cmd.exe, though.)

From there, you can play around with the database using a GUI tool
such as pgAdmin. I sometimes use this database to test certain things,
since it's the only example of a living, breathing PostgreSQL database
that you can just download that I am aware of. Its schema is probably
not an exemplar of good design, but it does seem reasonably well
thought out. I'm not a domain expert, though, so I can't really be
sure how good it is.

The nice thing about this approach is that you can figure it out using
a "top down" approach, by first understanding how the database is used
in practical terms, and then filling in the details of how the
application that it backs uses the database.

Last I checked, restoring this database will take about 30GB of disk
space on top of the dump file itself.

--
Peter Geoghegan


Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Basques, Bob (CI-StPaul)
In reply to this post by Judith Lacoste
Hi Judith,

Might be more than you are looking for, but . . . We’ve done something like this with a portable mapping system installed on a Raspberry Pi.  Really!!, it works with Postgres and a webserver to serve out a replicated open software stack based mappint interface. 

Our focus was on using the approach to take large amounts of constantly changing mapping data into the field without the dependence on a network.  Since the RPis are so small, it’s easy to take them into the field as well as sync them in the office.

We used a whole service infrastructure with a postgres backend, web server and the GeoMoose product to display maps.  Our next iteration is to build some editing tools, but since the software stack is the same as a our in office version, the development works for both.

You can do some interesting tricks at the database level in order to pull things together from various edit users.  More info on request for this.  :c)

Here is our project from earlier this year:


We put on a Workshop with a slightly older version of the Raspberry Pi.  The newest version is even faster.  It can be operated in the field, in your pocket with a battery running all day, or plugged into a car charger, or both, for non-stop operation.  Just turn it on in the morning, and off at night.  We did some special database syncing processes with ours, but I’ll leave that as an exercise for you.  Basically you can use the RPi as a mobile Server stack, complete with Web FORMs and database connectivity.  you can also use them in a team approach where more than one wifi connection can be made to the RPi, because in the end, it’s just a web server advertising over it’s own broacast Wifi.

Glad to answer any questions.

bobb




On Sep 5, 2019, at 4:00 PM, Judith Lacoste <[hidden email]> wrote:

Think Before You Click: This email originated outside our organization.


Hi,


My question is whether or not such set up is possible with PostgreSQL?  

Why am I interested in PostrgreSQL?  First, my work has made me aware of how precious open source tools are.  Our main tools for data analysis are open source.  Commercial equivalents are black boxes which we try to avoid in the name of science reproducibility and transparency.  Secondly, the commercial software we are currently using is apparently based on PostgreSQL, so I am hoping that using PostgreSQL will make migration less painful. 

Thank you in advance,

Judith

Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Ron-2
In reply to this post by Judith Lacoste
On 9/5/19 4:00 PM, Judith Lacoste wrote:

You the end-user won't be using Postgres; you'll be using the application.  The important question is whether Postgres has the features you need for your application.


Are you asking if you can do with PostgreSQL what you currently do with the existing database?


Why am I interested in PostrgreSQL?  First, my work has made me aware of how precious open source tools are.  Our main tools for data analysis are open source.  Commercial equivalents are black boxes which we try to avoid in the name of science reproducibility and transparency.  Secondly, the commercial software we are currently using is apparently based on PostgreSQL, so I am hoping that using PostgreSQL will make migration less painful.

If you're already using a Postgres-based system, then the presumable answer is "yes, it'll work".  However, Postgres can be modified by closed-source code, so the amount of work needed by programmer you hire might be extreme.  We don't know because we don't know what your application does.


--
Angular momentum makes the world go 'round.
Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Adrian Klaver-4
In reply to this post by Judith Lacoste
On 9/5/19 2:00 PM, Judith Lacoste wrote:
> Hi,
>
> I think PostgreSQL is the solution for my needs, but I am not a
> programmer/coder.  If I can confirm PostgreSQL does what I need, I will
> have to hire someone to assist, I am willing to give the effort to learn
> myself but it may be difficult, my specialities are biology and
> microscopy.  Or perhaps the use of PostgreSQL is restricted to people
> highly trained in computer sciences?

No, I am biologist and I learned Postgres/database management. It is
about organizing things and that is a commonality with biology.


> I have been looking around a lot through the PostgreSQL website,
> searching the archives, and I even contacted PostgreSQL people locally
> but I still don’t have a clear answer to my first question.  So I am
> posting it here with the hope to move on with PostgreSQL, or abandon the
> project.

This would be the list to talk to.


> I plan to install the database on a server in the office. Me and my four
> colleagues will occasionally connect to this database when we are
> working in other locations (usually hospitals or universities). In such
> remote locations, we often do not have internet/network, yet we still
> need to access the database.  Currently, we use a system where a copy of
> the database lives on each of our laptops.  We can access all the
> information in the database despite being offline.  This local copy of
> the database is synchronized with the server once network becomes
> available again.
> question is whether or not such set up is possible with PostgreSQL?

The set up is possible, though how you would implement it would depend
on several factors:

1) What OS and versions are you using?

2) Are you working directly with the database or through an application?

3) What programming languages are you using?

There is also the option of using Sqlite(https://sqlite.org/index.html)
for your 'local' databases and then syncing them to Postgres.


>
> Why am I interested in PostrgreSQL?  First, my work has made me aware of
> how precious open source tools are.  Our main tools for data analysis
> are open source.  Commercial equivalents are black boxes which we try to
> avoid in the name of science reproducibility and transparency.
>   Secondly, the commercial software we are currently using is apparently
> based on PostgreSQL, so I am hoping that using PostgreSQL will make
> migration less painful.
>
> Thank you in advance,
>
> Judith


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Samuel Gendler
If I was in a hurry to implement this, and I had a userbase that wasn't very experienced with managing relational databases, I'd write some code to automatically and periodically build a docker image with the latest data in it (however often is sufficient to meet your needs), and then I'd set up a 1-line scheduled command on the laptops that would pull the latest docker image to the user's laptop. Then I'd give them a script that runs the docker container locally, and give them a client that knows how to connect to it.  Assuming it is a read-only db when you aren't connected, I could automate all of that in just a few hours in most environments, and the changes that would be required on the individual laptops would be minimal.

If you need to be able to write to the db when disconnected, and pull those writes into the central db instance when connected, that's a tougher problem to solve which is more suited to some of the earlier suggestions.  But if you only need to read when remote and just want something that works, is easy to put together, and can likely be built by an outside consultant for minimal expense and even less ongoing support and maintenance, I would just pay someone to read rthat first paragraph and set it up for me and call it good. Any of the suggested solutions is going to require a fair amount of administrative competence to really put together, so going for one that shouldn't require much maintenance to keep synchronized is your best bet. 

On Thu, Sep 5, 2019 at 3:43 PM Adrian Klaver <[hidden email]> wrote:
On 9/5/19 2:00 PM, Judith Lacoste wrote:
> Hi,
>
> I think PostgreSQL is the solution for my needs, but I am not a
> programmer/coder.  If I can confirm PostgreSQL does what I need, I will
> have to hire someone to assist, I am willing to give the effort to learn
> myself but it may be difficult, my specialities are biology and
> microscopy.  Or perhaps the use of PostgreSQL is restricted to people
> highly trained in computer sciences?

No, I am biologist and I learned Postgres/database management. It is
about organizing things and that is a commonality with biology.


> I have been looking around a lot through the PostgreSQL website,
> searching the archives, and I even contacted PostgreSQL people locally
> but I still don’t have a clear answer to my first question.  So I am
> posting it here with the hope to move on with PostgreSQL, or abandon the
> project.

This would be the list to talk to.


> I plan to install the database on a server in the office. Me and my four
> colleagues will occasionally connect to this database when we are
> working in other locations (usually hospitals or universities). In such
> remote locations, we often do not have internet/network, yet we still
> need to access the database.  Currently, we use a system where a copy of
> the database lives on each of our laptops.  We can access all the
> information in the database despite being offline.  This local copy of
> the database is synchronized with the server once network becomes
> available again.
> question is whether or not such set up is possible with PostgreSQL?

The set up is possible, though how you would implement it would depend
on several factors:

1) What OS and versions are you using?

2) Are you working directly with the database or through an application?

3) What programming languages are you using?

There is also the option of using Sqlite(https://sqlite.org/index.html)
for your 'local' databases and then syncing them to Postgres.


>
> Why am I interested in PostrgreSQL?  First, my work has made me aware of
> how precious open source tools are.  Our main tools for data analysis
> are open source.  Commercial equivalents are black boxes which we try to
> avoid in the name of science reproducibility and transparency.
>   Secondly, the commercial software we are currently using is apparently
> based on PostgreSQL, so I am hoping that using PostgreSQL will make
> migration less painful.
>
> Thank you in advance,
>
> Judith


--
Adrian Klaver
[hidden email]


Reply | Threaded
Open this post in threaded view
|

Re: Question from someone who is not trained in computer sciences

Judith Lacoste
Hi,

I am very grateful for all your prompt replies, that’s fantastic since I got an answer to my question.  I am going to digest all of it, and then decide what to do next.  Once again, thanks a lot!

Best regards,

Judith


On 2019-Sep-05, at 19:08, Sam Gendler <[hidden email]> wrote:

If I was in a hurry to implement this, and I had a userbase that wasn't very experienced with managing relational databases, I'd write some code to automatically and periodically build a docker image with the latest data in it (however often is sufficient to meet your needs), and then I'd set up a 1-line scheduled command on the laptops that would pull the latest docker image to the user's laptop. Then I'd give them a script that runs the docker container locally, and give them a client that knows how to connect to it.  Assuming it is a read-only db when you aren't connected, I could automate all of that in just a few hours in most environments, and the changes that would be required on the individual laptops would be minimal.

If you need to be able to write to the db when disconnected, and pull those writes into the central db instance when connected, that's a tougher problem to solve which is more suited to some of the earlier suggestions.  But if you only need to read when remote and just want something that works, is easy to put together, and can likely be built by an outside consultant for minimal expense and even less ongoing support and maintenance, I would just pay someone to read rthat first paragraph and set it up for me and call it good. Any of the suggested solutions is going to require a fair amount of administrative competence to really put together, so going for one that shouldn't require much maintenance to keep synchronized is your best bet. 

On Thu, Sep 5, 2019 at 3:43 PM Adrian Klaver <[hidden email]> wrote:
On 9/5/19 2:00 PM, Judith Lacoste wrote:
> Hi,
>
> I think PostgreSQL is the solution for my needs, but I am not a
> programmer/coder.  If I can confirm PostgreSQL does what I need, I will
> have to hire someone to assist, I am willing to give the effort to learn
> myself but it may be difficult, my specialities are biology and
> microscopy.  Or perhaps the use of PostgreSQL is restricted to people
> highly trained in computer sciences?

No, I am biologist and I learned Postgres/database management. It is
about organizing things and that is a commonality with biology.


> I have been looking around a lot through the PostgreSQL website,
> searching the archives, and I even contacted PostgreSQL people locally
> but I still don’t have a clear answer to my first question.  So I am
> posting it here with the hope to move on with PostgreSQL, or abandon the
> project.

This would be the list to talk to.


> I plan to install the database on a server in the office. Me and my four
> colleagues will occasionally connect to this database when we are
> working in other locations (usually hospitals or universities). In such
> remote locations, we often do not have internet/network, yet we still
> need to access the database.  Currently, we use a system where a copy of
> the database lives on each of our laptops.  We can access all the
> information in the database despite being offline.  This local copy of
> the database is synchronized with the server once network becomes
> available again.
> question is whether or not such set up is possible with PostgreSQL?

The set up is possible, though how you would implement it would depend
on several factors:

1) What OS and versions are you using?

2) Are you working directly with the database or through an application?

3) What programming languages are you using?

There is also the option of using Sqlite(https://sqlite.org/index.html)
for your 'local' databases and then syncing them to Postgres.


>
> Why am I interested in PostrgreSQL?  First, my work has made me aware of
> how precious open source tools are.  Our main tools for data analysis
> are open source.  Commercial equivalents are black boxes which we try to
> avoid in the name of science reproducibility and transparency.
>   Secondly, the commercial software we are currently using is apparently
> based on PostgreSQL, so I am hoping that using PostgreSQL will make
> migration less painful.
>
> Thank you in advance,
>
> Judith


--
Adrian Klaver
[hidden email]