Querying PostgreSQL / PostGIS Databases in Python

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

Querying PostgreSQL / PostGIS Databases in Python

Shaozhong SHI
Hi,

What is the advantage of querying in Python?

Has anyone got much experience?

What not just use standard query?

What is the rationale for querying in Python?

Would the performance be better?

Regards,

Shao
Reply | Threaded
Open this post in threaded view
|

Re: Querying PostgreSQL / PostGIS Databases in Python

Allan Kamau-3
You may write stored procedures using PL/pgSQL,alternatively you may write your queries in python.
You may use psycopg2 to query the DB from Python.
You may have a mix of the two, it will depend on your preference.
Ideally you may not want your users running queries against the data by connecting to the database directly using database tools psql or pgadmin3 or pgadmin4.
This means that having a database access application written in Python to restrict the and encapsulate data access may be advisable.
In this case you may place all the DML statements in python and execute them or you may have much of the data access logic written into several PL/pgSQL functions, then call these functions via Python.


Below is python code illustrating the use of psycopg2. This code has not been run so expect some errors.
Here I am executing an SQL query on a table, you may modify this code to execute a PL/pgSQL function.


import psycopg2;
from psycopg2 import sql;
import psycopg2.extras;
from psycopg2.extensions import AsIs;


db__pg_conn__str="host='host_name_of_computer_hosting_pg_db_service' port=5432 dbname='your_pg_db_name' user='your_username' password='user_password'";
db__pg_conn=psycopg2.connect(db__pg_conn__str);

query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
);


def query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
):
    """
    """;
    table__id=-1;
    _sql_query1a="""
        SELECT {}::TEXT AS some_string,a.id AS table__id,a.*,clock_timestamp() FROM {}.{} a WHERE a.dataset_name={}::TEXT
        ;
    """;
    sqlSQL1a=None;
    sqlSQL1a=sql.SQL(_sql_query1a);
    pg_cursor1a=db__pg_conn.cursor(cursor_factory=psycopg2.extras.DictCursor);
    _sql_query1a_processed=pg_cursor1a.mogrify(
            sqlSQL1a.format(
                sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_1']))
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[0])
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[1])
                ,sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_2']))
            )
            ,{
                'some_value_1':'ABC'
                ,'some_value_2':dataset_name
            }
    );
    _sql_query1a_processed=_sql_query1a_processed.decode().replace("\\'","'");
   
    #LOGGER.info(" '{0}', -------------- _sql_query1a_processed is:'{1}'.".format(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-1],_sql_query1a_processed));
    pg_cursor1a.execute(
            _sql_query1a_processed
    );
    rowcount1a=pg_cursor1a.rowcount;
    rows=None;
    rows=pg_cursor1a.fetchall();
    row_cnt=0;
    for row in rows:
        pass;
        row_cnt+=1;
        table__id=row["table__id"];//do something with table__id
    //do something with rows.
    rows=None;
    db__pg_conn.commit();
    sqlSQL1a=None;
    pg_cursor1a=None;



On Fri, Jul 31, 2020 at 12:30 PM Shaozhong SHI <[hidden email]> wrote:
Hi,

What is the advantage of querying in Python?

Has anyone got much experience?

What not just use standard query?

What is the rationale for querying in Python?

Would the performance be better?

Regards,

Shao
Reply | Threaded
Open this post in threaded view
|

Re: Querying PostgreSQL / PostGIS Databases in Python

Tony Shelver
We are using python on top of Postgresql / PostGIS, for a vehicle tracking system.
THis is quite data intensive, and we have some 'interesting' GIS queries where we see where a vehicle or fleet has stopped within specific areas, where it has traveled, any incidents along the way and much more.

Postgresql functions are used almost exclusively for DML and queries, as running individual SQL statements that will return all the records required to process a complex report was just too slow.  For even simple stuff, we are an order of magnitude faster than a similar system using the same data written in Java against Posgresql, but using the java ORM for queries.

All geographic functions are processed in PostGIS, other than the odd reverse geocoding call which is performed against Google Maps or similar

YMMV.

On Fri, 31 Jul 2020 at 12:50, Allan Kamau <[hidden email]> wrote:
You may write stored procedures using PL/pgSQL,alternatively you may write your queries in python.
You may use psycopg2 to query the DB from Python.
You may have a mix of the two, it will depend on your preference.
Ideally you may not want your users running queries against the data by connecting to the database directly using database tools psql or pgadmin3 or pgadmin4.
This means that having a database access application written in Python to restrict the and encapsulate data access may be advisable.
In this case you may place all the DML statements in python and execute them or you may have much of the data access logic written into several PL/pgSQL functions, then call these functions via Python.


Below is python code illustrating the use of psycopg2. This code has not been run so expect some errors.
Here I am executing an SQL query on a table, you may modify this code to execute a PL/pgSQL function.


import psycopg2;
from psycopg2 import sql;
import psycopg2.extras;
from psycopg2.extensions import AsIs;


db__pg_conn__str="host='host_name_of_computer_hosting_pg_db_service' port=5432 dbname='your_pg_db_name' user='your_username' password='user_password'";
db__pg_conn=psycopg2.connect(db__pg_conn__str);

query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
);


def query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
):
    """
    """;
    table__id=-1;
    _sql_query1a="""
        SELECT {}::TEXT AS some_string,a.id AS table__id,a.*,clock_timestamp() FROM {}.{} a WHERE a.dataset_name={}::TEXT
        ;
    """;
    sqlSQL1a=None;
    sqlSQL1a=sql.SQL(_sql_query1a);
    pg_cursor1a=db__pg_conn.cursor(cursor_factory=psycopg2.extras.DictCursor);
    _sql_query1a_processed=pg_cursor1a.mogrify(
            sqlSQL1a.format(
                sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_1']))
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[0])
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[1])
                ,sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_2']))
            )
            ,{
                'some_value_1':'ABC'
                ,'some_value_2':dataset_name
            }
    );
    _sql_query1a_processed=_sql_query1a_processed.decode().replace("\\'","'");
   
    #LOGGER.info(" '{0}', -------------- _sql_query1a_processed is:'{1}'.".format(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-1],_sql_query1a_processed));
    pg_cursor1a.execute(
            _sql_query1a_processed
    );
    rowcount1a=pg_cursor1a.rowcount;
    rows=None;
    rows=pg_cursor1a.fetchall();
    row_cnt=0;
    for row in rows:
        pass;
        row_cnt+=1;
        table__id=row["table__id"];//do something with table__id
    //do something with rows.
    rows=None;
    db__pg_conn.commit();
    sqlSQL1a=None;
    pg_cursor1a=None;



On Fri, Jul 31, 2020 at 12:30 PM Shaozhong SHI <[hidden email]> wrote:
Hi,

What is the advantage of querying in Python?

Has anyone got much experience?

What not just use standard query?

What is the rationale for querying in Python?

Would the performance be better?

Regards,

Shao
Reply | Threaded
Open this post in threaded view
|

Re: Querying PostgreSQL / PostGIS Databases in Python

Tony Shelver
We use Postgresql, python and PostGIS.
Nearly all our data access is via Postgres functions, both DML and queries.  Our system is a vehicle GPS tracking and business function analysis system, with quite a lot of data.

Most of our reports require accessing hundreds to 100s of thousands of records for each vehicle, and all geo transforms (coordinates, areas / polygons, routes / lines and so on) are done in  PostGIS, and just the reporting result set is returned to Python for format, some final calculations and 

Our queries run an order of magnitude faster than the same data in a tracking system written in java on top of Postgres, where the queries are done via the java ORM system and the GIS processing is done in Java.

This is even more true where the python client is situated on physically separate servers to the database.

On Tue, 4 Aug 2020 at 14:09, Tony Shelver <[hidden email]> wrote:
We are using python on top of Postgresql / PostGIS, for a vehicle tracking system.
THis is quite data intensive, and we have some 'interesting' GIS queries where we see where a vehicle or fleet has stopped within specific areas, where it has traveled, any incidents along the way and much more.

Postgresql functions are used almost exclusively for DML and queries, as running individual SQL statements that will return all the records required to process a complex report was just too slow.  For even simple stuff, we are an order of magnitude faster than a similar system using the same data written in Java against Posgresql, but using the java ORM for queries.

All geographic functions are processed in PostGIS, other than the odd reverse geocoding call which is performed against Google Maps or similar

YMMV.

On Fri, 31 Jul 2020 at 12:50, Allan Kamau <[hidden email]> wrote:
You may write stored procedures using PL/pgSQL,alternatively you may write your queries in python.
You may use psycopg2 to query the DB from Python.
You may have a mix of the two, it will depend on your preference.
Ideally you may not want your users running queries against the data by connecting to the database directly using database tools psql or pgadmin3 or pgadmin4.
This means that having a database access application written in Python to restrict the and encapsulate data access may be advisable.
In this case you may place all the DML statements in python and execute them or you may have much of the data access logic written into several PL/pgSQL functions, then call these functions via Python.


Below is python code illustrating the use of psycopg2. This code has not been run so expect some errors.
Here I am executing an SQL query on a table, you may modify this code to execute a PL/pgSQL function.


import psycopg2;
from psycopg2 import sql;
import psycopg2.extras;
from psycopg2.extensions import AsIs;


db__pg_conn__str="host='host_name_of_computer_hosting_pg_db_service' port=5432 dbname='your_pg_db_name' user='your_username' password='user_password'";
db__pg_conn=psycopg2.connect(db__pg_conn__str);

query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
);


def query_table(
    dataset_name
    ,some_value_2
    ,db__pg_conn
):
    """
    """;
    table__id=-1;
    _sql_query1a="""
        SELECT {}::TEXT AS some_string,a.id AS table__id,a.*,clock_timestamp() FROM {}.{} a WHERE a.dataset_name={}::TEXT
        ;
    """;
    sqlSQL1a=None;
    sqlSQL1a=sql.SQL(_sql_query1a);
    pg_cursor1a=db__pg_conn.cursor(cursor_factory=psycopg2.extras.DictCursor);
    _sql_query1a_processed=pg_cursor1a.mogrify(
            sqlSQL1a.format(
                sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_1']))
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[0])
                ,psycopg2.sql.Identifier("my_schema.my_table".split(".")[1])
                ,sql.SQL(',').join(map(psycopg2.sql.Placeholder,['some_value_2']))
            )
            ,{
                'some_value_1':'ABC'
                ,'some_value_2':dataset_name
            }
    );
    _sql_query1a_processed=_sql_query1a_processed.decode().replace("\\'","'");
   
    #LOGGER.info(" '{0}', -------------- _sql_query1a_processed is:'{1}'.".format(datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S.%f')[:-1],_sql_query1a_processed));
    pg_cursor1a.execute(
            _sql_query1a_processed
    );
    rowcount1a=pg_cursor1a.rowcount;
    rows=None;
    rows=pg_cursor1a.fetchall();
    row_cnt=0;
    for row in rows:
        pass;
        row_cnt+=1;
        table__id=row["table__id"];//do something with table__id
    //do something with rows.
    rows=None;
    db__pg_conn.commit();
    sqlSQL1a=None;
    pg_cursor1a=None;



On Fri, Jul 31, 2020 at 12:30 PM Shaozhong SHI <[hidden email]> wrote:
Hi,

What is the advantage of querying in Python?

Has anyone got much experience?

What not just use standard query?

What is the rationale for querying in Python?

Would the performance be better?

Regards,

Shao