SQL help?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
Reply | Threaded
Open this post in threaded view
|

SQL help?

Dr. Drexl Spivey
Trying to do something very easy, but can't quite figure out the logic.
Typing out the email, and see if I can't figure it out just typing it
out.

Have two tables:

petitions
id = (int)primary key
name = text description

and the following

signed_petitions
signed_id = references petitions.id (Foreign key)
joined = boolean

my goal is to list those petitions that have the most signatures, so I
have to get a count of joined=true, and list the names.

What confuses me on a fundamental level, is the usage of the Foreign
key, reference. But, here is my best attempt.

Ok, this is a lot harder than I thought, because the boolean is not
actually a number. So I guess I will have to use a rank?

SELECT
    petitions.id
    petitions.name
    RANK() OVER (PARTITION BY signed_petitions.joined) AS total_signed
WHERE
    signed_petitions.joined = "TRUE"
ORDER BY total_signed DESC
LIMIT 10



I am not sure how the rank() over (partition by... works), this is my
first attempt.

Would this give me the ten highest signed petitions?


Reply | Threaded
Open this post in threaded view
|

Re: SQL help?

Saurabh Agrawal-2
SELECT signed_id, (SELECT name FROM petitions WHERE id=signed_id), count(signed_petitions.id) AS number
FROM signed_petitions
WHERE joined
GROUP BY signed_id
ORDER BY number;

Have a pk in the signed_petitions table and count those?

On Wed, May 22, 2019 at 1:48 PM Dr. Drexl Spivey <[hidden email]> wrote:
Trying to do something very easy, but can't quite figure out the logic.
Typing out the email, and see if I can't figure it out just typing it
out.

Have two tables:

petitions
id = (int)primary key
name = text description

and the following

signed_petitions
signed_id = references petitions.id (Foreign key)
joined = boolean

my goal is to list those petitions that have the most signatures, so I
have to get a count of joined=true, and list the names.

What confuses me on a fundamental level, is the usage of the Foreign
key, reference. But, here is my best attempt.

Ok, this is a lot harder than I thought, because the boolean is not
actually a number. So I guess I will have to use a rank?

SELECT
    petitions.id
    petitions.name
    RANK() OVER (PARTITION BY signed_petitions.joined) AS total_signed
WHERE
    signed_petitions.joined = "TRUE"
ORDER BY total_signed DESC
LIMIT 10



I am not sure how the rank() over (partition by... works), this is my
first attempt.

Would this give me the ten highest signed petitions?


Reply | Threaded
Open this post in threaded view
|

Re: SQL help?

David G Johnston
In reply to this post by Dr. Drexl Spivey
On Wed, May 22, 2019 at 1:19 AM Dr. Drexl Spivey <[hidden email]> wrote:
Trying to do something very easy, but can't quite figure out the logic.
Typing out the email, and see if I can't figure it out just typing it
out.

Have two tables:

petitions
id = (int)primary key
name = text description

and the following

signed_petitions
signed_id = references petitions.id (Foreign key)
joined = boolean

my goal is to list those petitions that have the most signatures, so I
have to get a count of joined=true, and list the names.

What confuses me on a fundamental level, is the usage of the Foreign
key, reference. But, here is my best attempt.

Ok, this is a lot harder than I thought, because the boolean is not
actually a number. So I guess I will have to use a rank?

 
SELECT
    petitions.id
    petitions.name
    RANK() OVER (PARTITION BY signed_petitions.joined) AS total_signed
WHERE
    signed_petitions.joined = "TRUE"
ORDER BY total_signed DESC
LIMIT 10

This query has no FROM clause and "TRUE" in double quotes would be considered a column reference...in short it is bogus.  It helps considerably to provide a self-contained minimal example with some data (say to compute top 2...) and a working attempt or even a manual expected result.

I have no idea why "joined" is important.

The rank() window function is much more reliable when the partition contents are ordered so that assigning the rank is based off of some measure and not just random.  It is also unlikely to be helpful for this query.

You may find the filter aggregate expression to be helpful though.


The count requires nothing more than a simple GROUP BY based query; you can do as Saurabh did an inline a scalar subselect for the name or take the grouped result and join it back to the petition table.

David J.

Reply | Threaded
Open this post in threaded view
|

RE: SQL help?

Vianello, Daniel A
In reply to this post by Dr. Drexl Spivey
A Foreign Key is a mechanism for the database to insure that all of the signed_id values in signed_petitions exist in the id column of the petitions table. (I am ignoring NULL handling for the moment, but look that up later once you feel that you understand foreign keys a little better).  The foreign key relationship also gives you the likely columns that you'll want to use to join the two tables.

Looking at your query below the first thing I notice is the lack of a FROM clause telling the query which tables you are looking at.  Next is that RANK() is likely not the best candidate for answering your question (not that it couldn't be used, but that there is a shorter method for doing so).

I am going to assume that "petitions" is just the list of the different petitions.  For "signed_petitions", I am unclear if this is just a subset of petitions that have at least one signature, or if the "signed_petitions" table shows all of the people who have been shown the petitions and whether or not they have signed it.  I will base the rest of this answer on the latter assumption - that "signed_petitions" includes the list of people who have signed each petition because you referenced wanting to know which petition has the most signatures.  

For the FROM clause, you told us the two tables and the foreign key relationship, so we can put that together as follows (note, this is just the first conceptual step)

SELECT petition.name, signed_petitions.joined
FROM petitions
INNER JOIN signed_petitions
ON petitions.id = signed_petitions.signed_id

This gives one row for every signature for every petition, and doesn't, yet, answer the full question. To get the number of signatures, the easiest function to use is COUNT().  Putting that in, we get this:

SELECT petitions.name,  count(signed_petitions.joined)
FROM petitions
INNER JOIN signed_petitions
ON petitions.id = signed_petitions.signed_id
GROUP BY name

This gives the count of records in signed_petitions for each petition, regardless of whether or not joined=true.  So we can add that WHERE clause back in.  NOTE, for a Boolean column, you don't actually need to put the equal sign, a very useful syntax, but harder for novices to understand.  NOTE2: "true" should not be in double quotes, nor even single quotes.

SELECT petitions.name,  count(signed_petitions.joined)
FROM petitions
INNER JOIN signed_petitions
ON petitions.id = signed_petitions.signed_id
WHERE signed_petitions.joined = true
GROUP BY name

This still ignores the case that some petitions may not have any records in signed_petitions.  We can account for that with a LEFT JOIN instead of INNER JOIN.

SELECT petitions.name,  count(signed_petitions.joined)
FROM petitions
LEFT JOIN signed_petitions
ON petitions.id = signed_petitions.signed_id
WHERE signed_petitions.joined = true
GROUP BY name

And finally, what if we want to know the counts for how many have joined=true AND joined=false all in the same query?

SELECT petitions.name
,  count(signed_petitions.joined) FILTER (WHERE joined = true) as num_joined
,  count(signed_petitions.joined) FILTER (WHERE joined = false) as num_not_joined
FROM petitions
LEFT JOIN signed_petitions
ON petitions.id = signed_petitions.signed_id
GROUP BY name

I hope this, as a tutorial both answered your intended question and gave you, or other future readers, help in conceptually putting these pieces together.

Dan

-----Original Message-----
From: Dr. Drexl Spivey [mailto:[hidden email]]
Sent: Wednesday, May 22, 2019 3:17 AM
To: pgsql-novice <[hidden email]>
Subject: SQL help?

Trying to do something very easy, but can't quite figure out the logic.
Typing out the email, and see if I can't figure it out just typing it
out.

Have two tables:

petitions
id = (int)primary key
name = text description

and the following

signed_petitions
signed_id = references petitions.id (Foreign key)
joined = boolean

my goal is to list those petitions that have the most signatures, so I
have to get a count of joined=true, and list the names.

What confuses me on a fundamental level, is the usage of the Foreign
key, reference. But, here is my best attempt.

Ok, this is a lot harder than I thought, because the boolean is not
actually a number. So I guess I will have to use a rank?

SELECT
    petitions.id
    petitions.name
    RANK() OVER (PARTITION BY signed_petitions.joined) AS total_signed
WHERE
    signed_petitions.joined = "TRUE"
ORDER BY total_signed DESC
LIMIT 10



I am not sure how the rank() over (partition by... works), this is my
first attempt.

Would this give me the ten highest signed petitions?


E-MAIL CONFIDENTIALITY NOTICE:
The contents of this e-mail message and any attachments are intended solely for the addressee(s) and may contain confidential and/or legally privileged information. If you are not the intended recipient of this message or if this message has been addressed to you in error, please immediately alert the sender by reply e-mail and then delete this message and any attachments. If you are not the intended recipient, you are notified that any use, dissemination, distribution, copying, or storage of this message or any attachment is strictly prohibited.