In my previous post I wanted to count the number of gigs that have or will be staged at a venue. But if a venue has no gigs associated it didn't appear in the results. Well to fix this I need to join the Venues table making sure that every venue appears in the output (even if there are no gigs there):
SELECT venues.name, COUNT(gigs.venueid) FROM venues LEFT OUTER JOIN gigs ON gigs.venueid = venues.id GROUP BY venues.name ORDER BY venues.name
The JOIN joins venues and gigs where gigs.venueid equals venues.id but the LEFT OUTER statement makes sure that all rows in the left table (venues) appear in the output table. There is an excellent description of how joins work in wikipedia of course. Any row in the left table (venues) that doesn't 'join' anything in the right tables (gigs) is added with NULL values for right table columns. We don't really care about this because we are only SELECTing the venue name and the count of gigs that use that venue (so we don't need any potentially NULL column entries)
Anyway, how do we get this in to SQLAlchemy:
select([Venue.c.name, func.count(Gig.c.venueid)],
from_obj=[venues_table.outerjoin(gig_table)],
group_by=[Venue.c.name, Venue.c.id],
order_by=[Venue.c.name]).execute().fetchall()
I learnt something pretty cool while working this out. You can ask SQLAlchemy to printout what query it is going to execute:
print select([Venue.c.name, func.count(Gig.c.venueid)],
from_obj=[venues_table.outerjoin(gig_table)],
group_by=[Venue.c.name, Venue.c.id],
order_by=[Venue.c.name])
and you get
SELECT venues.name, count(gigs.venueid) FROM venues LEFT OUTER JOIN gigs ON venues.id = gigs.venueid GROUP BY venues.name, venues.id ORDER BY venues.name
which I think you can agree is basically identical to the required statement. Using this feature I could test out SQLAlchemy syntax and see if it was getting close to the raw SQL. This helped me a lot.
I think the SQLAlchemy syntax here is particularly intuitive. The keyword parameter from_obj is particularly pleasing as it really describes what I am doing in my head. I'm taking the venue name and it's count from the set 'venues joined with gigs'. SQLAlchemy can easily work out that venues are joined to gigs by the foreign key in each gig row which referes to a venue. So there is no need for me to specify HOW to join (the ON clause in SQL).
