SQL Part 11

Htoo Latt
3 min readMar 29, 2022

In the previous blog post, we explore the dataset that we will be using for the JOINS. We also ended with a simple JOIN statement that connected the movies table with the links table using the movie’s id as the connector.

Using Aliases

When writing SQL queries, sometimes you might want to create an alias for a table so that you don’t have to write on the table names again.

SELECT *
FROM movies
JOIN ratings users
ON movies.movieId = users.movieId

In the above query, I gave an alias of users to the rating table. Although this doesn’t seem helpful in this case, when the table name is much longer, an alias can save a lot of time and also make the query look cleaner. For example, in the query below we will rename a table with a long title.

SELECT a.titles,
a.descriptions
b.release_date
FROM movies_database_from_imdb a
JOIN second_movie_database_from_imdb b
ON a.movie_id = b.movie_id
ORDER BY 3 DESC

In the above query, we have two different tables with rather long names. We gave the two tables the aliases of a and b which are repeated a number of times in the query. Let’s compare the above query to an identical query where aliases are not used.

SELECT movies_database_from_imdb.titles,
movies_database_from_imdb.descriptions
second_movie_database_from_imdb .release_date
FROM movies_database_from_imdb
JOIN second_movie_database_from_imdb
ON movies_database_from_imdb.movie_id =
second_movie_database_from_imdb.movie_id
ORDER BY 3 DESC

See how much cleaner the previous query looks. Giving an alias to a table or column can be a lifesaver when working with data that are not properly named.

Now let’s return to the JOIN ON statement that we were working on before and try to do an exercise on it. For this exercise let’s try to write a query that returns the movie title, the number of ratings given to it, and the average rating for the movie.

Let’s think through the problem logically. First, we know we need the movies and rating tables, therefore a JOIN statement will have to be used. Next, we know that the movie title column will not contain repeating values in the table we queried. Therefore the data has to be grouped by the title. Next, we will have to use AVG and COUNT aggregate functions on the ratings to get our answers.

SELECT movies.title,
COUNT(*),
AVG(rating)
FROM movies
JOIN ratings
ON movies.movieId = ratings.movieId
GROUP BY 1
ORDER BY 2 DESC, 3 DESC

In the above statement, I selected only three columns, two of which are aggregate functions. First is the movie title from the movies table, then the COUNT function used on all rows, and finally the AVG function which is used on the rating column which is located in the rating table. Most importantly we must not forget to group the table by the title. To make the resulting table easier to read, I ordered the table by the number of ratings given to the movies first and then the average rating in descending order.

--

--