You can access a significantly larger sample of the platform's content for free by logging in with your Gmail account. Sign in now to explore.

Assume the following schema:

Assume that we want to generate the raw data for creating a histogram of the number of songs per genre. To be specific, we are looking for the following output:n | freq |
---|---|

1 | 51 |

2 | 1 |

3 | 1 |

4 | 0 |

5 | 0 |

You can read the above as follows:

- There are 51 music genres that have 1 song
- There is 1 music genre that has 2 songs
- There is 1 music genre that has 3 songs
- There are
**zero**music genres that have 4 songs - etc.

Limit your response to five rows.

We will use a recursive CTE to generate all the numbers between the minimum number of songs per genre and the max. Then, we can left join and count to get the result:

```
WITH recursive songs_per_genre AS
(
SELECT genre, COUNT(DISTINCT trackid) as songs
FROM songs
WHERE genre != '-1'
GROUP BY genre
), freq_table as
(
SELECT (SELECT MIN(songs) FROM songs_per_genre) AS n
UNION ALL
SELECT n+1 FROM freq_table WHERE n < (SELECT MAX(songs) FROM songs_per_genre)
)
SELECT n, COUNT(songs) as freq
FROM freq_table LEFT JOIN songs_per_genre
ON n=songs
GROUP BY n
LIMIT 5;
```

n | freq |
---|---|

1 | 51 |

2 | 1 |

3 | 1 |

4 | 0 |

5 | 0 |

Recursive CTE, Join

- Artist maxranks Easy (Join)
- Songs that did not enter the charts or entered high Medium (Subquery, Join)
- Expensive house songs I Medium (Subquery, CTE, Join)
- Artists with more songs than others Hard (Subquery, CTE, Join, Window functions)
- Songs in charts with greater durations Hard (Subquery, CTE, Join, Window functions)