Saturday, January 27, 2018

Spotify Top 200 in mySQL

I do a lot of data analysis lately, and I try to find answers to questions through data for my companies pressing questions. Let's look at the past year of 2017 and answer questions for people who like music.



artist is the artist name
track is the artist's track name
list_date is which chart date the artist show up on the top200
streams is the number of plays following spotify specific rules

Let's look at the data set

select count(*) from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01';
+----------+
| count(*) |
+----------+
|    74142 |
+----------+
1 row in set (0.04 sec)


How many artists made it in the top200 for the United States?

mysql> select count(DISTINCT(artist)) from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01';
+-------------------------+
| count(DISTINCT(artist)) |
+-------------------------+
|                     527 |
+-------------------------+
1 row in set (0.09 sec)

Wow, it's really hard to be a musician. Only 527 broke the top200.

How many tracks in 2017 broke the top200?

 select count(DISTINCT(track)) from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01';
+------------------------+
| count(DISTINCT(track)) |
+------------------------+
|                   1682 |
+------------------------+

For the entire year, 1682 songs defined the united states listing habits for the most part.


Who showed up the most in the top200 for 2017?

mysql> select artist,count(*) AS CNT from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01' group by 1 order by 2 DESC LIMIT 10;
+------------------+------+
| artist           | CNT  |
+------------------+------+
| Drake            | 3204 |
| Lil Uzi Vert     | 1891 |
| Kendrick Lamar   | 1874 |
| Post Malone      | 1776 |
| Ed Sheeran       | 1581 |
| The Weeknd       | 1566 |
| Migos            | 1550 |
| Future           | 1536 |
| The Chainsmokers | 1503 |
| Kodak Black      | 1318 |
+------------------+------+
10 rows in set (0.16 sec)

Drake killed it, but Lil Uzi Vert is the star of the year, IMHO. Drake has a pedigree while Lil Uzi just started running.

Also from these artists I can tell HIP HOP dominated us charts; Let's verify this assumption.

mysql> select artist,SUM(streams) AS CNT from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01' group by 1 order by 2 DESC LIMIT 10;
+------------------+------------+
| artist           | CNT        |
+------------------+------------+
| Drake            | 1253877919 |
| Kendrick Lamar   | 1161624639 |
| Post Malone      |  954546910 |
| Lil Uzi Vert     |  818889040 |
| Ed Sheeran       |  714523363 |
| Migos            |  682008192 |
| Future           |  574005011 |
| The Chainsmokers |  557708920 |
| 21 Savage        |  472043174 |
| Khalid           |  463878924 |
+------------------+------------+
10 rows in set (0.48 sec)


Yup hip hop dominated the top 10 steams.

What about tracks? What are the top 10 tracks by streams?

 select track,SUM(streams) AS CNT from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01' group by 1 order by 2 DESC LIMIT 10;
+-------------------+-----------+
| track             | CNT       |
+-------------------+-----------+
| HUMBLE.           | 340136186 |
| XO TOUR Llif3     | 314758565 |
| Congratulations   | 283551832 |
| Shape of You      | 280898054 |
| Unforgettable     | 261753940 |
| Mask Off          | 242524530 |
| Despacito - Remix | 241370570 |
| rockstar          | 225517132 |
| Location          | 224879215 |
| 1-800-273-8255    | 219689749 |
+-------------------+-----------+
10 rows in set (0.43 sec)

Which tracks and artists had the most time in the top200?

 select artist,track,count(*) AS CNT from spotify.top200 WHERE country='us' and list_date >= '2017-01-01' and list_date < '2018-01-01' group by 2 order by 3 DESC LIMIT 10;
+------------------+-------------------------------------+-----+
| artist           | track                               | CNT |
+------------------+-------------------------------------+-----+
| D.R.A.M.         | Broccoli (feat. Lil Yachty)         | 485 |
| French Montana   | Unforgettable                       | 417 |
| PnB Rock         | Selfish                             | 394 |
| Travis Scott     | goosebumps                          | 365 |
| Post Malone      | Go Flex                             | 365 |
| Childish Gambino | Redbone                             | 365 |
| Post Malone      | Congratulations                     | 365 |
| Post Malone      | White Iverson                       | 365 |
| Migos            | Bad and Boujee (feat. Lil Uzi Vert) | 364 |
| Bruno Mars       | That's What I Like                  | 364 |
+------------------+-------------------------------------+-----+
10 rows in set (0.20 sec)

Also from this data I can tell that Post Malone had a fantastic year!


So, more questions can be answered, like who held the number 1 position on the top200 the most?

select artist,track,count(*) AS CNT from spotify.top200 WHERE country='us' and pos=1 and list_date >= '2017-01-01' and list_date < '2018-01-01' group by 2 order by 3 DESC LIMIT 10;
+----------------+-------------------------------------+-----+
| artist         | track                               | CNT |
+----------------+-------------------------------------+-----+
| Post Malone    | rockstar                            | 105 |
| Kendrick Lamar | HUMBLE.                             |  67 |
| Ed Sheeran     | Shape of You                        |  48 |
| Luis Fonsi     | Despacito - Remix                   |  47 |
| Migos          | Bad and Boujee (feat. Lil Uzi Vert) |  29 |
| 21 Savage      | Bank Account                        |  20 |
| Drake          | Passionfruit                        |  12 |
| Logic          | 1-800-273-8255                      |  10 |
| Taylor Swift   | Look What You Made Me Do            |  10 |
| French Montana | Unforgettable                       |   7 |
+----------------+-------------------------------------+-----+
10 rows in set (0.26 sec)


Wow can see hear that Post Malone is the star!

In summary, getting public data sources and doing simple queries can give a clearer insight into data to answer some pressing questions one may have.

With the schema above what questions would you answer?

No comments: