Spotify Web Player Failure due to Crash of Connected Device

2020-05-08

On May 6, several iOS apps including Spotify crashed due to a bug in Facebook's iOS SDK. These apps embed Facebook's SDK to implement Facebook's login or integration features. Regardless of whether or not one actually uses Facebook with these apps, the apps attempted to connect to Facebook's servers and crashed with an NSInvalidArgumentException. These crashes were triggered when Facebook updated their servers. Facebook apparently resolved the issue by reverting those server-side changes per a message that was later deleted. This deleted message was followed up with a general statement that didn't mention how the issue was resolved.

An interesting side effect was that Spotify's web player (https://open.spotify.com/) couldn't play songs anymore, even after my iOS app had recovered. Logging out and logging back in as well as hard refreshes in the browser didn't resolve the issue. Looking at my browser's developer console, I noticed that requests to the POST https://guc-spclient.spotify.com/track-playback/v1/devices endpoint timed out with a 504 response from the load balancer. On success, that endpoint seems to return a list of the connected Spotify devices with information per device such as the device type (e.g., computer, phone) and device capabilities (e.g., volume control support, video playback support). The 504 response was potentially related to one of the devices (the iOS device) being in a bad state in the past. It was possible to fix the issue by going to the Spotify account settings and selecting "SIGN OUT EVERYWHERE". After logging back in, web player was able to play songs again.

It appears that Spotify might prevent similar issues from happening in the future by implementing more robust error handling on the frontend. For example, when requests to the aforementioned devices endpoint time out, the Spotify web player can potentially display a message explaining that connected devices cannot be found at the moment and allow the user to play songs on the current device. In addition, the web player could offer the option to let the user sign out on other devices in order to disconnect problematic devices. Moreover, the 504 response indicates that the load balancer didn't receive a response from the upstream backend server, which points to a bug in the backend that should be fixed as well to address the root cause of the 504 responses.