Session by Oscar Celma (Pandora)
March 11, 2017
Pandora’s mission: be the effortless source of personalized music enjoyment and discovery
Some mind blowing statistics at Pandora
75M monthly average users
24 hours of listening per month
12B stations created
98% of artists spinning every month
How does Pandora decide what to play next?
Content based algorithm: music genome data
Collective intelligence: mining user behavior
Personalized filtering: your thumbs up and skips
Ensemble recommender: piece together output from 75 different algorithms
Challenges: balance familiar with unfamiliar
Exploit: play awesome music now. Tomorrow? Who cares. Don’t play music I don’t like.
Explore: play something risky. Learning what to play. Don’t play too many WTF (“what the freakommendation” – Paul Lamere“).
Novelty versus relevance
Exploit: low novelty, high relevance
Explore: high novelty, high relevance
Popular: low novelty and low relevance
Risky: high novelty, low relevance
How does Pandora test new ideas?
- Dream idea
- Experiment in small group (1% of users)
- If successful, roll out 6-12 months later
Metrics: did it bring new listeners? Did it avoid churn? Did they listen for longer?
Retention: time spent listening, active days
Activity: thumbs, skips, create new stations
Pandora’s Tech Stack (some of it)
Memcache, Redis, Python, Java, Scala, Hive, Spark, PostgreSQL, Hadoop (HDFS)