Vinyl Album Recognition Project for Bovik’s Image and Video Processing Class: Spring 2010
Post by Kevin Kozak
Mark Hoover (friend, lab partner) and I met on Thursday to discus our Plan of Attack for this image and video processing project. To sum up our project, we are going to create software that matches a cell phone picture of a vinyl album to a database. Once the album has been identified, information about the record is displayed and audio samples can be played. The project contains the software algorithm, the user interface, and the database to contain the album info. Our main focus for this semester is the software algorithm, especially since that what the grade will depend on. The mobile app I mentioned in a previous post will come later. For now, let’s make it work.
This weeks goal was to take some sample pictures of select vinyls to being implementing the software algorithm. Our first task is to get the picture into a working form. Once the picture has been taken, the background has to be removed, the picture rotated to be perfectly straight, undo any skew, transform into grayscale, deblurr if needed, and perform brightness/contrast adjustments.
Below are the 11 albums that I selected to test the software with. Mark will be taking pictures of his own. Then, we will try different cameras on different albums to see if that has any effect.
Picture 1: Stevie Wonder – Innervisions. This album was intentionally left inside the clear plastic jacket to see if that affects our algorithm. The user should not have to remove the jacket to take the picture. You will also notice the price tag is still on it–classy, I know. In a record store setting, it is not feasible to remove the price tag so the software needs to account for that too.
Picture 2: Frank Zappa – Joe’s Garage. This picture was taken with skew. This album was chosen because a.) its cool b.) its Zappa c.) still in plastic jacket d.) has price tag e.) I want to see how the software performs the black and white thresholding given the level of detail with the black oil on his face.
Picture 3: Jimmy Buffett – Volcano. This picture was intentionally rotated to see if the software can figure it out. This album was chosen because its bright. Really bright. I’d like to see how the software handles the color. This album makes me want to go to Florida really bad. Summer yet?
Picture 4: Elvis Presley – Elvis’ Golden Records. This album was chosen because he’s the king and because there is some writing on it that was not part of the album art. This record belonged to my grandfather, and true to form, grandpa Jake wrote his name on everything he owned. Also, Elvis Presley wrote his name on here with a blue pen. Silly Elvis.
Picture 5: Cream – Wheels of Fire. I chose this album because it was shiny. While shiny object frequently distract me, I wonder if the shininess will distract the software. I got this record from my dad and I can most certainly tell he listened the hell out of it. The front is worn pretty good. We shall see what effect that has on the software.
Picture 6: Creedence Clearwater Revival – Bayou Country. This album is a great example of the “what the f*** is this album?” response I’ve had in the past while record shopping. They were so innovative they forgot to put their name on the front! They did put their name on the back, but how can they expect me to turn it over. I’d rather image search it! This album cover is interesting because of how blurry the image is. Notice the wear and tear on this one in the shape of the record. Performing edge detection or template matching will be very tricky for this record.
Picture 7: John Denver – John Denver’s Greatest Hits. I chose this record because it has “Not for Sale: For Promotional Use Only” impressed on the upper right corner and the plastic cover is starting to tear and warp along the right edge. Also its an awesome picture of John Denver.
Picture 8: Led Zeppelin – Led Zeppelin IV (or the 4 symbols that appear on the record). I chose this record because the album name appears nowhere on this record, inside or out. The band name only appears on the record itself and the record sleeve. The album cover is interesting because the main picture is very old looking and grainy. The background to that picture is fairly detailed but low contrast. This album could be challenging to template match.
Picture 9: The King Bees – The King Bees. I chose this album because it was very high contrast. It should be fairly easy to perform template matching or edge detection. There’s also a little bit of dirt of something smudged on it. The red and yellow overlays might make things interesting though. We shall see. I bought this record somewhat recently because I thought the cover art was cool, but its also a bad ass rock ‘n roll album.
Picture 10: The Beatles – Sgt. Pepper’s Lonely Hearts Club Band. I chose this picture because it has a lot of detail, especially with the color. My mom loves the Beatles and I’m sure she would approve choosing a Beatle’s album.
Picture 11: Billy Joel – The Stranger. I picked this album because its already in grayscale. I don’t really know what happens when you take a color picture of a grayscale image and the transform it into grayscale. There is a little wear and tear along the top that the software needs to overcome.
Next step: handle skew, rotation, noise, deblurr, transform into grayscale, and research how template matching works.