We’re getting ready to face off with some face recognition these days and it isn’t all that easy as Picasa, Facebook, and iPhoto make it seem. Using the open source Computer Vision library, we’re – as I type – going through just over 4 million jpg’s grabbed out of videos from the Danish Broadcasting Corporation’s archive.
We’re hunting the well-known news anchors, celebs, and weather girls…
We’re hunting the well-known news anchors, celebs, and weather girls – to begin with – and turning their beautiful “mugshots” into indexable metadata for researchers and the public to use when looking for a specific clip in their cultural heritage. Unfortunately, the metadata in archives often lack important details like who’s talking when, so this could be a great feature for all of us.
These are the steps we’re going through to make this happen:
First, we’ve generated millions of scene detection images from the videos in the archive. These are all grouped with ID’s of the assets and the timecode they were grabbed from, so we can back trace any generated data to the original video clips.
Secondly, we’re letting the open source Computer Vision library detect the faces in the images. At this point, we have no idea if two faces are the same person, and who this person might be. You see the result of this on our friend John Malkovich (thumb from the original poster).
Thirdly, we need to group these faces into recognized people using something like Picasa, face.com, or crowdsourcing. I guess time will tell how easy we’ll be able to do this, but for know, I’ll celebrate the yellow frames on our images. Detecting a face – any face – is very doable today, but actually recognizing faces… I don’t want to get my hopes up just yet.
The fourth step will be to get the names of the people we’ve found back into our database of around a million video assets and letting users navigate through the content with the tagged people as a parameter.
For now – while the yellow frames are being added to the grabbed jpg’s – we’ll keep researching to see if something like the Eigenface algorithm can effectively do this recognition and be accurate enough, or if we’ll have to try crowdsourcing the job with some awesome gamification to lure the public in.
If you have any good ideas or interesting experiences with face recognition, please do share beneath.