Taking Facial Detection to the next level (and fulfilling my monthly cat post requirement) - Kittydar (Kitty Radar Cat Detector)
According to the project's ReadMe on Github, the way Kittydar works is that it first chops the image up into many "windows" to test for the presence of a cat head. Within each window a gradient is computed and a Histogram of Oriented Gradients or HOG is then used as the raw feature in a learning system.
The system is based on a neural network that has, according to the researcher Heather Arthur who created KittyDar, been pre-trained with thousands of photos of cat heads and their histograms, as well as thousands of non-cats. The neural network data is included in Kittydar in JSON format and is used to perform the classification.
How it works
Kittydar first chops the image up into many "windows" to test for the presence of a cat head. For each window, kittydar first extracts more tractable data from the image's data. Namely, it computes the Histogram of Orient Gradients descriptor of the image, using the hog-descriptor library. This data describes the directions of the edges in the image (where the image changes from light to dark and vice versa) and what strength they are. This data is a vector of numbers that is then fed into a neural network which gives a number from
1on how likely the histogram data represents a cat.
The neural network (the JSON of which is located in this repo) has been pre-trained with thousands of photos of cat heads and their histograms, as well as thousands of non-cats. See the repo for the node training scripts.
Kittydar will miss cats sometimes, and sometimes classify non-cats as cats. It's best at detecting upright cats that are facing forward, but it can handle a small tilt or turn in the head.
Kittydar isn't fast. It'll take a few seconds to find the cats in one image.
There's lots of room for improvement, so fork and send requests.
This informative research paper: Cat Head Detection - How to Effectively Exploit Shape and Texture Features by Weiwei Zhang, Jian Sun, and Xiaoou Tang.
This off the hook dataset of cat images annotated with the locations of the cat's ears, eyes, and mouth.
@gdeglin for the name.
Come on! You KNOW that's cool!