The long awaited third feature has finally arrived. I'm measuring the occurrence of keywords from the reference section of papers in their body, title and abstract. Implementing the feature itself was fairly trivial, but 3D plotting with Matplotlib turns out to be somewhat tricky.
Regardless, all 100 papers are correctly classified with this new feature (still working with k-nearest neighbor, now in 3 dimensions). My next step will be to get a bunch more data and evaluate various classifiers, since I'm using a quite arbitrary k=11 right now (as Professor Magdon pointed out at the CS poster session).