Algorithmic Detection of Computer Generated Papers: Pruning

So it turns out that pruning does not produce a U-shaped leave-one-out error curve as I expected:

For reference, a 3d plot of the pruned data with k=3:

The densities of computer generated and human papers do look much more comparable, though. The error looks to be uniformly at least slightly higher than without pruning, which is expected. I pruned by removing only points which were both classified correctly (leave-one-out) and whose removal did not cause any previously removed points to be classified incorrectly. There are certainly other (probably better) pruning algorithms, but I would expect at least comparable results from them.

I will try using a validation set instead of leave-one-out cross-validation just for kicks, but it's looking increasingly like k=3 is the way to go.

Algorithmic Detection of Computer Generated Papers

Saturday, April 10, 2010

Pruning

No comments:

Post a Comment

Followers

Blog Archive

About Me