tag:blogger.com,1999:blog-3161323258734729888.post95828127769430985..comments2015-04-02T08:19:07.952-04:00Comments on Algorithmic Detection of Computer Generated Papers: Getting Started with PreprocessingAllenhttp://www.blogger.com/profile/17098430318301260599noreply@blogger.comBlogger2125tag:blogger.com,1999:blog-3161323258734729888.post-50320223302496619812013-07-04T17:35:22.696-04:002013-07-04T17:35:22.696-04:00I found the source code. Ignore the above.
I found the source code. Ignore the above.<br />Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-3161323258734729888.post-5796102931078560402013-07-04T17:25:59.884-04:002013-07-04T17:25:59.884-04:00Hi, I'm hoping you'll get a notification f...Hi, I'm hoping you'll get a notification for this comment, despite it being over 3 years since you wrote this blog entry.<br /><br />I'm doing something similar and converting pdf lecture notes to text using pdftotext. <br />Are you able to elaborate on your methods for filtering the 'strange symbols which come from formulas and PDF artifacts'. I have been filtering using a few regular expression style techniques but maybe your methods could help me?Anonymousnoreply@blogger.com