> Does anyone know of any current plagiarism detection projects currently
> going on? I know of Malcolm Coulthard and Copycatch, but are there any other
> projects? Also, I would like to do some statistical work on plagiarised
> work, but does anyone know where I can find any data?
The following reference and also the references cited within might be helpful.
"Syntactic Clustering of the Web" by A. Z. Broder, S. C. Glassman, M. S.
Manasse, G. Zweig from Proc of WWW6, available at http://decweb.ethz.ch/WWW6/Te
chnical/Paper205/Paper205.html
They use document fingerprinting to cluster syntactically similar documents.
The same technique has been used to find documents on the web that are similar
by Nevin Heintze, see http://www.cs.cmu.edu/afs/cs/user/nch/www/koala/main.html
-Anoop
This archive was generated by hypermail 2b29 : Wed May 10 2000 - 11:45:36 MET DST