Answers 2.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657
  1. The software and text similarity tester SIM
  2. SIM tests lexical similarity in texts in C, Java, Pascal, Modula-2, Lisp,
  3. Miranda, and natural language. It is used
  4. - to detect potentially duplicated code fragments in large software projects,
  5. in program text but also in shell scripts and documentation;
  6. - to detect plagiarism in software projects, educational and otherwise.
  7. SIM is available through ftp. The directory
  8. ftp.cs.vu.nl:pub/dick/similarity_tester
  9. contains the sources (in C) and the MSDOS .EXEs.
  10. The software similarity tester is very efficient and allows us to compare
  11. this year's students' work with that collected from many past years (much to
  12. the dismay of some, mostly non-CS, students). Students are told in advance
  13. that their work is going to be compared, but some are non-believers ...
  14. The output of the similarity tester can be processed by a number of shell
  15. scripts by Matty Huntjens. These shell scripts take sim output and produce
  16. lists of suspect submissions, histograms and the like.
  17. The present version of these scripts is very much geared to the local situation
  18. at the Vrije Universiteit, though; they are low on portability.
  19. Matty Huntjens' email address is [email protected].
  20. We are not afraid that students would try to tune their work to the
  21. similarity tester. We reckon if they can do that they can also do the
  22. exercise.
  23. Since this piece of handicraft does not qualify as research, there are no
  24. international papers on it. A paper, titled `Detecting copied submissions in
  25. computer science lab work', was published in a local (i.e. Dutch) computer
  26. science journal:
  27. %A Dick Grune
  28. %A Matty Huntjens
  29. %T Het detecteren van kopie\(:en bij informatica-practica
  30. %J Informatie (in Dutch)
  31. %V 31
  32. %N 11
  33. %D Nov 1989
  34. %P 864-867
  35. The ftp directory contains a terse technical report about the internal
  36. working of the program.
  37. Dick Grune
  38. Vrije Universiteit
  39. de Boelelaan 1081
  40. 1081 HV Amsterdam
  41. the Netherlands
  42. [email protected]
  43. +31 20 444 7744
  44. ----------------------------------------------------------------
  45. With infinitely many exceptions, what you do makes no difference.