The best free tools for plagiarism detection are
Internet search engines. Most of them allow searching exact phrases
or even whole sentences (through 'advanced search'). Thus, if you
suspect a paper has plagiarized text, choose some unusual phrases in
the text and copy them in a search engine. The engine will bring to
you all Internet documents in which the phrase appears AND which
were indexed in its huge database. Among the many search
engines currently available the following ones are particularly
efficient:
1-
No search engine covers all Internet
pages. Thus, you should try the same key words or phrases in
several search engines, or use meta
search engines, which search several engines at a time as is
the case of Metacrawler and Dogpile both of which
retrieve results from
Google,
Yahoo ,AltaVista, Ask
Jeeves, About, LookSmart,
Overture, and FindWhat.
The meta search engine
Mamma
allows selection of search sources
and preferences. It also has very good search tips.
2- The
information contained in Portable Document Format (PDF) files is
not accessed by many search engines. It is true that PDF files are
harder to plagiarze
since they cannot be incorporated into the student's paper.
However, you should not ignore them because they are a very
popular Internet file type. Note that Google, AlltheWeb,
and Gigablast, in addition to HTML files, also locate PDF, Word,
Excel, and other file formats. AlltheWeb also indexes files in
Macromedia Flash files.
Not
all Internet material is accessible through general Internet search
engines. Some Web files are not or cannot be indexed by search
engines and thus can only be accessed through specific tools or
directories.
To find web documents in PDF
format, use the Search Adobe PDF Online at
http://createpdf.adobe.com/. Its homepage
says: "Now there's a way to search
through more than a million summaries of Adobe® Portable Document
Format (PDF) files on the Web. Your search results will allow you to
see the summaries before deciding to view the original Adobe PDF."
The Librarians' Index to the Internet
"is a searchable, annotated subject directory of more than 12,000
Internet resources selected and evaluated by librarians for their
usefulness to users of public libraries. lii.org is used by both
librarians and the general public as a reliable and efficient guide
to Internet resources.." http://lii.org/
Infomine "is a virtual library of
Internet resources relevant to faculty, students, and research staff
at the university level. It contains useful Internet resources such
as databases, electronic journals, electronic books, bulletin
boards, mailing lists, online library card catalogs, articles,
directories of researchers, and many other types of information."
http://infomine.ucr.edu/
The Invisible Web
directory "includes a
directory of some of the best resources the Invisible Web has to
offer. The directory includes resources that are informative, of
high quality, and contain worthy information from reliable
information providers that are not visible to general-purpose search
engines."
http://www.invisible-web.net/
Direct Search
"is a growing
compilation of links to the search interfaces of resources that
contain data not easily or entirely searchable/accessible from
general search tools like Alta Vista, Google, or Hotbot. Although
these "general" tools are essential for the retrieval of Internet
based data, searchers often fail to realize that a massive amount of
information is not easily or entirely searchable/accessible via
these search tools. Material "hidden" from the general search tools
is said to reside on the Invisible Web."
http://www.freepint.com/gary/direct.htm
PAPERS
ABOUT THE INVISIBLE
WEB
In
Those Dark Hiding Places: The Invisible
Web Revealed Robert Lackie provides links and information on
specific search tools including directories, searchable sites and
databases, and specialized search engines. Only a few of these tools
are mentioned above; thus take a look at Lackie's page for many
others.
http://www.robertlackie.com/invisible/index.html
H-Net Reviews:
"H-Net Reviews in
the Humanities and Social Sciences is an online scholarly review
resource."http://www.h-net.org/reviews/
Look
Smart's Find Article (and book reviews) is user-friendly and
allows one "search and read 3.5
million articles from over 700 publications."
http://www.findarticles.com/PI/index.jhtml
Thus, if
you suspect plagiarism in a book review prepared by one of your
students, compare the suspected review with those available on the
above sites.
TIP: Instructors can create databases of
books reviews of books they usually request their students to
review, so they can easily compare suspected reviews with their
local databases. These databases include both reviews found
on the Web and reviews presented by students in previous semesters.
Library databases (e.g., ERIC) and electronic
journals can also be used as tools to detect plagiarism.
The advanced search of the e-journal
package JSTOR, for instance, allows search of phrases as well as
full paragraphs.
An
impressive general database, Academic Search Premier, provides full text
for nearly 4,000 scholarly publications, including full text for
more than 3,100 peer-reviewed journals. Coverage spans virtually
every area of academic study and offers information dating as far
back as 1975.."
AUB University Libraries subscribe to
those and other online resources:
Tips for
effective use of search engines and library electronic resources as
a plagiarism detection tools
Be selective in choosing
the part of the paper you are going to type into the search
engines. By picking unusual phrases (four to six words) and key words you
narrow down the options provided by the search engine and
save your time in checking them.
Include suspected quotes and phrases within
parentheses. This way the search engines will retrieve only exact
matches.
Search for the material cited in the bibliography
and compare with the student's paper. Some cases of plagiarism
involve the use of large chunks of text of the material cited in
the bibliography without correct acknowledgment of the source.
What is "anti-plagiarism software" and how does it
work?
Software designed to identify similar texts--thus
identifying plagiarized texts. Each software
works in a particular way, as you will see below in this page.
Are the reports presented by this type of software
100 percent
correct?
No. Anti-plagiarism software
has been criticized because of its intrinsic flaws. For
instance, it cannot distinguish original text from copied
material. In order to test one of these software programs,
Dehnart (1999) submitted his senior
thesis. To his surprise the result of the
analyzes claimed his work had been plagiarized in full. By
checking the links given in the report he found out that the
software had compared his thesis with a copy of the same thesis the
software found online. Those software also cannot differentiate
plagiarized text from properly quoted text. For this reason, some
companies alert teachers not to take their reports as absolute proof
of plagiarism. The reports merely point out phrases that should be
examined more closely." (Kopytoff, 2000)
Is it legal to use anti-plagiarism software to detect
students' plagiarized papers?
Some tools compare the suspected
paper with material publicly available on the Internet. I am not
aware of any case questioning the legality of this activity. Other
programs however (e.g., Turnitin.com) have been accused of
infringing upon the students' right to their intellectual property
because they collect the students' papers and use them for profit
without their permission. "Turnitin
keeps a copy of every paper submitted and adds it to their database.
Students have no choice in the matter; if a professor submits a
student's paper for a check, it's
archived -- essentially in-house-published -- for future use by the
Turnitin.com database." (Technotes,
2001).
According to L.
Rooker, director of the U.S. Department
of Education's Family Policy Compliance Office, "You can hire a
vendor to check for plagiarism," he says. "But once they do that,
they can't then keep that personally identifiable document and use
it for any other purpose." (Foster, 2002).
To avoid legality problems, professors who are going to use software
like Turnitin should do one of the
following: either make sure the students know their papers will be
submitted or, even better, require the student to upload their
papers. This way they cannot say they were not aware their papers
were submitted to the anti-plagiarism site.
Where can we get these
software? Are they for free?
Some of these tools
are free; others are free but require you to create an account; and
most of them sell licenses. See table
below.
"It
compares text documents with one another to determine if they
share words in phrases. When it finds two files that share
enough words in those phrases, WCopyfind
generates html report files."
Download: WCopyfind
2.5 Software -
Instructions
Papers are
sent to the Turnitin web site where they are compared to
Internet sources and to material from their own database.
Reports of originality are sent back by e-mail.
Papers are
also sent to the company's site which prepares reports.
These reports identify exactly
which sections of text are taken verbatim from Internet
sources without proper documentation and points out passages
that have been altered.
Offers free
trial with 10 originality reports free
Papers are uploaded to Scriptum web site
and their content is compared with material on the Internet. A
similarity report is sent to the instructor highlighting all
similarities to
Internet material.
Offers a free demo account
WCopyfind
2.5 is a free application you can download to your computer.
This software does not compare your student's paper with texts on
the Internet. It works locally, and compares different documents
downloaded on your computer. If you search the Internet and find a
paper you suspect your student used, you can create a shortcut for
the paper and indicate it among the documents you request
WCopyfind to analyze. It works very well
in case you want to compare a paper with your database of papers
presented
in electronic format by your previous students.
In addition to software designed to detect plagiarism
in text, there are also software programs to detect plagiarism in computer
programming. These tools search for similar codes in programming
projects.
Software
to detect plagiarism in computer programming
Free -
accessible only to instructors in programming courses
From UC
Berkeley, "Moss (for a Measure Of Software Similarity) is an
automatic system for determining the similarity of C, C++,
Java, Pascal, Ada,
ML, Lisp, or Scheme programs. To
date, the main application of Moss has been in detecting
plagiarism in programming classes."
Software
to detect plagiarism in language texts and in computer programming
From the
University of Karlsruhe, Germany,
it "finds similarities among multiple sets of source code
files. This way it can detect software plagiarism.
JPlag currently supports Java, C,
C++, Scheme, and natural language text."
Note that Moss and JPlag do not exclude
each other. They present different reports and instructors should
use both to get more accurate information on possible plagiarism
on source codes.
Technical
reviews of plagiarism detection tools
Plagiarism and the
Internet
(2001) by F. Condron(2001)
offers reviews of software
for detecting plagiarism both in extended prose and
in the source code.http://www.oucs.ox.ac.uk/ltg/reports/plag.xml
A Review of Electronic Services forPlagiarism Detection in Student Submissions from South Bank
University, UK. "Four
services are discussed: the Measure of Software Similarity (MOSS)
service for program source code and the plagiarism.org, Integriguard
and copycatch.com services for free-text submissions."
http://www.ics.heacademy.ac.uk/events/presentations/317_Culwin.pdf
Some of the
same paper mills that students might use for plagiarism can be used
by instructors to examine suspected papers. This is perfectly
applicable with sites that offer free papers because Internet search
engines can locate these sites. They do not have access, however, to
the databases of the sites that sell papers. These require
subscription and it would be impossible for an instructor to
subscribe to all of them, and check each one of them individually.
We suggest that as a last resource
instructors search: 1) the free sites of paper mills 2) some sites
with closed databases but with accessible paper abstracts.
These sites allow search by topic or by keyword. Compare the
descriptions of the papers your search brings with the suspect paper.
To see these sites open Google and other search
engines and type "term papers" or "free term papers".
TIP: To save time, open a search engine and
type "research paper" and the topic of the student's paper. The
sites retrieved will include the paper mills which have papers on
the topic.
In many
cases, the professor's own database of previous papers is a great
plagiarism detection tool. For that, always
require students to hand in both a hard
copy and an electronic version of their papers.
Colleagues teaching the same type of course (and
asking for the same type of assignments) can share a common and thus
larger database to avoid the same paper being
submitted to different instructors.