|
Plagiarism Detection in Software Projects Using Abstract Syntax Trees
Szymutko, Marek ; Seda, Pavel
Plagiarism is a hot topic in modern education andscience. It requires special attention since committing plagiarismis very easy with the use of the internet. This problem can befought against utilizing prevention or detection methods, whichhave been both used in this work. This paper introduces animplementation of a submission scheme of students’ projects inclasses taught at the Brno University of Technology. Scripts for anautomatic hand-in space for each student were created. Studentshave restricted privileges within these spaces on the GitLabcloud service. For plagiarism detection, a tool written in Pythonwas developed. This tool utilizes Abstract Syntax Trees compiledfrom the source code, which is a part of the Students’ solutions.The output of the comparison is represented with a tabular fileof the format .xlsx, which allows a detailed view. Ongoingimplementation is focused on widening the tool’s usability byadding a Python similarity comparison engine.
|
| |
|
Detection of plagiatorism in software projects in the BDS course
Szymutko, Marek ; Přinosil, Jiří (referee) ; Šeda, Pavel (advisor)
Plagiarism is a widespread problem, which can be fought by prevention or detection methods. This thesis contains a summary of plagiarism detection methods through automated means. To parse source code, an open-source abstract syntax tree compiler was employed. The functionality of this compiler was demonstrated in this thesis. To reduce the mutual visibility of students’ projects, a proposal for the submission process was created. GitLab cloud service was employed for this purpose. Initialization of these students’ spaces is performed via Bash scripts. Other scripts to archive and create spaces for groups of students in the GitLab service were also created. A similarity-detecting tool was created in Python programming language. This tool was specialized to be employed in the subject BPC-BDS for the detection of plagiarism in students’ assignments written in Java or Python. It can also be used in other subjects though. For similarity detection, numerical metrics and abstract syntax trees were used. The comparison output of the projects and their individual parts is represented with integer value and outputted into a tabular file of the format xlsx. This thesis also summarizes the strengths and weaknesses of the implemented solution and lists problems that were encountered in the process of implementation. A case study about plagiarism in the subject BDS in the academic year 2022/2023 is also included in this thesis.
|
| |
|
Detection of Duplicates in Huge Web Databases
Sadloň, Vladimír ; Galamboš, Leo (advisor) ; Kopecký, Michal (referee)
This master thesis analyses the methods used for duplicity document detection and possibilities of their integration with a web search engine. It offers an overview of commonly used methods, from which it chooses the method of approximation of the Jaccard similarity measure in combination with shingling. The chosen method is adapted for implementation in the Egothor web search engine environment. The aim of the thesis is to present this implementation, describe its features, and find the most suitable parameters for the detection to run in real time. An important feature of the described method is also the possibility to make dynamic changes over the collection of indexed documents.
|
|
Detection of source code plagiarism
Bláhová, Barbora ; Harabiš, Vratislav (referee) ; Kašpar, Jakub (advisor)
The purpose of this thesis is to introduce the matters of plagiarism of source codes and to suggest methods, that will be used to create the original plagiarism detector. Theoretical part of the thesis states the definitions and the most common methods of plagiarism, both in general and in source codes. Furthermore it presents already existing types of the detectors and states principals, according to which the detection can be performed. Practical part deals with implementation of detecton and it´s testing.
|
|
Detection of similarity in program codes
Maťašová, Kristýna ; Vítek, Martin (referee) ; Kašpar, Jakub (advisor)
The Bachelor introduces the concept of plagiarism and possible kinds of plagiarism. It focuses on the problem of detecting the similarity of source codes, especially with graphical interfaces in the MATLAB environment. It also describes already existing detectors. The practical part of thesis is focused on finding appropriate flags for detection of similarity in source codes and introduces the metric of detected flags. It also describes the internal logic of created detector of similarity and discusses the results of its testing.
|
|
Plagiarism Detection in Program Codes Using Mapping Technique
Kašpar, Jakub
The aim of this paper is to introduce the problem of plagiarism and propose a method for plagiarism detection in program codes. In the first part of this paper the basic definition of plagiarism is described. Further in the paper the principle of preprocessing and localization process for signs of plagiarism is introduced. The last part of this paper presents an algorithm for comparison of the detected signs to get the best results possible. The detector was tested on student projects from the BTBIO study program.
|
|
Plagiarism detection in programme codes
Skoupilová, Alena ; Vítek, Martin (referee) ; Kašpar, Jakub (advisor)
The main goal of this thesis is to introduce the meaning of plagiarism and its types and occuration in academic field in form of textual plagiarism and mainly source-code plagiarism. Thesis also introduces principals and types of source-code plagiarism detection and introduces existing detecting tools. A detector for computing source-code similarity based on detection and counting chosen attributes is being realized and described. Reability of the detector is tested within students’ projects database.
|
|
Plagiarism detection of text documents
Nezval, Jiří ; Kašpar, Jakub (referee) ; Vítek, Martin (advisor)
This thesis informs the reader about plagiarism. It explaines basic methods and approaches of its detection. Furthermore, it contains a practical part realized in the Matlab enviroment involving creating a plagiarism detector. The detector was tested on a database of real thesis. Graphic user interface is also implemented into the detector.
|