Main Article Content
Simple data processing using the r studio programming language is a data processing process that needs to be done to convert raw data into information. The data processing method uses data acquisition, data input, data selection, division of study programs. split data using k-fold cross-validation, text word cloud model, and model evaluation. Secondary data was acquired in excel format in May 2021, the number of datasets is 71 records with 5 variables, namely number, name, gender, faculty, and study program (Prodi). Data selection aims to select variables that are needed and variables that are not needed are deleted, so that the results of data selection have 2 variables, namely Name and Study Program and a dataset of 71 records. K-fold cross-validation has training data 54 records and testing data 17. The text mining model is visualized with word cloud data, the results of the word cloud testing data test show that there are 4 most important words, including "STI" with a frequency value of 5, "Teacher" a frequency value of 4 , “Law” frequency value 3, and “Agriculture” frequency value 2.