A Recommendation System Framework for Educational Content

Reinforcement in Virtual Learning Environments

Adson R. P. Damasceno, Lucas C. Carneiro, Jo

ao Victor F. T. De Sampaio, Allberson B. O. Dantas,

Eudenia Magalh

aes, Paulo Henrique M. Maia and Francisco C. M. B. Oliveira

UECE - State University of Cear

a, Fortaleza, Brazil

Keywords:

Recommendation of Educational Content, Educational Content Reinforcement, Recommendation Systems,

Virtual Learning Environments.

Abstract:

In Virtual Learning Environments, tutors play a vital role by supporting students and improving their learning

through the courses. One important task is to identify content with which the students struggle and give

them suggestions for educational resources to reinforce their learning and overcome difﬁculties. However,

providing individualized suggestions for each student may be infeasible, especially for courses with many

enrolled students. In this work, we propose and validate a framework for building recommendation systems

of educational content for Virtual Learning Environments. Our proposed system identiﬁes the content that a

student needs to reinforce based on the results of his assessments and recommends resources that best relate

to the questions that he answered incorrectly, using Information Retrieval, Machine Learning, and Natural

Language Processing techniques. We validate our proposed solution by taking as a case study data collected

from DAL - Dell Accessible Learning, an distance learning platform. We built a dataset with content from 8

courses to compare the performance of different methods and text representations in our framework. Our best

result achieved an accuracy of 0.89 using a Nearest Neighbor method with TF-IDF representation.

1 INTRODUCTION

During the learning process, a critical learning oppor-

tunity is created when the student is asked in assess-

ments to use the concepts presented in a context dif-

ferent from the one they were introduced to. It is by

working with new ideas that the apprentice appropri-

ates them. This is also where his incompetence sur-

faces (McLeod, 2010). At this point, the instructor’s

intervention is precious, providing the necessary in-

formation for the practical completion of the appro-

priation process. However, in online courses medi-

ated through Virtual Learning Environments (VLE),

the sheer volume of online activities can be too much

for the tutor, and the workload on online tutors are

often reported to be signiﬁcantly greater than what it

is in a face-to-face teaching context (Bernath and Ru-

bin, 2001). It is challenging and time-consuming for

tutors to keep track of all students’ learning activities

to provide individualized recommendations and feed-

back, especially when the number of enrolled students

in a course grows. Some VLE may not even have tu-

tors available to perform such activity. Therefore, a

VLE should make available the greatest number of

resources that favor the students’ initiative, since the

main premises of distance learning are the optimiza-

tion of time, ﬂexibility, and autonomy in the knowl-

edge retention process.

One solution to mitigate this problem is the use

of Machine Learning (ML) to build recommendation

systems of educational content. Such systems aim

to provide valuable educational resource suggestions

to students to reinforce their knowledge in contents

that they lack appropriation. Suggestions are given

based on data patterns collected from the users and

the educational resources, learned by ML algorithms.

Recommendation systems of educational content al-

low the reduction of workload on online tutors and

increase the autonomy of students during their learn-

ing process in VLE.

In light of the provided scenario, we offer a

framework for developing recommendation systems

for educational content reinforcement in VLE. Our

framework consists of gathering textual data from the

courses of a VLE to build a knowledge base, and

employ Information Retrieval (IR), ML and Natu-

ral Language Processing (NLP) techniques to sug-

gest educational content. The suggestions are related

228

Damasceno, A., Carneiro, L., T. De Sampaio, J., Dantas, A., Magalhães, E., Maia, P. and Oliveira, F.

A Recommendation System Framework for Educational Content Reinforcement in Virtual Learning Environments.

DOI: 10.5220/0011032000003182

In Proceedings of the 14th International Conference on Computer Supported Education (CSEDU 2022) - Volume 1, pages 228-235

ISBN: 978-989-758-562-3; ISSN: 2184-5026

to questions incorrectly answered by students during

pedagogical exams, thus reinforcing the content with

which they struggled the most. Our approach uses

only textual data from pedagogical resources and in-

correctly answered questions and can be implemented

in any VLE, provided that texts of the education con-

tents can be gathered. To validate the viability and

performance of the proposed framework, we consid-

ered a case study where we evaluated the recommen-

dation methods on data gathered from Dell Accessi-

ble Learning (DAL) platform, an accessible distance

learning environment. The obtained results show the

viability of creating and implementing recommenda-

tion systems in VLEs using our framework. Further-

more, we believe that, in addition to reducing the ef-

fort of online tutors, this application will allow stu-

dents to gain more autonomy in the learning process.

The remainder of this paper is organized as fol-

lows: Section 2 provides an overview of related works

in the ﬁeld of recommendation systems in educational

contexts. Next, our proposed methodologies and their

evaluation are described in Section 3. The results and

inferences drawn from them are presented in Section

4. Finally, the conclusions we can make from our in-

vestigation are presented in Section 5.

2 RELATED WORK

Computer systems technologies have been widely

used in the construction of educational software,

from training systems, educational games to virtual

learning environments capable of intelligently recom-

mending educational content to students. A recom-

mendation system of educational content is a func-

tionality that provides students in VLE with pedagog-

ical resources that are likely to be helpful to their

learning process (Shani and Gunawardana, 2011).

There are numerous research initiatives in educational

resource recommendation systems (Marante et al.,

2020; Rivera et al., 2018; Kulkarni et al., 2020;

Kla

snja-Mili

cevi

c et al., 2015). In the following para-

graphs, we present some promising directions.

In recent years, recommendation systems for ed-

ucational content based on ﬁnding patterns of simi-

larity among students have been widely investigated

(Zaliane, 2006; Urdaneta-Ponte et al., 2021; Nur-

janah, 2016). These patterns can occur in sev-

eral ways, including performance measured in corre-

sponding activities, patterns of access to the system,

and socioeconomic issues. Such systems can use a va-

riety of learning techniques, including data grouping,

association rules, natural language processing, and

collaborative ﬁltering (Sicilia et al., 2010; Indrayadi

and Nurjanah, 2015).

A relevant research question in recommendation

systems for educational content is how to identify stu-

dent learning patterns (Kla

snja-Mili

cevi

c et al., 2011;

Truong, 2016; Yan et al., 2021). In fact, the study of

individual student learning styles is a well-established

research ﬁeld, with several learning proﬁles identi-

ﬁed, each proposing different descriptions and classi-

ﬁcations of learning types (Cofﬁeld et al., 2004). In-

creasingly adopted methods consist in inferring stu-

dents’ learning patterns based on their interactions

with the system, such as their access to pages and on-

line content, the number of attempts they make in as-

sessments, among others (Kelly and Tangney, 2004;

Gope and Jain, 2017; Nafea et al., 2019; Jyothi et al.,

2012).

In this work, the only data gathered from the stu-

dents used as input are the texts of incorrectly an-

swered questions in courses’ assessments. Our frame-

work uses NLP techniques to predict which content

in a knowledge base is best related to these ques-

tions and recommend it to students in order to rein-

force their learning process. We did not ﬁnd in liter-

ature works that recommend content based purely on

the text from questions that students answered incor-

rectly. In the next section, the details of our recom-

mendation system are described thoroughly.

3 METHODOLOGY

This section comprises the proposed solution, as well

as the experimental setup to validate our approach

considering the speciﬁc case of DAL, an accessi-

ble distance learning environment focused mainly on

technology courses.

Our approach consists of creating a knowledge

base, or corpus, of educational content and recom-

mending the most similar documents using the texts

of questions answered incorrectly during evaluation

activities as queries. We emphasize that our strategy

applies to other educational platforms as long as it is

possible to collect a corpus of educational content.

3.1 Corpus Construction and Analysis

We created our corpus with documents of educational

content from the courses offered on the DAL plat-

form. This platform provides a variety of online

courses in various disciplines, including courses in

the areas of data science, web development, mobile

development, management, languages, inter alia. The

lessons are displayed as videos or texts, the latter be-

ing split into topics. The original language of most of

A Recommendation System Framework for Educational Content Reinforcement in Virtual Learning Environments

229

Figure 1: Frequency of the quantity of educational contents

in lessons.

the courses is Brazilian Portuguese.

We developed a script that automatically down-

loads the ﬁles from the DAL platform and builds the

corpus. The textual content from web page lessons

is collected by parsing HTML ﬁles, while video con-

tent is collected by parsing its subtitles. The corpus is

stored as a table in which each row represents a doc-

ument. We describe the table’s columns in Table 1.

Table 1: Description of variables in the corpus in table for-

mat.

Variable Description

Course Name of the course from where the content

was collected. E.g.: Data Visualization.

Lesson ID Number that uniquely identiﬁes the lesson

from where the content was collected. It will

be used to ﬁlter out the possible recommen-

dations. E.g.: 102192363916253.

Content name The title of the content. This information is

what will be recommended to the student. It

can be a topic from a web lesson or the num-

ber of a video lesson. E.g.: Topic 1 - Corre-

lation and scatter plots.

Text Text of the document. This will be used to

compare with a text of a question to ﬁnd the

best content to recommend. E.g. Under-

stand the concept of correlation between two

variables; Learn to create two-variable scat-

ter plots using seaborn; Learn how to cre-

ate scatter plots with a regression line using

seaborn...

Each lesson has between 2 and 6 educational con-

tents that can be recommended. Figure 1 illustrates

the frequency of quantity of contents in lessons. Most

lessons have less than four contents, with two con-

tents in a class being the most frequent. The higher

the quantity of contents in one class, the more a rec-

ommendation system is helpful for a student, since

there is a higher space of search to reinforce the con-

tents, and the system can direct its efforts to achieve

better results.

We also analyse the distribution of the quantity of

words in the documents, illustrated in Figure 2. In

general, the documents are long texts. The smaller

document have 180 words, and the longer have near

Figure 2: Frequency of the quantity of words in documents.

6000 words. Half of the documents have more than

1000 words.

3.2 Recommendation Methods

As discussed previously, we aim to recommend edu-

cational content related to incorrectly answered ques-

tions during assessment activities. In this sense, the

challenge relies in identify which content is more rel-

evant for a given question and lesson. We highlight

that, in this scenario, every question is associated

with a unique lesson that is known beforehand. To

achieve this goal we took two approaches well know

in literature: one based on ﬁnding the document with

the highest cosine similarity in a vector representa-

tion, and the other based on supervised learning, us-

ing the Naive-Bayes method for text classiﬁcation.

These methods were chosen due its simplicity and

good performance in scenarios where there is not a

large amount of data available. Both approaches are

discussed in detail below.

3.2.1 Nearest Neighbor

The Nearest Neighbor (NN) approach is a typical IR

method, which is related with content-based recom-

mendation systems (Pazzani and Billsus, 2007). This

method consists of, for a given question and lesson,

ﬁnd the most similar document in the subset of doc-

uments related to that lesson in our corpus, and then

recommend the name of the related content. In the

context of artiﬁcial intelligence applied in education,

this approach was previously used by (Montesuma

et al., 2021) for retrieving documents that answer stu-

dents’ conceptual questions. We formalize ours as

follows.

Let q ∈ R

refers to a question in a vec-

tor representation, related to a lesson `, and D =

,...,d

} refers to a corpus formed by N docu-

ments gathered from lessons of online courses. Each

document d

is deﬁned as a tuple d

= {v

CSEDU 2022 - 14th International Conference on Computer Supported Education

230

corresponding to the document’s vector representa-

tion v

∈ R

, the associated lesson id of the doc-

ument `

, and the name of the content c

, respec-

tively. Each lesson ` is associated to a subset S =

,· ·· , d

} ⊂ D of P documents. For a given

input question q from a lesson ` and a corpus D, the

content name c

∈ S that best relates to q is given by

argmaxsim(q, v

) where the function sim(u, v) repre-

sents the cosine similarity between two vectors.

Any vector representation of text can be used in

this approach. In our experiments, we used two

well-known vector representations for documents:

Term Frequency-Inverse Document Frequency (TF-

IDF) (Jones, 1972) and centroids of Word Embed-

dings (Galke et al., 2017), more speciﬁcally the

Word2Vec embeddings (Mikolov et al., 2013). TF-

IDF representation models the importance of each

word in a document so that documents that have a

higher intersection of keywords will have greater sim-

ilarity. Word2Vec represents each word in a dense

vector space, where semantically similar words are

spatially close. The centroid of a text is calculated

as the mean vector of all words in that text. Thus,

it is intended that texts with similar semantics have

greater similarity in this vector space, which is de-

sired in many scenarios. Figure 3 illustrates the NN

approach. An advantage of the NN approach is that it

does not requires model training, since it is a instance-

based learning method. Updating the model to make

recommendations for new lessons only requires up-

dating the corpus.

question text (query)

lecture ID

preprocessing and

vectorization

filtered knowledge base

with documents related to

input lecture

knowledge base

(texts and matrix of

vectorized documents)

cosine similarity

Document with

higher similarity

Figure 3: Diagram of the NN approach.

3.2.2 Naive Bayes

The Naive Bayes (NB) is a well-known supervised

learning ML method used for classiﬁcation, based on

the Bayes Rule. In this strategy, the problem is mod-

eled as a text classiﬁcation task, where from a given

question and a lesson, we aim to predict the name of

the content related to that question. The NB classi-

ﬁer assumes that the features of a data point are inde-

pendent. More speciﬁcally, we use the Multinomial

Naive Bayes classiﬁer (Kibriya et al., 2004), which

assumes that the likelihood of TF-IDF features fol-

lows a multinomial distribution. Although the multi-

nomial distribution presumes that variables are dis-

crete counts, in practice, TF-IDF fractional counts are

known to perform better.

Since we known beforehand that a given question

is related to a speciﬁc lesson, we also know that the

prediction should be one of the contents related to that

lesson. Therefore, instead of training one model with

all the corpus D, we train one independent model for

each lesson. In this way, a lesson ` is associated to a

subset S

= {d

,· ·· , d

} ⊂ D and with a model

. Each model m

is trained with the documents

∈ S and can predict only the contents c

∈ d

. As

so, before predict a content for a new question q, one

must ﬁrst choose the correct model trained with data

from the lesson related to q.

Since the target classes correspond to the content

name c

of each document, we only have one example

for each class. This implies that the prior probabil-

ity for each class is uniform, so only the likelihood

term from the Bayes rule is used to determine the pre-

diction. We highlight that any other supervised model

for classiﬁcation could be used, following the strategy

of training one model for each lesson, but we choose

NB for its relative low complexity, low computational

cost to train and fair performance in many text classi-

ﬁcation problems, in particular when there are a low

amount of examples available for each class. Figure 4

illustrates the NB approach.

question text (query)

lecture ID

preprocessing and

vectorization

model related to input

lecture

set of Naive-Bayes models,

where each model is trained

for one lecture

model i

model 1

Predicted

document

Figure 4: Diagram of the NB approach.

3.3 Validation Data Set

To validate and compare the performance of our ap-

proaches, we manually built a data set with a sample

of actual questions from DAL platform courses. We

collected multiple-choice questions from assessments

in each class in the 8 most popular courses in DAL.

The labels, i.e., the contents names, were then as-

signed by tutors with expertise in the ﬁelds by check-

ing in the lesson the content each question refers.

Figure 5 illustrates the courses and the number of

samples collected. The data set has 122 questions

from 8 courses. Each question is related to only one

lesson, which is known in advance. Some courses

have a larger number of samples than others, due to

A Recommendation System Framework for Educational Content Reinforcement in Virtual Learning Environments

231

the availability of tutors with the expertise to label the

data.

Figure 5: Quantity of questions by course in the validation

dataset.

Besides the text of the question itself and the asso-

ciated content name, it was also collected the course

name, lesson id, multiple-choice items, and correct

answer. Table 2 describes each variable in this data

set.

We use this data set by taking the text and the les-

son ID from each question as input to our methods

and comparing the predicted content name with the

original content name manually assigned to the ques-

tion. We also aim to investigate how the addition of

extra information as the multiple-choice items or the

correct item impacts the performance of our models.

Therefore, we compare the predictions using the text

of the question alone, text and multiple-choice items,

and text and only the correct answer. We believe that

the addition of the correct answer may provide extra

information that may help our models improve per-

formance, due to the existence of keywords in many

of the answers.

For comparing between methods, we use as a met-

ric the recall for each course, i.e., the proportion of

correct predictions in each course. For further com-

parison, we use the balanced accuracy, which corre-

sponds simply to the average recall of each course.

Table 2: Description of columns in the validation data set.

Variable Description

Course Name of the course.

Lesson Numeric ID that uniquely identiﬁes the les-

son from which the question was collected.

Text Text of the question.

Multiple-choice items Items with possible answers to the question.

Correct answer The item that correctly responds the question.

Content The target that we want to predict. The name

of the content related to the question.

4 RESULTS AND DISCUSSION

In this section, we describe the experimental setup

and results of the proposed experiments. The main

goal of these experiments was to validate the viabil-

ity of our framework, but also to compare the per-

formance of each proposed method in the particular

scenario described in our validation data set. We also

highlight that after the experiments, we implemented

a recommendation system in DAL platform, which

send recommendations to student thorough a chatbot

described in (Damasceno et al., 2020). The integra-

tion of the system with the chatbot occurs by using a

message broker.

4.1 Experimental Setup

The only preprocessing applied to the texts is convert-

ing them to lowercase. This choice is due to the high

variability of expressions and terms present in tech-

nology courses, such as words in other languages and

programming language keywords and symbols. We

stated empirically that more sophisticated preprocess-

ing steps, such as lemmatization, resulted in a worse

general result. As pointed out previously, we use TF-

IDF and Word2Vec centroids to represent text in a

vector space. The TF-IDF representation considers

unigrams and bigrams as tokens, separated by spaces

or punctuation, and the resulting vector space has a

dimension of 227,917. The Word2Vec model was

trained using all documents from our corpus, with a

vector dimension of 100.

4.2 Results

We ﬁrst evaluate the effect of adding multiple-choice

items and the correct answer to the text input for our

methods. Table 3 describes the balanced accuracy for

each method, when different inputs are taken into ac-

count. We can observe that for each variation of in-

puts, the NN approach with Word2Vec features per-

formed the worst. The NN and NB approaches using

TF-IDF features have similar performances for every

input, being somewhat equivalent.

This indicates that using the centroid of word vec-

tors is not a good representation for this task. This

may be due to the loss of information that occurs

when representing a text composed of many word

vectors as an average vector. The information loss

is more signiﬁcant for longer texts, which is the case

for most of the documents in the corpus. In fact,

Word2Vec focuses on modeling the semantic simi-

larity of words, not their importance in a document,

unlike TF-IDF. The experiments revealed the latter to

CSEDU 2022 - 14th International Conference on Computer Supported Education

232

be more suitable for the task of ﬁnding a document re-

lated to a question by measuring the cosine similarity

between them.

In terms of which input works best for the task,

both adding multiple-choice items and adding cor-

rect answers improved accuracy when compared to

using only the text of the question. However, adding

all multiple-choice items had a slightly better result

than adding only the right answer. Despite the correct

answer being a more precise information, adding all

items, including the incorrect ones, was revealed to

be a better choice, since words in wrong answers may

still be present in the target documents because they

can be related to the same topics.

Table 3: Balanced accuracy for different inputs. The best

results were obtained with the methods NN and NB in TF-

IDF representation with questions and all of the multiple-

choice items. Word2Vec features did not perform well for

all the inputs.

NN TF-IDF NN

Word2Vec

NB TF-IDF

Question only 0.82 0.63 0.79

Question and

multiple-choice

items

0.89 0.60 0.86

Question and correct

answer

0.87 0.63 0.84

We further investigate the best input option by an-

alyzing the performance in each course for the differ-

ent methods. These results are presented in Table 4.

Table 4: Recall by course for each method, using as input

the question and all the multiple-choice items.

Course NN - TF-IDF NN Word2Vec NB - TF-IDF

Customer Service 1.00 0.30 1.00

Databases with Java 0.81 0.81 0.81

Java OOP 0.76 0.56 0.76

Introduction to Python 0.85 0.62 0.69

Programming Logic 1.00 0.70 1.00

Java Full Stack 0.90 0.70 0.80

Machine Learning 0.79 0.64 0.79

Data Visualization 1.00 0.40 1.00

Balanced Accuracy 0.89 0.59 0.86

As expected, NN with Word2Vec features per-

formed the worst for all courses, with the notable ex-

ception of the course Database Fundamentals in Java,

for which all methods had the same recall. Both NN

and NB with TF-IDF features performed similarly in

each course. In most courses, they achieved the same

recall, possibly due to the relatively low quantity of

samples. In fact, those methods are somewhat simi-

lar, as they take into consideration the importance of

words in documents, weighted by TF-IDF.

NN method performed better than NB in courses

like Introduction to Python and Java Full Stack, there-

fore having a slightly better balanced accuracy. In

practical terms, the NN method is also less expen-

sive to maintain and scale, since it does not require

the training of new models when new courses are cre-

ated, only that the corpus be updated. On the other

hand, because one NB model is trained for each class

in a course independently, it would require new mod-

els to be trained every time a course is updated or a

new course is added to the corpus.

Our best result, obtained with NN and TF-IDF

features achieved a balanced accuracy of 0.89, which

is considered a satisfactory result for our pedagogical

purposes. We also believe that even when the desired

educational content is not recommended, the recom-

mended document still have a signiﬁcant relation with

the question, since both come from the same lesson.

Thus, even in case of errors, the target and the pre-

dicted documents have closely related topics, which

would still be beneﬁcial to the students. To conﬁrm

this statement, we next analyze some examples of in-

correct recommendations.

4.3 Error Analysis

Out of 122 questions, our best method predicted in-

correctly 18 contents. To recognize the limitations

of this method, we performed a qualitative analysis,

comparing the text of the incorrect predictions with

the text of the desired ones. More speciﬁcally, we

analyzed the errors in the Machine Learning course,

which had 6 incorrect predictions out of a total of 28

questions. Table 5 shows this comparison. The ﬁrst

column in the table is the name of the content pre-

dicted by the model and the second one (Sim Q/P) is

the cosine similarity between the text of the question

and the text of the predicted content in TF-IDF repre-

sentation. The third column is the name of the target

content and the fourth one is the cosine similarity be-

tween the text of the question and the text of the target

content (Sim Q/T).

Table 5: Comparison between predicted and target contents

in the incorrect predictions from Machine Learning course.

Most of predicted contents are introductory topics that pre-

cedes the targets in the lesson.

Predicted content Sim Q/P Correct content Sim Q/T

Introduction to data

analysis with Python

0.033 Statistical analysis of a data set 0.025

Deleting columns 0.074 Allocating missing values 0.056

The artiﬁcial neuron 0.049 The ”exclusive-or” example 0.027

Introduction to Ten-

sorﬂow

0.049 Construction and evaluation of

the Tensorﬂow model

0.043

Introduction to SVM 0.102 Implementing SVM with Kernels 0.076

Confusion, Recall and

Confusion Matrix

0.049 Introduction to Model Evaluation 0.031

From Table 5, we notice that most of the errors

in the Machine Learning course occurred by recom-

A Recommendation System Framework for Educational Content Reinforcement in Virtual Learning Environments

233

mending an introduction content instead of the target,

a most advanced content. That is, the recommended

contents usually precede the targets. In this course,

this occurred 4 out of 6 times.

A possible reason behind that is that documents

of introductory contents in this corpus are usually

longer, having more words. They also usually intro-

duce some of the concepts that will be studied in the

next content, thus having a lot of words in common.

So it is likely that a question has words in common

with introductory content. Although the recommen-

dation of this introductory content is not misleading

to the student, it lacks the precision that is expected

from a recommendation system and may not be help-

ful enough to improve the student learning process.

This gap shows room for improvement by exploring

other methods that don’t solely rely on the frequency

of words in texts.

5 CONCLUSIONS

In this work, we proposed an approach to build a rec-

ommendation system of educational content in VLE.

The approach consists of create a corpus of docu-

ments from online courses and recommend to a stu-

dent the contents best related to the incorrectly an-

swered questions, aiming to reinforce the students

learning process. We explored two methods to ﬁnd

the best document for a given question and lesson,

the Nearest Neighbor method, based in cosine simi-

larity from documents represented in a vector space

(TF-IDF or Word2Vec), and the Naive Bayes method.

We evaluated this approach in a study case using

data collected from DAL platform. Our best result

achieved a balanced accuracy of 0.89 using NN with

TF-IDF, which was considered satisfactory but still

shows room for improvement. In general, the results

showed the feasibility of implementing our frame-

work, which can directly impact the students’ learn-

ing process by improving their autonomy. A practical

consequence was the implementation of a recommen-

dation system in DAL platform with the settings that

achieved better results in these experiments.

As future work, we intend to evaluate the impact

of this application in the learning process of students

in DAL platform, collecting quantitative and qualita-

tive data of the experience and performance of the

students while taking lessons. We also want to ex-

periment with more sophisticated and state of the art

methods or different similarity metrics or text repre-

sentations to achieve better accuracy in recommenda-

tions, e.g. attention-based models as BERT (Vaswani

et al., 2017; Akkalyoncu Yilmaz et al., 2019), pre-

trained deep learning models, and transfer learning

(Do and Ng, 2005; Yan and Zhang, 2009; Deb, 2019).

Finally, we intend to validate our framework on a

larger dataset, with data with greater variation, col-

lected from courses in different areas.

REFERENCES

Akkalyoncu Yilmaz, Z., Wang, S., Yang, W., Zhang, H.,

and Lin, J. (2019). Applying BERT to document re-

trieval with birch. In Proceedings of the 2019 Con-

ference on Empirical Methods in Natural Language

Processing and the 9th International Joint Conference

on Natural Language Processing (EMNLP-IJCNLP):

System Demonstrations, pages 19–24, Hong Kong,

China. Association for Computational Linguistics.

Bernath, U. and Rubin, E. (2001). Professional devel-

opment in distance education – a successful experi-

ment and future directions. Innovations in Open &

Distance Learning, Successful Development of Online

and Web-Based Learning, pages 213–223.

Cofﬁeld, F., Moseley, D., Hall, E., and Ecclestone, K.

(2004). Should we be using learning styles? what

research has to say to practice.

Damasceno, A. R., Martins, A. R., Chagas, M. L., Barros,

E. M., Maia, P. H. M., and Oliveira, F. C. (2020). Stu-

art: an intelligent tutoring system for increasing scal-

ability of distance education courses. In Proceedings

of the 19th Brazilian Symposium on Human Factors in

Computing Systems, pages 1–10.

Deb, S. (2019). Driving content recommendations by build-

ing a knowledge base using weak supervision and

transfer learning. In Proceedings of the 13th ACM

Conference on Recommender Systems, pages 531–

531.

Do, C. B. and Ng, A. Y. (2005). Transfer learning for text

classiﬁcation. Advances in neural information pro-

cessing systems, 18:299–306.

Galke, L., Saleh, A., and Scherp, A. (2017). Word em-

beddings for practical information retrieval. INFOR-

MATIK 2017.

Gope, J. and Jain, S. K. (2017). A learning styles based rec-

ommender system prototype for edx courses. In 2017

International Conference on Smart Technologies for

Smart Nation (SmartTechCon), pages 414–419. IEEE.

Indrayadi, F. and Nurjanah, D. (2015). Combining learner’s

preference and similar peers’ experience in adaptive

learning. In CSEDU (1), pages 486–493.

Jones, K. S. (1972). A statistical interpretation of term

speciﬁcity and its application in retrieval. Journal of

Documentation, 28:11–21.

Jyothi, N., Bhan, K., Mothukuri, U., Jain, S., and Jain, D.

(2012). A recommender system assisting instructor in

building learning path for personalized learning sys-

tem. In 2012 IEEE Fourth International Conference

on Technology for Education, pages 228–230. IEEE.

Kelly, D. and Tangney, B. (2004). Predicting learning char-

acteristics in a multiple intelligence based tutoring

CSEDU 2022 - 14th International Conference on Computer Supported Education

234

system. In International Conference on Intelligent Tu-

toring Systems, pages 678–688. Springer.

Kibriya, A. M., Frank, E., Pfahringer, B., and Holmes, G.

(2004). Multinomial naive bayes for text categoriza-

tion revisited. In Australasian Joint Conference on

Artiﬁcial Intelligence, pages 488–499. Springer.

Kla

snja-Mili

cevi

c, A., Ivanovi

c, M., and Nanopoulos, A.

(2015). Recommender systems in e-learning environ-

ments: a survey of the state-of-the-art and possible

extensions. Artiﬁcial Intelligence Review, 44(4):571–

604.

Kla

snja-Mili

cevi

c, A., Vesin, B., Ivanovi

c, M., and Budi-

mac, Z. (2011). E-learning personalization based on

hybrid recommendation strategy and learning style

identiﬁcation. Computers & education, 56(3):885–

899.

Kulkarni, P. V., Rai, S., and Kale, R. (2020). Recommender

system in elearning: A survey. In Bhalla, S., Kwan,

P., Bedekar, M., Phalnikar, R., and Sirsikar, S., edi-

tors, Proceeding of International Conference on Com-

putational Science and Applications, pages 119–126,

Singapore. Springer Singapore.

Marante, Y., Silva, V., Jr., J. G., Vitor, M., Martins, A.,

and Souza, J. D. (2020). Evaluating educational rec-

ommendation systems: a systematic mapping. In

Anais do XXXI Simp

osio Brasileiro de Inform

atica na

Educac¸

ao, pages 912–921, Porto Alegre, RS, Brasil.

SBC.

McLeod, S. (2010). Zone of proximal development. Simply

psychology.

Mikolov, T., Chen, K., Corrado, G., and Dean, J. (2013).

Efﬁcient estimation of word representations in vector

space.

Montesuma, E. F., Carneiro, L. C., Damasceno, A. R.,

de Sampaio, J. V. F., F

errer Filho, R. F., Maia, P.

H. M., and Oliveira, F. C. (2021). An empirical study

of information retrieval and machine reading compre-

hension algorithms for an online education platform.

In Anais do XIII Simp

osio Brasileiro de Tecnologia da

Informac¸

ao e da Linguagem Humana, pages 217–226.

SBC.

Nafea, S. M., Siewe, F., and He, Y. (2019). On recom-

mendation of learning objects using felder-silverman

learning style model. IEEE Access, 7:163034–

163048.

Nurjanah, D. (2016). Good and similar learners’ recom-

mendation in adaptive learning systems. In CSEDU

(1), pages 434–440.

Pazzani, M. J. and Billsus, D. (2007). Content-based rec-

ommendation systems. In The adaptive web, pages

325–341. Springer.

Rivera, A. C., Tapia-Leon, M., and Lujan-Mora, S. (2018).

Recommendation systems in education: A systematic

mapping study. In Rocha,

A. and Guarda, T., editors,

Proceedings of the International Conference on In-

formation Technology & Systems (ICITS 2018), pages

937–947, Cham. Springer International Publishing.

Shani, G. and Gunawardana, A. (2011). Evaluating recom-

mendation systems. In Recommender systems hand-

book, pages 257–297. Springer.

Sicilia, M.-

A., Garc

ıa-Barriocanal, E., S

anchez-Alonso, S.,

and Cechinel, C. (2010). Exploring user-based rec-

ommender results in large learning object reposito-

ries: the case of merlot. Procedia Computer Science,

1(2):2859–2864.

Truong, H. M. (2016). Integrating learning styles and adap-

tive e-learning system: Current developments, prob-

lems and opportunities. Computers in human behav-

ior, 55:1185–1193.

Urdaneta-Ponte, M. C., Mendez-Zorrilla, A., and

Oleagordia-Ruiz, I. (2021). Recommendation

systems for education: Systematic review. Electron-

ics, 10(14):1611.

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones,

L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.

(2017). Attention is all you need. In Advances in

neural information processing systems, pages 5998–

6008.

Yan, L., Yin, C., Chen, H., Rong, W., Xiong, Z., and David,

B. (2021). Learning resource recommendation in e-

learning systems based on online learning style. In

International Conference on Knowledge Science, En-

gineering and Management, pages 373–385. Springer.

Yan, R. and Zhang, J. (2009). Transfer learning using task-

level features with application to information retrieval.

In Twenty-First International Joint Conference on Ar-

tiﬁcial Intelligence.

Zaliane, O. (2006). Recommender systems for e-learning:

towards non-intrusive web mining. Data Mining in

E-Learning(Advances in Management Information),

4:79–93.

A Recommendation System Framework for Educational Content Reinforcement in Virtual Learning Environments

235