Meeting_3_31

2021-03-31 20:32:32 阅读：148 来源： 互联网

标签：Web MOOC Meeting 31 user learning model data

Hello, Mr. Houben, this is Zhou Yanzhuo. I returned to China after being in quarantine for a month before I was able to contact people. Now I have finished dealing with my family problems at home. And I had a physical problem when I was in the Netherlands, and I am slowly recovering. I don't know when I can return to the Netherlands. I may have to wait for fewer cases in the Netherlands. Recently, there are too many new cases, which makes people feel quite scary. There are no more cases in mainland China.

1. Why do you choose this topic?

First of all, MOOC has played a very important role in my daily study. When I was in university, there were no computer-related courses because I was in a language university. At that time, when I planned to study computer programming and related courses by myself, I referred to a lot of MOOC materials, such as courses like data structure. The MOOC can visualize the entire data structure and algorithm in just a few minutes. The process of change is clearly stated, which is much more convenient than the textbook format I use.
Secondly, I often get sleepy in traditional classrooms, and sometimes I can't keep up with the teacher's progress. In contrast, MOOCs can study repeatedly anytime and anywhere and control the learning progress by myself. Together with the in-class exercises after each class can help me better consolidate the knowledge system.

2. What's the purpose of this paper?

I collected more than a dozen related articles and selected three of them that I am more interested in to review how they process webpage data. These three articles also use different methods to process data from different perspectives. One is machine learning on measuring the quality of specific parameters to exploit certain learning strategies, one is TF-IDF on observing the user profiles, and the other is system design on log analysis. And I hope that these articles will have a certain relevance to the current research on the MOOC in Delft.

3. Describe three papers

3.1 The first paper I reviewed is a case study of Tsinghua XuetangX, one of the largest MOOCs from China, and I am also a seasoned user of this MOOC platform. This paper uses dataset from xuetangX, the authors analyze key factors that influence students’ engagement in MOOCs and study to what extent we could infer a student’s learning effectiveness. The results are interesting as that students who exert higher effort and ask more questions are not necessarily more likely to get certificates. The authors also develop a unified model to predict students’ learning effectiveness, by incorporating user demographics, forum activities, and learning behavior. They demonstrate that the proposed model significantly outperforms several alternative methods in predicting students’ performance on assignments and course certificates. The model is flexible and can be applied to various settings. For example, we are deploying a new feature into xuetangX to help teachers dynamically optimize the teaching process.

The dataset collects 11 courses, categorized into science and non-science types. The authors first conduct a regression analysis to examine the correlation between student demographics and course selection. Regarding demographics, they consider gender, education (graduate degree including master and Ph.D., bachelor, and those with degree below bachelor), and age. It can briefly give us some information that females and bachelors are more likely to choose non-science courses. To study the learning activity patterns, the authors focus on the forum and video activities to examine whether the likelihood of getting the certificate is correlated with user demographics, forum activities, and effective learning time. They find that Both forum activities and effective learning are significant predictors for certificate rate, suggesting the importance of encouraging students to participate in forum discussions.

The key design of the model is a factor graph. Each student is associated with a feature vector Xt(i) and a set of activities Y t(i) for time t. We use latent states Zt(i) to model students’ activities and features. The method of modeling students’ activities using latent states is similar to the assumption in Markov Decision Process (MDP), and Deep Learning. In the LadFG model, we use function f(.) to capture correlations between different (observed or latent) variables. S(t) equals a time-dependent correlation between a sequence of p past latent state Z, attributes Xt(i). And the latent states can be obtained by previous states and attributes. The learning goal is to achieve the minimization of the objective function. This observation function can be regarded as the difference between the previous state and the subsequent state. The authors use an EM liked algorithm to learn these parameters.

E-step: fix all model parameters obj and update Z, by using a gradient descent method.
M-step: fix all latent states Z and update each model parameter in obj.

Moreover, we use grade and certificate earner prediction to evaluate the effectiveness of the proposed model.

3.2 The second paper provides another view of processing web data as that over half of MOOC learners (62.4%) reported themselves as being employed full-time or self-employed. The authors then investigate whether the information in different fields of professionals' profiles from LinkedIn1 (e.g., job titles) allows producing useful user profiles which can be used for personalized MOOC recommendations. They retrieve the data with the keyword "coursera". One advantage to collect data from Linkedln is that most of the degree information is correct. Users are represented by a vector of weighted keywords from a specic eld in their proless. the elds in a LinkedIn prole about a user u can be summarized as: (1) job titles: Software Engineer, Java Engineer (2) education elds: Information Engineering, and (3) skills: Java, C++, Microsoft Excel. We use the well the known TF (Term Frequency)-IDF (Inverse Document Frequency) as the weighting scheme, i.e.,

Our main goal is to analyze and compare the applicability of dierent user modeling strategies in the context of MOOC recommendations. We do not aim to optimize recommendation quality, but are interested in comparing the quality achieved by the same recommendation algorithm when inputing dierent types of user proless. We compare the quality of dierent user modeling strategies to that of the top-popular recommendation strategy as a baseline, which is a common practice for cold start situations. Top-popular recommendation (pop) is a non personalized model recommends the top-N items with the highest popularity amongst learners. The performance of the recommender system was evaluated by standard evaluation methods Mean Reciprocal Rank (MRR) and Success at rank N (S@N). MRR indicates at which rank the rest item relevant to the user occurs on average. S@N stands for the mean probability that a relevant item occurs within the top-N recommendations.

The authors prove that a \richer" learner tend to take a greater number of MOOCs. In terms of user modeling strategies, our experiment showed that the skill-based user modeling strategy performs better than the job- and edu-based ones in the context of MOOC recommendations.

3.3 This paper presents our MOOClm platform, for transforming data from MOOCs into independent learner models that can drive personalisation and support reuse of the learner model, for example in an Open Learner Model (OLM). Our MOOClm platform has been designed to harness this data by transforming it into an independent learner model. This means that the model can be reused in other learning systems.

At the left is the MOOC platform, the Open edX platform. This is widely used and open source. This was the reason it was chosen for our SPOC on C and Unix. The right of the figure shows the MOOClm server. This platform MOOClm augments the MOOC.

The top left box represents the MOOC interface the users see. Below this is the basic analytics tools. At the upper right, we show the MongoDB “Courseware” database of all learning resources and references, including text for exercises and YouTube references for videos. At the lower right is the raw logs of date-stamped events stored in JSON format. The most recent response for each problem is in the analytics database; the JSON logs have the full history.

The logs are comprehensive but need care to transform into a learner model. For example, when Alice views the video, this is logged as a load video event when the page is opened, then several play video events at two minute intervals until the video finishes. When Alice does an exercise, edX logs a problem check event from browser to server, a problem check event internal to the server, then a problem graded request from server to browser which gives the result. Only the problem graded event indicates if the submission was correct.

We illustrate some event types to indicate challenge in designing MOOClm: play video, load video, pause video.

In the learner model, the log data is treated as evidence. This transformation required several design decisions. This process unifies our learner model ontology of learning objectives with the raw MOOC data. The log entries use edX’s internal courseware ID for learning materials such as video and exercises. To process the JSON log files, this ID is checked against the courseware database, mapping it to a human-readable form.

MOOClm provides several Resolvers. Each Resolver examines the evidence for and against for each learning outcome and draws a conclusion about the “knowledge level” of a learning objective.

4. What conclusions can be drawn from the paper review?

5. What did you learn from this course?

5.1 I reviewed some basic knowledge about wwb and this is the basis of my future job.

there are design challenges (modelbased): •Data (content) structures (what data do we have?) •Navigation (how do we organize the data for access?) •Presentation (layout) (how do we present the data in access?) •Database/repository management (how do we do the back-end)

Web 2.0: Rish interfaces / user participation

Web 3.0: Semantic Web: research aims at further developments in the Web and Web applications, mostly along languages and technology.

Web science: the web should be understood / to be engineered for future growth and capabilities.

5.2 User modeling

User modeling is a way that represents user information that supports a given application or context.

This personalized application has to obtain, understand information/context about the user.

People leave traces on the Web everywhere in logs, cookies, sessions, bookmarks, pictures and videos. It's not only about a user's behavior, but also interactions of other users.

User profile: a characterization of a user at a particular moment
User model: definitions and rules for the interpretation of observations about the user

Modeling by observe the user: use logs, clustering

TF-IDF: weight the concepts

The more fine-grained the concepts the better the recommendation performance: entity-based

5.3 Web data

Social web data provides a source of knowledge that is unprecedented. It is the source of data for UM.

issues: privacy concerns / privacy settings

5.4 Human computation

slow / inconsistent

labeling images / recognizing and clustering objects -> large number of people using the web can perform tasks

ImageNet: AI systems need to be trained and evaluated on big data

5.5 Semantic Web

-databases

-web

-logics

Page-centric design

Web data should easily be available for machines for further processing
Data should be able to be combined, or merged easily on a Web-scale
We need to be able to do complex queries or even reasoning on data

The goal of the semantic web is not to understand natural language, but to provide information in a computer-readable form

------------恢复内容结束------------

标签：Web,MOOC,Meeting,31,user,learning,model,data
来源： https://www.cnblogs.com/zyzindividual/p/14603586.html

本站声明： 1. iCode9 技术分享网（下文简称本站）提供的所有内容，仅供技术学习、探讨和分享；
2. 关于本站的所有留言、评论、转载及引用，纯属内容发起人的个人观点，与本站观点和立场无关；
3. 关于本站的所有言论和文字，纯属内容发起人的个人观点，与本站观点和立场无关；
4. 本站文章均是网友提供，不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属；如您发现该文章侵犯了您的权益，可联系我们第一时间进行删除；
5. 本站为非盈利性的个人网站，所有内容不会用来进行牟利，也不会利用任何形式的广告来间接获益，纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

ICode9

Meeting_3_31