基于大数据的大学生行为分析研究
发布时间:2021-03-07 05:12
随着信息技术的不断发展和基础设施的不断完善,大数据技术已广泛应用于各个行业,比如医疗、教育、餐饮、物流、汽车、金融和娱乐等行业,给人们的生活带来诸多便利。在大学,随着管理手段信息化的不断深入,产生了大量的数据,其中,大学生日常生活和学习行为所累积的数据引起了高校管理人员的高度重视,也成为广大研究者的研究对象。通过对这些数据进行处理和分析,则可以获得学生的行为特征和规律,为学生管理者更好地管理学生提供参考。本文基于兰州理工大学的学生数据,其中包含学生的书籍借阅、校园卡消费、两学年的成绩和学生专业记录等数据,作为要调查的数据源。使用RapidMiner数据架构框架,可以预处理数据集并集成不同的数据源以获得一组数据以进行分析。进行的主要工作如下:(1)利用FP—growth算法来挖掘学生的学习成绩、借阅的书籍数量、不同专业与校园卡消费之间的相关关系,来预测学生行为。还使用Python Pandas软件包进行了统计分析,以确保数据平衡以及检测和处理任何异常值。(2)通过使用K-means算法对学生数据进行聚类,根据聚类结果挖掘不同学生的学业成绩、图书借阅数据与校园卡消费数据之间的关系,以及不同...
【文章来源】:兰州理工大学甘肃省
【文章页数】:76 页
【学位级别】:硕士
【文章目录】:
Abstract
摘要
Chapter1 Introduction
1.1 Research Background and Significance
1.1.1 Research Background
1.1.2 Research Significance
1.2 Research status
1.2.1 Research Status of Big Data
1.2.2 Research Status of Association Rules
1.2.3 Research Status of Cluster Analysis
1.3 Research content
1.4 Thesis Structure
Chapter2 Related Theory and Technology Review
2.1 Data Mining
2.2 Data Preprocessing
2.3 Association Rules Mining
2.4 Cluster Analysis
2.5 Classification Techniques
2.5.1 Neural Networks Model
2.5.2 Na?ve Bayes Model
2.5.3 Support Vector Machine Model
2.5.4 Random Forest Model
2.5.5 K-Fold Cross Validation Technique
2.6 Data Mining tool
2.7 Chapter Summary
Chapter3 Preprocessing of Student Behavior Data
3.1 Introduction
3.2 Data introduction
3.3 Data Cleaning
3.3.1 Library Book borrowing data cleaning
3.3.2 Card Consumption Data Cleaning
3.3.3 Grade Data Cleaning
3.3.4 University Departments Data Cleaning
3.4 Data Integration
3.5 Chapter Summary
Chapter4 Study on the relevance of student behavior
4.1 Introduction
4.2 Association rules of student behavior
4.2.1 FP-Growth Algorithm
4.2.2 Discretization of students’Behavior data
4.2.3 Relevance analysis of behavior data
4.3 Cluster Analysis of student behavior
4.3.1 K-means algorithm
4.3.2 Cluster analysis based on K-means algorithm
4.4 Chapter summary
Chapter5 Predicting Student Performance
5.1 Introduction
5.2 Classification and prediction models
5.2.1 Neural Networks Model
5.2.2 Na?ve Bayes Model
5.2.3 Support Vector Machine Model
5.2.4 Random Forest Model
5.2.5 K-Fold Cross Validation
5.3 Prediction Models Evaluation Matrices of students’data
5.4 Predictive analysis of student data
5.4.1 Proposed model for predicting student performance
5.4.2 Proposed model Evaluation:
5.4.3 Student performance predictive models’comparison
5.4.4 Student performance predictive models’comparison for students’majors
5.5 Chapter summary
Conclusion and Future Work
References
Acknowledgement
Academic papers and awards
本文编号:3068459
【文章来源】:兰州理工大学甘肃省
【文章页数】:76 页
【学位级别】:硕士
【文章目录】:
Abstract
摘要
Chapter1 Introduction
1.1 Research Background and Significance
1.1.1 Research Background
1.1.2 Research Significance
1.2 Research status
1.2.1 Research Status of Big Data
1.2.2 Research Status of Association Rules
1.2.3 Research Status of Cluster Analysis
1.3 Research content
1.4 Thesis Structure
Chapter2 Related Theory and Technology Review
2.1 Data Mining
2.2 Data Preprocessing
2.3 Association Rules Mining
2.4 Cluster Analysis
2.5 Classification Techniques
2.5.1 Neural Networks Model
2.5.2 Na?ve Bayes Model
2.5.3 Support Vector Machine Model
2.5.4 Random Forest Model
2.5.5 K-Fold Cross Validation Technique
2.6 Data Mining tool
2.7 Chapter Summary
Chapter3 Preprocessing of Student Behavior Data
3.1 Introduction
3.2 Data introduction
3.3 Data Cleaning
3.3.1 Library Book borrowing data cleaning
3.3.2 Card Consumption Data Cleaning
3.3.3 Grade Data Cleaning
3.3.4 University Departments Data Cleaning
3.4 Data Integration
3.5 Chapter Summary
Chapter4 Study on the relevance of student behavior
4.1 Introduction
4.2 Association rules of student behavior
4.2.1 FP-Growth Algorithm
4.2.2 Discretization of students’Behavior data
4.2.3 Relevance analysis of behavior data
4.3 Cluster Analysis of student behavior
4.3.1 K-means algorithm
4.3.2 Cluster analysis based on K-means algorithm
4.4 Chapter summary
Chapter5 Predicting Student Performance
5.1 Introduction
5.2 Classification and prediction models
5.2.1 Neural Networks Model
5.2.2 Na?ve Bayes Model
5.2.3 Support Vector Machine Model
5.2.4 Random Forest Model
5.2.5 K-Fold Cross Validation
5.3 Prediction Models Evaluation Matrices of students’data
5.4 Predictive analysis of student data
5.4.1 Proposed model for predicting student performance
5.4.2 Proposed model Evaluation:
5.4.3 Student performance predictive models’comparison
5.4.4 Student performance predictive models’comparison for students’majors
5.5 Chapter summary
Conclusion and Future Work
References
Acknowledgement
Academic papers and awards
本文编号:3068459
本文链接:https://www.wllwen.com/kejilunwen/shengwushengchang/3068459.html
最近更新
教材专著