Module3Assignment - Copy

.docx

School

Northeastern University *

*We aren’t endorsed by this school

Course

6015

Subject

Mathematics

Date

Apr 3, 2024

Type

docx

Pages

14

Uploaded by PresidentToadPerson1018 on coursehero.com

Module 3 Assignment College of Professional Studies, Northeastern University ALY6015, 21626 Harpreet Sharma January 29 th , 2024 1
Table of Contents Introduction ................................................................................................................................ 4 Analysis ...................................................................................................................................... 4 Figure 1 ................................................................................................................................... 4 Scatterplot for Enrollment vs. Applicants ............................................................................... 4 Figure 2 ................................................................................................................................... 5 Boxplot comparing Private and Public Schools ..................................................................... 5 Figure 3 ................................................................................................................................... 5 Descriptive statistics for Enrollment numbers ....................................................................... 5 Figure 4 ................................................................................................................................... 6 Descriptive Statistics for Out-of-state Tuition ........................................................................ 6 Figure 5 ................................................................................................................................... 6 Summary of Logistic Regression Model ................................................................................. 6 Figure 6 ................................................................................................................................... 6 Exponentiated Coefficients of the Logistic Regression Model ............................................... 6 Figure 7 ................................................................................................................................... 7 Confusion Matrix for Train Set ............................................................................................... 7 Figure 8 ................................................................................................................................... 8 Model Metrics ......................................................................................................................... 8 Figure 9 ................................................................................................................................... 9 Confusion Matrix for Test set ................................................................................................. 9 Figure 10 ............................................................................................................................... 10 ROC Curve ........................................................................................................................... 10 Figure 11 ............................................................................................................................... 11 The area under the Curve ..................................................................................................... 11 Conclusion/Interpretation ......................................................................................................... 11 References ................................................................................................................................ 12 Appendices ............................................................................................................................... 13 Appendix A ........................................................................................................................... 13 Scatterplot of Enrollments vs. Applications ......................................................................... 13 Appendix B ........................................................................................................................... 13 Boxplot of Private and Public Schools ................................................................................. 13 Appendix C ........................................................................................................................... 13 Descriptive Statistics of the number of Enrollments ............................................................ 13 Appendix D .......................................................................................................................... 13 2
Descriptive Statistics for Out-of-state Tuition ...................................................................... 13 Appendix E ........................................................................................................................... 14 Train and Test set .................................................................................................................. 14 Appendix F ........................................................................................................................... 14 Logistic Regression Model ................................................................................................... 14 Appendix G .......................................................................................................................... 14 Confusion Matrix of Train Set .............................................................................................. 14 Appendix H .......................................................................................................................... 14 Confusion Matric for Test Set .............................................................................................. 14 Appendix I ............................................................................................................................ 14 ROC Curve ........................................................................................................................... 14 3
Introduction This report focuses on the analysis and prediction of university classification (private or public) using logistic regression on the College dataset obtained from the ISLR package, which comprises 777 records and 18 variables. The process involves exploratory data analysis, a train-test split, and subsequent model training. Key evaluations encompassing confusion matrices, along with accuracy, precision, recall, specificity, and testing set performance analyses. The ROC curve assesses discrimination ability, with AUC calculated for overall model performance. Analysis 1. Exploratory Data Analysis Created a scatterplot to explore the relationship between the number of applicants and enrollments (see Appendix A). Figure 1 Scatterplot for Enrollment vs. Applicants 4
Figure 1 indicates that there is a positive correlation between the variables. Suggesting that as the number of applications increases, there tends to be an increase in the number of enrollments, and vice versa. Created a boxplot to compare the enrollments in private and non-private (public) schools (see Appendix B). Figure 2 Boxplot comparing Private and Public Schools Figure 2 indicates that there is a huge difference between the two groups as the median of public school (yes) lies outside of the private school box plot. Calculated descriptive statistics for enrollment numbers (see Appendix C) Figure 3 Descriptive statistics for Enrollment numbers Calculated descriptive statistics for out-of-state tuition (see Appendix D) 5
Your preview ends here
Eager to read complete document? Join bartleby learn and gain access to the full version
  • Access to all documents
  • Unlimited textbook solutions
  • 24/7 expert homework help