Our second case study is inspired by Maier, Baker, & Stalzer (2021). This research builds a model of exercise difficulties on pass-fail data. The primary aim of this case study is to gain some hands-on experience with essential R packages and functions for Logistic Knowledge Tracing, and its simplest variant, Performance Factors Analysis (PFA).
This paper discussed some of the challenges in PFAâs real-world application and some potential solutions: insufficient number of practices, degenerate parameters, rare benchmarks, and compensatory vs. conjunctive skill relationships.
Motivation
Althrough PFA has been widely used in research and very successful in predicting studentsâ future performance, thereâs little study on what factors impact PFAâs performance in real-world settings.
Research Question
The central goal of this research is:
This paperâs goal is to study the factors that have emerged for other algorithms, to better understand the use of PFA in real-world learning.
Dataset and Model
The data used in this paper is from Reveal Math Course 1, a digital core math product for 6th grade. This dataset includes 3073 students in 46 schools. The authors randomly select 20% of the students as testing set and 80% of the students as training set. Modelâs performance was analyzed on the testing set.
The authors trained a baseline model using the original formula. Overall, AUC is in the 0.78-0.80 range and RMSE values are in the 0.42-0.44 range.
Challenges
There are four challenges discussed in this paper.
Insufficient Number of Practices: The authors found when students in the data donât encounter a skill enough times, the model will perform more poorly, and extreme parameters will appear. Thus, several experiments were conducted involving the number of practices per skill, filtering on real data. The authors find major improvement from 2 to 3 practices and it continues up to 6. They also find that the more practices found in the training data, the better the model will perform on the testing set. The improvement was the most substantial from the 2nd to 5th practices, but still increased up to practice 12.
Degenerate Parameters: First, the authors identified 3 types of degeneracy:
Type 1: Îł < 0 (Getting an item right leads to worse future performance overall)
Type 2: Îł < Ï (Getting an item right leads to worse future performance than getting it wrong)
Type 3: Îł = Ï = 0 (No improvement)
They found 6 skills that had type 1 degeneracy, and 7 skills had type 3 degeneracy. There was type 2 degeneracy if the model was trained with the data from 4 or fewer practices. The authors suggested fixing these problems by constraining the parameters during fitting (as done for BKT in Corbett and Anderson, 1995).
Rare vs Common Skills: A model needs to be able to handle both common and rare skills. Thus, the authors proposed to adjust the PFA formula to differentiate common and rare skills.. They proposed either fitting a common set of parameters across all rare skills, or using the average parameters of the common skills for the rare skills.
Compensatory vs Conjunctive Skills in PFA: If skills are treated as conjunctive, then a student needs to know every skill tagged to an item, to solve it correctly. If skills are compensatory, then students need to know at least one of the skills tagged to an item, to solve it correctly. The original version of PFA is compensatory; the authors compared that to a conjunctive version and a mathematical model in between compensatory and conjunctive â assuming performance is an average of skills involved in the item. They found that both new versions performed worse than the original compensatory model. This contradicts a past result for BKT.
Summary
Overall, the authors believe that the degenerate parameters are an issue for PFA and there are still some other challenges that might appear in real-world settings. However, the authors argue that these challenges can be addressed with only minor adjustments. Thus, PFA is a very reasonable choice in real-world settings.
1b. Load Packages
In this case study, you will use the data from this project by Dr. Philip I. Pavlik Jr. to learn how to fit a PFA model. First, you will learn about the essential packages and resources you will be using in this case study.
CRAN: The Comprehensive R Archive Network
CRAN is a network of FTP and web servers around the world that store identical, up-to-date, versions of code and documentation for R and R packages.
You may use the CRAN mirror nearest to you to minimize network load.
LKT: Logistic Knowledge Tracing
This package computes Logistic Knowledge Tracing (âLKTâ) which is a general framework for tracking human learning in an educational software system. Please see Pavlik, Eglington, and Harrell-Williams (2021). The LKT framework allows a researcher or practitioner to select and then compute features of student data that are used as predictors of subsequent performance. LKT allows flexibility in the choice of predictive components and features. The system is built on top of âLiblineaRâ, which computes logistic models quickly.
The following object is masked from 'package:SparseM':
det
Loading required package: data.table
Loading required package: LiblineaR
Package 'LKT' version 1.7.0
Type 'citation("LKT")' for citing this R package in publications.
2. Wrangle
Data wrangling is the process of converting raw data into a format suitable for analysis. Typically, data is not formatted in the necessary fashion for your package when you first obtain it.
Import the dataset
Youâll first import the raw dataset originally obtained from this project by Dr. Philip I. Pavlik Jr. This dataset is included in the LKT package as largerawsample
Use the code chunk below to import the dataset:
set.seed(41)val<-largerawsample
Convert to a data table
val =setDT(val)
data = setDT(data): This line is using the setDT() function to convert the object data into a data.table object.
Data Cleaning and Transformation
Next, you are going to prepare the data for subsequent analysis, making it more manageable, understandable, and suitable for the Performance Factors Analysis.
#Clean it upval$KC..Default.<-val$Problem.Name# get the times of each trial in seconds from 1970val$CF..Time.<-as.numeric(as.POSIXct(as.character(val$Time),format="%Y-%m-%d %H:%M:%S"))#make sure it is ordered in the way the code expectsval<-val[order(val$Anon.Student.Id, val$CF..Time.),]#create a binary response column to predict and extract only data with a valid valueval$CF..ansbin.<-ifelse(tolower(val$Outcome)=="correct",1,ifelse(tolower(val$Outcome)=="incorrect",0,-1))val<-val[val$CF..ansbin.==0| val$CF..ansbin.==1,]# create durationsval$Duration..sec.<-(val$CF..End.Latency.+val$CF..Review.Latency.+500)/1000
val$KC..Default.<-val$Problem.Name:
This line of code assigns the value of val$Problem.Name to val$KC..Default. In R, the <- operator is used for assignment. So, itâs taking the value of val$Problem.Name and assigning it to val$KC..Default. By looking at the data, we could see that Problem.Name column is more accurate in describing the knowledge component (KC).
This line converts the Time column in the dataframe val into seconds since the UNIX epoch (January 1, 1970). It first converts Time to character type, then to POSIXct (a datetime object in R), and finally to numeric, resulting in the number of seconds since 1970.
This line reorders the dataframe val based on two columns: Anon.Student.Id and CF..Time.. This ensures that the data is arranged in the expected order for further processing.
This line creates a new column named CF..ansbin. in the dataframe val. It assigns 1 if the Outcome column (converted to lowercase) is âcorrectâ, 0 if itâs âincorrectâ, and -1 otherwise.
This line filters the dataframe val to only include rows where the CF..ansbin. column value is either 0 or 1, meaning itâs either âincorrectâ or âcorrectâ. Rows with a value of -1 (indicating invalid or other outcomes) are excluded.
Student-level cross-validation
Creating student-level cross validation folds
# Define the number of foldsk <-5# Get unique student IDsunique_students <-unique(val$Anon.Student.Id)# Randomly shuffle the student IDsshuffled_students <-sample(unique_students)# Split student IDs into k foldsfolds <-split(shuffled_students, cut(seq_along(shuffled_students), k, labels =FALSE))for (i in1:k) {# Add fold numbers to dataset. val$fold[val$Anon.Student.Id %in% folds[[i]]] <- i}
3. Explore
Take a look at the dataset
What does a LKT dataset look like? It is similar to what BKT expects but a little bit different. The dataset is huge and hard to print with R. Thus, please double-click the val under the âEnvironmentâ tab on the top-right of the R-studio to take a look at the data.
There are many variables inside this sheet. However, only four of them matter. Letâs filter down to them::
In this section, we will build four variants of LKT, including the classic PFA variant.
Additive Factors Model (AFM) fixed effect version
auc_values <-c()rmse_values <-c()ll_values <-c()r2_values <-c()for (i in1:k) { test_fold <- i# Fit the model. Note that usefolds is a vector of folds to include. We use everything except the held-out test fold. modelob <-LKT(data = val, interc=FALSE,components =c("Anon.Student.Id","KC..Default.","KC..Default."),features =c("intercept", "intercept", "lineafm"), usefolds=(1:k)[-test_fold])# Run the model on the held-out test fold. Save the results in 'results'. results <-predict_lkt(modelob, data=val, return_stats =TRUE, fold = test_fold)#Save the metrics to the vectors. auc_values <-c(auc_values, results$AUC) rmse_values <-c(rmse_values, results$RMSE) ll_values <-c(ll_values, results$ll) r2_values <-c(r2_values, results$R2)}
intercept Anon.Student.Id
intercept KC..Default.
lineafm KC..Default.
lineafmKC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000037 0.291886 0.561738 0.545395 0.808057 0.999700
McFadden's R2 logistic: 0.284001
LogLike logistic: -21712.32056958
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm KC..Default.
lineafmKC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000468 0.2893137 0.5571527 0.5416072 0.8014506 0.9997027
McFadden's R2 logistic: 0.281532
LogLike logistic: -21904.71290584
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm KC..Default.
lineafmKC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001872 0.301024 0.564987 0.548957 0.804419 0.999698
McFadden's R2 logistic: 0.274324
LogLike logistic: -22008.24696243
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm KC..Default.
lineafmKC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000335 0.2957704 0.5607353 0.5450283 0.8028969 0.9994865
McFadden's R2 logistic: 0.278355
LogLike logistic: -21945.81856756
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm KC..Default.
lineafmKC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000314 0.2912254 0.5590398 0.5443789 0.8052511 0.9997399
McFadden's R2 logistic: 0.283427
LogLike logistic: -21769.5610774
Setting levels: control = 0, case = 1
Setting direction: controls < cases
print(mean(auc_values))
[1] 0.7795474
Performance Factors Analysis (PFA) fixed effects version
auc_values <-c()rmse_values <-c()ll_values <-c()r2_values <-c()for (i in1:k) { test_fold <- i# Fit the model. Note that usefolds is a vector of folds to include. We use everything except the held-out test fold. modelob <-LKT(data = val, interc=FALSE,components =c("Anon.Student.Id", "KC..Default.", "KC..Default.", "KC..Default."), features =c("intercept", "intercept", "linesuc$","linefail$"), usefolds=(1:k)[-test_fold])# Run the model on the held-out test fold. Save the results in 'results'. results <-predict_lkt(modelob, data=val, return_stats =TRUE, fold = test_fold)#Save the metrics to the vectors. auc_values <-c(auc_values, results$AUC) rmse_values <-c(rmse_values, results$RMSE) ll_values <-c(ll_values, results$ll) r2_values <-c(r2_values, results$R2)}
intercept Anon.Student.Id
intercept KC..Default.
linesuc$ KC..Default.
linefail$ KC..Default.
linefailKC..Default.:e$data$KC..Default.+linesucKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000151 0.2837202 0.5501652 0.5453885 0.8166812 0.9999900
McFadden's R2 logistic: 0.300103
LogLike logistic: -21224.03949015
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
linesuc$ KC..Default.
linefail$ KC..Default.
linefailKC..Default.:e$data$KC..Default.+linesucKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000111 0.2815212 0.5429010 0.5416037 0.8116261 0.9999900
McFadden's R2 logistic: 0.298502
LogLike logistic: -21387.33526806
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
linesuc$ KC..Default.
linefail$ KC..Default.
linefailKC..Default.:e$data$KC..Default.+linesucKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002943 0.291836 0.552747 0.548948 0.815356 0.999990
McFadden's R2 logistic: 0.290945
LogLike logistic: -21504.16080931
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
linesuc$ KC..Default.
linefail$ KC..Default.
linefailKC..Default.:e$data$KC..Default.+linesucKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000145 0.2870338 0.5480655 0.5450271 0.8118929 0.9999900
McFadden's R2 logistic: 0.294142
LogLike logistic: -21465.70268813
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
linesuc$ KC..Default.
linefail$ KC..Default.
linefailKC..Default.:e$data$KC..Default.+linesucKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+0
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000136 0.2839430 0.5484811 0.5443772 0.8140910 0.9999900
McFadden's R2 logistic: 0.298739
LogLike logistic: -21304.38635147
Setting levels: control = 0, case = 1
Setting direction: controls < cases
print(mean(auc_values))
[1] 0.7934317
Recent Performance Factors Analysis (RPFA)
auc_values <-c()rmse_values <-c()ll_values <-c()r2_values <-c()for (i in1:k) { test_fold <- i# Fit the model. Note that usefolds is a vector of folds to include. We use everything except the held-out test fold. modelob <-LKT(data = val, interc=TRUE,components =c("Anon.Student.Id", "KC..Default.", "KC..Default.", "KC..Default."),features =c("intercept", "intercept", "propdec2","linefail"), usefolds=(1:k)[-test_fold])# Run the model on the held-out test fold. Save the results in 'results'. results <-predict_lkt(modelob, data=val, return_stats =TRUE, fold = test_fold)#Save the metrics to the vectors. auc_values <-c(auc_values, results$AUC) rmse_values <-c(rmse_values, results$RMSE) ll_values <-c(ll_values, results$ll) r2_values <-c(r2_values, results$R2)}
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.5
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2702902 0.5493040 0.5453876 0.8394007 0.9980896
McFadden's R2 logistic: 0.311261
LogLike logistic: -20885.6832446
step par values =0.5
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.501
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2703095 0.5493287 0.5453876 0.8393341 0.9980907
McFadden's R2 logistic: 0.311252
LogLike logistic: -20885.94721762
step par values =0.501
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.499
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2702885 0.5493291 0.5453876 0.8394583 0.9980885
McFadden's R2 logistic: 0.31127
LogLike logistic: -20885.42120889
step par values =0.499
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000227 0.2709873 0.5540615 0.5453878 0.8406724 0.9967424
McFadden's R2 logistic: 0.306281
LogLike logistic: -21036.69562569
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.00101
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000227 0.2709705 0.5539906 0.5453878 0.8407845 0.9967456
McFadden's R2 logistic: 0.306309
LogLike logistic: -21035.8571737
step par values =0.00101
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000227 0.2709873 0.5540615 0.5453878 0.8406724 0.9967424
McFadden's R2 logistic: 0.306281
LogLike logistic: -21036.69562569
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.387411712186497
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000267 0.2693004 0.5499043 0.5453877 0.8420982 0.9979192
McFadden's R2 logistic: 0.311793
LogLike logistic: -20869.54157383
step par values =0.3874117
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.388411712186497
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000267 0.2692899 0.5498827 0.5453877 0.8420776 0.9979211
McFadden's R2 logistic: 0.311793
LogLike logistic: -20869.55849036
step par values =0.3884117
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.386411712186497
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000275 0.2692970 0.5498938 0.5453877 0.8421014 0.9979184
McFadden's R2 logistic: 0.311794
LogLike logistic: -20869.5292103
step par values =0.3864117
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.380775106872534
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2693018 0.5499482 0.5453877 0.8422737 0.9979065
McFadden's R2 logistic: 0.311795
LogLike logistic: -20869.48940565
step par values =0.3807751
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.381775106872534
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2692962 0.5499507 0.5453877 0.8422898 0.9979084
McFadden's R2 logistic: 0.311795
LogLike logistic: -20869.49056854
step par values =0.3817751
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.379775106872534
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000266 0.2693216 0.5499230 0.5453877 0.8422542 0.9979045
McFadden's R2 logistic: 0.311795
LogLike logistic: -20869.49061406
step par values =0.3797751
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.5
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000274 0.2655166 0.5428230 0.5416020 0.8345223 0.9976015
McFadden's R2 logistic: 0.311047
LogLike logistic: -21004.8662962
step par values =0.5
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.501
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000278 0.2655301 0.5427707 0.5416020 0.8345159 0.9976028
McFadden's R2 logistic: 0.311037
LogLike logistic: -21005.17055426
step par values =0.501
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.499
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000277 0.2654955 0.5429233 0.5416020 0.8345086 0.9976001
McFadden's R2 logistic: 0.311056
LogLike logistic: -21004.56569465
step par values =0.499
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000221 0.2671295 0.5459883 0.5416021 0.8384034 0.9958487
McFadden's R2 logistic: 0.306138
LogLike logistic: -21154.51907335
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.00101
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000221 0.2671286 0.5461787 0.5416021 0.8384518 0.9958535
McFadden's R2 logistic: 0.306166
LogLike logistic: -21153.65174413
step par values =0.00101
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000221 0.2671295 0.5459883 0.5416021 0.8384034 0.9958487
McFadden's R2 logistic: 0.306138
LogLike logistic: -21154.51907335
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.378679273227491
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000262 0.2646884 0.5427870 0.5416020 0.8375153 0.9973581
McFadden's R2 logistic: 0.311696
LogLike logistic: -20985.0656743
step par values =0.3786793
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.379679273227491
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000262 0.2646923 0.5427842 0.5416020 0.8375062 0.9973606
McFadden's R2 logistic: 0.311696
LogLike logistic: -20985.08208539
step par values =0.3796793
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.377679273227491
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000262 0.2646841 0.5427837 0.5416020 0.8375556 0.9973557
McFadden's R2 logistic: 0.311697
LogLike logistic: -20985.05182533
step par values =0.3776793
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.372290176635784
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000262 0.2646459 0.5427419 0.5416020 0.8377184 0.9973417
McFadden's R2 logistic: 0.311698
LogLike logistic: -20985.02083794
step par values =0.3722902
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.373290176635784
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000262 0.2646160 0.5427104 0.5416020 0.8376735 0.9973442
McFadden's R2 logistic: 0.311698
LogLike logistic: -20985.02093255
step par values =0.3732902
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.371290176635784
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000258 0.2646500 0.5427607 0.5416020 0.8377543 0.9973404
McFadden's R2 logistic: 0.311697
LogLike logistic: -20985.02249016
step par values =0.3712902
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.5
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002469 0.277583 0.551794 0.548946 0.840377 0.996984
McFadden's R2 logistic: 0.303337
LogLike logistic: -21128.35278611
step par values =0.5
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.501
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002469 0.277592 0.551775 0.548946 0.840356 0.996989
McFadden's R2 logistic: 0.303327
LogLike logistic: -21128.6434494
step par values =0.501
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.499
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002468 0.277572 0.551812 0.548946 0.840367 0.996980
McFadden's R2 logistic: 0.303346
LogLike logistic: -21128.06418384
step par values =0.499
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002155 0.279455 0.555635 0.548946 0.841635 0.994571
McFadden's R2 logistic: 0.298177
LogLike logistic: -21284.83650619
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.00101
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002155 0.279452 0.555641 0.548946 0.841677 0.994576
McFadden's R2 logistic: 0.298206
LogLike logistic: -21283.95003394
step par values =0.00101
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002155 0.279455 0.555635 0.548946 0.841635 0.994571
McFadden's R2 logistic: 0.298177
LogLike logistic: -21284.83650619
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.383504542323257
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002378 0.276541 0.552336 0.548946 0.842585 0.996414
McFadden's R2 logistic: 0.303941
LogLike logistic: -21110.01784983
step par values =0.3835045
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.384504542323257
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002379 0.276530 0.552348 0.548946 0.842538 0.996420
McFadden's R2 logistic: 0.303941
LogLike logistic: -21110.03531627
step par values =0.3845045
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.382504542323257
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002378 0.276543 0.552386 0.548946 0.842610 0.996409
McFadden's R2 logistic: 0.303942
LogLike logistic: -21110.00291357
step par values =0.3825045
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.376601959425517
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002373 0.276438 0.552432 0.548946 0.842720 0.996375
McFadden's R2 logistic: 0.303943
LogLike logistic: -21109.96637852
step par values =0.376602
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.377601959425517
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002374 0.276462 0.552429 0.548946 0.842783 0.996381
McFadden's R2 logistic: 0.303943
LogLike logistic: -21109.96634901
step par values =0.377602
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.375601959425517
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002373 0.276434 0.552435 0.548946 0.842713 0.996369
McFadden's R2 logistic: 0.303943
LogLike logistic: -21109.96894892
step par values =0.375602
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.5
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000281 0.2723211 0.5464621 0.5450253 0.8380177 0.9980286
McFadden's R2 logistic: 0.307841
LogLike logistic: -21049.11230213
step par values =0.5
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.501
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000281 0.2723360 0.5464859 0.5450253 0.8379652 0.9980299
McFadden's R2 logistic: 0.30783
LogLike logistic: -21049.43842428
step par values =0.501
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.499
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000028 0.272304 0.546439 0.545025 0.838111 0.998027
McFadden's R2 logistic: 0.307852
LogLike logistic: -21048.78823389
step par values =0.499
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000235 0.2722953 0.5509650 0.5450253 0.8404766 0.9965522
McFadden's R2 logistic: 0.303263
LogLike logistic: -21188.32403249
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.00101
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000235 0.2722820 0.5509298 0.5450253 0.8404294 0.9965562
McFadden's R2 logistic: 0.303291
LogLike logistic: -21187.46923055
step par values =0.00101
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000235 0.2722953 0.5509650 0.5450253 0.8404766 0.9965522
McFadden's R2 logistic: 0.303263
LogLike logistic: -21188.32403249
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.368915419152444
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000272 0.2709310 0.5470512 0.5450253 0.8408357 0.9978040
McFadden's R2 logistic: 0.308598
LogLike logistic: -21026.09410563
step par values =0.3689154
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.369915419152444
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000269 0.2709370 0.5470726 0.5450253 0.8407889 0.9978059
McFadden's R2 logistic: 0.308597
LogLike logistic: -21026.10974858
step par values =0.3699154
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.367915419152444
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000269 0.2709120 0.5470322 0.5450253 0.8408012 0.9978014
McFadden's R2 logistic: 0.308598
LogLike logistic: -21026.0794182
step par values =0.3679154
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.362501317926035
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000283 0.2709387 0.5470680 0.5450254 0.8408515 0.9977910
McFadden's R2 logistic: 0.308599
LogLike logistic: -21026.05252884
step par values =0.3625013
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.363501317926035
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000279 0.2709322 0.5471482 0.5450254 0.8408177 0.9977941
McFadden's R2 logistic: 0.308599
LogLike logistic: -21026.05153754
step par values =0.3635013
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.361501317926035
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000283 0.2709245 0.5470902 0.5450254 0.8408834 0.9977888
McFadden's R2 logistic: 0.308599
LogLike logistic: -21026.05505536
step par values =0.3615013
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.5
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000258 0.2688199 0.5480623 0.5443752 0.8392424 0.9979885
McFadden's R2 logistic: 0.311558
LogLike logistic: -20914.93070204
step par values =0.5
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.501
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000258 0.2688292 0.5479557 0.5443752 0.8392236 0.9979897
McFadden's R2 logistic: 0.311549
LogLike logistic: -20915.21939841
step par values =0.501
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.499
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000256 0.2687883 0.5480693 0.5443752 0.8393004 0.9979870
McFadden's R2 logistic: 0.311568
LogLike logistic: -20914.64334332
step par values =0.499
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000217 0.2692554 0.5520275 0.5443753 0.8413734 0.9965126
McFadden's R2 logistic: 0.306741
LogLike logistic: -21061.27500713
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.00101
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000217 0.2692250 0.5520724 0.5443753 0.8414061 0.9965166
McFadden's R2 logistic: 0.306769
LogLike logistic: -21060.42813504
step par values =0.00101
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 1e-05
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000217 0.2692554 0.5520275 0.5443753 0.8413734 0.9965126
McFadden's R2 logistic: 0.306741
LogLike logistic: -21061.27500713
step par values =1e-05
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.379523531974136
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000026 0.267936 0.548357 0.544375 0.841983 0.997787
McFadden's R2 logistic: 0.312178
LogLike logistic: -20896.11641835
step par values =0.3795235
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.380523531974136
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000261 0.2679553 0.5483030 0.5443753 0.8420024 0.9977885
McFadden's R2 logistic: 0.312177
LogLike logistic: -20896.13304431
step par values =0.3805235
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.378523531974136
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000026 0.267915 0.548327 0.544375 0.841939 0.997784
McFadden's R2 logistic: 0.312178
LogLike logistic: -20896.10219205
step par values =0.3785235
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.372705938284547
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000026 0.267791 0.548438 0.544375 0.842029 0.997771
McFadden's R2 logistic: 0.312179
LogLike logistic: -20896.0681252
step par values =0.3727059
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.373705938284547
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000026 0.267799 0.548472 0.544375 0.842076 0.997774
McFadden's R2 logistic: 0.312179
LogLike logistic: -20896.06807576
step par values =0.3737059
intercept Anon.Student.Id
intercept KC..Default.
propdec2 KC..Default. 0.371705938284547
linefail KC..Default.
linefailKC..Default.+propdec2KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.000026 0.267798 0.548397 0.544375 0.841885 0.997769
McFadden's R2 logistic: 0.312179
LogLike logistic: -20896.07061111
step par values =0.3717059
Setting levels: control = 0, case = 1
Setting direction: controls < cases
print(mean(auc_values))
[1] 0.8053306
The RPFA includes seedpars=c(.9)). It is the an initial set of parameter values. These values serve as a starting point for your parameter search.
Individualized Additive Factors Model (iAFM) fixed effect version
auc_values <-c()rmse_values <-c()ll_values <-c()r2_values <-c()for (i in1:k) { test_fold <- i# Fit the model. Note that usefolds is a vector of folds to include. We use everything except the held-out test fold. modelob <-LKT(data = val, interc=TRUE,components =c("Anon.Student.Id", "KC..Default.", "KC..Default.", "KC..Default."),features =c("intercept", "intercept", "lineafm$", "lineafm"), usefolds=(1:k)[-test_fold])# Run the model on the held-out test fold. Save the results in 'results'. results <-predict_lkt(modelob, data=val, return_stats =TRUE, fold = test_fold)#Save the metrics to the vectors. auc_values <-c(auc_values, results$AUC) rmse_values <-c(rmse_values, results$RMSE) ll_values <-c(ll_values, results$ll) r2_values <-c(r2_values, results$R2)}
intercept Anon.Student.Id
intercept KC..Default.
lineafm$ KC..Default.
lineafm KC..Default.
lineafmKC..Default.+lineafmKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000103 0.2866462 0.5560965 0.5453877 0.8133876 0.9999900
McFadden's R2 logistic: 0.294176
LogLike logistic: -21403.79277572
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm$ KC..Default.
lineafm KC..Default.
lineafmKC..Default.+lineafmKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00001 0.28392 0.55057 0.54160 0.80764 0.99999
McFadden's R2 logistic: 0.292358
LogLike logistic: -21574.64628785
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm$ KC..Default.
lineafm KC..Default.
lineafmKC..Default.+lineafmKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.002105 0.295609 0.561172 0.548945 0.810843 0.999990
McFadden's R2 logistic: 0.284216
LogLike logistic: -21708.26011417
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm$ KC..Default.
lineafm KC..Default.
lineafmKC..Default.+lineafmKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.0000265 0.2904520 0.5544308 0.5450267 0.8073318 0.9999900
McFadden's R2 logistic: 0.287906
LogLike logistic: -21655.3416102
Setting levels: control = 0, case = 1
Setting direction: controls < cases
intercept Anon.Student.Id
intercept KC..Default.
lineafm$ KC..Default.
lineafm KC..Default.
lineafmKC..Default.+lineafmKC..Default.:e$data$KC..Default.+interceptKC..Default.+interceptAnon.Student.Id+1
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.00001 0.28687 0.55523 0.54438 0.81068 0.99999
McFadden's R2 logistic: 0.292725
LogLike logistic: -21487.10554724
Setting levels: control = 0, case = 1
Setting direction: controls < cases
print(mean(auc_values))
[1] 0.7855078
interacts = c(NA, NA, NA, âAnon.Student.Idâ): This argument specifies interactions between components. Here:
The first three components have no interaction (NA).
The fourth component ("KC..Default.") interacts with "Anon.Student.Id".
đ Your Turn—
Now you have tried to build a few LKT model variants. Youâve only built some of the simpler variants. Pavlikâs papers show other, more complex variants.
Please write codes below to build an LKT model using 2012 ASSISTments dataset. More Background about this research is here.