vignettes/GetStarted.Rmd
GetStarted.RmdPCLVMthesis implements Bayesian partially confirmatory
factor analysis (PCFA) with flexible shrinkage priors, structured
residual covariance, and categorical data support.
The model is based on the latent variable framework:
where:
The package supports:
This vignette demonstrated:
PCLVMthesis provides a flexible framework for Bayesian
partially confirmatory factor modeling in complex structural
settings.
Below we demonstrate the main functionalities of the package.
library(PCLVMthesis)
truemodel<-sim_matrix(n_specifics=3, items_per_specific=6,loadings_s = c(.7,.7),lac_spe=0.2, cpf_spe=2,specifics_rho = 0.3, seed=123)
cat("Ture Loading is:\n");print( truemodel[["A"]])## Ture Loading is:
## K1 K2 K3
## item1 0.7 0.0 0.0
## item2 0.7 0.0 0.0
## item3 0.7 0.0 0.0
## item4 0.7 0.0 0.0
## item5 0.7 0.2 0.0
## item6 0.7 0.2 0.0
## item7 0.0 0.7 0.0
## item8 0.0 0.7 0.0
## item9 0.0 0.7 0.0
## item10 0.0 0.7 0.0
## item11 0.0 0.7 0.2
## item12 0.0 0.7 0.2
## item13 0.0 0.0 0.7
## item14 0.0 0.0 0.7
## item15 0.0 0.0 0.7
## item16 0.0 0.0 0.7
## item17 0.2 0.0 0.7
## item18 0.2 0.0 0.7
## True Factor correlation is:
## [,1] [,2] [,3]
## [1,] 1.0 0.3 0.3
## [2,] 0.3 1.0 0.3
## [3,] 0.3 0.3 1.0
The design matrix Q encodes prior structure:
-2 = specified loading-1 = unspecified loading (regularized)
Q <- matrix(-1, nrow = J, ncol = K)
Q[truedata$lam != 0] <- -2
m<-PCPMr$new()
m$pcfa(Y=y, Q=Q,iter =1000, burn = 1000, update = 1000);## Cumulative iterations: 1000, Time needed: 1.00533 s
## Cumulative iterations: 2000, Time needed: 2.08625 s
m$pcpmsummary(sig = TRUE)## $model
## [1] "pcfa"
##
## $loadA
## [,1] [,2] [,3]
## [1,] 0.7081456 0.0000000 0.0000000
## [2,] 0.7400338 0.0000000 0.0000000
## [3,] 0.7279924 0.0000000 0.0000000
## [4,] 0.6880363 0.0000000 0.0000000
## [5,] 0.7051716 0.1828038 0.0000000
## [6,] 0.6905446 0.1875283 0.0000000
## [7,] 0.0000000 0.7011196 0.0000000
## [8,] 0.0000000 0.7116980 0.0000000
## [9,] 0.0000000 0.7009015 0.0000000
## [10,] 0.0000000 0.6629890 0.0000000
## [11,] 0.0000000 0.6818785 0.2320647
## [12,] 0.0000000 0.6955046 0.2442505
## [13,] 0.0000000 0.0000000 0.7317293
## [14,] 0.0000000 0.0000000 0.6867971
## [15,] 0.0000000 0.0000000 0.7232461
## [16,] 0.0000000 0.0000000 0.7084529
## [17,] 0.1968783 0.0000000 0.6927487
## [18,] 0.2017558 0.0000000 0.7038701
##
## $PHI
## [,1] [,2] [,3]
## [1,] 1.0000000 0.0000000 0
## [2,] 0.2985372 1.0000000 0
## [3,] 0.3089016 0.2390245 1
##
## $PSX
## var1 var2 var3 var4 var5 var6 var7 var8
## 0.4992767 0.4694459 0.4841658 0.5275705 0.3814533 0.4176408 0.5151571 0.5052978
## var9 var10 var11 var12 var13 var14 var15 var16
## 0.4955527 0.5389408 0.4084869 0.3855452 0.4756966 0.5240894 0.4796519 0.5060166
## var17 var18
## 0.3930078 0.3735477
A<-stat(classname=m, varName='A',Q = Q, LD = F,med =FALSE, start = 00, end = -1, SL = 0.05, sig =FALSE)
C<-stat(classname=m, varName='C',Q = Q, LD = F,med =FALSE, start = 00, end = -1, SL = 0.05, sig =FALSE)
V<-stat(classname=m, varName='V',Q = Q, LD = F,med =FALSE, start = 00, end = -1, SL = 0.05, sig =FALSE)
head(A)## est sd lower upper sign psrf Item Factor
## var1 0.7081456 0.03335362 0.6401872 0.7730075 1 0.9999331 1 1
## var2 0.7400338 0.03241666 0.6731130 0.8001229 1 0.9995097 2 1
## var3 0.7279924 0.03445338 0.6587280 0.7917357 1 1.0000723 3 1
## var4 0.6880363 0.03516484 0.6104319 0.7481933 1 0.9996984 4 1
## var5 0.7051716 0.03189813 0.6448296 0.7702495 1 0.9996116 5 1
## var6 0.6905446 0.03255411 0.6296307 0.7546013 1 1.0029191 6 1
Users may change the loading shrinkage prior reglo
(e.g., ‘lasso’,‘horse’,‘ssp’).
m1 <- PCPMr$new()
m1$pcfa(Y=y, Q=Q,reglo = 'ssp',iter =1000, burn = 1000, update = 1000)## Cumulative iterations: 1000, Time needed: 1.03779 s
## Cumulative iterations: 2000, Time needed: 2.48152 s
truemodel<-sim_matrix(n_specifics=3, items_per_specific=6,loadings_s = c(.7, .7),
lac_spe=0.2, cpf_spe=2,specifics_rho = 0.3,
ecr = .3,necw=3, necb=3, seed=123)
truedata<-sim_data(N = 1000, lam = truemodel[["A"]],phi=truemodel[["Phi"]], ecm=truemodel[["ecm"]])
y<-truedata$dat
J<-nrow(truedata$lam);K<-ncol(truedata$lam)
Q <- matrix(-1, nrow = J, ncol = K)
Q[truedata$lam!=0]=-2
m<-PCPMr$new()
m$pcfa(Y=y, Q=Q,LD=T,iter =1000, burn = 1000, update = 1000);## Cumulative iterations: 1000, Time needed: 4.0269 s
## Cumulative iterations: 2000, Time needed: 7.48289 s
m$pcpmsummary(sig=T)## $model
## [1] "pcfa"
##
## $loadA
## [,1] [,2] [,3]
## [1,] 0.7162258 0.0000000 0.0000000
## [2,] 0.6822352 0.0000000 0.0000000
## [3,] 0.7540613 0.0000000 0.0000000
## [4,] 0.7579145 0.0000000 0.0000000
## [5,] 0.6547815 0.2316941 0.0000000
## [6,] 0.6813621 0.2062770 0.0000000
## [7,] 0.0000000 0.7187069 0.0000000
## [8,] 0.0000000 0.6650042 0.0000000
## [9,] 0.0000000 0.7099416 0.0000000
## [10,] 0.0000000 0.7147230 0.0000000
## [11,] 0.0000000 0.6942164 0.2523915
## [12,] 0.0000000 0.7303871 0.1725705
## [13,] 0.0000000 0.0000000 0.6548645
## [14,] 0.0000000 0.0000000 0.7100136
## [15,] 0.0000000 0.0000000 0.7493893
## [16,] 0.0000000 0.0000000 0.7490007
## [17,] 0.1833685 0.0000000 0.6871824
## [18,] 0.1676497 0.0000000 0.6812644
##
## $PHI
## [,1] [,2] [,3]
## [1,] 1.0000000 0.0000000 0
## [2,] 0.3114337 1.0000000 0
## [3,] 0.2705918 0.2299488 1
##
## $PSX
## [,1] [,2] [,3] [,4] [,5] [,6] [,7]
## [1,] 0.483613 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [2,] 0.000000 0.5132746 0.0000000 0.0000000 0.0000000 0.0000000 0.2402194
## [3,] 0.000000 0.0000000 0.4388913 0.2356877 0.0000000 0.0000000 0.0000000
## [4,] 0.000000 0.0000000 0.0000000 0.4484772 0.0000000 0.0000000 0.0000000
## [5,] 0.000000 0.0000000 0.0000000 0.0000000 0.4123777 0.0000000 0.0000000
## [6,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.4177829 0.0000000
## [7,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.4811251
## [8,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [9,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [10,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [11,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [12,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [13,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [14,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [15,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [16,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [17,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [18,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [,8] [,9] [,10] [,11] [,12] [,13] [,14]
## [1,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.2457158
## [2,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [3,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [4,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [5,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [6,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [7,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [8,] 0.530399 0.0000000 0.0000000 0.0000000 0.0000000 0.2639213 0.0000000
## [9,] 0.000000 0.4885842 0.2840275 0.0000000 0.0000000 0.0000000 0.0000000
## [10,] 0.000000 0.0000000 0.4926997 0.0000000 0.0000000 0.0000000 0.0000000
## [11,] 0.000000 0.0000000 0.0000000 0.4019481 0.0000000 0.0000000 0.0000000
## [12,] 0.000000 0.0000000 0.0000000 0.0000000 0.3859987 0.0000000 0.0000000
## [13,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.5531953 0.0000000
## [14,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.4758915
## [15,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [16,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [17,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [18,] 0.000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000 0.0000000
## [,15] [,16] [,17] [,18]
## [1,] 0.0000000 0.0000000 0.0000000 0.0000000
## [2,] 0.0000000 0.0000000 0.0000000 0.0000000
## [3,] 0.0000000 0.0000000 0.0000000 0.0000000
## [4,] 0.0000000 0.0000000 0.0000000 0.0000000
## [5,] 0.0000000 0.0000000 0.0000000 0.0000000
## [6,] 0.0000000 0.0000000 0.0000000 0.0000000
## [7,] 0.0000000 0.0000000 0.0000000 0.0000000
## [8,] 0.0000000 0.0000000 0.0000000 0.0000000
## [9,] 0.0000000 0.0000000 0.0000000 0.0000000
## [10,] 0.0000000 0.0000000 0.0000000 0.0000000
## [11,] 0.0000000 0.0000000 0.0000000 0.0000000
## [12,] 0.0000000 0.0000000 0.0000000 0.0000000
## [13,] 0.0000000 0.0000000 0.0000000 0.0000000
## [14,] 0.0000000 0.0000000 0.0000000 0.0000000
## [15,] 0.4477172 0.2492510 0.0000000 0.0000000
## [16,] 0.0000000 0.4431266 0.0000000 0.0000000
## [17,] 0.0000000 0.0000000 0.4215094 0.0000000
## [18,] 0.0000000 0.0000000 0.0000000 0.4264528
Users may change the residual covariance matrix prior
regpsx (e.g., ‘lasso’,‘horse’,‘ssp’).
m1<-PCPMr$new()
m1$pcfa(Y=y, Q=Q,LD=T,reglo = 'ssp',regpsx = 'ssp',iter =1000, burn = 1000, update = 1000)## Cumulative iterations: 1000, Time needed: 3.9806 s
## Cumulative iterations: 2000, Time needed: 8.35841 s
truemodel<-sim_matrix(n_specifics=3, items_per_specific=6,loadings_s = c(.7, .7),
lac_spe=0.2, cpf_spe=2,specifics_rho = 0.3, seed=123)
truedata<-sim_data(N = 1000, lam = truemodel[["A"]],phi=truemodel[["Phi"]], ecm=truemodel[["ecm"]],
cati = -1, noc = c(3), misp = 0.05)
y<-truedata$dat
J<-nrow(truedata$lam);K<-ncol(truedata$lam)
Q <- matrix(-1, nrow = J, ncol = K)
Q[truedata$lam!=0]=-2
m<-PCPMr$new()
m$pcfa(Y=y, Q=Q,cati=T,cati_v=-1,LD=F, iter =1000, burn = 1000, update = 1000);## Cumulative iterations: 1000, Time needed: 8.16041 s
## Cumulative iterations: 2000, Time needed: 15.4305 s
m$pcpmsummary(sig=T)## $model
## [1] "pcfa"
##
## $loadA
## [,1] [,2] [,3]
## [1,] 0.7057883 0.0000000 0.0000000
## [2,] 0.7425135 0.0000000 0.0000000
## [3,] 0.7214336 0.0000000 0.0000000
## [4,] 0.6535328 0.0000000 0.0000000
## [5,] 0.6923115 0.1959329 0.0000000
## [6,] 0.6850286 0.2482352 0.0000000
## [7,] 0.0000000 0.7331106 0.0000000
## [8,] 0.0000000 0.7027486 0.0000000
## [9,] 0.0000000 0.6929609 0.0000000
## [10,] 0.0000000 0.6613188 0.0000000
## [11,] 0.0000000 0.7072065 0.1936017
## [12,] 0.0000000 0.6565617 0.2425773
## [13,] 0.0000000 0.0000000 0.7761194
## [14,] 0.0000000 0.0000000 0.7188251
## [15,] 0.0000000 0.0000000 0.7194805
## [16,] 0.0000000 0.0000000 0.7089620
## [17,] 0.1846961 0.0000000 0.6712276
## [18,] 0.2030966 0.0000000 0.6853328
##
## $PHI
## [,1] [,2] [,3]
## [1,] 1.0000000 0.0000000 0
## [2,] 0.3198035 1.0000000 0
## [3,] 0.2931011 0.2641789 1
##
## $PSX
## var1 var2 var3 var4 var5 var6 var7 var8
## 0.4925400 0.4510807 0.5037068 0.5643879 0.3926954 0.3767199 0.5021212 0.5314340
## var9 var10 var11 var12 var13 var14 var15 var16
## 0.5085244 0.5570706 0.3798288 0.4166700 0.4317745 0.4883595 0.5003318 0.4976587
## var17 var18
## 0.4294974 0.4004760
We design a model with 2 general factors, 3 Special factors, 18 total items.
We generate a block-structured factor correlation matrix:
where:
truemodel<-sim_matrix(n_generals=2, items_per_general=9, n_specifics=3, items_per_specific=6,blocks_spe=3,
loadings_g = c(.7, .8), loadings_s = c(.5, .6),
generals_rho = 0.6, specifics_rho = 0.6,sparse_spe=1, seed=123)
cat("Ture Loading is:\n");print( truemodel[["A"]])## Ture Loading is:
## G1 G2 S1 S2 S3
## item1 0.7000 0.0000 0.50 0.00 0.00
## item2 0.7125 0.0000 0.52 0.00 0.00
## item3 0.7250 0.0000 0.00 0.50 0.00
## item4 0.7375 0.0000 0.00 0.52 0.00
## item5 0.7500 0.0000 0.00 0.00 0.50
## item6 0.7625 0.0000 0.00 0.00 0.52
## item7 0.7750 0.0000 0.54 0.00 0.00
## item8 0.7875 0.0000 0.56 0.00 0.00
## item9 0.8000 0.0000 0.00 0.54 0.00
## item10 0.0000 0.7000 0.00 0.56 0.00
## item11 0.0000 0.7125 0.00 0.00 0.54
## item12 0.0000 0.7250 0.00 0.00 0.56
## item13 0.0000 0.7375 0.58 0.00 0.00
## item14 0.0000 0.7500 0.60 0.00 0.00
## item15 0.0000 0.7625 0.00 0.58 0.00
## item16 0.0000 0.7750 0.00 0.60 0.00
## item17 0.0000 0.7875 0.00 0.00 0.58
## item18 0.0000 0.8000 0.00 0.00 0.60
## True Factor correlation is:
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.0 0.6 0 0.0 0.0
## [2,] 0.6 1.0 0 0.0 0.0
## [3,] 0.0 0.0 1 0.0 0.0
## [4,] 0.0 0.0 0 1.0 0.6
## [5,] 0.0 0.0 0 0.6 1.0
We specify:
cor_fac = c(1,1,0,0,0)Interpretation:
1 → correlation estimated using Inverse-Wishart
prior0 → correlation fixed to zeroThus:
m<-PCPMr$new()
m$pcfa(Y=y, Q=Q,cor_fac =c(1,1,0,0,0), sign_check = T,iter =10000, burn = 10000, update = 10000);## Cumulative iterations: 10000, Time needed: 13.8479 s
## Cumulative iterations: 20000, Time needed: 27.9608 s
m$pcpmsummary(sig=T)## $model
## [1] "pcfa"
##
## $loadA
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.67750690 0.00000000 0.5829044 0.0000000 0.0000000
## [2,] 0.71672063 0.00000000 0.5931649 0.0000000 0.0000000
## [3,] 0.79437024 0.00000000 0.0000000 0.3418735 0.0000000
## [4,] 0.79017720 0.00000000 0.0000000 0.3426005 0.0000000
## [5,] 0.82984427 0.00000000 0.0000000 0.0000000 0.3486496
## [6,] 0.82146641 0.00000000 0.0000000 0.0000000 0.3600870
## [7,] 0.78318068 0.00000000 0.6179040 0.0000000 0.0000000
## [8,] 0.80222608 -0.09384935 0.6493833 0.0000000 0.0000000
## [9,] 0.87293707 0.00000000 0.0000000 0.3235971 0.0000000
## [10,] 0.00000000 0.80172983 0.0000000 0.3582580 0.0000000
## [11,] 0.00000000 0.79666916 0.0000000 0.0000000 0.3880213
## [12,] 0.00000000 0.83303864 0.0000000 0.0000000 0.3428284
## [13,] -0.09209742 0.68267039 0.7192728 0.0000000 0.0000000
## [14,] 0.00000000 0.67651340 0.7507249 0.0000000 0.0000000
## [15,] 0.00000000 0.86627114 0.0000000 0.3770264 0.0000000
## [16,] 0.00000000 0.86428511 0.0000000 0.4205342 0.0000000
## [17,] 0.00000000 0.90109346 0.0000000 0.0000000 0.3501542
## [18,] 0.00000000 0.90867035 0.0000000 0.0000000 0.3744025
##
## $PHI
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.0000000 0 0 0 0
## [2,] 0.6173998 1 0 0 0
## [3,] 0.0000000 0 1 0 0
## [4,] 0.0000000 0 0 1 0
## [5,] 0.0000000 0 0 0 1
##
## $PSX
## var1 var2 var3 var4 var5 var6
## 0.257195079 0.233495404 0.213173176 0.187131049 0.173728121 0.137705702
## var7 var8 var9 var10 var11 var12
## 0.116212592 0.059058818 0.073782457 0.202652645 0.196136983 0.169753083
## var13 var14 var15 var16 var17 var18
## 0.136962659 0.079625532 0.087069211 0.041045426 0.041672098 0.007673663
To allow sparse estimation among specific factors:
regphi = "parlasso" / "parhorse" / regphi = "parssp"Meaning:
m1<-PCPMr$new()
m1$pcfa(Y=y, Q=Q,reglo = 'ssp',regphi = 'parssp',Kg = 2,iter =10000, burn = 10000, update = 10000)## Cumulative iterations: 10000, Time needed: 14.4594 s
## Cumulative iterations: 20000, Time needed: 29.2089 s
m1$pcpmsummary(sig=T) ## $model
## [1] "pcfa"
##
## $loadA
## [,1] [,2] [,3] [,4] [,5]
## [1,] 0.6802216 0.0000000 0.5249114 0.0000000 0.0000000
## [2,] 0.7101601 0.0000000 0.5247786 0.0000000 0.0000000
## [3,] 0.7069023 0.0000000 0.0000000 0.5347613 0.0000000
## [4,] 0.6937704 0.0000000 0.0000000 0.5756087 0.0000000
## [5,] 0.7513496 0.0000000 0.0000000 0.0000000 0.5075155
## [6,] 0.7355690 0.0000000 0.0000000 0.0000000 0.5606001
## [7,] 0.7757758 0.0000000 0.5444744 0.0000000 0.0000000
## [8,] 0.7918930 0.0000000 0.5748468 0.0000000 0.0000000
## [9,] 0.7829613 0.0000000 0.0000000 0.5548583 0.0000000
## [10,] 0.0000000 0.6729465 0.0000000 0.5780430 0.0000000
## [11,] 0.0000000 0.6551260 0.0000000 0.0000000 0.6041650
## [12,] 0.0000000 0.6817210 0.0000000 0.0000000 0.5999816
## [13,] 0.0000000 0.6826780 0.6446248 0.0000000 0.0000000
## [14,] 0.0000000 0.6532906 0.7129682 0.0000000 0.0000000
## [15,] 0.0000000 0.7223843 0.0000000 0.6171711 0.0000000
## [16,] 0.0000000 0.7040164 0.0000000 0.6703436 0.0000000
## [17,] 0.0000000 0.7486340 0.0000000 0.0000000 0.6236154
## [18,] 0.0000000 0.7558704 0.0000000 0.0000000 0.6397908
##
## $PHI
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1.0000000 0 0 0.0000000 0
## [2,] 0.5123394 1 0 0.0000000 0
## [3,] 0.0000000 0 1 0.0000000 0
## [4,] 0.0000000 0 0 1.0000000 0
## [5,] 0.0000000 0 0 0.6415314 1
##
## $PSX
## var1 var2 var3 var4 var5 var6
## 0.260667506 0.234446940 0.215088033 0.185945346 0.180060115 0.136897107
## var7 var8 var9 var10 var11 var12
## 0.115563176 0.058657498 0.075609866 0.202511858 0.200942746 0.171330957
## var13 var14 var15 var16 var17 var18
## 0.138669807 0.067660124 0.085177572 0.042427650 0.043264620 0.007238237
Alternatively, users may specify:
regphi = "lasso" / "horse" / "ssp"This applies shrinkage to the entire factor covariance matrix (or
blocks defined by cor_fac).
Such full regularization is typically not recommended for MTMM models with a well-defined theoretical design. Nevertheless, it can be appropriate in CFA settings where the factor correlation structure is believed to be sparse, or when the goal is to identify weak or negligible latent associations.