An improved stochastic EM algorithm for large-scale full-information item factor analysis
Siliang Zhang
Shanghai Center for Mathematical Sciences, Fudan University, Shanghai, China
Search for more papers by this authorCorresponding Author
Yunxiao Chen
Department of Statistics, London School of Economics and Political Science, London, UK
Correspondence should be addressed to Yunxiao Chen, London School of Economics, Columbia House, Room 5.16, Houghton Street, London WC2A 2AE, UK (email: [email protected]).Search for more papers by this authorYang Liu
Department of Human Development and Quantitative Methodology, University of Maryland, College Park MD
Search for more papers by this authorSiliang Zhang
Shanghai Center for Mathematical Sciences, Fudan University, Shanghai, China
Search for more papers by this authorCorresponding Author
Yunxiao Chen
Department of Statistics, London School of Economics and Political Science, London, UK
Correspondence should be addressed to Yunxiao Chen, London School of Economics, Columbia House, Room 5.16, Houghton Street, London WC2A 2AE, UK (email: [email protected]).Search for more papers by this authorYang Liu
Department of Human Development and Quantitative Methodology, University of Maryland, College Park MD
Search for more papers by this authorAbstract
In this paper, we explore the use of the stochastic EM algorithm (Celeux & Diebolt (1985) Computational Statistics Quarterly, 2, 73) for large-scale full-information item factor analysis. Innovations have been made on its implementation, including an adaptive-rejection-based Gibbs sampler for the stochastic E step, a proximal gradient descent algorithm for the optimization in the M step, and diagnostic procedures for determining the burn-in size and the stopping of the algorithm. These developments are based on the theoretical results of Nielsen (2000, Bernoulli, 6, 457), as well as advanced sampling and optimization techniques. The proposed algorithm is computationally efficient and virtually tuning-free, making it scalable to large-scale data with many latent traits (e.g. more than five latent traits) and easy to use for practitioners. Standard errors of parameter estimation are also obtained based on the missing-information identity (Louis, 1982, Journal of the Royal Statistical Society, Series B, 44, 226). The performance of the algorithm is evaluated through simulation studies and an application to the analysis of the IPIP-NEO personality inventory. Extensions of the proposed algorithm to other latent variable models are discussed.
References
- Agresti, A. (1996). An introduction to categorical data analysis. New York, NY: Wiley. https://doi.org/10.1002/0470114754
- Airoldi, E. M., Blei, D. M., Fienberg, S. E., & Xing, E. P. (2008). Mixed membership stochastic blockmodels. Journal of Machine Learning Research, 9, 1981–2014.
- Albert, J. H. (1992). Bayesian estimation of normal ogive item response curves using Gibbs sampling. Journal of Educational Statistics, 17, 251–269. https://doi.org/10.2307/1165149
- Anderson, T. W., & Rubin, H. (1956). In J. Neyman (Ed.), Statistical inference in factor analysis (pp. 111–150). Berkeley, CA: University of California Press.
- Béguin, A. A., & Glas, C. A. (2001). MCMC estimation and some model-fit analysis of multidimensional IRT models. Psychometrika, 66, 541–561. https://doi.org/10.1007/bf02296195
- Birnbaum, A. (1968). Some latent trait models and their use in inferring an examinee's ability. In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores (pp. 397–472). Reading, MA: Addison-Wesley.
- Blei, D. M., Ng, A. Y., & Jordan, M. I. (2003). Latent Dirichlet allocation. Journal of Machine Learning Research, 3, 993–1022.
- Bock, R. D. (1972). Estimating item parameters and latent ability when responses are scored in two or more nominal categories. Psychometrika, 37, 29–51. https://doi.org/10.1007/bf02291411
- Bock, R. D., & Aitkin, M. (1981). Marginal maximum likelihood estimation of item parameters: Application of an EM algorithm. Psychometrika, 46, 443–459. https://doi.org/10.1007/bf02294168
- Butcher, J. N., Dahlstrom, W., Graham, J., Tellegen, A., & Kaemmer, B. (1989). MMPI-2: Manual for administration and scoring. Minneapolis, MN: University of Minnesota Press.
- Cai, L. (2010a). High-dimensional exploratory item factor analysis by a Metropolis-Hastings Robbins–Monro algorithm. Psychometrika, 75, 33–57. https://doi.org/10.1007/s11336-009-9136-x
- Cai, L. (2010b). Metropolis-Hastings Robbins-Monro algorithm for confirmatory item factor analysis. Journal of Educational and Behavioral Statistics, 35, 307–335. https://doi.org/10.3102/1076998609353115
- Cai, L. (2013). flexMIRT: Flexible multilevel multidimensional item analysis and test scoring (version 2.0) [computer software]. Chapel Hill, NC: Vector Psychometric Group.
- Camilli, G., & Fox, J.-P. (2015). An aggregate IRT procedure for exploratory factor analysis. Journal of Educational and Behavioral Statistics, 40, 377–401. https://doi.org/10.3102/1076998615589185
- Celeux, G., & Diebolt, J. (1985). The SEM algorithm: A probabilistic teacher algorithm derived from the EM algorithm for the mixture problem. Computational Statistics Quarterly, 2, 73–82.
- Chalmers, R. P. (2012). mirt: A multidimensional item response theory package for the R environment. Journal of Statistical Software, 48, 1–29. https://doi.org/10.1007/978-0-387-89976-3
- Costa, P. T., & McCrae, R. R. (1985). The NEO personality inventory. Odessa, FL: Psychological Assessment Resources.
- Dagum, L., & Menon, R. (1998). OpenMP: An industry standard API for shared-memory programming. IEEE Computational Science & Engineering, 5, 46–55. https://doi.org/10.1109/99.660313
- Delyon, B., Lavielle, M., & Moulines, E. (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, 27, 94–128. https://doi.org/10.1214/aos/1018031103
- Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society, Series B, 39, 1–38. https://doi.org/10.1142/9789812388759_0028
10.1111/j.2517-6161.1977.tb01600.x Google Scholar
- Diebolt, J., & Ip, E. H. (1996). Stochastic EM: Method and application. In W. R. Gilks, S. Richardson & D. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 259–273). London, UK: Chapman & Hall.
10.1007/978-1-4899-4485-6_15 Google Scholar
- Edwards, M. C. (2010). A Markov chain Monte Carlo approach to confirmatory item factor analysis. Psychometrika, 75, 474–497. https://doi.org/10.1007/s11336-010-9161-9
- Fox, J.-P. (2003). Stochastic EM for estimating the parameters of a multilevel IRT model. British Journal of Mathematical and Statistical Psychology, 56, 65–81. https://doi.org/10.1348/000711003321645340
- Friedman, J., Hastie, T., & Tibshirani, R. (2010). Regularization paths for generalized linear models via coordinate descent. Journal of Statistical Software, 33, 1–22. https://doi.org/10.18637/jss.v033.i01
- Gelman, A., & Rubin, D. B. (1992). Inference from iterative simulation using multiple sequences. Statistical Science, 7, 457–472. https://doi.org/10.1214/ss/1177011136
10.1214/ss/1177011136 Google Scholar
- Geweke, J. (1992). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments. In J. M. Bernardo, J. O. Berger, A. P. Dawid & A. F. M. Smith (Eds.), Bayesian statistics (pp. 169–193). Oxford, UK: Oxford University Press.
10.1093/oso/9780198522669.003.0010 Google Scholar
- Gilks, W. R., & Wild, P. (1992). Adaptive rejection sampling for Gibbs sampling. Applied Statistics, 41, 337–348. https://doi.org/10.2307/2347565
- Gu, M. G., & Kong, F. H. (1998). A stochastic approximation algorithm with Markov chain Monte-Carlo method for incomplete data estimation problems. Proceedings of the National Academy of Sciences of the United States of America, 95, 7270–7274. https://doi.org/10.1073/pnas.95.13.7270
- Herlihy, M., & Shavit, N. (2011). The art of multiprocessor programming. Burlington, MA: Morgan Kaufmann.
- Huber, P., Ronchetti, E., & Victoria-Feser, M.-P. (2004). Estimation of generalized linear latent variable models. Journal of the Royal Statistical Society, Series B, 66, 893–908. https://doi.org/10.1111/j.1467-9868.2004.05627.x
- Ip, E. H. (1994). A stochastic EM estimator in the presence of missing data: Theory and applications. Unpublished doctoral dissertation, Department of Statistics, Stanford University.
- Ip, E. H. (2002). On single versus multiple imputation for a class of stochastic algorithms estimating maximum likelihood. Computational Statistics, 17, 517–524. https://doi.org/10.1007/s001800200124
- Johnson, J. A. (2005). Ascertaining the validity of individual protocols from web-based personality inventories. Journal of Research in Personality, 39, 103–129. https://doi.org/10.1016/j.jrp.2004.09.009
- Johnson, J. A. (2014). Measuring thirty facets of the five factor model with a 120-item public domain inventory: development of the IPIP-NEO-120. Journal of Research in Personality, 51, 78–89. https://doi.org/10.1016/j.jrp.2014.05.003
- Jöreskog, K. G., & Moustaki, I. (2001). Factor analysis of ordinal variables: A comparison of three approaches. Multivariate Behavioral Research, 36, 347–387. https://doi.org/10.1207/s15327906347-387
- Kass, R. E., & Steffey, D. (1989). Approximate Bayesian inference in conditionally independent hierarchical models (parametric empirical Bayes models). Journal of the American Statistical Association, 84, 717–726. https://doi.org/10.1080/01621459.1989.10478825
- Liu, Y., Magnus, B., Quinn, H., & Thissen, D. (2018). Multidimensional item response theory. In D. Hughes, P. Irwing & T. Booth (Eds.), The Wiley handbook of psychometric testing (pp. 445–493). Hoboken, NJ: Wiley.
10.1002/9781118489772.ch16 Google Scholar
- Liu, D. C., & Nocedal, J. (1989). On the limited memory BFGS method for large scale optimization. Mathematical Programming, 45, 503–528. https://doi.org/10.1007/BF01589116
- Louis, T. A. (1982). Finding the observed information matrix when using the EM algorithm. Journal of the Royal Statistical Society, Series B, 44, 226–233.
- Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174. https://doi.org/10.1007/BF02296272
- Meng, X.-L., & Schilling, S. (1996). Fitting full-information item factor models and an empirical investigation of bridge sampling. Journal of the American Statistical Association, 91, 1254–1267. https://doi.org/10.1080/01621459.1996.10476995
- Monroe, S. L. (2014). Multidimensional item factor analysis with semi-nonparametric latent densities. Unpublished doctoral dissertation. Los Angeles, CA: University of California.
- Muraki, E., & Carlson, J. E. (1995). Full-information factor analysis for polytomous item responses. Applied Psychological Measurement, 19, 73–90. https://doi.org/10.1177/014662169501900109
- Muthén, B. (1978). Contributions to factor analysis of dichotomous variables. Psychometrika, 43, 551–560. https://doi.org/10.1007/bf02293813
- Muthén, B. (1984). A general structural equation model with dichotomous, ordered categorical, and continuous latent variable indicators. Psychometrika, 49, 115–132. https://doi.org/10.1007/bf02294210
- Muthén, B. (1993). Goodness of fit with categorical and other nonnormal variables. In K. A. Bollen & J. S. Long (Eds.), Testing structural equation models (pp. 205–234). Newbury Park, CA: Sage.
- Neal, R. M. (2003). Slice sampling. Annals of Statistics, 31, 705–741. https://doi.org/10.1214/aos/1056562461
- Nemirovski, A., Juditsky, A., Lan, G., & Shapiro, A. (2009). Robust stochastic approximation approach to stochastic programming. SIAM Journal on Optimization, 19, 1574–1609. https://doi.org/10.1137/070704277
- Nielsen, S. F. (2000). The stochastic EM algorithm: Estimation and asymptotic results. Bernoulli, 6, 457–489. https://doi.org/10.2307/3318671
- Parikh, N., & Boyd, S. (2014). Proximal algorithms. Foundations and Trends R in Optimization, 1, 127–239. https://doi.org/10.1561/2400000003
10.1561/2400000003 Google Scholar
- Patz, R. J., & Junker, B. W. (1999a). Applications and extensions of MCMC in IRT: Multiple item types, missing data, and rated responses. Journal of Educational and Behavioral Statistics, 24, 342–366. https://doi.org/10.3102/10769986024004342
- Patz, R. J., & Junker, B. W. (1999b). A straightforward approach to Markov chain Monte Carlo methods for item response models. Journal of Educational and Behavioral Statistics, 24, 146–178. https://doi.org/10.3102/10769986024002146
- Rabe-Hesketh, S., Skrondal, A., & Pickles, A. (2005). Maximum likelihood estimation of limited and discrete dependent variable models with nested random effects. Journal of Econometrics, 128, 301–323. https://doi.org/10.1016/j.jeconom.2004.08.017
- Rasch, G. (1960). Probabilistic models for some intelligence and achievement tests. Copenhagen, Denmark: Danish Institute for Educational Research.
- Reckase, M. (2009). Multidimensional item response theory. New York, NY: Springer.
10.1007/978-0-387-89976-3 Google Scholar
- Revuelta, J. (2014). Multidimensional item response model for nominal variables. Applied Psychological Measurement, 38, 549–562. https://doi.org/10.1177/0146621614536272
- Robbins, H., & Monro, S. (1951). A stochastic approximation method. Annals of Mathematical Statistics, 22, 400–407.
- Roberts, G. O. (1996). Markov chain concepts related to sampling algorithms. In W. R. Gilks, S. Richardson & D. Spiegelhalter (Eds.), Markov chain Monte Carlo in practice (pp. 45–57). London, UK: Chapman & Hall.
10.1007/978-1-4899-4485-6_3 Google Scholar
- Schilling, S., & Bock, R. D. (2005). High-dimensional maximum marginal likelihood item factor analysis by adaptive quadrature. Psychometrika, 70, 533–555. https://doi.org/10.1007/s11336-003-1141-x
- Shi, J.-Q., & Lee, S.-Y. (1998). Bayesian sampling-based approach for factor analysis models with continuous and polytomous data. British Journal of Mathematical and Statistical Psychology, 51, 233–252. https://doi.org/10.1111/j.2044-8317.1998.tb00679.x
- Song, X.-Y., & Lee, S.-Y. (2005). A multivariate probit latent variable model for analyzing dichotomous responses. Statistica Sinica, 15, 645–664.
- Spall, J. C. (2005). Introduction to stochastic search and optimization: Estimation, simulation, and control. Hoboken, NJ: John Wiley & Sons.
- Steel, P., Schmidt, J., & Shultz, J. (2008). Refining the relationship between personality and subjective well-being. Psychological Bulletin, 134, 138–161. https://doi.org/10.1037/0033-2909.134.1.138
- Sun, J., Chen, Y., Liu, J., Ying, Z., & Xin, T. (2016). Latent variable selection for multidimensional item response theory models via L1 regularization. Psychometrika, 81, 921–939. https://doi.org/10.1007/s11336-016-9529-6
- Sympson, J. B. (1978). A model for testing with multidimensional items. In D. Weiss (Ed.), Proceedings of the 1977 computerized adaptive testing conference (pp. 82–98). Minneapolis, MN: Psychometric Methods Program, Dept. of Psychology, University of Minnesota.
- Thomas, N. (1993). Asymptotic corrections for multivariate posterior moments with factored likelihood functions. Journal of Computational and Graphical Statistics, 2, 309–322. https://doi.org/10.2307/1390648
10.2307/1390648 Google Scholar
- von Davier, M. (2016). New results on an improved parallel EM algorithm for estimating generalized latent variable models. In L. Ark, M. Wiberg, S. Culpepper, J. Douglas & W. Wang (Eds.), Quantitative psychology: The 81st Annual Meeting of the Psychometric Society (pp. 1–8). Cham, Switzerland: Springer.
- von Davier, M., & Sinharay, S. (2010). Stochastic approximation methods for latent regression item response models. Journal of Educational and Behavioral Statistics, 35, 174–193. https://doi.org/10.3102/1076998609346970
- Wada, T., & Fujisaki, Y. (2015). A stopping rule for stochastic approximation. Automatica, 60, 1–6. https://doi.org/10.1016/j.automatica.2015.06.029
- Wirth, R., & Edwards, M. C. (2007). Item factor analysis: Current approaches and future directions. Psychological Methods, 12, 58–79. https://doi.org/10.1037/1082-989X.12.1.58
- Yao, L., & Schwarz, R. D. (2006). A multidimensional partial credit model with associated item and test statistics: An application to mixed-format tests. Applied Psychological Measurement, 30, 469–492. https://doi.org/10.1177/0146621605284537
- Zhao, Y., & Joe, H. (2005). Composite likelihood estimation in multivariate data analysis. Canadian Journal of Statistics, 33, 335–356. https://doi.org/10.1002/cjs.5540330303