Statistical Methods in Medical Research: An International Review Journal

Print ISSN: 0962-2802 Publisher: Sage Publications

Most recent papers:

Quantile inference for multivariate response regression in joint modeling of longitudinal and survival data.
Xiaoyu Niu, Xuejing Zhao1School of Mathematics and Statistics, Lanzhou University, Lanzhou, China.
Statistical Methods in Medical Research: An International Review Journal. 2 days ago

Statistical Methods in Medical Research, Ahead of Print.
Quantile regression (QR) offers a robust framework for analyzing covariate effects across the outcome distribution, particularly when the response variable exhibits skewness or heavy tails. To jointly model multivariate longitudinal biomarkers and a time-...

May 16, 2026 doi: 10.1177/09622802261420698 open full text
Empowering classification for multivariate functional data with simultaneous feature selection.
Shuoyang Wang, Guanqun Cao, Yuan Huang.
Statistical Methods in Medical Research: An International Review Journal. 3 days ago

Statistical Methods in Medical Research, Ahead of Print.
The opportunity to utilize multivariate functional data types for conducting classification tasks is emerging with the growing availability of imaging data. Inspired by the extensive data provided by the Alzheimer’s Disease Neuroimaging Initiative, we ...

May 15, 2026 doi: 10.1177/09622802261449366 open full text
A Bayesian phase I/II platform design with survival efficacy endpoint for dose optimization.
Xian Shi, Jiangyan Zhao, Jin Xu, Rongji Mu.
Statistical Methods in Medical Research: An International Review Journal. 4 days ago

Statistical Methods in Medical Research, Ahead of Print.
Motivated by a real-world drug development program, we propose a Bayesian phase I/II platform design to co-develop therapies with time-to-event efficacy endpoint (BPCT). We jointly model the binary toxicity outcome and the time-to-event efficacy outcome, ...

May 14, 2026 doi: 10.1177/09622802261449367 open full text
Screening for diabetes mellitus in the US population using neural network-based modeling and complex survey designs.
Marcos Matabuena, Juan C Vidal, Rahul Ghosal, Jukka-Pekka Onnela.
Statistical Methods in Medical Research: An International Review Journal. 11 days ago

Statistical Methods in Medical Research, Ahead of Print.
Complex survey designs are widely used in medical cohort studies. Developing risk score models that adequately account for the sampling design is essential to minimize selection bias and obtain representative population estimates. This work addresses ...

May 07, 2026 doi: 10.1177/09622802261442893 open full text
PRO-ADD: Patient-empowered dose-finding trials integrating safety, preliminary efficacy and patient-reported outcomes for optimal dose selection.
Emily Alger, Sumithra J Mandrekar, Jun Yin, Christina Yap.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Advances in oncology drug development are driving the emergence of novel therapies, challenging traditional dose-efficacy assumptions in dose-finding oncology trials. Traditional trial designs aim to identify a maximum tolerated dose (MTD) by assessing ...

April 30, 2026 doi: 10.1177/09622802261435969 open full text
PRO-ADD: Patient-empowered dose-finding trials integrating safety, preliminary efficacy and patient-reported outcomes for optimal dose selection.
Emily Alger, Sumithra J Mandrekar, Jun Yin, Christina Yap.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Advances in oncology drug development are driving the emergence of novel therapies, challenging traditional dose-efficacy assumptions in dose-finding oncology trials. Traditional trial designs aim to identify a maximum tolerated dose (MTD) by assessing ...

April 30, 2026 doi: 10.1177/09622802261435969 open full text
A double-semiparametric approach for extending mixture cure models with interval-censored data.
Xiaoyu Liu, Zsolt Szabo, Liming Xiang.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Mixture cure models (MCMs) have become a valuable tool for analyzing failure time data in settings where a subset of individuals is considered to be “cured” or no longer experience the failure event of interest. Under the MCM framework, the latency ...

April 30, 2026 doi: 10.1177/09622802261442911 open full text
A double-semiparametric approach for extending mixture cure models with interval-censored data.
Xiaoyu Liu, Zsolt Szabo, Liming Xiang.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Mixture cure models (MCMs) have become a valuable tool for analyzing failure time data in settings where a subset of individuals is considered to be “cured” or no longer experience the failure event of interest. Under the MCM framework, the latency ...

April 30, 2026 doi: 10.1177/09622802261442911 open full text
When randomization is not random: Allocation bias in small sample, group sequential randomized clinical trials.
Daniel Bodden, Ralf-Dieter Hilgers, Franz König.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Even in rare diseases, where the sample size is limited and blinding is less frequently implemented, randomized controlled trials are considered the gold standard to prove efficacy. Randomization is used to mitigate bias and regulatory guidance recommend ...

April 30, 2026 doi: 10.1177/09622802261442914 open full text
When randomization is not random: Allocation bias in small sample, group sequential randomized clinical trials.
Daniel Bodden, Ralf-Dieter Hilgers, Franz König.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Even in rare diseases, where the sample size is limited and blinding is less frequently implemented, randomized controlled trials are considered the gold standard to prove efficacy. Randomization is used to mitigate bias and regulatory guidance recommend ...

April 30, 2026 doi: 10.1177/09622802261442914 open full text
When randomization is not random: Allocation bias in small sample, group sequential randomized clinical trials.
Daniel Bodden, Ralf-Dieter Hilgers, Franz König.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Even in rare diseases, where the sample size is limited and blinding is less frequently implemented, randomized controlled trials are considered the gold standard to prove efficacy. Randomization is used to mitigate bias and regulatory guidance recommend ...

April 30, 2026 doi: 10.1177/09622802261442914 open full text
Smooth transformation models for survival analysis: A tutorial using R.
Sandra Siegfried, Bálint Tamási, Torsten Hothorn1Institut für Epidemiologie, Biostatistik und Prävention, Universität Zürich, Switzerland.
Statistical Methods in Medical Research: An International Review Journal. April 28, 2026

Statistical Methods in Medical Research, Ahead of Print.
Over the last five decades, we have seen strong methodological advances in survival analysis, using parametric methods and, more prominently, methods based on non-/semi-parametric estimation. As the methodological landscape continues to evolve, the task ...

April 28, 2026 doi: 10.1177/09622802251414595 open full text
Smooth transformation models for survival analysis: A tutorial using R.
Sandra Siegfried, Bálint Tamási, Torsten Hothorn1Institut für Epidemiologie, Biostatistik und Prävention, Universität Zürich, Switzerland.
Statistical Methods in Medical Research: An International Review Journal. April 28, 2026

Statistical Methods in Medical Research, Ahead of Print.
Over the last five decades, we have seen strong methodological advances in survival analysis, using parametric methods and, more prominently, methods based on non-/semi-parametric estimation. As the methodological landscape continues to evolve, the task ...

April 28, 2026 doi: 10.1177/09622802251414595 open full text
Smooth transformation models for survival analysis: A tutorial using R.
Sandra Siegfried, Bálint Tamási, Torsten Hothorn1Institut für Epidemiologie, Biostatistik und Prävention, Universität Zürich, Switzerland.
Statistical Methods in Medical Research: An International Review Journal. April 28, 2026

Statistical Methods in Medical Research, Ahead of Print.
Over the last five decades, we have seen strong methodological advances in survival analysis, using parametric methods and, more prominently, methods based on non-/semi-parametric estimation. As the methodological landscape continues to evolve, the task ...

April 28, 2026 doi: 10.1177/09622802251414595 open full text
Quantifying the effects of air pollution on respiratory ill health treated in primary care when the locations of the populations at risk are partially unknown.
Qiangqiang Zhu, Duncan Lee, Oliver Stoner1School of Mathematics and Statistics, University of Glasgow, UK.
Statistical Methods in Medical Research: An International Review Journal. April 24, 2026

Statistical Methods in Medical Research, Ahead of Print.
Most air pollution and health studies focus on severe outcomes such as hospitalisations and deaths, overlooking the impact that air pollution may have on non-hospitalised respiratory ill health treated in primary care. This paper presents a new study ...

April 24, 2026 doi: 10.1177/09622802261439259 open full text
A practical review of response-adaptive randomization: Under-explored challenges and potential directions.
Hao Mei, Xiaolin Xu, Hang Yang, Fan Wang, Yang Li.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) dynamically adjusts allocation probabilities of sequentially enrolled patients based on accumulating response information. It has gained increasing attention in clinical trials for its ability to enhance statistical ...

April 16, 2026 doi: 10.1177/09622802261427330 open full text
A practical review of response-adaptive randomization: Under-explored challenges and potential directions.
Hao Mei, Xiaolin Xu, Hang Yang, Fan Wang, Yang Li.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) dynamically adjusts allocation probabilities of sequentially enrolled patients based on accumulating response information. It has gained increasing attention in clinical trials for its ability to enhance statistical ...

April 16, 2026 doi: 10.1177/09622802261427330 open full text
Bayesian sample size determination using robust commensurate priors with interpretable discrepancy weights.
Lou E Whitehead, James MS Wason, Oliver Sailer, Haiyan Zheng.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Randomized controlled clinical trials provide the gold standard for evidence generation in relation to the efficacy of a new treatment in clinical research. Relevant information from previous studies may be desirable to incorporate in the design and ...

April 16, 2026 doi: 10.1177/09622802261432816 open full text
A practical review of response-adaptive randomization: Under-explored challenges and potential directions.
Hao Mei, Xiaolin Xu, Hang Yang, Fan Wang, Yang Li.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) dynamically adjusts allocation probabilities of sequentially enrolled patients based on accumulating response information. It has gained increasing attention in clinical trials for its ability to enhance statistical ...

April 16, 2026 doi: 10.1177/09622802261427330 open full text
Bayesian sample size determination using robust commensurate priors with interpretable discrepancy weights.
Lou E Whitehead, James MS Wason, Oliver Sailer, Haiyan Zheng.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Randomized controlled clinical trials provide the gold standard for evidence generation in relation to the efficacy of a new treatment in clinical research. Relevant information from previous studies may be desirable to incorporate in the design and ...

April 16, 2026 doi: 10.1177/09622802261432816 open full text
The applicability to systematic reviews of common effect, random effects and fixed effects approaches to meta-analysis.
Richard J Stevens1Nuffield Department of Primary Care Health Sciences, 6396University of Oxford, Oxford, UK.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Systematic reviewers planning quantitative meta-analysis usually choose between fixed effect meta-analysis, and random effects meta-analysis. An alternative method is called fixed effects (note the s in the name). This method has the unique property that ...

April 16, 2026 doi: 10.1177/09622802261439260 open full text
The applicability to systematic reviews of common effect, random effects and fixed effects approaches to meta-analysis.
Richard J Stevens1Nuffield Department of Primary Care Health Sciences, 6396University of Oxford, Oxford, UK.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Systematic reviewers planning quantitative meta-analysis usually choose between fixed effect meta-analysis, and random effects meta-analysis. An alternative method is called fixed effects (note the s in the name). This method has the unique property that ...

April 16, 2026 doi: 10.1177/09622802261439260 open full text
A flexible semiparametric approach for robust causal inference with invalid instruments and unmeasured confounder.
Yunlong Cao, Yuquan Wang, Dapeng Shi, Dong Chen, Yue-Qing Hu.
Statistical Methods in Medical Research: An International Review Journal. April 13, 2026

Statistical Methods in Medical Research, Ahead of Print.
Inferring causal effects with unmeasured confounder is a main challenge in causal inference. Many researchers impose parametric assumptions on the distribution of unmeasured confounder. However, due to the unobservable nature of the unmeasured confounder, ...

April 13, 2026 doi: 10.1177/09622802261439252 open full text
Model-based clustering of multiple images incorporating covariates.
Ying Cui, Jeong Hoon Jang, Robert G Mannino, Amita K Manatunga.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
In this paper, we develop a novel method for clustering multiple images while adjusting for the effect of available covariates on cluster membership. The key strategy is to represent each image as two-dimensional functional data and formulate a functional ...

April 08, 2026 doi: 10.1177/09622802251393631 open full text
Implementing empirical likelihood within the causal inference framework to study causal effects of air pollution on reproductive development.
Sima Sharghi, Kevin E Stoll, Sally W Thurston, Emily Barrett, Brent Johnson.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
To study whether air pollution is detrimental to reproductive development is imperative. In the absence of randomized trials to study the effects of air pollution on human health, data from observational studies have been utilized in which the researchers ...

April 08, 2026 doi: 10.1177/09622802261435966 open full text
Beyond weighting: Propensity score modeling for causal inference.
Rong J.B. Zhu112478Fudan University, China.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
Propensity score weighting is a common method in causal inference methods. However, this approach faces two well-known challenges: (i) high variance due to small probability values in the denominator, and (ii) sensitivity to model specification errors ...

April 08, 2026 doi: 10.1177/09622802261436960 open full text
Beyond weighting: Propensity score modeling for causal inference.
Rong J.B. Zhu112478Fudan University, China.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
Propensity score weighting is a common method in causal inference methods. However, this approach faces two well-known challenges: (i) high variance due to small probability values in the denominator, and (ii) sensitivity to model specification errors ...

April 08, 2026 doi: 10.1177/09622802261436960 open full text
A Bayesian likely responder approach for the analysis of randomized controlled trials.
Annan Deng, Carole Siegel, Hyung G Park.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
An important goal of precision medicine is to personalize medical treatment by identifying individuals who are most likely to benefit from a specific treatment. The likely responder (LR) framework, which identifies a subpopulation where treatment response ...

April 08, 2026 doi: 10.1177/09622802261427026 open full text
A Bayesian likely responder approach for the analysis of randomized controlled trials.
Annan Deng, Carole Siegel, Hyung G Park.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2026

Statistical Methods in Medical Research, Ahead of Print.
An important goal of precision medicine is to personalize medical treatment by identifying individuals who are most likely to benefit from a specific treatment. The likely responder (LR) framework, which identifies a subpopulation where treatment response ...

April 08, 2026 doi: 10.1177/09622802261427026 open full text
Sensitivity bounds for bias in hazard ratios: A causal hazard perspective.
Yan-Lin Chen, Hsiang-Hsi Hung, Sheng-Hsuan Lin.
Statistical Methods in Medical Research: An International Review Journal. April 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The Cox proportional hazards model has popularized the conventional hazard ratio as a standard measure for assessing the effect of exposure on time-to-event outcomes. However, as noted in Hernán's influential critique, interpreting the hazard ratio as a ...

April 03, 2026 doi: 10.1177/09622802261436811 open full text
Sensitivity bounds for bias in hazard ratios: A causal hazard perspective.
Yan-Lin Chen, Hsiang-Hsi Hung, Sheng-Hsuan Lin.
Statistical Methods in Medical Research: An International Review Journal. April 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The Cox proportional hazards model has popularized the conventional hazard ratio as a standard measure for assessing the effect of exposure on time-to-event outcomes. However, as noted in Hernán's influential critique, interpreting the hazard ratio as a ...

April 03, 2026 doi: 10.1177/09622802261436811 open full text
Vaccine efficacy estimands and power considerations.
Andrea Callegaro, Nathan W. Bean.
Statistical Methods in Medical Research: An International Review Journal. April 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The ICH E9(R1) addendum stresses the importance of clearly pre-specifying clinically interpretable treatment effect measures (estimands) and proposes different strategies to deal with intercurrent events. In this paper, we consider different estimands of ...

April 03, 2026 doi: 10.1177/09622802251412833 open full text
Vaccine efficacy estimands and power considerations.
Andrea Callegaro, Nathan W. Bean.
Statistical Methods in Medical Research: An International Review Journal. April 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The ICH E9(R1) addendum stresses the importance of clearly pre-specifying clinically interpretable treatment effect measures (estimands) and proposes different strategies to deal with intercurrent events. In this paper, we consider different estimands of ...

April 03, 2026 doi: 10.1177/09622802251412833 open full text
Linearized maximum rank correlation estimation of doubly truncated data.
Peijie Wang, Qihao Wang, Jianguo Sun.
Statistical Methods in Medical Research: An International Review Journal. April 02, 2026

Statistical Methods in Medical Research, Ahead of Print.
Truncated data frequently arise in many areas such as economics, astronomical studies, and survival analysis, and the existence of truncation makes statistical inference more difficult due to the incomplete information. In this paper, we propose a ...

April 02, 2026 doi: 10.1177/09622802261432834 open full text
Linearized maximum rank correlation estimation of doubly truncated data.
Peijie Wang, Qihao Wang, Jianguo Sun.
Statistical Methods in Medical Research: An International Review Journal. April 02, 2026

Statistical Methods in Medical Research, Ahead of Print.
Truncated data frequently arise in many areas such as economics, astronomical studies, and survival analysis, and the existence of truncation makes statistical inference more difficult due to the incomplete information. In this paper, we propose a ...

April 02, 2026 doi: 10.1177/09622802261432834 open full text
Penalized estimation of linear transformation models for interval-censored data with time-dependent covariates.
Minggen Lu, Yahui Zhang, Chin-Shang Li, Guogen Shan.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2026

Statistical Methods in Medical Research, Ahead of Print.
We investigate efficient estimation strategies for partially linear transformation models with time-dependent covariates under interval censoring. The unknown monotone function is approximated using a monotoneB-spline basis to enable flexible ...

April 01, 2026 doi: 10.1177/09622802261433000 open full text
Penalized estimation of linear transformation models for interval-censored data with time-dependent covariates.
Minggen Lu, Yahui Zhang, Chin-Shang Li, Guogen Shan.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2026

Statistical Methods in Medical Research, Ahead of Print.
We investigate efficient estimation strategies for partially linear transformation models with time-dependent covariates under interval censoring. The unknown monotone function is approximated using a monotoneB-spline basis to enable flexible ...

April 01, 2026 doi: 10.1177/09622802261433000 open full text
Improving finite sample performance of causal discovery by exploiting temporal structure.
Christine W. Bang, Janine Witte, Ronja Foraita, Vanessa Didelez.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2026

Statistical Methods in Medical Research, Ahead of Print.
Methods of causal discovery aim to identify causal structures in a data-driven way. Existing algorithms are known to be unstable and sensitive to statistical errors, and are therefore rarely used with biomedical or epidemiological data. We investigate an ...

April 01, 2026 doi: 10.1177/09622802261422162 open full text
Improving finite sample performance of causal discovery by exploiting temporal structure.
Christine Bang, Janine Witte, Ronja Foraita, Vanessa Didelez.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2026

Statistical Methods in Medical Research, Ahead of Print.
Methods of causal discovery aim to identify causal structures in a data-driven way. Existing algorithms are known to be unstable and sensitive to statistical errors, and are therefore rarely used with biomedical or epidemiological data. We investigate an ...

April 01, 2026 doi: 10.1177/09622802261422162 open full text
Eliminating residual confounding in the stratified estimator via smoothing along with the propensity score.
Naoto Tsujimoto, Satoshi Hattori.
Statistical Methods in Medical Research: An International Review Journal. March 31, 2026

Statistical Methods in Medical Research, Ahead of Print.
The stratified estimator by the propensity score is one of the most popular estimator for the average causal effect in the presence of confounding. Despite of its advantages of robustness and simplicity, it has a serious shortcoming of residual ...

March 31, 2026 doi: 10.1177/09622802261432998 open full text
Eliminating residual confounding in the stratified estimator via smoothing along with the propensity score.
Naoto Tsujimoto, Satoshi Hattori.
Statistical Methods in Medical Research: An International Review Journal. March 31, 2026

Statistical Methods in Medical Research, Ahead of Print.
The stratified estimator by the propensity score is one of the most popular estimator for the average causal effect in the presence of confounding. Despite of its advantages of robustness and simplicity, it has a serious shortcoming of residual ...

March 31, 2026 doi: 10.1177/09622802261432998 open full text
Eliminating residual confounding in the stratified estimator via smoothing along with the propensity score.
Naoto Tsujimoto, Satoshi Hattori.
Statistical Methods in Medical Research: An International Review Journal. March 31, 2026

Statistical Methods in Medical Research, Ahead of Print.
The stratified estimator by the propensity score is one of the most popular estimator for the average causal effect in the presence of confounding. Despite of its advantages of robustness and simplicity, it has a serious shortcoming of residual ...

March 31, 2026 doi: 10.1177/09622802261432998 open full text
Joint estimation of multiple graphical models for an fMRI study of brain connectivity networks.
Lizhe Sun, Xiaojuan Han, Aiying Zhang.
Statistical Methods in Medical Research: An International Review Journal. March 30, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 925-942, April 2026.
Investigating changes and similarities in brain connectivity networks across task conditions is a central topic in neuroscience. We propose a novel framework for jointly estimating multiple graphical models using a hybrid Bayesian integration technique ...

March 30, 2026 doi: 10.1177/09622802261432804 open full text
Joint estimation of multiple graphical models for an fMRI study of brain connectivity networks.
Lizhe Sun, Xiaojuan Han, Aiying Zhang.
Statistical Methods in Medical Research: An International Review Journal. March 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Investigating changes and similarities in brain connectivity networks across task conditions is a central topic in neuroscience. We propose a novel framework for jointly estimating multiple graphical models using a hybrid Bayesian integration technique ...

March 30, 2026 doi: 10.1177/09622802261432804 open full text
Joint estimation of multiple graphical models for an fMRI study of brain connectivity networks.
Lizhe Sun, Xiaojuan Han, Aiying Zhang.
Statistical Methods in Medical Research: An International Review Journal. March 30, 2026

Statistical Methods in Medical Research, Ahead of Print.
Investigating changes and similarities in brain connectivity networks across task conditions is a central topic in neuroscience. We propose a novel framework for jointly estimating multiple graphical models using a hybrid Bayesian integration technique ...

March 30, 2026 doi: 10.1177/09622802261432804 open full text
Addressing nonignorable missing data and heterogeneity in prognostic biomarker assessment.
Xinran Huang, Ruosha Li, Jing Ning, for the Alzheimer’s Disease Neuroimaging Initiative.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Ahead of Print.
Covariate-specific and time-dependent area-under-curve (AUC) is often used to evaluate the discriminative performance of biomarkers with time-to-event outcomes, particularly when certain covariates influence biomarkers’ accuracy. In biomarker research, ...

March 27, 2026 doi: 10.1177/09622802261432996 open full text
Addressing nonignorable missing data and heterogeneity in prognostic biomarker assessment.
Xinran Huang, Ruosha Li, Jing Ning, for the Alzheimer’s Disease Neuroimaging Initiative.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Ahead of Print.
Covariate-specific and time-dependent area-under-curve (AUC) is often used to evaluate the discriminative performance of biomarkers with time-to-event outcomes, particularly when certain covariates influence biomarkers’ accuracy. In biomarker research, ...

March 27, 2026 doi: 10.1177/09622802261432996 open full text
Addressing nonignorable missing data and heterogeneity in prognostic biomarker assessment.
Xinran Huang, Ruosha Li, Jing Ning, for the Alzheimer’s Disease Neuroimaging Initiative.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Ahead of Print.
Covariate-specific and time-dependent area-under-curve (AUC) is often used to evaluate the discriminative performance of biomarkers with time-to-event outcomes, particularly when certain covariates influence biomarkers’ accuracy. In biomarker research, ...

March 27, 2026 doi: 10.1177/09622802261432996 open full text
Covariate hypothesis tests for the cure rate in mixture cure models based on martingale difference correlation.
Blanca E Monroy-Castillo, María Amalia Jácome, Ricardo Cao, Ingrid Van Keilegom.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Ahead of Print.
Cure models are a class of survival models used to analyze time-to-event data that allow the possibility that the event never occurs for a certain percentage1−p, of the population. These methods allow for direct modelling of the cure rate and the influence ...

March 27, 2026 doi: 10.1177/09622802261421453 open full text
Covariate hypothesis tests for the cure rate in mixture cure models based on martingale difference correlation.
Blanca E Monroy-Castillo, María Amalia Jácome, Ricardo Cao, Ingrid Van Keilegom.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 887-911, April 2026.
Cure models are a class of survival models used to analyze time-to-event data that allow the possibility that the event never occurs for a certain percentage1−p, of the population. These methods allow for direct modelling of the cure rate and the influence ...

March 27, 2026 doi: 10.1177/09622802261421453 open full text
Covariate hypothesis tests for the cure rate in mixture cure models based on martingale difference correlation.
Blanca E Monroy-Castillo, María Amalia Jácome, Ricardo Cao, Ingrid Van Keilegom.
Statistical Methods in Medical Research: An International Review Journal. March 27, 2026

Statistical Methods in Medical Research, Ahead of Print.
Cure models are a class of survival models used to analyze time-to-event data that allow the possibility that the event never occurs for a certain percentage1−p, of the population. These methods allow for direct modelling of the cure rate and the influence ...

March 27, 2026 doi: 10.1177/09622802261421453 open full text
Randomization and allocation procedures for master protocol trials of single-arm studies.
Peter Jacko, Günter Heimann, Tom Parke.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
In this paper we propose and examine randomization and allocation procedures in master protocol trials where each subtrial is performed as a single-arm study, which is motivated mainly by rare diseases. The subtrial analysis is done by comparing the ...

March 25, 2026 doi: 10.1177/09622802261425426 open full text
Randomization and allocation procedures for master protocol trials of single-arm studies.
Peter Jacko, Günter Heimann, Tom Parke.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
In this paper we propose and examine randomization and allocation procedures in master protocol trials where each subtrial is performed as a single-arm study, which is motivated mainly by rare diseases. The subtrial analysis is done by comparing the ...

March 25, 2026 doi: 10.1177/09622802261425426 open full text
Randomization and allocation procedures for master protocol trials of single-arm studies.
Peter Jacko, Günter Heimann, Tom Parke.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
In this paper we propose and examine randomization and allocation procedures in master protocol trials where each subtrial is performed as a single-arm study, which is motivated mainly by rare diseases. The subtrial analysis is done by comparing the ...

March 25, 2026 doi: 10.1177/09622802261425426 open full text
A Bayesian transformation model for informative partly interval-censored data with covariates subject to measurement error.
Jingjing Jiang, Chunjie Wang1School of Mathematics and Statistics, Changchun University of Technology, China.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
Linear transformation models are one of the commonly used models for regression analysis of failure time data due to their flexibility. Although the existing literature provides many methods for fitting transformation models with fixed covariates and non-...

March 25, 2026 doi: 10.1177/09622802261432830 open full text
A Bayesian transformation model for informative partly interval-censored data with covariates subject to measurement error.
Jingjing Jiang, Chunjie Wang1School of Mathematics and Statistics, Changchun University of Technology, China.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
Linear transformation models are one of the commonly used models for regression analysis of failure time data due to their flexibility. Although the existing literature provides many methods for fitting transformation models with fixed covariates and non-...

March 25, 2026 doi: 10.1177/09622802261432830 open full text
A Bayesian transformation model for informative partly interval-censored data with covariates subject to measurement error.
Jingjing Jiang, Chunjie Wang1School of Mathematics and Statistics, Changchun University of Technology, China.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2026

Statistical Methods in Medical Research, Ahead of Print.
Linear transformation models are one of the commonly used models for regression analysis of failure time data due to their flexibility. Although the existing literature provides many methods for fitting transformation models with fixed covariates and non-...

March 25, 2026 doi: 10.1177/09622802261432830 open full text
Asymptotic validity of Schoenfeld’s sample size formula for the Cox proportional hazards model via the Wald test approach.
Se Yoon Lee1Department of Statistics, 14736Texas A&M University, USA.
Statistical Methods in Medical Research: An International Review Journal. March 23, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 681-694, April 2026.
We revisit the widely used sample size formula for the Cox proportional hazards model, originally proposed by Schoenfeld in 1983. The classical derivation, based on the score test, evaluates the Fisher information under the null hypothesis, overlooking ...

March 23, 2026 doi: 10.1177/09622802261427024 open full text
Mediation analysis in longitudinal intervention studies with an ordinal treatment-dependent confounder.
Mikko Valtanen, Tommi Härkänen, Matti Uusitupa, Jaakko Tuomilehto, Jaana Lindström, Kari Auranen.
Statistical Methods in Medical Research: An International Review Journal. March 18, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 773-794, April 2026.
In interventional health studies, causal mediation analysis can be employed to investigate mechanisms through which the intervention affects the targeted health outcome. Identifying direct and indirect effects from empirical data become complicated, ...

March 18, 2026 doi: 10.1177/09622802261418211 open full text
Mediation analysis in longitudinal intervention studies with an ordinal treatment-dependent confounder.
Mikko Valtanen, Tommi Härkänen, Matti Uusitupa, Jaakko Tuomilehto, Jaana Lindström, Kari Auranen.
Statistical Methods in Medical Research: An International Review Journal. March 18, 2026

Statistical Methods in Medical Research, Ahead of Print.
In interventional health studies, causal mediation analysis can be employed to investigate mechanisms through which the intervention affects the targeted health outcome. Identifying direct and indirect effects from empirical data become complicated, ...

March 18, 2026 doi: 10.1177/09622802261418211 open full text
Mediation analysis in longitudinal intervention studies with an ordinal treatment-dependent confounder.
Mikko Valtanen, Tommi Härkänen, Matti Uusitupa, Jaakko Tuomilehto, Jaana Lindström, Kari Auranen.
Statistical Methods in Medical Research: An International Review Journal. March 18, 2026

Statistical Methods in Medical Research, Ahead of Print.
In interventional health studies, causal mediation analysis can be employed to investigate mechanisms through which the intervention affects the targeted health outcome. Identifying direct and indirect effects from empirical data become complicated, ...

March 18, 2026 doi: 10.1177/09622802261418211 open full text
Flexible Bayesian modeling of non-equidispersed counts with penalized complexity priors in disease incidence studies.
Mahsa Nadifar, Hossein Baghishani, Thomas Kneib, Afshin Fallah.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 713-735, April 2026.
Counts in epidemiology often deviate from equidispersion and exhibit spatial, temporal, and nonlinear structure that the Poisson model cannot accommodate. We introduce a gamma-count structured additive regression model that strategically integrates ...

March 16, 2026 doi: 10.1177/09622802261416088 open full text
Flexible Bayesian modeling of non-equidispersed counts with penalized complexity priors in disease incidence studies.
Mahsa Nadifar, Hossein Baghishani, Thomas Kneib, Afshin Fallah.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Counts in epidemiology often deviate from equidispersion and exhibit spatial, temporal, and nonlinear structure that the Poisson model cannot accommodate. We introduce a gamma-count structured additive regression model that strategically integrates ...

March 16, 2026 doi: 10.1177/09622802261416088 open full text
Flexible Bayesian modeling of non-equidispersed counts with penalized complexity priors in disease incidence studies.
Mahsa Nadifar, Hossein Baghishani, Thomas Kneib, Afshin Fallah.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2026

Statistical Methods in Medical Research, Ahead of Print.
Counts in epidemiology often deviate from equidispersion and exhibit spatial, temporal, and nonlinear structure that the Poisson model cannot accommodate. We introduce a gamma-count structured additive regression model that strategically integrates ...

March 16, 2026 doi: 10.1177/09622802261416088 open full text
Efficient design of partially nested randomized trials: A maximin approach.
Math JJM Candel, Gerard JP van Breukelen.
Statistical Methods in Medical Research: An International Review Journal. March 13, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 695-712, April 2026.
For two-treatment randomized trials with clustering in one of the treatment arms and a continuous outcome, designs are presented that minimize the number of subjects or the amount of research budget, when aiming for a desired power level. These designs ...

March 13, 2026 doi: 10.1177/09622802251409388 open full text
Efficient design of partially nested randomized trials: A maximin approach.
Math JJM Candel, Gerard JP van Breukelen.
Statistical Methods in Medical Research: An International Review Journal. March 13, 2026

Statistical Methods in Medical Research, Ahead of Print.
For two-treatment randomized trials with clustering in one of the treatment arms and a continuous outcome, designs are presented that minimize the number of subjects or the amount of research budget, when aiming for a desired power level. These designs ...

March 13, 2026 doi: 10.1177/09622802251409388 open full text
Efficient design of partially nested randomized trials: A maximin approach.
Math JJM Candel, Gerard JP van Breukelen.
Statistical Methods in Medical Research: An International Review Journal. March 13, 2026

Statistical Methods in Medical Research, Ahead of Print.
For two-treatment randomized trials with clustering in one of the treatment arms and a continuous outcome, designs are presented that minimize the number of subjects or the amount of research budget, when aiming for a desired power level. These designs ...

March 13, 2026 doi: 10.1177/09622802251409388 open full text
Regression analysis of interval-censored competing risks data with missing causes of failure: A direct likelihood approach.
Yichen Lou, Yuqing Ma, Liming Xiang, Jianguo Sun.
Statistical Methods in Medical Research: An International Review Journal. March 04, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 795-811, April 2026.
Regression analysis of interval-censored competing risks data is often required and plays an important role in many areas. For the situation, in addition to competing risk and interval censoring, another feature that makes the analysis difficult is that ...

March 04, 2026 doi: 10.1177/09622802261420820 open full text
Regression analysis of interval-censored competing risks data with missing causes of failure: A direct likelihood approach.
Yichen Lou, Yuqing Ma, Liming Xiang, Jianguo Sun.
Statistical Methods in Medical Research: An International Review Journal. March 04, 2026

Statistical Methods in Medical Research, Ahead of Print.
Regression analysis of interval-censored competing risks data is often required and plays an important role in many areas. For the situation, in addition to competing risk and interval censoring, another feature that makes the analysis difficult is that ...

March 04, 2026 doi: 10.1177/09622802261420820 open full text
An ensemble approach to tensor learning.
Jiaxin He, Jialiang Li.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 847-866, April 2026.
Motivated by recent development of tensor regression modeling, we propose a novel tensor ensemble learning (TEL) approach. While CANDECOMP/PARAFAC (CP) decomposition is an efficient technique to reduce the number of parameters in tensor covariate, ...

March 03, 2026 doi: 10.1177/09622802261424654 open full text
An ensemble approach to tensor learning.
Jiaxin He, Jialiang Li.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
Motivated by recent development of tensor regression modeling, we propose a novel tensor ensemble learning (TEL) approach. While CANDECOMP/PARAFAC (CP) decomposition is an efficient technique to reduce the number of parameters in tensor covariate, ...

March 03, 2026 doi: 10.1177/09622802261424654 open full text
Likelihood ratio test for the disease progression model to measure saved time in Alzheimer’s disease.
Guogen Shan, Yahui Zhang, Zhixin Tang, Aidong Adam Ding.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
Saved time provides an intuitive interpretation comparing a new treatment to the placebo in a randomized trial with repeated measures. The projection approach is frequently used to estimate saved time by using the placebo group disease progression curve ...

March 03, 2026 doi: 10.1177/09622802261424515 open full text
Improved survival analysis with shrinkage Kibria–Lukman estimators in the Cox model: Application to lung cancer data.
Solmaz Seifollahi, Mohammad Arashi1Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The Cox proportional hazards regression model is a widely used and valuable tool for modeling survival time with predictors, however its performance can deteriorate in the presence of multicollinearity. It can lead to unreliable estimates from the maximum ...

March 03, 2026 doi: 10.1177/09622802261423186 open full text
Likelihood ratio test for the disease progression model to measure saved time in Alzheimer’s disease.
Guogen Shan, Yahui Zhang, Zhixin Tang, Aidong Adam Ding.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 912-924, April 2026.
Saved time provides an intuitive interpretation comparing a new treatment to the placebo in a randomized trial with repeated measures. The projection approach is frequently used to estimate saved time by using the placebo group disease progression curve ...

March 03, 2026 doi: 10.1177/09622802261424515 open full text
Likelihood ratio test for the disease progression model to measure saved time in Alzheimer’s disease.
Guogen Shan, Yahui Zhang, Zhixin Tang, Aidong Adam Ding.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
Saved time provides an intuitive interpretation comparing a new treatment to the placebo in a randomized trial with repeated measures. The projection approach is frequently used to estimate saved time by using the placebo group disease progression curve ...

March 03, 2026 doi: 10.1177/09622802261424515 open full text
Improved survival analysis with shrinkage Kibria–Lukman estimators in the Cox model: Application to lung cancer data.
Solmaz Seifollahi, Mohammad Arashi1Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The Cox proportional hazards regression model is a widely used and valuable tool for modeling survival time with predictors, however its performance can deteriorate in the presence of multicollinearity. It can lead to unreliable estimates from the maximum ...

March 03, 2026 doi: 10.1177/09622802261423186 open full text
Improved survival analysis with shrinkage Kibria–Lukman estimators in the Cox model: Application to lung cancer data.
Solmaz Seifollahi, Mohammad Arashi1Department of Statistics, Faculty of Mathematical Sciences, Ferdowsi University of Mashhad, Mashhad, Iran.
Statistical Methods in Medical Research: An International Review Journal. March 03, 2026

Statistical Methods in Medical Research, Ahead of Print.
The Cox proportional hazards regression model is a widely used and valuable tool for modeling survival time with predictors, however its performance can deteriorate in the presence of multicollinearity. It can lead to unreliable estimates from the maximum ...

March 03, 2026 doi: 10.1177/09622802261423186 open full text
Confidence intervals and point estimates for treatment effects in adaptive enrichment designs.
Jinyu Zhu, Andrew Titman, Fang Wan1School of Mathematical Sciences, Lancaster University, UK.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 827-846, April 2026.
Adaptive enrichment designs allow subgroup selection of the patient population within a confirmatory trial via an interim analysis. However, this design complicates treatment effect estimation and uncertainty quantification. This paper introduces ap-value ...

February 24, 2026 doi: 10.1177/09622802261423180 open full text
Confidence intervals and point estimates for treatment effects in adaptive enrichment designs.
Jinyu Zhu, Andrew Titman, Fang Wan1School of Mathematical Sciences, Lancaster University, UK.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2026

Statistical Methods in Medical Research, Ahead of Print.
Adaptive enrichment designs allow subgroup selection of the patient population within a confirmatory trial via an interim analysis. However, this design complicates treatment effect estimation and uncertainty quantification. This paper introduces ap-value ...

February 24, 2026 doi: 10.1177/09622802261423180 open full text
Confidence intervals and point estimates for treatment effects in adaptive enrichment designs.
Jinyu Zhu, Andrew Titman, Fang Wan1School of Mathematical Sciences, Lancaster University, UK.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2026

Statistical Methods in Medical Research, Ahead of Print.
Adaptive enrichment designs allow subgroup selection of the patient population within a confirmatory trial via an interim analysis. However, this design complicates treatment effect estimation and uncertainty quantification. This paper introduces ap-value ...

February 24, 2026 doi: 10.1177/09622802261423180 open full text
Estimating conditional survival benefit for the allocation of scarce resources.
Ilaria Prosepe, Nan van Geloven, Hans de Ferrante, Andries E Braat, Hein Putter.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 812-826, April 2026.
Whenever treatment is scarce, the question of how to allocate resources arises. One option is to allocate based on conditional survival benefit, defined as the contrast between an individual’s expected survival with and without treatment. Estimating ...

February 17, 2026 doi: 10.1177/09622802261420699 open full text
Estimating conditional survival benefit for the allocation of scarce resources.
Ilaria Prosepe, Nan van Geloven, Hans de Ferrante, Andries E Braat, Hein Putter.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2026

Statistical Methods in Medical Research, Ahead of Print.
Whenever treatment is scarce, the question of how to allocate resources arises. One option is to allocate based on conditional survival benefit, defined as the contrast between an individual’s expected survival with and without treatment. Estimating ...

February 17, 2026 doi: 10.1177/09622802261420699 open full text
Estimating conditional survival benefit for the allocation of scarce resources.
Ilaria Prosepe, Nan van Geloven, Hans de Ferrante, Andries E Braat, Hein Putter.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2026

Statistical Methods in Medical Research, Ahead of Print.
Whenever treatment is scarce, the question of how to allocate resources arises. One option is to allocate based on conditional survival benefit, defined as the contrast between an individual’s expected survival with and without treatment. Estimating ...

February 17, 2026 doi: 10.1177/09622802261420699 open full text
Designing clinical trials for the comparison of single and multiple quantiles with right-censored data.
Beatriz Farah, Olivier Bouaziz, Aurélien Latouche.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 626-636, March 2026.
Based on the test for equality of quantiles originally introduced by Kosorok (1999), we propose new power formulas for the comparison of one quantile between two treatment groups, as well as for the comparison of a collection of quantiles. Under the null ...

February 17, 2026 doi: 10.1177/09622802251415363 open full text
Designing clinical trials for the comparison of single and multiple quantiles with right-censored data.
Beatriz Farah, Olivier Bouaziz, Aurélien Latouche.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 626-636, March 2026.
Based on the test for equality of quantiles originally introduced by Kosorok (1999), we propose new power formulas for the comparison of one quantile between two treatment groups, as well as for the comparison of a collection of quantiles. Under the null ...

February 17, 2026 doi: 10.1177/09622802251415363 open full text
Rank-based methods for assessing equivalence/non-inferiority with assay sensitivity in a three-arm trial with ordinal endpoints.
Shi-Fang Qiu, Dai-Min Li, Wai-Yin Poon.
Statistical Methods in Medical Research: An International Review Journal. February 12, 2026

Statistical Methods in Medical Research, Ahead of Print.
Various approaches have been developed to assess equivalence/non-inferiority with assay sensitivity in a three-arm trial with continuous or discrete endpoints. However, there is little work done on ordinal endpoints. Ordinal data do not have metric ...

February 12, 2026 doi: 10.1177/09622802261417216 open full text
Rank-based methods for assessing equivalence/non-inferiority with assay sensitivity in a three-arm trial with ordinal endpoints.
Shi-Fang Qiu, Dai-Min Li, Wai-Yin Poon.
Statistical Methods in Medical Research: An International Review Journal. February 12, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 752-772, April 2026.
Various approaches have been developed to assess equivalence/non-inferiority with assay sensitivity in a three-arm trial with continuous or discrete endpoints. However, there is little work done on ordinal endpoints. Ordinal data do not have metric ...

February 12, 2026 doi: 10.1177/09622802261417216 open full text
A hybrid prior Bayesian method for combining domestic real-world data and overseas data in global drug development.
Keer Chen, Zengyue Zheng, Pengfei Zhu, Shuping Jiang, Nan Li, Jumin Deng, Pingyan Chen, Zhenyu Wu, Ying Wu.
Statistical Methods in Medical Research: An International Review Journal. February 06, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 552-570, March 2026.
BackgroundHybrid clinical trial design integrates traditional randomized controlled trials (RCTs) with real-world data (RWD), aiming to enhance trial efficiency through dynamic incorporation of external data (External trial data and RWD). However, ...

February 06, 2026 doi: 10.1177/09622802251414586 open full text
A hybrid prior Bayesian method for combining domestic real-world data and overseas data in global drug development.
Keer Chen, Zengyue Zheng, Pengfei Zhu, Shuping Jiang, Nan Li, Jumin Deng, Pingyan Chen, Zhenyu Wu, Ying Wu.
Statistical Methods in Medical Research: An International Review Journal. February 06, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 552-570, March 2026.
BackgroundHybrid clinical trial design integrates traditional randomized controlled trials (RCTs) with real-world data (RWD), aiming to enhance trial efficiency through dynamic incorporation of external data (External trial data and RWD). However, ...

February 06, 2026 doi: 10.1177/09622802251414586 open full text
Monitoring time to event in registry data using CUSUMs based on relative survival models.
Jimmy Huy Tran, Jan Terje Kvaløy, Hartwig Kørner.
Statistical Methods in Medical Research: An International Review Journal. February 05, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 488-506, March 2026.
An aspect of interest in surveillance of diseases is whether the survival time distribution changes over time. By following data in health registries over time, this can be monitored, either in real time or retrospectively. With relevant risk factors ...

February 05, 2026 doi: 10.1177/09622802251411540 open full text
Monitoring time to event in registry data using CUSUMs based on relative survival models.
Jimmy Huy Tran, Jan Terje Kvaløy, Hartwig Kørner.
Statistical Methods in Medical Research: An International Review Journal. February 05, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 488-506, March 2026.
An aspect of interest in surveillance of diseases is whether the survival time distribution changes over time. By following data in health registries over time, this can be monitored, either in real time or retrospectively. With relevant risk factors ...

February 05, 2026 doi: 10.1177/09622802251411540 open full text
A non-proportional hazards cure model with an application to gastric cancer data analysis.
N Balakrishnan, M Mar Fenoy, M Carmen Pardo.
Statistical Methods in Medical Research: An International Review Journal. January 30, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 653-666, March 2026.
In many practical situations, some subjects may never experience the event of interest in their lifetime. These subjects are referred to as the cured or non-susceptible subjects. In the context of chronic disease treatment, this is referred to as a cure ...

January 30, 2026 doi: 10.1177/09622802251414429 open full text
A non-proportional hazards cure model with an application to gastric cancer data analysis.
N Balakrishnan, M Mar Fenoy, M Carmen Pardo.
Statistical Methods in Medical Research: An International Review Journal. January 30, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 653-666, March 2026.
In many practical situations, some subjects may never experience the event of interest in their lifetime. These subjects are referred to as the cured or non-susceptible subjects. In the context of chronic disease treatment, this is referred to as a cure ...

January 30, 2026 doi: 10.1177/09622802251414429 open full text
Bayesian feature selection in joint models with application to a cardiovascular disease cohort study.
Mirajul Islam, Michael J Daniels, Zeynab Aghabazaz, Juned Siddique.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 637-652, March 2026.
Cardiovascular disease (CVD) cohorts collect data longitudinally to study the association between CVD risk factors and event times. An important area of scientific research is to better understand what features of CVD risk factor trajectories are ...

January 29, 2026 doi: 10.1177/09622802251414939 open full text
Bayesian feature selection in joint models with application to a cardiovascular disease cohort study.
Mirajul Islam, Michael J Daniels, Zeynab Aghabazaz, Juned Siddique.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 637-652, March 2026.
Cardiovascular disease (CVD) cohorts collect data longitudinally to study the association between CVD risk factors and event times. An important area of scientific research is to better understand what features of CVD risk factor trajectories are ...

January 29, 2026 doi: 10.1177/09622802251414939 open full text
Statistical methods for clustered competing risk data when the event types are only available in a training dataset.
Yujie Wu, Ce Yang, Molin Wang.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 667-677, March 2026.
We develop methods to analyze clustered competing risks data when the event types are only available in a training dataset and are missing in the main study. We propose to estimate the exposure effects through the cause-specific proportional hazards ...

January 29, 2026 doi: 10.1177/09622802251415022 open full text
Discrimination performance in illness-death models with interval-censored disease data.
Marta Spreafico, Anja J Rueten-Budde, Hein Putter, Marta Fiocco.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 469-487, March 2026.
In clinical studies, the illness-death model is often used to describe disease progression. A subject starts disease-free, may develop the disease and then die, or die directly. In clinical practice, disease can only be diagnosed at pre-specified follow-...

January 29, 2026 doi: 10.1177/09622802251412855 open full text
Discrimination performance in illness-death models with interval-censored disease data.
Marta Spreafico, Anja J Rueten-Budde, Hein Putter, Marta Fiocco.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 469-487, March 2026.
In clinical studies, the illness-death model is often used to describe disease progression. A subject starts disease-free, may develop the disease and then die, or die directly. In clinical practice, disease can only be diagnosed at pre-specified follow-...

January 29, 2026 doi: 10.1177/09622802251412855 open full text
Statistical methods for clustered competing risk data when the event types are only available in a training dataset.
Yujie Wu, Ce Yang, Molin Wang.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 667-677, March 2026.
We develop methods to analyze clustered competing risks data when the event types are only available in a training dataset and are missing in the main study. We propose to estimate the exposure effects through the cause-specific proportional hazards ...

January 29, 2026 doi: 10.1177/09622802251415022 open full text
Parametric and nonparametric propensity score weighting analysis with subgroup covariate balance.
Yan Li, Yong-Fang Kuo, Liang Li.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Volume 35, Issue 4, Page 736-751, April 2026.
Estimating the causal treatment effects by subgroups is important in observational studies when the treatment effect heterogeneity is present. Existing propensity score methods rely on a correctly specified propensity score model. Model misspecification ...

January 29, 2026 doi: 10.1177/09622802251415157 open full text
Parametric and nonparametric propensity score weighting analysis with subgroup covariate balance.
Yan Li, Yong-Fang Kuo, Liang Li.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Ahead of Print.
Estimating the causal treatment effects by subgroups is important in observational studies when the treatment effect heterogeneity is present. Existing propensity score methods rely on a correctly specified propensity score model. Model misspecification ...

January 29, 2026 doi: 10.1177/09622802251415157 open full text
Parametric and nonparametric propensity score weighting analysis with subgroup covariate balance.
Yan Li, Yong-Fang Kuo, Liang Li.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2026

Statistical Methods in Medical Research, Ahead of Print.
Estimating the causal treatment effects by subgroups is important in observational studies when the treatment effect heterogeneity is present. Existing propensity score methods rely on a correctly specified propensity score model. Model misspecification ...

January 29, 2026 doi: 10.1177/09622802251415157 open full text
A historical note: Rediscovering an unpublished response to Korn and Freidlin (2011).
Hongjian Zhu, William F Rosenberger, Feifang Hu, Sofia Villar.
Statistical Methods in Medical Research: An International Review Journal. January 28, 2026

Statistical Methods in Medical Research, Ahead of Print.
Despite extensive research, the use of response-adaptive randomization (RAR) in clinical trials has remained controversial. Korn and Freidlin’s 2011 article reignited this debate back then, prompting numerous responses, including one by Zhu, Rosenberger, ...

January 28, 2026 doi: 10.1177/09622802251403371 open full text
A historical note: Rediscovering an unpublished response to Korn and Freidlin (2011).
Hongjian Zhu, William F Rosenberger, Feifang Hu, Sofia Villar.
Statistical Methods in Medical Research: An International Review Journal. January 28, 2026

Statistical Methods in Medical Research, Ahead of Print.
Despite extensive research, the use of response-adaptive randomization (RAR) in clinical trials has remained controversial. Korn and Freidlin’s 2011 article reignited this debate back then, prompting numerous responses, including one by Zhu, Rosenberger, ...

January 28, 2026 doi: 10.1177/09622802251403371 open full text
Truncated Gaussian copula principal component analysis with application to pediatric acute lymphoblastic leukemia patients’ gut microbiome.
Lei Wang, Yang Ni, Irina Gaynanova.
Statistical Methods in Medical Research: An International Review Journal. January 23, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 443-455, March 2026.
Increasing epidemiologic evidence suggests that the diversity and composition of the gut microbiome can predict infection risk in cancer patients. Infections remain a major cause of morbidity and mortality during chemotherapy. Analyzing microbiome data to ...

January 23, 2026 doi: 10.1177/09622802251412844 open full text
Truncated Gaussian copula principal component analysis with application to pediatric acute lymphoblastic leukemia patients’ gut microbiome.
Lei Wang, Yang Ni, Irina Gaynanova.
Statistical Methods in Medical Research: An International Review Journal. January 23, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 443-455, March 2026.
Increasing epidemiologic evidence suggests that the diversity and composition of the gut microbiome can predict infection risk in cancer patients. Infections remain a major cause of morbidity and mortality during chemotherapy. Analyzing microbiome data to ...

January 23, 2026 doi: 10.1177/09622802251412844 open full text
Cluster analysis for longitudinal data and its application in the detection of adiposity trajectories.
Asael Fabian Martínez, Ivonne Ramírez-Silva, Ruth Fuentes-García.
Statistical Methods in Medical Research: An International Review Journal. January 20, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 588-600, March 2026.
The identification of latent profile trajectories in longitudinal studies represents an important challenge for specialists since they could provide insights to better understand their problem of interest. The majority of the statistical methodologies for ...

January 20, 2026 doi: 10.1177/09622802251414594 open full text
Hazard-based distributional regression via ordinary differential equations.
Jose A Christen, Francisco J Rubio.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 571-587, March 2026.
The hazard function is central to the formulation of commonly used survival regression models such as the proportional hazards and accelerated failure time models. However, these models rely on a shared baseline hazard, which, when specified ...

January 19, 2026 doi: 10.1177/09622802251412840 open full text
A permutation test of differences between externally or internally defined groupings in compositional data sets.
Nikola Štefelová, Javier Palarea-Albaladejo, Josep Antoni Martín-Fernández1Department of Computer Science, Applied Mathematics and Statistics, University of Girona, Spain.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 601-625, March 2026.
Testing group differences in compositional data, that is, multivariate data referring to parts of a whole, requires focussing on the relative information between components. This is commonly achieved by mapping the data into a sensible logratio coordinate ...

January 19, 2026 doi: 10.1177/09622802251413737 open full text
A permutation test of differences between externally or internally defined groupings in compositional data sets.
Nikola Štefelová, Javier Palarea-Albaladejo, Josep Antoni Martín-Fernández1Department of Computer Science, Applied Mathematics and Statistics, University of Girona, Spain.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 601-625, March 2026.
Testing group differences in compositional data, that is, multivariate data referring to parts of a whole, requires focussing on the relative information between components. This is commonly achieved by mapping the data into a sensible logratio coordinate ...

January 19, 2026 doi: 10.1177/09622802251413737 open full text
Hazard-based distributional regression via ordinary differential equations.
Jose A Christen, Francisco J Rubio.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 571-587, March 2026.
The hazard function is central to the formulation of commonly used survival regression models such as the proportional hazards and accelerated failure time models. However, these models rely on a shared baseline hazard, which, when specified ...

January 19, 2026 doi: 10.1177/09622802251412840 open full text
Dynamic prediction of interval-censored failure time data with longitudinal marker.
Yang-Jin Kim1Department of statistics, 35015Sookmyung womens’ University, South Korea.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 456-468, March 2026.
A main interest in clinical practice is the prediction of patient prognosis conductive to decision making. Therefore, a relevant prediction model should be able to reflect the updated patient’s condition. A joint model of longitudinal markers and time-to-...

January 19, 2026 doi: 10.1177/09622802251412849 open full text
Dynamic prediction of interval-censored failure time data with longitudinal marker.
Yang-Jin Kim1Department of statistics, 35015Sookmyung womens’ University, South Korea.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 456-468, March 2026.
A main interest in clinical practice is the prediction of patient prognosis conductive to decision making. Therefore, a relevant prediction model should be able to reflect the updated patient’s condition. A joint model of longitudinal markers and time-to-...

January 19, 2026 doi: 10.1177/09622802251412849 open full text
A burn-in(g) question: How long should an initial equal randomization stage be before Bayesian response-adaptive randomization?
Edwin YN Tang, Stef Baas, Daniel Kaddaj, Lukas Pin, David S Robertson, Sofía S Villar.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) can increase participant benefit in clinical trials, but also complicates statistical analysis. The burn-in period—a non-adaptive initial stage—is commonly used to mitigate this disadvantage, yet guidance on its ...

January 19, 2026 doi: 10.1177/09622802251411538 open full text
A burn-in(g) question: How long should an initial equal randomization stage be before Bayesian response-adaptive randomization?
Edwin YN Tang, Stef Baas, Daniel Kaddaj, Lukas Pin, David S Robertson, Sofía S Villar.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) can increase participant benefit in clinical trials, but also complicates statistical analysis. The burn-in period—a non-adaptive initial stage—is commonly used to mitigate this disadvantage, yet guidance on its ...

January 19, 2026 doi: 10.1177/09622802251411538 open full text
A burn-in(g) question: How long should an initial equal randomization stage be before Bayesian response-adaptive randomization?
Edwin YN Tang, Stef Baas, Daniel Kaddaj, Lukas Pin, David S Robertson, Sofía S Villar.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2026

Statistical Methods in Medical Research, Ahead of Print.
Response-adaptive randomization (RAR) can increase participant benefit in clinical trials, but also complicates statistical analysis. The burn-in period—a non-adaptive initial stage—is commonly used to mitigate this disadvantage, yet guidance on its ...

January 19, 2026 doi: 10.1177/09622802251411538 open full text
CiFGNA: Comprehensive information-based functional gene network analysis.
Heewon Park, Seiya Imoto, Satoru Miyano.
Statistical Methods in Medical Research: An International Review Journal. January 16, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 537-551, March 2026.
Heterogeneous gene networks capture coordinated gene activities and systemic disruptions in complex biological processes and diseases, but extracting biologically meaningful insights from these large-scale networks remains challenging due to limited ...

January 16, 2026 doi: 10.1177/09622802251411550 open full text
CiFGNA: Comprehensive information-based functional gene network analysis.
Heewon Park, Seiya Imoto, Satoru Miyano.
Statistical Methods in Medical Research: An International Review Journal. January 16, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 537-551, March 2026.
Heterogeneous gene networks capture coordinated gene activities and systemic disruptions in complex biological processes and diseases, but extracting biologically meaningful insights from these large-scale networks remains challenging due to limited ...

January 16, 2026 doi: 10.1177/09622802251411550 open full text
Joint modeling of composite quantile regression for multiple ordinal longitudinal data with its applications to a dementia dataset.
Shuqing Liang, Lina Bian, Qi Yang, Yuzhu Tian, Maozai Tian.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 507-536, March 2026.
In the context of longitudinal data regression modeling, individuals often have two or more response indicators, and these response indicators are typically correlated to some extent. Additionally, in the field of clinical medicine, the response ...

January 12, 2026 doi: 10.1177/09622802251412838 open full text
Joint modeling of composite quantile regression for multiple ordinal longitudinal data with its applications to a dementia dataset.
Shuqing Liang, Lina Bian, Qi Yang, Yuzhu Tian, Maozai Tian.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2026

Statistical Methods in Medical Research, Volume 35, Issue 3, Page 507-536, March 2026.
In the context of longitudinal data regression modeling, individuals often have two or more response indicators, and these response indicators are typically correlated to some extent. Additionally, in the field of clinical medicine, the response ...

January 12, 2026 doi: 10.1177/09622802251412838 open full text
Implementing response-adaptive randomisation in stratified rare-disease trials: Design challenges and practical solutions.
Rajenki Das, Nina Deliu, Mark R Toshner, Sofía S Villar.
Statistical Methods in Medical Research: An International Review Journal. October 06, 2025

Statistical Methods in Medical Research, Ahead of Print.
Although response-adaptive randomisation (RAR) has gained substantial attention in the literature, it still has limited use in clinical trials. Amongst other reasons, the implementation of RAR in real world trials raises important practical questions, ...

October 06, 2025 doi: 10.1177/09622802251380625 open full text
Implementing response-adaptive randomisation in stratified rare-disease trials: Design challenges and practical solutions.
Rajenki Das, Nina Deliu, Mark R Toshner, Sofía S Villar.
Statistical Methods in Medical Research: An International Review Journal. October 06, 2025

Statistical Methods in Medical Research, Ahead of Print.
Although response-adaptive randomisation (RAR) has gained substantial attention in the literature, it still has limited use in clinical trials. Amongst other reasons, the implementation of RAR in real world trials raises important practical questions, ...

October 06, 2025 doi: 10.1177/09622802251380625 open full text
Implementing response-adaptive designs when responses are missing: Impute or ignore?
Mia S Tackney, Sofía S VillarMRC-Biostatistics Unit, University of Cambridge, UK.
Statistical Methods in Medical Research: An International Review Journal. August 29, 2025

Statistical Methods in Medical Research, Ahead of Print.
Missing data is a widespread issue in clinical trials, but is particularly problematic for digital health interventions where disengagement is common and outcomes are likely to be missing not at random (MNAR). Trials that use response-adaptive designs ...

August 29, 2025 doi: 10.1177/09622802251366843 open full text
Implementing response-adaptive designs when responses are missing: Impute or ignore?
Mia S Tackney, Sofía S VillarMRC-Biostatistics Unit, University of Cambridge, UK.
Statistical Methods in Medical Research: An International Review Journal. August 29, 2025

Statistical Methods in Medical Research, Ahead of Print.
Missing data is a widespread issue in clinical trials, but is particularly problematic for digital health interventions where disengagement is common and outcomes are likely to be missing not at random (MNAR). Trials that use response-adaptive designs ...

August 29, 2025 doi: 10.1177/09622802251366843 open full text
Implementing response-adaptive designs when responses are missing: Impute or ignore?
Mia S Tackney, Sofía S VillarMRC-Biostatistics Unit, University of Cambridge, UK.
Statistical Methods in Medical Research: An International Review Journal. August 29, 2025

Statistical Methods in Medical Research, Ahead of Print.
Missing data is a widespread issue in clinical trials, but is particularly problematic for digital health interventions where disengagement is common and outcomes are likely to be missing not at random (MNAR). Trials that use response-adaptive designs ...

August 29, 2025 doi: 10.1177/09622802251366843 open full text
On the achievability of efficiency bounds for covariate-adjusted response-adaptive randomization.
Jiahui Xin, Wei MaInstitute of Statistics and Big Data, 12471Renmin University of China, Beijing, China.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2025

Statistical Methods in Medical Research, Ahead of Print.
In the context of precision medicine, covariate-adjusted response-adaptive (CARA) randomization has garnered much attention from both academia and industry due to its benefits in providing ethical and tailored treatment assignments based on patients’ ...

April 01, 2025 doi: 10.1177/09622802251327689 open full text
On the achievability of efficiency bounds for covariate-adjusted response-adaptive randomization.
Jiahui Xin, Wei MaInstitute of Statistics and Big Data, 12471Renmin University of China, Beijing, China.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2025

Statistical Methods in Medical Research, Ahead of Print.
In the context of precision medicine, covariate-adjusted response-adaptive (CARA) randomization has garnered much attention from both academia and industry due to its benefits in providing ethical and tailored treatment assignments based on patients’ ...

April 01, 2025 doi: 10.1177/09622802251327689 open full text
Statistical analysis of a low cost method for multiple disease prediction.
Bayati, M., Bhaskar, S., Montanari, A.
Statistical Methods in Medical Research: An International Review Journal. December 08, 2016

Early identification of individuals at risk for chronic diseases is of significant clinical value. Early detection provides the opportunity to slow the pace of a condition, and thus help individuals to improve or maintain their quality of life. Additionally, it can lessen the financial burden on health insurers and self-insured employers. As a solution to mitigate the rise in chronic conditions and related costs, an increasing number of employers have recently begun using wellness programs, which typically involve an annual health risk assessment. Unfortunately, these risk assessments have low detection capability, as they should be low-cost and hence rely on collecting relatively few basic biomarkers. Thus one may ask, how can we select a low-cost set of biomarkers that would be the most predictive of multiple chronic diseases? In this paper, we propose a statistical data-driven method to address this challenge by minimizing the number of biomarkers in the screening procedure while maximizing the predictive power over a broad spectrum of diseases. Our solution uses multi-task learning and group dimensionality reduction from machine learning and statistics. We provide empirical validation of the proposed solution using data from two different electronic medical records systems, with comparisons over a statistical benchmark.

December 08, 2016 doi: 10.1177/0962280216680242 open full text
Adjusting for bias in unblinded randomized controlled trials.
Schmidt, A., Groenwold, R.
Statistical Methods in Medical Research: An International Review Journal. December 08, 2016

It may not always be possible to blind participants of a randomized controlled trial for treatment allocation. As a result, estimators of the actual treatment effect may be biased. In this paper, we will extend a novel method, originally introduced in genetic research, for instrumental variable meta-analysis, adjusting for bias due to unblinding of trial participants. Using simulation studies, this novel method, "Egger Correction for non-Adherence", is introduced and compared to the performance of the "intention-to-treat," "as-treated," and conventional "instrumental variable" estimators. Scenarios considered (time-varying) non-adherence, confounding, and between-study heterogeneity. The effect of treatment on a binary endpoint was quantified by means of a risk difference. In all scenarios with unblinded treatment allocation, the Egger Correction for non-Adherence method was the least biased estimator. However, unless the variation in adherence was relatively large, precision was lacking, and power did not surpass 0.50. As a comparison, in a meta-analysis of blinded randomized controlled trials, power of the conventional IV estimator was 1.00 versus at most 0.14 for the Egger Correction for non-Adherence estimator. Due to this lack of precision and power, we suggest to use this method mainly as a sensitivity analysis.

December 08, 2016 doi: 10.1177/0962280216680652 open full text
Interval estimation for a proportion using a double-sampling scheme with two fallible classifiers.
Qiu, S.-F., Lian, H., Zou, G., Zeng, X.-S.
Statistical Methods in Medical Research: An International Review Journal. December 08, 2016

Double-sampling schemes using one classifier assessing the whole sample and another classifier assessing a subset of the sample have been introduced for reducing classification errors when an infallible or gold standard classifier is unavailable or impractical. Inference procedures have previously been proposed for situations where an infallible classifier is available for validating a subset of the sample that has already been classified by a fallible classifier. Here, we consider the case where both classifiers are fallible, proposing and evaluating several confidence interval procedures for a proportion under two models, distinguished by the assumption regarding ascertainment of two classifiers. Simulation results suggest that the modified Wald-based confidence interval, Score-based confidence interval, two Bayesian credible intervals, and the percentile Bootstrap confidence interval performed reasonably well even for small binomial proportions and small validated sample under the model with the conditional independent assumption, and the confidence interval derived from the Wald test with nuisance parameters appropriately evaluated, likelihood ratio-based confidence interval, Score-based confidence interval, and the percentile Bootstrap confidence interval performed satisfactory in terms of coverage under the model without the conditional independent assumption. Moreover, confidence intervals based on log- and logit-transformations also performed well when the binomial proportion and the ratio of the validated sample are not very small under two models. Two examples were used to illustrate the procedures.

December 08, 2016 doi: 10.1177/0962280216681599 open full text
Detection of imprinting effects for qualitative traits on X chromosome based on nuclear families.
Zhou, J.-Y., You, X.-P., Yang, R., Fung, W. K.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

Methods for detecting imprinting effects have been developed primarily for autosomal markers. However, no method is available in the literature to test for imprinting effects on X chromosome. Therefore, it is necessary to suggest methods for detecting such imprinting effects. In this article, the parental-asymmetry test on X chromosome (XPAT) is first developed to test for imprinting for qualitative traits in the presence of association, based on family trios each with both parents and their affected daughter. Then, we propose 1-XPAT to deal with parent–daughter pairs, each with one parent and his/her affected daughter. By simultaneously considering family trios and parent–daughter pairs, C-XPAT (the combined test statistic of XPAT and 1-XPAT) is constructed to test for imprinting. Further, we extend the proposed methods to accommodate complete (with both parents) and incomplete (with one parent) nuclear families having multiple daughters of which at least one is affected. Simulation results demonstrate that the proposed methods control the size well, irrespective of the inbreeding coefficient in females being zero or non-zero. By incorporating incomplete nuclear families, C-XPAT is more powerful than XPAT using only complete nuclear families. For practical use, these proposed methods are applied to analyse the rheumatoid arthritis data and Turner’s syndrome data.

December 05, 2016 doi: 10.1177/0962280216680243 open full text
Maximizing the usefulness of statistical classifiers for two populations with illustrative applications.
Jeske, D. R., Smith, S.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

The usefulness of two-class statistical classifiers is limited when one or both of the conditional misclassification rates is unacceptably high. Incorporating a neutral zone region into the classifier provides a mechanism to refer ambiguous cases to follow-up where additional information might be obtained to clarify the classification decision. Through the use of the neutral zone region, the conditional misclassification rates can be controlled and the classifier becomes useful. Three real-life examples, including applications to prostate cancer and kidney dysfunction following heart surgery, are used to illustrate how neutral zone regions can extract utility from disappointing classifiers that might otherwise be abandoned.

December 05, 2016 doi: 10.1177/0962280216680244 open full text
Is the classical Wald test always suitable under response-adaptive randomization?
Baldi Antognini, A., Vagheggini, A., Zagoraiou, M.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

The aim of this paper is to analyze the impact of response-adaptive randomization rules for normal response trials intended to test the superiority of one of two available treatments. Taking into account the classical Wald test, we show how response-adaptive methodology could induce a consistent loss of inferential precision. Then, we suggest a modified version of the Wald test which, by using the current allocation proportion to the treatments as a consistent estimator of the target, avoids some degenerate scenarios and so it should be preferable to the classical test. Furthermore, we show both analytically and via simulations how some target allocations may induce a locally decreasing power function. Thus, we derive the conditions on the target guaranteeing its monotonicity and we show how a correct choice of the initial sample size allows one to overcome this drawback regardless of the adopted target.

December 05, 2016 doi: 10.1177/0962280216680241 open full text
Use of the concordance index for predictors of censored survival data.
Brentnall, A. R., Cuzick, J.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

The concordance index is often used to measure how well a biomarker predicts the time to an event. Estimators of the concordance index for predictors of right-censored data are reviewed, including those based on censored pairs, inverse probability weighting and a proportional-hazards model. Predictive and prognostic biomarkers often lose strength with time, and in this case the aforementioned statistics depend on the length of follow up. A semi-parametric estimator of the concordance index is developed that accommodates converging hazards through a single parameter in a Pareto model. Concordance index estimators are assessed through simulations, which demonstrate substantial bias of classical censored-pairs and proportional-hazards model estimators. Prognostic biomarkers in a cohort of women diagnosed with breast cancer are evaluated using new and classical estimators of the concordance index.

December 05, 2016 doi: 10.1177/0962280216680245 open full text
An optimal Wilcoxon-Mann-Whitney test of mortality and a continuous outcome.
Matsouaka, R. A., Singhal, A. B., Betensky, R. A.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

We consider a two-group randomized clinical trial, where mortality affects the assessment of a follow-up continuous outcome. Using the worst-rank composite endpoint, we develop a weighted Wilcoxon–Mann–Whitney test statistic to analyze the data. We determine the optimal weights for the Wilcoxon–Mann–Whitney test statistic that maximize its power. We derive a formula for its power and demonstrate its accuracy in simulations. Finally, we apply the method to data from an acute ischemic stroke clinical trial of normobaric oxygen therapy.

December 05, 2016 doi: 10.1177/0962280216680524 open full text
A graphical perspective of marginal structural models: An application for the estimation of the effect of physical activity on blood pressure.
Talbot, D., Rossi, A. M., Bacon, S. L., Atherton, J., Lefebvre, G.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

Estimating causal effects requires important prior subject-matter knowledge and, sometimes, sophisticated statistical tools. The latter is especially true when targeting the causal effect of a time-varying exposure in a longitudinal study. Marginal structural models are a relatively new class of causal models that effectively deal with the estimation of the effects of time-varying exposures. Marginal structural models have traditionally been embedded in the counterfactual framework to causal inference. In this paper, we use the causal graph framework to enhance the implementation of marginal structural models. We illustrate our approach using data from a prospective cohort study, the Honolulu Heart Program. These data consist of 8006 men at baseline. To illustrate our approach, we focused on the estimation of the causal effect of physical activity on blood pressure, which were measured at three time points. First, a causal graph is built to encompass prior knowledge. This graph is then validated and improved utilizing structural equation models. We estimated the aforementioned causal effect using marginal structural models for repeated measures and guided the implementation of the models with the causal graph. By employing the causal graph framework, we also show the validity of fitting conditional marginal structural models for repeated measures in the context implied by our data.

December 05, 2016 doi: 10.1177/0962280216680834 open full text
A more efficient three-arm non-inferiority test based on pooled estimators of the homogeneous variance.
Lu, H., Jin, H., Zeng, W.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2016

Hida and Tango established a statistical testing framework for the three-arm non-inferiority trial including a placebo with a pre-specified non-inferiority margin to overcome the shortcomings of traditional two-arm non-inferiority trials (such as having to choose the non-inferiority margin). In this paper, we propose a new method that improves their approach with respect to two aspects. We construct our testing statistics based on the best unbiased pooled estimators of the homogeneous variance; and we use the principle of intersection-union tests to determine the rejection rule. We theoretically prove that our test is better than that of Hida and Tango for large sample sizes. Furthermore, when that sample size was small or moderate, our simulation studies showed that our approach performed better than Hida and Tango’s. Although both controlled the type I error rate, their test was more conservative and the statistical power of our test was higher.

December 05, 2016 doi: 10.1177/0962280216681036 open full text
Mixed hidden Markov quantile regression models for longitudinal data with possibly incomplete sequences.
Marino, M. F., Tzavidis, N., Alfo, M.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2016

Quantile regression provides a detailed and robust picture of the distribution of a response variable, conditional on a set of observed covariates. Recently, it has be been extended to the analysis of longitudinal continuous outcomes using either time-constant or time-varying random parameters. However, in real-life data, we frequently observe both temporal shocks in the overall trend and individual-specific heterogeneity in model parameters. A benchmark dataset on HIV progression gives a clear example. Here, the evolution of the CD4 log counts exhibits both sudden temporal changes in the overall trend and heterogeneity in the effect of the time since seroconversion on the response dynamics. To accommodate such situations, we propose a quantile regression model, where time-varying and time-constant random coefficients are jointly considered. Since observed data may be incomplete due to early drop-out, we also extend the proposed model in a pattern mixture perspective. We assess the performance of the proposals via a large-scale simulation study and the analysis of the CD4 count data.

November 28, 2016 doi: 10.1177/0962280216678433 open full text
A simple method to estimate the time-dependent receiver operating characteristic curve and the area under the curve with right censored data.
Li, L., Greene, T., Hu, B.
Statistical Methods in Medical Research: An International Review Journal. November 27, 2016

The time-dependent receiver operating characteristic curve is often used to study the diagnostic accuracy of a single continuous biomarker, measured at baseline, on the onset of a disease condition when the disease onset may occur at different times during the follow-up and hence may be right censored. Due to right censoring, the true disease onset status prior to the pre-specified time horizon may be unknown for some patients, which causes difficulty in calculating the time-dependent sensitivity and specificity. We propose to estimate the time-dependent sensitivity and specificity by weighting the censored data by the conditional probability of disease onset prior to the time horizon given the biomarker, the observed time to event, and the censoring indicator, with the weights calculated nonparametrically through a kernel regression on time to event. With this nonparametric weighting adjustment, we derive a novel, closed-form formula to calculate the area under the time-dependent receiver operating characteristic curve. We demonstrate through numerical study and theoretical arguments that the proposed method is insensitive to misspecification of the kernel bandwidth, produces unbiased and efficient estimators of time-dependent sensitivity and specificity, the area under the curve, and other estimands from the receiver operating characteristic curve, and outperforms several other published methods currently implemented in R packages.

November 27, 2016 doi: 10.1177/0962280216680239 open full text
Measurement error, time lag, unmeasured confounding: Considerations for longitudinal estimation of the effect of a mediator in randomised clinical trials.
Goldsmith, K., Chalder, T., White, P., Sharpe, M., Pickles, A.
Statistical Methods in Medical Research: An International Review Journal. November 24, 2016

Clinical trials are expensive and time-consuming and so should also be used to study how treatments work, allowing for the evaluation of theoretical treatment models and refinement and improvement of treatments. These treatment processes can be studied using mediation analysis. Randomised treatment makes some of the assumptions of mediation models plausible, but the mediator–outcome relationship could remain subject to bias. In addition, mediation is assumed to be a temporally ordered longitudinal process, but estimation in most mediation studies to date has been cross-sectional and unable to explore this assumption. This study used longitudinal structural equation modelling of mediator and outcome measurements from the PACE trial of rehabilitative treatments for chronic fatigue syndrome (ISRCTN 54285094) to address these issues. In particular, autoregressive and simplex models were used to study measurement error in the mediator, different time lags in the mediator–outcome relationship, unmeasured confounding of the mediator and outcome, and the assumption of a constant mediator–outcome relationship over time. Results showed that allowing for measurement error and unmeasured confounding were important. Contemporaneous rather than lagged mediator–outcome effects were more consistent with the data, possibly due to the wide spacing of measurements. Assuming a constant mediator–outcome relationship over time increased precision.

November 24, 2016 doi: 10.1177/0962280216666111 open full text
Linear-rank testing of a non-binary, responder-analysis, efficacy score to evaluate pharmacotherapies for substance use disorders.
Holmes, T. H., Li, S.-H., McCann, D. J.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2016

The design of pharmacological trials for management of substance use disorders is shifting toward outcomes of successful individual-level behavior (abstinence or no heavy use). While binary success/failure analyses are common, McCann and Li (CNS Neurosci Ther 2012; 18: 414–418) introduced "number of beyond-threshold weeks of success" (NOBWOS) scores to avoid dichotomized outcomes. NOBWOS scoring employs an efficacy "hurdle" with values reflecting duration of success. Here, we evaluate NOBWOS scores rigorously. Formal analysis of mathematical structure of NOBWOS scores is followed by simulation studies spanning diverse conditions to assess operating characteristics of five linear-rank tests on NOBWOS scores. Simulations include assessment of Fisher’s exact test applied to hurdle component. On average, statistical power was approximately equal for five linear-rank tests. Under none of conditions examined did Fisher’s exact test exhibit greater statistical power than any of the linear-rank tests. These linear-rank tests provide good Type I and Type II error control for comparing distributions of NOBWOS scores between groups (e.g. active vs. placebo). All methods were applied to re-analyses of data from four clinical trials of differing lengths and substances of abuse. These linear-rank tests agreed across all trials in rejecting (or not) their null (equality of distributions) at ≤ 0.05.

November 23, 2016 doi: 10.1177/0962280216677317 open full text
Sample size and power for a stratified doubly randomized preference design.
Cameron, B., Esserman, D. A.
Statistical Methods in Medical Research: An International Review Journal. November 21, 2016

The two-stage (or doubly) randomized preference trial design is an important tool for researchers seeking to disentangle the role of patient treatment preference on treatment response through estimation of selection and preference effects. Up until now, these designs have been limited by their assumption of equal preference rates and effect sizes across the entire study population. We propose a stratified two-stage randomized trial design that addresses this limitation. We begin by deriving stratified test statistics for the treatment, preference, and selection effects. Next, we develop a sample size formula for the number of patients required to detect each effect. The properties of the model and the efficiency of the design are established using a series of simulation studies. We demonstrate the applicability of the design using a study of Hepatitis C treatment modality, specialty clinic versus mobile medical clinic. In this example, a stratified preference design (stratified by alcohol/drug use) may more closely capture the true distribution of patient preferences and allow for a more efficient design than a design which ignores these differences (unstratified version).

November 21, 2016 doi: 10.1177/0962280216677573 open full text
Sample size calculation for agreement between two raters with binary endpoints using exact tests.
Shan, G.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

In an agreement test between two raters with binary endpoints, existing methods for sample size calculation are always based on asymptotic approaches that use limiting distributions of a test statistic under null and alternative hypotheses. These calculated sample sizes may be not reliable due to the unsatisfactory type I error control of asymptotic approaches. We propose a new sample size calculation based on exact approaches which control for the type I error rate. The two exact approaches are considered: one approach based on maximization and the other based on estimation and maximization. We found that the latter approach is generally more powerful than the one based on maximization. Therefore, we present the sample size calculation based on estimation and maximization. A real example from a clinical trial to diagnose low back pain of patients is used to illustrate the two exact testing procedures and sample size determination.

November 16, 2016 doi: 10.1177/0962280216676854 open full text
The asymptotic maximal procedure for subject randomization in clinical trials.
Zhao, W., Berger, V. W., Yu, Z.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

The maximal procedure is a restricted randomization method that maximizes the number of feasible allocation sequences under the constraints of the maximum tolerated imbalance and the allocation sequence length. It assigns an equal probability to all feasible sequences. However, its implementation is not easy due to the lack of the Markovian property of the conditional allocation probabilities. In this paper, we propose the asymptotic maximal procedure, which replaces the sequence-length-dependent conditional allocation probabilities with their asymptotic values. The new randomization procedure is compared with the original maximal procedure and few other randomization procedures with the maximum tolerated imbalance via simulations and is found to be a practical choice for future clinical trials.

November 16, 2016 doi: 10.1177/0962280216677107 open full text
Variable selection for mixture and promotion time cure rate models.
Masud, A., Tu, W., Yu, Z.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

Failure-time data with cured patients are common in clinical studies. Data from these studies are typically analyzed with cure rate models. Variable selection methods have not been well developed for cure rate models. In this research, we propose two least absolute shrinkage and selection operators based methods, for variable selection in mixture and promotion time cure models with parametric or nonparametric baseline hazards. We conduct an extensive simulation study to assess the operating characteristics of the proposed methods. We illustrate the use of the methods using data from a study of childhood wheezing.

November 16, 2016 doi: 10.1177/0962280216677748 open full text
Analysis of an incomplete longitudinal composite variable using a marginalized random effects model and multiple imputation.
Gosho, M., Maruo, K., Ishii, R., Hirakawa, A.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

The total score, which is calculated as the sum of scores in multiple items or questions, is repeatedly measured in longitudinal clinical studies. A mixed effects model for repeated measures method is often used to analyze these data; however, if one or more individual items are not measured, the method cannot be directly applied to the total score. We develop two simple and interpretable procedures that infer fixed effects for a longitudinal continuous composite variable. These procedures consider that the items that compose the total score are multivariate longitudinal continuous data and, simultaneously, handle subject-level and item-level missing data. One procedure is based on a multivariate marginalized random effects model with a multiple of Kronecker product covariance matrices for serial time dependence and correlation among items. The other procedure is based on a multiple imputation approach with a multivariate normal model. In terms of the type-1 error rate and the bias of treatment effect in total score, the marginalized random effects model and multiple imputation procedures performed better than the standard mixed effects model for repeated measures analysis with listwise deletion and single imputations for handling item-level missing data. In particular, the mixed effects model for repeated measures with listwise deletion resulted in substantial inflation of the type-1 error rate. The marginalized random effects model and multiple imputation methods provide for a more efficient analysis by fully utilizing the partially available data, compared to the mixed effects model for repeated measures method with listwise deletion.

November 16, 2016 doi: 10.1177/0962280216677879 open full text
Evolution of association between renal and liver functions while awaiting heart transplant: An application using a bivariate multiphase nonlinear mixed effects model.
Rajeswaran, J., Blackstone, E. H., Barnard, J.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

In many longitudinal follow-up studies, we observe more than one longitudinal outcome. Impaired renal and liver functions are indicators of poor clinical outcomes for patients who are on mechanical circulatory support and awaiting heart transplant. Hence, monitoring organ functions while waiting for heart transplant is an integral part of patient management. Longitudinal measurements of bilirubin can be used as a marker for liver function and glomerular filtration rate for renal function. We derive an approximation to evolution of association between these two organ functions using a bivariate nonlinear mixed effects model for continuous longitudinal measurements, where the two submodels are linked by a common distribution of time-dependent latent variables and a common distribution of measurement errors.

November 16, 2016 doi: 10.1177/0962280216678022 open full text
A novel nonparametric confidence interval for differences of proportions for correlated binary data.
Duan, C., Cao, Y., zhou, L., Tan, M. T., Chen, P.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

Various confidence interval estimators have been developed for differences in proportions resulted from correlated binary data. However, the width of the mostly recommended Tango’s score confidence interval tends to be wide, and the computing burden of exact methods recommended for small-sample data is intensive. The recently proposed rank-based nonparametric method by treating proportion as special areas under receiver operating characteristic provided a new way to construct the confidence interval for proportion difference on paired data, while the complex computation limits its application in practice. In this article, we develop a new nonparametric method utilizing the U-statistics approach for comparing two or more correlated areas under receiver operating characteristics. The new confidence interval has a simple analytic form with a new estimate of the degrees of freedom of n – 1. It demonstrates good coverage properties and has shorter confidence interval widths than that of Tango. This new confidence interval with the new estimate of degrees of freedom also leads to coverage probabilities that are an improvement on the rank-based nonparametric confidence interval. Comparing with the approximate exact unconditional method, the nonparametric confidence interval demonstrates good coverage properties even in small samples, and yet they are very easy to implement computationally. This nonparametric procedure is evaluated using simulation studies and illustrated with three real examples. The simplified nonparametric confidence interval is an appealing choice in practice for its ease of use and good performance.

November 16, 2016 doi: 10.1177/0962280216679040 open full text
A Markov chain representation of the multiple testing problem.
Cabras, S.
Statistical Methods in Medical Research: An International Review Journal. November 16, 2016

The problem of multiple hypothesis testing can be represented as a Markov process where a new alternative hypothesis is accepted in accordance with its relative evidence to the currently accepted one. This virtual and not formally observed process provides the most probable set of non null hypotheses given the data; it plays the same role as Markov Chain Monte Carlo in approximating a posterior distribution. To apply this representation and obtain the posterior probabilities over all alternative hypotheses, it is enough to have, for each test, barely defined Bayes Factors, e.g. Bayes Factors obtained up to an unknown constant. Such Bayes Factors may either arise from using default and improper priors or from calibrating p-values with respect to their corresponding Bayes Factor lower bound. Both sources of evidence are used to form a Markov transition kernel on the space of hypotheses. The approach leads to easy interpretable results and involves very simple formulas suitable to analyze large datasets as those arising from gene expression data (microarray or RNA-seq experiments).

November 16, 2016 doi: 10.1177/0962280216628903 open full text
Bayesian correction for covariate measurement error: A frequentist evaluation and comparison with regression calibration.
Bartlett, J. W., Keogh, R. H.
Statistical Methods in Medical Research: An International Review Journal. November 09, 2016

Bayesian approaches for handling covariate measurement error are well established and yet arguably are still relatively little used by researchers. For some this is likely due to unfamiliarity or disagreement with the Bayesian inferential paradigm. For others a contributory factor is the inability of standard statistical packages to perform such Bayesian analyses. In this paper, we first give an overview of the Bayesian approach to handling covariate measurement error, and contrast it with regression calibration, arguably the most commonly adopted approach. We then argue why the Bayesian approach has a number of statistical advantages compared to regression calibration and demonstrate that implementing the Bayesian approach is usually quite feasible for the analyst. Next, we describe the closely related maximum likelihood and multiple imputation approaches and explain why we believe the Bayesian approach to generally be preferable. We then empirically compare the frequentist properties of regression calibration and the Bayesian approach through simulation studies. The flexibility of the Bayesian approach to handle both measurement error and missing data is then illustrated through an analysis of data from the Third National Health and Nutrition Examination Survey.

November 09, 2016 doi: 10.1177/0962280216667764 open full text
Diagnostic checks in mixture cure models with interval-censoring.
Scolas, S., Legrand, C., Oulhaj, A., El Ghouch, A.
Statistical Methods in Medical Research: An International Review Journal. November 04, 2016

Models for interval-censored survival data presenting a fraction of "cure" or "immune" patients have recently been proposed in the literature, particularly extending the mixture cure model to interval-censored data. However, little is known about the goodness-of-fit of such models. In a mixture cure model, the survival distribution of the entire population is improper and expressed in terms of the survival distribution of uncured individuals, i.e. the latency part of the model, and the probability to experience the event of interest, i.e. the incidence part. To validate a mixture cure model, assumptions made on both parts need to be checked, i.e. the survival distribution of uncured individuals, the link function used in the latency and the linearity of the covariates used in the both parts of the model. In this work, we investigate the Cox-Snell and deviance residuals and show how they can be adapted and used to perform diagnostics checks when all subjects are right- or interval-censored and some subjects are cured with unknown cure status. A large simulation study investigates the ability of these residuals to detect a departure from the assumptions of the mixture model. Developed techniques are applied to a real data set about Alzheimer’s disease.

November 04, 2016 doi: 10.1177/0962280216676502 open full text
Integrating genomic signatures for treatment selection with Bayesian predictive failure time models.
Ma, J., Hobbs, B. P., Stingo, F. C.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2016

Over the past decade, a tremendous amount of resources have been dedicated to the pursuit of developing genomic signatures that effectively match patients with targeted therapies. Although dozens of therapies that target DNA mutations have been developed, the practice of studying single candidate genes has limited our understanding of cancer. Moreover, many studies of multiple-gene signatures have been conducted for the purpose of identifying prognostic risk cohorts, and thus are limited for selecting personalized treatments. Existing statistical methods for treatment selection often model treatment-by-covariate interactions that are difficult to specify, and require prohibitively large patient cohorts. In this article, we describe a Bayesian predictive failure time model for treatment selection that integrates multiple-gene signatures. Our approach relies on a heuristic measure of similarity that determines the extent to which historically treated patients contribute to the outcome prediction of new patients. The similarity measure, which can be obtained from existing clustering methods, imparts robustness to the underlying stochastic data structure, which enhances feasibility in the presence of small samples. Performance of the proposed method is evaluated in simulation studies, and its application is demonstrated through a study of lung squamous cell carcinoma. Our Bayesian predictive failure time approach is shown to effectively leverage genomic signatures to match patients to the therapies that are most beneficial for prolonging their survival.

November 01, 2016 doi: 10.1177/0962280216675373 open full text
Dynamic longitudinal discriminant analysis using multiple longitudinal markers of different types.
Hughes, D. M., Komarek, A., Czanner, G., Garcia-Finana, M.
Statistical Methods in Medical Research: An International Review Journal. October 26, 2016

There is an emerging need in clinical research to accurately predict patients’ disease status and disease progression by optimally integrating multivariate clinical information. Clinical data are often collected over time for multiple biomarkers of different types (e.g. continuous, binary and counts). In this paper, we present a flexible and dynamic (time-dependent) discriminant analysis approach in which multiple biomarkers of various types are jointly modelled for classification purposes by the multivariate generalized linear mixed model. We propose a mixture of normal distributions for the random effects to allow additional flexibility when modelling the complex correlation between longitudinal biomarkers and to robustify the model and the classification procedure against misspecification of the random effects distribution. These longitudinal models are subsequently used in a multivariate time-dependent discriminant scheme to predict, at any time point, the probability of belonging to a particular risk group. The methodology is illustrated using clinical data from patients with epilepsy, where the aim is to identify patients who will not achieve remission of seizures within a five-year follow-up period.

October 26, 2016 doi: 10.1177/0962280216674496 open full text
Optimal threshold estimator of a prognostic marker by maximizing a time-dependent expected utility function for a patient-centered stratified medicine.
Dantan, E., Foucher, Y., Lorent, M., Giral, M., Tessier, P.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2016

Defining thresholds of prognostic markers is essential for stratified medicine. Such thresholds are mostly estimated from purely statistical measures regardless of patient preferences potentially leading to unacceptable medical decisions. Quality-Adjusted Life-Years are a widely used preferences-based measure of health outcomes. We develop a time-dependent Quality-Adjusted Life-Years-based expected utility function for censored data that should be maximized to estimate an optimal threshold. We performed a simulation study to compare estimated thresholds when using the proposed expected utility approach and purely statistical estimators. Two applications illustrate the usefulness of the proposed methodology which was implemented in the R package ROCt (www.divat.fr). First, by reanalysing data of a randomized clinical trial comparing the efficacy of prednisone vs. placebo in patients with chronic liver cirrhosis, we demonstrate the utility of treating patients with a prothrombin level higher than 89%. Second, we reanalyze the data of an observational cohort of kidney transplant recipients: we conclude to the uselessness of the Kidney Transplant Failure Score to adapt the frequency of clinical visits. Applying such a patient-centered methodology may improve future transfer of novel prognostic scoring systems or markers in clinical practice.

October 20, 2016 doi: 10.1177/0962280216671161 open full text
On the comparison of risk of death according to different stages of breast cancer via the long-term exponentiated Weibull hazard model.
Souza, H. C. C. d., Perdona, G. d. S. C., Louzada, F., Peria, F. M.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2016

Long-term survivor models have been extensively used for modelling time-to-event data with a significant proportion of patients who do not experience poor outcome. In this paper, we propose a new long-term survivor hazard model, which accommodates comprehensive families of cure rate models as particular cases, including modified Weibull, exponentiated Weibull, Weibull, exponential and Rayleigh distribution, among others. The maximum likelihood estimation procedure is presented. A simulation study evaluates bias and mean square error of the considered estimation procedure as well as the coverage probabilities of the parameters asymptotic and bootstrap confidence intervals. A real Brazilian dataset on breast cancer illustrates the methodology. From the practical point of view, under our modelling, we provide a parameter that works as a metric to quantify and compare the risk between different stages of the disease. We emphasize that, we developed an online platform for oncologists to calculate the probability of survival of patients diagnosed with breast cancer according to the stage of the disease in real time.

October 20, 2016 doi: 10.1177/0962280216673245 open full text
A Bayesian hierarchical model for demand curve analysis.
Ho, Y.-Y., Vo, T. N., Chu, H., Luo, X., Le, C. T.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2016

Drug self-administration experiments are a frequently used approach to assessing the abuse liability and reinforcing property of a compound. It has been used to assess the abuse liabilities of various substances such as psychomotor stimulants and hallucinogens, food, nicotine, and alcohol. The demand curve generated from a self-administration study describes how demand of a drug or non-drug reinforcer varies as a function of price. With the approval of the 2009 Family Smoking Prevention and Tobacco Control Act, demand curve analysis provides crucial evidence to inform the US Food and Drug Administration’s policy on tobacco regulation, because it produces several important quantitative measurements to assess the reinforcing strength of nicotine. The conventional approach popularly used to analyze the demand curve data is individual-specific non-linear least square regression. The non-linear least square approach sets out to minimize the residual sum of squares for each subject in the dataset; however, this one-subject-at-a-time approach does not allow for the estimation of between- and within-subject variability in a unified model framework. In this paper, we review the existing approaches to analyze the demand curve data, non-linear least square regression, and the mixed effects regression and propose a new Bayesian hierarchical model. We conduct simulation analyses to compare the performance of these three approaches and illustrate the proposed approaches in a case study of nicotine self-administration in rats. We present simulation results and discuss the benefits of using the proposed approaches.

October 20, 2016 doi: 10.1177/0962280216673675 open full text
Non-parametric estimation of transition probabilities in non-Markov multi-state models: The landmark Aalen-Johansen estimator.
Putter, H., Spitoni, C.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2016

The topic non-parametric estimation of transition probabilities in non-Markov multi-state models has seen a remarkable surge of activity recently. Two recent papers have used the idea of subsampling in this context. The first paper, by de Uña Álvarez and Meira-Machado, uses a procedure based on (differences between) Kaplan–Meier estimators derived from a subset of the data consisting of all subjects observed to be in the given state at the given time. The second, by Titman, derived estimators of transition probabilities that are consistent in general non-Markov multi-state models. Here, we show that the same idea of subsampling, used in both these papers, combined with the Aalen–Johansen estimate of the state occupation probabilities derived from that subset, can also be used to obtain a relatively simple and intuitive procedure which we term landmark Aalen–Johansen. We show that the landmark Aalen–Johansen estimator yields a consistent estimator of the transition probabilities in general non-Markov multi-state models under the same conditions as needed for consistency of the Aalen–Johansen estimator of the state occupation probabilities. Simulation studies show that the landmark Aalen–Johansen estimator has good small sample properties and is slightly more efficient than the other estimators.

October 20, 2016 doi: 10.1177/0962280216674497 open full text
Efficient nonparametric confidence bands for receiver operating-characteristic curves.
Martinez-Camblor, P., Perez-Fernandez, S., Corral, N.
Statistical Methods in Medical Research: An International Review Journal. October 17, 2016

Receiver operating-characteristic curve is a popular graphical method frequently used in order to study the diagnostic capacity of continuous (bio)markers. In spite of the existence of a huge number of papers devoted to both theoretical and practical aspects of this topic, the construction of confidence bands has had little impact in the specialized literature. As far as the authors know, in the CRAN there are only three R packages providing receiver operating-characteristic curve confidence regions: plotROC, pROC and fbroc. This work tries to fill this gap studying and proposing a new nonparametric method to build confidence bands for both the standard receiver operating-characteristic curve and its generalization for nonmonotone relationships. The behavior of the proposed procedure is studied via Monte Carlo simulations and the methodology is applied on two real-world biomedical problems. In addition, an R function to compute the proposed and some of the previously existing methodologies is provided as online supplementary material.

October 17, 2016 doi: 10.1177/0962280216672490 open full text
Group-based multi-trajectory modeling.
Nagin, D. S., Jones, B. L., Lima Passos, V., Tremblay, R. E.
Statistical Methods in Medical Research: An International Review Journal. October 17, 2016

Identifying and monitoring multiple disease biomarkers and other clinically important factors affecting the course of a disease, behavior or health status is of great clinical relevance. Yet conventional statistical practice generally falls far short of taking full advantage of the information available in multivariate longitudinal data for tracking the course of the outcome of interest. We demonstrate a method called multi-trajectory modeling that is designed to overcome this limitation. The method is a generalization of group-based trajectory modeling. Group-based trajectory modeling is designed to identify clusters of individuals who are following similar trajectories of a single indicator of interest such as post-operative fever or body mass index. Multi-trajectory modeling identifies latent clusters of individuals following similar trajectories across multiple indicators of an outcome of interest (e.g., the health status of chronic kidney disease patients as measured by their eGFR, hemoglobin, blood CO₂ levels). Multi-trajectory modeling is an application of finite mixture modeling. We lay out the underlying likelihood function of the multi-trajectory model and demonstrate its use with two examples.

October 17, 2016 doi: 10.1177/0962280216673085 open full text
Time-dependent efficacy of longitudinal biomarker for clinical endpoint.
Kolamunnage-Dona, R., Williamson, P. R.
Statistical Methods in Medical Research: An International Review Journal. October 17, 2016

Joint modelling of longitudinal biomarker and event-time processes has gained its popularity in recent years as they yield more accurate and precise estimates. Considering this modelling framework, a new methodology for evaluating the time-dependent efficacy of a longitudinal biomarker for clinical endpoint is proposed in this article. In particular, the proposed model assesses how well longitudinally repeated measurements of a biomarker over various time periods (0,t) distinguish between individuals who developed the disease by time t and individuals who remain disease-free beyond time t. The receiver operating characteristic curve is used to provide the corresponding efficacy summaries at various t based on the association between longitudinal biomarker trajectory and risk of clinical endpoint prior to each time point. The model also allows detecting the time period over which a biomarker should be monitored for its best discriminatory value. The proposed approach is evaluated through simulation and illustrated on the motivating dataset from a prospective observational study of biomarkers to diagnose the onset of sepsis.

October 17, 2016 doi: 10.1177/0962280216673084 open full text
Estimating age-specific reproductive numbers--A comparison of methods.
Moser, C. B., White, L. F.
Statistical Methods in Medical Research: An International Review Journal. October 17, 2016

Large outbreaks, such as those caused by influenza, put a strain on resources necessary for their control. In particular, children have been shown to play a key role in influenza transmission during recent outbreaks, and targeted interventions, such as school closures, could positively impact the course of emerging epidemics. As an outbreak is unfolding, it is important to be able to estimate reproductive numbers that incorporate this heterogeneity and to use surveillance data that is routinely collected to more effectively target interventions and obtain an accurate understanding of transmission dynamics. There are a growing number of methods that estimate age-group specific reproductive numbers with limited data that build on methods assuming a homogenously mixing population. In this article, we introduce a new approach that is flexible and improves on many aspects of existing methods. We apply this method to influenza data from two outbreaks, the 2009 H1N1 outbreaks in South Africa and Japan, to estimate age-group specific reproductive numbers and compare it to three other methods that also use existing data from social mixing surveys to quantify contact rates among different age groups. In this exercise, all estimates of the reproductive numbers for children exceeded the critical threshold of one and in most cases exceeded those of adults. We introduce a flexible new method to estimate reproductive numbers that describe heterogeneity in the population.

October 17, 2016 doi: 10.1177/0962280216673676 open full text
Compositional data analysis in epidemiology.
Mert, M. C., Filzmoser, P., Endel, G., Wilbacher, I.
Statistical Methods in Medical Research: An International Review Journal. October 06, 2016

Compositional data analysis refers to analyzing relative information, based on ratios between the variables in a data set. Data from epidemiology are usually treated as absolute information in an analysis. We outline the differences in both approaches for univariate and multivariate statistical analyses, using illustrative data sets from Austrian districts. Not only the results of the analyses can differ, but in particular the interpretation differs. It is demonstrated that the compositional data analysis approach leads to new and interesting insights.

October 06, 2016 doi: 10.1177/0962280216671536 open full text
Effective plots to assess bias and precision in method comparison studies.
Taffe, P.
Statistical Methods in Medical Research: An International Review Journal. October 04, 2016

Bland and Altman’s limits of agreement have traditionally been used in clinical research to assess the agreement between different methods of measurement for quantitative variables. However, when the variances of the measurement errors of the two methods are different, Bland and Altman’s plot may be misleading; there are settings where the regression line shows an upward or a downward trend but there is no bias or a zero slope and there is a bias. Therefore, the goal of this paper is to clearly illustrate why and when does a bias arise, particularly when heteroscedastic measurement errors are expected, and propose two new plots, the "bias plot" and the "precision plot," to help the investigator visually and clinically appraise the performance of the new method. These plots do not have the above-mentioned defect and still are easy to interpret, in the spirit of Bland and Altman’s limits of agreement. To achieve this goal, we rely on the modeling framework recently developed by Nawarathna and Choudhary, which allows the measurement errors to be heteroscedastic and depend on the underlying latent trait. Their estimation procedure, however, is complex and rather daunting to implement. We have, therefore, developed a new estimation procedure, which is much simpler to implement and, yet, performs very well, as illustrated by our simulations. The methodology requires several measurements with the reference standard and possibly only one with the new method for each individual.

October 04, 2016 doi: 10.1177/0962280216666667 open full text
Unified approach for extrapolation and bridging of adult information in early-phase dose-finding paediatric studies.
Petit, C., Samson, A., Morita, S., Ursino, M., Guedj, J., Jullien, V., Comets, E., Zohar, S.
Statistical Methods in Medical Research: An International Review Journal. October 04, 2016

The number of trials conducted and the number of patients per trial are typically small in paediatric clinical studies. This is due to ethical constraints and the complexity of the medical process for treating children. While incorporating prior knowledge from adults may be extremely valuable, this must be done carefully. In this paper, we propose a unified method for designing and analysing dose-finding trials in paediatrics, while bridging information from adults. The dose-range is calculated under three extrapolation options, linear, allometry and maturation adjustment, using adult pharmacokinetic data. To do this, it is assumed that target exposures are the same in both populations. The working model and prior distribution parameters of the dose–toxicity and dose–efficacy relationships are obtained using early-phase adult toxicity and efficacy data at several dose levels. Priors are integrated into the dose-finding process through Bayesian model selection or adaptive priors. This calibrates the model to adjust for misspecification, if the adult and pediatric data are very different. We performed a simulation study which indicates that incorporating prior adult information in this way may improve dose selection in children.

October 04, 2016 doi: 10.1177/0962280216671348 open full text
Estimation after blinded sample size reassessment.
Posch, M., Klinglmueller, F., König, F., Miller, F.
Statistical Methods in Medical Research: An International Review Journal. October 02, 2016

Blinded sample size reassessment is a popular means to control the power in clinical trials if no reliable information on nuisance parameters is available in the planning phase. We investigate how sample size reassessment based on blinded interim data affects the properties of point estimates and confidence intervals for parallel group superiority trials comparing the means of a normal endpoint. We evaluate the properties of two standard reassessment rules that are based on the sample size formula of the z-test, derive the worst case reassessment rule that maximizes the absolute mean bias and obtain an upper bound for the mean bias of the treatment effect estimate.

October 02, 2016 doi: 10.1177/0962280216670424 open full text
Optimally estimating the sample mean from the sample size, median, mid-range, and/or mid-quartile range.
Luo, D., Wan, X., Liu, J., Tong, T.
Statistical Methods in Medical Research: An International Review Journal. September 27, 2016

The era of big data is coming, and evidence-based medicine is attracting increasing attention to improve decision making in medical practice via integrating evidence from well designed and conducted clinical research. Meta-analysis is a statistical technique widely used in evidence-based medicine for analytically combining the findings from independent clinical trials to provide an overall estimation of a treatment effectiveness. The sample mean and standard deviation are two commonly used statistics in meta-analysis but some trials use the median, the minimum and maximum values, or sometimes the first and third quartiles to report the results. Thus, to pool results in a consistent format, researchers need to transform those information back to the sample mean and standard deviation. In this article, we investigate the optimal estimation of the sample mean for meta-analysis from both theoretical and empirical perspectives. A major drawback in the literature is that the sample size, needless to say its importance, is either ignored or used in a stepwise but somewhat arbitrary manner, e.g. the famous method proposed by Hozo et al. We solve this issue by incorporating the sample size in a smoothly changing weight in the estimators to reach the optimal estimation. Our proposed estimators not only improve the existing ones significantly but also share the same virtue of the simplicity. The real data application indicates that our proposed estimators are capable to serve as "rules of thumb" and will be widely applied in evidence-based medicine.

September 27, 2016 doi: 10.1177/0962280216669183 open full text
Feasibility of reusing time-matched controls in an overlapping cohort.
Delcoigne, B., Hagenbuch, N., Schelin, M. E., Salim, A., Lindström, L. S., Bergh, J., Czene, K., Reilly, M.
Statistical Methods in Medical Research: An International Review Journal. September 21, 2016

The methods developed for secondary analysis of nested case-control data have been illustrated only in simplified settings in a common cohort and have not found their way into biostatistical practice. This paper demonstrates the feasibility of reusing prior nested case-control data in a realistic setting where a new outcome is available in an overlapping cohort where no new controls were gathered and where all data have been anonymised. Using basic information about the background cohort and sampling criteria, the new cases and prior data are "aligned" to identify the common underlying study base. With this study base, a Kaplan–Meier table of the prior outcome extracts the risk sets required to calculate the weights to assign to the controls to remove the sampling bias. A weighted Cox regression, implemented in standard statistical software, provides unbiased hazard ratios. Using the method to compare cases of contralateral breast cancer to available controls from a prior study of metastases, we identified a multifocal tumor as a risk factor that has not been reported previously. We examine the sensitivity of the method to an imperfect weighting scheme and discuss its merits and pitfalls to provide guidance for its use in medical research studies.

September 21, 2016 doi: 10.1177/0962280216669744 open full text
Comparison of statistical approaches dealing with time-dependent confounding in drug effectiveness studies.
Karim, M. E., Petkau, J., Gustafson, P., Platt, R. W., Tremlett, H., The BeAMS Study Group.
Statistical Methods in Medical Research: An International Review Journal. September 21, 2016

In longitudinal studies, if the time-dependent covariates are affected by the past treatment, time-dependent confounding may be present. For a time-to-event response, marginal structural Cox models are frequently used to deal with such confounding. To avoid some of the problems of fitting marginal structural Cox model, the sequential Cox approach has been suggested as an alternative. Although the estimation mechanisms are different, both approaches claim to estimate the causal effect of treatment by appropriately adjusting for time-dependent confounding. We carry out simulation studies to assess the suitability of the sequential Cox approach for analyzing time-to-event data in the presence of a time-dependent covariate that may or may not be a time-dependent confounder. Results from these simulations revealed that the sequential Cox approach is not as effective as marginal structural Cox model in addressing the time-dependent confounding. The sequential Cox approach was also found to be inadequate in the presence of a time-dependent covariate. We propose a modified version of the sequential Cox approach that correctly estimates the treatment effect in both of the above scenarios. All approaches are applied to investigate the impact of beta-interferon treatment in delaying disability progression in the British Columbia Multiple Sclerosis cohort (1995–2008).

September 21, 2016 doi: 10.1177/0962280216668554 open full text
Inferring marginal association with paired and unpaired clustered data.
Lorenz, D. J., Levy, S., Datta, S.
Statistical Methods in Medical Research: An International Review Journal. September 20, 2016

In the marginal analysis of clustered data, where the marginal distribution of interest is that of a typical observation within a typical cluster, analysis by reweighting has been introduced as a useful tool for estimating parameters of these marginal distributions. Such reweighting methods have foundation in within-cluster resampling schemes that marginalize potential informativeness due to cluster size or within-cluster covariate distribution, to which reweighting methods are asymptotically equivalent. In this paper, we introduce a reweighting scheme for the marginal analysis of clustered data that generalizes prior reweighting methods, with a particular application to measuring bivariate correlation in unpaired clustered data, in which observations of two random variables are not naturally paired at the within-cluster level. We develop unpaired clustered data analogs of well-known product moment correlation coefficients (Pearson, Spearman, phi), as well as the polyserial coefficient for measuring correlation between one discrete and one continuous variable. We evaluate the performance of these coefficients via a simulation study and demonstrate their use by finding no statistically significant association between dental caries at an early age and dental fluorosis at age 13 using a large dental dataset.

September 20, 2016 doi: 10.1177/0962280216669184 open full text
ANOVA model for network meta-analysis of diagnostic test accuracy data.
Nyaga, V. N., Aerts, M., Arbyn, M.
Statistical Methods in Medical Research: An International Review Journal. September 20, 2016

Procedures combining and summarising direct and indirect evidence from independent studies assessing the diagnostic accuracy of different tests for the same disease are referred to network meta-analysis. Network meta-analysis provides a unified inference framework and uses the data more efficiently. Nonetheless, handling the inherent correlation between sensitivity and specificity continues to be a statistical challenge. We developed an arm-based hierarchical model which expresses the logit transformed sensitivity and specificity as the sum of fixed effects for test, correlated study-effects to model the inherent correlation between sensitivity and specificity and a random error associated with various tests evaluated in a given study. We present the accuracy of 11 tests used to triage women with minor cervical lesions to detect cervical precancer. Finally, we compare the results with those from a contrast-based model which expresses the linear predictor as a contrast to a comparator test. The proposed arm-based model is more appealing than the contrast-based model since the former permits more straightforward interpretation of the parameters, makes use of all available data yielding shorter credible intervals, and models more natural variance–covariance matrix structures.

September 20, 2016 doi: 10.1177/0962280216669182 open full text
Multiple imputation with non-additively related variables: Joint-modeling and approximations.
Kim, S., Belin, T. R., Sugar, C. A.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

This paper investigates multiple imputation methods for regression models with interacting continuous and binary predictors when continuous variable may be missing. Usual implementations for parametric multiple imputation assume a multivariate normal structure for the variables, which is not satisfied for a binary variable nor its interaction with a continuous variable. To accommodate interactions, missing covariates are multiply imputed from conditional distribution in a manner consistent with the joint model. Alternative imputation methods under multivariate normal assumptions are also considered as candidate approximations and evaluated in a simulation study. The results suggest that the joint modeling procedure performs generally well across a wide range of scenarios and so do the approximation methods that incorporate interactions in the model appropriately by stratification. It is critical to include interactions in the imputation model as failure to do so may result in low coverage and bias. We apply the joint modeling approach and approximation methods in the study of childhood trauma with gender x trauma interaction.

September 19, 2016 doi: 10.1177/0962280216667763 open full text
Recommendations on multiple testing adjustment in multi-arm trials with a shared control group.
Howard, D. R., Brown, J. M., Todd, S., Gregory, W. M.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

Multi-arm clinical trials assessing multiple experimental treatments against a shared control group can offer efficiency advantages over independent trials through assessing an increased number of hypotheses. Published opinion is divided on the requirement for multiple testing adjustment to control the family-wise type-I error rate (FWER). The probability of a false positive error in multi-arm trials compared to equivalent independent trials is affected by the correlation between comparisons due to sharing control data. We demonstrate that this correlation in fact leads to a reduction in the FWER, therefore FWER adjustment is not recommended solely due to sharing control data. In contrast, the correlation increases the probability of multiple false positive outcomes across the hypotheses, although standard FWER adjustment methods do not control for this. A stringent critical value adjustment is proposed to maintain equivalent evidence of superiority in two correlated comparisons to that obtained within independent trials. FWER adjustment is only required if there is an increased chance of making a single claim of effectiveness by testing multiple hypotheses, not due to sharing control data. For competing experimental therapies, the correlation between comparisons can be advantageous as it eliminates bias due to the experimental therapies being compared to different control populations.

September 19, 2016 doi: 10.1177/0962280216664759 open full text
Multiple imputation by chained equations for systematically and sporadically missing multilevel data.
Resche-Rigon, M., White, I. R.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

In multilevel settings such as individual participant data meta-analysis, a variable is ‘systematically missing’ if it is wholly missing in some clusters and ‘sporadically missing’ if it is partly missing in some clusters. Previously proposed methods to impute incomplete multilevel data handle either systematically or sporadically missing data, but frequently both patterns are observed. We describe a new multiple imputation by chained equations (MICE) algorithm for multilevel data with arbitrary patterns of systematically and sporadically missing variables. The algorithm is described for multilevel normal data but can easily be extended for other variable types. We first propose two methods for imputing a single incomplete variable: an extension of an existing method and a new two-stage method which conveniently allows for heteroscedastic data. We then discuss the difficulties of imputing missing values in several variables in multilevel data using MICE, and show that even the simplest joint multilevel model implies conditional models which involve cluster means and heteroscedasticity. However, a simulation study finds that the proposed methods can be successfully combined in a multilevel MICE procedure, even when cluster means are not included in the imputation models.

September 19, 2016 doi: 10.1177/0962280216666564 open full text
Association analysis of successive events data in the presence of competing risks.
Chen, X., Cheng, Y., Frank, E., Kupfer, D. J.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

We aim to close a methodological gap in analyzing durations of successive events that are subject to induced dependent censoring as well as competing-risk censoring. In the Bipolar Disorder Center for Pennsylvanians study, some patients who managed to recover from their symptomatic entry later developed a new depressive or manic episode. It is of great clinical interest to quantify the association between time to recovery and time to recurrence in patients with bipolar disorder. The estimation of the bivariate distribution of the gap times with independent censoring has been well studied. However, the existing methods cannot be applied to failure times that are censored by competing causes such as in the Bipolar Disorder Center for Pennsylvanians study. Bivariate cumulative incidence function has been used to describe the joint distribution of parallel event times that involve multiple causes. To the best of our knowledge, however, there is no method available for successive events with competing-risk censoring. Therefore, we extend the bivariate cumulative incidence function to successive events data, and propose non-parametric estimators of the bivariate cumulative incidence function and the related conditional cumulative incidence function. Moreover, an odds ratio measure is proposed to describe the cause-specific dependence, leading to the development of a formal test for independence of successive events. Simulation studies demonstrate that the estimators and tests perform well for realistic sample sizes, and our methods can be readily applied to the Bipolar Disorder Center for Pennsylvanians study.

September 19, 2016 doi: 10.1177/0962280216667645 open full text
Control limits to identify outlying hospitals based on risk-stratification.
Rousson, V., Le Pogam, M.-A., Eggli, Y.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

Outcome indicators are routinely used to compare hospitals with respect to quality of care. Indicators might be based on observed proportions of adverse events (binary outcomes) or observed averages of e.g. lengths or costs of hospital stays (continuous outcomes). These observed values are compared with expected ones in an average hospital, which might be estimated from a reference sample and should be appropriately adjusted for the case mix of patients. One possibility to achieve a reliable adjustment is to stratify the patients according to their risks, where each patient belongs to one and only one stratum. Control limits calculated under the null hypothesis of an average hospital, allowing to decide whether a discrepancy between an observed and an expected value might be explained by chance or not, are then plotted around the indicator, such that hospitals falling above those control limits are detected as being statistically worse than an average hospital. Calculation of valid control limits is however not always obvious. In this article, we propose a simple and unified framework to calculate such control limits when adjustment is based on stratification, where we allow to distinguish and disentangle the variability explained by stratification and the variability due to chance, where we take into account the uncertainty about the estimation of the expected values, and where it is possible not only to detect those hospitals which are statistically worse, but also those which are statistically much worse than an average hospital. The method applies both to binary and continuous outcomes and is illustrated on Swiss hospital discharge data.

September 19, 2016 doi: 10.1177/0962280216668556 open full text
Does ignoring clustering in multicenter data influence the performance of prediction models? A simulation study.
Wynants, L., Vergouwe, Y., Van Huffel, S., Timmerman, D., Van Calster, B.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

Clinical risk prediction models are increasingly being developed and validated on multicenter datasets. In this article, we present a comprehensive framework for the evaluation of the predictive performance of prediction models at the center level and the population level, considering population-averaged predictions, center-specific predictions, and predictions assuming an average random center effect. We demonstrated in a simulation study that calibration slopes do not only deviate from one because of over- or underfitting of patterns in the development dataset, but also as a result of the choice of the model (standard versus mixed effects logistic regression), the type of predictions (marginal versus conditional versus assuming an average random effect), and the level of model validation (center versus population). In particular, when data is heavily clustered (ICC 20%), center-specific predictions offer the best predictive performance at the population level and the center level. We recommend that models should reflect the data structure, while the level of model validation should reflect the research question.

September 19, 2016 doi: 10.1177/0962280216668555 open full text
Testing of non-inferiority and superiority for three-arm clinical studies with multiple experimental treatments.
Zhong, J., Wen, M.-J., Kwong, K. S., Cheung, S. H.
Statistical Methods in Medical Research: An International Review Journal. September 19, 2016

The purpose of a non-inferiority trial is to assert the efficacy of an experimental treatment compared with a reference treatment by showing that the experimental treatment retains a substantial proportion of the efficacy of the reference treatment. Statistical methods have been developed to test multiple experimental treatments in three-arm non-inferiority trials. In this paper, we report the development of procedures that simultaneously test the non-inferiority and the superiority of experimental treatments after the assay sensitivity has been established. The advantage of the proposed test procedures is the additional ability to identify superior treatments while retaining an non-inferiority testing power comparable to that of existing testing procedures. Single-step and stepwise procedures are derived and then compared with each other to determine their relative testing power and testing error in a simulation study. Finally, the suggested procedures are illustrated with two clinical examples.

September 19, 2016 doi: 10.1177/0962280216668913 open full text
Relative efficiency of joint-model and full-conditional-specification multiple imputation when conditional models are compatible: The general location model.
Seaman, S. R., Hughes, R. A.
Statistical Methods in Medical Research: An International Review Journal. September 05, 2016

Estimating the parameters of a regression model of interest is complicated by missing data on the variables in that model. Multiple imputation is commonly used to handle these missing data. Joint model multiple imputation and full-conditional specification multiple imputation are known to yield imputed data with the same asymptotic distribution when the conditional models of full-conditional specification are compatible with that joint model. We show that this asymptotic equivalence of imputation distributions does not imply that joint model multiple imputation and full-conditional specification multiple imputation will also yield asymptotically equally efficient inference about the parameters of the model of interest, nor that they will be equally robust to misspecification of the joint model. When the conditional models used by full-conditional specification multiple imputation are linear, logistic and multinomial regressions, these are compatible with a restricted general location joint model. We show that multiple imputation using the restricted general location joint model can be substantially more asymptotically efficient than full-conditional specification multiple imputation, but this typically requires very strong associations between variables. When associations are weaker, the efficiency gain is small. Moreover, full-conditional specification multiple imputation is shown to be potentially much more robust than joint model multiple imputation using the restricted general location model to mispecification of that model when there is substantial missingness in the outcome variable.

September 05, 2016 doi: 10.1177/0962280216665872 open full text
Modelling retinal pulsatile blood flow from video data.
Betz-Stablein, B., Hazelton, M. L., Morgan, W. H.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Modern day datasets continue to increase in both size and diversity. One example of such ‘big data’ is video data. Within the medical arena, more disciplines are using video as a diagnostic tool. Given the large amount of data stored within a video image, it is one of most time consuming types of data to process and analyse. Therefore, it is desirable to have automated techniques to extract, process and analyse data from video images. While many methods have been developed for extracting and processing video data, statistical modelling to analyse the outputted data has rarely been employed. We develop a method to take a video sequence of periodic nature, extract the RGB data and model the changes occurring across the contiguous images. We employ harmonic regression to model periodicity with autoregressive terms accounting for the error process associated with the time series nature of the data. A linear spline is included to account for movement between frames. We apply this model to video sequences of retinal vessel pulsation, which is the pulsatile component of blood flow. Slope and amplitude are calculated for the curves generated from the application of the harmonic model, providing clinical insight into the location of obstruction within the retinal vessels. The method can be applied to individual vessels, or to smaller segments such as 2 x 2 pixels which can then be interpreted easily as a heat map.

September 01, 2016 doi: 10.1177/0962280216665504 open full text
Continuously updated network meta-analysis and statistical monitoring for timely decision-making.
Nikolakopoulou, A., Mavridis, D., Egger, M., Salanti, G.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Pairwise and network meta-analysis (NMA) are traditionally used retrospectively to assess existing evidence. However, the current evidence often undergoes several updates as new studies become available. In each update recommendations about the conclusiveness of the evidence and the need of future studies need to be made. In the context of prospective meta-analysis future studies are planned as part of the accumulation of the evidence. In this setting, multiple testing issues need to be taken into account when the meta-analysis results are interpreted. We extend ideas of sequential monitoring of meta-analysis to provide a methodological framework for updating NMAs. Based on the z-score for each network estimate (the ratio of effect size to its standard error) and the respective information gained after each study enters NMA we construct efficacy and futility stopping boundaries. A NMA treatment effect is considered conclusive when it crosses an appended stopping boundary. The methods are illustrated using a recently published NMA where we show that evidence about a particular comparison can become conclusive via indirect evidence even if no further trials address this comparison.

September 01, 2016 doi: 10.1177/0962280216659896 open full text
A joint model for longitudinal and survival data based on an AR(1) latent process.
Bacci, S., Bartolucci, F., Pandolfi, S.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

A critical problem in repeated measurement studies is the occurrence of nonignorable missing observations. A common approach to deal with this problem is joint modeling the longitudinal and survival processes for each individual on the basis of a random effect that is usually assumed to be time constant. We relax this hypothesis by introducing time-varying subject-specific random effects that follow a first-order autoregressive process, AR(1). We also adopt a generalized linear model formulation to accommodate for different types of longitudinal response (i.e. continuous, binary, count) and we consider some extended cases, such as counts with excess of zeros and multivariate outcomes at each time occasion. Estimation of the parameters of the resulting joint model is based on the maximization of the likelihood computed by a recursion developed in the hidden Markov literature. This maximization is performed on the basis of a quasi-Newton algorithm that also provides the information matrix and then standard errors for the parameter estimates. The proposed approach is illustrated through a Monte Carlo simulation study and the analysis of certain medical datasets.

September 01, 2016 doi: 10.1177/0962280216659895 open full text
A two-stage model in a Bayesian framework to estimate a survival endpoint in the presence of confounding by indication.
Bellera, C., Proust-Lima, C., Joseph, L., Richaud, P., Taylor, J., Sandler, H., Hanley, J., Mathoulin-Pelissier, S.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Background
Biomarker series can indicate disease progression and predict clinical endpoints. When a treatment is prescribed depending on the biomarker, confounding by indication might be introduced if the treatment modifies the marker profile and risk of failure.
Objective
Our aim was to highlight the flexibility of a two-stage model fitted within a Bayesian Markov Chain Monte Carlo framework. For this purpose, we monitored the prostate-specific antigens in prostate cancer patients treated with external beam radiation therapy. In the presence of rising prostate-specific antigens after external beam radiation therapy, salvage hormone therapy can be prescribed to reduce both the prostate-specific antigens concentration and the risk of clinical failure, an illustration of confounding by indication. We focused on the assessment of the prognostic value of hormone therapy and prostate-specific antigens trajectory on the risk of failure.
Methods
We used a two-stage model within a Bayesian framework to assess the role of the prostate-specific antigens profile on clinical failure while accounting for a secondary treatment prescribed by indication. We modeled prostate-specific antigens using a hierarchical piecewise linear trajectory with a random changepoint. Residual prostate-specific antigens variability was expressed as a function of prostate-specific antigens concentration. Covariates in the survival model included hormone therapy, baseline characteristics, and individual predictions of the prostate-specific antigens nadir and timing and prostate-specific antigens slopes before and after the nadir as provided by the longitudinal process.
Results
We showed positive associations between an increased prostate-specific antigens nadir, an earlier changepoint and a steeper post-nadir slope with an increased risk of failure. Importantly, we highlighted a significant benefit of hormone therapy, an effect that was not observed when the prostate-specific antigens trajectory was not accounted for in the survival model.
Conclusion
Our modeling strategy was particularly flexible and accounted for multiple complex features of longitudinal and survival data, including the presence of a random changepoint and a time-dependent covariate.

September 01, 2016 doi: 10.1177/0962280216660127 open full text
Analysis of phase II methodologies for single-arm clinical trials with multiple endpoints in rare cancers: An example in Ewings sarcoma.
Dutton, P., Love, S., Billingham, L., Hassan, A.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Trials run in either rare diseases, such as rare cancers, or rare sub-populations of common diseases are challenging in terms of identifying, recruiting and treating sufficient patients in a sensible period. Treatments for rare diseases are often designed for other disease areas and then later proposed as possible treatments for the rare disease after initial phase I testing is complete. To ensure the trial is in the best interests of the patient participants, frequent interim analyses are needed to force the trial to stop promptly if the treatment is futile or toxic. These non-definitive phase II trials should also be stopped for efficacy to accelerate research progress if the treatment proves to be particularly promising. In this paper, we review frequentist and Bayesian methods that have been adapted to incorporate two binary endpoints and frequent interim analyses. The Eurosarc Trial of Linsitinib in advanced Ewing Sarcoma (LINES) is used as a motivating example and provides a suitable platform to compare these approaches. The Bayesian approach provides greater design flexibility, but does not provide additional value over the frequentist approaches in a single trial setting when the prior is non-informative. However, Bayesian designs are able to borrow from any previous experience, using prior information to improve efficiency.

September 01, 2016 doi: 10.1177/0962280216662070 open full text
Mortality and morbidity peaks modeling: An extreme value theory approach.
Chiu, Y., Chebana, F., Abdous, B., Belanger, D., Gosselin, P.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Hospitalizations and deaths belong to the most studied health variables in public health. Those variables are usually analyzed through mean events and trends, based on the whole dataset. However, this approach is not appropriate to comprehend health outcome peaks which are unusual events that strongly impact the health care network (e.g. overflow in hospital emergency rooms). Peaks can also be of interest in etiological research, for instance when analyzing relationships with extreme exposures (meteorological conditions, air pollution, social stress, etc.). Therefore, this paper aims at modeling health variables exclusively through the peaks, which is rarely done except over short periods. Establishing a rigorous and general methodology to identify peaks is another goal of this study. To this end, the extreme value theory appears adequate with statistical tools for selecting and modeling peaks. Selection and analysis for deaths and hospitalizations peaks using extreme value theory have not been applied in public health yet. Therefore, this study also has an exploratory goal. A declustering procedure is applied to the raw data in order to meet extreme value theory requirements. The application is done on hospitalization and death peaks for cardiovascular diseases, in the Montreal and Quebec metropolitan communities (Canada) for the period 1981–2011. The peak return levels are obtained from the modeling and can be useful in hospital management or planning future capacity needs for health care facilities, for example. This paper focuses on one class of diseases in two cities, but the methodology can be applied to any other health peaks series anywhere, as it is data driven.

September 01, 2016 doi: 10.1177/0962280216662494 open full text
Parametric and penalized generalized survival models.
Liu, X.-R., Pawitan, Y., Clements, M.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

We describe generalized survival models, where g(S(t|z)), for link function g, survival S, time t, and covariates z, is modeled by a linear predictor in terms of covariate effects and smooth time effects. These models include proportional hazards and proportional odds models, and extend the parametric Royston–Parmar models. Estimation is described for both fully parametric linear predictors and combinations of penalized smoothers and parametric effects. The penalized smoothing parameters can be selected automatically using several information criteria. The link function may be selected based on prior assumptions or using an information criterion. We have implemented the models in R. All of the penalized smoothers from the mgcv package are available for smooth time effects and smooth covariate effects. The generalized survival models perform well in a simulation study, compared with some existing models. The estimation of smooth covariate effects and smooth time-dependent hazard or odds ratios is simplified, compared with many non-parametric models. Applying these models to three cancer survival datasets, we find that the proportional odds model is better than the proportional hazards model for two of the datasets.

September 01, 2016 doi: 10.1177/0962280216664760 open full text
Detecting and accounting for violations of the constancy assumption in non-inferiority clinical trials.
Koopmeiners, J. S., Hobbs, B. P.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Randomized, placebo-controlled clinical trials are the gold standard for evaluating a novel therapeutic agent. In some instances, it may not be considered ethical or desirable to complete a placebo-controlled clinical trial and, instead, the placebo is replaced by an active comparator with the objective of showing either superiority or non-inferiority to the active comparator. In a non-inferiority trial, the experimental treatment is considered non-inferior if it retains a pre-specified proportion of the effect of the active comparator as represented by the non-inferiority margin. A key assumption required for valid inference in the non-inferiority setting is the constancy assumption, which requires that the effect of the active comparator in the non-inferiority trial is consistent with the effect that was observed in previous trials. It has been shown that violations of the constancy assumption can result in a dramatic increase in the rate of incorrectly concluding non-inferiority in the presence of ineffective or even harmful treatment. In this paper, we illustrate how Bayesian hierarchical modeling can be used to facilitate multi-source smoothing of the data from the current trial with the data from historical studies, enabling direct probabilistic evaluation of the constancy assumption. We then show how this result can be used to adapt the non-inferiority margin when the constancy assumption is violated and present simulation results illustrating that our method controls the type-I error rate when the constancy assumption is violated, while retaining the power of the standard approach when the constancy assumption holds. We illustrate our adaptive procedure using a non-inferiority trial of raltegravir, an antiretroviral drug for the treatment of HIV.

September 01, 2016 doi: 10.1177/0962280216665418 open full text
Confidence and coverage for Bland-Altman limits of agreement and their approximate confidence intervals.
Carkeet, A., Goh, Y. T.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

Bland and Altman described approximate methods in 1986 and 1999 for calculating confidence limits for their 95% limits of agreement, approximations which assume large subject numbers. In this paper, these approximations are compared with exact confidence intervals calculated using two-sided tolerance intervals for a normal distribution. The approximations are compared in terms of the tolerance factors themselves but also in terms of the exact confidence limits and the exact limits of agreement coverage corresponding to the approximate confidence interval methods. Using similar methods the 50th percentile of the tolerance interval are compared with the k values of 1.96 and 2, which Bland and Altman used to define limits of agreements (i.e. d{macron}+/– 1.96S_d and d{macron}+/– 2S_d). For limits of agreement outer confidence intervals, Bland and Altman’s approximations are too permissive for sample sizes <40 (1999 approximation) and <76 (1986 approximation). For inner confidence limits the approximations are poorer, being permissive for sample sizes of <490 (1986 approximation) and all practical sample sizes (1999 approximation). Exact confidence intervals for 95% limits of agreements, based on two-sided tolerance factors, can be calculated easily based on tables and should be used in preference to the approximate methods, especially for small sample sizes.

September 01, 2016 doi: 10.1177/0962280216665419 open full text
A basket two-part model to analyze medical expenditure on interdependent multiple sectors.
Sugawara, S., Wu, T., Yamanishi, K.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2016

This study proposes a novel statistical methodology to analyze expenditure on multiple medical sectors using consumer data. Conventionally, medical expenditure has been analyzed by two-part models, which separately consider purchase decision and amount of expenditure. We extend the traditional two-part models by adding the step of basket analysis for dimension reduction. This new step enables us to analyze complicated interdependence between multiple sectors without an identification problem. As an empirical application for the proposed method, we analyze data of 13 medical sectors from the Medical Expenditure Panel Survey. In comparison with the results of previous studies that analyzed the multiple sector independently, our method provides more detailed implications of the impacts of individual socioeconomic status on the composition of joint purchases from multiple medical sectors; our method has a better prediction performance.

September 01, 2016 doi: 10.1177/0962280216665642 open full text
Inferential tools in penalized logistic regression for small and sparse data: A comparative study.
Siino, M., Fasola, S., Muggeo, V. M.
Statistical Methods in Medical Research: An International Review Journal. August 10, 2016

This paper focuses on inferential tools in the logistic regression model fitted by the Firth penalized likelihood. In this context, the Likelihood Ratio statistic is often reported to be the preferred choice as compared to the ‘traditional’ Wald statistic. In this work, we consider and discuss a wider range of test statistics, including the robust Wald, the Score, and the recently proposed Gradient statistic. We compare all these asymptotically equivalent statistics in terms of interval estimation and hypothesis testing via simulation experiments and analyses of two real datasets. We find out that the Likelihood Ratio statistic does not appear the best inferential device in the Firth penalized logistic regression.

August 10, 2016 doi: 10.1177/0962280216661213 open full text
Sample size and classification error for Bayesian change-point models with unlabelled sub-groups and incomplete follow-up.
White, S. R., Muniz-Terrera, G., Matthews, F. E.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

Many medical (and ecological) processes involve the change of shape, whereby one trajectory changes into another trajectory at a specific time point. There has been little investigation into the study design needed to investigate these models. We consider the class of fixed effect change-point models with an underlying shape comprised two joined linear segments, also known as broken-stick models. We extend this model to include two sub-groups with different trajectories at the change-point, a change and no change class, and also include a missingness model to account for individuals with incomplete follow-up. Through a simulation study, we consider the relationship of sample size to the estimates of the underlying shape, the existence of a change-point, and the classification-error of sub-group labels. We use a Bayesian framework to account for the missing labels, and the analysis of each simulation is performed using standard Markov chain Monte Carlo techniques. Our simulation study is inspired by cognitive decline as measured by the Mini-Mental State Examination, where our extended model is appropriate due to the commonly observed mixture of individuals within studies who do or do not exhibit accelerated decline. We find that even for studies of modest size (n = 500, with 50 individuals observed past the change-point) in the fixed effect setting, a change-point can be detected and reliably estimated across a range of observation-errors.

August 08, 2016 doi: 10.1177/0962280216662298 open full text
A multi-locus genetic association test for a dichotomous trait and its secondary phenotype.
Zhang, H., Wu, C. O., Yang, Y., Berndt, S. I., Chanock, S. J., Yu, K.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

Genetic association studies often collect information on secondary phenotypes related to the primary disease status. In many situations, the secondary phenotypes are only measured in subjects with the disease condition. It would be advantageous to model the primary trait and the secondary phenotype together if they share certain level of genetic heritability. We propose a family of multi-locus testing procedures to detect the composite association between a set of genetic markers and two traits (the primary trait and a secondary phenotype), in order to identify genes influencing both traits. The proposed test is derived from a random effect model with two variance components, with each presenting the genetic effect on one trait, and incorporates a model selection procedure for seeking the optimal model to represent the two sources of genetic effects. We conduct simulation studies to evaluate performance of the proposed procedure and apply the method to a genome-wide association study of prostate cancer with the Gleason score as the secondary phenotype.

August 08, 2016 doi: 10.1177/0962280216662071 open full text
Efficient Monte Carlo evaluation of resampling-based hypothesis tests with applications to genetic epidemiology.
Fung, W. K., Yu, K., Yang, Y., Zhou, J.-Y.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

Monte Carlo evaluation of resampling-based tests is often conducted in statistical analysis. However, this procedure is generally computationally intensive. The pooling resampling-based method has been developed to reduce the computational burden but the validity of the method has not been studied before. In this article, we first investigate the asymptotic properties of the pooling resampling-based method and then propose a novel Monte Carlo evaluation procedure namely the n-times pooling resampling-based method. Theorems as well as simulations show that the proposed method can give smaller or comparable root mean squared errors and bias with much less computing time, thus can be strongly recommended especially for evaluating highly computationally intensive hypothesis testing procedures in genetic epidemiology.

August 08, 2016 doi: 10.1177/0962280216661876 open full text
Reducing the width of confidence intervals for the difference between two population means by inverting adaptive tests.
OGorman, T. W.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

In the last decade, it has been shown that an adaptive testing method could be used, along with the Robbins–Monro search procedure, to obtain confidence intervals that are often narrower than traditional confidence intervals. However, these confidence interval limits require a great deal of computation and some familiarity with stochastic search methods. We propose a method for estimating the limits of confidence intervals that uses only a few tests of significance. We compare these limits to those obtained by a lengthy Robbins–Monro stochastic search and find that the proposed method is nearly as accurate as the Robbins–Monro search. Adaptive confidence intervals that are produced by the proposed method are often narrower than traditional confidence intervals when the distributions are long-tailed, skewed, or bimodal. Moreover, the proposed method of estimating confidence interval limits is easy to understand, because it is based solely on the p-values from a few tests of significance.

August 08, 2016 doi: 10.1177/0962280216661745 open full text
A new parsimonious model for ordinal longitudinal data with application to subjective evaluations of a gastrointestinal disease.
Ursino, M., Gasparini, M.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

In this paper, a new discrete statistical model for ordered categorical data is proposed via fixed-point discretization of a beta latent variable. The resulting discretized beta distribution has a highly flexible shape and it can be either over-dispersed or under-dispersed with respect to the binomial distribution. It has only two parameters, which may therefore parsimoniously depend on covariates and on random effects, providing new tools for the analysis of structured, clustered or longitudinal ordinal data. Practical examples and advices are given and an application of the new model to subjective evaluations of a gastrointestinal disease is shown.

August 08, 2016 doi: 10.1177/0962280216661370 open full text
Calibration of medical diagnostic classifier scores to the probability of disease.
Chen, W., Sahiner, B., Samuelson, F., Pezeshk, A., Petrick, N.
Statistical Methods in Medical Research: An International Review Journal. August 08, 2016

Scores produced by statistical classifiers in many clinical decision support systems and other medical diagnostic devices are generally on an arbitrary scale, so the clinical meaning of these scores is unclear. Calibration of classifier scores to a meaningful scale such as the probability of disease is potentially useful when such scores are used by a physician. In this work, we investigated three methods (parametric, semi-parametric, and non-parametric) for calibrating classifier scores to the probability of disease scale and developed uncertainty estimation techniques for these methods. We showed that classifier scores on arbitrary scales can be calibrated to the probability of disease scale without affecting their discrimination performance. With a finite dataset to train the calibration function, it is important to accompany the probability estimate with its confidence interval. Our simulations indicate that, when a dataset used for finding the transformation for calibration is also used for estimating the performance of calibration, the resubstitution bias exists for a performance metric involving the truth states in evaluating the calibration performance. However, the bias is small for the parametric and semi-parametric methods when the sample size is moderate to large (>100 per class).

August 08, 2016 doi: 10.1177/0962280216661371 open full text
An extension of generalized pairwise comparisons for prioritized outcomes in the presence of censoring.
Peron, J., Buyse, M., Ozenne, B., Roche, L., Roy, P.
Statistical Methods in Medical Research: An International Review Journal. August 02, 2016

Generalized pairwise comparisons have been proposed to permit a comprehensive assessment of several prioritized outcomes between two groups of observations. This procedure estimates , the net chance of a better outcome with treatment than with control by comparing the patients outcomes among all possible pairs taking one patient from the treatment group and one patient from the control group. For time to event outcomes, the standard procedure of generalized pairwise comparisons is analogous to the Gehan’s modification of the Mann-Whitney test which is biased in presence of censored observation and less powerful than Efron’s modification of this test. We adapt Efron’s modification to generalized pairwise comparisons. We show how a pairwise contribution to can be calculated from the estimates of the survival function in the presence of right-censored data. We performed a simulation study to assess the bias, the type I error and the power of the new procedure. The estimate of with the new procedure is only slightly biased even in presence of heavy censoring. We also show how this bias can be corrected when only one time-to-event outcome is analyzed. The new procedure has higher power in most cases compared to the standard procedure.

August 02, 2016 doi: 10.1177/0962280216658320 open full text
Meta-analysis for the comparison of two diagnostic tests to a common gold standard: A generalized linear mixed model approach.
Hoyer, A., Kuss, O.
Statistical Methods in Medical Research: An International Review Journal. August 02, 2016

Meta-analysis of diagnostic studies is still a rapidly developing area of biostatistical research. Especially, there is an increasing interest in methods to compare different diagnostic tests to a common gold standard. Restricting to the case of two diagnostic tests, in these meta-analyses the parameters of interest are the differences of sensitivities and specificities (with their corresponding confidence intervals) between the two diagnostic tests while accounting for the various associations across single studies and between the two tests. We propose statistical models with a quadrivariate response (where sensitivity of test 1, specificity of test 1, sensitivity of test 2, and specificity of test 2 are the four responses) as a sensible approach to this task. Using a quadrivariate generalized linear mixed model naturally generalizes the common standard bivariate model of meta-analysis for a single diagnostic test. If information on several thresholds of the tests is available, the quadrivariate model can be further generalized to yield a comparison of full receiver operating characteristic (ROC) curves. We illustrate our model by an example where two screening methods for the diagnosis of type 2 diabetes are compared.

August 02, 2016 doi: 10.1177/0962280216661587 open full text
An overview of methods for network meta-analysis using individual participant data: when do benefits arise?
Debray, T. P., Schuit, E., Efthimiou, O., Reitsma, J. B., Ioannidis, J. P., Salanti, G., Moons, K. G., on behalf of GetReal Workpackage.
Statistical Methods in Medical Research: An International Review Journal. August 02, 2016

Network meta-analysis (NMA) is a common approach to summarizing relative treatment effects from randomized trials with different treatment comparisons. Most NMAs are based on published aggregate data (AD) and have limited possibilities for investigating the extent of network consistency and between-study heterogeneity. Given that individual participant data (IPD) are considered the gold standard in evidence synthesis, we explored statistical methods for IPD-NMA and investigated their potential advantages and limitations, compared with AD-NMA. We discuss several one-stage random-effects NMA models that account for within-trial imbalances, treatment effect modifiers, missing response data and longitudinal responses. We illustrate all models in a case study of 18 antidepressant trials with a continuous endpoint (the Hamilton Depression Score). All trials suffered from drop-out; missingness of longitudinal responses ranged from 21 to 41% after 6 weeks follow-up. Our results indicate that NMA based on IPD may lead to increased precision of estimated treatment effects. Furthermore, it can help to improve network consistency and explain between-study heterogeneity by adjusting for participant-level effect modifiers and adopting more advanced models for dealing with missing response data. We conclude that implementation of IPD-NMA should be considered when trials are affected by substantial drop-out rate, and when treatment effects are potentially influenced by participant-level covariates.

August 02, 2016 doi: 10.1177/0962280216660741 open full text
Principal component of explained variance: An efficient and optimal data dimension reduction framework for association studies.
Turgeon, M., Oualkacha, K., Ciampi, A., Miftah, H., Dehghan, G., Zanke, B. W., Benedet, A. L., Rosa-Neto, P., Greenwood, C. M., Labbe, A., for the Alzheimers Disease Neuroimaging Initiative.
Statistical Methods in Medical Research: An International Review Journal. July 26, 2016

The genomics era has led to an increase in the dimensionality of data collected in the investigation of biological questions. In this context, dimension-reduction techniques can be used to summarise high-dimensional signals into low-dimensional ones, to further test for association with one or more covariates of interest. This paper revisits one such approach, previously known as principal component of heritability and renamed here as principal component of explained variance (PCEV). As its name suggests, the PCEV seeks a linear combination of outcomes in an optimal manner, by maximising the proportion of variance explained by one or several covariates of interest. By construction, this method optimises power; however, due to its computational complexity, it has unfortunately received little attention in the past. Here, we propose a general analytical PCEV framework that builds on the assets of the original method, i.e. conceptually simple and free of tuning parameters. Moreover, our framework extends the range of applications of the original procedure by providing a computationally simple strategy for high-dimensional outcomes, along with exact and asymptotic testing procedures that drastically reduce its computational cost. We investigate the merits of the PCEV using an extensive set of simulations. Furthermore, the use of the PCEV approach is illustrated using three examples taken from the fields of epigenetics and brain imaging.

July 26, 2016 doi: 10.1177/0962280216660128 open full text
Joint modeling of longitudinal zero-inflated count and time-to-event data: A Bayesian perspective.
Zhu, H., DeSantis, S. M., Luo, S.
Statistical Methods in Medical Research: An International Review Journal. July 26, 2016

Longitudinal zero-inflated count data are encountered frequently in substance-use research when assessing the effects of covariates and risk factors on outcomes. Often, both the time to a terminal event such as death or dropout and repeated measure count responses are collected for each subject. In this setting, the longitudinal counts are censored by the terminal event, and the time to the terminal event may depend on the longitudinal outcomes. In the study described herein, we expand the class of joint models for longitudinal and survival data to accommodate zero-inflated counts and time-to-event data by using a Cox proportional hazards model with piecewise constant baseline hazard. We use a Bayesian framework via Markov chain Monte Carlo simulations implemented in the BUGS programming language. Via an extensive simulation study, we apply the joint model and obtain estimates that are more accurate than those of the corresponding independence model. We apply the proposed method to an alpha-tocopherol, beta-carotene lung cancer prevention study.

July 26, 2016 doi: 10.1177/0962280216659312 open full text
Propensity score matching and complex surveys.
Austin, P. C., Jembere, N., Chiu, M.
Statistical Methods in Medical Research: An International Review Journal. July 26, 2016

Researchers are increasingly using complex population-based sample surveys to estimate the effects of treatments, exposures and interventions. In such analyses, statistical methods are essential to minimize the effect of confounding due to measured covariates, as treated subjects frequently differ from control subjects. Methods based on the propensity score are increasingly popular. Minimal research has been conducted on how to implement propensity score matching when using data from complex sample surveys. We used Monte Carlo simulations to examine two critical issues when implementing propensity score matching with such data. First, we examined how the propensity score model should be formulated. We considered three different formulations depending on whether or not a weighted regression model was used to estimate the propensity score and whether or not the survey weights were included in the propensity score model as an additional covariate. Second, we examined whether matched control subjects should retain their natural survey weight or whether they should inherit the survey weight of the treated subject to which they were matched. Our results were inconclusive with respect to which method of estimating the propensity score model was preferable. In general, greater balance in measured baseline covariates and decreased bias was observed when natural retained weights were used compared to when inherited weights were used. We also demonstrated that bootstrap-based methods performed well for estimating the variance of treatment effects when outcomes are binary. We illustrated the application of our methods by using the Canadian Community Health Survey to estimate the effect of educational attainment on lifetime prevalence of mood or anxiety disorders.

July 26, 2016 doi: 10.1177/0962280216658920 open full text
A review of statistical updating methods for clinical prediction models.
Su, T.-L., Jaki, T., Hickey, G. L., Buchan, I., Sperrin, M.
Statistical Methods in Medical Research: An International Review Journal. July 26, 2016

A clinical prediction model is a tool for predicting healthcare outcomes, usually within a specific population and context. A common approach is to develop a new clinical prediction model for each population and context; however, this wastes potentially useful historical information. A better approach is to update or incorporate the existing clinical prediction models already developed for use in similar contexts or populations. In addition, clinical prediction models commonly become miscalibrated over time, and need replacing or updating. In this article, we review a range of approaches for re-using and updating clinical prediction models; these fall in into three main categories: simple coefficient updating, combining multiple previous clinical prediction models in a meta-model and dynamic updating of models. We evaluated the performance (discrimination and calibration) of the different strategies using data on mortality following cardiac surgery in the United Kingdom: We found that no single strategy performed sufficiently well to be used to the exclusion of the others. In conclusion, useful tools exist for updating existing clinical prediction models to a new population or context, and these should be implemented rather than developing a new clinical prediction model from scratch, using a breadth of complementary statistical methods.

July 26, 2016 doi: 10.1177/0962280215626466 open full text
Non-randomized and randomized stepped-wedge designs using an orthogonalized least squares framework.
Hu, Y., Hoover, D. R.
Statistical Methods in Medical Research: An International Review Journal. July 10, 2016

Randomized stepped-wedge (R-SW) designs are increasingly used to evaluate interventions targeting continuous longitudinal outcomes measured at T-fixed time points. Typically, all units start out untreated, and randomly chosen units switch to intervention at sequential time points until all receive intervention. As randomization is not always feasible, non-randomized stepped-wedge (NR-SW) designs (units switching to intervention are not randomly chosen) have attracted researchers. We develop an orthogonlized generalized least squares framework for both R-SW and NR-SW designs. The variance of the intervention effect estimate depends on the number of steps (S), length of step sizes (t_s), and number of units (n_s) switched at each step (s=1,..., S). If all other design parameters are equal, this variance is higher for the NR-SW than for the equivalent R-SW design (particularly if the intercepts of non-randomly stepped switching strata are analyzed as fixed effects). We focus on balanced stepped-wedge (BR-SW, BNR-SW) designs (where t_s and n_s remain constant across s) to obtain insights into optimality for variance of the estimated intervention effect. As previously observed for the BR-SW, the optimal choice for number of time points at each step is also ts1 for the BNR-SW. In our examples, when compared to BR-SW designs, equivalent BNR-SW designs even with intercepts of non-randomly stepped switching strata analyzed using fixed effects sacrifice little efficiency given an intra-unit repeated measure correlation ≥0.50. Compared to traditional difference-in-differences designs, optimal BNR-SW designs are more efficient with the ratio of variances of these designs converging to 0.75 when T > 10. We illustrate these findings using longitudinal outcomes in long-term care facilities.

July 10, 2016 doi: 10.1177/0962280216657852 open full text
Statistical approaches to account for missing values in accelerometer data: Applications to modeling physical activity.
Xu, S. Y., Nelson, S., Kerr, J., Godbole, S., Patterson, R., Merchant, G., Abramson, I., Staudenmayer, J., Natarajan, L.
Statistical Methods in Medical Research: An International Review Journal. July 10, 2016

Physical inactivity is a recognized risk factor for many chronic diseases. Accelerometers are increasingly used as an objective means to measure daily physical activity. One challenge in using these devices is missing data due to device nonwear. We used a well-characterized cohort of 333 overweight postmenopausal breast cancer survivors to examine missing data patterns of accelerometer outputs over the day. Based on these observed missingness patterns, we created psuedo-simulated datasets with realistic missing data patterns. We developed statistical methods to design imputation and variance weighting algorithms to account for missing data effects when fitting regression models. Bias and precision of each method were evaluated and compared. Our results indicated that not accounting for missing data in the analysis yielded unstable estimates in the regression analysis. Incorporating variance weights and/or subject-level imputation improved precision by >50%, compared to ignoring missing data. We recommend that these simple easy-to-implement statistical tools be used to improve analysis of accelerometer data.

July 10, 2016 doi: 10.1177/0962280216657119 open full text
Bayesian prospective detection of small area health anomalies using Kullback-Leibler divergence.
Rotejanaprasert, C., Lawson, A.
Statistical Methods in Medical Research: An International Review Journal. July 10, 2016

Early detection of unusual health events depends on the ability to rapidly detect any substantial changes in disease, thus facilitating timely public health interventions. To assist public health practitioners to make decisions, statistical methods are adopted to assess unusual events in real time. We introduce a surveillance Kullback–Leibler measure for timely detection of disease outbreaks for small area health data. The detection methods are compared with the surveillance conditional predictive ordinate within the framework of Bayesian hierarchical Poisson modeling and applied to a case study of a group of respiratory system diseases observed weekly in South Carolina counties. Properties of the proposed surveillance techniques including timeliness and detection precision are investigated using a simulation study offered in the article’s supplementary materials.

July 10, 2016 doi: 10.1177/0962280216652156 open full text
Error-rate estimation in discriminant analysis of non-linear longitudinal data: A comparison of resampling methods.
de la Cruz, R., Fuentes, C., Meza, C., Nunez-Anton, V.
Statistical Methods in Medical Research: An International Review Journal. July 08, 2016

Consider longitudinal observations across different subjects such that the underlying distribution is determined by a non-linear mixed-effects model. In this context, we look at the misclassification error rate for allocating future subjects using cross-validation, bootstrap algorithms (parametric bootstrap, leave-one-out, .632 and .632+), and bootstrap cross-validation (which combines the first two approaches), and conduct a numerical study to compare the performance of the different methods. The simulation and comparisons in this study are motivated by real observations from a pregnancy study in which one of the main objectives is to predict normal versus abnormal pregnancy outcomes based on information gathered at early stages. Since in this type of studies it is not uncommon to have insufficient data to simultaneously solve the classification problem and estimate the misclassification error rate, we put special attention to situations when only a small sample size is available. We discuss how the misclassification error rate estimates may be affected by the sample size in terms of variability and bias, and examine conditions under which the misclassification error rate estimates perform reasonably well.

July 08, 2016 doi: 10.1177/0962280216656246 open full text
A Bayesian multi-stage cost-effectiveness design for animal studies in stroke research.
Cai, C., Ning, J., Huang, X.
Statistical Methods in Medical Research: An International Review Journal. July 08, 2016

Much progress has been made in the area of adaptive designs for clinical trials. However, little has been done regarding adaptive designs to identify optimal treatment strategies in animal studies. Motivated by an animal study of a novel strategy for treating strokes, we propose a Bayesian multi-stage cost-effectiveness design to simultaneously identify the optimal dose and determine the therapeutic treatment window for administrating the experimental agent. We consider a non-monotonic pattern for the dose–schedule–efficacy relationship and develop an adaptive shrinkage algorithm to assign more cohorts to admissible strategies. We conduct simulation studies to evaluate the performance of the proposed design by comparing it with two standard designs. These simulation studies show that the proposed design yields a significantly higher probability of selecting the optimal strategy, while it is generally more efficient and practical in terms of resource usage.

July 08, 2016 doi: 10.1177/0962280216657853 open full text
Semiparametric models for multilevel overdispersed count data with extra zeros.
Mahmoodi, M., Moghimbeigi, A., Mohammad, K., Faradmal, J.
Statistical Methods in Medical Research: An International Review Journal. July 07, 2016

This study proposes semiparametric models for analysis of hierarchical count data containing excess zeros and overdispersion simultaneously. The methods discussed in this paper handle nonlinear covariate effects through flexible semiparametric multilevel regression techniques. This is performed by providing a comprehensive comparison of semiparametric multilevel zero-inflated negative binomial and semiparametric multilevel zero-inflated generalized Poisson models under the real and simulated data. An EM algorithm based on Newton–Raphson equations for maximum penalized likelihood estimation approach is developed. The performance of the proposed models is assessed by using a Monte Carlo simulation study. We also illustrated the methods by the analysis of decayed, missing, and filled teeth of children aged 5–14 years old.

July 07, 2016 doi: 10.1177/0962280216657376 open full text
Exact confidence limits for the response rate in two-stage designs with over- or under-enrollment in the second stage.
Shan, G.
Statistical Methods in Medical Research: An International Review Journal. July 07, 2016

Simon’s two-stage design has been widely used in early phase clinical trials to assess the activity of a new investigated treatment. In practice, the actual sample sizes do not always follow the study design precisely, especially in the second stage. When over- or under-enrollment occurs in a study, the original critical values for the study design are no longer valid for making proper statistical inference in a clinical trial. The hypothesis for such studies is always one-sided, and the null hypothesis is rejected when only a few responses are observed. Therefore, a one-sided lower interval is suitable to test the hypothesis. The commonly used approaches for confidence interval construction are based on asymptotic approaches. These approaches generally do not guarantee the coverage probability. For this reason, Clopper-Pearson approach can be used to compute exact confidence intervals. This approach has to be used in conjunction with a method to order the sample space. The frequently used method is based on point estimates for the response rate, but this ordering has too many ties which lead to conservativeness of the exact intervals. We propose developing exact one-sided intervals based on the p-value to order the sample space. The proposed approach outperforms the existing asymptotic and exact approaches. Therefore, it is recommended for use in practice.

July 07, 2016 doi: 10.1177/0962280216650918 open full text
Collaborative targeted maximum likelihood estimation for variable importance measure: Illustration for functional outcome prediction in mild traumatic brain injuries.
Pirracchio, R., Yue, J. K., Manley, G. T., van der Laan, M. J., Hubbard, A. E., the TRACK-TBI Investigators including Wayne A Gordon, Hester F Lingsma, Andrew IR Maas, Pratik Mukherjee, David O Okonkwo, David M Schnyer, Alex B Valadka and Esther L Yuh.
Statistical Methods in Medical Research: An International Review Journal. June 29, 2016

Standard statistical practice used for determining the relative importance of competing causes of disease typically relies on ad hoc methods, often byproducts of machine learning procedures (stepwise regression, random forest, etc.). Causal inference framework and data-adaptive methods may help to tailor parameters to match the clinical question and free one from arbitrary modeling assumptions. Our focus is on implementations of such semiparametric methods for a variable importance measure (VIM). We propose a fully automated procedure for VIM based on collaborative targeted maximum likelihood estimation (cTMLE), a method that optimizes the estimate of an association in the presence of potentially numerous competing causes. We applied the approach to data collected from traumatic brain injury patients, specifically a prospective, observational study including three US Level-1 trauma centers. The primary outcome was a disability score (Glasgow Outcome Scale - Extended (GOSE)) collected three months post-injury. We identified clinically important predictors among a set of risk factors using a variable importance analysis based on targeted maximum likelihood estimators (TMLE) and on cTMLE. Via a parametric bootstrap, we demonstrate that the latter procedure has the potential for robust automated estimation of variable importance measures based upon machine-learning algorithms. The cTMLE estimator was associated with substantially less positivity bias as compared to TMLE and larger coverage of the 95% CI. This study confirms the power of an automated cTMLE procedure that can target model selection via machine learning to estimate VIMs in complicated, high-dimensional data.

June 29, 2016 doi: 10.1177/0962280215627335 open full text
Estimating average attributable fractions with confidence intervals for cohort and case-control studies.
Ferguson, J., Alvarez-Iglesias, A., Newell, J., Hinde, J., ODonnell, M.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2016

Chronic diseases tend to depend on a large number of risk factors, both environmental and genetic. Average attributable fractions were introduced by Eide and Gefeller as a way of partitioning overall disease burden into contributions from individual risk factors; this may be useful in deciding which risk factors to target in disease interventions. Here, we introduce new estimation methods for average attributable fractions that are appropriate for both case–control designs and prospective studies. Confidence intervals, derived using Monte Carlo simulation, are also described. Finally, we introduce a novel approximation for the sample average attributable fraction that will ensure a computationally tractable approach when the number of risk factors is large. An R package, averisk, implementing the methods described in this manuscript can be downloaded from the CRAN repository.

June 24, 2016 doi: 10.1177/0962280216655374 open full text
Sequential designs with small samples: Evaluation and recommendations for normal responses.
Nikolakopoulos, S., Roes, K. C., van der Tweel, I.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2016

Sequential monitoring is a well-known methodology for the design and analysis of clinical trials. Driven by the lower expected sample size, recent guidelines and published research suggest the use of sequential methods for the conduct of clinical trials in rare diseases. However, the vast majority of the developed and most commonly used sequential methods relies on asymptotic assumptions concerning the distribution of the test statistics. It is not uncommon for trials in (very) rare diseases to be conducted with only a few decades of patients and the use of sequential methods that rely on large-sample approximations could inflate the type I error probability. Additionally, the setting of a rare disease could make the traditional paradigm of designing a clinical trial (deciding on the sample size given type I and II errors and anticipated effect size) irrelevant. One could think of the situation where the number of patients available has a maximum and this should be utilized in the most efficient way. In this work, we evaluate the operational characteristics of sequential designs in the setting of very small to moderate sample sizes with normally distributed outcomes and demonstrate the necessity of simple corrections of the critical boundaries. We also suggest a method for deciding on an optimal sequential design given a maximum sample size and some (data driven or based on expert opinion) prior belief on the treatment effect.

June 24, 2016 doi: 10.1177/0962280216653778 open full text
Passive imputation and parcel summaries are both valid to handle missing items in studies with many multi-item scales.
Eekhout, I., de Vet, H. C., de Boer, M. R., Twisk, J. W., Heymans, M. W.
Statistical Methods in Medical Research: An International Review Journal. June 22, 2016

Previous studies showed that missing data in multi-item scales can best be handled by multiple imputation of item scores. However, when many scales are used, the number of items will become too large for the imputation model to reliably estimate imputations. A solution is to use passive imputation or a parcel summary score that combine and consequently reduce the number of variables in the imputation model. The performance of these methods was evaluated in a simulation study and illustrated in an example. Passive imputation, which updated scale scores from imputed items, and parcel summary scores that use the average over available item scores were compared to using all items simultaneously, imputing total scores of scales and complete-case analysis. Scale scores and coefficient estimates from linear regression were compared to "true" parameters on bias and precision. Passive imputation and using parcel summaries showed smaller bias and more precision than imputing total scores and complete-case analyses. Passive imputation or using parcel summary scores are valid missing data solutions in studies that include many multi-item scales.

June 22, 2016 doi: 10.1177/0962280216654511 open full text
Bayesian spatially dependent variable selection for small area health modeling.
Choi, J., Lawson, A. B.
Statistical Methods in Medical Research: An International Review Journal. June 16, 2016

Statistical methods for spatial health data to identify the significant covariates associated with the health outcomes are of critical importance. Most studies have developed variable selection approaches in which the covariates included appear within the spatial domain and their effects are fixed across space. However, the impact of covariates on health outcomes may change across space and ignoring this behavior in spatial epidemiology may cause the wrong interpretation of the relations. Thus, the development of a statistical framework for spatial variable selection is important to allow for the estimation of the space-varying patterns of covariate effects as well as the early detection of disease over space. In this paper, we develop flexible spatial variable selection approaches to find the spatially-varying subsets of covariates with significant effects. A Bayesian hierarchical latent model framework is applied to account for spatially-varying covariate effects. We present a simulation example to examine the performance of the proposed models with the competing models. We apply our models to a county-level low birth weight incidence dataset in Georgia.

June 16, 2016 doi: 10.1177/0962280215627184 open full text
Unbiased estimation for response adaptive clinical trials.
Bowden, J., Trippa, L.
Statistical Methods in Medical Research: An International Review Journal. June 16, 2016

Bayesian adaptive trials have the defining feature that the probability of randomization to a particular treatment arm can change as information becomes available as to its true worth. However, there is still a general reluctance to implement such designs in many clinical settings. One area of concern is that their frequentist operating characteristics are poor or, at least, poorly understood. We investigate the bias induced in the maximum likelihood estimate of a response probability parameter, p, for binary outcome by the process of adaptive randomization. We discover that it is small in magnitude and, under mild assumptions, can only be negative – causing one’s estimate to be closer to zero on average than the truth. A simple unbiased estimator for p is obtained, but it is shown to have a large mean squared error. Two approaches are therefore explored to improve its precision based on inverse probability weighting and Rao–Blackwellization. We illustrate these estimation strategies using two well-known designs from the literature.

June 16, 2016 doi: 10.1177/0962280215597716 open full text
An approach for quantifying small effects in regression models.
Bedrick, E. J., Hund, L.
Statistical Methods in Medical Research: An International Review Journal. June 14, 2016

We develop a novel approach for quantifying small effects in regression models. Our method is based on variation in the mean function, in contrast to methods that focus on regression coefficients. Our idea applies in diverse settings such as testing for a negligible trend and quantifying differences in regression functions across strata. Straightforward Bayesian methods are proposed for inference. Four examples are used to illustrate the ideas.

June 14, 2016 doi: 10.1177/0962280216653152 open full text
Kernel machine score test for pathway analysis in the presence of semi-competing risks.
Neykov, M., Hejblum, B. P., Sinnott, J. A.
Statistical Methods in Medical Research: An International Review Journal. June 02, 2016

In cancer studies, patients often experience two different types of events: a non-terminal event such as recurrence or metastasis, and a terminal event such as cancer-specific death. Identifying pathways and networks of genes associated with one or both of these events is an important step in understanding disease development and targeting new biological processes for potential intervention. These correlated outcomes are commonly dealt with by modeling progression-free survival, where the event time is the minimum between the times of recurrence and death. However, identifying pathways only associated with progression-free survival may miss out on pathways that affect time to recurrence but not death, or vice versa. We propose a combined testing procedure for a pathway’s association with both the cause-specific hazard of recurrence and the marginal hazard of death. The dependency between the two outcomes is accounted for through perturbation resampling to approximate the test’s null distribution, without any further assumption on the nature of the dependency. Even complex non-linear relationships between pathways and disease progression or death can be uncovered thanks to a flexible kernel machine framework. The superior statistical power of our approach is demonstrated in numerical studies and in a gene expression study of breast cancer.

June 02, 2016 doi: 10.1177/0962280216653427 open full text
Fitting the data from embryo implantation prediction: Learning from label proportions.
Hernandez-Gonzalez, J., Inza, I., Crisol-Ortiz, L., Guembe, M. A., Inarra, M. J., Lozano, J. A.
Statistical Methods in Medical Research: An International Review Journal. May 30, 2016

Machine learning techniques have been previously used to assist clinicians to select embryos for human-assisted reproduction. This work aims to show how an appropriate modeling of the problem can contribute to improve machine learning techniques for embryo selection. In this study, a dataset of 330 consecutive cycles (and associated embryos) carried out by the Unit of Assisted Reproduction of the Hospital Donostia (Spain) throughout 18 months has been analyzed. The problem of the embryo selection has been modeled by a novel weakly supervised paradigm, learning from label proportions, which considers all the available data, including embryos whose fate cannot be certainly established. Furthermore, all the collected features, describing cycles and embryos, have been considered in a multi-variate data analysis. Our integral solution has been successfully tested. Experimental results show that the proposed technique consistently outperforms an equivalent approach based on standard supervised classification. Embryos in this study were selected for transference according to the criteria of the Spanish Association for Reproduction Biology Studies. Obtained classification models outperform these criteria, specifically reordering medium-quality embryos.

May 30, 2016 doi: 10.1177/0962280216651098 open full text
A novel complete-case analysis to determine statistical significance between treatments in an intention-to-treat population of randomized clinical trials involving missing data.
Liu, W., Ding, J.
Statistical Methods in Medical Research: An International Review Journal. May 25, 2016

The application of the principle of the intention-to-treat (ITT) to the analysis of clinical trials is challenged in the presence of missing outcome data. The consequences of stopping an assigned treatment in a withdrawn subject are unknown. It is difficult to make a single assumption about missing mechanisms for all clinical trials because there are complicated reactions in the human body to drugs due to the presence of complex biological networks, leading to data missing randomly or non-randomly. Currently there is no statistical method that can tell whether a difference between two treatments in the ITT population of a randomized clinical trial with missing data is significant at a pre-specified level. Making no assumptions about the missing mechanisms, we propose a generalized complete-case (GCC) analysis based on the data of completers. An evaluation of the impact of missing data on the ITT analysis reveals that a statistically significant GCC result implies a significant treatment effect in the ITT population at a pre-specified significance level unless, relative to the comparator, the test drug is poisonous to the non-completers as documented in their medical records. Applications of the GCC analysis are illustrated using literature data, and its properties and limits are discussed.

May 25, 2016 doi: 10.1177/0962280216651307 open full text
A review and comparison of Bayesian and likelihood-based inferences in beta regression and zero-or-one-inflated beta regression.
Liu, F., Eugenio, E. C.
Statistical Methods in Medical Research: An International Review Journal. May 25, 2016

Beta regression is an increasingly popular statistical technique in medical research for modeling of outcomes that assume values in (0, 1), such as proportions and patient reported outcomes. When outcomes take values in the intervals [0,1), (0,1], or [0,1], zero-or-one-inflated beta (zoib) regression can be used. We provide a thorough review on beta regression and zoib regression in the modeling, inferential, and computational aspects via the likelihood-based and Bayesian approaches. We demonstrate the statistical and practical importance of correctly modeling the inflation at zero/one rather than ad hoc replacing them with values close to zero/one via simulation studies; the latter approach can lead to biased estimates and invalid inferences. We show via simulation studies that the likelihood-based approach is computationally faster in general than MCMC algorithms used in the Bayesian inferences, but runs the risk of non-convergence, large biases, and sensitivity to starting values in the optimization algorithm especially with clustered/correlated data, data with sparse inflation at zero and one, and data that warrant regularization of the likelihood. The disadvantages of the regular likelihood-based approach make the Bayesian approach an attractive alternative in these cases. Software packages and tools for fitting beta and zoib regressions in both the likelihood-based and Bayesian frameworks are also reviewed.

May 25, 2016 doi: 10.1177/0962280216650699 open full text
An efficient genome-wide association test for mixed binary and continuous phenotypes with applications to substance abuse research.
Buu, A., Williams, L. K., Yang, J. J.
Statistical Methods in Medical Research: An International Review Journal. May 22, 2016

We propose a new genome-wide association test for mixed binary and continuous phenotypes that uses an efficient numerical method to estimate the empirical distribution of the Fisher’s combination statistic under the null hypothesis. Our simulation study shows that the proposed method controls the type I error rate and also maintains its power at the level of the permutation method. More importantly, the computational efficiency of the proposed method is much higher than the one of the permutation method. The simulation results also indicate that the power of the test increases when the genetic effect increases, the minor allele frequency increases, and the correlation between responses decreases. The statistical analysis on the database of the Study of Addiction: Genetics and Environment demonstrates that the proposed method combining multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests.

May 22, 2016 doi: 10.1177/0962280216647422 open full text
Valid statistical inference methods for a case-control study with missing data.
Tian, G.-L., Zhang, C., Jiang, X.
Statistical Methods in Medical Research: An International Review Journal. May 19, 2016

The main objective of this paper is to derive the valid sampling distribution of the observed counts in a case–control study with missing data under the assumption of missing at random by employing the conditional sampling method and the mechanism augmentation method. The proposed sampling distribution, called the case–control sampling distribution, can be used to calculate the standard errors of the maximum likelihood estimates of parameters via the Fisher information matrix and to generate independent samples for constructing small-sample bootstrap confidence intervals. Theoretical comparisons of the new case–control sampling distribution with two existing sampling distributions exhibit a large difference. Simulations are conducted to investigate the influence of the three different sampling distributions on statistical inferences. One finding is that the conclusion by the Wald test for testing independency under the two existing sampling distributions could be completely different (even contradictory) from the Wald test for testing the equality of the success probabilities in control/case groups under the proposed distribution. A real cervical cancer data set is used to illustrate the proposed statistical methods.

May 19, 2016 doi: 10.1177/0962280216649619 open full text
A measure of association for ordered categorical data in population-based studies.
Nelson, K. P., Edwards, D.
Statistical Methods in Medical Research: An International Review Journal. May 16, 2016

Ordinal classification scales are commonly used to define a patient’s disease status in screening and diagnostic tests such as mammography. Challenges arise in agreement studies when evaluating the association between many raters’ classifications of patients’ disease or health status when an ordered categorical scale is used. In this paper, we describe a population-based approach and chance-corrected measure of association to evaluate the strength of relationship between multiple raters’ ordinal classifications where any number of raters can be accommodated. In contrast to Shrout and Fleiss’ intraclass correlation coefficient, the proposed measure of association is invariant with respect to changes in disease prevalence. We demonstrate how unique characteristics of individual raters can be explored using random effects. Simulation studies are conducted to demonstrate the properties of the proposed method under varying assumptions. The methods are applied to two large-scale agreement studies of breast cancer screening and prostate cancer severity.

May 16, 2016 doi: 10.1177/0962280216643347 open full text
Missing continuous outcomes under covariate dependent missingness in cluster randomised trials.
Hossain, A., Diaz-Ordaz, K., Bartlett, J. W.
Statistical Methods in Medical Research: An International Review Journal. May 13, 2016

Attrition is a common occurrence in cluster randomised trials which leads to missing outcome data. Two approaches for analysing such trials are cluster-level analysis and individual-level analysis. This paper compares the performance of unadjusted cluster-level analysis, baseline covariate adjusted cluster-level analysis and linear mixed model analysis, under baseline covariate dependent missingness in continuous outcomes, in terms of bias, average estimated standard error and coverage probability. The methods of complete records analysis and multiple imputation are used to handle the missing outcome data. We considered four scenarios, with the missingness mechanism and baseline covariate effect on outcome either the same or different between intervention groups. We show that both unadjusted cluster-level analysis and baseline covariate adjusted cluster-level analysis give unbiased estimates of the intervention effect only if both intervention groups have the same missingness mechanisms and there is no interaction between baseline covariate and intervention group. Linear mixed model and multiple imputation give unbiased estimates under all four considered scenarios, provided that an interaction of intervention and baseline covariate is included in the model when appropriate. Cluster mean imputation has been proposed as a valid approach for handling missing outcomes in cluster randomised trials. We show that cluster mean imputation only gives unbiased estimates when missingness mechanism is the same between the intervention groups and there is no interaction between baseline covariate and intervention group. Multiple imputation shows overcoverage for small number of clusters in each intervention group.

May 13, 2016 doi: 10.1177/0962280216648357 open full text
Survival analysis with delayed entry in selected families with application to human longevity.
Rodriguez-Girondo, M., Deelen, J., Slagboom, E. P., Houwing-Duistermaat, J. J.
Statistical Methods in Medical Research: An International Review Journal. May 13, 2016

In the field of aging research, family-based sampling study designs are commonly used to study the lifespans of long-lived family members. However, the specific sampling procedure should be carefully taken into account in order to avoid biases. This work is motivated by the Leiden Longevity Study, a family-based cohort of long-lived siblings. Families were invited to participate in the study if at least two siblings were ‘long-lived’, where ‘long-lived’ meant being older than 89 years for men or older than 91 years for women. As a result, more than 400 families were included in the study and followed for around 10 years. For estimation of marker-specific survival probabilities and correlations among life times of family members, delayed entry due to outcome-dependent sampling mechanisms has to be taken into account. We consider shared frailty models to model left-truncated correlated survival data. The treatment of left truncation in shared frailty models is still an open issue and the literature on this topic is scarce. We show that the current approaches provide, in general, biased estimates and we propose a new method to tackle this selection problem by applying a correction on the likelihood estimation by means of inverse probability weighting at the family level.

May 13, 2016 doi: 10.1177/0962280216648356 open full text
A Bayesian approach for estimating under-reported dengue incidence with a focus on non-linear associations between climate and dengue in Dhaka, Bangladesh.
Sharmin, S., Glass, K., Viennet, E., Harley, D.
Statistical Methods in Medical Research: An International Review Journal. May 13, 2016

Determining the relation between climate and dengue incidence is challenging due to under-reporting of disease and consequent biased incidence estimates. Non-linear associations between climate and incidence compound this. Here, we introduce a modelling framework to estimate dengue incidence from passive surveillance data while incorporating non-linear climate effects. We estimated the true number of cases per month using a Bayesian generalised linear model, developed in stages to adjust for under-reporting. A semi-parametric thin-plate spline approach was used to quantify non-linear climate effects. The approach was applied to data collected from the national dengue surveillance system of Bangladesh. The model estimated that only 2.8% (95% credible interval 2.7–2.8) of all cases in the capital Dhaka were reported through passive case reporting. The optimal mean monthly temperature for dengue transmission is 29℃ and average monthly rainfall above 15 mm decreases transmission. Our approach provides an estimate of true incidence and an understanding of the effects of temperature and rainfall on dengue transmission in Dhaka, Bangladesh.

May 13, 2016 doi: 10.1177/0962280216649216 open full text
A censored quantile regression approach for the analysis of time to event data.
Xue, X., Xie, X., Strickler, H. D.
Statistical Methods in Medical Research: An International Review Journal. May 10, 2016

The commonly used statistical model for studying time to event data, the Cox proportional hazards model, is limited by the assumption of a constant hazard ratio over time (i.e., proportionality), and the fact that it models the hazard rate rather than the survival time directly. The censored quantile regression model, defined on the quantiles of time to event, provides an alternative that is more flexible and interpretable. However, the censored quantile regression model has not been widely adopted in clinical research, due to the complexity involved in interpreting its results properly and consequently the difficulty to appreciate its advantages over the Cox proportional hazards model, as well as the absence of adequate validation procedure. In this paper, we addressed these limitations by (1) using both simulated examples and data from National Wilms’ Tumor clinical trials to illustrate proper interpretation of the censored quantile regression model and the differences and the advantages of the model compared to the Cox proportional hazards model; and (2) developing a validation procedure for the predictive censored quantile regression model. The performance of this procedure was examined using simulation studies. Overall, we recommend the use of censored quantile regression model, which permits a more sensitive analysis of time to event data together with the Cox proportional hazards model.

May 10, 2016 doi: 10.1177/0962280216648724 open full text
Investigating covariate-by-centre interaction in survival data.
Biard, L., Labopin, M., Chevret, S., Resche-Rigon, M., on behalf of the Acute Leukaemia Working Party of the EBMT.
Statistical Methods in Medical Research: An International Review Journal. May 10, 2016

In survival analysis, assessing the existence of potential centre effects on the baseline hazard or on the effect of fixed covariates on the baseline hazard, such as treatment-by-centre interaction, is a frequent clinical concern in multicentre studies. Survival models with random effects on the baseline hazard and/or on the effect of the covariates of interest have been largely applied, for instance, to investigate potential centre effects. We aimed to develop a procedure to routinely test for multiple random effects in survival analyses. We propose a statistic and a permutation approach to test whether all or a subset of components of the variance-covariance matrix of random effects are non-zero in a mixed-effects Cox model framework. Performances of the proposed permutation tests are examined under different null hypotheses corresponding to the different components of the variance-covariance matrix, i.e., to the different random effects considered on the baseline hazard and/or on the covariates effects. Several alternative hypotheses are evaluated using simulations. The results indicate that the permutation tests have valid type I error rates under the null and achieve satisfactory power under all alternatives. The procedure is applied to two European cohorts of haematological stem cell transplants in acute leukaemia to investigate the heterogeneity across centres in leukaemia-free survival and the potential heterogeneity in prognostic factors effects across centres.

May 10, 2016 doi: 10.1177/0962280216647981 open full text
Bayesian optimal response-adaptive design for binary responses using stopping rule.
Komaki, F., Biswas, A.
Statistical Methods in Medical Research: An International Review Journal. May 02, 2016

Response-adaptive designs are used in phase III clinical trials to allocate a larger number of patients to the better treatment arm. Optimal designs are explored in the recent years in the context of response-adaptive designs, in the frequentist view point only. In the present paper, we propose some response-adaptive designs for two treatments based on Bayesian prediction for phase III clinical trials. Some properties are studied and numerically compared with some existing competitors. A real data set is used to illustrate the applicability of the proposed methodology where we redesign the experiment using parameters derived from the data set.

May 02, 2016 doi: 10.1177/0962280216647210 open full text
Extended likelihood ratio test-based methods for signal detection in a drug class with application to FDAs adverse event reporting system database.
Zhao, Y., Yi, M., Tiwari, R. C.
Statistical Methods in Medical Research: An International Review Journal. May 02, 2016

A likelihood ratio test, recently developed for the detection of signals of adverse events for a drug of interest in the FDA Adverse Events Reporting System database, is extended to detect signals of adverse events simultaneously for all the drugs in a drug class. The extended likelihood ratio test methods, based on Poisson model (Ext-LRT) and zero-inflated Poisson model (Ext-ZIP-LRT), are discussed and are analytically shown, like the likelihood ratio test method, to control the type-I error and false discovery rate. Simulation studies are performed to evaluate the performance characteristics of Ext-LRT and Ext-ZIP-LRT. The proposed methods are applied to the Gadolinium drug class in FAERS database. An in-house likelihood ratio test tool, incorporating the Ext-LRT methodology, is being developed in the Food and Drug Administration.

May 02, 2016 doi: 10.1177/0962280216646678 open full text
Dynamic prediction of recurrent events data by landmarking with application to a follow-up study of patients after kidney transplant.
Musoro, J., Struijk, G., Geskus, R., ten Berge, I., Zwinderman, A.
Statistical Methods in Medical Research: An International Review Journal. May 02, 2016

This paper extends dynamic prediction by landmarking to recurrent event data. The motivating data comprised post-kidney transplantation records of repeated infections and repeated measurements of multiple markers. At each landmark time point t_s, a Cox proportional hazards model with a frailty term was fitted using data of individuals who were at risk at landmark s. This model included the time-updated marker values at t_s as time-fixed covariates. Based on a stacked data set that merged all landmark data sets, we considered supermodels that allow parameters to depend on the landmarks in a smooth fashion. We described and evaluated four ways to parameterize the supermodels for recurrent event data. With both the study data and simulated data sets, we compared supermodels that were fitted on stacked data sets that consisted of either overlapping or non-overlapping landmark periods. We observed that for recurrent event data, the supermodels may yield biased estimates when overlapping landmark periods are used for stacking. Using the best supermodel amongst the ones considered, we dynamically estimated the probability to remain infection free between t_s and a prediction horizon t_hor, conditional on the information available at t_s.

May 02, 2016 doi: 10.1177/0962280216643563 open full text
Sigmoidal mixed models for longitudinal data.
Capuano, A. W., Wilson, R. S., Leurgans, S. E., Dawson, J. D., Bennett, D. A., Hedeker, D.
Statistical Methods in Medical Research: An International Review Journal. April 28, 2016

Linear mixed models are widely used to analyze longitudinal cognitive data. Often, however, the trajectory of cognitive function is nonlinear. For example, some participants may experience cognitive decline that accelerates as death approaches. Polynomial regression and piecewise linear models are common approaches used to characterize nonlinear trajectories, although both have assumptions that may not correspond with the actual trajectories. An alternative is to use a flexible sigmoidal mixed model based on the logistic family of curves. We describe a general class of such a model, which has up to five parameters, representing (1) final level, (2) rate of decline, (3) midpoint of decline, (4) initial level before decline, and (5) asymmetry. Focusing on a four-parameter symmetric sub-class of the model, with random effects on two of the parameters, we demonstrate that a likelihood approach to fitting this model produces accurate estimates of mean levels across time, even in the case of model misspecification. We also illustrate the method on deceased participants who had completed at least 5 years of annual cognitive testing and annual assessment of body mass. We show that departures from a stable body can modify the trajectory curves and anticipate cognitive decline.

April 28, 2016 doi: 10.1177/0962280216645632 open full text
Assessing methods for dealing with treatment switching in clinical trials: A follow-up simulation study.
Latimer, N. R., Abrams, K. R., Lambert, P. C., Morden, J. P., Crowther, M. J.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2016

When patients randomised to the control group of a randomised controlled trial are allowed to switch onto the experimental treatment, intention-to-treat analyses of the treatment effect are confounded because the separation of randomised groups is lost. Previous research has investigated statistical methods that aim to estimate the treatment effect that would have been observed had this treatment switching not occurred and has demonstrated their performance in a limited set of scenarios. Here, we investigate these methods in a new range of realistic scenarios, allowing conclusions to be made based upon a broader evidence base. We simulated randomised controlled trials incorporating prognosis-related treatment switching and investigated the impact of sample size, reduced switching proportions, disease severity, and alternative data-generating models on the performance of adjustment methods, assessed through a comparison of bias, mean squared error, and coverage, related to the estimation of true restricted mean survival in the absence of switching in the control group. Rank preserving structural failure time models, inverse probability of censoring weights, and two-stage methods consistently produced less bias than the intention-to-treat analysis. The switching proportion was confirmed to be a key determinant of bias: sample size and censoring proportion were relatively less important. It is critical to determine the size of the treatment effect in terms of an acceleration factor (rather than a hazard ratio) to provide information on the likely bias associated with rank-preserving structural failure time model adjustments. In general, inverse probability of censoring weight methods are more volatile than other adjustment methods.

April 25, 2016 doi: 10.1177/0962280216642264 open full text
Class-imbalanced subsampling lasso algorithm for discovering adverse drug reactions.
Ahmed, I., Pariente, A., Tubert-Bitter, P.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2016

Background
All methods routinely used to generate safety signals from pharmacovigilance databases rely on disproportionality analyses of counts aggregating patients’ spontaneous reports. Recently, it was proposed to analyze individual spontaneous reports directly using Bayesian lasso logistic regressions. Nevertheless, this raises the issue of choosing an adequate regularization parameter in a variable selection framework while accounting for computational constraints due to the high dimension of the data.
Purpose
Our main objective is to propose a method, which exploits the subsampling idea from Stability Selection, a variable selection procedure combining subsampling with a high-dimensional selection algorithm, and adapts it to the specificities of the spontaneous reporting data, the latter being characterized by their large size, their binary nature and their sparsity.
Materials and method
Given the large imbalance existing between the presence and absence of a given adverse event, we propose an alternative subsampling scheme to that of Stability Selection resulting in an over-representation of the minority class and a drastic reduction in the number of observations in each subsample. Simulations are used to help define the detection threshold as regards the average proportion of false signals. They are also used to compare the performances of the proposed sampling scheme with that originally proposed for Stability Selection. Finally, we compare the proposed method to the gamma Poisson shrinker, a disproportionality method, and to a lasso logistic regression approach through an empirical study conducted on the French national pharmacovigilance database and two sets of reference signals.
Results
Simulations show that the proposed sampling strategy performs better in terms of false discoveries and is faster than the equiprobable sampling of Stability Selection. The empirical evaluation illustrates the better performances of the proposed method compared with gamma Poisson shrinker and the lasso in terms of number of reference signals retrieved.

April 25, 2016 doi: 10.1177/0962280216643116 open full text
A new likelihood approach to inference about predictive values of diagnostic tests in paired designs.
Tsou, T.-S.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2016

Intuitively, one only needs patients with two positive screening test results for positive predictive values comparison, and those with two negative screening test results for contrasting negative predictive values. Nevertheless, current existing methods rely on the multinomial model that includes superfluous parameters unnecessary for specific comparisons. This practice results in complex statistics formulas. We introduce a novel likelihood approach that fits the intuition by including a minimum number of parameters of interest in paired designs. It is demonstrated that our robust score test statistic is identical to a newly proposed weighted generalized score test statistic. Simulations and real data analysis are used for illustration.

April 25, 2016 doi: 10.1177/0962280216634755 open full text
An evaluation of bias in propensity score-adjusted non-linear regression models.
Wan, F., Mitra, N.
Statistical Methods in Medical Research: An International Review Journal. April 19, 2016

Propensity score methods are commonly used to adjust for observed confounding when estimating the conditional treatment effect in observational studies. One popular method, covariate adjustment of the propensity score in a regression model, has been empirically shown to be biased in non-linear models. However, no compelling underlying theoretical reason has been presented. We propose a new framework to investigate bias and consistency of propensity score-adjusted treatment effects in non-linear models that uses a simple geometric approach to forge a link between the consistency of the propensity score estimator and the collapsibility of non-linear models. Under this framework, we demonstrate that adjustment of the propensity score in an outcome model results in the decomposition of observed covariates into the propensity score and a remainder term. Omission of this remainder term from a non-collapsible regression model leads to biased estimates of the conditional odds ratio and conditional hazard ratio, but not for the conditional rate ratio. We further show, via simulation studies, that the bias in these propensity score-adjusted estimators increases with larger treatment effect size, larger covariate effects, and increasing dissimilarity between the coefficients of the covariates in the treatment model versus the outcome model.

April 19, 2016 doi: 10.1177/0962280216643739 open full text
Cumulative sum control charts for monitoring geometrically inflated Poisson processes: An application to infectious disease counts data.
Rakitzis, A. C., Castagliola, P., Maravelakis, P. E.
Statistical Methods in Medical Research: An International Review Journal. April 14, 2016

In this work, we study upper-sided cumulative sum control charts that are suitable for monitoring geometrically inflated Poisson processes. We assume that a process is properly described by a two-parameter extension of the zero-inflated Poisson distribution, which can be used for modeling count data with an excessive number of zero and non-zero values. Two different upper-sided cumulative sum-type schemes are considered, both suitable for the detection of increasing shifts in the average of the process. Aspects of their statistical design are discussed and their performance is compared under various out-of-control situations. Changes in both parameters of the process are considered. Finally, the monitoring of the monthly cases of poliomyelitis in the USA is given as an illustrative example.

April 14, 2016 doi: 10.1177/0962280216641985 open full text
Statistical methods for body mass index: A selective review.
Yu, K., Liu, X., Alhamzawi, R., Becker, F., Lord, J.
Statistical Methods in Medical Research: An International Review Journal. April 11, 2016

Obesity rates have been increasing over recent decades, causing significant concern among policy makers. Excess body fat, commonly measured by body mass index, is a major risk factor for several common disorders including diabetes and cardiovascular disease, placing a substantial burden on health care systems. To guide effective public health action, we need to understand the complex system of intercorrelated influences on body mass index. This paper, based on all eligible articles searched from Global health, Medline and Web of Science databases, reviews both classical and modern statistical methods for body mass index analysis. We give a description of each of these methods, exploring the classification, links and differences between them and the reasons for choosing one over the others in different settings. We aim to provide a key resource and statistical library for researchers in public health and medicine to deal with obesity and body mass index data analysis.

April 11, 2016 doi: 10.1177/0962280216643117 open full text
Use of instrumental variables in electronic health record-driven models.
Salmasi, L., Capobianco, E.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2016

Precision medicine presents various methodological challenges whose assessment requires the consideration of multiple factors. In particular, the data multitude in the Electronic Health Records poses interoperability issues and requires novel inference strategies. A problem, though apparently a paradox, is that highly specific treatments and a variety of outcomes may hardly match with consistent observations (i.e., large samples). Why is it the case? Owing to the heterogeneity of Electronic Health Records, models for the evaluation of treatment effects need to be selected, and in some cases, the use of instrumental variables might be necessary. We studied the recently defined person-centered treatment effects in cancer and C-section contexts from Electronic Health Record sources and identified as an instrument the distance of patients from hospitals. We present first the rationale for using such instrument and then its model implementation. While for cancer patients consideration of distance turns out to be a penalty, implying a negative effect on the probability of receiving surgery, a positive effect is instead found in C-section due to higher propensity of scheduling delivery. Overall, the estimated person-centered treatment effects reveal a high degree of heterogeneity, whose interpretation remains context-dependent. With regard to the use of instruments in light of our two case studies, our suggestion is that this process requires ad hoc variable selection for both covariates and instruments and additional testing to ensure validity.

April 07, 2016 doi: 10.1177/0962280216641154 open full text
Continuous time Markov chain approaches for analyzing transtheoretical models of health behavioral change: A case study and comparison of model estimations.
Ma, J., Chan, W., Tilley, B. C.
Statistical Methods in Medical Research: An International Review Journal. April 04, 2016

Continuous time Markov chain models are frequently employed in medical research to study the disease progression but are rarely applied to the transtheoretical model, a psychosocial model widely used in the studies of health-related outcomes. The transtheoretical model often includes more than three states and conceptually allows for all possible instantaneous transitions (referred to as general continuous time Markov chain). This complicates the likelihood function because it involves calculating a matrix exponential that may not be simplified for general continuous time Markov chain models. We undertook a Bayesian approach wherein we numerically evaluated the likelihood using ordinary differential equation solvers available from the gnu scientific library. We compared our Bayesian approach with the maximum likelihood method implemented with the R package MSM. Our simulation study showed that the Bayesian approach provided more accurate point and interval estimates than the maximum likelihood method, especially in complex continuous time Markov chain models with five states. When applied to data from a four-state transtheoretical model collected from a nutrition intervention study in the next step trial, we observed results consistent with the results of the simulation study. Specifically, the two approaches provided comparable point estimates and standard errors for most parameters, but the maximum likelihood offered substantially smaller standard errors for some parameters. Comparable estimates of the standard errors are obtainable from package MSM, which works only when the model estimation algorithm converges.

April 04, 2016 doi: 10.1177/0962280216639859 open full text
Exact tests in binary data under an incomplete block crossover design.
Lui, K.-J., Chang, K.-C.
Statistical Methods in Medical Research: An International Review Journal. March 21, 2016

To improve the power of a parallel groups design and reduce the time length of a crossover trial, we may consider an incomplete block crossover design. Under a distribution-free random effects logistic regression model, we derive an exact test and a Mantel-Haenszel Type of summary test procedure for testing non-equality in binary data when comparing three treatments. We employ Monte Carlo simulation to evaluate the performance of these test procedures. We find that both test procedures developed here can perform well in a variety of situations. We use the data taken as a part of the crossover trial comparing the low and high doses of an analgesic with a placebo for the relief of pain in primary dysmenorrhea to illustrate the use of the proposed test procedures.

March 21, 2016 doi: 10.1177/0962280216638382 open full text
Correcting for dependent censoring in routine outcome monitoring data by applying the inverse probability censoring weighted estimator.
Willems, S., Schat, A., van Noorden, M., Fiocco, M.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Censored data make survival analysis more complicated because exact event times are not observed. Statistical methodology developed to account for censored observations assumes that patients’ withdrawal from a study is independent of the event of interest. However, in practice, some covariates might be associated to both lifetime and censoring mechanism, inducing dependent censoring. In this case, standard survival techniques, like Kaplan–Meier estimator, give biased results. The inverse probability censoring weighted estimator was developed to correct for bias due to dependent censoring. In this article, we explore the use of inverse probability censoring weighting methodology and describe why it is effective in removing the bias. Since implementing this method is highly time consuming and requires programming and mathematical skills, we propose a user friendly algorithm in R. Applications to a toy example and to a medical data set illustrate how the algorithm works. A simulation study was carried out to investigate the performance of the inverse probability censoring weighted estimators in situations where dependent censoring is present in the data. In the simulation process, different sample sizes, strengths of the censoring model, and percentages of censored individuals were chosen. Results show that in each scenario inverse probability censoring weighting reduces the bias induced in the traditional Kaplan–Meier approach where dependent censoring is ignored.

March 17, 2016 doi: 10.1177/0962280216628900 open full text
Identification of predicted individual treatment effects in randomized clinical trials.
Lamont, A., Lyons, M. D., Jaki, T., Stuart, E., Feaster, D. J., Tharmaratnam, K., Oberski, D., Ishwaran, H., Wilson, D. K., Van Horn, M. L.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

In most medical research, treatment effectiveness is assessed using the average treatment effect or some version of subgroup analysis. The practice of individualized or precision medicine, however, requires new approaches that predict how an individual will respond to treatment, rather than relying on aggregate measures of effect. In this study, we present a conceptual framework for estimating individual treatment effects, referred to as predicted individual treatment effects. We first apply the predicted individual treatment effect approach to a randomized controlled trial designed to improve behavioral and physical symptoms. Despite trivial average effects of the intervention, we show substantial heterogeneity in predicted individual treatment response using the predicted individual treatment effect approach. The predicted individual treatment effects can be used to predict individuals for whom the intervention may be most effective (or harmful). Next, we conduct a Monte Carlo simulation study to evaluate the accuracy of predicted individual treatment effects. We compare the performance of two methods used to obtain predictions: multiple imputation and non-parametric random decision trees. Results showed that, on average, both predictive methods produced accurate estimates at the individual level; however, the random decision trees tended to underestimate the predicted individual treatment effect for people at the extreme and showed more variability in predictions across repetitions compared to the imputation approach. Limitations and future directions are discussed.

March 17, 2016 doi: 10.1177/0962280215623981 open full text
Bayesian bivariate meta-analysis of correlated effects: Impact of the prior distributions on the between-study correlation, borrowing of strength, and joint inferences.
Burke, D. L., Bujkiewicz, S., Riley, R. D.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Multivariate random-effects meta-analysis allows the joint synthesis of correlated results from multiple studies, for example, for multiple outcomes or multiple treatment groups. In a Bayesian univariate meta-analysis of one endpoint, the importance of specifying a sensible prior distribution for the between-study variance is well understood. However, in multivariate meta-analysis, there is little guidance about the choice of prior distributions for the variances or, crucially, the between-study correlation, _B; for the latter, researchers often use a Uniform(–1,1) distribution assuming it is vague. In this paper, an extensive simulation study and a real illustrative example is used to examine the impact of various (realistically) vague prior distributions for _B and the between-study variances within a Bayesian bivariate random-effects meta-analysis of two correlated treatment effects. A range of diverse scenarios are considered, including complete and missing data, to examine the impact of the prior distributions on posterior results (for treatment effect and between-study correlation), amount of borrowing of strength, and joint predictive distributions of treatment effectiveness in new studies. Two key recommendations are identified to improve the robustness of multivariate meta-analysis results. First, the routine use of a Uniform(–1,1) prior distribution for _B should be avoided, if possible, as it is not necessarily vague. Instead, researchers should identify a sensible prior distribution, for example, by restricting values to be positive or negative as indicated by prior knowledge. Second, it remains critical to use sensible (e.g. empirically based) prior distributions for the between-study variances, as an inappropriate choice can adversely impact the posterior distribution for _B, which may then adversely affect inferences such as joint predictive probabilities. These recommendations are especially important with a small number of studies and missing data.

March 17, 2016 doi: 10.1177/0962280216631361 open full text
Impact of intervention targeting risk factors on chronic disease burden.
Wanneveich, M., Jacqmin-Gadda, H., Dartigues, J.-F., Joly, P.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

The aging of the population is accompanied by a sharp rise of chronic disease prevalences, such as dementia. These diseases generally cannot be prevented or cured and persist over time, with a progressive deterioration of health, requiring specific care. To reduce the burden of these diseases, it is appropriate to propose interventions targeting disease risk factors, but the association between most of these risk factors and mortality makes it difficult to anticipate the potential impact of such interventions. A method was previously proposed to estimate changes in disease prevalence following an intervention targeting subjects at a given age where the incidence of the disease is supposed to be null. Here, we propose a general framework to make projections for life expectancies with and without the disease, the age at onset, and the lifelong probability of the disease, and to evaluate the consequences of preventive interventions targeting risk factors on these various measures of disease burden. The methodology takes into account the mortality trend over calendar time and age in both healthy and diseased subjects, and the change in mortality due to the intervention. The method is applied to make projections for dementia in 2030 according to several scenarios of public health interventions.

March 17, 2016 doi: 10.1177/0962280216631360 open full text
Extrapolation of efficacy and other data to support the development of new medicines for children: A systematic review of methods.
Wadsworth, I., Hampson, L. V., Jaki, T.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Objective
When developing new medicines for children, the potential to extrapolate from adult data to reduce the experimental burden in children is well recognised. However, significant assumptions about the similarity of adults and children are needed for extrapolations to be biologically plausible. We reviewed the literature to identify statistical methods that could be used to optimise extrapolations in paediatric drug development programmes.
Methods
Web of Science was used to identify papers proposing methods relevant for using data from a ‘source population’ to support inferences for a ‘target population’. Four key areas of methods development were targeted: paediatric clinical trials, trials extrapolating efficacy across ethnic groups or geographic regions, the use of historical data in contemporary clinical trials and using short-term endpoints to support inferences about long-term outcomes.
Results
Searches identified 626 papers of which 52 met our inclusion criteria. From these we identified 102 methods comprising 58 Bayesian and 44 frequentist approaches. Most Bayesian methods (n = 54) sought to use existing data in the source population to create an informative prior distribution for a future clinical trial. Of these, 46 allowed the source data to be down-weighted to account for potential differences between populations. Bayesian and frequentist versions of methods were found for assessing whether key parameters of source and target populations are commensurate (n = 34). Fourteen frequentist methods synthesised data from different populations using a joint model or a weighted test statistic.
Conclusions
Several methods were identified as potentially applicable to paediatric drug development. Methods which can accommodate a heterogeneous target population and which allow data from a source population to be down-weighted are preferred. Methods assessing the commensurability of parameters may be used to determine whether it is appropriate to pool data across age groups to estimate treatment effects.

March 17, 2016 doi: 10.1177/0962280216631359 open full text
Phase I/II dose-finding design for molecularly targeted agent: Plateau determination using adaptive randomization.
Riviere, M.-K., Yuan, Y., Jourdan, J.-H., Dubois, F., Zohar, S.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Conventionally, phase I dose-finding trials aim to determine the maximum tolerated dose of a new drug under the assumption that both toxicity and efficacy monotonically increase with the dose. This paradigm, however, is not suitable for some molecularly targeted agents, such as monoclonal antibodies, for which efficacy often increases initially with the dose and then plateaus. For molecularly targeted agents, the goal is to find the optimal dose, defined as the lowest safe dose that achieves the highest efficacy. We develop a Bayesian phase I/II dose-finding design to find the optimal dose. We employ a logistic model with a plateau parameter to capture the increasing-then-plateau feature of the dose–efficacy relationship. We take the weighted likelihood approach to accommodate for the case where efficacy is possibly late-onset. Based on observed data, we continuously update the posterior estimates of toxicity and efficacy probabilities and adaptively assign patients to the optimal dose. The simulation studies show that the proposed design has good operating characteristics. This method is going to be applied in more than two phase I clinical trials as no other method is available for this specific setting. We also provide an R package dfmta that can be downloaded from CRAN website.

March 17, 2016 doi: 10.1177/0962280216631763 open full text
Missing value imputation for physical activity data measured by accelerometer.
Lee, J. A., Gill, J.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

An accelerometer, a wearable motion sensor on the hip or wrist, is becoming a popular tool in clinical and epidemiological studies for measuring the physical activity. Such data provide a series of activity counts at every minute or even more often and displays a person’s activity pattern throughout a day. Unfortunately, the collected data can include irregular missing intervals because of noncompliance of participants and therefore make the statistical analysis more challenging. The purpose of this study is to develop a novel imputation method to handle the multivariate count data, motivated by the accelerometer data structure. We specify the predictive distribution of the missing data with a mixture of zero-inflated Poisson and Log-normal distribution, which is shown to be effective to deal with the minute-by-minute autocorrelation as well as under- and over-dispersion of count data. The imputation is performed at the minute level and follows the principles of multiple imputation using a fully conditional specification with the chained algorithm. To facilitate the practical use of this method, we provide an R package accelmissing. Our method is demonstrated using 2003–2004 National Health and Nutrition Examination Survey data.

March 17, 2016 doi: 10.1177/0962280216633248 open full text
Study duration for three-arm non-inferiority survival trials designed for accrual by cohorts.
Wu, Y., Li, Y., Hou, Y., Li, K., Zhou, X.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Study planning is particularly complex for survival trials because it usually involves an accrual period and a continued observation period after accrual closure. The three-arm clinical trial design, which includes a test treatment, an active reference, and a placebo control, is the gold standard design for the assessment of non-inferiority. The existing statistical methods of calculating minimal sample size for non-inferiority trials with three-arm design and survival-type endpoints cannot take into consideration the accrual rate of patients to the trial, the length of accrual period, the length of continued observation period after accrual closure, and unbalanced allocation of the total sample size. The purpose of this paper is to develop a statistical method, which allows for all these sources of variability for planning non-inferiority trials with the gold standard design for censored, exponentially distributed time-to-event data. The proposed method is based on the assumption of exponentially distributed failure times and a non-inferiority test formulated in terms of the retention of effect hypotheses. It can be used to calculate the duration of accrual required to assure a desired power for non-inferiority trials with active and placebo control. We illustrate the use of the method by considering a randomized, active- and placebo-controlled trial in depression associated with Parkinson’s disease. We then explore the validity of the proposed method by simulation studies. An R-language program for the implementation of the proposed algorithm is provided as supplementary material.

March 17, 2016 doi: 10.1177/0962280216633908 open full text
Detecting influential observations in a model-based cluster analysis.
Bruckers, L., Molenberghs, G., Verbeke, G., Geys, H.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Finite mixture models have been used to model population heterogeneity and to relax distributional assumptions. These models are also convenient tools for clustering and classification of complex data such as, for example, repeated-measurements data. The performance of model-based clustering algorithms is sensitive to influential and outlying observations. Methods for identifying outliers in a finite mixture model have been described in the literature. Approaches to identify influential observations are less common. In this paper, we apply local-influence diagnostics to a finite mixture model with known number of components. The methodology is illustrated on real-life data.

March 17, 2016 doi: 10.1177/0962280216634112 open full text
Modelling the distribution of health-related quality of life of advanced melanoma patients in a longitudinal multi-centre clinical trial using M-quantile random effects regression.
Borgoni, R., Del Bianco, P., Salvati, N., Schmid, T., Tzavidis, N.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Health-related quality of life assessment is important in the clinical evaluation of patients with metastatic disease that may offer useful information in understanding the clinical effectiveness of a treatment. To assess if a set of explicative variables impacts on the health-related quality of life, regression models are routinely adopted. However, the interest of researchers may be focussed on modelling other parts (e.g. quantiles) of this conditional distribution. In this paper, we present an approach based on quantile and M-quantile regression to achieve this goal. We applied the methodologies to a prospective, randomized, multi-centre clinical trial. In order to take into account the hierarchical nature of the data we extended the M-quantile regression model to a three-level random effects specification and estimated it by maximum likelihood.

March 17, 2016 doi: 10.1177/0962280216636651 open full text
Sample size determinations for stepped-wedge clinical trials from a three-level data hierarchy perspective.
Heo, M., Kim, N., Rinke, M. L., Wylie-Rosett, J.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Stepped-wedge (SW) designs have been steadily implemented in a variety of trials. A SW design typically assumes a three-level hierarchical data structure where participants are nested within times or periods which are in turn nested within clusters. Therefore, statistical models for analysis of SW trial data need to consider two correlations, the first and second level correlations. Existing power functions and sample size determination formulas had been derived based on statistical models for two-level data structures. Consequently, the second-level correlation has not been incorporated in conventional power analyses. In this paper, we derived a closed-form explicit power function based on a statistical model for three-level continuous outcome data. The power function is based on a pooled overall estimate of stratified cluster-specific estimates of an intervention effect. The sampling distribution of the pooled estimate is derived by applying a fixed-effect meta-analytic approach. Simulation studies verified that the derived power function is unbiased and can be applicable to varying number of participants per period per cluster. In addition, when data structures are assumed to have two levels, we compare three types of power functions by conducting additional simulation studies under a two-level statistical model. In this case, the power function based on a sampling distribution of a marginal, as opposed to pooled, estimate of the intervention effect performed the best. Extensions of power functions to binary outcomes are also suggested.

March 17, 2016 doi: 10.1177/0962280216632564 open full text
Methods for meta-analysis of pharmacodynamic dose-response data with application to multi-arm studies of alogliptin.
Langford, O., Aronson, J. K., van Valkenhoef, G., Stevens, R. J.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2016

Standard methods for meta-analysis of dose–response data in epidemiology assume a model with a single scalar parameter, such as log-linear relationships between exposure and outcome; such models are implicitly unbounded. In contrast, in pharmacology, multi-parameter models, such as the widely used E_max model, are used to describe relationships that are bounded above and below. We propose methods for estimating the parameters of a dose–response model by meta-analysis of summary data from the results of randomized controlled trials of a drug, in which each trial uses multiple doses of the drug of interest (possibly including dose 0 or placebo). We assume that, for each randomized arm of each trial, the mean and standard error of a continuous response measure and the corresponding allocated dose are available. We consider weighted least squares fitting of the model to the mean and dose pairs from all arms of all studies, and a two-stage procedure in which scalar inverse-variance meta-analysis is performed at each dose, and the dose–response model is fitted to the results by weighted least squares. We then compare these with two further methods inspired by network meta-analysis that fit the model to the contrasts between doses. We illustrate the methods by estimating the parameters of the E_max model to a collection of multi-arm, multiple-dose, randomized controlled trials of alogliptin, a drug for the management of diabetes mellitus, and further examine the properties of the four methods with sensitivity analyses and a simulation study. We find that all four methods produce broadly comparable point estimates for the parameters of most interest, but a single-stage method based on contrasts between doses produces the most appropriate confidence intervals. Although simpler methods may have pragmatic advantages, such as the use of standard software for scalar meta-analysis, more sophisticated methods are nevertheless preferable for their advantages in estimation.

March 17, 2016 doi: 10.1177/0962280216637093 open full text
Bayesian joint modeling for assessing the progression of chronic kidney disease in children.
Armero, C., Forte, A., Perpinan, H., Sanahuja, M. J., Agusti, S.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

Joint models are rich and flexible models for analyzing longitudinal data with nonignorable missing data mechanisms. This article proposes a Bayesian random-effects joint model to assess the evolution of a longitudinal process in terms of a linear mixed-effects model that accounts for heterogeneity between the subjects, serial correlation, and measurement error. Dropout is modeled in terms of a survival model with competing risks and left truncation. The model is applied to data coming from ReVaPIR, a project involving children with chronic kidney disease whose evolution is mainly assessed through longitudinal measurements of glomerular filtration rate.

March 16, 2016 doi: 10.1177/0962280216628560 open full text
Evaluating hospital infection control measures for antimicrobial-resistant pathogens using stochastic transmission models: Application to vancomycin-resistant enterococci in intensive care units.
Wei, Y., Kypraios, T., ONeill, P. D., Huang, S. S., Rifas-Shiman, S. L., Cooper, B. S.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

Nosocomial pathogens such as methicillin-resistant Staphylococcus aureus (MRSA) and vancomycin-resistant Enterococci (VRE) are the cause of significant morbidity and mortality among hospital patients. It is important to be able to assess the efficacy of control measures using data on patient outcomes. In this paper, we describe methods for analysing such data using patient-level stochastic models which seek to describe the underlying unobserved process of transmission. The methods are applied to detailed longitudinal patient-level data on vancomycin-resistant Enterococci from a study in a US hospital with eight intensive care units (ICUs). The data comprise admission and discharge dates, dates and results of screening tests, and dates during which precautionary measures were in place for each patient during the study period. Results include estimates of the efficacy of the control measures, the proportion of unobserved patients colonized with vancomycin-resistant Enterococci, and the proportion of patients colonized on admission.

March 16, 2016 doi: 10.1177/0962280215627299 open full text
Bayesian clinical classification from high-dimensional data: Signatures versus variability.
Shalabi, A., Inoue, M., Watkins, J., De Rinaldis, E., Coolen, A. C.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

When data exhibit imbalance between a large number d of covariates and a small number n of samples, clinical outcome prediction is impaired by overfitting and prohibitive computation demands. Here we study two simple Bayesian prediction protocols that can be applied to data of any dimension and any number of outcome classes. Calculating Bayesian integrals and optimal hyperparameters analytically leaves only a small number of numerical integrations, and CPU demands scale as O(nd). We compare their performance on synthetic and genomic data to the mclustDA method of Fraley and Raftery. For small d they perform as well as mclustDA or better. For d = 10,000 or more mclustDA breaks down computationally, while the Bayesian methods remain efficient. This allows us to explore phenomena typical of classification in high-dimensional spaces, such as overfitting and the reduced discriminative effectiveness of signatures compared to intra-class variability.

March 16, 2016 doi: 10.1177/0962280216628901 open full text
Responsiveness-informed multiple imputation and inverse probability-weighting in cohort studies with missing data that are non-monotone or not missing at random.
Doidge, J. C.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

Population-based cohort studies are invaluable to health research because of the breadth of data collection over time, and the representativeness of their samples. However, they are especially prone to missing data, which can compromise the validity of analyses when data are not missing at random. Having many waves of data collection presents opportunity for participants’ responsiveness to be observed over time, which may be informative about missing data mechanisms and thus useful as an auxiliary variable. Modern approaches to handling missing data such as multiple imputation and maximum likelihood can be difficult to implement with the large numbers of auxiliary variables and large amounts of non-monotone missing data that occur in cohort studies. Inverse probability-weighting can be easier to implement but conventional wisdom has stated that it cannot be applied to non-monotone missing data. This paper describes two methods of applying inverse probability-weighting to non-monotone missing data, and explores the potential value of including measures of responsiveness in either inverse probability-weighting or multiple imputation. Simulation studies are used to compare methods and demonstrate that responsiveness in longitudinal studies can be used to mitigate bias induced by missing data, even when data are not missing at random.

March 16, 2016 doi: 10.1177/0962280216628902 open full text
A robust semi-parametric warping estimator of the survivor function with an application to two-group comparisons.
Hutson, A. D.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

In this note, we develop a new and novel semi-parametric estimator of the survival curve that is comparable to the product-limit estimator under very relaxed assumptions. The estimator is based on a beta parametrization that warps the empirical distribution of the observed censored and uncensored data. The parameters are obtained using a pseudo-maximum likelihood approach adjusting the survival curve accounting for the censored observations. In the univariate setting, the new estimator tends to better extend the range of the survival estimation given a high degree of censoring. However, the key feature of this paper is that we develop a new two-group semi-parametric exact permutation test for comparing survival curves that is generally superior to the classic log-rank and Wilcoxon tests and provides the best global power across a variety of alternatives. The new test is readily extended to the k group setting.

March 16, 2016 doi: 10.1177/0962280216630342 open full text
Confidence interval of difference of proportions in logistic regression in presence of covariates.
Reeve, R.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2016

Comparison of treatment differences in incidence rates is an important objective of many clinical trials. However, often the proportion is affected by covariates, and the adjustment of the predicted proportion is made using logistic regression. It is desirable to estimate the treatment differences in proportions adjusting for the covariates, similarly to the comparison of adjusted means in analysis of variance. Because of the correlation between the point estimates in the different treatment groups, the standard methods for constructing confidence intervals are inadequate. The problem is more difficult in the binary case, as the comparison is not uniquely defined, and the sampling distribution more difficult to analyze. Four procedures for analyzing the data are presented, which expand upon existing methods and generalize the link function. It is shown that, among the four methods studied, the resampling method based on the exact distribution function yields a coverage rate closest to the nominal.

March 16, 2016 doi: 10.1177/0962280216631583 open full text
Interval estimation in multi-stage drop-the-losers designs.
Lu, X., He, Y., Wu, S. S.
Statistical Methods in Medical Research: An International Review Journal. March 14, 2016

Drop-the-losers designs have been discussed extensively in the past decades, mostly focusing on two-stage models. The designs with more than two stages have recently received increasing attention due to their improved efficiency over the corresponding two-stage designs. In this paper, we consider the problem of estimating and testing the effect of selected treatment under the setting of three-stage drop-the-losers designs. A conservative interval estimator is proposed, which is proved to have at least the specified coverage probability using a stochastic ordering approach. The proposed interval estimator is also demonstrated numerically to have narrower interval width but higher coverage rate than the bootstrap method proposed by Bowden and Glimm (Biometrical Journal, vol. 56, pp. 332–349) in most cases. It is also a straightforward derivation from the stochastic ordering result that the family-wise error rate is strongly controlled with the maximum achieved at the global null hypothesis.

March 14, 2016 doi: 10.1177/0962280215626748 open full text
Controlled multi-arm platform design using predictive probability.
Hobbs, B. P., Chen, N., Lee, J. J.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2016

The process of screening agents one-at-a-time under the current clinical trials system suffers from several deficiencies that could be addressed in order to extend financial and patient resources. In this article, we introduce a statistical framework for designing and conducting randomized multi-arm screening platforms with binary endpoints using Bayesian modeling. In essence, the proposed platform design consolidates inter-study control arms, enables investigators to assign more new patients to novel therapies, and accommodates mid-trial modifications to the study arms that allow both dropping poorly performing agents as well as incorporating new candidate agents. When compared to sequentially conducted randomized two-arm trials, screening platform designs have the potential to yield considerable reductions in cost, alleviate the bottleneck between phase I and II, eliminate bias stemming from inter-trial heterogeneity, and control for multiplicity over a sequence of a priori planned studies. When screening five experimental agents, our results suggest that platform designs have the potential to reduce the mean total sample size by as much as 40% and boost the mean overall response rate by as much as 15%. We explain how to design and conduct platform designs to achieve the aforementioned aims and preserve desirable frequentist properties for the treatment comparisons. In addition, we demonstrate how to conduct a platform design using look-up tables that can be generated in advance of the study. The gains in efficiency facilitated by platform design could prove to be consequential in oncologic settings, wherein trials often lack a proper control, and drug development suffers from low enrollment, long inter-trial latency periods, and an unacceptably high rate of failure in phase III.

January 12, 2016 doi: 10.1177/0962280215620696 open full text
Probability of atrial fibrillation after ablation: Using a parametric nonlinear temporal decomposition mixed effects model.
Rajeswaran, J., Blackstone, E. H., Ehrlinger, J., Li, L., Ishwaran, H., Parides, M. K.
Statistical Methods in Medical Research: An International Review Journal. January 05, 2016

Atrial fibrillation is an arrhythmic disorder where the electrical signals of the heart become irregular. The probability of atrial fibrillation (binary response) is often time varying in a structured fashion, as is the influence of associated risk factors. A generalized nonlinear mixed effects model is presented to estimate the time-related probability of atrial fibrillation using a temporal decomposition approach to reveal the pattern of the probability of atrial fibrillation and their determinants. This methodology generalizes to patient-specific analysis of longitudinal binary data with possibly time-varying effects of covariates and with different patient-specific random effects influencing different temporal phases. The motivation and application of this model is illustrated using longitudinally measured atrial fibrillation data obtained through weekly trans-telephonic monitoring from an NIH sponsored clinical trial being conducted by the Cardiothoracic Surgery Clinical Trials Network.

January 05, 2016 doi: 10.1177/0962280215623583 open full text
A review of instrumental variable estimators for Mendelian randomization.
Burgess, S., Small, D. S., Thompson, S. G.
Statistical Methods in Medical Research: An International Review Journal. January 05, 2016

Instrumental variable analysis is an approach for obtaining causal inferences on the effect of an exposure (risk factor) on an outcome from observational data. It has gained in popularity over the past decade with the use of genetic variants as instrumental variables, known as Mendelian randomization. An instrumental variable is associated with the exposure, but not associated with any confounder of the exposure–outcome association, nor is there any causal pathway from the instrumental variable to the outcome other than via the exposure. Under the assumption that a single instrumental variable or a set of instrumental variables for the exposure is available, the causal effect of the exposure on the outcome can be estimated. There are several methods available for instrumental variable estimation; we consider the ratio method, two-stage methods, likelihood-based methods, and semi-parametric methods. Techniques for obtaining statistical inferences and confidence intervals are presented. The statistical properties of estimates from these methods are compared, and practical advice is given about choosing a suitable analysis method. In particular, bias and coverage properties of estimators are considered, especially with weak instruments. Settings particularly relevant to Mendelian randomization are prioritized in the paper, notably the scenario of a continuous exposure and a continuous or binary outcome.

January 05, 2016 doi: 10.1177/0962280215597579 open full text
Clustering of longitudinal data by using an extended baseline: A new method for treatment efficacy clustering in longitudinal data.
Schramm, C., Vial, C., Bachoud-Levi, A.-C., Katsahian, S.
Statistical Methods in Medical Research: An International Review Journal. December 31, 2015

Heterogeneity in treatment efficacy is a major concern in clinical trials. Clustering may help to identify the treatment responders and the non-responders. In the context of longitudinal cluster analyses, sample size and variability of the times of measurements are the main issues with the current methods. Here, we propose a new two-step method for the Clustering of Longitudinal data by using an Extended Baseline. The first step relies on a piecewise linear mixed model for repeated measurements with a treatment-time interaction. The second step clusters the random predictions and considers several parametric (model-based) and non-parametric (partitioning, ascendant hierarchical clustering) algorithms. A simulation study compares all options of the clustering of longitudinal data by using an extended baseline method with the latent-class mixed model. The clustering of longitudinal data by using an extended baseline method with the two model-based algorithms was the more robust model. The clustering of longitudinal data by using an extended baseline method with all the non-parametric algorithms failed when there were unequal variances of treatment effect between clusters or when the subgroups had unbalanced sample sizes. The latent-class mixed model failed when the between-patients slope variability is high. Two real data sets on neurodegenerative disease and on obesity illustrate the clustering of longitudinal data by using an extended baseline method and show how clustering may help to identify the marker(s) of the treatment response. The application of the clustering of longitudinal data by using an extended baseline method in exploratory analysis as the first stage before setting up stratified designs can provide a better estimation of treatment effect in future clinical trials.

December 31, 2015 doi: 10.1177/0962280215621591 open full text
Reliability assessment of a hospital quality measure based on rates of adverse outcomes on nursing units.
Staggs, V. S.
Statistical Methods in Medical Research: An International Review Journal. December 31, 2015

The purpose of this study was to develop methods for assessing the reliability of scores on a widely disseminated hospital quality measure based on nursing unit fall rates. Poisson regression interactive multilevel modeling was adapted to account for clustering of units within hospitals. Three signal-noise reliability measures were computed. Squared correlations between the hospital score and true hospital fall rate averaged 0.52 ± 0.18 for total falls (0.68 ± 0.18 for injurious falls). Reliabilities on the other two measures averaged at least 0.70 but varied widely across hospitals. Parametric bootstrap data reflecting within-unit noise in falls were generated to evaluate percentile-ranked hospital scores as estimators of true hospital fall rate ranks. Spearman correlations between bootstrap hospital scores and true fall rates averaged 0.81 ± 0.01 (0.79 ± 0.01). Bias was negligible, but ranked hospital scores were imprecise, varying across bootstrap samples with average SD 11.8 (14.9) percentiles. Across bootstrap samples, hospital-measure scores fell in the same decile as the true fall rate in about 30% of cases. Findings underscore the importance of thoroughly assessing reliability of quality measurements before deciding how they will be used. Both the hospital measure and the reliability methods described can be adapted to other contexts involving clustered rates of adverse patient outcomes.

December 31, 2015 doi: 10.1177/0962280215618688 open full text
Estimating sample size in the presence of competing risks - Cause-specific hazard or cumulative incidence approach?
Tai, B., Chen, Z., Machin, D.
Statistical Methods in Medical Research: An International Review Journal. December 27, 2015

In designing randomised clinical trials involving competing risks endpoints, it is important to consider competing events to ensure appropriate determination of sample size. We conduct a simulation study to compare sample sizes obtained from the cause-specific hazard and cumulative incidence (CMI) approaches, by first assuming exponential event times. As the proportional subdistribution hazard assumption does not hold for the CMI exponential (CMI_Exponential) model, we further investigate the impact of violation of such an assumption by comparing the results obtained from the CMI exponential model with those of a CMI model assuming a Gompertz distribution (CMI_Gompertz) where the proportional assumption is tenable. The simulation suggests that the CMI_Exponential approach requires a considerably larger sample size when treatment reduces the hazards of both the main event, A, and the competing risk, B. When treatment has a beneficial effect on A but no effect on B, the sample sizes required by both methods are largely similar, especially for large reduction in the main risk. If treatment has a protective effect on A but adversely affects B, then the sample size required by CMI_Exponential is notably smaller than cause-specific hazard for small to moderate reduction in the main risk. Further, a smaller sample size is required for CMI_Gompertz as compared with CMI_Exponential. The choice between a cause-specific hazard or CMI model in competing risks outcomes has implications on the study design. This should be made on the basis of the clinical question of interest and the validity of the associated model assumption.

December 27, 2015 doi: 10.1177/0962280215623107 open full text
Performance of informative priors skeptical of large treatment effects in clinical trials: A simulation study.
Pedroza, C., Han, W., Truong, V. T. T., Green, C., Tyson, J. E.
Statistical Methods in Medical Research: An International Review Journal. December 13, 2015

One of the main advantages of Bayesian analyses of clinical trials is their ability to formally incorporate skepticism about large treatment effects through the use of informative priors. We conducted a simulation study to assess the performance of informative normal, Student-t, and beta distributions in estimating relative risk (RR) or odds ratio (OR) for binary outcomes. Simulation scenarios varied the prior standard deviation (SD; level of skepticism of large treatment effects), outcome rate in the control group, true treatment effect, and sample size. We compared the priors with regards to bias, mean squared error (MSE), and coverage of 95% credible intervals. Simulation results show that the prior SD influenced the posterior to a greater degree than the particular distributional form of the prior. For RR, priors with a 95% interval of 0.50–2.0 performed well in terms of bias, MSE, and coverage under most scenarios. For OR, priors with a wider 95% interval of 0.23–4.35 had good performance. We recommend the use of informative priors that exclude implausibly large treatment effects in analyses of clinical trials, particularly for major outcomes such as mortality.

December 13, 2015 doi: 10.1177/0962280215620828 open full text
Extending multivariate-t linear mixed models for multiple longitudinal data with censored responses and heavy tails.
Wang, W.-L., Lin, T.-I., Lachos, V. H.
Statistical Methods in Medical Research: An International Review Journal. December 13, 2015

The analysis of complex longitudinal data is challenging due to several inherent features: (i) more than one series of responses are repeatedly collected on each subject at irregularly occasions over a period of time; (ii) censorship due to limits of quantification of responses arises left- and/or right- censoring effects; (iii) outliers or heavy-tailed noises are possibly embodied within multiple response variables. This article formulates the multivariate-t linear mixed model with censored responses (MtLMMC), which allows the analysts to model such data in the presence of the above described features simultaneously. An efficient expectation conditional maximization either (ECME) algorithm is developed to carry out maximum likelihood estimation of model parameters. The implementation of the E-step relies on the mean and covariance matrix of truncated multivariate-t distributions. To enhance the computational efficiency, two auxiliary permutation matrices are incorporated into the procedure to determine the observed and censored parts of each subject. The proposed methodology is demonstrated via a simulation study and a real application on HIV/AIDS data.

December 13, 2015 doi: 10.1177/0962280215620229 open full text
Optimal two-stage enrichment design correcting for biomarker misclassification.
Zang, Y., Guo, B.
Statistical Methods in Medical Research: An International Review Journal. November 26, 2015

The enrichment design is an important clinical trial design to detect the treatment effect of the molecularly targeted agent (MTA) in personalized medicine. Under this design, patients are stratified into marker-positive and marker-negative subgroups based on their biomarker statuses and only the marker-positive patients are enrolled into the trial and randomized to receive either the MTA or a standard treatment. As the biomarker plays a key role in determining the enrollment of the trial, a misclassification of the biomarker can induce substantial bias, undermine the integrity of the trial, and seriously affect the treatment evaluation. In this paper, we propose a two-stage optimal enrichment design that utilizes the surrogate marker to correct for the biomarker misclassification. The proposed design is optimal in the sense that it maximizes the probability of correctly classifying each patient’s biomarker status based on the surrogate marker information. In addition, after analytically deriving the bias caused by the biomarker misclassification, we develop a likelihood ratio test based on the EM algorithm to correct for such bias. We conduct comprehensive simulation studies to investigate the operating characteristics of the optimal design and the results confirm the desirable performance of the proposed design.

November 26, 2015 doi: 10.1177/0962280215618429 open full text
Standardized likelihood ratio test for comparing several log-normal means and confidence interval for the common mean.
Krishnamoorthy, K., Oral, E.
Statistical Methods in Medical Research: An International Review Journal. November 26, 2015

Standardized likelihood ratio test (SLRT) for testing the equality of means of several log-normal distributions is proposed. The properties of the SLRT and an available modified likelihood ratio test (MLRT) and a generalized variable (GV) test are evaluated by Monte Carlo simulation and compared. Evaluation studies indicate that the SLRT is accurate even for small samples, whereas the MLRT could be quite liberal for some parameter values, and the GV test is in general conservative and less powerful than the SLRT. Furthermore, a closed-form approximate confidence interval for the common mean of several log-normal distributions is developed using the method of variance estimate recovery, and compared with the generalized confidence interval with respect to coverage probabilities and precision. Simulation studies indicate that the proposed confidence interval is accurate and better than the generalized confidence interval in terms of coverage probabilities. The methods are illustrated using two examples.

November 26, 2015 doi: 10.1177/0962280215615160 open full text
The impact of covariate misclassification using generalized linear regression under covariate-adaptive randomization.
Fan, L., Yeatts, S. D., Wolf, B. J., McClure, L. A., Selim, M., Palesch, Y. Y.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2015

Under covariate adaptive randomization, the covariate is tied to both randomization and analysis. Misclassification of such covariate will impact the intended treatment assignment; further, it is unclear what the appropriate analysis strategy should be. We explore the impact of such misclassification on the trial’s statistical operating characteristics. Simulation scenarios were created based on the misclassification rate and the covariate effect on the outcome. Models including unadjusted, adjusted for the misclassified, or adjusted for the corrected covariate were compared using logistic regression for a binary outcome and Poisson regression for a count outcome. For the binary outcome using logistic regression, type I error can be maintained in the adjusted model, but the test is conservative using an unadjusted model. Power decreased with both increasing covariate effect on the outcome as well as the misclassification rate. Treatment effect estimates were biased towards the null for both the misclassified and unadjusted models. For the count outcome using a Poisson model, covariate misclassification led to inflated type I error probabilities and reduced power in the misclassified and the unadjusted model. The impact of covariate misclassification under covariate–adaptive randomization differs depending on the underlying distribution of the outcome.

November 23, 2015 doi: 10.1177/0962280215616405 open full text
Causal mediation analysis with multiple causally non-ordered mediators.
Taguri, M., Featherstone, J., Cheng, J.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2015

In many health studies, researchers are interested in estimating the treatment effects on the outcome around and through an intermediate variable. Such causal mediation analyses aim to understand the mechanisms that explain the treatment effect. Although multiple mediators are often involved in real studies, most of the literature considered mediation analyses with one mediator at a time. In this article, we consider mediation analyses when there are causally non-ordered multiple mediators. Even if the mediators do not affect each other, the sum of two indirect effects through the two mediators considered separately may diverge from the joint natural indirect effect when there are additive interactions between the effects of the two mediators on the outcome. Therefore, we derive an equation for the joint natural indirect effect based on the individual mediation effects and their interactive effect, which helps us understand how the mediation effect works through the two mediators and relative contributions of the mediators and their interaction. We also discuss an extension for three mediators. The proposed method is illustrated using data from a randomized trial on the prevention of dental caries.

November 23, 2015 doi: 10.1177/0962280215615899 open full text
Influence diagnostics for count data under AB-BA crossover trials.
Hao, C., von Rosen, D., von Rosen, T.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2015

This paper aims to develop diagnostic measures to assess the influence of data perturbations on estimates in AB-BA crossover studies with a Poisson distributed response. Generalised mixed linear models with normally distributed random effects are utilised. We show that in this special case, the model can be decomposed into two independent sub-models which allow to derive closed-form expressions to evaluate the changes in the maximum likelihood estimates under several perturbation schemes. The performance of the new influence measures is illustrated by simulation studies and the analysis of a real dataset.

November 23, 2015 doi: 10.1177/0962280215615597 open full text
A generalized semiparametric mixed model for analysis of multivariate health care utilization data.
Li, Z., Liu, H., Tu, W.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2015

Health care utilization is an outcome of interest in health services research. Two frequently studied forms of utilization are counts of emergency department (ED) visits and hospital admissions. These counts collectively convey a sense of disease exacerbation and cost escalation. Different types of event counts from the same patient form a vector of correlated outcomes. Traditional analysis typically model such outcomes one at a time, ignoring the natural correlations between different events, and thus failing to provide a full picture of patient care utilization. In this research, we propose a multivariate semiparametric modeling framework for the analysis of multiple health care events following the exponential family of distributions in a longitudinal setting. Bivariate nonparametric functions are incorporated to assess the concurrent nonlinear influences of independent variables as well as their interaction effects on the outcomes. The smooth functions are estimated using the thin plate regression splines. A maximum penalized likelihood method is used for parameter estimation. The performance of the proposed method was evaluated through simulation studies. To illustrate the method, we analyzed data from a clinical trial in which ED visits and hospital admissions were considered as bivariate outcomes.

November 23, 2015 doi: 10.1177/0962280215615159 open full text
Borrowing of strength and study weights in multivariate and network meta-analysis.
Jackson, D., White, I. R., Price, M., Copas, J., Riley, R. D.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2015

Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).

November 06, 2015 doi: 10.1177/0962280215611702 open full text
Bayesian analysis of multi-type recurrent events and dependent termination with nonparametric covariate functions.
Lin, L.-A., Luo, S., Chen, B. E., Davis, B. R.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2015

Multi-type recurrent event data occur frequently in longitudinal studies. Dependent termination may occur when the terminal time is correlated to recurrent event times. In this article, we simultaneously model the multi-type recurrent events and a dependent terminal event, both with nonparametric covariate functions modeled by B-splines. We develop a Bayesian multivariate frailty model to account for the correlation among the dependent termination and various types of recurrent events. Extensive simulation results suggest that misspecifying nonparametric covariate functions may introduce bias in parameter estimation. This method development has been motivated by and applied to the lipid-lowering trial component of the Antihypertensive and Lipid-Lowering Treatment to Prevent Heart Attack Trial.

November 06, 2015 doi: 10.1177/0962280215613378 open full text
A joint Bayesian approach for the analysis of response measured at a primary endpoint and longitudinal measurements.
Kalaylioglu, Z., Demirhan, H.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2015

Joint mixed modeling is an attractive approach for the analysis of a scalar response measured at a primary endpoint and longitudinal measurements on a covariate. In the standard Bayesian analysis of these models, measurement error variance and the variance/covariance of random effects are a priori modeled independently. The key point is that these variances cannot be assumed independent given the total variation in a response. This article presents a joint Bayesian analysis in which these variance terms are a priori modeled jointly. Simulations illustrate that analysis with multivariate variance prior in general lead to reduced bias (smaller relative bias) and improved efficiency (smaller interquartile range) in the posterior inference compared with the analysis with independent variance priors.

November 06, 2015 doi: 10.1177/0962280215615003 open full text
What are the appropriate methods for analyzing patient-reported outcomes in randomized trials when data are missing?
Hamel, J., Sebille, V., Le Neel, T., Kubis, G., Boyer, F., Hardouin, J.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2015

Subjective health measurements using Patient Reported Outcomes (PRO) are increasingly used in randomized trials, particularly for patient groups comparisons. Two main types of analytical strategies can be used for such data: Classical Test Theory (CTT) and Item Response Theory models (IRT). These two strategies display very similar characteristics when data are complete, but in the common case when data are missing, whether IRT or CTT would be the most appropriate remains unknown and was investigated using simulations. We simulated PRO data such as quality of life data. Missing responses to items were simulated as being completely random, depending on an observable covariate or on an unobserved latent trait. The considered CTT-based methods allowed comparing scores using complete-case analysis, personal mean imputations or multiple-imputations based on a two-way procedure. The IRT-based method was the Wald test on a Rasch model including a group covariate. The IRT-based method and the multiple-imputations-based method for CTT displayed the highest observed power and were the only unbiased method whatever the kind of missing data. Online software and Stata® modules compatibles with the innate mi impute suite are provided for performing such analyses. Traditional procedures (listwise deletion and personal mean imputations) should be avoided, due to inevitable problems of biases and lack of power.

November 06, 2015 doi: 10.1177/0962280215615158 open full text
A new diagnostic accuracy measure and cut-point selection criterion.
Dong, T., Attwood, K., Hutson, A., Liu, S., Tian, L.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2015

Most diagnostic accuracy measures and criteria for selecting optimal cut-points are only applicable to diseases with binary or three stages. Currently, there exist two diagnostic measures for diseases with general k stages: the hypervolume under the manifold and the generalized Youden index. While hypervolume under the manifold cannot be used for cut-points selection, generalized Youden index is only defined upon correct classification rates. This paper proposes a new measure named maximum absolute determinant for diseases with k stages (k≥2). This comprehensive new measure utilizes all the available classification information and serves as a cut-points selection criterion as well. Both the geometric and probabilistic interpretations for the new measure are examined. Power and simulation studies are carried out to investigate its performance as a measure of diagnostic accuracy as well as cut-points selection criterion. A real data set from Alzheimer’s Disease Neuroimaging Initiative is analyzed using the proposed maximum absolute determinant.

October 20, 2015 doi: 10.1177/0962280215611631 open full text
Unified variable selection in semi-parametric models.
Terry, W., Zhang, H., Maity, A., Arshad, H., Karmaus, W.
Statistical Methods in Medical Research: An International Review Journal. October 20, 2015

We propose a Bayesian variable selection method in semi-parametric models with applications to genetic and epigenetic data (e.g., single nucleotide polymorphisms and DNA methylation, respectively). The data are individually standardized to reduce heterogeneity and facilitate simultaneous selection of categorical (single nucleotide polymorphisms) and continuous (DNA methylation) variables. The Gaussian reproducing kernel is applied to the transformed data to evaluate joint effect of the variables, which may include complex interactions between, e.g., single nucleotide polymorphisms and DNA methylation. Indicator variables are introduced to the model for the purpose of variable selection. The method is demonstrated and evaluated using simulations under different scenarios. We apply the method to identify informative DNA methylation sites and single nucleotide polymorphisms in a set of genes based on their joint effect on allergic sensitization. The selected single nucleotide polymorphisms and methylation sites have the potential to serve as early markers for allergy prediction, and consequently benefit medical and clinical research to prevent allergy before its manifestation.

October 20, 2015 doi: 10.1177/0962280215610928 open full text
A time-varying effect model for studying gender differences in health behavior.
Yang, S., Cranford, J. A., Li, R., Zucker, R. A., Buu, A.
Statistical Methods in Medical Research: An International Review Journal. October 16, 2015

This study proposes a time-varying effect model that can be used to characterize gender-specific trajectories of health behaviors and conduct hypothesis testing for gender differences. The motivating examples demonstrate that the proposed model is applicable to not only multi-wave longitudinal studies but also short-term studies that involve intensive data collection. The simulation study shows that the accuracy of estimation of trajectory functions improves as the sample size and the number of time points increase. In terms of the performance of the hypothesis testing, the type I error rates are close to their corresponding significance levels under all combinations of sample size and number of time points. Furthermore, the power increases as the alternative hypothesis deviates more from the null hypothesis, and the rate of this increasing trend is higher when the sample size and the number of time points are larger.

October 16, 2015 doi: 10.1177/0962280215610608 open full text
Fast clustering using adaptive density peak detection.
Wang, X.-F., Xu, Y.
Statistical Methods in Medical Research: An International Review Journal. October 16, 2015

Common limitations of clustering methods include the slow algorithm convergence, the instability of the pre-specification on a number of intrinsic parameters, and the lack of robustness to outliers. A recent clustering approach proposed a fast search algorithm of cluster centers based on their local densities. However, the selection of the key intrinsic parameters in the algorithm was not systematically investigated. It is relatively difficult to estimate the "optimal" parameters since the original definition of the local density in the algorithm is based on a truncated counting measure. In this paper, we propose a clustering procedure with adaptive density peak detection, where the local density is estimated through the nonparametric multivariate kernel estimation. The model parameter is then able to be calculated from the equations with statistical theoretical justification. We also develop an automatic cluster centroid selection method through maximizing an average silhouette index. The advantage and flexibility of the proposed method are demonstrated through simulation studies and the analysis of a few benchmark gene expression data sets. The method only needs to perform in one single step without any iteration and thus is fast and has a great potential to apply on big data analysis. A user-friendly R package ADPclust is developed for public use.

October 16, 2015 doi: 10.1177/0962280215609948 open full text
Power evaluation of asymptotic tests for comparing two binomial proportions to detect direct and indirect association in large-scale studies.
Emily, M., Friguet, C.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2015

Asymptotic tests are commonly used for comparing two binomial proportions when the sample size is sufficiently large. However, there is no consensus on the most powerful test. In this paper, we clarify this issue by comparing the power functions of three popular asymptotic tests: the Pearson’s ² test, the likelihood-ratio test and the odds-ratio based test. Considering Taylor decompositions under local alternatives, the comparisons lead to recommendations on which test to use in view of both the experimental design and the nature of the investigated signal. We show that when the design is balanced between the two binomials, the three tests are equivalent in terms of power. However, when the design is unbalanced, differences in power can be substantial and the choice of the most powerful test also depends on the value of the parameters of the two compared binomials. We further investigated situations where the two binomials are not compared directly but through tag binomials. In these cases of indirect association, we show that the differences in power between the three tests are enhanced with decreasing values of the parameters of the tag binomials. Our results are illustrated in the context of genetic epidemiology where the analysis of genome-wide association studies provides insights regarding the low power for detecting rare variants.

October 14, 2015 doi: 10.1177/0962280215608528 open full text
Fast and highly efficient pseudo-likelihood methodology for large and complex ordinal data.
Ivanova, A., Molenberghs, G., Verbeke, G.
Statistical Methods in Medical Research: An International Review Journal. October 07, 2015

In longitudinal studies, continuous, binary, categorical, and survival outcomes are often jointly collected, possibly with some observations missing. However, when it comes to modeling responses, the ordinal ones have received less attention in the literature. In a longitudinal or hierarchical context, the univariate proportional odds mixed model (POMM) can be regarded as an instance of the generalized linear mixed model (GLMM). When the response of the joint multivariate model encompass ordinal responses, the complexity further increases. An additional problem of model fitting is the size of the collected data. Pseudo-likelihood based methods for pairwise fitting, for partitioned samples and, as introduced in this paper, pairwise fitting within partitioned samples allow joint modeling of even larger numbers of responses. We show that that pseudo-likelihood methodology allows for highly efficient and fast inferences in high-dimensional large datasets.

October 07, 2015 doi: 10.1177/0962280215608213 open full text
Parsimonious covariate selection for a multicategory ordered response.
Hsu, W.-H., DiRienzo, A. G.
Statistical Methods in Medical Research: An International Review Journal. October 01, 2015

We propose a flexible continuation ratio (CR) model for an ordinal categorical response with potentially ultrahigh dimensional data that characterizes the unique covariate effects at each response level. The CR model is the logit of the conditional discrete hazard function for each response level given covariates. We propose two modeling strategies, one that keeps the same covariate set for each hazard function but allows regression coefficients to arbitrarily change with response level, and one that allows both the set of covariates and their regression coefficients to arbitrarily change with response. Evaluating a covariate set is accomplished by using the nonparametric bootstrap to estimate prediction error and their robust standard errors that do not rely on proper model specification. To help with interpretation of the selected covariate set, we flexibly estimate the conditional cumulative distribution function given the covariates using the separate hazard function models. The goodness-of-fit of our flexible CR model is assessed with graphical and numerical methods based on the cumulative sum of residuals. Simulation results indicate the methods perform well in finite samples. An application to B-cell acute lymphocytic leukemia data is provided.

October 01, 2015 doi: 10.1177/0962280215608120 open full text
Development of a Bayesian response-adaptive trial design for the Dexamethasone for Excessive Menstruation study.
Holm Hansen, C., Warner, P., Parker, R. A., Walker, B. R., Critchley, H. O., Weir, C. J.
Statistical Methods in Medical Research: An International Review Journal. September 30, 2015

It is often unclear what specific adaptive trial design features lead to an efficient design which is also feasible to implement. This article describes the preparatory simulation study for a Bayesian response-adaptive dose-finding trial design. Dexamethasone for Excessive Menstruation aims to assess the efficacy of Dexamethasone in reducing excessive menstrual bleeding and to determine the best dose for further study. To maximise learning about the dose response, patients receive placebo or an active dose with randomisation probabilities adapting based on evidence from patients already recruited. The dose-response relationship is estimated using a flexible Bayesian Normal Dynamic Linear Model. Several competing design options were considered including: number of doses, proportion assigned to placebo, adaptation criterion, and number and timing of adaptations. We performed a fractional factorial study using SAS software to simulate virtual trial data for candidate adaptive designs under a variety of scenarios and to invoke WinBUGS for Bayesian model estimation. We analysed the simulated trial results using Normal linear models to estimate the effects of each design feature on empirical type I error and statistical power. Our readily-implemented approach using widely available statistical software identified a final design which performed robustly across a range of potential trial scenarios.

September 30, 2015 doi: 10.1177/0962280215606155 open full text
Bayesian multi-scale modeling for aggregated disease mapping data.
Aregay, M., Lawson, A. B., Faes, C., Kirby, R. S.
Statistical Methods in Medical Research: An International Review Journal. September 29, 2015

In disease mapping, a scale effect due to an aggregation of data from a finer resolution level to a coarser level is a common phenomenon. This article addresses this issue using a hierarchical Bayesian modeling framework. We propose four different multiscale models. The first two models use a shared random effect that the finer level inherits from the coarser level. The third model assumes two independent convolution models at the finer and coarser levels. The fourth model applies a convolution model at the finer level, but the relative risk at the coarser level is obtained by aggregating the estimates at the finer level. We compare the models using the deviance information criterion (DIC) and Watanabe-Akaike information criterion (WAIC) that are applied to real and simulated data. The results indicate that the models with shared random effects outperform the other models on a range of criteria.

September 29, 2015 doi: 10.1177/0962280215607546 open full text
An ensemble survival model for estimating relative residual longevity following stroke: Application to mortality data in the chronic dialysis population.
Phadnis, M. A., Wetmore, J. B., Shireman, T. I., Ellerbeck, E. F., Mahnken, J. D.
Statistical Methods in Medical Research: An International Review Journal. September 24, 2015

Time-dependent covariates can be modeled within the Cox regression framework and can allow both proportional and nonproportional hazards for the risk factor of research interest. However, in many areas of health services research, interest centers on being able to estimate residual longevity after the occurrence of a particular event such as stroke. The survival trajectory of patients experiencing a stroke can be potentially influenced by stroke type (hemorrhagic or ischemic), time of the stroke (relative to time zero), time since the stroke occurred, or a combination of these factors. In such situations, researchers are more interested in estimating lifetime lost due to stroke rather than merely estimating the relative hazard due to stroke. To achieve this, we propose an ensemble approach using the generalized gamma distribution by means of a semi-Markov type model with an additive hazards extension. Our modeling framework allows stroke as a time-dependent covariate to affect all three parameters (location, scale, and shape) of the generalized gamma distribution. Using the concept of relative times, we answer the research question by estimating residual life lost due to ischemic and hemorrhagic stroke in the chronic dialysis population.

September 24, 2015 doi: 10.1177/0962280215605107 open full text
Detecting time-specific differences between temporal nonlinear curves: Analyzing data from the visual world paradigm.
Oleson, J. J., Cavanaugh, J. E., McMurray, B., Brown, G.
Statistical Methods in Medical Research: An International Review Journal. September 23, 2015

In multiple fields of study, time series measured at high frequencies are used to estimate population curves that describe the temporal evolution of some characteristic of interest. These curves are typically nonlinear, and the deviations of each series from the corresponding curve are highly autocorrelated. In this scenario, we propose a procedure to compare the response curves for different groups at specific points in time. The method involves fitting the curves, performing potentially hundreds of serially correlated tests, and appropriately adjusting the overall alpha level of the tests. Our motivating application comes from psycholinguistics and the visual world paradigm. We describe how the proposed technique can be adapted to compare fixation curves within subjects as well as between groups. Our results lead to conclusions beyond the scope of previous analyses.

September 23, 2015 doi: 10.1177/0962280215607411 open full text
A regression method for modelling geometric rates.
Bottai, M.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2015

The occurrence of an event of interest over time is often summarized by the incidence rate, defined as the average number of events per person-time. This type of rate applies to events that may occur repeatedly over time on any given subject, such as infections, and Poisson regression represents a natural regression method for modelling the effect of covariates on it. However, for events that can occur only once, such as death, the geometric rate may be a better summary measure. The geometric rate has long been utilized in demography for studying the growth of populations and in finance to compute compound interest on capital. This type of rate, however, is virtually unknown to medical research. This may be partly a consequence of the lack of a regression method for it. This paper describes a regression method for modelling the effect of covariates on the geometric rate. The described method is based on applying quantile regression to a transform of the time-to-event variable. The proposed method is used to analyze mortality in a randomized clinical trial and in an observational epidemiological study.

September 18, 2015 doi: 10.1177/0962280215606474 open full text
A joint frailty-copula model between tumour progression and death for meta-analysis.
Emura, T., Nakatochi, M., Murotani, K., Rondeau, V.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2015

Dependent censoring often arises in biomedical studies when time to tumour progression (e.g., relapse of cancer) is censored by an informative terminal event (e.g., death). For meta-analysis combining existing studies, a joint survival model between tumour progression and death has been considered under semicompeting risks, which induces dependence through the study-specific frailty. Our paper here utilizes copulas to generalize the joint frailty model by introducing additional source of dependence arising from intra-subject association between tumour progression and death. The practical value of the new model is particularly evident for meta-analyses in which only a few covariates are consistently measured across studies and hence there exist residual dependence. The covariate effects are formulated through the Cox proportional hazards model, and the baseline hazards are nonparametrically modeled on a basis of splines. The estimator is then obtained by maximizing a penalized log-likelihood function. We also show that the present methodologies are easily modified for the competing risks or recurrent event data, and are generalized to accommodate left-truncation. Simulations are performed to examine the performance of the proposed estimator. The method is applied to a meta-analysis for assessing a recently suggested biomarker CXCL12 for survival in ovarian cancer patients. We implement our proposed methods in R joint.Cox package.

September 18, 2015 doi: 10.1177/0962280215604510 open full text
Simultaneous comparisons of treatments at multiple time points: Combined marginal models versus joint modeling.
Pallmann, P., Pretorius, M., Ritz, C.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2015

We discuss several aspects of multiple inference in longitudinal settings, focusing on many-to-one and all-pairwise comparisons of (a) treatment groups simultaneously at several points in time, or (b) time points simultaneously for several treatments. We assume a continuous endpoint that is measured repeatedly over time and contrast two basic modeling strategies: fitting a joint model across all occasions (with random effects and/or some residual covariance structure to account for heteroscedasticity and serial dependence), and a novel approach combining a set of simple marginal, i.e. occasion-specific models. Upon parameter and covariance estimation with either modeling approach, we employ a variant of multiple contrast tests that acknowledges correlation between time points and test statistics. This method provides simultaneous confidence intervals and adjusted p-values for elementary hypotheses as well as a global test decision. We compare via simulation the powers of multiple contrast tests based on a joint model and multiple marginal models, respectively, and quantify the benefit of incorporating longitudinal correlation, i.e. the advantage over Bonferroni. Practical application is illustrated with data from a clinical trial on bradykinin receptor antagonism.

September 18, 2015 doi: 10.1177/0962280215603743 open full text
A new approach to categorising continuous variables in prediction models: Proposal and validation.
Barrio, I., Arostegui, I., Rodriguez-Alvarez, M.-X., Quintana, J.-M.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2015

When developing prediction models for application in clinical practice, health practitioners usually categorise clinical variables that are continuous in nature. Although categorisation is not regarded as advisable from a statistical point of view, due to loss of information and power, it is a common practice in medical research. Consequently, providing researchers with a useful and valid categorisation method could be a relevant issue when developing prediction models. Without recommending categorisation of continuous predictors, our aim is to propose a valid way to do it whenever it is considered necessary by clinical researchers. This paper focuses on categorising a continuous predictor within a logistic regression model, in such a way that the best discriminative ability is obtained in terms of the highest area under the receiver operating characteristic curve (AUC). The proposed methodology is validated when the optimal cut points’ location is known in theory or in practice. In addition, the proposed method is applied to a real data-set of patients with an exacerbation of chronic obstructive pulmonary disease, in the context of the IRYSS-COPD study where a clinical prediction rule for severe evolution was being developed. The clinical variable PCO₂ was categorised in a univariable and a multivariable setting.

September 18, 2015 doi: 10.1177/0962280215601873 open full text
Assessing agreement between two measurement systems: An alternative to the limits of agreement approach.
Stevens, N. T., Steiner, S. H., MacKay, R. J.
Statistical Methods in Medical Research: An International Review Journal. September 03, 2015

The comparison of two measurement systems is important in medical and other contexts. A common goal is to decide if a new measurement system agrees suitably with an existing one, and hence whether the two can be used interchangeably. Various methods for assessing interchangeability are available, the most popular being the limits of agreement approach due to Bland and Altman. In this article, we review the challenges of this technique and propose a model-based framework for comparing measurement systems that overcomes those challenges. The proposal is based on a simple metric, the probability of agreement, and a corresponding plot which can be used to summarize the agreement between two measurement systems. We also make recommendations for a study design that facilitates accurate and precise estimation of the probability of agreement.

September 03, 2015 doi: 10.1177/0962280215601133 open full text
Direct and flexible marginal inference for semicontinuous data.
Smith, V. A., Preisser, J. S.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2015

The marginalized two-part (MTP) model for semicontinuous data proposed by Smith et al. provides direct inference for the effect of covariates on the marginal mean of positively continuous data with zeros. This brief note addresses mischaracterizations of the MTP model by Gebregziabher et al. Additionally, the MTP model is extended to incorporate the three-parameter generalized gamma distribution, which takes many well-known distributions as special cases, including the Weibull, gamma, inverse gamma, and log-normal distributions.

September 01, 2015 doi: 10.1177/0962280215602290 open full text
Estimating the effect of treatment on binary outcomes using full matching on the propensity score.
Austin, P. C., Stuart, E. A.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2015

Many non-experimental studies use propensity-score methods to estimate causal effects by balancing treatment and control groups on a set of observed baseline covariates. Full matching on the propensity score has emerged as a particularly effective and flexible method for utilizing all available data, and creating well-balanced treatment and comparison groups. However, full matching has been used infrequently with binary outcomes, and relatively little work has investigated the performance of full matching when estimating effects on binary outcomes. This paper describes methods that can be used for estimating the effect of treatment on binary outcomes when using full matching. It then used Monte Carlo simulations to evaluate the performance of these methods based on full matching (with and without a caliper), and compared their performance with that of nearest neighbour matching (with and without a caliper) and inverse probability of treatment weighting. The simulations varied the prevalence of the treatment and the strength of association between the covariates and treatment assignment. Results indicated that all of the approaches work well when the strength of confounding is relatively weak. With stronger confounding, the relative performance of the methods varies, with nearest neighbour matching with a caliper showing consistently good performance across a wide range of settings. We illustrate the approaches using a study estimating the effect of inpatient smoking cessation counselling on survival following hospitalization for a heart attack.

September 01, 2015 doi: 10.1177/0962280215601134 open full text
Natural interpretations in Tobit regression models using marginal estimation methods.
Wang, W., Griswold, M. E.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2015

The Tobit model, also known as a censored regression model to account for left- and/or right-censoring in the dependent variable, has been used in many areas of applications, including dental health, medical research and economics. The reported Tobit model coefficient allows estimation and inference of an exposure effect on the latent dependent variable. However, this model does not directly provide overall exposure effects estimation on the original outcome scale. We propose a direct-marginalization approach using a reparameterized link function to model exposure and covariate effects directly on the truncated dependent variable mean. We also discuss an alternative average-predicted-value, post-estimation approach which uses model-predicted values for each person in a designated reference group under different exposure statuses to estimate covariate-adjusted overall exposure effects. Simulation studies were conducted to show the unbiasedness and robustness properties for both approaches under various scenarios. Robustness appears to diminish when covariates with substantial effects are imbalanced between exposure groups; we outline an approach for model choice based on information criterion fit statistics. The methods are applied to the Genetic Epidemiology Network of Arteriopathy (GENOA) cohort study to assess associations between obesity and cognitive function in the non-Hispanic white participants.

September 01, 2015 doi: 10.1177/0962280215602716 open full text
A comparison of confidence/credible interval methods for the area under the ROC curve for continuous diagnostic tests with small sample size.
Feng, D., Cortese, G., Baumgartner, R.
Statistical Methods in Medical Research: An International Review Journal. August 30, 2015

The receiver operating characteristic (ROC) curve is frequently used as a measure of accuracy of continuous markers in diagnostic tests. The area under the ROC curve (AUC) is arguably the most widely used summary index for the ROC curve. Although the small sample size scenario is common in medical tests, a comprehensive study of small sample size properties of various methods for the construction of the confidence/credible interval (CI) for the AUC has been by and large missing in the literature. In this paper, we describe and compare 29 non-parametric and parametric methods for the construction of the CI for the AUC when the number of available observations is small. The methods considered include not only those that have been widely adopted, but also those that have been less frequently mentioned or, to our knowledge, never applied to the AUC context. To compare different methods, we carried out a simulation study with data generated from binormal models with equal and unequal variances and from exponential models with various parameters and with equal and unequal small sample sizes. We found that the larger the true AUC value and the smaller the sample size, the larger the discrepancy among the results of different approaches. When the model is correctly specified, the parametric approaches tend to outperform the non-parametric ones. Moreover, in the non-parametric domain, we found that a method based on the Mann–Whitney statistic is in general superior to the others. We further elucidate potential issues and provide possible solutions to along with general guidance on the CI construction for the AUC when the sample size is small. Finally, we illustrate the utility of different methods through real life examples.

August 30, 2015 doi: 10.1177/0962280215602040 open full text
An improved procedure for estimation of malignant breast cancer prevalence using partially rank ordered set samples with multiple concomitants.
Hatefi, A., Jafari Jozani, M.
Statistical Methods in Medical Research: An International Review Journal. August 26, 2015

Rank-based sampling designs are widely used in situations where measuring the variable of interest is costly but a small number of sampling units (set) can be easily ranked prior to taking the final measurements on them and this can be done at little cost. When the variable of interest is binary, a common approach for ranking the sampling units is to estimate the probabilities of success through a logistic regression model. However, this requires training samples for model fitting. Also, in this approach once a sampling unit has been measured, the extra rank information obtained in the ranking process is not used further in the estimation process. To address these issues, in this paper, we propose to use the partially rank-ordered set sampling design with multiple concomitants. In this approach, instead of fitting a logistic regression model, a soft ranking technique is employed to obtain a vector of weights for each measured unit that represents the probability or the degree of belief associated with its rank among a small set of sampling units. We construct an estimator which combines the rank information and the observed partially rank-ordered set measurements themselves. The proposed methodology is applied to a breast cancer study to estimate the proportion of patients with malignant (cancerous) breast tumours in a given population. Through extensive numerical studies, the performance of the estimator is evaluated under various concomitants with different ranking potentials (i.e. good, intermediate and bad) and tie structures among the ranks. We show that the precision of the partially rank-ordered set estimator is better than its counterparts under simple random sampling and ranked set sampling designs and, hence, the sample size required to achieve a desired precision is reduced.

August 26, 2015 doi: 10.1177/0962280215601458 open full text
Sample size considerations for split-mouth design.
Zhu, H., Zhang, S., Ahn, C.
Statistical Methods in Medical Research: An International Review Journal. August 24, 2015

Split-mouth designs are frequently used in dental clinical research, where a mouth is divided into two or more experimental segments that are randomly assigned to different treatments. It has the distinct advantage of removing a lot of inter-subject variability from the estimated treatment effect. Methods of statistical analyses for split-mouth design have been well developed. However, little work is available on sample size consideration at the design phase of a split-mouth trial, although many researchers pointed out that the split-mouth design can only be more efficient than a parallel-group design when within-subject correlation coefficient is substantial. In this paper, we propose to use the generalized estimating equation (GEE) approach to assess treatment effect in split-mouth trials, accounting for correlations among observations. Closed-form sample size formulas are introduced for the split-mouth design with continuous and binary outcomes, assuming exchangeable and "nested exchangeable" correlation structures for outcomes from the same subject. The statistical inference is based on the large sample approximation under the GEE approach. Simulation studies are conducted to investigate the finite-sample performance of the GEE sample size formulas. A dental clinical trial example is presented for illustration.

August 24, 2015 doi: 10.1177/0962280215601137 open full text
A comparative study of matched pair designs with two binary endpoints.
Jiang, Y., Xu, J.
Statistical Methods in Medical Research: An International Review Journal. August 20, 2015

We study matched pair designs with two binary endpoints under three different approaches. Power approximation and sample size calculation are derived under these situations and facilitated by R programs. An adaptive design with sample size re-estimation is also presented. Through extensive simulations, we provide general guidelines for practitioners to choose the best approach according to the ranges of the interested parameters in the sense of feasibility and robustness. Application to a cancer chemotherapy trial is illustrated.

August 20, 2015 doi: 10.1177/0962280215601136 open full text
Applications of temporal kernel canonical correlation analysis in adherence studies.
John, M., Lencz, T., Ferbinteanu, J., Gallego, J. A., Robinson, D. G.
Statistical Methods in Medical Research: An International Review Journal. August 20, 2015

Adherence to medication is often measured as a continuous outcome but analyzed as a dichotomous outcome due to lack of appropriate tools. In this paper, we illustrate the use of the temporal kernel canonical correlation analysis (tkCCA) as a method to analyze adherence measurements and symptom levels on a continuous scale. The tkCCA is a novel method developed for studying the relationship between neural signals and hemodynamic response detected by functional MRI during spontaneous activity. Although the tkCCA is a powerful tool, it has not been utilized outside the application that it was originally developed for. In this paper, we simulate time series of symptoms and adherence levels for patients with a hypothetical brain disorder and show how the tkCCA can be used to understand the relationship between them. We also examine, via simulations, the behavior of the tkCCA under various missing value mechanisms and imputation methods. Finally, we apply the tkCCA to a real data example of psychotic symptoms and adherence levels obtained from a study based on subjects with a first episode of schizophrenia, schizophreniform or schizoaffective disorder.

August 20, 2015 doi: 10.1177/0962280215598805 open full text
The impact of covariance misspecification in group-based trajectory models for longitudinal data with non-stationary covariance structure.
Davies, C. E., Glonek, G. F., Giles, L. C.
Statistical Methods in Medical Research: An International Review Journal. August 17, 2015

One purpose of a longitudinal study is to gain a better understanding of how an outcome of interest changes among a given population over time. In what follows, a trajectory will be taken to mean the series of measurements of the outcome variable for an individual. Group-based trajectory modelling methods seek to identify subgroups of trajectories within a population, such that trajectories that are grouped together are more similar to each other than to trajectories in distinct groups. Group-based trajectory models generally assume a certain structure in the covariances between measurements, for example conditional independence, homogeneous variance between groups or stationary variance over time. Violations of these assumptions could be expected to result in poor model performance. We used simulation to investigate the effect of covariance misspecification on misclassification of trajectories in commonly used models under a range of scenarios. To do this we defined a measure of performance relative to the ideal Bayesian correct classification rate. We found that the more complex models generally performed better over a range of scenarios. In particular, incorrectly specified covariance matrices could significantly bias the results but using models with a correct but more complicated than necessary covariance matrix incurred little cost.

August 17, 2015 doi: 10.1177/0962280215598806 open full text
Uncertainty in the Bayesian meta-analysis of normally distributed surrogate endpoints.
Bujkiewicz, S., Thompson, J. R., Spata, E., Abrams, K. R.
Statistical Methods in Medical Research: An International Review Journal. August 13, 2015

We investigate the effect of the choice of parameterisation of meta-analytic models and related uncertainty on the validation of surrogate endpoints. Different meta-analytical approaches take into account different levels of uncertainty which may impact on the accuracy of the predictions of treatment effect on the target outcome from the treatment effect on a surrogate endpoint obtained from these models. A range of Bayesian as well as frequentist meta-analytical methods are implemented using illustrative examples in relapsing–remitting multiple sclerosis, where the treatment effect on disability worsening is the primary outcome of interest in healthcare evaluation, while the effect on relapse rate is considered as a potential surrogate to the effect on disability progression, and in gastric cancer, where the disease-free survival has been shown to be a good surrogate endpoint to the overall survival. Sensitivity analysis was carried out to assess the impact of distributional assumptions on the predictions. Also, sensitivity to modelling assumptions and performance of the models were investigated by simulation. Although different methods can predict mean true outcome almost equally well, inclusion of uncertainty around all relevant parameters of the model may lead to less certain and hence more conservative predictions. When investigating endpoints as candidate surrogate outcomes, a careful choice of the meta-analytical approach has to be made. Models underestimating the uncertainty of available evidence may lead to overoptimistic predictions which can then have an effect on decisions made based on such predictions.

August 13, 2015 doi: 10.1177/0962280215597260 open full text
Comparison of bias-corrected covariance estimators for MMRM analysis in longitudinal data with dropouts.
Gosho, M., Hirakawa, A., Noma, H., Maruo, K., Sato, Y.
Statistical Methods in Medical Research: An International Review Journal. August 13, 2015

In longitudinal clinical trials, some subjects will drop out before completing the trial, so their measurements towards the end of the trial are not obtained. Mixed-effects models for repeated measures (MMRM) analysis with "unstructured" (UN) covariance structure are increasingly common as a primary analysis for group comparisons in these trials. Furthermore, model-based covariance estimators have been routinely used for testing the group difference and estimating confidence intervals of the difference in the MMRM analysis using the UN covariance. However, using the MMRM analysis with the UN covariance could lead to convergence problems for numerical optimization, especially in trials with a small-sample size. Although the so-called sandwich covariance estimator is robust to misspecification of the covariance structure, its performance deteriorates in settings with small-sample size. We investigated the performance of the sandwich covariance estimator and covariance estimators adjusted for small-sample bias proposed by Kauermann and Carroll (J Am Stat Assoc 2001; 96: 1387–1396) and Mancl and DeRouen (Biometrics 2001; 57: 126–134) fitting simpler covariance structures through a simulation study. In terms of the type 1 error rate and coverage probability of confidence intervals, Mancl and DeRouen’s covariance estimator with compound symmetry, first-order autoregressive (AR(1)), heterogeneous AR(1), and antedependence structures performed better than the original sandwich estimator and Kauermann and Carroll’s estimator with these structures in the scenarios where the variance increased across visits. The performance based on Mancl and DeRouen’s estimator with these structures was nearly equivalent to that based on the Kenward–Roger method for adjusting the standard errors and degrees of freedom with the UN structure. The model-based covariance estimator with the UN structure under unadjustment of the degrees of freedom, which is frequently used in applications, resulted in substantial inflation of the type 1 error rate. We recommend the use of Mancl and DeRouen’s estimator in MMRM analysis if the number of subjects completing is (n + 5) or less, where n is the number of planned visits. Otherwise, the use of Kenward and Roger’s method with UN structure should be the best way.

August 13, 2015 doi: 10.1177/0962280215597938 open full text
Sparse estimation of gene-gene interactions in prediction models.
Lee, S., Pawitan, Y., Ingelsson, E., Lee, Y.
Statistical Methods in Medical Research: An International Review Journal. August 13, 2015

Current assessment of gene–gene interactions is typically based on separate parallel analysis, where each interaction term is tested separately, while less attention has been paid on simultaneous estimation of interaction terms in a prediction model. As the number of interaction terms grows fast, sparse estimation is desirable from statistical and interpretability reasons. There is a large literature on sparse estimation, but there is a natural hierarchy between the interaction and its corresponding main effects that requires special considerations. We describe random-effect models that impose sparse estimation of interactions under both strong and weak-hierarchy constraints. We develop an estimation procedure based on the hierarchical-likelihood argument and show that the modelling approach is equivalent to a penalty-based method, with the advantage of the models being more transparent and flexible. We compare the procedure with some standard methods in a simulation study and illustrate its application in an analysis of gene–gene interaction model to predict body-mass index.

August 13, 2015 doi: 10.1177/0962280215597261 open full text
Introducing the fit-criteria assessment plot - A visualisation tool to assist class enumeration in group-based trajectory modelling.
Klijn, S. L., Weijenberg, M. P., Lemmens, P., van den Brandt, P. A., Lima Passos, V.
Statistical Methods in Medical Research: An International Review Journal. August 11, 2015

Background and objective
Group-based trajectory modelling is a model-based clustering technique applied for the identification of latent patterns of temporal changes. Despite its manifold applications in clinical and health sciences, potential problems of the model selection procedure are often overlooked. The choice of the number of latent trajectories (class-enumeration), for instance, is to a large degree based on statistical criteria that are not fail-safe. Moreover, the process as a whole is not transparent. To facilitate class enumeration, we introduce a graphical summary display of several fit and model adequacy criteria, the fit-criteria assessment plot.
Methods
An R-code that accepts universal data input is presented. The programme condenses relevant group-based trajectory modelling output information of model fit indices in automated graphical displays. Examples based on real and simulated data are provided to illustrate, assess and validate fit-criteria assessment plot’s utility.
Results
Fit-criteria assessment plot provides an overview of fit criteria on a single page, placing users in an informed position to make a decision. Fit-criteria assessment plot does not automatically select the most appropriate model but eases the model assessment procedure.
Conclusions
Fit-criteria assessment plot is an exploratory, visualisation tool that can be employed to assist decisions in the initial and decisive phase of group-based trajectory modelling analysis. Considering group-based trajectory modelling’s widespread resonance in medical and epidemiological sciences, a more comprehensive, easily interpretable and transparent display of the iterative process of class enumeration may foster group-based trajectory modelling’s adequate use.

August 11, 2015 doi: 10.1177/0962280215598665 open full text
Bayesian nonparametric mixed-effects joint model for longitudinal-competing risks data analysis in presence of multiple data features.
Lu, T.
Statistical Methods in Medical Research: An International Review Journal. August 11, 2015

Recently, the joint analysis of longitudinal and survival data has been an active research area. Most joint models focus on survival data with only one type of failure. The research on joint modeling of longitudinal and competing risks survival data is sparse. Even so, many joint models for this type of data assume parametric function forms for both longitudinal and survival sub-models, thus limits their use. Further, the common data features that are usually observed in practice, such as asymmetric distribution and missingness in response, measurement errors in covariate, need to be taken into account for reliable parameter estimation. The statistical inference is complicated when all these factors are considered simultaneously. In the article, driven by a motivating example, we assume nonparametric function forms for the varying coefficients in both longitudinal and competing risks survival sub-models. We propose a Bayesian nonparametric mixed-effects joint model for the analysis of longitudinal-competing risks data with asymmetry, missingness, and measurement errors. Simulation studies are conducted to assess the performance of the proposed method. We apply the proposed method to an AIDS dataset and compare a few candidate models under various settings. Some interesting results are reported.

August 11, 2015 doi: 10.1177/0962280215597939 open full text
An extended sequential goodness-of-fit multiple testing method for discrete data.
Castro-Conde, I., Dohler, S., de Una-Alvarez, J.
Statistical Methods in Medical Research: An International Review Journal. August 11, 2015

The sequential goodness-of-fit (SGoF) multiple testing method has recently been proposed as an alternative to the familywise error rate- and the false discovery rate-controlling procedures in high-dimensional problems. For discrete data, the SGoF method may be very conservative. In this paper, we introduce an alternative SGoF-type procedure that takes into account the discreteness of the test statistics. Like the original SGoF, our new method provides weak control of the false discovery rate/familywise error rate but attains false discovery rate levels closer to the desired nominal level, and thus it is more powerful. We study the performance of this method in a simulation study and illustrate its application to a real pharmacovigilance data set.

August 11, 2015 doi: 10.1177/0962280215597580 open full text
A vine copula mixed effect model for trivariate meta-analysis of diagnostic test accuracy studies accounting for disease prevalence.
Nikoloulopoulos, A. K.
Statistical Methods in Medical Research: An International Review Journal. August 11, 2015

A bivariate copula mixed model has been recently proposed to synthesize diagnostic test accuracy studies and it has been shown that it is superior to the standard generalized linear mixed model in this context. Here, we call trivariate vine copulas to extend the bivariate meta-analysis of diagnostic test accuracy studies by accounting for disease prevalence. Our vine copula mixed model includes the trivariate generalized linear mixed model as a special case and can also operate on the original scale of sensitivity, specificity, and disease prevalence. Our general methodology is illustrated by re-analyzing the data of two published meta-analyses. Our study suggests that there can be an improvement on trivariate generalized linear mixed model in fit to data and makes the argument for moving to vine copula random effects models especially because of their richness, including reflection asymmetric tail dependence, and computational feasibility despite their three dimensionality.

August 11, 2015 doi: 10.1177/0962280215596769 open full text
Bias-corrected estimates for logistic regression models for complex surveys with application to the United States' Nationwide Inpatient Sample.
Rader, K. A., Lipsitz, S. R., Fitzmaurice, G. M., Harrington, D. P., Parzen, M., Sinha, D.
Statistical Methods in Medical Research: An International Review Journal. August 11, 2015

For complex surveys with a binary outcome, logistic regression is widely used to model the outcome as a function of covariates. Complex survey sampling designs are typically stratified cluster samples, but consistent and asymptotically unbiased estimates of the logistic regression parameters can be obtained using weighted estimating equations (WEEs) under the naive assumption that subjects within a cluster are independent. Despite the relatively large samples typical of many complex surveys, with rare outcomes, many interaction terms, or analysis of subgroups, the logistic regression parameters estimates from WEE can be markedly biased, just as with independent samples. In this paper, we propose bias-corrected WEEs for complex survey data. The proposed method is motivated by a study of postoperative complications in laparoscopic cystectomy, using data from the 2009 United States’ Nationwide Inpatient Sample complex survey of hospitals.

August 11, 2015 doi: 10.1177/0962280215596550 open full text
Dependence and independence: Structure and inference.
Vexler, A., Chen, X., Hutson, A. D.
Statistical Methods in Medical Research: An International Review Journal. July 29, 2015

Evaluations of relationships between pairs of variables, including testing for independence, are increasingly important. Erich Leo Lehmann noted that "the study of the power and efficiency of tests of independence is complicated by the difficulty of defining natural classes of alternatives to the hypothesis of independence." This paper presents a general review, discussion and comparison of classical and novel tests of independence. We investigate a broad spectrum of dependence structures with/without random effects, including those that are well addressed in both the applied and the theoretical scientific literatures as well as scenarios when the classical tests of independence may break down completely. Motivated by practical considerations, the impact of random effects in dependence structures are studied in the additive and multiplicative forms. A novel index of dependence is proposed based on the area under the Kendall plot. In conjunction with the scatterplot and the Kendall plot, the proposed method provides a comprehensive presentation of the data in terms of graphing and conceptualizing the dependence. We also present a graphical methodology based on heat maps to effectively compare the powers of various tests. Practical examples illustrate the use of various tests of independence and the graphical representations of dependence structures.

July 29, 2015 doi: 10.1177/0962280215594198 open full text
Bayesian hierarchical models for network meta-analysis incorporating nonignorable missingness.
Zhang, J., Chu, H., Hong, H., Virnig, B. A., Carlin, B. P.
Statistical Methods in Medical Research: An International Review Journal. July 28, 2015

Network meta-analysis expands the scope of a conventional pairwise meta-analysis to simultaneously compare multiple treatments, synthesizing both direct and indirect information and thus strengthening inference. Since most of trials only compare two treatments, a typical data set in a network meta-analysis managed as a trial-by-treatment matrix is extremely sparse, like an incomplete block structure with significant missing data. Zhang et al. proposed an arm-based method accounting for correlations among different treatments within the same trial and assuming that absent arms are missing at random. However, in randomized controlled trials, nonignorable missingness or missingness not at random may occur due to deliberate choices of treatments at the design stage. In addition, those undertaking a network meta-analysis may selectively choose treatments to include in the analysis, which may also lead to missingness not at random. In this paper, we extend our previous work to incorporate missingness not at random using selection models. The proposed method is then applied to two network meta-analyses and evaluated through extensive simulation studies. We also provide comprehensive comparisons of a commonly used contrast-based method and the arm-based method via simulations in a technical appendix under missing completely at random and missing at random.

July 28, 2015 doi: 10.1177/0962280215596185 open full text
Self-modeling ordinal regression with time invariant covariates - An application to prostate cancer data.
Shirazi, A. M., Das, K., Pinheiro, A.
Statistical Methods in Medical Research: An International Review Journal. July 28, 2015

In a prostate cancer study, the severity of genito-urinary (bladder) toxicity is assessed for patients who were given different doses of radiation. The ordinal responses (severity of side effects) are recorded longitudinally along with the cancer stage of a patient. Differences among the patients due to time-invariant covariates are captured by the parameters. To build up a suitable framework for an analysis of such data, we propose the use of self-modeling ordinal longitudinal model where the conditional cumulative probabilities for a category of an outcome have a relation with shape-invariant model. Since patients suffering from a common disease usually exhibit a similar pattern, it is natural to build up a nonlinear model that is shape invariant. The model is essentially semi-parametric where the population time curve is modeled with penalized regression spline. Monte Carlo expectation maximization technique is used to estimate the parameters of the model. A simulation study is also carried out to justify the methodology used.

July 28, 2015 doi: 10.1177/0962280215594493 open full text
Bayesian accelerated failure time model for space-time dependency in a geographically augmented survival model.
Onicescu, G., Lawson, A., Zhang, J., Gebregziabher, M., Wallace, K., Eberth, J. M.
Statistical Methods in Medical Research: An International Review Journal. July 28, 2015

In this paper, we extend the spatially explicit survival model for small area cancer data by allowing dependency between space and time and using accelerated failure time models. Spatial dependency is modeled directly in the definition of the survival, density, and hazard functions. The models are developed in the context of county level aggregated data. Two cases are considered: the first assumes that the spatial and temporal distributions are independent; the second allows for dependency between the spatial and temporal components. We apply the models to prostate cancer data from the Louisiana SEER cancer registry.

July 28, 2015 doi: 10.1177/0962280215596186 open full text
Modelling a response as a function of high-frequency count data: The association between physical activity and fat mass.
Augustin, N. H., Mattocks, C., Faraway, J. J., Greven, S., Ness, A. R.
Statistical Methods in Medical Research: An International Review Journal. July 17, 2015

Accelerometers are widely used in health sciences, ecology and other application areas. They quantify the intensity of physical activity as counts per epoch over a given period of time. Currently, health scientists use very lossy summaries of the accelerometer time series, some of which are based on coarse discretisation of activity levels, and make certain implicit assumptions, including linear or constant effects of physical activity. We propose the histogram as a functional summary for achieving a near lossless dimension reduction, comparability between individual time series and easy interpretability. Using the histogram as a functional summary avoids registration of accelerometer counts in time. In our novel method, a scalar response is regressed on additive multi-dimensional functional predictors, including the histogram of the high-frequency counts, and additive non-linear predictors for other continuous covariates. The method improves on the current state-of-the art, as it can deal with high-frequency time series of different lengths and missing values and yields a flexible way to model the physical activity effect with fewer assumptions. It also allows the commonly made modelling assumptions to be tested. We investigate the relationship between the response fat mass and physical activity measured by accelerometer, in data from the Avon Longitudinal Study of Parents and Children. Our method allows testing of whether the effect of physical activity varies over its intensity by gender, by time of day or by day of the week. We show that meaningful interpretation requires careful treatment of identifiability constraints in the light of the sum-to-one property of a histogram. We find that the (not necessarily causal) effect of physical activity on kg fat mass is not linear and not constant over the activity intensity.

July 17, 2015 doi: 10.1177/0962280215595832 open full text
A Bayesian sequential design using alpha spending function to control type I error.
Zhu, H., Yu, Q.
Statistical Methods in Medical Research: An International Review Journal. July 17, 2015

We propose in this article a Bayesian sequential design using alpha spending functions to control the overall type I error in phase III clinical trials. We provide algorithms to calculate critical values, power, and sample sizes for the proposed design. Sensitivity analysis is implemented to check the effects from different prior distributions, and conservative priors are recommended. We compare the power and actual sample sizes of the proposed Bayesian sequential design with different alpha spending functions through simulations. We also compare the power of the proposed method with frequentist sequential design using the same alpha spending function. Simulations show that, at the same sample size, the proposed method provides larger power than the corresponding frequentist sequential design. It also has larger power than traditional Bayesian sequential design which sets equal critical values for all interim analyses. When compared with other alpha spending functions, O’Brien-Fleming alpha spending function has the largest power and is the most conservative in terms that at the same sample size, the null hypothesis is the least likely to be rejected at early stage of clinical trials. And finally, we show that adding a step of stop for futility in the Bayesian sequential design can reduce the overall type I error and reduce the actual sample sizes.

July 17, 2015 doi: 10.1177/0962280215595058 open full text
Estimation of the treatment effect under an incomplete block crossover design in binary data - A conditional likelihood approach.
Lui, K.-J.
Statistical Methods in Medical Research: An International Review Journal. July 15, 2015

A random effects logistic regression model is proposed for an incomplete block crossover trial comparing three treatments when the underlying patient response is dichotomous. On the basis of the conditional distributions, the conditional maximum likelihood estimator for the relative effect between treatments and its estimated asymptotic standard error are derived. Asymptotic interval estimator and exact interval estimator are also developed. Monte Carlo simulation is used to evaluate the performance of these estimators. Both asymptotic and exact interval estimators are found to perform well in a variety of situations. When the number of patients is small, the exact interval estimator with assuring the coverage probability larger than or equal to the desired confidence level can be especially of use. The data taken from a crossover trial comparing the low and high doses of an analgesic with a placebo for the relief of pain in primary dysmenorrhea are used to illustrate the use of estimators and the potential usefulness of the incomplete block crossover design.

July 15, 2015 doi: 10.1177/0962280215595434 open full text
Dynamic prediction models for clustered and interval-censored outcomes: Investigating the intra-couple correlation in the risk of dementia.
Rondeau, V., Mauguen, A., Laurent, A., Berr, C., Helmer, C.
Statistical Methods in Medical Research: An International Review Journal. July 15, 2015

The use of settings such as cohorts or clinical trials with interval-censored data and clustered event times are increasingly popular designs. First, the observed outcomes cannot be considered as independent and random effects survival models were introduced. Second, the failure time is not known exactly but it is only known to have occurred within a certain interval.
We propose here an extension of shared frailty models to handle simultaneously the interval censoring, the clustering and also left truncation due to delayed entry in the cohort. A simulation study to evaluate the proposed method was conducted. The estimated results are used to obtain dynamic predictions for clustered patients, with interval-censored failure times and with a given history. We apply our method to the Three-City study, a prospective cohort with periodic follow-up in order to study prognostic factors of dementia. In this application scheme, couples are natural clusters and an intra-couple correlation might be present with a possible increased risk for dementia for subjects whose partner already developed incident dementia. No significant intra-couple correlation for the risk of dementia was observed before and after adjustments for covariates. We also present individual predictions of dementia underlining the usefulness of dynamic prognostic tools that can take into account the clustering.
The consideration of frailty models for interval-censoring data and left-truncated data permits useful analysis of very complex clustered data. It could help to improve estimation of the impact of proposed prognostic features in a study with clustering. We proposed here a tractable model and a dynamic prediction tool that can easily be implemented using the R package Frailtypack.

July 15, 2015 doi: 10.1177/0962280215594835 open full text
Bayesian optimal interval design for dose finding in drug-combination trials.
Lin, R., Yin, G.
Statistical Methods in Medical Research: An International Review Journal. July 15, 2015

Interval designs have recently attracted enormous attention due to their simplicity and desirable properties. We develop a Bayesian optimal interval design for dose finding in drug-combination trials. To determine the next dose combination based on the cumulative data, we propose an allocation rule by maximizing the posterior probability that the toxicity rate of the next dose falls inside a prespecified probability interval. The entire dose-finding procedure is nonparametric (model-free), which is thus robust and also does not require the typical "nonparametric" prephase used in model-based designs for drug-combination trials. The proposed two-dimensional interval design enjoys convergence properties for large samples. We conduct simulation studies to demonstrate the finite-sample performance of the proposed method under various scenarios and further make a modication to estimate toxicity contours by parallel dose-finding paths. Simulation results show that on average the performance of the proposed design is comparable with model-based designs, but it is much easier to implement.

July 15, 2015 doi: 10.1177/0962280215594494 open full text
Estimating the ratio of multivariate recurrent event rates with application to a blood transfusion study.
Ning, J., Rahbar, M. H., Choi, S., Piao, J., Hong, C., del Junco, D. J., Rahbar, E., Fox, E. E., Holcomb, J. B., Wang, M.-C.
Statistical Methods in Medical Research: An International Review Journal. July 09, 2015

In comparative effectiveness studies of multicomponent, sequential interventions like blood product transfusion (plasma, platelets, red blood cells) for trauma and critical care patients, the timing and dynamics of treatment relative to the fragility of a patient’s condition is often overlooked and underappreciated. While many hospitals have established massive transfusion protocols to ensure that physiologically optimal combinations of blood products are rapidly available, the period of time required to achieve a specified massive transfusion standard (e.g. a 1:1 or 1:2 ratio of plasma or platelets:red blood cells) has been ignored. To account for the time-varying characteristics of transfusions, we use semiparametric rate models for multivariate recurrent events to estimate blood product ratios. We use latent variables to account for multiple sources of informative censoring (early surgical or endovascular hemorrhage control procedures or death). The major advantage is that the distributions of latent variables and the dependence structure between the multivariate recurrent events and informative censoring need not be specified. Thus, our approach is robust to complex model assumptions. We establish asymptotic properties and evaluate finite sample performance through simulations, and apply the method to data from the PRospective Observational Multicenter Major Trauma Transfusion study.

July 09, 2015 doi: 10.1177/0962280215593974 open full text
A marginalized two-part model for longitudinal semicontinuous data.
Smith, V. A., Neelon, B., Preisser, J. S., Maciejewski, M. L.
Statistical Methods in Medical Research: An International Review Journal. July 07, 2015

In health services research, it is common to encounter semicontinuous data, characterized by a point mass at zero followed by a right-skewed continuous distribution with positive support. Examples include health expenditures, in which the zeros represent a subpopulation of patients who do not use health services, while the continuous distribution describes the level of expenditures among health services users. Longitudinal semicontinuous data are typically analyzed using two-part random-effect mixtures with one component that models the probability of health services use, and a second component that models the distribution of log-scale positive expenditures among users. However, because the second part conditions on a non-zero response, obtaining interpretable effects of covariates on the combined population of health services users and non-users is not straightforward, even though this is often of greatest interest to investigators. Here, we propose a marginalized two-part model for longitudinal data that allows investigators to obtain the effect of covariates on the overall population mean. The model additionally provides estimates of the overall population mean on the original, untransformed scale, and many covariates take a dual population average and subject-specific interpretation. Using a Bayesian estimation approach, this model maintains the flexibility to include complex random-effect structures and easily estimate functions of the overall mean. We illustrate this approach by evaluating the effect of a copayment increase on health care expenditures in the Veterans Affairs health care system over a four-year period.

July 07, 2015 doi: 10.1177/0962280215592908 open full text
Estimating negative likelihood ratio confidence when test sensitivity is 100%: A bootstrapping approach.
Marill, K. A., Chang, Y., Wong, K. F., Friedman, A. B.
Statistical Methods in Medical Research: An International Review Journal. July 06, 2015

Objectives
Assessing high-sensitivity tests for mortal illness is crucial in emergency and critical care medicine. Estimating the 95% confidence interval (CI) of the likelihood ratio (LR) can be challenging when sample sensitivity is 100%. We aimed to develop, compare, and automate a bootstrapping method to estimate the negative LR CI when sample sensitivity is 100%.
Methods
The lowest population sensitivity that is most likely to yield sample sensitivity 100% is located using the binomial distribution. Random binomial samples generated using this population sensitivity are then used in the LR bootstrap. A free R program, "bootLR," automates the process. Extensive simulations were performed to determine how often the LR bootstrap and comparator method 95% CIs cover the true population negative LR value. Finally, the 95% CI was compared for theoretical sample sizes and sensitivities approaching and including 100% using: (1) a technique of individual extremes, (2) SAS software based on the technique of Gart and Nam, (3) the Score CI (as implemented in the StatXact, SAS, and R PropCI package), and (4) the bootstrapping technique.
Results
The bootstrapping approach demonstrates appropriate coverage of the nominal 95% CI over a spectrum of populations and sample sizes. Considering a study of sample size 200 with 100 patients with disease, and specificity 60%, the lowest population sensitivity with median sample sensitivity 100% is 99.31%. When all 100 patients with disease test positive, the negative LR 95% CIs are: individual extremes technique (0,0.073), StatXact (0,0.064), SAS Score method (0,0.057), R PropCI (0,0.062), and bootstrap (0,0.048). Similar trends were observed for other sample sizes.
Conclusions
When study samples demonstrate 100% sensitivity, available methods may yield inappropriately wide negative LR CIs. An alternative bootstrapping approach and accompanying free open-source R package were developed to yield realistic estimates easily. This methodology and implementation are applicable to other binomial proportions with homogeneous responses.

July 06, 2015 doi: 10.1177/0962280215592907 open full text
Global tests for novelty.
Ahonen, I., Larocque, D., Nevalainen, J.
Statistical Methods in Medical Research: An International Review Journal. July 06, 2015

Outlier detection covers the wide range of methods aiming at identifying observations that are considered unusual. Novelty detection, on the other hand, seeks observations among newly generated test data that are exceptional compared with previously observed training data. In many applications, the general existence of novelty is of more interest than identifying the individual novel observations. For instance, in high-throughput cancer treatment screening experiments, it is meaningful to test whether any new treatment effects are seen compared with existing compounds. Here, we present hypothesis tests for such global level novelty. The problem is approached through a set of very general assumptions, making it innovative in relation to the current literature. We introduce test statistics capable of detecting novelty. They operate on local neighborhoods and their null distribution is obtained by the permutation principle. We show that they are valid and able to find different types of novelty, e.g. location and scale alternatives. The performance of the methods is assessed with simulations and with applications to real data sets.

July 06, 2015 doi: 10.1177/0962280215591236 open full text
CUSUM control charts to monitor series of Negative Binomial count data.
Alencar, A. P., Lee Ho, L., Albarracin, O. Y. E.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2015

To detect outbreaks of diseases in public health, several control charts have been proposed in the literature. In this context, the usual generalized linear model may be fitted for counts under a Negative Binomial distribution with a logarithm link function and the population size included as offset to model hospitalization rates. Different statistics are used to build CUSUM control charts to monitor daily hospitalizations and their performances are compared in simulation studies. The main contribution of the current paper is to consider different statistics based on transformations and the deviance residual to build control charts to monitor counts with seasonality effects and evaluate all the assumptions of the monitored statistics. The monitoring of daily number of hospital admissions due to respiratory diseases for people aged over 65 years in the city São Paulo-Brazil is considered as an illustration of the current proposal.

June 26, 2015 doi: 10.1177/0962280215592427 open full text
Performance of methods for meta-analysis of diagnostic test accuracy with few studies or sparse data.
Takwoingi, Y., Guo, B., Riley, R. D., Deeks, J. J.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2015

Hierarchical models such as the bivariate and hierarchical summary receiver operating characteristic (HSROC) models are recommended for meta-analysis of test accuracy studies. These models are challenging to fit when there are few studies and/or sparse data (for example zero cells in contingency tables due to studies reporting 100% sensitivity or specificity); the models may not converge, or give unreliable parameter estimates. Using simulation, we investigated the performance of seven hierarchical models incorporating increasing simplifications in scenarios designed to replicate realistic situations for meta-analysis of test accuracy studies. Performance of the models was assessed in terms of estimability (percentage of meta-analyses that successfully converged and percentage where the between study correlation was estimable), bias, mean square error and coverage of the 95% confidence intervals. Our results indicate that simpler hierarchical models are valid in situations with few studies or sparse data. For synthesis of sensitivity and specificity, univariate random effects logistic regression models are appropriate when a bivariate model cannot be fitted. Alternatively, an HSROC model that assumes a symmetric SROC curve (by excluding the shape parameter) can be used if the HSROC model is the chosen meta-analytic approach. In the absence of heterogeneity, fixed effect equivalent of the models can be applied.

June 26, 2015 doi: 10.1177/0962280215592269 open full text
The transition model test for serial dependence in mixed-effects models for binary data.
Breinegaard, N., Rabe-Hesketh, S., Skrondal, A.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2015

Generalized linear mixed models for longitudinal data assume that responses at different occasions are conditionally independent, given the random effects and covariates. Although this assumption is pivotal for consistent estimation, violation due to serial dependence is hard to assess by model elaboration. We therefore propose a targeted diagnostic test for serial dependence, called the transition model test (TMT), that is straightforward and computationally efficient to implement in standard software. The TMT is shown to have larger power than general misspecification tests. We also propose the targeted root mean squared error of approximation (TRSMEA) as a measure of the population misfit due to serial dependence.

June 26, 2015 doi: 10.1177/0962280215588123 open full text
Zero-inflated count models for longitudinal measurements with heterogeneous random effects.
Zhu, H., Luo, S., DeSantis, S. M.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2015

Longitudinal zero-inflated count data arise frequently in substance use research when assessing the effects of behavioral and pharmacological interventions. Zero-inflated count models (e.g. zero-inflated Poisson or zero-inflated negative binomial) with random effects have been developed to analyze this type of data. In random effects zero-inflated count models, the random effects covariance matrix is typically assumed to be homogeneous (constant across subjects). However, in many situations this matrix may be heterogeneous (differ by measured covariates). In this paper, we extend zero-inflated count models to account for random effects heterogeneity by modeling their variance as a function of covariates. We show via simulation that ignoring intervention and covariate-specific heterogeneity can produce biased estimates of covariate and random effect estimates. Moreover, those biased estimates can be rectified by correctly modeling the random effects covariance structure. The methodological development is motivated by and applied to the Combined Pharmacotherapies and Behavioral Interventions for Alcohol Dependence (COMBINE) study, the largest clinical trial of alcohol dependence performed in United States with 1383 individuals.

June 24, 2015 doi: 10.1177/0962280215588224 open full text
Distribution-free estimation of zero-inflated models with unobserved heterogeneity.
Gilles, R., Kim, S.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2015

This paper presents a quasi-conditional likelihood method for the consistent estimation of both continuous and count data models with excess zeros and unobserved individual heterogeneity when the true data generating process is unknown. Monte Carlo simulation studies show that our zero-inflated quasi-conditional maximum likelihood (ZI-QCML) estimator outperforms other methods and is robust to distributional misspecifications. We apply the ZI-QCML estimator to analyze the frequency of doctor visits.

June 24, 2015 doi: 10.1177/0962280215588940 open full text
Modeling of correlated data with informative cluster sizes: An evaluation of joint modeling and within-cluster resampling approaches.
Zhang, B., Liu, W., Zhang, Z., Qu, Y., Chen, Z., Albert, P. S.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2015

Joint modeling and within-cluster resampling are two approaches that are used for analyzing correlated data with informative cluster sizes. Motivated by a developmental toxicity study, we examined the performances and validity of these two approaches in testing covariate effects in generalized linear mixed-effects models. We show that the joint modeling approach is robust to the misspecification of cluster size models in terms of Type I and Type II errors when the corresponding covariates are not included in the random effects structure; otherwise, statistical tests may be affected. We also evaluate the performance of the within-cluster resampling procedure and thoroughly investigate the validity of it in modeling correlated data with informative cluster sizes. We show that within-cluster resampling is a valid alternative to joint modeling for cluster-specific covariates, but it is invalid for time-dependent covariates. The two methods are applied to a developmental toxicity study that investigated the effect of exposure to diethylene glycol dimethyl ether.

June 24, 2015 doi: 10.1177/0962280215592268 open full text
Cause-specific quantile residual life regression.
Lim, J. Y., Jeong, J.-H.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2015

We propose a cause-specific quantile residual life regression where the cause-specific quantile residual life, defined as the inverse of the cumulative incidence function of the residual life distribution of a specific type of events of interest conditional on a fixed time point, is log-linear in observable covariates. The proposed test statistic for the effects of prognostic factors does not involve estimation of the improper probability density function of the cause-specific residual life distribution under competing risks. The asymptotic distribution of the test statistic is derived. Simulation studies are performed to assess the finite sample properties of the proposed estimating equation and the test statistic. The proposed method is illustrated with a real dataset from a clinical trial on breast cancer.

June 24, 2015 doi: 10.1177/0962280215592426 open full text
Bayesian inference for two-part mixed-effects model using skew distributions, with application to longitudinal semicontinuous alcohol data.
Xing, D., Huang, Y., Chen, H., Zhu, Y., Dagne, G. A., Baldwin, J.
Statistical Methods in Medical Research: An International Review Journal. June 19, 2015

Semicontinuous data featured with an excessive proportion of zeros and right-skewed continuous positive values arise frequently in practice. One example would be the substance abuse/dependence symptoms data for which a substantial proportion of subjects investigated may report zero. Two-part mixed-effects models have been developed to analyze repeated measures of semicontinuous data from longitudinal studies. In this paper, we propose a flexible two-part mixed-effects model with skew distributions for correlated semicontinuous alcohol data under the framework of a Bayesian approach. The proposed model specification consists of two mixed-effects models linked by the correlated random effects: (i) a model on the occurrence of positive values using a generalized logistic mixed-effects model (Part I); and (ii) a model on the intensity of positive values using a linear mixed-effects model where the model errors follow skew distributions including skew-t and skew-normal distributions (Part II). The proposed method is illustrated with an alcohol abuse/dependence symptoms data from a longitudinal observational study, and the analytic results are reported by comparing potential models under different random-effects structures. Simulation studies are conducted to assess the performance of the proposed models and method.

June 19, 2015 doi: 10.1177/0962280215590284 open full text
New defective models based on the Kumaraswamy family of distributions with application to cancer data sets.
Rocha, R., Nadarajah, S., Tomazella, V., Louzada, F., Eudes, A.
Statistical Methods in Medical Research: An International Review Journal. June 19, 2015

An alternative to the standard mixture model is proposed for modeling data containing cured elements or a cure fraction. This approach is based on the use of defective distributions to estimate the cure fraction as a function of the estimated parameters. In the literature there are just two of these distributions: the Gompertz and the inverse Gaussian. Here, we propose two new defective distributions: the Kumaraswamy Gompertz and Kumaraswamy inverse Gaussian distributions, extensions of the Gompertz and inverse Gaussian distributions under the Kumaraswamy family of distributions. We show in fact that if a distribution is defective, then its extension under the Kumaraswamy family is defective too. We consider maximum likelihood estimation of the extensions and check its finite sample performance. We use three real cancer data sets to show that the new defective distributions offer better fits than baseline distributions.

June 19, 2015 doi: 10.1177/0962280215587976 open full text
Statistical interactions and Bayes estimation of log odds in case-control studies.
Satagopan, J. M., Olson, S. H., Elston, R. C.
Statistical Methods in Medical Research: An International Review Journal. June 19, 2015

This paper is concerned with the estimation of the logarithm of disease odds (log odds) when evaluating two risk factors, whether or not interactions are present. Statisticians define interaction as a departure from an additive model on a certain scale of measurement of the outcome. Certain interactions, known as removable interactions, may be eliminated by fitting an additive model under an invertible transformation of the outcome. This can potentially provide more precise estimates of log odds than fitting a model with interaction terms. In practice, we may also encounter nonremovable interactions. The model must then include interaction terms, regardless of the choice of the scale of the outcome. However, in practical settings, we do not know at the outset whether an interaction exists, and if so whether it is removable or nonremovable. Rather than trying to decide on significance levels to test for the existence of removable and nonremovable interactions, we develop a Bayes estimator based on a squared error loss function. We demonstrate the favorable bias-variance trade-offs of our approach using simulations, and provide empirical illustrations using data from three published endometrial cancer case-control studies. The methods are implemented in an R program, and available freely at http://www.mskcc.org/biostatistics/~satagopj.

June 19, 2015 doi: 10.1177/0962280214567140 open full text
Accounting for dropout reason in longitudinal studies with nonignorable dropout.
Moore, C. M., MaWhinney, S., Forster, J. E., Carlson, N. E., Allshouse, A., Wang, X., Routy, J.-P., Conway, B., Connick, E.
Statistical Methods in Medical Research: An International Review Journal. June 15, 2015

Dropout is a common problem in longitudinal cohort studies and clinical trials, often raising concerns of nonignorable dropout. Selection, frailty, and mixture models have been proposed to account for potentially nonignorable missingness by relating the longitudinal outcome to time of dropout. In addition, many longitudinal studies encounter multiple types of missing data or reasons for dropout, such as loss to follow-up, disease progression, treatment modifications and death. When clinically distinct dropout reasons are present, it may be preferable to control for both dropout reason and time to gain additional clinical insights. This may be especially interesting when the dropout reason and dropout times differ by the primary exposure variable. We extend a semi-parametric varying-coefficient method for nonignorable dropout to accommodate dropout reason. We apply our method to untreated HIV-infected subjects recruited to the Acute Infection and Early Disease Research Program HIV cohort and compare longitudinal CD4⁺ T cell count in injection drug users to nonusers with two dropout reasons: anti-retroviral treatment initiation and loss to follow-up.

June 15, 2015 doi: 10.1177/0962280215590432 open full text
Combined dynamic predictions using joint models of two longitudinal outcomes and competing risk data.
Andrinopoulou, E.-R., Rizopoulos, D., Takkenberg, J. J., Lesaffre, E.
Statistical Methods in Medical Research: An International Review Journal. June 09, 2015

Nowadays there is an increased medical interest in personalized medicine and tailoring decision making to the needs of individual patients. Within this context our developments are motivated from a Dutch study at the Cardio-Thoracic Surgery Department of the Erasmus Medical Center, consisting of patients who received a human tissue valve in aortic position and who were thereafter monitored echocardiographically. Our aim is to utilize the available follow-up measurements of the current patients to produce dynamically updated predictions of both survival and freedom from re-intervention for future patients. In this paper, we propose to jointly model multiple longitudinal measurements combined with competing risk survival outcomes and derive the dynamically updated cumulative incidence functions. Moreover, we investigate whether different features of the longitudinal processes would change significantly the prediction for the events of interest by considering different types of association structures, such as time-dependent trajectory slopes and time-dependent cumulative effects. Our final contribution focuses on optimizing the quality of the derived predictions. In particular, instead of choosing one final model over a list of candidate models which ignores model uncertainty, we propose to suitably combine predictions from all considered models using Bayesian model averaging.

June 09, 2015 doi: 10.1177/0962280215588340 open full text
A multistate additive relative survival semi-Markov model.
Gillaizeau, F., Dantan, E., Giral, M., Foucher, Y.
Statistical Methods in Medical Research: An International Review Journal. June 07, 2015

Medical researchers are often interested to investigate the relationship between explicative variables and times-to-events such as disease progression or death. Such multiple times-to-events can be studied using multistate models. For chronic diseases, it may be relevant to consider semi-Markov multistate models because the transition intensities between two clinical states more likely depend on the time already spent in the current state than on the chronological time. When the cause of death for a patient is unavailable or not totally attributable to the disease, it is not possible to specifically study the associations with the excess mortality related to the disease. Relative survival analysis allows an estimate of the net survival in the hypothetical situation where the disease would be the only possible cause of death. In this paper, we propose a semi-Markov additive relative survival (SMRS) model that combines the multistate and the relative survival approaches. The usefulness of the SMRS model is illustrated by two applications with data from a French cohort of kidney transplant recipients. Using simulated data, we also highlight the effectiveness of the SMRS model: the results tend to those obtained if the different causes of death are known.

June 07, 2015 doi: 10.1177/0962280215586456 open full text
A new framework of statistical inferences based on the valid joint sampling distribution of the observed counts in an incomplete contingency table.
Tian, G.-L., Li, H.-Q.
Statistical Methods in Medical Research: An International Review Journal. June 05, 2015

Some existing confidence interval methods and hypothesis testing methods in the analysis of a contingency table with incomplete observations in both margins entirely depend on an underlying assumption that the sampling distribution of the observed counts is a product of independent multinomial/binomial distributions for complete and incomplete counts. However, it can be shown that this independency assumption is incorrect and can result in unreliable conclusions because of the under-estimation of the uncertainty. Therefore, the first objective of this paper is to derive the valid joint sampling distribution of the observed counts in a contingency table with incomplete observations in both margins. The second objective is to provide a new framework for analyzing incomplete contingency tables based on the derived joint sampling distribution of the observed counts by developing a Fisher scoring algorithm to calculate maximum likelihood estimates of parameters of interest, the bootstrap confidence interval methods, and the bootstrap testing hypothesis methods. We compare the differences between the valid sampling distribution and the sampling distribution under the independency assumption. Simulation studies showed that average/expected confidence-interval widths of parameters based on the sampling distribution under the independency assumption are shorter than those based on the new sampling distribution, yielding unrealistic results. A real data set is analyzed to illustrate the application of the new sampling distribution for incomplete contingency tables and the analysis results again confirm the conclusions obtained from the simulation studies.

June 05, 2015 doi: 10.1177/0962280215586591 open full text
Two-stage phase II oncology designs using short-term endpoints for early stopping.
Kunz, C. U., Wason, J. M., Kieser, M.
Statistical Methods in Medical Research: An International Review Journal. June 02, 2015

Phase II oncology trials are conducted to evaluate whether the tumour activity of a new treatment is promising enough to warrant further investigation. The most commonly used approach in this context is a two-stage single-arm design with binary endpoint. As for all designs with interim analysis, its efficiency strongly depends on the relation between recruitment rate and follow-up time required to measure the patients’ outcomes. Usually, recruitment is postponed after the sample size of the first stage is achieved up until the outcomes of all patients are available. This may lead to a considerable increase of the trial length and with it to a delay in the drug development process. We propose a design where an intermediate endpoint is used in the interim analysis to decide whether or not the study is continued with a second stage. Optimal and minimax versions of this design are derived. The characteristics of the proposed design in terms of type I error rate, power, maximum and expected sample size as well as trial duration are investigated. Guidance is given on how to select the most appropriate design. Application is illustrated by a phase II oncology trial in patients with advanced angiosarcoma, which motivated this research.

June 02, 2015 doi: 10.1177/0962280215585819 open full text
An imputation-based solution to using mismeasured covariates in propensity score analysis.
Webb-Vargas, Y., Rudolph, K. E., Lenis, D., Murakami, P., Stuart, E. A.
Statistical Methods in Medical Research: An International Review Journal. June 02, 2015

Although covariate measurement error is likely the norm rather than the exception, methods for handling covariate measurement error in propensity score methods have not been widely investigated. We consider a multiple imputation-based approach that uses an external calibration sample with information on the true and mismeasured covariates, multiple imputation for external calibration, to correct for the measurement error, and investigate its performance using simulation studies. As expected, using the covariate measured with error leads to bias in the treatment effect estimate. In contrast, the multiple imputation for external calibration method can eliminate almost all the bias. We confirm that the outcome must be used in the imputation process to obtain good results, a finding related to the idea of congenial imputation and analysis in the broader multiple imputation literature. We illustrate the multiple imputation for external calibration approach using a motivating example estimating the effects of living in a disadvantaged neighborhood on mental health and substance use outcomes among adolescents. These results show that estimating the propensity score using covariates measured with error leads to biased estimates of treatment effects, but when a calibration data set is available, multiple imputation for external calibration can be used to help correct for such bias.

June 02, 2015 doi: 10.1177/0962280215588771 open full text
Bayesian multivariate augmented Beta rectangular regression models for patient-reported outcomes and survival data.
Wang, J., Luo, S.
Statistical Methods in Medical Research: An International Review Journal. June 02, 2015

Many longitudinal studies (e.g. observational studies and randomized clinical trials) have collected multiple rating scales at each visit in the form of patient-reported outcomes (PROs) in the close unit interval [0,1]. We propose a joint modeling framework to address the issues from the following data features: (1) multiple correlated PROs; (2) the presence of the boundary values of zeros and ones; (3) extreme outliers and heavy tails; (4) the PRO-dependent terminal events such as death and dropout. Our modeling framework consists of a multivariate augmented mixed-effects sub-model based on Beta rectangular distributions for the multiple longitudinal outcomes and a Cox model for the terminal events. The simulation studies suggest that in the presence of outliers, heavy tails, and dependent terminal event, our proposed models provide more accurate parameter estimates than the joint model based on Beta distributions. The proposed models are applied to the motivating Long-term Study-1 (LS-1 study, n = 1741) of Parkinson’s disease patients.

June 02, 2015 doi: 10.1177/0962280215586010 open full text
Approaches for dealing with various sources of overdispersion in modeling count data: Scale adjustment versus modeling.
Payne, E. H., Hardin, J. W., Egede, L. E., Ramakrishnan, V., Selassie, A., Gebregziabher, M.
Statistical Methods in Medical Research: An International Review Journal. May 31, 2015

Overdispersion is a common problem in count data. It can occur due to extra population-heterogeneity, omission of key predictors, and outliers. Unless properly handled, this can lead to invalid inference. Our goal is to assess the differential performance of methods for dealing with overdispersion from several sources. We considered six different approaches: unadjusted Poisson regression (Poisson), deviance-scale-adjusted Poisson regression (DS-Poisson), Pearson-scale-adjusted Poisson regression (PS-Poisson), negative-binomial regression (NB), and two generalized linear mixed models (GLMM) with random intercept, log-link and Poisson (Poisson-GLMM) and negative-binomial (NB-GLMM) distributions. To rank order the preference of the models, we used Akaike's information criteria/Bayesian information criteria values, standard error, and 95% confidence-interval coverage of the parameter values. To compare these methods, we used simulated count data with overdispersion of different magnitude from three different sources. Mean of the count response was associated with three predictors. Data from two real-case studies are also analyzed. The simulation results showed that NB and NB-GLMM were preferred for dealing with overdispersion resulting from any of the sources we considered. Poisson and DS-Poisson often produced smaller standard-error estimates than expected, while PS-Poisson conversely produced larger standard-error estimates. Thus, it is good practice to compare several model options to determine the best method of modeling count data.

May 31, 2015 doi: 10.1177/0962280215588569 open full text
Testing the trajectory difference in a semi-parametric longitudinal model.
Niu, F., Zhou, J., Le, T. H., Ma, J. Z.
Statistical Methods in Medical Research: An International Review Journal. May 13, 2015

Motivated by a genetic investigation on the progressive decline in renal function in a clinical trial study of kidney disease, we develop a practical test for evaluating the group difference in trajectories under a semi-parametric modeling framework. For the temporal patterns or trajectories of longitudinal data, B-splines are used to approximate the function non-parametrically. Such approximation asymptotically converts the problem of testing trajectory difference into the significance test of regression coefficients that can be simply estimated by generalized estimating equations. To select the optimal number of inner knots for B-splines, a cross-validation procedure is performed using the criterion of the generalized residual sum of squares. The new proposed test successfully detects a significant difference of underlying genetic impact on the progression of renal disease, which is not captured by the parametric approach.

May 13, 2015 doi: 10.1177/0962280215584109 open full text
Random-effects meta-analysis: the number of studies matters.
Guolo, A., Varin, C.
Statistical Methods in Medical Research: An International Review Journal. May 07, 2015

This paper investigates the impact of the number of studies on meta-analysis and meta-regression within the random-effects model framework. It is frequently neglected that inference in random-effects models requires a substantial number of studies included in meta-analysis to guarantee reliable conclusions. Several authors warn about the risk of inaccurate results of the traditional DerSimonian and Laird approach especially in the common case of meta-analysis involving a limited number of studies. This paper presents a selection of likelihood and non-likelihood methods for inference in meta-analysis proposed to overcome the limitations of the DerSimonian and Laird procedure, with a focus on the effect of the number of studies. The applicability and the performance of the methods are investigated in terms of Type I error rates and empirical power to detect effects, according to scenarios of practical interest. Simulation studies and applications to real meta-analyses highlight that it is not possible to identify an approach uniformly superior to alternatives. The overall recommendation is to avoid the DerSimonian and Laird method when the number of meta-analysis studies is modest and prefer a more comprehensive procedure that compares alternative inferential approaches. R code for meta-analysis according to all of the inferential methods examined in the paper is provided.

May 07, 2015 doi: 10.1177/0962280215583568 open full text
A non-parametric model to address overdispersed count response in a longitudinal data setting with missingness.
Zhang, H., He, H., Lu, N., Zhu, L., Zhang, B., Zhang, Z., Tang, L.
Statistical Methods in Medical Research: An International Review Journal. May 05, 2015

Count responses are becoming increasingly important in biostatistical analysis because of the development of new biomedical techniques such as next-generation sequencing and digital polymerase chain reaction; a commonly met problem in modeling them with the popular Poisson model is overdispersion. Although it has been studied extensively for cross-sectional observations, addressing overdispersion for longitudinal data without parametric distributional assumptions remains challenging, especially with missing data. In this paper, we propose a method to detect overdispersion in repeated measures in a non-parametric manner by extending the Mann–Whitney–Wilcoxon rank sum test to longitudinal data. In addition, we also incorporate the inverse probability weighted method to address the data missingness. The proposed model is illustrated with both simulated and real study data.

May 05, 2015 doi: 10.1177/0962280215583397 open full text
The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes.
Austin, P. C., Stuart, E. A.
Statistical Methods in Medical Research: An International Review Journal. April 30, 2015

There is increasing interest in estimating the causal effects of treatments using observational data. Propensity-score matching methods are frequently used to adjust for differences in observed characteristics between treated and control individuals in observational studies. Survival or time-to-event outcomes occur frequently in the medical literature, but the use of propensity score methods in survival analysis has not been thoroughly investigated. This paper compares two approaches for estimating the Average Treatment Effect (ATE) on survival outcomes: Inverse Probability of Treatment Weighting (IPTW) and full matching. The performance of these methods was compared in an extensive set of simulations that varied the extent of confounding and the amount of misspecification of the propensity score model. We found that both IPTW and full matching resulted in estimation of marginal hazard ratios with negligible bias when the ATE was the target estimand and the treatment-selection process was weak to moderate. However, when the treatment-selection process was strong, both methods resulted in biased estimation of the true marginal hazard ratio, even when the propensity score model was correctly specified. When the propensity score model was correctly specified, bias tended to be lower for full matching than for IPTW. The reasons for these biases and for the differences between the two methods appeared to be due to some extreme weights generated for each method. Both methods tended to produce more extreme weights as the magnitude of the effects of covariates on treatment selection increased. Furthermore, more extreme weights were observed for IPTW than for full matching. However, the poorer performance of both methods in the presence of a strong treatment-selection process was mitigated by the use of IPTW with restriction and full matching with a caliper restriction when the propensity score model was correctly specified.

April 30, 2015 doi: 10.1177/0962280215584401 open full text
Semiparametric Bayesian analysis of censored linear regression with errors-in-covariates.
Sinha, S., Wang, S.
Statistical Methods in Medical Research: An International Review Journal. April 24, 2015

The accelerated failure time (AFT) model is a well-known alternative to the Cox proportional hazard model for analyzing time-to-event data. In this paper we consider fitting an AFT model to right censored data when a predictor variable is subject to measurement errors. First, without measurement errors, estimation of the model parameters in the AFT model is a challenging task due to the presence of censoring, especially when no specific assumption is made regarding the distribution of the logarithm of the time-to-event. The model complexity increases when a predictor is measured with error. We propose a non-parametric Bayesian method for analyzing such data. The novel component of our approach is to model (1) the distribution of the time-to-event, (2) the distribution of the unobserved true predictor, and (3) the distribution of the measurement errors all non-parametrically using mixtures of the Dirichlet process priors. Along with the parameter estimation we also prescribe how to estimate survival probabilities of the time-to-event. Some operating characteristics of the proposed approach are judged via finite sample simulation studies. We illustrate the proposed method by analyzing a data set from an AIDS clinical trial study.

April 24, 2015 doi: 10.1177/0962280215580668 open full text
Construction of joint confidence regions for the optimal true class fractions of Receiver Operating Characteristic (ROC) surfaces and manifolds.
Bantis, L. E., Nakas, C. T., Reiser, B., Myall, D., Dalrymple-Alford, J. C.
Statistical Methods in Medical Research: An International Review Journal. April 24, 2015

The three-class approach is used for progressive disorders when clinicians and researchers want to diagnose or classify subjects as members of one of three ordered categories based on a continuous diagnostic marker. The decision thresholds or optimal cut-off points required for this classification are often chosen to maximize the generalized Youden index (Nakas et al., Stat Med 2013; 32: 995–1003). The effectiveness of these chosen cut-off points can be evaluated by estimating their corresponding true class fractions and their associated confidence regions. Recently, in the two-class case, parametric and non-parametric methods were investigated for the construction of confidence regions for the pair of the Youden-index-based optimal sensitivity and specificity fractions that can take into account the correlation introduced between sensitivity and specificity when the optimal cut-off point is estimated from the data (Bantis et al., Biomet 2014; 70: 212–223). A parametric approach based on the Box–Cox transformation to normality often works well while for markers having more complex distributions a non-parametric procedure using logspline density estimation can be used instead. The true class fractions that correspond to the optimal cut-off points estimated by the generalized Youden index are correlated similarly to the two-class case. In this article, we generalize these methods to the three- and to the general k-class case which involves the classification of subjects into three or more ordered categories, where ROC surface or ROC manifold methodology, respectively, is typically employed for the evaluation of the discriminatory capacity of a diagnostic marker. We obtain three- and multi-dimensional joint confidence regions for the optimal true class fractions. We illustrate this with an application to the Trail Making Test Part A that has been used to characterize cognitive impairment in patients with Parkinson’s disease.

April 24, 2015 doi: 10.1177/0962280215581694 open full text
A permutation test to analyse systematic bias and random measurement errors of medical devices via boosting location and scale models.
Mayr, A., Schmid, M., Pfahlberg, A., Uter, W., Gefeller, O.
Statistical Methods in Medical Research: An International Review Journal. April 24, 2015

Measurement errors of medico-technical devices can be separated into systematic bias and random error. We propose a new method to address both simultaneously via generalized additive models for location, scale and shape (GAMLSS) in combination with permutation tests. More precisely, we extend a recently proposed boosting algorithm for GAMLSS to provide a test procedure to analyse potential device effects on the measurements. We carried out a large-scale simulation study to provide empirical evidence that our method is able to identify possible sources of systematic bias as well as random error under different conditions. Finally, we apply our approach to compare measurements of skin pigmentation from two different devices in an epidemiological study.

April 24, 2015 doi: 10.1177/0962280215581855 open full text
Weibull mixture regression for marginal inference in zero-heavy continuous outcomes.
Gebregziabher, M., Voronca, D., Teklehaimanot, A., Santa Ana, E. J.
Statistical Methods in Medical Research: An International Review Journal. April 22, 2015

Continuous outcomes with preponderance of zero values are ubiquitous in data that arise from biomedical studies, for example studies of addictive disorders. This is known to lead to violation of standard assumptions in parametric inference and enhances the risk of misleading conclusions unless managed properly. Two-part models are commonly used to deal with this problem. However, standard two-part models have limitations with respect to obtaining parameter estimates that have marginal interpretation of covariate effects which are important in many biomedical applications. Recently marginalized two-part models are proposed but their development is limited to log-normal and log-skew-normal distributions. Thus, in this paper, we propose a finite mixture approach, with Weibull mixture regression as a special case, to deal with the problem. We use extensive simulation study to assess the performance of the proposed model in finite samples and to make comparisons with other family of models via statistical information and mean squared error criteria. We demonstrate its application on real data from a randomized controlled trial of addictive disorders. Our results show that a two-component Weibull mixture model is preferred for modeling zero-heavy continuous data when the non-zero part are simulated from Weibull or similar distributions such as Gamma or truncated Gauss.

April 22, 2015 doi: 10.1177/0962280215583402 open full text
A statistical method for studying correlated rare events and their risk factors.
Xue, X., Kim, M. Y., Wang, T., Kuniholm, M. H., Strickler, H. D.
Statistical Methods in Medical Research: An International Review Journal. April 08, 2015

Longitudinal studies of rare events such as cervical high-grade lesions or colorectal polyps that can recur often involve correlated binary data. Risk factor for these events cannot be reliably examined using conventional statistical methods. For example, logistic regression models that incorporate generalized estimating equations often fail to converge or provide inaccurate results when analyzing data of this type. Although exact methods have been reported, they are complex and computationally difficult. The current paper proposes a mathematically straightforward and easy-to-use two-step approach involving (i) an additive model to measure associations between a rare or uncommon correlated binary event and potential risk factors and (ii) a permutation test to estimate the statistical significance of these associations. Simulation studies showed that the proposed method reliably tests and accurately estimates the associations of exposure with correlated binary rare events. This method was then applied to a longitudinal study of human leukocyte antigen (HLA) genotype and risk of cervical high grade squamous intraepithelial lesions (HSIL) among HIV-infected and HIV-uninfected women. Results showed statistically significant associations of two HLA alleles among HIV-negative but not HIV-positive women, suggesting that immune status may modify the HLA and cervical HSIL association. Overall, the proposed method avoids model nonconvergence problems and provides a computationally simple, accurate, and powerful approach for the analysis of risk factor associations with rare/uncommon correlated binary events.

April 08, 2015 doi: 10.1177/0962280215581112 open full text
Generalized linear mixed models for multi-reader multi-case studies of diagnostic tests.
Liu, W., Pantoja-Galicia, N., Zhang, B., Kotz, R. M., Pennello, G., Zhang, H., Jacob, J., Zhang, Z.
Statistical Methods in Medical Research: An International Review Journal. April 05, 2015

Diagnostic tests are often compared in multi-reader multi-case (MRMC) studies in which a number of cases (subjects with or without the disease in question) are examined by several readers using all tests to be compared. One of the commonly used methods for analyzing MRMC data is the Obuchowski–Rockette (OR) method, which assumes that the true area under the receiver operating characteristic curve (AUC) for each combination of reader and test follows a linear mixed model with fixed effects for test and random effects for reader and the reader–test interaction. This article proposes generalized linear mixed models which generalize the OR model by incorporating a range-appropriate link function that constrains the true AUCs to the unit interval. The proposed models can be estimated by maximizing a pseudo-likelihood based on the approximate normality of AUC estimates. A Monte Carlo expectation-maximization algorithm can be used to maximize the pseudo-likelihood, and a non-parametric bootstrap procedure can be used for inference. The proposed method is evaluated in a simulation study and applied to an MRMC study of breast cancer detection.

April 05, 2015 doi: 10.1177/0962280215579476 open full text
A comparative review of methods for comparing means using partially paired data.
Guo, B., Yuan, Y.
Statistical Methods in Medical Research: An International Review Journal. April 01, 2015

In medical experiments with the objective of testing the equality of two means, data are often partially paired by design or because of missing data. The partially paired data represent a combination of paired and unpaired observations. In this article, we review and compare nine methods for analyzing partially paired data, including the two-sample t-test, paired t-test, corrected z-test, weighted t-test, pooled t-test, optimal pooled t-test, multiple imputation method, mixed model approach, and the test based on a modified maximum likelihood estimate. We compare the performance of these methods through extensive simulation studies that cover a wide range of scenarios with different effect sizes, sample sizes, and correlations between the paired variables, as well as true underlying distributions. The simulation results suggest that when the sample size is moderate, the test based on the modified maximum likelihood estimator is generally superior to the other approaches when the data is normally distributed and the optimal pooled t-test performs the best when the data is not normally distributed, with well-controlled type I error rates and high statistical power; when the sample size is small, the optimal pooled t-test is to be recommended when both variables have missing data and the paired t-test is to be recommended when only one variable has missing data.

April 01, 2015 doi: 10.1177/0962280215577111 open full text
Multi-state modelling of repeated hospitalisation and death in patients with heart failure: The use of large administrative databases in clinical epidemiology.
Ieva, F., Jackson, C. H., Sharples, L. D.
Statistical Methods in Medical Research: An International Review Journal. March 26, 2015

In chronic diseases like heart failure (HF), the disease course and associated clinical event histories for the patient population vary widely. To improve understanding of the prognosis of patients and enable health care providers to assess and manage resources, we wish to jointly model disease progression, mortality and their relation with patient characteristics. We show how episodes of hospitalisation for disease-related events, obtained from administrative data, can be used as a surrogate for disease status. We propose flexible multi-state models for serial hospital admissions and death in HF patients, that are able to accommodate important features of disease progression, such as multiple ordered events and competing risks. Fully parametric and semi-parametric semi-Markov models are implemented using freely available software in R. The models were applied to a dataset from the administrative data bank of the Lombardia region in Northern Italy, which included 15,298 patients who had a first hospitalisation ending in 2006 and 4 years of follow-up thereafter. This provided estimates of the associations of age and gender with rates of hospital admission and length of stay in hospital, and estimates of the expected total time spent in hospital over five years. For example, older patients and men were readmitted more frequently, though the total time in hospital was roughly constant with age. We also discuss the relative merits of parametric and semi-parametric multi-state models, and model assessment and comparison.

March 26, 2015 doi: 10.1177/0962280215578777 open full text
Multiple frailty model for clustered interval-censored data with frailty selection.
Pan, C., Cai, B., Wang, L.
Statistical Methods in Medical Research: An International Review Journal. March 19, 2015

Interval-censored time-to-event data often occur in studies of diseases where the symptoms of interest are not directly observable but require lab examinations for detection. Furthermore, the independence assumption among observations may not be valid if they are from clusters. Some methods have been developed for analysing clustered interval-censored data with a shared frailty to account for overall heterogeneity. In this paper, we propose a multiple frailty proportional hazards model, where we not only account for the baseline heterogeneity and effect variation across clusters for predictors, but also quantify the probabilities of the existence of such frailties. This proposed model will be especially useful for analysing multi-center randomised clinical trials for HIV, infections or progression-free survival in oncology studies.

March 19, 2015 doi: 10.1177/0962280215576987 open full text
Noninferiority studies with multiple reference treatments.
Li-Ching, H., Miin-Jye, W., Hung, C. S., Shing, K. K.
Statistical Methods in Medical Research: An International Review Journal. March 18, 2015

The increasing popularity of noninferiority trials reflects the ongoing efforts to replace existing treatments (reference treatments) with new treatments (experimental treatments) that retain a substantial fraction of the effect of the reference treatments. The adoption of any new treatment has to be vindicated by a demonstration of benefits that outweigh a possible clinically insignificant reduction in the reference treatment efficacy. Statistical methods have been developed to analyze data collected from noninferiority trials. However, these methods focus on cases with only one reference treatment. In this paper, we provide the statistical inferential procedures for situations with multiple reference treatments. The computation of the corresponding critical values for simultaneous testings of noninferiority of several new treatments to multiple reference treatments in the presence of a placebo is provided. Furthermore, for a prespecified level of test power, a technique to determine the optimal sample size before the onset of a noninferiority trial is derived. A clinical example is given to illustrate our proposed procedure.

March 18, 2015 doi: 10.1177/0962280215576017 open full text
Bayesian and frequentist approaches to assessing reliability and precision of health-care provider quality measures.
Staggs, V. S., Gajewski, B. J.
Statistical Methods in Medical Research: An International Review Journal. March 17, 2015

Our purpose was to compare frequentist, empirical Bayes, and Bayesian hierarchical model approaches to estimating reliability of health care quality measures, including construction of credible intervals to quantify uncertainty in reliability estimates, using data on inpatient fall rates on hospital nursing units. Precision of reliability estimates and Bayesian approaches to estimating reliability are not well studied. We analyzed falls data from 2372 medical units; the rate of unassisted falls per 1000 inpatient days was the measure of interest. The Bayesian methods "shrunk" the observed fall rates and frequentist reliability estimates toward their posterior means. We examined the association between reliability and precision in fall rate rankings by plotting the length of a 90% credible interval for each unit’s percentile rank against the unit’s estimated reliability. Precision of rank estimates tended to increase as reliability increased but was limited even at higher reliability levels: Among units with reliability >0.8, only 5.5% had credible interval length <20; among units with reliability >0.9, only 31.9% had credible interval length <20. Thus, a high reliability estimate may not be sufficient to ensure precise differentiation among providers. Bayesian approaches allow for assessment of this precision.

March 17, 2015 doi: 10.1177/0962280215577410 open full text
Sample size determination for logistic regression on a logit-normal distribution.
Kim, S., Heath, E., Heilbrun, L.
Statistical Methods in Medical Research: An International Review Journal. March 04, 2015

Although the sample size for simple logistic regression can be readily determined using currently available methods, the sample size calculation for multiple logistic regression requires some additional information, such as the coefficient of determination ($${R}_{\mathrm{cov}}^{2}$$) of a covariate of interest with other covariates, which is often unavailable in practice. The response variable of logistic regression follows a logit-normal distribution which can be generated from a logistic transformation of a normal distribution. Using this property of logistic regression, we propose new methods of determining the sample size for simple and multiple logistic regressions using a normal transformation of outcome measures. Simulation studies and a motivating example show several advantages of the proposed methods over the existing methods: (i) no need for $${R}_{\mathrm{cov}}^{2}$$ for multiple logistic regression, (ii) available interim or group-sequential designs, and (iii) much smaller required sample size.

March 04, 2015 doi: 10.1177/0962280215572407 open full text
Experimental design and statistical analysis for three-drug combination studies.
Fang, H.-B., Chen, X., Pei, X.-Y., Grant, S., Tan, M.
Statistical Methods in Medical Research: An International Review Journal. March 04, 2015

Drug combination is a critically important therapeutic approach for complex diseases such as cancer and HIV due to its potential for efficacy at lower, less toxic doses and the need to move new therapies rapidly into clinical trials. One of the key issues is to identify which combinations are additive, synergistic, or antagonistic. While the value of multidrug combinations has been well recognized in the cancer research community, to our best knowledge, all existing experimental studies rely on fixing the dose of one drug to reduce the dimensionality, e.g. looking at pairwise two-drug combinations, a suboptimal design. Hence, there is an urgent need to develop experimental design and analysis methods for studying multidrug combinations directly. Because the complexity of the problem increases exponentially with the number of constituent drugs, there has been little progress in the development of methods for the design and analysis of high-dimensional drug combinations. In fact, contrary to common mathematical reasoning, the case of three-drug combinations is fundamentally more difficult than two-drug combinations. Apparently, finding doses of the combination, number of combinations, and replicates needed to detect departures from additivity depends on dose–response shapes of individual constituent drugs. Thus, different classes of drugs of different dose–response shapes need to be treated as a separate case. Our application and case studies develop dose finding and sample size method for detecting departures from additivity with several common (linear and log-linear) classes of single dose–response curves. Furthermore, utilizing the geometric features of the interaction index, we propose a nonparametric model to estimate the interaction index surface by B-spine approximation and derive its asymptotic properties. Utilizing the method, we designed and analyzed a combination study of three anticancer drugs, PD184, HA14-1, and CEP3891 inhibiting myeloma H929 cell line. To our best knowledge, this is the first ever three drug combinations study performed based on the original 4D dose–response surface formed by dose ranges of three drugs.

March 04, 2015 doi: 10.1177/0962280215574320 open full text
Detecting disease association signals with multiple genetic variants and covariates.
Cheng, K., Lee, J.
Statistical Methods in Medical Research: An International Review Journal. March 02, 2015

Due to the improvements in the efficiency of resequencing technologies, discoveries and analyses of rare variants in sequencing-based association studies at the gene level, or even exome-wide are becoming increasingly feasible. Powerful association tests have been suggested in literature for testing whether a group of variants in a gene region is associated with a particular disease of interest. Their performance depends on the correct assumption of regression model and conditions such as the size of the case and control sample, numbers of causal and noncausal variants (rare or common), variant frequency, effect size and directionality, rate of missing genotype, etc. Most of these model-based tests require genotype data to be complete at each variant. Our previous results showed that in the case of no covariate, the power of these tests might be greatly influenced, when there were missing genotypes and only simple imputation was used. In this paper, we demonstrate by simulations that in the presence of covariates, the type I errors of these approaches might be inflated, even when genotype missing rate was very small. We present an association test based on testing zero proportion of causal variants in the gene region, and show this test to be almost uniformly most powerful among the competing tests under very general simulation conditions. This test does not require genotype to be complete and hence is robust against missing genotype. We discuss how to adjust for population stratification based on principal components and show the power loss of this approach was small when the population stratification effect was moderate. We use a Shanghai Breast Cancer Study to demonstrate application of the tests and show the proposed test is more powerful in detecting variants related to breast cancer, and robust against the inclusion of noncausal variants.

March 02, 2015 doi: 10.1177/0962280215574541 open full text
Effect of the absolute statistic on gene-sampling gene-set analysis methods.
Nam, D.
Statistical Methods in Medical Research: An International Review Journal. March 02, 2015

Gene-set enrichment analysis and its modified versions have commonly been used for identifying altered functions or pathways in disease from microarray data. In particular, the simple gene-sampling gene-set analysis methods have been heavily used for datasets with only a few sample replicates. The biggest problem with this approach is the highly inflated false-positive rate. In this paper, the effect of absolute gene statistic on gene-sampling gene-set analysis methods is systematically investigated. Thus far, the absolute gene statistic has merely been regarded as a supplementary method for capturing the bidirectional changes in each gene set. Here, it is shown that incorporating the absolute gene statistic in gene-sampling gene-set analysis substantially reduces the false-positive rate and improves the overall discriminatory ability. Its effect was investigated by power, false-positive rate, and receiver operating curve for a number of simulated and real datasets. The performances of gene-set analysis methods in one-tailed (genome-wide association study) and two-tailed (gene expression data) tests were also compared and discussed.

March 02, 2015 doi: 10.1177/0962280215574014 open full text
Estimation of causal effects of binary treatments in unconfounded studies with one continuous covariate.
Gutman, R., Rubin, D.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2015

The estimation of causal effects in nonrandomized studies should comprise two distinct phases: design, with no outcome data available; and analysis of the outcome data according to a specified protocol. Here, we review and compare point and interval estimates of common statistical procedures for estimating causal effects (i.e. matching, subclassification, weighting, and model-based adjustment) with a scalar continuous covariate and a scalar continuous outcome. We show, using an extensive simulation, that some highly advocated methods have poor operating characteristics. In many conditions, matching for the point estimate combined with within-group matching for sampling variance estimation, with or without covariance adjustment, appears to be the most efficient valid method of those evaluated. These results provide new conclusions and advice regarding the merits of currently used procedures.

February 24, 2015 doi: 10.1177/0962280215570722 open full text
Predictive accuracy of novel risk factors and markers: A simulation study of the sensitivity of different performance measures for the Cox proportional hazards regression model.
Austin, P. C., Pencinca, M. J., Steyerberg, E. W.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2015

Predicting outcomes that occur over time is important in clinical, population health, and health services research. We compared changes in different measures of performance when a novel risk factor or marker was added to an existing Cox proportional hazards regression model. We performed Monte Carlo simulations for common measures of performance: concordance indices (c, including various extensions to survival outcomes), Royston’s D index, R²-type measures, and Chambless’ adaptation of the integrated discrimination improvement to survival outcomes. We found that the increase in performance due to the inclusion of a risk factor tended to decrease as the performance of the reference model increased. Moreover, the increase in performance increased as the hazard ratio or the prevalence of a binary risk factor increased. Finally, for the concordance indices and R²-type measures, the absolute increase in predictive accuracy due to the inclusion of a risk factor was greater when the observed event rate was higher (low censoring). Amongst the different concordance indices, Chambless and Diao’s c-statistic exhibited the greatest increase in predictive accuracy when a novel risk factor was added to an existing model. Amongst the different R²-type measures, O’Quigley et al.’s modification of Nagelkerke’s R² index and Kent and O’Quigley’s $${\rho }_{w,a}^{2}$$ displayed the greatest sensitivity to the addition of a novel risk factor or marker. These methods were then applied to a cohort of 8635 patients hospitalized with heart failure to examine the added benefit of a point-based scoring system for predicting mortality after initial adjustment with patient age alone.

February 24, 2015 doi: 10.1177/0962280214567141 open full text
Design and analysis of Bayesian adaptive crossover trials for evaluating contact lens safety and efficacy.
Zhang, Q., Toubouti, Y., Carlin, B. P.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2015

A crossover study, also referred to as a crossover trial, is a form of longitudinal study. Subjects are randomly assigned to different arms of the study and receive different treatments sequentially. While there are many frequentist methods to analyze data from a crossover study, random effects models for longitudinal data are perhaps most naturally modeled within a Bayesian framework. In this article, we introduce a Bayesian adaptive approach to crossover studies for both efficacy and safety endpoints using Gibbs sampling. Using simulation, we find our approach can detect a true difference between two treatments with a specific false-positive rate that we can readily control via the standard equal-tail posterior credible interval. We then illustrate our Bayesian approaches using real data from Johnson & Johnson Vision Care, Inc. contact lens studies. We then design a variety of Bayesian adaptive predictive probability crossover studies for single and multiple continuous efficacy endpoints, indicate their extension to binary safety endpoints, and investigate their frequentist operating characteristics via simulation. The Bayesian adaptive approach emerges as a crossover trials tool that is useful yet surprisingly overlooked to date, particularly in contact lens development.

February 19, 2015 doi: 10.1177/0962280215572272 open full text
Joint assessment of dependent discrete disease state processes.
Engler, D., Chitnis, T., Healy, B.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2015

In multiple sclerosis, the primary clinical measure of disability level is an ordinal score, the expanded disability severity scale score. In relapsing-remitting multiple sclerosis, measures of relapse are additionally of interest. Multiple sclerosis patients are typically assessed with regard to both the expanded disability severity scale and relapse state at each follow-up visit. As both are discrete measures, the two can be viewed as jointly dependent Markov processes. One of the main goals of multiple sclerosis research is to accurately model, over time, both transitions between expanded disability severity scale states and change in relapse state. This objective requires a number of significant modeling decisions, including decisions about whether or not the combination of specific disease states is warranted and assessment of the dependence structure between the two disease processes. Historically, such decisions are often made in an ad hoc manner and are not formally justified. We propose novel use of Bayes factors and Bayesian variable selection in the assessment of jointly dependent Markovian processes in multiple sclerosis. Methods are assessed using both simulated data and data collected from the Partners Multiple Sclerosis Center in Boston, MA.

February 19, 2015 doi: 10.1177/0962280215569899 open full text
Bayesian inference on mixed-effects varying-coefficient joint models with skew-t distribution for longitudinal data with multiple features.
Lu, T., Huang, Y.
Statistical Methods in Medical Research: An International Review Journal. February 10, 2015

In AIDS clinical study, two biomarkers, HIV viral load and CD4 cell counts, play important roles. It is well known that there is inverse relationship between the two. Nevertheless, the relationship is not constant but time varying. The mixed-effects varying-coefficient model is capable of capturing the time varying nature of such relationship from both population and individual perspective. In practice, the nucleic acid sequence-based amplification assay is used to measure plasma HIV-1 RNA with a limit of detection (LOD) and the CD4 cell counts are usually measured with much noise and missing data often occur during the treatment. Furthermore, most of the statistical models assume symmetric distribution, such as normal, for the response variables. Often time, normality assumption does not hold in practice. Therefore, it is important to explore all these factors when modeling the real data. In this article, we establish a joint model that accounts for asymmetric and LOD data for the response variable, and covariate measurement error and missingness simultaneously in the mixed-effects varying-coefficient modeling framework. A Bayesian inference procedure is developed to estimate the parameters in the joint model. The proposed model and method are applied to a real AIDS clinical study and various comparisons of a few models are performed.

February 10, 2015 doi: 10.1177/0962280215569294 open full text
Testing equality and interval estimation of the generalized odds ratio in ordinal data under a three-period crossover design.
Lui, K.-J., Chang, K.-C., Lin, C.-D.
Statistical Methods in Medical Research: An International Review Journal. February 10, 2015

The crossover design can be of use to save the number of patients or improve power of a parallel groups design in studying treatments to noncurable chronic diseases. We propose using the generalized odds ratio for paired sample data to measure the relative effects in ordinal data between treatments and between periods. We show that one can apply the commonly used asymptotic and exact test procedures for stratified analysis in epidemiology to test non-equality of treatments in ordinal data, as well as obtain asymptotic and exact interval estimators for the generalized odds ratio under a three-period crossover design. We further show that one can apply procedures for testing the homogeneity of the odds ratio under stratified sampling to examine whether there are treatment-by-period interactions. We use the data taken from a three-period crossover trial studying the effects of low and high doses of an analgesic versus a placebo for the relief of pain in primary dysmenorrhea to illustrate the use of these test procedures and estimators proposed here.

February 10, 2015 doi: 10.1177/0962280215569623 open full text
On estimating and testing associations between random coefficients from multivariate generalized linear mixed models of longitudinal outcomes.
Mikulich-Gilbertson, S. K., Wagner, B. D., Riggs, P. D., Zerbe, G. O.
Statistical Methods in Medical Research: An International Review Journal. January 30, 2015

Different types of outcomes (e.g. binary, count, continuous) can be simultaneously modeled with multivariate generalized linear mixed models by assuming: (1) same or different link functions, (2) same or different conditional distributions, and (3) conditional independence given random subject effects. Others have used this approach for determining simple associations between subject-specific parameters (e.g. correlations between slopes). We demonstrate how more complex associations (e.g. partial regression coefficients between slopes adjusting for intercepts, time lags of maximum correlation) can be estimated. Reparameterizing the model to directly estimate coefficients allows us to compare standard errors based on the inverse of the Hessian matrix with more usual standard errors approximated by the delta method; a mathematical proof demonstrates their equivalence when the gradient vector approaches zero. Reparameterization also allows us to evaluate significance of coefficients with likelihood ratio tests and to compare this approach with more usual Wald-type t-tests and Fisher’s z transformations. Simulations indicate that the delta method and inverse Hessian standard errors are nearly equivalent and consistently overestimate the true standard error. Only the likelihood ratio test based on the reparameterized model has an acceptable type I error rate and is therefore recommended for testing associations between stochastic parameters. Online supplementary materials include our medical data example, annotated code, and simulation details.

January 30, 2015 doi: 10.1177/0962280214568522 open full text
Bayesian regression models for the estimation of net cost of disease using aggregate data.
Mitsakakis, N., Tomlinson, G.
Statistical Methods in Medical Research: An International Review Journal. January 23, 2015

Estimation of net costs attributed to a disease or other health condition is very important for health economists and policy makers. Skewness and heteroscedasticity are well-known characteristics for cost data, making linear models generally inappropriate and dictating the use of other types of models, such as gamma regression. Additional hurdles emerge when individual level data are not available. In this paper, we consider the latter case were data are only available at the aggregate level, containing means and standard deviations for different strata defined by a number of demographic and clinical factors. We summarize a number of methods that can be used for this estimation, and we propose a Bayesian approach that utilizes the sample stratum specific standard deviations as stochastic. We investigate the performance of two linear mixed models, comparing them with two proposed gamma regression mixed models, to analyze simulated data generated by gamma and log-normal distributions. Our proposed Bayesian approach seems to have significant advantages for net cost estimation when only aggregate data are available. The implemented gamma models do not seem to offer the expected benefits over the linear models; however, further investigation and refinement is needed.

January 23, 2015 doi: 10.1177/0962280214568110 open full text
Stability metrics for multi-source biomedical data based on simplicial projections from probability distribution distances.
Saez, C., Robles, M., Garcia-Gomez, J. M.
Statistical Methods in Medical Research: An International Review Journal. January 23, 2015

Biomedical data may be composed of individuals generated from distinct, meaningful sources. Due to possible contextual biases in the processes that generate data, there may exist an undesirable and unexpected variability among the probability distribution functions (PDFs) of the source subsamples, which, when uncontrolled, may lead to inaccurate or unreproducible research results. Classical statistical methods may have difficulties to undercover such variabilities when dealing with multi-modal, multi-type, multi-variate data. This work proposes two metrics for the analysis of stability among multiple data sources, robust to the aforementioned conditions, and defined in the context of data quality assessment. Specifically, a global probabilistic deviation and a source probabilistic outlyingness metrics are proposed. The first provides a bounded degree of the global multi-source variability, designed as an estimator equivalent to the notion of normalized standard deviation of PDFs. The second provides a bounded degree of the dissimilarity of each source to a latent central distribution. The metrics are based on the projection of a simplex geometrical structure constructed from the Jensen–Shannon distances among the sources PDFs. The metrics have been evaluated and demonstrated their correct behaviour on a simulated benchmark and with real multi-source biomedical data using the UCI Heart Disease data set. The biomedical data quality assessment based on the proposed stability metrics may improve the efficiency and effectiveness of biomedical data exploitation and research.

January 23, 2015 doi: 10.1177/0962280214545122 open full text
Inverse sampling regression for pooled data.
Montesinos-Lopez, O. A., Montesinos-Lopez, A., Eskridge, K., Crossa, J.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2015

Because pools are tested instead of individuals in group testing, this technique is helpful for estimating prevalence in a population or for classifying a large number of individuals into two groups at a low cost. For this reason, group testing is a well-known means of saving costs and producing precise estimates. In this paper, we developed a mixed-effect group testing regression that is useful when the data-collecting process is performed using inverse sampling. This model allows including covariate information at the individual level to incorporate heterogeneity among individuals and identify which covariates are associated with positive individuals. We present an approach to fit this model using maximum likelihood and we performed a simulation study to evaluate the quality of the estimates. Based on the simulation study, we found that the proposed regression method for inverse sampling with group testing produces parameter estimates with low bias when the pre-specified number of positive pools (r) to stop the sampling process is at least 10 and the number of clusters in the sample is also at least 10. We performed an application with real data and we provide an NLMIXED code that researchers can use to implement this method.

January 19, 2015 doi: 10.1177/0962280214568047 open full text
A surrogate-primary replacement algorithm for response-adaptive randomization in stroke clinical trials.
Nowacki, A. S., Zhao, W., Palesch, Y. Y.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2015

Response-adaptive randomization (RAR) offers clinical investigators benefit by modifying the treatment allocation probabilities to optimize the ethical, operational, or statistical performance of the trial. Delayed primary outcomes and their effect on RAR have been studied in the literature; however, the incorporation of surrogate outcomes has not been fully addressed. We explore the benefits and limitations of surrogate outcome utilization in RAR in the context of acute stroke clinical trials. We propose a novel surrogate-primary (S-P) replacement algorithm where a patient’s surrogate outcome is used in the RAR algorithm only until their primary outcome becomes available to replace it. Computer simulations investigate the effect of both the delay in obtaining the primary outcome and the underlying surrogate and primary outcome distributional discrepancies on complete randomization, standard RAR and the S-P replacement algorithm methods. Results show that when the primary outcome is delayed, the S-P replacement algorithm reduces the variability of the treatment allocation probabilities and achieves stabilization sooner. Additionally, the S-P replacement algorithm benefit proved to be robust in that it preserved power and reduced the expected number of failures across a variety of scenarios.

January 12, 2015 doi: 10.1177/0962280214567142 open full text
Sequential selection of variables using short permutation procedures and multiple adjustments: An application to genomic data.
Azevedo Costa, M., de Souza Rodrigues, T., da Costa, A. G. F., Natowicz, R., Padua Braga, A.
Statistical Methods in Medical Research: An International Review Journal. January 09, 2015

This work proposes a sequential methodology for selecting variables in classification problems in which the number of predictors is much larger than the sample size. The methodology includes a Monte Carlo permutation procedure that conditionally tests the null hypothesis of no association among the outcomes and the available predictors. In order to improve computing aspects, we propose a new parametric distribution, the Truncated and Zero Inflated Gumbel Distribution. The final application is to find compact classification models with improved performance for genomic data. Results using real data sets show that the proposed methodology selects compact models with optimized classification performances.

January 09, 2015 doi: 10.1177/0962280214566262 open full text
Likelihood-based methods for evaluating principal surrogacy in augmented vaccine trials.
Liu, W., Zhang, B., Zhang, H., Zhang, Z.
Statistical Methods in Medical Research: An International Review Journal. December 29, 2014

There is growing interest in assessing immune biomarkers, which are quick to measure and potentially predictive of long-term efficacy, as surrogate endpoints in randomized, placebo-controlled vaccine trials. This can be done under a principal stratification approach, with principal strata defined using a subject’s potential immune responses to vaccine and placebo (the latter may be assumed to be zero). In this context, principal surrogacy refers to the extent to which vaccine efficacy varies across principal strata. Because a placebo recipient’s potential immune response to vaccine is unobserved in a standard vaccine trial, augmented vaccine trials have been proposed to produce the information needed to evaluate principal surrogacy. This article reviews existing methods based on an estimated likelihood and a pseudo-score (PS) and proposes two new methods based on a semiparametric likelihood (SL) and a pseudo-likelihood (PL), for analyzing augmented vaccine trials. Unlike the PS method, the SL method does not require a model for missingness, which can be advantageous when immune response data are missing by happenstance. The SL method is shown to be asymptotically efficient, and it performs similarly to the PS and PL methods in simulation experiments. The PL method appears to have a computational advantage over the PS and SL methods.

December 29, 2014 doi: 10.1177/0962280214565833 open full text
A goodness-of-fit test for the random-effects distribution in mixed models.
Efendi, A., Drikvandi, R., Verbeke, G., Molenberghs, G.
Statistical Methods in Medical Research: An International Review Journal. December 24, 2014

In this paper, we develop a simple diagnostic test for the random-effects distribution in mixed models. The test is based on the gradient function, a graphical tool proposed by Verbeke and Molenberghs to check the impact of assumptions about the random-effects distribution in mixed models on inferences. Inference is conducted through the bootstrap. The proposed test is easy to implement and applicable in a general class of mixed models. The operating characteristics of the test are evaluated in a simulation study, and the method is further illustrated using two real data analyses.

December 24, 2014 doi: 10.1177/0962280214564721 open full text
Poisson and negative binomial item count techniques for surveys with sensitive question.
Tian, G.-L., Tang, M.-L., Wu, Q., Liu, Y.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2014

Although the item count technique is useful in surveys with sensitive questions, privacy of those respondents who possess the sensitive characteristic of interest may not be well protected due to a defect in its original design. In this article, we propose two new survey designs (namely the Poisson item count technique and negative binomial item count technique) which replace several independent Bernoulli random variables required by the original item count technique with a single Poisson or negative binomial random variable, respectively. The proposed models not only provide closed form variance estimate and confidence interval within [0, 1] for the sensitive proportion, but also simplify the survey design of the original item count technique. Most importantly, the new designs do not leak respondents’ privacy. Empirical results show that the proposed techniques perform satisfactorily in the sense that it yields accurate parameter estimate and confidence interval.

December 16, 2014 doi: 10.1177/0962280214563345 open full text
Doubly robust estimation of attributable fractions in survival analysis.
Sjolander, A., Vansteelandt, S.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2014

The attributable fraction is a commonly used measure that quantifies the public health impact of an exposure on an outcome. It was originally defined for binary outcomes, but an extension has recently been proposed for right-censored survival time outcomes; the so-called attributable fraction function. A maximum likelihood estimator of the attributable fraction function has been developed, which requires a model for the outcome. In this paper, we derive a doubly robust estimator of the attributable fraction function. This estimator requires one model for the outcome, and one joint model for the exposure and censoring. The estimator is consistent if either model is correct, not necessarily both.

December 16, 2014 doi: 10.1177/0962280214564003 open full text
A composite likelihood method for bivariate meta-analysis in diagnostic systematic reviews.
Chen, Y., Liu, Y., Ning, J., Nie, L., Zhu, H., Chu, H.
Statistical Methods in Medical Research: An International Review Journal. December 14, 2014

Diagnostic systematic review is a vital step in the evaluation of diagnostic technologies. In many applications, it involves pooling pairs of sensitivity and specificity of a dichotomized diagnostic test from multiple studies. We propose a composite likelihood (CL) method for bivariate meta-analysis in diagnostic systematic reviews. This method provides an alternative way to make inference on diagnostic measures such as sensitivity, specificity, likelihood ratios, and diagnostic odds ratio. Its main advantages over the standard likelihood method are the avoidance of the nonconvergence problem, which is nontrivial when the number of studies is relatively small, the computational simplicity, and some robustness to model misspecifications. Simulation studies show that the CL method maintains high relative efficiency compared to that of the standard likelihood method. We illustrate our method in a diagnostic review of the performance of contemporary diagnostic imaging technologies for detecting metastases in patients with melanoma.

December 14, 2014 doi: 10.1177/0962280214562146 open full text
Finding vulnerable subpopulations in the Seychelles Child Development Study: effect modification with latent groups.
Love, T. M., Thurston, S. W., Davidson, P. W.
Statistical Methods in Medical Research: An International Review Journal. December 14, 2014

The Seychelles Child Development Study is a research project with the objective of examining associations between prenatal exposure to low doses of methylmercury from maternal fish consumption and children’s developmental outcomes. Whether methylmercury has neurotoxic effects at low doses remains unclear and recommendations for pregnant women and children to reduce fish intake may prevent a substantial number of people from receiving sufficient nutrients that are abundant in fish. The primary findings of the Seychelles Child Development Study are inconsistent with adverse associations between methylmercury from fish consumption and neurodevelopmental outcomes. However, whether there are subpopulations of children who are particularly sensitive to this diet is an open question. Secondary analysis from this study found significant interactions between prenatal methylmercury levels and both caregiver IQ and income on 19-month IQ. These results are sensitive to the categories chosen for these covariates and are difficult to interpret collectively. In this paper, we estimate effect modification of the association between prenatal methylmercury exposure and 19-month IQ using a general formulation of mixture regression. Our mixture regression model creates a latent categorical group membership variable which interacts with methylmercury in predicting the outcome. We also fit the same outcome model when in addition the latent variable is assumed to be a parametric function of three distinct socioeconomic measures. Bayesian methods allow group membership and the regression coefficients to be estimated simultaneously and our approach yields a principled choice of the number of distinct subpopulations. The results show three groups with different response patterns between prenatal methylmercury exposure and 19-month IQ in this population.

December 14, 2014 doi: 10.1177/0962280214560044 open full text
Augmented mixed models for clustered proportion data.
Bandyopadhyay, D., Galvis, D. M., Lachos, V. H.
Statistical Methods in Medical Research: An International Review Journal. December 08, 2014

Often in biomedical research, we deal with continuous (clustered) proportion responses ranging between zero and one quantifying the disease status of the cluster units. Interestingly, the study population might also consist of relatively disease-free as well as highly diseased subjects, contributing to proportion values in the interval [0, 1]. Regression on a variety of parametric densities with support lying in (0, 1), such as beta regression, can assess important covariate effects. However, they are deemed inappropriate due to the presence of zeros and/or ones. To evade this, we introduce a class of general proportion density, and further augment the probabilities of zero and one to this general proportion density, controlling for the clustering. Our approach is Bayesian and presents a computationally convenient framework amenable to available freeware. Bayesian case-deletion influence diagnostics based on q-divergence measures are automatic from the Markov chain Monte Carlo output. The methodology is illustrated using both simulation studies and application to a real dataset from a clinical periodontology study.

December 08, 2014 doi: 10.1177/0962280214561093 open full text
Optimal combination of biomarkers for time-dependent receiver operating characteristic estimation and related problems.
Guan, Z., Qin, J.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2014

The receiver operating characteristic curve is commonly used for assessing diagnostic test accuracy and for discriminatory ability of a medical diagnostic test in distinguishing between diseases and non-diseased individuals. With the advance of technology, many genetic variables and biomarker variables are easily collected. The most challenging problem is how to combine clinical, genetic, and biomarker variables together to predict disease status. If one is interested in predicting t-year survival, however, the status of "case" (death) and "control" (survival) at the given t-year is unknown for those individuals who were censored before t-year. To conduct a receiver operating characteristic analysis, one has to impute those ambiguous statuses. In this paper, we study a maximum pseudo likelihood method to estimate the underlying parameters and baseline distribution functions. The proposed approach produces more efficient and smoother estimate of the optimal time-dependent receiver operating characteristic curve and more stable estimation of the prediction rule for the t-year survivors. More importantly, the proposal is equipped with a goodness-of-fit test for the model assumption based on the bootstrap method. Two real medical data sets are used for illustration.

November 28, 2014 doi: 10.1177/0962280214561506 open full text
Estimating the average treatment effects of nutritional label use using subclassification with regression adjustment.
Lopez, M. J., Gutman, R.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2014

Propensity score methods are common for estimating a binary treatment effect when treatment assignment is not randomized. When exposure is measured on an ordinal scale (i.e. low–medium–high), however, propensity score inference requires extensions which have received limited attention. Estimands of possible interest with an ordinal exposure are the average treatment effects between each pair of exposure levels. Using these estimands, it is possible to determine an optimal exposure level. Traditional methods, including dichotomization of the exposure or a series of binary propensity score comparisons across exposure pairs, are generally inadequate for identification of optimal levels. We combine subclassification with regression adjustment to estimate transitive, unbiased average causal effects across an ordered exposure, and apply our method on the 2005–2006 National Health and Nutrition Examination Survey to estimate the effects of nutritional label use on body mass index.

November 28, 2014 doi: 10.1177/0962280214560046 open full text
Analysis of longitudinal censored semicontinuous data with application to the study of executive dysfunction: The Towers Task.
Lourens, S., Zhang, Y., Long, J. D., Paulsen, J. S.
Statistical Methods in Medical Research: An International Review Journal. November 26, 2014

Executive dysfunction is a deficiency in skills of planning and problem solving that characterizes many neuropsychiatric disorders. The Towers Task is a commonly used measure of planning and problem solving for assessing executive function. Towers Task data are usually zero-inflated and right-censored, and ignoring these features can result in biased inference for the disease characterization of executive dysfunction. In this manuscript, a mixed-effects model for longitudinal censored semicontinuous data is developed for analyzing longitudinal Towers Task data from the PREDICT-HD study. The model is contrasted with current practice, and implications for general use are discussed.

November 26, 2014 doi: 10.1177/0962280214560187 open full text
Adjusting for treatment switching in randomised controlled trials - A simulation study and a simplified two-stage method.
Latimer, N. R., Abrams, K., Lambert, P., Crowther, M., Wailoo, A., Morden, J., Akehurst, R., Campbell, M.
Statistical Methods in Medical Research: An International Review Journal. November 21, 2014

Estimates of the overall survival benefit of new cancer treatments are often confounded by treatment switching in randomised controlled trials (RCTs) – whereby patients randomised to the control group are permitted to switch onto the experimental treatment upon disease progression. In health technology assessment, estimates of the unconfounded overall survival benefit associated with the new treatment are needed. Several switching adjustment methods have been advocated in the literature, some of which have been used in health technology assessment. However, it is unclear which methods are likely to produce least bias in realistic RCT-based scenarios. We simulated RCTs in which switching, associated with patient prognosis, was permitted. Treatment effect size and time dependency, switching proportions and disease severity were varied across scenarios. We assessed the performance of alternative adjustment methods based upon bias, coverage and mean squared error, related to the estimation of true restricted mean survival in the absence of switching in the control group. We found that when the treatment effect was not time-dependent, rank preserving structural failure time models (RPSFTM) and iterative parameter estimation methods produced low levels of bias. However, in the presence of a time-dependent treatment effect, these methods produced higher levels of bias, similar to those produced by an inverse probability of censoring weights method. The inverse probability of censoring weights and structural nested models produced high levels of bias when switching proportions exceeded 85%. A simplified two-stage Weibull method produced low bias across all scenarios and provided the treatment switching mechanism is suitable, represents an appropriate adjustment method.

November 21, 2014 doi: 10.1177/0962280214557578 open full text
Widen NomoGram for multinomial logistic regression: an application to staging liver fibrosis in chronic hepatitis C patients.
Ardoino, I., Lanzoni, M., Marano, G., Boracchi, P., Sagrini, E., Gianstefani, A., Piscaglia, F., Biganzoli, E. M.
Statistical Methods in Medical Research: An International Review Journal. November 20, 2014

The interpretation of regression models results can often benefit from the generation of nomograms, ‘user friendly’ graphical devices especially useful for assisting the decision-making processes. However, in the case of multinomial regression models, whenever categorical responses with more than two classes are involved, nomograms cannot be drawn in the conventional way. Such a difficulty in managing and interpreting the outcome could often result in a limitation of the use of multinomial regression in decision-making support. In the present paper, we illustrate the derivation of a non-conventional nomogram for multinomial regression models, intended to overcome this issue. Although it may appear less straightforward at first sight, the proposed methodology allows an easy interpretation of the results of multinomial regression models and makes them more accessible for clinicians and general practitioners too. Development of prediction model based on multinomial logistic regression and of the pertinent graphical tool is illustrated by means of an example involving the prediction of the extent of liver fibrosis in hepatitis C patients by routinely available markers.

November 20, 2014 doi: 10.1177/0962280214560045 open full text
Events per variable (EPV) and the relative performance of different strategies for estimating the out-of-sample validity of logistic regression models.
Austin, P. C., Steyerberg, E. W.
Statistical Methods in Medical Research: An International Review Journal. November 19, 2014

We conducted an extensive set of empirical analyses to examine the effect of the number of events per variable (EPV) on the relative performance of three different methods for assessing the predictive accuracy of a logistic regression model: apparent performance in the analysis sample, split-sample validation, and optimism correction using bootstrap methods. Using a single dataset of patients hospitalized with heart failure, we compared the estimates of discriminatory performance from these methods to those for a very large independent validation sample arising from the same population. As anticipated, the apparent performance was optimistically biased, with the degree of optimism diminishing as the number of events per variable increased. Differences between the bootstrap-corrected approach and the use of an independent validation sample were minimal once the number of events per variable was at least 20. Split-sample assessment resulted in too pessimistic and highly uncertain estimates of model performance. Apparent performance estimates had lower mean squared error compared to split-sample estimates, but the lowest mean squared error was obtained by bootstrap-corrected optimism estimates. For bias, variance, and mean squared error of the performance estimates, the penalty incurred by using split-sample validation was equivalent to reducing the sample size by a proportion equivalent to the proportion of the sample that was withheld for model validation. In conclusion, split-sample validation is inefficient and apparent performance is too optimistic for internal validation of regression-based prediction models. Modern validation methods, such as bootstrap-based optimism correction, are preferable. While these findings may be unsurprising to many statisticians, the results of the current study reinforce what should be considered good statistical practice in the development and validation of clinical prediction models.

November 19, 2014 doi: 10.1177/0962280214558972 open full text
Bayesian inference for joint modelling of longitudinal continuous, binary and ordinal events.
Li, Q., Pan, J., Belcher, J.
Statistical Methods in Medical Research: An International Review Journal. November 19, 2014

In medical studies, repeated measurements of continuous, binary and ordinal outcomes are routinely collected from the same patient. Instead of modelling each outcome separately, in this study we propose to jointly model the trivariate longitudinal responses, so as to take account of the inherent association between the different outcomes and thus improve statistical inferences. This work is motivated by a large cohort study in the North West of England, involving trivariate responses from each patient: Body Mass Index, Depression (Yes/No) ascertained with cut-off score not less than 8 at the Hospital Anxiety and Depression Scale, and Pain Interference generated from the Medical Outcomes Study 36-item short-form health survey with values returned on an ordinal scale 1–5. There are some well-established methods for combined continuous and binary, or even continuous and ordinal responses, but little work was done on the joint analysis of continuous, binary and ordinal responses. We propose conditional joint random-effects models, which take into account the inherent association between the continuous, binary and ordinal outcomes. Bayesian analysis methods are used to make statistical inferences. Simulation studies show that, by jointly modelling the trivariate outcomes, standard deviations of the estimates of parameters in the models are smaller and much more stable, leading to more efficient parameter estimates and reliable statistical inferences. In the real data analysis, the proposed joint analysis yields a much smaller deviance information criterion value than the separate analysis, and shows other good statistical properties too.

November 19, 2014 doi: 10.1177/0962280214526199 open full text
A stepped wedge design for testing an effect of intranasal insulin on cognitive development of children with Phelan-McDermid syndrome: A comparison of different designs.
Van den Heuvel, E. R., Zwanenburg, R. J., Van Ravenswaaij-Arts, C. M.
Statistical Methods in Medical Research: An International Review Journal. November 19, 2014

This paper compares the power of the parallel group design, the matched-pairs design, and several options for the stepped wedge and delayed start designs for testing a possible effect of intranasal insulin with respect to placebo on developmental growth of children with a rare disorder like Phelan-McDermid syndrome. A subject-specific linear mixed effects model for the primary outcome developmental age in a longitudinal setting with five time points was assumed. Monte Carlo simulation studies with small sample sizes were applied since the rare disorder prohibits large trials. The stepped wedge designs, which were initially preferred for ethical reasons, appear to be competitive in power to other designs and were in some settings even the best. The assumed statistical model also demonstrates that all of the designs can be viewed as a stepped wedge or delayed treatment design. Our results show that the stepped wedge design is an appropriate alternative for randomized controlled trials on developmental growth with small numbers of participants under the formulated statistical conditions.

November 19, 2014 doi: 10.1177/0962280214558864 open full text
Treatment comparison in randomized clinical trials with nonignorable missingness: A reverse regression approach.
Zhang, Z., Cheon, K.
Statistical Methods in Medical Research: An International Review Journal. November 19, 2014

A common problem in randomized clinical trials is nonignorable missingness, namely that the clinical outcome(s) of interest can be missing in a way that is not fully explained by the observed quantities. This happens when the continued participation of patients depends on the current outcome after adjusting for the observed history. Standard methods for handling nonignorable missingness typically require specification of the response mechanism, which can be difficult in practice. This article proposes a reverse regression approach that does not require a model for the response mechanism. Instead, the proposed approach relies on the assumption that missingness is independent of treatment assignment upon conditioning on the relevant outcome(s). This conditional independence assumption is motivated by the observation that, when patients are effectively masked to the assigned treatment, their decision to either stay in the trial or drop out cannot depend on the assigned treatment directly. Under this assumption, one can estimate parameters in the reverse regression model, test for the presence of a treatment effect, and in some cases estimate the outcome distributions. The methodology can be extended to longitudinal outcomes under natural conditions. The proposed approach is illustrated with real data from a cardiovascular study.

November 19, 2014 doi: 10.1177/0962280214558865 open full text
Applying compositional data methodology to nutritional epidemiology.
Leite, M. L. C.
Statistical Methods in Medical Research: An International Review Journal. November 19, 2014

The purpose of epidemiological studies of nutrition and disease is to investigate the effects of specific dietary components regardless of total energy intake, but this is sometimes hampered by the compositional nature of dietary data. Compositional data are those that measure parts of a whole, such as percentages or proportions, and particular methodologies have been developed to allow their statistical analysis and theoretical and practical applications in various sciences. This paper describes the use of a compositional data perspective for statistical analyses in the field of nutritional epidemiology. The approach is based on isometric log-ratio transformation and has been previously proposed for the construction of regression models using compositional explanatory variables. The new isometric log-ratio variables allow full inferences about each element of dietary composition and adjustment by total energy intake. Using data from an Italian population-based study, logistic regression models were fitted to evaluate the effects of the intake of macronutrients (proteins, fats and carbohydrates) on the odds of having metabolic syndrome in middle-aged subjects.

November 19, 2014 doi: 10.1177/0962280214560047 open full text
Bayesian modeling and prediction of accrual in multi-regional clinical trials.
Deng, Y., Zhang, X., Long, Q.
Statistical Methods in Medical Research: An International Review Journal. November 03, 2014

In multi-regional trials, the underlying overall and region-specific accrual rates often do not hold constant over time and different regions could have different start-up times, which combined with initial jump in accrual within each region often leads to a discontinuous overall accrual rate, and these issues associated with multi-regional trials have not been adequately investigated. In this paper, we clarify the implication of the multi-regional nature on modeling and prediction of accrual in clinical trials and investigate a Bayesian approach for accrual modeling and prediction, which models region-specific accrual using a nonhomogeneous Poisson process and allows the underlying Poisson rate in each region to vary over time. The proposed approach can accommodate staggered start-up times and different initial accrual rates across regions/centers. Our numerical studies show that the proposed method improves accuracy and precision of accrual prediction compared to existing methods including the nonhomogeneous Poisson process model that does not model region-specific accrual.

November 03, 2014 doi: 10.1177/0962280214557581 open full text
Handling incomplete smoking history data in survival analysis.
Furukawa, K., Preston, D. L., Misumi, M., Cullings, H. M.
Statistical Methods in Medical Research: An International Review Journal. October 26, 2014

While data are unavoidably missing or incomplete in most observational studies, consequences of mishandling such incompleteness in analysis are often overlooked. When time-varying information is collected irregularly and infrequently over a long period, even precisely obtained data may implicitly involve substantial incompleteness. Motivated by an analysis to quantitatively evaluate the effects of smoking and radiation on lung cancer risks among Japanese atomic-bomb survivors, we provide a unique application of multiple imputation to incompletely observed smoking histories under the assumption of missing at random. Predicting missing values for the age of smoking initiation and, given initiation, smoking intensity and cessation age, analyses can be based on complete, though partially imputed, smoking histories. A simulation study shows that multiple imputation appropriately conditioned on the outcome and other relevant variables can produce consistent estimates when data are missing at random. Our approach is particularly appealing in large cohort studies where a considerable amount of time-varying information is incomplete under a mechanism depending in a complex manner on other variables. In application to the motivating example, this approach is expected to reduce estimation bias that might be unavoidable in naive analyses, while keeping efficiency by retaining known information.

October 26, 2014 doi: 10.1177/0962280214556794 open full text
Analysis of case-cohort designs with binary outcomes: Improving efficiency using whole-cohort auxiliary information.
Noma, H., Tanaka, S.
Statistical Methods in Medical Research: An International Review Journal. October 26, 2014

The case-cohort design has been widely adopted for reducing the cost of covariate measurements in large prospective cohort studies. Under the case-cohort design, complete covariate data are collected only on randomly sampled cases and a subcohort randomly selected from the whole cohort. For the analysis of case-cohort studies with binary outcomes, logistic regression analysis has been routinely used. However, in many applications, certain covariates are readily measured on all samples from the whole cohort, and the case-cohort design may be regarded as a two-phase sampling design. Using this auxiliary covariate information, estimators for the regression parameters can be substantially improved. In this article, we discuss the theoretical basis of the case-cohort design derived from the formulation of the two-phase design and the improved estimators using whole-cohort auxiliary variable information. In particular, we show that the sampling scheme of the case-cohort design is substantially equivalent to that of conventional two-phase case-control studies (also known as two-stage case-control studies for epidemiologists), i.e., the methodologies of two-phase case-control studies can be directly applied to case-cohort data. Under this framework, we review and apply the following improved estimators to the case-cohort design with binary outcomes: (i) weighted estimators, (ii) a semiparametric maximum likelihood estimator, and (iii) a multiple imputation estimator. In addition, based on the framework of the two-phase design, we can obtain risk ratio and risk difference estimators without the rare-disease assumption. We illustrate these methodologies via simulations and the National Wilms Tumor Study data.

October 26, 2014 doi: 10.1177/0962280214556175 open full text
Number of imputations needed to stabilize estimated treatment difference in longitudinal data analysis.
Lu, K.
Statistical Methods in Medical Research: An International Review Journal. October 10, 2014

Multiple imputation procedures replace each missing value with a set of plausible values based on the posterior predictive distribution of missing data given observed data. In many applications, as few as five imputations are adequate to achieve high efficiency relative to an infinite number of imputations. However, substantially more imputations are often needed to stabilize imputation-based inference at the analysis stage. Imputation-based inference at the analysis stage is considered stable if the conditional variability of the multiple imputation estimator, half-width of 95% confidence interval, test statistic, and estimated fraction of missing information given observed data is within specified thresholds for simulation error. For the estimation of treatment difference at study end for normally distributed responses in longitudinal trials, we calculate the multiple imputation quantities for an infinite number of imputations analytically and use simulations to assess the variability of the number of imputations needed at the analysis stage in repeated sampling.

October 10, 2014 doi: 10.1177/0962280214554439 open full text
Model evaluation based on the negative predictive value for interval-censored survival outcomes.
Han, S., Tsui, K.-W., Andrei, A.-C.
Statistical Methods in Medical Research: An International Review Journal. October 10, 2014

In many cohort studies, time to events such as disease recurrence is recorded in an interval-censored format. An important objective is to predict patient outcomes. Clinicians are interested in predictive covariates. Prediction rules based on the receiver operating characteristic curve alone are not related to the survival endpoint. We propose a model evaluation strategy to leverage the predictive accuracy based on negative predictive functions. Our proposed method makes very few assumptions and only requires a working model to obtain the regression coefficients. A nonparametric estimate of the predictive accuracy provides a simple and flexible approach for model evaluation to interval-censored survival outcomes. The implementation effort is minimal, therefore this method has an increased potential for immediate use in biomedical data analyses. Simulation studies and a breast cancer trial example further illustrate the practical advantages of this approach.

October 10, 2014 doi: 10.1177/0962280214554253 open full text
On the analysis of composite measures of quality in medical research.
Moineddin, R., Meaney, C., Grunfeld, E.
Statistical Methods in Medical Research: An International Review Journal. October 08, 2014

Composite endpoints are commonplace in biomedical research. The complex nature of many health conditions and medical interventions demand that composite endpoints be employed. Different approaches exist for the analysis of composite endpoints. A Monte Carlo simulation study was employed to assess the statistical properties of various regression methods for analyzing binary composite endpoints. We also applied these methods to data from the BETTER trial which employed a binary composite endpoint. We demonstrated that type 1 error rates are poor for the Negative Binomial regression model and the logistic generalized linear mixed model (GLMM). Bias was minimal and power was highest in the binomial logistic regression model, the linear regression model, the Poisson (corrected for over-dispersion) regression model and the common effect logistic generalized estimating equation (GEE) model. Convergence was poor in the distinct effect GEE models, the logistic GLMM and some of the zero-one inflated beta regression models. Considering the BETTER trial data, the distinct effect GEE model struggled with convergence and the collapsed composite method estimated an effect, which was greatly attenuated compared to other models. All remaining models suggested an intervention effect of similar magnitude. In our simulation study, the binomial logistic regression model (corrected for possible over/under-dispersion), the linear regression model, the Poisson regression model (corrected for over-dispersion) and the common effect logistic GEE model appeared to be unbiased, with good type 1 error rates, power and convergence properties.

October 08, 2014 doi: 10.1177/0962280214553330 open full text
Censored linear regression models for irregularly observed longitudinal data using the multivariate-t distribution.
Garay, A. M., Castro, L. M., Leskow, J., Lachos, V. H.
Statistical Methods in Medical Research: An International Review Journal. October 08, 2014

In acquired immunodeficiency syndrome (AIDS) studies it is quite common to observe viral load measurements collected irregularly over time. Moreover, these measurements can be subjected to some upper and/or lower detection limits depending on the quantification assays. A complication arises when these continuous repeated measures have a heavy-tailed behavior. For such data structures, we propose a robust structure for a censored linear model based on the multivariate Student’s t-distribution. To compensate for the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is employed. An efficient expectation maximization type algorithm is developed for computing the maximum likelihood estimates, obtaining as a by-product the standard errors of the fixed effects and the log-likelihood function. The proposed algorithm uses closed-form expressions at the E-step that rely on formulas for the mean and variance of a truncated multivariate Student’s t-distribution. The methodology is illustrated through an application to an Human Immunodeficiency Virus-AIDS (HIV-AIDS) study and several simulation studies.

October 08, 2014 doi: 10.1177/0962280214551191 open full text
Exact one-sided confidence limits for Cohen's kappa as a measurement of agreement.
Shan, G., Wang, W.
Statistical Methods in Medical Research: An International Review Journal. October 06, 2014

Cohen’s kappa coefficient, , is a statistical measure of inter-rater agreement or inter-annotator agreement for qualitative items. In this paper, we focus on interval estimation of in the case of two raters and binary items. So far, only asymptotic and bootstrap intervals are available for due to its complexity. However, there is no guarantee that such intervals will capture with the desired nominal level 1–α. In other words, the statistical inferences based on these intervals are not reliable. We apply the Buehler method to obtain exact confidence intervals based on four widely used asymptotic intervals, three Wald-type confidence intervals and one interval constructed from a profile variance. These exact intervals are compared with regard to coverage probability and length for small to medium sample sizes. The exact intervals based on the Garner interval and the Lee and Tu interval are generally recommended for use in practice due to good performance in both coverage probability and length.

October 06, 2014 doi: 10.1177/0962280214552881 open full text
Confidence intervals for a difference between lognormal means in cluster randomization trials.
Poirier, J., Zou, G., Koval, J.
Statistical Methods in Medical Research: An International Review Journal. September 29, 2014

Cluster randomization trials, in which intact social units are randomized to different interventions, have become popular in the last 25 years. Outcomes from these trials in many cases are positively skewed, following approximately lognormal distributions. When inference is focused on the difference between treatment arm arithmetic means, existent confidence interval procedures either make restricting assumptions or are complex to implement. We approach this problem by assuming log-transformed outcomes from each treatment arm follow a one-way random effects model. The treatment arm means are functions of multiple parameters for which separate confidence intervals are readily available, suggesting that the method of variance estimates recovery may be applied to obtain closed-form confidence intervals. A simulation study showed that this simple approach performs well in small sample sizes in terms of empirical coverage, relatively balanced tail errors, and interval widths as compared to existing methods. The methods are illustrated using data arising from a cluster randomization trial investigating a critical pathway for the treatment of community acquired pneumonia.

September 29, 2014 doi: 10.1177/0962280214552291 open full text
Finite-sample corrected generalized estimating equation of population average treatment effects in stepped wedge cluster randomized trials.
Scott, J. M., deCamp, A., Juraska, M., Fay, M. P., Gilbert, P. B.
Statistical Methods in Medical Research: An International Review Journal. September 29, 2014

Stepped wedge designs are increasingly commonplace and advantageous for cluster randomized trials when it is both unethical to assign placebo, and it is logistically difficult to allocate an intervention simultaneously to many clusters. We study marginal mean models fit with generalized estimating equations for assessing treatment effectiveness in stepped wedge cluster randomized trials. This approach has advantages over the more commonly used mixed models that (1) the population-average parameters have an important interpretation for public health applications and (2) they avoid untestable assumptions on latent variable distributions and avoid parametric assumptions about error distributions, therefore, providing more robust evidence on treatment effects. However, cluster randomized trials typically have a small number of clusters, rendering the standard generalized estimating equation sandwich variance estimator biased and highly variable and hence yielding incorrect inferences. We study the usual asymptotic generalized estimating equation inferences (i.e., using sandwich variance estimators and asymptotic normality) and four small-sample corrections to generalized estimating equation for stepped wedge cluster randomized trials and for parallel cluster randomized trials as a comparison. We show by simulation that the small-sample corrections provide improvement, with one correction appearing to provide at least nominal coverage even with only 10 clusters per group. These results demonstrate the viability of the marginal mean approach for both stepped wedge and parallel cluster randomized trials. We also study the comparative performance of the corrected methods for stepped wedge and parallel designs, and describe how the methods can accommodate interval censoring of individual failure times and incorporate semiparametric efficient estimators.

September 29, 2014 doi: 10.1177/0962280214552092 open full text
Detecting associated single-nucleotide polymorphisms on the X chromosome in case control genome-wide association studies.
Chen, Z., Ng, H. K. T., Li, J., Liu, Q., Huang, H.
Statistical Methods in Medical Research: An International Review Journal. September 24, 2014

In the past decade, hundreds of genome-wide association studies have been conducted to detect the significant single-nucleotide polymorphisms that are associated with certain diseases. However, most of the data from the X chromosome were not analyzed and only a few significant associated single-nucleotide polymorphisms from the X chromosome have been identified from genome-wide association studies. This is mainly due to the lack of powerful statistical tests. In this paper, we propose a novel statistical approach that combines the information of single-nucleotide polymorphisms on the X chromosome from both males and females in an efficient way. The proposed approach avoids the need of making strong assumptions about the underlying genetic models. Our proposed statistical test is a robust method that only makes the assumption that the risk allele is the same for both females and males if the single-nucleotide polymorphism is associated with the disease for both genders. Through simulation study and a real data application, we show that the proposed procedure is robust and have excellent performance compared to existing methods. We expect that many more associated single-nucleotide polymorphisms on the X chromosome will be identified if the proposed approach is applied to current available genome-wide association studies data.

September 24, 2014 doi: 10.1177/0962280214551815 open full text
Jackknife variance of the partial area under the empirical receiver operating characteristic curve.
Bandos, A. I., Guo, B., Gur, D.
Statistical Methods in Medical Research: An International Review Journal. September 16, 2014

Receiver operating characteristic analysis provides an important methodology for assessing traditional (e.g., imaging technologies and clinical practices) and new (e.g., genomic studies, biomarker development) diagnostic problems. The area under the clinically/practically relevant part of the receiver operating characteristic curve (partial area or partial area under the receiver operating characteristic curve) is an important performance index summarizing diagnostic accuracy at multiple operating points (decision thresholds) that are relevant to actual clinical practice. A robust estimate of the partial area under the receiver operating characteristic curve is provided by the area under the corresponding part of the empirical receiver operating characteristic curve. We derive a closed-form expression for the jackknife variance of the partial area under the empirical receiver operating characteristic curve. Using the derived analytical expression, we investigate the differences between the jackknife variance and a conventional variance estimator. The relative properties in finite samples are demonstrated in a simulation study. The developed formula enables an easy way to estimate the variance of the empirical partial area under the receiver operating characteristic curve, thereby substantially reducing the computation burden, and provides important insight into the structure of the variability. We demonstrate that when compared with the conventional approach, the jackknife variance has substantially smaller bias, and leads to a more appropriate type I error rate of the Wald-type test. The use of the jackknife variance is illustrated in the analysis of a data set from a diagnostic imaging study.

September 16, 2014 doi: 10.1177/0962280214551190 open full text
A multi-stage drop-the-losers design for multi-arm clinical trials.
Wason, J., Stallard, N., Bowden, J., Jennison, C.
Statistical Methods in Medical Research: An International Review Journal. September 16, 2014

Multi-arm multi-stage trials can improve the efficiency of the drug development process when multiple new treatments are available for testing. A group-sequential approach can be used in order to design multi-arm multi-stage trials, using an extension to Dunnett’s multiple-testing procedure. The actual sample size used in such a trial is a random variable that has high variability. This can cause problems when applying for funding as the cost will also be generally highly variable. This motivates a type of design that provides the efficiency advantages of a group-sequential multi-arm multi-stage design, but has a fixed sample size. One such design is the two-stage drop-the-losers design, in which a number of experimental treatments, and a control treatment, are assessed at a prescheduled interim analysis. The best-performing experimental treatment and the control treatment then continue to a second stage. In this paper, we discuss extending this design to have more than two stages, which is shown to considerably reduce the sample size required. We also compare the resulting sample size requirements to the sample size distribution of analogous group-sequential multi-arm multi-stage designs. The sample size required for a multi-stage drop-the-losers design is usually higher than, but close to, the median sample size of a group-sequential multi-arm multi-stage trial. In many practical scenarios, the disadvantage of a slight loss in average efficiency would be overcome by the huge advantage of a fixed sample size. We assess the impact of delay between recruitment and assessment as well as unknown variance on the drop-the-losers designs.

September 16, 2014 doi: 10.1177/0962280214550759 open full text
Misclassification of outcome in case-control studies: Methods for sensitivity analysis.
Gilbert, R., Martin, R. M., Donovan, J., Lane, J. A., Hamdy, F., Neal, D. E., Metcalfe, C.
Statistical Methods in Medical Research: An International Review Journal. September 12, 2014

Case–control studies are potentially open to misclassification of disease outcome which may be unrelated to risk factor exposure (non-differential), thus underestimating associations, or related to risk factor exposure (differential), thus causing more serious bias.
We conducted a systematic literature review for methods of adjusting for outcome misclassification in case–control studies. We also applied methods to simulated data with known outcome misclassification to assess performance of these methods. Finally, real data from the Prostate Testing for Cancer and Treatment (ProtecT) randomised controlled trial gauged the usefulness of these methods.
Adjustment methods range from recalculating cell frequencies to probabilistic sensitivity modelling and Bayesian models, which incorporate uncertainty in sensitivity and specificity estimates. Simulated data indicated that substantial bias in either direction resulted from differential misclassification. More sophisticated methods, incorporating uncertainty into estimates of misclassification, provided appropriately wide confidence intervals for corrected estimates of risk factor–disease association.
Method choice depends on whether the objective is to assess if an observed association can be explained by bias, or to provide a ‘corrected’ estimate for the primary analysis. Accurate estimation of the degree of misclassification is important for the latter; otherwise further bias may be introduced.

September 12, 2014 doi: 10.1177/0962280214523192 open full text
Beyond the treatment effect: Evaluating the effects of patient preferences in randomised trials.
Walter, S., Turner, R., Macaskill, P., McCaffery, K., Irwig, L.
Statistical Methods in Medical Research: An International Review Journal. September 11, 2014

The treatments under comparison in a randomised trial should ideally have equal value and acceptability – a position of equipoise – to study participants. However, it is unlikely that true equipoise exists in practice, because at least some participants may have preferences for one treatment or the other, for a variety of reasons. These preferences may be related to study outcomes, and hence affect the estimation of the treatment effect. Furthermore, the effects of preferences can sometimes be substantial, and may even be larger than the direct effect of treatment. Preference effects are of interest in their own right, but they cannot be assessed in the standard parallel group design for a randomised trial. In this paper, we describe a model to represent the impact of preferences on trial outcomes, in addition to the usual treatment effect. In particular, we describe how outcomes might differ between participants who would choose one treatment or the other, if they were free to do so. Additionally, we investigate the difference in outcomes depending on whether or not a participant receives his or her preferred treatment, which we characterise through a so-called preference effect. We then discuss several study designs that have been proposed to measure and exploit data on preferences, and which constitute alternatives to the conventional parallel group design. Based on the model framework, we determine which of the various preference effects can or cannot be estimated with each design. We also illustrate these ideas with some examples of preference designs from the literature.

September 11, 2014 doi: 10.1177/0962280214550516 open full text
Joint modelling compared with two stage methods for analysing longitudinal data and prospective outcomes: A simulation study of childhood growth and BP.
Sayers, A., Heron, J., Smith, A., Macdonald-Wallis, C., Gilthorpe, M., Steele, F., Tilling, K.
Statistical Methods in Medical Research: An International Review Journal. September 11, 2014

There is a growing debate with regards to the appropriate methods of analysis of growth trajectories and their association with prospective dependent outcomes. Using the example of childhood growth and adult BP, we conducted an extensive simulation study to explore four two-stage and two joint modelling methods, and compared their bias and coverage in estimation of the (unconditional) association between birth length and later BP, and the association between growth rate and later BP (conditional on birth length). We show that the two-stage method of using multilevel models to estimate growth parameters and relating these to outcome gives unbiased estimates of the conditional associations between growth and outcome. Using simulations, we demonstrate that the simple methods resulted in bias in the presence of measurement error, as did the two-stage multilevel method when looking at the total (unconditional) association of birth length with outcome. The two joint modelling methods gave unbiased results, but using the re-inflated residuals led to undercoverage of the confidence intervals. We conclude that either joint modelling or the simpler two-stage multilevel approach can be used to estimate conditional associations between growth and later outcomes, but that only joint modelling is unbiased with nominal coverage for unconditional associations.

September 11, 2014 doi: 10.1177/0962280214548822 open full text
Weibull regression with Bayesian variable selection to identify prognostic tumour markers of breast cancer survival.
Newcombe, P., Raza Ali, H., Blows, F., Provenzano, E., Pharoah, P., Caldas, C., Richardson, S.
Statistical Methods in Medical Research: An International Review Journal. September 04, 2014

As data-rich medical datasets are becoming routinely collected, there is a growing demand for regression methodology that facilitates variable selection over a large number of predictors. Bayesian variable selection algorithms offer an attractive solution, whereby a sparsity inducing prior allows inclusion of sets of predictors simultaneously, leading to adjusted effect estimates and inference of which covariates are most important. We present a new implementation of Bayesian variable selection, based on a Reversible Jump MCMC algorithm, for survival analysis under the Weibull regression model. A realistic simulation study is presented comparing against an alternative LASSO-based variable selection strategy in datasets of up to 20,000 covariates. Across half the scenarios, our new method achieved identical sensitivity and specificity to the LASSO strategy, and a marginal improvement otherwise. Runtimes were comparable for both approaches, taking approximately a day for 20,000 covariates. Subsequently, we present a real data application in which 119 protein-based markers are explored for association with breast cancer survival in a case cohort of 2287 patients with oestrogen receptor-positive disease. Evidence was found for three independent prognostic tumour markers of survival, one of which is novel. Our new approach demonstrated the best specificity.

September 04, 2014 doi: 10.1177/0962280214548748 open full text
Zero-inflated Poisson model based likelihood ratio test for drug safety signal detection.
Huang, L., Zheng, D., Zalkikar, J., Tiwari, R.
Statistical Methods in Medical Research: An International Review Journal. September 03, 2014

In recent decades, numerous methods have been developed for data mining of large drug safety databases, such as Food and Drug Administration’s (FDA’s) Adverse Event Reporting System, where data matrices are formed by drugs such as columns and adverse events as rows. Often, a large number of cells in these data matrices have zero cell counts and some of them are "true zeros" indicating that the drug-adverse event pairs cannot occur, and these zero counts are distinguished from the other zero counts that are modeled zero counts and simply indicate that the drug-adverse event pairs have not occurred yet or have not been reported yet. In this paper, a zero-inflated Poisson model based likelihood ratio test method is proposed to identify drug-adverse event pairs that have disproportionately high reporting rates, which are also called signals. The maximum likelihood estimates of the model parameters of zero-inflated Poisson model based likelihood ratio test are obtained using the expectation and maximization algorithm. The zero-inflated Poisson model based likelihood ratio test is also modified to handle the stratified analyses for binary and categorical covariates (e.g. gender and age) in the data. The proposed zero-inflated Poisson model based likelihood ratio test method is shown to asymptotically control the type I error and false discovery rate, and its finite sample performance for signal detection is evaluated through a simulation study. The simulation results show that the zero-inflated Poisson model based likelihood ratio test method performs similar to Poisson model based likelihood ratio test method when the estimated percentage of true zeros in the database is small. Both the zero-inflated Poisson model based likelihood ratio test and likelihood ratio test methods are applied to six selected drugs, from the 2006 to 2011 Adverse Event Reporting System database, with varying percentages of observed zero-count cells.

September 03, 2014 doi: 10.1177/0962280214549590 open full text
An alternative classification to mixture modeling for longitudinal counts or binary measures.
Subtil, F., Boussari, O., Bastard, M., Etard, J.-F., Ecochard, R., Genolini, C.
Statistical Methods in Medical Research: An International Review Journal. September 01, 2014

Classifying patients according to longitudinal measures, or trajectory classification, has become frequent in clinical research. The k-means algorithm is increasingly used for this task in case of continuous variables with standard deviations that do not depend on the mean. One feature of count and binary data modeled by Poisson or logistic regression is that the variance depends on the mean; hence, the within-group variability changes from one group to another depending on the mean trajectory level. Mixture modeling could be used here for classification though its main purpose is to model the data. The results obtained may change according to the main objective. This article presents an extension of the k-means algorithm that takes into account the features of count and binary data by using the deviance as distance metric. This approach is justified by its analogy with the classification likelihood. Two applications are presented with binary and count data to show the differences between the classifications obtained with the usual Euclidean distance versus the deviance distance.

September 01, 2014 doi: 10.1177/0962280214549040 open full text
Accelerated longitudinal designs: An overview of modelling, power, costs and handling missing data.
Galbraith, S., Bowden, J., Mander, A.
Statistical Methods in Medical Research: An International Review Journal. August 27, 2014

Longitudinal studies are often used to investigate age-related developmental change. Whereas a single cohort design takes a group of individuals at the same initial age and follows them over time, an accelerated longitudinal design takes multiple single cohorts, each one starting at a different age. The main advantage of an accelerated longitudinal design is its ability to span the age range of interest in a shorter period of time than would be possible with a single cohort longitudinal design. This paper considers design issues for accelerated longitudinal studies. A linear mixed effect model is considered to describe the responses over age with random effects for intercept and slope parameters. Random and fixed cohort effects are used to cope with the potential bias accelerated longitudinal designs have due to multiple cohorts. The impact of other factors such as costs and the impact of dropouts on the power of testing or the precision of estimating parameters are examined. As duration-related costs increase relative to recruitment costs the best designs shift towards shorter duration and eventually cross-sectional design being best. For designs with the same duration but differing interval between measurements, we found there was a cutoff point for measurement costs relative to recruitment costs relating to frequency of measurements. Under our model of 30% dropout there was a maximum power loss of 7%.

August 27, 2014 doi: 10.1177/0962280214547150 open full text
Causal effect estimation strategies in a longitudinal study with complex time-varying confounders: A tutorial.
Mertens, B. J., Datta, S., Brand, R., Peul, W.
Statistical Methods in Medical Research: An International Review Journal. August 20, 2014

The Dutch Sciatica Trial represents a longitudinal study with complex time-varying confounders as patients with poorer health conditions (e.g. more severe pain) are more likely to opt for surgery, which, in turn, may affect future outcomes (pain severity). A straightforward classical as-treated comparison at the end point would lead to biased estimation of the surgery effect. We present several strategies of causal treatment effect estimation that might be applicable for analyzing such data. These include an inverse probability of treatment weighted regression analysis, a marginal weighted analysis, an unweighted regression analysis, and several propensity score-based approaches. In addition, we demonstrate how to evaluate these approaches in a thorough simulation study where we generate various realistic complex confounding patterns akin to the sciatica study.

August 20, 2014 doi: 10.1177/0962280214545529 open full text
Sample size determinations for group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms.
Heo, M., Litwin, A. H., Blackstock, O., Kim, N., Arnsten, J. H.
Statistical Methods in Medical Research: An International Review Journal. August 14, 2014

We derived sample size formulae for detecting main effects in group-based randomized clinical trials with different levels of data hierarchy between experimental and control arms. Such designs are necessary when experimental interventions need to be administered to groups of subjects whereas control conditions need to be administered to individual subjects. This type of trial, often referred to as a partially nested or partially clustered design, has been implemented for management of chronic diseases such as diabetes and is beginning to emerge more commonly in wider clinical settings. Depending on the research setting, the level of hierarchy of data structure for the experimental arm can be three or two, whereas that for the control arm is two or one. Such different levels of data hierarchy assume correlation structures of outcomes that are different between arms, regardless of whether research settings require two or three level data structure for the experimental arm. Therefore, the different correlations should be taken into account for statistical modeling and for sample size determinations. To this end, we considered mixed-effects linear models with different correlation structures between experimental and control arms to theoretically derive and empirically validate the sample size formulae with simulation studies.

August 14, 2014 doi: 10.1177/0962280214547381 open full text
Frailty modeling for clustered competing risks data with missing cause of failure.
Lee, M., Ha, I. D., Lee, Y.
Statistical Methods in Medical Research: An International Review Journal. August 14, 2014

Competing risks data often occur within a center in multi-center clinical trials where the event times within a center may be correlated due to unobserved factors across individuals. In this paper, we consider the cause-specific proportional hazards model with a shared frailty to model the association between the event times within a center in the framework of competing risks. We use a hierarchical likelihood approach, which does not require any intractable integration over the frailty terms. In a clinical trial, cause of death information may not be observed for some patients. In such a case, analyses through exclusion of cases with missing cause of death may lead to biased inferences. We propose a hierarchical likelihood approach for fitting the cause-specific proportional hazards model with a shared frailty in the presence of missing cause of failure. We use multiple imputation methods to address missing cause of death information under the assumption of missing at random. Simulation studies show that the proposed procedures perform well, even if the imputation model is misspecified. The proposed methods are illustrated with data from EORTC trial 30791 conducted by European Organization for Research and Treatment of Cancer (EORTC).

August 14, 2014 doi: 10.1177/0962280214545639 open full text
Sample size calculations for prevalent cohort designs.
Liu, H., Shen, Y., Ning, J., Qin, J.
Statistical Methods in Medical Research: An International Review Journal. August 04, 2014

Cross-sectional prevalent cohort design has drawn considerable interests in the studies of association between risk factors and time-to-event outcome. The sampling scheme in such design gives rise to length-biased data that require specialized analysis strategy but can improve study efficiency. The power and sample size calculation methods are however lacking for studies with prevalent cohort design, and using the formula developed for traditional survival data may overestimate sample size. We derive the sample size formulas that are appropriate for the design of cross-sectional prevalent cohort studies, under the assumptions of exponentially distributed event time and uniform follow-up for cross-sectional prevalent cohort design. We perform numerical and simulation studies to compare the sample size requirements for achieving the same power between prevalent cohort and incident cohort designs. We also use a large prospective prevalent cohort study to demonstrate the procedure. Using rigorous designs and proper analysis tools, the prospective prevalent cohort design can be more efficient than the incident cohort design with the same total sample sizes and study durations.

August 04, 2014 doi: 10.1177/0962280214544730 open full text
A "placement of death" approach for studies of treatment effects on ICU length of stay.
Lin, W., Halpern, S. D., Prasad Kerlin, M., Small, D. S.
Statistical Methods in Medical Research: An International Review Journal. August 01, 2014

Length of stay in the intensive care unit (ICU) is a common outcome measure in randomized trials of ICU interventions. Because many patients die in the ICU, it is difficult to disentangle treatment effects on length of stay from effects on mortality; conventional analyses depend on assumptions that are often unstated and hard to interpret or check. We adapt a proposal from Rosenbaum that addresses concerns about selection bias and makes its assumptions explicit. A composite outcome is constructed that equals ICU length of stay if the patient was discharged alive and indicates death otherwise. Given any preference ordering that compares death with possible lengths of stay, we can estimate the intervention’s effects on the composite outcome distribution. Sensitivity analyses can show results for different preference orderings. We discuss methods for constructing approximate confidence intervals for treatment effects on quantiles of the outcome distribution or on proportions of patients with outcomes preferable to various cutoffs. Strengths and weaknesses of possible primary significance tests (including the Wilcoxon–Mann–Whitney rank sum test and a heteroskedasticity-robust variant due to Brunner and Munzel) are reviewed. An illustrative example reanalyzes a randomized trial of an ICU staffing intervention.

August 01, 2014 doi: 10.1177/0962280214545121 open full text
Statistical methods for incomplete data: Some results on model misspecification.
McIsaac, M., Cook, R.
Statistical Methods in Medical Research: An International Review Journal. July 25, 2014

Inverse probability weighted estimating equations and multiple imputation are two of the most studied frameworks for dealing with incomplete data in clinical and epidemiological research. We examine the limiting behaviour of estimators arising from inverse probability weighted estimating equations, augmented inverse probability weighted estimating equations and multiple imputation when the requisite auxiliary models are misspecified. We compute limiting values for settings involving binary responses and covariates and illustrate the effects of model misspecification using simulations based on data from a breast cancer clinical trial. We demonstrate that, even when both auxiliary models are misspecified, the asymptotic biases of double-robust augmented inverse probability weighted estimators are often smaller than the asymptotic biases of estimators arising from complete-case analyses, inverse probability weighting or multiple imputation. We further demonstrate that use of inverse probability weighting or multiple imputation with slightly misspecified auxiliary models can actually result in greater asymptotic bias than the use of naïve, complete case analyses. These asymptotic results are shown to be consistent with empirical results from simulation studies.

July 25, 2014 doi: 10.1177/0962280214544251 open full text
A comparison of machine learning methods for classification using simulation with multiple real data examples from mental health studies.
Khondoker, M., Dobson, R., Skirrow, C., Simmons, A., Stahl, D.
Statistical Methods in Medical Research: An International Review Journal. July 25, 2014

Background
Recent literature on the comparison of machine learning methods has raised questions about the neutrality, unbiasedness and utility of many comparative studies. Reporting of results on favourable datasets and sampling error in the estimated performance measures based on single samples are thought to be the major sources of bias in such comparisons. Better performance in one or a few instances does not necessarily imply so on an average or on a population level and simulation studies may be a better alternative for objectively comparing the performances of machine learning algorithms.
Methods
We compare the classification performance of a number of important and widely used machine learning algorithms, namely the Random Forests (RF), Support Vector Machines (SVM), Linear Discriminant Analysis (LDA) and k-Nearest Neighbour (kNN). Using massively parallel processing on high-performance supercomputers, we compare the generalisation errors at various combinations of levels of several factors: number of features, training sample size, biological variation, experimental variation, effect size, replication and correlation between features.
Results
For smaller number of correlated features, number of features not exceeding approximately half the sample size, LDA was found to be the method of choice in terms of average generalisation errors as well as stability (precision) of error estimates. SVM (with RBF kernel) outperforms LDA as well as RF and kNN by a clear margin as the feature set gets larger provided the sample size is not too small (at least 20). The performance of kNN also improves as the number of features grows and outplays that of LDA and RF unless the data variability is too high and/or effect sizes are too small. RF was found to outperform only kNN in some instances where the data are more variable and have smaller effect sizes, in which cases it also provide more stable error estimates than kNN and LDA. Applications to a number of real datasets supported the findings from the simulation study.

July 25, 2014 doi: 10.1177/0962280213502437 open full text
Statistical analysis with dilatation for development process of human fetuses.
Naito, K., Notsu, A., Udagawa, J., Otani, H.
Statistical Methods in Medical Research: An International Review Journal. July 22, 2014

This paper is concerned with the development process of human fetuses. Though the development process of human fetuses still includes many unknown issues, it is known that a certain harmonious relationship between the organs can be observed. This knowledge is based on our intuition, but we have no theory which clarifies these harmonized developments. The paper aims to give a mathematical understanding of the notion of harmonized development through the use of dilatation, which is a measure of the departure from conformal mapping. The asymptotics for dilatation have been developed using certain efficient models of quasiconformal mapping. The proposed method of dilatation is effectively applied to the human fetus data.

July 22, 2014 doi: 10.1177/0962280214543405 open full text
A better confidence interval for the sensitivity at a fixed level of specificity for diagnostic tests with continuous endpoints.
Shan, G.
Statistical Methods in Medical Research: An International Review Journal. July 21, 2014

For a diagnostic test with continuous measurement, it is often important to construct confidence intervals for the sensitivity at a fixed level of specificity. Bootstrap-based confidence intervals were shown to have good performance as compared to others, and the one by Zhou and Qin (2005) was recommended as the best existing confidence interval, named the BTII interval. We propose two new confidence intervals based on the profile variance method and conduct extensive simulation studies to compare the proposed intervals and the BTII intervals under a wide range of conditions. An example from a medical study on severe head trauma is used to illustrate application of the new intervals. The new proposed intervals generally have better performance than the BTII interval.

July 21, 2014 doi: 10.1177/0962280214544313 open full text
Hierarchical mixture models for longitudinal immunologic data with heterogeneity, non-normality, and missingness.
Huang, Y., Chen, J., Yin, P.
Statistical Methods in Medical Research: An International Review Journal. July 17, 2014

It is a common practice to analyze longitudinal data frequently arisen in medical studies using various mixed-effects models in the literature. However, the following issues may standout in longitudinal data analysis: (i) In clinical practice, the profile of each subject’s response from a longitudinal study may follow a "broken stick" like trajectory, indicating multiple phases of increase, decline and/or stable in response. Such multiple phases (with changepoints) may be an important indicator to help quantify treatment effect and improve management of patient care. To estimate changepoints, the various mixed-effects models become a challenge due to complicated structures of model formulations; (ii) an assumption of homogeneous population for models may be unrealistically obscuring important features of between-subject and within-subject variations; (iii) normality assumption for model errors may not always give robust and reliable results, in particular, if the data exhibit non-normality; and (iv) the response may be missing and the missingness may be non-ignorable. In the literature, there has been considerable interest in accommodating heterogeneity, non-normality or missingness in such models. However, there has been relatively little work concerning all of these features simultaneously. There is a need to fill up this gap as longitudinal data do often have these characteristics. In this article, our objectives are to study simultaneous impact of these data features by developing a Bayesian mixture modeling approach-based Finite Mixture of Changepoint (piecewise) Mixed-Effects (FMCME) models with skew distributions, allowing estimates of both model parameters and class membership probabilities at population and individual levels. Simulation studies are conducted to assess the performance of the proposed method, and an AIDS clinical data example is analyzed to demonstrate the proposed methodologies and to compare modeling results of potential mixture models under different scenarios.

July 17, 2014 doi: 10.1177/0962280214544207 open full text
Double propensity-score adjustment: A solution to design bias or bias due to incomplete matching.
Austin, P. C.
Statistical Methods in Medical Research: An International Review Journal. July 17, 2014

Propensity-score matching is frequently used to reduce the effects of confounding when using observational data to estimate the effects of treatments. Matching allows one to estimate the average effect of treatment in the treated. Rosenbaum and Rubin coined the term "bias due to incomplete matching" to describe the bias that can occur when some treated subjects are excluded from the matched sample because no appropriate control subject was available. The presence of incomplete matching raises important questions around the generalizability of estimated treatment effects to the entire population of treated subjects. We describe an analytic solution to address the bias due to incomplete matching. Our method is based on using optimal or nearest neighbor matching, rather than caliper matching (which frequently results in the exclusion of some treated subjects). Within the sample matched on the propensity score, covariate adjustment using the propensity score is then employed to impute missing potential outcomes under lack of treatment for each treated subject. Using Monte Carlo simulations, we found that the proposed method resulted in estimates of treatment effect that were essentially unbiased. This method resulted in decreased bias compared to caliper matching alone and compared to either optimal matching or nearest neighbor matching alone. Caliper matching alone resulted in design bias or bias due to incomplete matching, while optimal matching or nearest neighbor matching alone resulted in bias due to residual confounding. The proposed method also tended to result in estimates with decreased mean squared error compared to when caliper matching was used.

July 17, 2014 doi: 10.1177/0962280214543508 open full text
A comparison of marginal odds ratio estimators.
Loux, T. M., Drake, C., Smith-Gagen, J.
Statistical Methods in Medical Research: An International Review Journal. July 08, 2014

Uses of the propensity score to obtain estimates of causal effect have been investigated thoroughly under assumptions of linearity and additivity of exposure effect. When the outcome variable is binary relationships such as collapsibility, valid for the linear model, do not always hold. This article examines uses of the propensity score when both exposure and outcome are binary variables and the parameter of interest is the marginal odds ratio. We review stratification and matching by the propensity score when calculating the Mantel–Haenszel estimator and show that it is consistent for neither the marginal nor conditional odds ratio. We also investigate a marginal odds ratio estimator based on doubly robust estimators and summarize its performance relative to other recently proposed estimators under various conditions, including low exposure prevalence and model misspecification. Finally, we apply all estimators to a case study estimating the effect of Medicare plan type on the quality of care received by African-American breast cancer patients.

July 08, 2014 doi: 10.1177/0962280214541995 open full text
Does McNemar's test compare the sensitivities and specificities of two diagnostic tests?
Kim, S., Lee, W.
Statistical Methods in Medical Research: An International Review Journal. July 04, 2014

McNemar’s test is often used in practice to compare the sensitivities and specificities for the evaluation of two diagnostic tests. For correct evaluation of accuracy, an intuitive recommendation is to test the diseased and the non-diseased groups separately so that the sensitivities can be compared among the diseased, and specificities can be compared among the healthy group of people. This paper provides a rigorous theoretical framework for this argument and study the validity of McNemar’s test regardless of the conditional independence assumption. We derive McNemar’s test statistic under the null hypothesis considering both assumptions of conditional independence and conditional dependence. We then perform power analyses to show how the result is affected by the amount of the conditional dependence under alternative hypothesis.

July 04, 2014 doi: 10.1177/0962280214541852 open full text
Treatment selection in a randomized clinical trial via covariate-specific treatment effect curves.
Ma, Y., Zhou, X.-H.
Statistical Methods in Medical Research: An International Review Journal. July 04, 2014

For time-to-event data in a randomized clinical trial, we proposed two new methods for selecting an optimal treatment for a patient based on the covariate-specific treatment effect curve, which is used to represent the clinical utility of a predictive biomarker. To select an optimal treatment for a patient with a specific biomarker value, we proposed pointwise confidence intervals for each covariate-specific treatment effect curve and the difference between covariate-specific treatment effect curves of two treatments. Furthermore, to select an optimal treatment for a future biomarker-defined subpopulation of patients, we proposed confidence bands for each covariate-specific treatment effect curve and the difference between each pair of covariate-specific treatment effect curve over a fixed interval of biomarker values. We constructed the confidence bands based on a resampling technique. We also conducted simulation studies to evaluate finite-sample properties of the proposed estimation methods. Finally, we illustrated the application of the proposed method in a real-world data set.

July 04, 2014 doi: 10.1177/0962280214541724 open full text
Receiver operating characteristic curve generalization for non-monotone relationships.
Martinez-Camblor, P., Corral, N., Rey, C., Pascual, J., Cernuda-Morollon, E.
Statistical Methods in Medical Research: An International Review Journal. July 01, 2014

The receiver operating characteristic curve is a popular graphical method frequently used in order to study the diagnostic capacity of continuous markers. It represents in a plot true-positive rates against the false-positive ones. Both the practical and theoretical aspects of the receiver operating characteristic curve have been extensively studied. Conventionally, it is assumed that the considered marker has a monotone relationship with the studied characteristic; i.e., the upper (lower) values of the (bio)marker are associated with a higher probability of a positive result. However, there exist real situations where both the lower and the upper values of the marker are associated with higher probability of a positive result. We propose a receiver operating characteristic curve generalization, $${}_{g}$$, useful in this context. All pairs of possible cut-off points, one for the lower and another one for the upper marker values, are taken into account and the best of them are selected. The natural empirical estimator for the $${}_{g}$$ curve is considered and its uniform consistency and asymptotic distribution are derived. Finally, two real-world applications are studied.

July 01, 2014 doi: 10.1177/0962280214541095 open full text
Linear spline multilevel models for summarising childhood growth trajectories: A guide to their application using examples from five birth cohorts.
Howe, L. D., Tilling, K., Matijasevich, A., Petherick, E. S., Santos, A. C., Fairley, L., Wright, J., Santos, I. S., Barros, A. J. D., Martin, R. M., Kramer, M. S., Bogdanovich, N., Matush, L., Barros, H., Lawlor, D. A.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2014

Childhood growth is of interest in medical research concerned with determinants and consequences of variation from healthy growth and development. Linear spline multilevel modelling is a useful approach for deriving individual summary measures of growth, which overcomes several data issues (co-linearity of repeat measures, the requirement for all individuals to be measured at the same ages and bias due to missing data). Here, we outline the application of this methodology to model individual trajectories of length/height and weight, drawing on examples from five cohorts from different generations and different geographical regions with varying levels of economic development. We describe the unique features of the data within each cohort that have implications for the application of linear spline multilevel models, for example, differences in the density and inter-individual variation in measurement occasions, and multiple sources of measurement with varying measurement error. After providing example Stata syntax and a suggested workflow for the implementation of linear spline multilevel models, we conclude with a discussion of the advantages and disadvantages of the linear spline approach compared with other growth modelling methods such as fractional polynomials, more complex spline functions and other non-linear models.

June 26, 2014 doi: 10.1177/0962280213503925 open full text
Surrogacy assessment using principal stratification and a Gaussian copula model.
Conlon, A., Taylor, J., Elliott, M.
Statistical Methods in Medical Research: An International Review Journal. June 19, 2014

In clinical trials, a surrogate outcome (S) can be measured before the outcome of interest (T) and may provide early information regarding the treatment (Z) effect on T. Many methods of surrogacy validation rely on models for the conditional distribution of T given Z and S. However, S is a post-randomization variable, and unobserved, simultaneous predictors of S and T may exist, resulting in a non-causal interpretation. Frangakis and Rubin developed the concept of principal surrogacy, stratifying on the joint distribution of the surrogate marker under treatment and control to assess the association between the causal effects of treatment on the marker and the causal effects of treatment on the clinical outcome. Working within the principal surrogacy framework, we address the scenario of an ordinal categorical variable as a surrogate for a censored failure time true endpoint. A Gaussian copula model is used to model the joint distribution of the potential outcomes of T, given the potential outcomes of S. Because the proposed model cannot be fully identified from the data, we use a Bayesian estimation approach with prior distributions consistent with reasonable assumptions in the surrogacy assessment setting. The method is applied to data from a colorectal cancer clinical trial, previously analyzed by Burzykowski et al.

June 19, 2014 doi: 10.1177/0962280214539655 open full text
Tests for equivalence of two survival functions: Alternative to the tests under proportional hazards.
Martinez, E. E., Sinha, D., Wang, W., Lipsitz, S. R., Chappell, R. J.
Statistical Methods in Medical Research: An International Review Journal. June 12, 2014

For either the equivalence trial or the non-inferiority trial with survivor outcomes from two treatment groups, the most popular testing procedure is the extension (e.g., Wellek, A log-rank test for equivalence of two survivor functions, Biometrics, 1993; 49: 877–881) of log-rank based test under proportional hazards model. We show that the actual type I error rate for the popular procedure of Wellek is higher than the intended nominal rate when survival responses from two treatment arms satisfy the proportional odds survival model. When the true model is proportional odds survival model, we show that the hypothesis of equivalence of two survival functions can be formulated as a statistical hypothesis involving only the survival odds ratio parameter. We further show that our new equivalence test, formulation, and related procedures are applicable even in the presence of additional covariates beyond treatment arms, and the associated equivalence test procedures have correct type I error rates under the proportional hazards model as well as the proportional odds survival model. These results show that use of our test will be a safer statistical practice for equivalence trials of survival responses than the commonly used log-rank based tests.

June 12, 2014 doi: 10.1177/0962280214539282 open full text
The potential for increased power from combining P-values testing the same hypothesis.
Ganju, J., (Julie) Ma, G.
Statistical Methods in Medical Research: An International Review Journal. June 11, 2014

The conventional approach to hypothesis testing for formal inference is to prespecify a single test statistic thought to be optimal. However, we usually have more than one test statistic in mind for testing the null hypothesis of no treatment effect but we do not know which one is the most powerful. Rather than relying on a single p-value, combining p-values from prespecified multiple test statistics can be used for inference. Combining functions include Fisher’s combination test and the minimum p-value. Using randomization-based tests, the increase in power can be remarkable when compared with a single test and Simes’s method. The versatility of the method is that it also applies when the number of covariates exceeds the number of observations. The increase in power is large enough to prefer combined p-values over a single p-value. The limitation is that the method does not provide an unbiased estimator of the treatment effect and does not apply to situations when the model includes treatment by covariate interaction.

June 11, 2014 doi: 10.1177/0962280214538016 open full text
Closed-form fiducial confidence intervals for some functions of independent binomial parameters with comparisons.
Krishnamoorthy, K., Lee, M., Zhang, D.
Statistical Methods in Medical Research: An International Review Journal. June 11, 2014

Approximate closed-form confidence intervals (CIs) for estimating the difference, relative risk, odds ratio, and linear combination of proportions are proposed. These CIs are developed using the fiducial approach and the modified normal-based approximation to the percentiles of a linear combination of independent random variables. These confidence intervals are easy to calculate as the computation requires only the percentiles of beta distributions. The proposed confidence intervals are compared with the popular score confidence intervals with respect to coverage probabilities and expected widths. Comparison studies indicate that the proposed confidence intervals are comparable with the corresponding score confidence intervals, and better in some cases, for all the problems considered. The methods are illustrated using several examples.

June 11, 2014 doi: 10.1177/0962280214537809 open full text
A multiphase non-linear mixed effects model: An application to spirometry after lung transplantation.
Rajeswaran, J., Blackstone, E. H.
Statistical Methods in Medical Research: An International Review Journal. June 11, 2014

In medical sciences, we often encounter longitudinal temporal relationships that are non-linear in nature. The influence of risk factors may also change across longitudinal follow-up. A system of multiphase non-linear mixed effects model is presented to model temporal patterns of longitudinal continuous measurements, with temporal decomposition to identify the phases and risk factors within each phase. Application of this model is illustrated using spirometry data after lung transplantation using readily available statistical software. This application illustrates the usefulness of our flexible model when dealing with complex non-linear patterns and time-varying coefficients.

June 11, 2014 doi: 10.1177/0962280214537255 open full text
Fully non-parametric receiver operating characteristic curve estimation for random-effects meta-analysis.
Martinez-Camblor, P.
Statistical Methods in Medical Research: An International Review Journal. May 28, 2014

Meta-analyses, broadly defined as the quantitative review and synthesis of the results of related but independent comparable studies, allow to know the state of the art of one considered topic. Since the amount of available bibliography has enhanced in almost all fields and, specifically, in biomedical research, its popularity has drastically increased during the last decades. In particular, different methodologies have been developed in order to perform meta-analytic studies of diagnostic tests for both fixed- and random-effects models. From a parametric point of view, these techniques often compute a bivariate estimation for the sensitivity and the specificity by using only one threshold per included study. Frequently, an overall receiver operating characteristic curve based on a bivariate normal distribution is also provided. In this work, the author deals with the problem of estimating an overall receiver operating characteristic curve from a fully non-parametric approach when the data come from a meta-analysis study i.e. only certain information about the diagnostic capacity is available. Both fixed- and random-effects models are considered. In addition, the proposed methodology lets to use the information of all cut-off points available (not only one of them) in the selected original studies. The performance of the method is explored through Monte Carlo simulations. The observed results suggest that the proposed estimator is better than the reference one when the reported information is related to a threshold based on the Youden index and when information for two or more points are provided. Real data illustrations are included.

May 28, 2014 doi: 10.1177/0962280214537047 open full text
Meta-analysis of the technical performance of an imaging procedure: Guidelines and statistical methodology.
Huang, E. P., Wang, X.-F., Roy Choudhury, K., McShane, L. M., Gonen, M., Ye, J., Buckler, A. J., Kinahan, P. E., Reeves, A. P., Jackson, E. F., Guimaraes, A. R., Zahlmann, G., Meta-Analysis Working Group.
Statistical Methods in Medical Research: An International Review Journal. May 28, 2014

Medical imaging serves many roles in patient care and the drug approval process, including assessing treatment response and guiding treatment decisions. These roles often involve a quantitative imaging biomarker, an objectively measured characteristic of the underlying anatomic structure or biochemical process derived from medical images. Before a quantitative imaging biomarker is accepted for use in such roles, the imaging procedure to acquire it must undergo evaluation of its technical performance, which entails assessment of performance metrics such as repeatability and reproducibility of the quantitative imaging biomarker. Ideally, this evaluation will involve quantitative summaries of results from multiple studies to overcome limitations due to the typically small sample sizes of technical performance studies and/or to include a broader range of clinical settings and patient populations. This paper is a review of meta-analysis procedures for such an evaluation, including identification of suitable studies, statistical methodology to evaluate and summarize the performance metrics, and complete and transparent reporting of the results. This review addresses challenges typical of meta-analyses of technical performance, particularly small study sizes, which often causes violations of assumptions underlying standard meta-analysis techniques. Alternative approaches to address these difficulties are also presented; simulation studies indicate that they outperform standard techniques when some studies are small. The meta-analysis procedures presented are also applied to actual [18F]-fluorodeoxyglucose positron emission tomography (FDG-PET) test–retest repeatability data for illustrative purposes.

May 28, 2014 doi: 10.1177/0962280214537394 open full text
Logistic regression for dichotomized counts.
Preisser, J. S., Das, K., Benecha, H., Stamm, J. W.
Statistical Methods in Medical Research: An International Review Journal. May 26, 2014

Sometimes there is interest in a dichotomized outcome indicating whether a count variable is positive or zero. Under this scenario, the application of ordinary logistic regression may result in efficiency loss, which is quantifiable under an assumed model for the counts. In such situations, a shared-parameter hurdle model is investigated for more efficient estimation of regression parameters relating to overall effects of covariates on the dichotomous outcome, while handling count data with many zeroes. One model part provides a logistic regression containing marginal log odds ratio effects of primary interest, while an ancillary model part describes the mean count of a Poisson or negative binomial process in terms of nuisance regression parameters. Asymptotic efficiency of the logistic model parameter estimators of the two-part models is evaluated with respect to ordinary logistic regression. Simulations are used to assess the properties of the models with respect to power and Type I error, the latter investigated under both misspecified and correctly specified models. The methods are applied to data from a randomized clinical trial of three toothpaste formulations to prevent incident dental caries in a large population of Scottish schoolchildren.

May 26, 2014 doi: 10.1177/0962280214536893 open full text
A hybrid Bayesian hierarchical model combining cohort and case-control studies for meta-analysis of diagnostic tests: Accounting for partial verification bias.
Ma, X., Chen, Y., Cole, S. R., Chu, H.
Statistical Methods in Medical Research: An International Review Journal. May 26, 2014

To account for between-study heterogeneity in meta-analysis of diagnostic accuracy studies, bivariate random effects models have been recommended to jointly model the sensitivities and specificities. As study design and population vary, the definition of disease status or severity could differ across studies. Consequently, sensitivity and specificity may be correlated with disease prevalence. To account for this dependence, a trivariate random effects model had been proposed. However, the proposed approach can only include cohort studies with information estimating study-specific disease prevalence. In addition, some diagnostic accuracy studies only select a subset of samples to be verified by the reference test. It is known that ignoring unverified subjects may lead to partial verification bias in the estimation of prevalence, sensitivities, and specificities in a single study. However, the impact of this bias on a meta-analysis has not been investigated. In this paper, we propose a novel hybrid Bayesian hierarchical model combining cohort and case–control studies and correcting partial verification bias at the same time. We investigate the performance of the proposed methods through a set of simulation studies. Two case studies on assessing the diagnostic accuracy of gadolinium-enhanced magnetic resonance imaging in detecting lymph node metastases and of adrenal fluorine-18 fluorodeoxyglucose positron emission tomography in characterizing adrenal masses are presented.

May 26, 2014 doi: 10.1177/0962280214536703 open full text
Longitudinal data subject to irregular observation: A review of methods with a focus on visit processes, assumptions, and study design.
Pullenayegum, E. M., Lim, L. S.
Statistical Methods in Medical Research: An International Review Journal. May 21, 2014

When data are collected longitudinally, measurement times often vary among patients. This is of particular concern in clinic-based studies, for example retrospective chart reviews. Here, typically no two patients will share the same set of measurement times and moreover, it is likely that the timing of the measurements is associated with disease course; for example, patients may visit more often when unwell. While there are statistical methods that can help overcome the resulting bias, these make assumptions about the nature of the dependence between visit times and outcome processes, and the assumptions differ across methods. The purpose of this paper is to review the methods available with a particular focus on how the assumptions made line up with visit processes encountered in practice. Through this we show that no one method can handle all plausible visit scenarios and suggest that careful analysis of the visit process should inform the choice of analytic method for the outcomes. Moreover, there are some commonly encountered visit scenarios that are not handled well by any method, and we make recommendations with regard to study design that would minimize the chances of these problematic visit scenarios arising.

May 21, 2014 doi: 10.1177/0962280214536537 open full text
Increasing efficiency for estimating treatment-biomarker interactions with historical data.
Boonstra, P. S., Taylor, J. M., Mukherjee, B.
Statistical Methods in Medical Research: An International Review Journal. May 21, 2014

Detecting a treatment–biomarker interaction, which is a task better suited for large sample sizes, in a phase II trial, which has a small sample size, is challenging. In this paper, we investigate how two plausibly available sources of historical data may contain partial information to help estimate the treatment–biomarker interaction parameter in a randomized phase II study. The parameter is not identified in either historical dataset alone; nonetheless, both can provide some information about the parameter and, consequently, increase the precision of its estimate. To illustrate the potential for gains in efficiency and implications for the design of the study, we consider Gaussian outcomes and biomarker data and calculate the asymptotic variance using the expected Fisher information matrix. We quantify the gain in efficiency both through a numerical study and, in a simplified setting, insights derived from an algebraic development of the problem. We find that a non-negligible gain in precision is possible, even if the historical and prospective data do not arise from identical underlying models.

May 21, 2014 doi: 10.1177/0962280214535370 open full text
Semiparametric M-quantile regression for count data.
Dreassi, E., Ranalli, M. G., Salvati, N.
Statistical Methods in Medical Research: An International Review Journal. May 20, 2014

Lung cancer incidence over 2005–2010 for 326 Local Authority Districts in England is investigated by ecological regression. Motivated from mis-specification of a Negative Binomial additive model, a semiparametric Negative Binomial M-quantile regression model is introduced. The additive part relates to those univariate or bivariate smoothing components, which are included in the model to capture nonlinearities in the predictor or to account for spatial dependence. All such components are estimated using penalized splines. The results show the capability of the semiparametric Negative Binomial M-quantile regression model to handle data with a strong spatial structure.

May 20, 2014 doi: 10.1177/0962280214536636 open full text
Individualized dynamic prediction of prostate cancer recurrence with and without the initiation of a second treatment: Development and validation.
Sene, M., Taylor, J. M., Dignam, J. J., Jacqmin-Gadda, H., Proust-Lima, C.
Statistical Methods in Medical Research: An International Review Journal. May 20, 2014

With the emergence of rich information on biomarkers after treatments, new types of prognostic tools are being developed: dynamic prognostic tools that can be updated at each new biomarker measurement. Such predictions are of interest in oncology where after an initial treatment, patients are monitored with repeated biomarker data. However, in such setting, patients may receive second treatments to slow down the progression of the disease. This paper aims to develop and validate dynamic individual predictions that allow the possibility of a new treatment in order to help understand the benefit of initiating new treatments during the monitoring period. The prediction of the event in the next x years is done under two scenarios: (1) the patient initiates immediately a second treatment, (2) the patient does not initiate any treatment in the next x years. Predictions are derived from shared random-effect models. Applied to prostate cancer data, different specifications for the dependence between the prostate-specific antigen repeated measures, the initiation of a second treatment (hormonal therapy), and the risk of clinical recurrence are investigated and compared. The predictive accuracy of the dynamic predictions is evaluated with two measures (Brier score and prognostic cross-entropy) for which approximated cross-validated estimators are proposed.

May 20, 2014 doi: 10.1177/0962280214535763 open full text
Statistical methods for the analysis of clinical trials data containing many zeros: An application in vaccine development.
Callegaro, A., Kassapian, M., Zahaf, T., Tibaldi, F.
Statistical Methods in Medical Research: An International Review Journal. May 19, 2014

In recent years, many vaccines have been developed for the prevention of a variety of diseases. Many of these vaccines, like the one for herpes zoster, are supposed to act in a multilevel way. Ideally, they completely prevent expression of the virus, but failing that they help to reduce the severity of the disease. A simple approach to analyze these data is the so-called burden-of-illness test. The method assigns a score, say W, equal to 0 for the uninfected and a post-infection outcome X > 0 for the infected individuals. One of the limitations of this test is the potential low power when the vaccine efficacy is close to 0. To overcome this limitation, we propose a Fisher adjusted test where we combine a statistic for infection with a statistic for post-infection outcome adjusted for selection bias. The advantages and disadvantages of different methods proposed in the literature are discussed. We compared the methods via simulations in herpes zoster, HIV, and malaria vaccine trial settings. In addition, we applied these methods to published data on HIV vaccine. The paper ends with some recommendations and conclusions.

May 19, 2014 doi: 10.1177/0962280214532911 open full text
Modelling life course blood pressure trajectories using Bayesian adaptive splines.
Muniz-Terrera, G., Bakra, E., Hardy, R., Matthews, F. E., Lunn, D., FALCon collaboration group.
Statistical Methods in Medical Research: An International Review Journal. May 19, 2014

No single study has collected data over individuals’ entire lifespans. To understand changes over the entire life course, it is necessary to combine data from various studies that cover the whole life course. Such combination may be methodologically challenging due to potential differences in study protocols, information available and instruments used to measure the outcome of interest. Motivated by our interest in modelling blood pressure changes over the life course, we propose the use of Bayesian adaptive splines within a hierarchical setting to combine data from several UK-based longitudinal studies where blood pressure measures were taken in different stages of life. Our method allowed us to obtain a realistic estimate of the mean life course trajectory, quantify the variability both within and between studies, and examine overall and study specific effects of relevant risk factors on life course blood pressure changes.

May 19, 2014 doi: 10.1177/0962280214532576 open full text
Survival estimation in two-phase cohort studies with application to biomarkers evaluation.
Rebora, P., Valsecchi, M. G.
Statistical Methods in Medical Research: An International Review Journal. May 19, 2014

Two-phase studies are attractive for their economy and efficiency in research settings where large cohorts are available for investigating the prognostic and predictive role of novel genetic and biological factors. In this type of study, information on novel factors is collected only in a convenient subcohort (phase II) drawn from the cohort (phase I) according to a given (optimal) sampling strategy. Estimation of survival in the subcohort needs to account for the design. The Kaplan–Meier method, based on counts of events and of subjects at risk in time, must be applied accounting, with suitable weights, for the sampling probabilities of the subjects in phase II, in order to recover the representativeness of the subcohort for the entire cohort. The authors derived a proper variance estimator of survival by linearization. The proposed method is applied in the context of a two-phase study on childhood acute lymphoblastic leukemia, which was planned in order to evaluate the role of genetic polymorphisms on treatment failure due to relapse. The method has shown satisfactory performance through simulations under different scenarios, including the case–control setting, and proved to be useful for describing results in the clinical example.

May 19, 2014 doi: 10.1177/0962280214534411 open full text
Choice of agreement indices for assessing and improving measurement reproducibility in a core laboratory setting.
Barnhart, H. X., Yow, E., Crowley, A. L., Daubert, M. A., Rabineau, D., Bigelow, R., Pencina, M., Douglas, P. S.
Statistical Methods in Medical Research: An International Review Journal. May 14, 2014

Clinical core laboratories, such as Echocardiography core laboratories, are increasingly used in clinical studies with imaging outcomes as primary, secondary, or surrogate endpoints. While many factors contribute to the quality of measurements of imaging variables, an essential step in ensuring the value of imaging data includes formal assessment and control of reproducibility via intra-observer and inter-observer reliability. There are many different agreement/reliability indices in the literature. However, different indices may lead to different conclusions and it is not clear which index is the preferred choice as an overall indication of data quality and a tool for providing guidance on improving quality and reliability in a core lab setting. In this paper, we pre-specify the desirable characteristics of an agreement index for assessing and improving reproducibility in a core lab setting; we compare existing agreement indices in terms of these characteristics to choose a preferred index. We conclude that, among the existing indices reviewed, the coverage probability for assessing agreement is the preferred agreement index on the basis of computational simplicity, its ability for rapid identification of discordant measurements to provide guidance for review and retraining, and its consistent evaluation of data quality across multiple reviewers, populations, and continuous/categorical data.

May 14, 2014 doi: 10.1177/0962280214534651 open full text
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, M. C., Higgins, J. P.
Statistical Methods in Medical Research: An International Review Journal. May 12, 2014

Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy.

May 12, 2014 doi: 10.1177/0962280214534409 open full text
Combining growth curves when a longitudinal study switches measurement tools.
Oleson, J. J., Cavanaugh, J. E., Tomblin, J. B., Walker, E., Dunn, C.
Statistical Methods in Medical Research: An International Review Journal. May 11, 2014

When longitudinal studies are performed to investigate the growth of traits in children, the measurement tool being used to quantify the trait may need to change as the subjects’ age throughout the study. Changing the measurement tool at some point in the longitudinal study makes the analysis of that growth challenging which, in turn, makes it difficult to determine what other factors influence the growth rate. We developed a Bayesian hierarchical modeling framework that relates the growth curves per individual for each of the different measurement tools and allows for covariates to influence the shapes of the curves by borrowing strength across curves. The method is motivated by and demonstrated by speech perception outcome measurements of children who were implanted with cochlear implants. Researchers are interested in assessing the impact of age at implantation and comparing the growth rates of children who are implanted under the age of two versus those implanted between the ages of two and four.

May 11, 2014 doi: 10.1177/0962280214534588 open full text
Diagnosis using clinical/pathological and molecular information.
Irigoien, I., Arenas, C.
Statistical Methods in Medical Research: An International Review Journal. May 11, 2014

In diagnosis and classification diseases multiple outcomes, both molecular and clinical/pathological are routinely gathered on patients. In recent years, many approaches have been suggested for integrating gene expression (continuous data) with clinical/pathological data (usually categorical and ordinal data). This new area of research integrates both clinical and genomic data in order to improve our knowledge about diseases, and to capture the information which is lost in independent clinical or genomic studies. The related metric scaling distance is a not well-known, but very valuable distance to integrate clinical/pathological and molecular information. In this article, we present the use of the related metric scaling distance in biomedical research. We describe how this distance works, and we also explain why it may sometimes be preferred. We discuss the choice of the related metric scaling distance and compare it with other proximity measures to include both clinical and genetic information. Furthermore, we comment the choice of the related metric scaling distance when classical clustering or discriminant analysis based on distances are performed and compare the results with more complex cluster or discriminant procedures specially constructed for integrating clinical and molecular information. The use of the related metric scaling distance is illustrated on simulated experimental and four real data sets, a heart disease, and three cancer studies. The results present the flexibility and availability of this distance which gives competitive results.

May 11, 2014 doi: 10.1177/0962280214534410 open full text
Semi-Markov models for interval censored transient cognitive states with back transitions and a competing risk.
Wei, S., Kryscio, R. J.
Statistical Methods in Medical Research: An International Review Journal. May 11, 2014

Continuous-time multi-state stochastic processes are useful for modeling the flow of subjects from intact cognition to dementia with mild cognitive impairment and global impairment as intervening transient cognitive states and death as a competing risk. Each subject's cognition is assessed periodically resulting in interval censoring for the cognitive states while death without dementia is not interval censored. Since back transitions among the transient states are possible, Markov chains are often applied to this type of panel data. In this manuscript, we apply a semi-Markov process in which we assume that the waiting times are Weibull distributed except for transitions from the baseline state, which are exponentially distributed and in which we assume no additional changes in cognition occur between two assessments. We implement a quasi-Monte Carlo (QMC) method to calculate the higher order integration needed for likelihood estimation. We apply our model to a real dataset, the Nun Study, a cohort of 461 participants.

May 11, 2014 doi: 10.1177/0962280214534412 open full text
Gene selection for survival data under dependent censoring: A copula-based approach.
Emura, T., Chen, Y.-H.
Statistical Methods in Medical Research: An International Review Journal. May 11, 2014

Dependent censoring arises in biomedical studies when the survival outcome of interest is censored by competing risks. In survival data with microarray gene expressions, gene selection based on the univariate Cox regression analyses has been used extensively in medical research, which however, is only valid under the independent censoring assumption. In this paper, we first consider a copula-based framework to investigate the bias caused by dependent censoring on gene selection. Then, we utilize the copula-based dependence model to develop an alternative gene selection procedure. Simulations show that the proposed procedure adjusts for the effect of dependent censoring and thus outperforms the existing method when dependent censoring is indeed present. The non-small-cell lung cancer data are analyzed to demonstrate the usefulness of our proposal. We implemented the proposed method in an R "compound.Cox" package.

May 11, 2014 doi: 10.1177/0962280214533378 open full text
Predicting birth weight with conditionally linear transformation models.
Most, L., Schmid, M., Faschingbauer, F., Hothorn, T.
Statistical Methods in Medical Research: An International Review Journal. May 08, 2014

Low and high birth weight (BW) are important risk factors for neonatal morbidity and mortality. Gynecologists must therefore accurately predict BW before delivery. Most prediction formulas for BW are based on prenatal ultrasound measurements carried out within one week prior to birth. Although successfully used in clinical practice, these formulas focus on point predictions of BW but do not systematically quantify uncertainty of the predictions, i.e. they result in estimates of the conditional mean of BW but do not deliver prediction intervals. To overcome this problem, we introduce conditionally linear transformation models (CLTMs) to predict BW. Instead of focusing only on the conditional mean, CLTMs model the whole conditional distribution function of BW given prenatal ultrasound parameters. Consequently, the CLTM approach delivers both point predictions of BW and fetus-specific prediction intervals. Prediction intervals constitute an easy-to-interpret measure of prediction accuracy and allow identification of fetuses subject to high prediction uncertainty. Using a data set of 8712 deliveries at the Perinatal Centre at the University Clinic Erlangen (Germany), we analyzed variants of CLTMs and compared them to standard linear regression estimation techniques used in the past and to quantile regression approaches. The best-performing CLTM variant was competitive with quantile regression and linear regression approaches in terms of conditional coverage and average length of the prediction intervals. We propose that CLTMs be used because they are able to account for possible heteroscedasticity, kurtosis, and skewness of the distribution of BWs.

May 08, 2014 doi: 10.1177/0962280214532745 open full text
Non-randomized response model for sensitive survey with noncompliance.
Wu, Q., Tang, M.-L.
Statistical Methods in Medical Research: An International Review Journal. May 07, 2014

Collecting representative data on sensitive issues has long been problematic and challenging in public health prevalence investigation (e.g. non-suicidal self-injury), medical research (e.g. drug habits), social issue studies (e.g. history of child abuse), and their interdisciplinary studies (e.g. premarital sexual intercourse). Alternative data collection techniques that can be adopted to study sensitive questions validly become more important and necessary. As an alternative to the famous Warner randomized response model, non-randomized response triangular model has recently been developed to encourage participants to provide truthful responses in surveys involving sensitive questions. Unfortunately, both randomized and non-randomized response models could underestimate the proportion of subjects with the sensitive characteristic as some respondents do not believe that these techniques can protect their anonymity. As a result, some authors hypothesized that lack of trust and noncompliance should be highest among those who have the most to lose and the least to use for the anonymity provided by using these techniques. Some researchers noticed the existence of noncompliance and proposed new models to measure noncompliance in order to get reliable information. However, all proposed methods were based on randomized response models which require randomizing devices, restrict the survey to only face-to-face interview and are lack of reproductivity. Taking the noncompliance into consideration, we introduce new non-randomized response techniques in which no covariate is required. Asymptotic properties of the proposed estimates for sensitive characteristic as well as noncompliance probabilities are developed. Our proposed techniques are empirically shown to yield accurate estimates for both sensitive and noncompliance probabilities. A real example about premarital sex among university students is used to demonstrate our methodologies.

May 07, 2014 doi: 10.1177/0962280214533022 open full text
Receiver operating characteristic curve estimation for time to event with semicompeting risks and interval censoring.
Jacqmin-Gadda, H., Blanche, P., Chary, E., Touraine, C., Dartigues, J.-F.
Statistical Methods in Medical Research: An International Review Journal. May 06, 2014

Semicompeting risks and interval censoring are frequent in medical studies, for instance when a disease may be diagnosed only at times of visit and disease onset is in competition with death. To evaluate the ability of markers to predict disease onset in this context, estimators of discrimination measures must account for these two issues. In recent years, methods for estimating the time-dependent receiver operating characteristic curve and the associated area under the ROC curve have been extended to account for right censored data and competing risks. In this paper, we show how an approximation allows to use the inverse probability of censoring weighting estimator for semicompeting events with interval censored data. Then, using an illness-death model, we propose two model-based estimators allowing to rigorously handle these issues. The first estimator is fully model based whereas the second one only uses the model to impute missing observations due to censoring. A simulation study shows that the bias for inverse probability of censoring weighting remains modest and may be less than the one of the two parametric estimators when the model is misspecified. We finally recommend the nonparametric inverse probability of censoring weighting estimator as main analysis and the imputation estimator based on the illness-death model as sensitivity analysis.

May 06, 2014 doi: 10.1177/0962280214531691 open full text
Comparison of models for analyzing two-group, cross-sectional data with a Gaussian outcome subject to a detection limit.
Wiegand, R. E., Rose, C. E., Karon, J. M.
Statistical Methods in Medical Research: An International Review Journal. May 05, 2014

A potential difficulty in the analysis of biomarker data occurs when data are subject to a detection limit. This detection limit is often defined as the point at which the true values cannot be measured reliably. Multiple, regression-type models designed to analyze such data exist. Studies have compared the bias among such models, but few have compared their statistical power. This simulation study provides a comparison of approaches for analyzing two-group, cross-sectional data with a Gaussian-distributed outcome by exploring statistical power and effect size confidence interval coverage of four models able to be implemented in standard software. We found using a Tobit model fit by maximum likelihood provides the best power and coverage. An example using human immunodeficiency virus type 1 ribonucleic acid data is used to illustrate the inferential differences in these models.

May 05, 2014 doi: 10.1177/0962280214531684 open full text
Estimation of regression quantiles in complex surveys with data missing at random: An application to birthweight determinants.
Geraci, M.
Statistical Methods in Medical Research: An International Review Journal. April 29, 2014

The estimation of population parameters using complex survey data requires careful statistical modelling to account for the design features. This is further complicated by unit and item nonresponse for which a number of methods have been developed in order to reduce estimation bias. In this paper, we address some issues that arise when the target of the inference (i.e. the analysis model or model of interest) is the conditional quantile of a continuous outcome. Survey design variables are duly included in the analysis and a bootstrap variance estimation approach is proposed. Missing data are multiply imputed by means of chained equations. In particular, imputation of continuous variables is based on their empirical distribution, conditional on all other variables in the analysis. This method preserves the distributional relationships in the data, including conditional skewness and kurtosis, and successfully handles bounded outcomes. Our motivating study concerns the analysis of birthweight determinants in a large UK-based cohort of children. A novel finding on the parental conflict theory is reported. R code implementing these procedures is provided.

April 29, 2014 doi: 10.1177/0962280213484401 open full text
Bounded influence function based inference in joint modelling of ordinal partial linear model and accelerated failure time model.
Chakraborty, A.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2014

A common objective in longitudinal studies is to characterize the relationship between a longitudinal response process and a time-to-event data. Ordinal nature of the response and possible missing information on covariates add complications to the joint model. In such circumstances, some influential observations often present in the data may upset the analysis. In this paper, a joint model based on ordinal partial mixed model and an accelerated failure time model is used, to account for the repeated ordered response and time-to-event data, respectively. Here, we propose an influence function-based robust estimation method. Monte Carlo expectation maximization method-based algorithm is used for parameter estimation. A detailed simulation study has been done to evaluate the performance of the proposed method. As an application, a data on muscular dystrophy among children is used. Robust estimates are then compared with classical maximum likelihood estimates.

April 25, 2014 doi: 10.1177/0962280214531570 open full text
Controlling for localised spatio-temporal autocorrelation in long-term air pollution and health studies.
Lee, D., Mitchell, R.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2014

Estimating the long-term health impact of air pollution using an ecological spatio-temporal study design is a challenging task, due to the presence of residual spatio-temporal autocorrelation in the health counts after adjusting for the covariate effects. This autocorrelation is commonly modelled by a set of random effects represented by a Gaussian Markov random field (GMRF) prior distribution, as part of a hierarchical Bayesian model. However, GMRF models typically assume the random effects are globally smooth in space and time, and thus are likely to be collinear to any spatially and temporally smooth covariates such as air pollution. Such collinearity leads to poor estimation performance of the estimated fixed effects, and motivated by this epidemiological problem, this paper proposes new GMRF methodology to allow for localised spatio-temporal smoothing. This means random effects that are either geographically or temporally adjacent are allowed to be autocorrelated or conditionally independent, which allows more flexible autocorrelation structures to be represented. This increased flexibility results in improved fixed effects estimation compared with global smoothing models, which is evidenced by our simulation study. The methodology is then applied to the motivating study investigating the long-term effects of air pollution on respiratory ill health in Greater Glasgow, Scotland between 2007 and 2011.

April 25, 2014 doi: 10.1177/0962280214527384 open full text
A new risk-adjusted Bernoulli cumulative sum chart for monitoring binary health data.
Rossi, G., Sarto, S. D., Marchi, M.
Statistical Methods in Medical Research: An International Review Journal. April 22, 2014

To monitor a health event in patients with a specific risk of developing the event, a risk-adjusted cumulative sum chart is needed. The risk-adjusted cumulative sum chart proposed in the literature has some limitations. Setting appropriate control limits is not straightforward, there is no simple formula for constructing them, and they remain sensitive to changes in the underlying risk distribution and the baseline incidence rate. To overcome these limits, we propose a new risk-adjusted Bernoulli cumulative sum chart as a simple and efficient solution. Analyses of simulated and real data sets illustrate the performance and usefulness of the proposed procedure.

April 22, 2014 doi: 10.1177/0962280214530883 open full text
Comparison of imputation variance estimators.
Hughes, R., Sterne, J., Tilling, K.
Statistical Methods in Medical Research: An International Review Journal. April 22, 2014

Appropriate imputation inference requires both an unbiased imputation estimator and an unbiased variance estimator. The commonly used variance estimator, proposed by Rubin, can be biased when the imputation and analysis models are misspecified and/or incompatible. Robins and Wang proposed an alternative approach, which allows for such misspecification and incompatibility, but it is considerably more complex. It is unknown whether in practice Robins and Wang’s multiple imputation procedure is an improvement over Rubin’s multiple imputation. We conducted a critical review of these two multiple imputation approaches, a re-sampling method called full mechanism bootstrapping and our modified Rubin’s multiple imputation procedure via simulations and an application to data. We explored four common scenarios of misspecification and incompatibility. In general, for a moderate sample size (n = 1000), Robins and Wang’s multiple imputation produced the narrowest confidence intervals, with acceptable coverage. For a small sample size (n = 100) Rubin’s multiple imputation, overall, outperformed the other methods. Full mechanism bootstrapping was inefficient relative to the other methods and required modelling of the missing data mechanism under the missing at random assumption. Our proposed modification showed an improvement over Rubin’s multiple imputation in the presence of misspecification. Overall, Rubin’s multiple imputation variance estimator can fail in the presence of incompatibility and/or misspecification. For unavoidable incompatibility and/or misspecification, Robins and Wang’s multiple imputation could provide more robust inferences.

April 22, 2014 doi: 10.1177/0962280214526216 open full text
Evaluating treatment effectiveness under model misspecification: A comparison of targeted maximum likelihood estimation with bias-corrected matching.
Kreif, N., Gruber, S., Radice, R., Grieve, R., Sekhon, J. S.
Statistical Methods in Medical Research: An International Review Journal. April 22, 2014

Statistical approaches for estimating treatment effectiveness commonly model the endpoint, or the propensity score, using parametric regressions such as generalised linear models. Misspecification of these models can lead to biased parameter estimates. We compare two approaches that combine the propensity score and the endpoint regression, and can make weaker modelling assumptions, by using machine learning approaches to estimate the regression function and the propensity score. Targeted maximum likelihood estimation is a double-robust method designed to reduce bias in the estimate of the parameter of interest. Bias-corrected matching reduces bias due to covariate imbalance between matched pairs by using regression predictions. We illustrate the methods in an evaluation of different types of hip prosthesis on the health-related quality of life of patients with osteoarthritis. We undertake a simulation study, grounded in the case study, to compare the relative bias, efficiency and confidence interval coverage of the methods. We consider data generating processes with non-linear functional form relationships, normal and non-normal endpoints. We find that across the circumstances considered, bias-corrected matching generally reported less bias, but higher variance than targeted maximum likelihood estimation. When either targeted maximum likelihood estimation or bias-corrected matching incorporated machine learning, bias was much reduced, compared to using misspecified parametric models.

April 22, 2014 doi: 10.1177/0962280214521341 open full text
Penalized count data regression with application to hospital stay after pediatric cardiac surgery.
Wang, Z., Ma, S., Zappitelli, M., Parikh, C., Wang, C.-Y., Devarajan, P.
Statistical Methods in Medical Research: An International Review Journal. April 17, 2014

Pediatric cardiac surgery may lead to poor outcomes such as acute kidney injury (AKI) and prolonged hospital length of stay (LOS). Plasma and urine biomarkers may help with early identification and prediction of these adverse clinical outcomes. In a recent multi-center study, 311 children undergoing cardiac surgery were enrolled to evaluate multiple biomarkers for diagnosis and prognosis of AKI and other clinical outcomes. LOS is often analyzed as count data, thus Poisson regression and negative binomial (NB) regression are common choices for developing predictive models. With many correlated prognostic factors and biomarkers, variable selection is an important step. The present paper proposes new variable selection methods for Poisson and NB regression. We evaluated regularized regression through penalized likelihood function. We first extend the elastic net (Enet) Poisson to two penalized Poisson regression: Mnet, a combination of minimax concave and ridge penalties; and Snet, a combination of smoothly clipped absolute deviation (SCAD) and ridge penalties. Furthermore, we extend the above methods to the penalized NB regression. For the Enet, Mnet, and Snet penalties (EMSnet), we develop a unified algorithm to estimate the parameters and conduct variable selection simultaneously. Simulation studies show that the proposed methods have advantages with highly correlated predictors, against some of the competing methods. Applying the proposed methods to the aforementioned data, it is discovered that early postoperative urine biomarkers including NGAL, IL18, and KIM-1 independently predict LOS, after adjusting for risk and biomarker variables.

April 17, 2014 doi: 10.1177/0962280214530608 open full text
Funnel plot control limits to identify poorly performing healthcare providers when there is uncertainty in the value of the benchmark.
Manktelow, B. N., Seaton, S. E., Evans, T. A.
Statistical Methods in Medical Research: An International Review Journal. April 17, 2014

There is an increasing use of statistical methods, such as funnel plots, to identify poorly performing healthcare providers. Funnel plots comprise the construction of control limits around a benchmark and providers with outcomes falling outside the limits are investigated as potential outliers. The benchmark is usually estimated from observed data but uncertainty in this estimate is usually ignored when constructing control limits. In this paper, the use of funnel plots in the presence of uncertainty in the value of the benchmark is reviewed for outcomes from a Binomial distribution. Two methods to derive the control limits are shown: (i) prediction intervals; (ii) tolerance intervals. Tolerance intervals formally include the uncertainty in the value of the benchmark while prediction intervals do not. The probability properties of 95% control limits derived using each method were investigated through hypothesised scenarios. Neither prediction intervals nor tolerance intervals produce funnel plot control limits that satisfy the nominal probability characteristics when there is uncertainty in the value of the benchmark. This is not necessarily to say that funnel plots have no role to play in healthcare, but that without the development of intervals satisfying the nominal probability characteristics they must be interpreted with care.

April 17, 2014 doi: 10.1177/0962280214530281 open full text
Bayesian latent structure modeling of walking behavior in a physical activity intervention.
Lawson, A. B., Ellerbe, C., Carroll, R., Alia, K., Coulon, S., Wilson, D. K., VanHorn, M. L., George, S. M. S.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2014

The analysis of walking behavior in a physical activity intervention is considered. A Bayesian latent structure modeling approach is proposed whereby the ability and willingness of participants is modeled via latent effects. The dropout process is jointly modeled via a linked survival model. Computational issues are addressed via posterior sampling and a simulated evaluation of the longitudinal model’s ability to recover latent structure and predictor effects is considered. We evaluate the effect of a variety of socio-psychological and spatial neighborhood predictors on the propensity to walk and the estimation of latent ability and willingness in the full study.

April 16, 2014 doi: 10.1177/0962280214529932 open full text
Assessing the inter-rater agreement for ordinal data through weighted indexes.
Marasini, D., Quatto, P., Ripamonti, E.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2014

Assessing the inter-rater agreement between observers, in the case of ordinal variables, is an important issue in both the statistical theory and biomedical applications. Typically, this problem has been dealt with the use of Cohen’s weighted kappa, which is a modification of the original kappa statistic, proposed for nominal variables in the case of two observers. Fleiss (1971) put forth a generalization of kappa in the case of multiple observers, but both Cohen’s and Fleiss’ kappa could have a paradoxical behavior, which may lead to a difficult interpretation of their magnitude. In this paper, a modification of Fleiss’ kappa, not affected by paradoxes, is proposed, and subsequently generalized to the case of ordinal variables. Monte Carlo simulations are used both to testing statistical hypotheses and to calculating percentile and bootstrap-t confidence intervals based on this statistic. The normal asymptotic distribution of the proposed statistic is demonstrated. Our results are applied to the classical Holmquist et al.’s (1967) dataset on the classification, by multiple observers, of carcinoma in situ of the uterine cervix. Finally, we generalize the use of s* to a bivariate case.

April 16, 2014 doi: 10.1177/0962280214529560 open full text
Multivariate tests based on interpoint distances with application to magnetic resonance imaging.
Marozzi, M.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2014

The multivariate location problem is addressed. The most familiar method to address the problem is the Hotelling test. When the hypothesis of normal distributions holds, the Hotelling test is optimal. Unfortunately, in practice the distributions underlying the samples are generally unknown and without assuming normality the finite sample unbiasedness of the Hotelling test is not guaranteed. Moreover, high-dimensional data are increasingly encountered when analyzing medical and biological problems, and in these situations the Hotelling test performs poorly or cannot be computed. A test that is unbiased for non-normal data, for small sample sizes as well as for two-sided alternatives and that can be computed for high-dimensional data has been recently proposed and is based on the ranks of the interpoint Euclidean distances between observations. Five modifications of this test are proposed and compared to the original test and the Hotelling test. Unbiasedness and consistency of the tests are proven and the problem of power computation is addressed. It is shown that two of the modified interpoint distance-based tests are always more powerful than the original test. Particularly, the modified test based on the Tippett criterium is suggested when the assumption of normality is not tenable and/or in case of high-dimensional data with complex dependence structure which are typical in molecular biology and medical imaging. A practical application to a case-control study where functional magnetic resonance imaging is used is discussed.

April 16, 2014 doi: 10.1177/0962280214529104 open full text
Assessing calibration of prognostic risk scores.
Crowson, C. S., Atkinson, E. J., Therneau, T. M.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

Current methods used to assess calibration are limited, particularly in the assessment of prognostic models. Methods for testing and visualizing calibration (e.g. the Hosmer–Lemeshow test and calibration slope) have been well thought out in the binary regression setting. However, extension of these methods to Cox models is less well known and could be improved. We describe a model-based framework for the assessment of calibration in the binary setting that provides natural extensions to the survival data setting. We show that Poisson regression models can be used to easily assess calibration in prognostic models. In addition, we show that a calibration test suggested for use in survival data has poor performance. Finally, we apply these methods to the problem of external validation of a risk score developed for the general population when assessed in a special patient population (i.e. patients with particular comorbidities, such as rheumatoid arthritis).

April 07, 2014 doi: 10.1177/0962280213497434 open full text
A corrected formulation for marginal inference derived from two-part mixed models for longitudinal semi-continuous data.
Tom, B. D., Su, L., Farewell, V. T.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

For semi-continuous data which are a mixture of true zeros and continuously distributed positive values, the use of two-part mixed models provides a convenient modelling framework. However, deriving population-averaged (marginal) effects from such models is not always straightforward. Su et al. presented a model that provided convenient estimation of marginal effects for the logistic component of the two-part model but the specification of marginal effects for the continuous part of the model presented in that paper was based on an incorrect formulation. We present a corrected formulation and additionally explore the use of the two-part model for inferences on the overall marginal mean, which may be of more practical relevance in our application and more generally.

April 07, 2014 doi: 10.1177/0962280213509798 open full text
Multiple imputation of covariates by fully conditional specification: Accommodating the substantive model.
Bartlett, J. W., Seaman, S. R., White, I. R., Carpenter, J. R., for the Alzheimer's Disease Neuroimaging Initiative*.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

Missing covariate data commonly occur in epidemiological and clinical research, and are often dealt with using multiple imputation. Imputation of partially observed covariates is complicated if the substantive model is non-linear (e.g. Cox proportional hazards model), or contains non-linear (e.g. squared) or interaction terms, and standard software implementations of multiple imputation may impute covariates from models that are incompatible with such substantive models. We show how imputation by fully conditional specification, a popular approach for performing multiple imputation, can be modified so that covariates are imputed from models which are compatible with the substantive model. We investigate through simulation the performance of this proposal, and compare it with existing approaches. Simulation results suggest our proposal gives consistent estimates for a range of common substantive models, including models which contain non-linear covariate effects or interactions, provided data are missing at random and the assumed imputation models are correctly specified and mutually compatible. Stata software implementing the approach is freely available.

April 07, 2014 doi: 10.1177/0962280214521348 open full text
Joint spatial Bayesian modeling for studies combining longitudinal and cross-sectional data.
Lawson, A. B., Carroll, R., Castro, M.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

Design for intervention studies may combine longitudinal data collected from sampled locations over several survey rounds and cross-sectional data from other locations in the study area. In this case, modeling the impact of the intervention requires an approach that can accommodate both types of data, accounting for the dependence between individuals followed up over time. Inadequate modeling can mask intervention effects, with serious implications for policy making. In this paper we use data from a large-scale larviciding intervention for malaria control implemented in Dar es Salaam, United Republic of Tanzania, collected over a period of almost 5 years. We apply a longitudinal Bayesian spatial model to the Dar es Salaam data, combining follow-up and cross-sectional data, treating the correlation in longitudinal observations separately, and controlling for potential confounders. An innovative feature of this modeling is the use of Ornstein–Uhlenbeck process to model random time effects. We contrast the results with other Bayesian modeling formulations, including cross-sectional approaches that consider individual-level random effects to account for subjects followed up in two or more surveys. The longitudinal modeling approach indicates that the intervention significantly reduced the prevalence of malaria infection in Dar es Salaam by 20% whereas the joint model did not suggest significance within the results. Our results suggest that the longitudinal model is to be preferred when longitudinal information is available at the individual level.

April 07, 2014 doi: 10.1177/0962280214527383 open full text
On fitting spatio-temporal disease mapping models using approximate Bayesian inference.
Ugarte, M. D., Adin, A., Goicoa, T., Militino, A. F.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

Spatio-temporal disease mapping comprises a wide range of models used to describe the distribution of a disease in space and its evolution in time. These models have been commonly formulated within a hierarchical Bayesian framework with two main approaches: an empirical Bayes (EB) and a fully Bayes (FB) approach. The EB approach provides point estimates of the parameters relying on the well-known penalized quasi-likelihood (PQL) technique. The FB approach provides the posterior distribution of the target parameters. These marginal distributions are not usually available in closed form and common estimation procedures are based on Markov chain Monte Carlo (MCMC) methods. However, the spatio-temporal models used in disease mapping are often very complex and MCMC methods may lead to large Monte Carlo errors and a huge computation time if the dimension of the data at hand is large. To circumvent these potential inconveniences, a new technique called integrated nested Laplace approximations (INLA), based on nested Laplace approximations, has been proposed for Bayesian inference in latent Gaussian models. In this paper, we show how to fit different spatio-temporal models for disease mapping with INLA using the Leroux CAR prior for the spatial component, and we compare it with PQL via a simulation study. The spatio-temporal distribution of male brain cancer mortality in Spain during the period 1986–2010 is also analysed.

April 07, 2014 doi: 10.1177/0962280214527528 open full text
A comparison of imputation strategies in cluster randomized trials with missing binary outcomes.
Caille, A., Leyrat, C., Giraudeau, B.
Statistical Methods in Medical Research: An International Review Journal. April 07, 2014

In cluster randomized trials, clusters of subjects are randomized rather than subjects themselves, and missing outcomes are a concern as in individual randomized trials. We assessed strategies for handling missing data when analysing cluster randomized trials with a binary outcome; strategies included complete case, adjusted complete case, and simple and multiple imputation approaches. We performed a simulation study to assess bias and coverage rate of the population-averaged intervention-effect estimate. Both multiple imputation with a random-effects logistic regression model or classical logistic regression provided unbiased estimates of the intervention effect. Both strategies also showed good coverage properties, even slightly better for multiple imputation with a random-effects logistic regression approach. Finally, this latter approach led to a slightly negatively biased intracluster correlation coefficient estimate but less than that with a classical logistic regression model strategy. We applied these strategies to a real trial randomizing households and comparing ivermectin and malathion to treat head lice.

April 07, 2014 doi: 10.1177/0962280214530030 open full text
Detecting adverse drug reactions following long-term exposure in longitudinal observational data: The exposure-adjusted self-controlled case series.
Schuemie, M. J., Trifiro, G., Coloma, P. M., Ryan, P. B., Madigan, D.
Statistical Methods in Medical Research: An International Review Journal. March 31, 2014

Most approaches used in postmarketing drug safety monitoring, including spontaneous reporting and statistical risk identification using electronic health care records, are primarily suited to pick up only acute adverse drug effects. With the availability of increasingly larger electronic health record and administrative claims databases comes the opportunity to monitor for potential adverse effects that occur only after prolonged exposure to a drug, but analysis methods are lacking. We propose an adaptation of the self-controlled case series design that uses the notion of accumulated exposure to capture long-term effects of drugs and evaluate extensions to correct for age and recurrent events. Several variations of the approach are tested on simulated data and two large insurance claims databases. To evaluate performance a set of positive and negative control drug–event pairs was created by medical experts based on drug product labels and review of the literature. Performance on the real data was measured using the area under the receiver operator characteristics curve. The best performing method achieved an area under the receiver operator characteristics curve of 0.86 in the largest database using a spline model, adjustment for age, and ignoring recurrent events, but it appears this performance can only be achieved with very large data sets.

March 31, 2014 doi: 10.1177/0962280214527531 open full text
Spatiotemporal hurdle models for zero-inflated count data: Exploring trends in emergency department visits.
Neelon, B., Chang, H. H., Ling, Q., Hastings, N. S.
Statistical Methods in Medical Research: An International Review Journal. March 28, 2014

Motivated by a study exploring spatiotemporal trends in emergency department use, we develop a class of two-part hurdle models for the analysis of zero-inflated areal count data. The models consist of two components—one for the probability of any emergency department use and one for the number of emergency department visits given use. Through a hierarchical structure, the models incorporate both patient- and region-level predictors, as well as spatially and temporally correlated random effects for each model component. The random effects are assigned multivariate conditionally autoregressive priors, which induce dependence between the components and provide spatial and temporal smoothing across adjacent spatial units and time periods, resulting in improved inferences. To accommodate potential overdispersion, we consider a range of parametric specifications for the positive counts, including truncated negative binomial and generalized Poisson distributions. We adopt a Bayesian inferential approach, and posterior computation is handled conveniently within standard Bayesian software. Our results indicate that the negative binomial and generalized Poisson hurdle models vastly outperform the Poisson hurdle model, demonstrating that overdispersed hurdle models provide a useful approach to analyzing zero-inflated spatiotemporal data.

March 28, 2014 doi: 10.1177/0962280214527079 open full text
Can we believe the DAGs? A comment on the relationship between causal DAGs and mechanisms.
Aalen, O., Roysland, K., Gran, J., Kouyos, R., Lange, T.
Statistical Methods in Medical Research: An International Review Journal. March 28, 2014

Directed acyclic graphs (DAGs) play a large role in the modern approach to causal inference. DAGs describe the relationship between measurements taken at various discrete times including the effect of interventions. The causal mechanisms, on the other hand, would naturally be assumed to be a continuous process operating over time in a cause–effect fashion. How does such immediate causation, that is causation occurring over very short time intervals, relate to DAGs constructed from discrete observations? We introduce a time-continuous model and simulate discrete observations in order to judge the relationship between the DAG and the immediate causal model. We find that there is no clear relationship; indeed the Bayesian network described by the DAG may not relate to the causal model. Typically, discrete observations of a process will obscure the conditional dependencies that are represented in the underlying mechanistic model of the process. It is therefore doubtful whether DAGs are always suited to describe causal relationships unless time is explicitly considered in the model. We relate the issues to mechanistic modeling by using the concept of local (in)dependence. An example using data from the Swiss HIV Cohort Study is presented.

March 28, 2014 doi: 10.1177/0962280213520436 open full text
Bayesian hierarchical modelling of noisy spatial rates on a modestly large and discontinuous irregular lattice.
MacNab, Y. C., Read, S., Strong, M., Pearson, T., Maheswaran, R., Goyder, E.
Statistical Methods in Medical Research: An International Review Journal. March 26, 2014

We present Bayesian hierarchical spatial model development motivated from a recent analysis of noisy small area response rate data, named the Booster data. The Booster data are postcode-level aggregates from a recent mail-out recruitment for a physical exercise intervention in deprived urban neighbourhoods in Sheffield, UK. Bayesian hierarchical Bernoulli-binomial spatial mixture zero-inflated Binomial models were developed for modelling overdispersion and for separation of systematic and random variations in the noisy and mostly low crude response rates. We present methods that enabled us to explore the underlying spatial rate variation, clustering of low or high response rate areas and neighbourhood characteristics that were associated with variations and patterns of invitation mail-outs, zero-response and response rates. Three spatial prior formulations, the intrinsic conditional autoregressive or (iCAR), the Besag-York-Mollié (BYM) and the modified BYM models, were explored for their performance on modelling sparse data on a modestly large and discontinuous irregular lattice. An in-depth Bayesian analysis of the Booster data is presented, with the resulting posterior estimation and inference implemented via Markov chain Monte Carlo simulation in WinBUGS. With increasing availability of spatial data referenced at fine spatial scales such as the postcode, the sparse-data situation and the Bayesian models and methods discussed herein should have considerable relevance to small area disease and health mapping and to spatial regression.

March 26, 2014 doi: 10.1177/0962280214527386 open full text
Joint modeling of HIV data in multicenter observational studies: A comparison among different approaches.
Brombin, C., Di Serio, C., Rancoita, P. M.
Statistical Methods in Medical Research: An International Review Journal. March 26, 2014

Disease process over time results from the combination of event history information and longitudinal process. Commonly, separate analyses of longitudinal and survival outcomes are performed. However, discharging the dependence between these components may cause misleading results. Separate analyses are difficult to interpret whenever one deals with observational retrospective multicenter cohort studies where the biomarkers are poorly monitored over time, while the survival component may be affected by several sources of bias, such as multiple endpoints, multiple time-scales, and informative censoring. We discuss how joint modeling of longitudinal and survival data represents an effective strategy to incorporate all information simultaneously and to provide valid and efficient inferences, thus allowing to produce a better insight into the biological mechanisms underlying the phenomenon under study. Accounting for the whole dynamics of the disease process is crucial in retrospective longitudinal studies. In this work, we present different approaches for modeling longitudinal and time-to-event data, retrieved from 648 HIV-infected patients enrolled in the Italian cohort of the CASCADE (Concerted Action on SeroConversion to AIDS and Death in Europe) study, one of the largest AIDS collaborative cohort studies. In particular, we evaluate CD4 lymphocyte evolution over time (from the date of seroconversion) and overall survival, CD4 being one of the most important immunologic biomarker for HIV progression. Besides a standard separate modeling approach, we consider two alternative joint models: the traditional joint model and the joint latent class mixed model. Advantages and disadvantages of the different approaches are discussed. To compare the performances of these models, cross-validation procedures are also performed.

March 26, 2014 doi: 10.1177/0962280214526192 open full text
Evaluation of cluster recovery for small area relative risk models.
Rotejanaprasert, C.
Statistical Methods in Medical Research: An International Review Journal. March 21, 2014

The analysis of disease risk is often considered via relative risk. The comparison of relative risk estimation methods with "true risk" scenarios has been considered on various occasions. However, there has been little examination of how well competing methods perform when the focus is clustering of risk. In this paper, a simulated evaluation of a range of potential spatial risk models and a range of measures that can be used for (a) cluster goodness of fit, (b) cluster diagnostics are considered. Results suggest that exceedence probability is a poor measure of hot spot clustering because of model dependence, whereas residual-based methods are less model dependent and perform better. Local deviance information criteria measures perform well, but conditional predictive ordinate measures yield a high false positive rate.

March 21, 2014 doi: 10.1177/0962280214527382 open full text
Prospective analysis of infectious disease surveillance data using syndromic information.
Corberan-Vallet, A., Lawson, A. B.
Statistical Methods in Medical Research: An International Review Journal. March 21, 2014

In this paper, we describe a Bayesian hierarchical Poisson model for the prospective analysis of data for infectious diseases. The proposed model consists of two components. The first component describes the behavior of disease during nonepidemic periods and the second component represents the increase in disease counts due to the presence of an epidemic. A novelty of our model formulation is that the parameters describing the spread of epidemics are allowed to vary in both space and time. We also show how syndromic information can be incorporated into the model to provide a better description of the data and more accurate one-step-ahead forecasts. These real-time forecasts can be used to identify high-risk areas for outbreaks and, consequently, to develop efficient targeted surveillance. We apply the methodology to weekly emergency room discharges for acute bronchitis in South Carolina.

March 21, 2014 doi: 10.1177/0962280214527385 open full text
Detecting the violation of variance homogeneity in mixed models.
Fang, X., Li, J., Wong, W. K., Fu, B.
Statistical Methods in Medical Research: An International Review Journal. March 21, 2014

Mixed-effects models are increasingly used in many areas of applied science. Despite their popularity, there is virtually no systematic approach for examining the homogeneity of the random-effects covariance structure commonly assumed for such models. We propose two tests for evaluating the homogeneity of the covariance structure assumption across subjects: one is based on the covariance matrices computed from the fitted model and the other is based on the empirical variation computed from the estimated random effects. We used simulation studies to compare performances of the two tests for detecting violations of the homogeneity assumption in the mixed-effects models and showed that they were able to identify abnormal clusters of subjects with dissimilar random-effects covariance structures; in particular, their removal from the fitted model might change the signs and the magnitudes of important predictors in the analysis. In a case study, we applied our proposed tests to a longitudinal cohort study of rheumatoid arthritis patients and compared their abilities to ascertain whether the assumption of covariance homogeneity for subject-specific random effects holds.

March 21, 2014 doi: 10.1177/0962280214526194 open full text
Bayesian inference for an illness-death model for stroke with cognition as a latent time-dependent risk factor.
van den Hout, A., Fox, J.-P., Klein Entink, R. H.
Statistical Methods in Medical Research: An International Review Journal. March 21, 2014

Longitudinal data can be used to estimate the transition intensities between healthy and unhealthy states prior to death. An illness-death model for history of stroke is presented, where time-dependent transition intensities are regressed on a latent variable representing cognitive function. The change of this function over time is described by a linear growth model with random effects. Occasion-specific cognitive function is measured by an item response model for longitudinal scores on the Mini-Mental State Examination, a questionnaire used to screen for cognitive impairment. The illness-death model will be used to identify and to explore the relationship between occasion-specific cognitive function and stroke. Combining a multi-state model with the latent growth model defines a joint model which extends current statistical inference regarding disease progression and cognitive function. Markov chain Monte Carlo methods are used for Bayesian inference. Data stem from the Medical Research Council Cognitive Function and Ageing Study in the UK (1991–2005).

March 21, 2014 doi: 10.1177/0962280211426359 open full text
Analysis of clustered competing risks data using subdistribution hazard models with multivariate frailties.
Ha, I. D., Christian, N. J., Jeong, J.-H., Park, J., Lee, Y.
Statistical Methods in Medical Research: An International Review Journal. March 11, 2014

Competing risks data often exist within a center in multi-center randomized clinical trials where the treatment effects or baseline risks may vary among centers. In this paper, we propose a subdistribution hazard regression model with multivariate frailty to investigate heterogeneity in treatment effects among centers from multi-center clinical trials. For inference, we develop a hierarchical likelihood (or h-likelihood) method, which obviates the need for an intractable integration over the frailty terms. We show that the profile likelihood function derived from the h-likelihood is identical to the partial likelihood, and hence it can be extended to the weighted partial likelihood for the subdistribution hazard frailty models. The proposed method is illustrated with a dataset from a multi-center clinical trial on breast cancer as well as with a simulation study. We also demonstrate how to present heterogeneity in treatment effects among centers by using a confidence interval for the frailty for each individual center and how to perform a statistical test for such heterogeneity using a restricted h-likelihood.

March 11, 2014 doi: 10.1177/0962280214526193 open full text
On a class of optimal covariate-adjusted response adaptive designs for survival outcomes.
Biswas, A., Bhattacharya, R., Park, E.
Statistical Methods in Medical Research: An International Review Journal. March 11, 2014

A class of optimal covariate-adjusted response adaptive procedures is developed for phase III clinical trials when the treatment response is a survival time and there is random censoring. The basic aim is to develop an allocation design by combining the ethical aspects with statistical precision in a reasonable way under the presence of covariate information. Considering minimisation of total hazards as the ethical requirement, the proposed procedure is assessed in terms of the assignment to the better treatment and the efficiency (i.e. power) to detect a small departure in treatment effectiveness. The applicability of the proposed methodology is also illustrated using a real data set.

March 11, 2014 doi: 10.1177/0962280214524177 open full text
Optimal scheduling of post-therapeutic follow-up of patients treated for cancer for early detection of relapses.
Somda, S. M., Leconte, E., Boher, J.-M., Asselain, B., Kramar, A., Filleron, T.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2014

Post-therapeutic surveillance is one important component of cancer care. However, there still is no evidence-based strategies to schedule patients’ follow-up examinations. Our approach is based on the modeling of the probability of the onset of relapse at an early asymptotic or preclinical stage and its transition to a clinical stage. For that we consider a multistate homogeneous Markov model, which includes the natural history of relapse. The model also handles separately the different types of possible relapses. The optimal schedule is provided by the calendar visit that maximizes a utility function. The methodology has been applied to laryngeal cancer. The different follow-up strategies revealed to be more efficient than those proposed by different scientific societies.

February 24, 2014 doi: 10.1177/0962280214524178 open full text
Survival analysis with functional covariates for partial follow-up studies.
Fang, H.-B., Wu, T. T., Rapoport, A. P., Tan, M.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2014

Predictive or prognostic analysis plays an increasingly important role in the era of personalized medicine to identify subsets of patients whom the treatment may benefit the most. Although various time-dependent covariate models are available, such models require that covariates be followed in the whole follow-up period. This article studies a new class of functional survival models where the covariates are only monitored in a time interval that is shorter than the whole follow-up period. This paper is motivated by the analysis of a longitudinal study on advanced myeloma patients who received stem cell transplants and T cell infusions after the transplants. The absolute lymphocyte cell counts were collected serially during hospitalization. Those patients are still followed up if they are alive after hospitalization, while their absolute lymphocyte cell counts cannot be measured after that. Another complication is that absolute lymphocyte cell counts are sparsely and irregularly measured. The conventional method using Cox model with time-varying covariates is not applicable because of the different lengths of observation periods. Analysis based on each single observation obviously underutilizes available information and, more seriously, may yield misleading results. This so-called partial follow-up study design represents increasingly common predictive modeling problem where we have serial multiple biomarkers up to a certain time point, which is shorter than the total length of follow-up. We therefore propose a solution to the partial follow-up design. The new method combines functional principal components analysis and survival analysis with selection of those functional covariates. It also has the advantage of handling sparse and irregularly measured longitudinal observations of covariates and measurement errors. Our analysis based on functional principal components reveals that it is the patterns of the trajectories of absolute lymphocyte cell counts, instead of the actual counts, that affect patient’s disease-free survival time.

February 24, 2014 doi: 10.1177/0962280214523586 open full text
A cautionary note on the use of attributable fractions in cohort studies.
Sjolander, A.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2014

The attributable fraction is a widely used measure to quantify the public health impact of an exposure on an outcome. It was originally proposed for binary outcomes, but attributable fraction estimators have also been proposed for time-to-event outcomes. In this note, we consider an estimator which was proposed by Benichou (Stats Methods Med Res, 2001) and is supposed to estimate the cohort attributable fraction, i.e. the number of events that would have been prevented in the cohort during follow-up, if the exposure would hypothetically have been eliminated. We show that this estimator is only valid under certain assumptions, which are often likely to be violated in practice. We further argue that the cohort attributable fraction may not be of substantial scientific interest in the first place. We propose a potentially more relevant measure of attributable fraction in cohort studies; the baseline attributable fraction. We show how the baseline attributable fraction can be conveniently estimated in Cox proportional hazards models.

February 24, 2014 doi: 10.1177/0962280214523953 open full text
Optimal selection of individuals for repeated covariate measurements in follow-up studies.
Reinikainen, J., Karvanen, J., Tolonen, H.
Statistical Methods in Medical Research: An International Review Journal. February 24, 2014

Repeated covariate measurements bring important information on the time-varying risk factors in long epidemiological follow-up studies. However, due to budget limitations, it may be possible to carry out the repeated measurements only for a subset of the cohort. We study cost-efficient alternatives for the simple random sampling in the selection of the individuals to be remeasured. The proposed selection criteria are based on forms of the D-optimality. The selection methods are compared with the simulation studies and illustrated with the data from the East–West study carried out in Finland from 1959 to 1999. The results indicate that cost savings can be achieved if the selection is focused on the individuals with high expected risk of the event and, on the other hand, on those with extreme covariate values in the previous measurements.

February 24, 2014 doi: 10.1177/0962280214523952 open full text
Bayesian analysis of transformation latent variable models with multivariate censored data.
Song, X.-Y., Pan, D., Liu, P.-F., Cai, J.-H.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2014

Transformation latent variable models are proposed in this study to analyze multivariate censored data. The proposed models generalize conventional linear transformation models to semiparametric transformation models that accommodate latent variables. The characteristics of the latent variables were assessed based on several correlated observed indicators through measurement equations. A Bayesian approach was developed with Bayesian P-splines technique and the Markov chain Monte Carlo algorithm to estimate the unknown parameters and transformation functions. Simulation shows that the performance of the proposed methodology is satisfactory. The proposed method was applied to analyze a cardiovascular disease data set.

February 17, 2014 doi: 10.1177/0962280214522786 open full text
Confidence intervals for intraclass correlation coefficients in variance components models.
Demetrashvili, N., Wit, E. C., van den Heuvel, E. R.
Statistical Methods in Medical Research: An International Review Journal. February 17, 2014

Confidence intervals for intraclass correlation coefficients in agreement studies with continuous outcomes are model-specific and no generic approach exists. This paper provides two generic approaches for intraclass correlation coefficients of the form $$\sum _{q=1}^{Q}{\sigma }_{q}^{2}/(\sum _{q=1}^{Q}{\sigma }_{q}^{2}+\sum _{p=Q+1}^{P}{\sigma }_{p}^{2})$$. The first approach uses Satterthwaite’s approximation and an F-distribution. The second approach uses the first and second moments of the intraclass correlation coefficient estimate in combination with a Beta distribution. Both approaches are based on the restricted maximum likelihood estimates for the variance components involved. Simulation studies are conducted to examine the coverage probabilities of the confidence intervals for agreement studies with a mix of small sample sizes. Two different three-way variance components models and balanced and unbalanced one-way random effects models are investigated. The proposed approaches are compared with other approaches developed for these specific models. The approach based on the F-distribution provides acceptable coverage probabilities, but the approach based on the Beta distribution results in accurate coverages for most settings in both balanced and unbalanced designs. A real agreement study is provided to illustrate the approaches.

February 17, 2014 doi: 10.1177/0962280214522787 open full text
Sample size considerations in active-control non-inferiority trials with binary data based on the odds ratio.
Siqueira, A. L., Todd, S., Whitehead, A.
Statistical Methods in Medical Research: An International Review Journal. February 12, 2014

This paper presents an approximate closed form sample size formula for determining non-inferiority in active-control trials with binary data. We use the odds-ratio as the measure of the relative treatment effect, derive the sample size formula based on the score test and compare it with a second, well-known formula based on the Wald test. Both closed form formulae are compared with simulations based on the likelihood ratio test. Within the range of parameter values investigated, the score test closed form formula is reasonably accurate when non-inferiority margins are based on odds-ratios of about 0.5 or above and when the magnitude of the odds ratio under the alternative hypothesis lies between about 1 and 2.5. The accuracy generally decreases as the odds ratio under the alternative hypothesis moves upwards from 1. As the non-inferiority margin odds ratio decreases from 0.5, the score test closed form formula increasingly overestimates the sample size irrespective of the magnitude of the odds ratio under the alternative hypothesis. The Wald test closed form formula is also reasonably accurate in the cases where the score test closed form formula works well. Outside these scenarios, the Wald test closed form formula can either underestimate or overestimate the sample size, depending on the magnitude of the non-inferiority margin odds ratio and the odds ratio under the alternative hypothesis. Although neither approximation is accurate for all cases, both approaches lead to satisfactory sample size calculation for non-inferiority trials with binary data where the odds ratio is the parameter of interest.

February 12, 2014 doi: 10.1177/0962280214520729 open full text
A combined gamma frailty and normal random-effects model for repeated, overdispersed time-to-event data.
Molenberghs, G., Verbeke, G., Efendi, A., Braekers, R., Demetrio, C. G.
Statistical Methods in Medical Research: An International Review Journal. February 12, 2014

This paper presents, extends, and studies a model for repeated, overdispersed time-to-event outcomes, subject to censoring. Building upon work by Molenberghs, Verbeke, and Demétrio (2007) and Molenberghs et al. (2010), gamma and normal random effects are included in a Weibull model, to account for overdispersion and between-subject effects, respectively. Unlike these authors, censoring is allowed for, and two estimation methods are presented. The partial marginalization approach to full maximum likelihood of Molenberghs et al. (2010) is contrasted with pseudo-likelihood estimation. A limited simulation study is conducted to examine the relative merits of these estimation methods. The modeling framework is employed to analyze data on recurrent asthma attacks in children on the one hand and on survival in cancer patients on the other.

February 12, 2014 doi: 10.1177/0962280214520730 open full text
Analysis of cross-over studies with missing data.
Rosenkranz, G. K.
Statistical Methods in Medical Research: An International Review Journal. February 04, 2014

This paper addresses some aspects of the analysis of cross-over trials with missing or incomplete data. A literature review on the topic reveals that many proposals provide correct results under the missing completely at random assumption while only some consider the more general missing at random situation. It is argued that mixed-effects models have a role in this context to recover some of the missing intra-subject from the inter-subject information, in particular when missingness is ignorable. Eventually, sensitivity analyses to deal with more general missingness mechanisms are presented.

February 04, 2014 doi: 10.1177/0962280214521349 open full text
The performance of different propensity score methods for estimating absolute effects of treatments on survival outcomes: A simulation study.
Austin, P. C., Schuster, T.
Statistical Methods in Medical Research: An International Review Journal. February 03, 2014

Observational studies are increasingly being used to estimate the effect of treatments, interventions and exposures on outcomes that can occur over time. Historically, the hazard ratio, which is a relative measure of effect, has been reported. However, medical decision making is best informed when both relative and absolute measures of effect are reported. When outcomes are time-to-event in nature, the effect of treatment can also be quantified as the change in mean or median survival time due to treatment and the absolute reduction in the probability of the occurrence of an event within a specified duration of follow-up. We describe how three different propensity score methods, propensity score matching, stratification on the propensity score and inverse probability of treatment weighting using the propensity score, can be used to estimate absolute measures of treatment effect on survival outcomes. These methods are all based on estimating marginal survival functions under treatment and lack of treatment. We then conducted an extensive series of Monte Carlo simulations to compare the relative performance of these methods for estimating the absolute effects of treatment on survival outcomes. We found that stratification on the propensity score resulted in the greatest bias. Caliper matching on the propensity score and a method based on earlier work by Cole and Hernán tended to have the best performance for estimating absolute effects of treatment on survival outcomes. When the prevalence of treatment was less extreme, then inverse probability of treatment weighting-based methods tended to perform better than matching-based methods.

February 03, 2014 doi: 10.1177/0962280213519716 open full text
Various varying variances: The challenge of nuisance parameters to the practising biostatistician.
Senn, S.
Statistical Methods in Medical Research: An International Review Journal. February 02, 2014

The 1997 Biometrics paper by Mike Kenward and James Roger has become a citation classic (more than 1260 citations by End June 2013 according to Google Scholar) and the solution that they proposed to deal with the problem of significance tests of fixed effects in REML models is now incorporated in many software packages and accepted by all biostatisticians as the method of choice. Nevertheless, it does not solve all problems, since there is more to analysis than just significance and since the problems that models with more than one variance pose arise in many contexts. In this paper, I discuss some problems and applications and make some tentative suggestions as to how they may be tackled. My excuse for raising problems I do not solve is that it may inspire James and Mike to complete what they started.

February 02, 2014 doi: 10.1177/0962280214520728 open full text
Robust small area prediction for counts.
Tzavidis, N., Ranalli, M. G., Salvati, N., Dreassi, E., Chambers, R.
Statistical Methods in Medical Research: An International Review Journal. February 02, 2014

A new semiparametric approach to model-based small area prediction for counts is proposed and used for estimating the average number of visits to physicians for Health Districts in Central Italy. The proposed small area predictor can be viewed as an outlier robust alternative to the more commonly used empirical plug-in predictor that is based on a Poisson generalized linear mixed model with Gaussian random effects. Results from the real data application and from a simulation experiment confirm that the proposed small area predictor has good robustness properties and in some cases can be more efficient than alternative small area approaches.

February 02, 2014 doi: 10.1177/0962280214520731 open full text
A Bayesian network for modelling blood glucose concentration and exercise in type 1 diabetes.
Ewings, S. M., Sahu, S. K., Valletta, J. J., Byrne, C. D., Chipperfield, A. J.
Statistical Methods in Medical Research: An International Review Journal. February 02, 2014

This article presents a new statistical approach to analysing the effects of everyday physical activity on blood glucose concentration in people with type 1 diabetes. A physiologically based model of blood glucose dynamics is developed to cope with frequently sampled data on food, insulin and habitual physical activity; the model is then converted to a Bayesian network to account for measurement error and variability in the physiological processes. A simulation study is conducted to determine the feasibility of using Markov chain Monte Carlo methods for simultaneous estimation of all model parameters and prediction of blood glucose concentration. Although there are problems with parameter identification in a minority of cases, most parameters can be estimated without bias. Predictive performance is unaffected by parameter misspecification and is insensitive to misleading prior distributions. This article highlights important practical and theoretical issues not previously addressed in the quest for an artificial pancreas as treatment for type 1 diabetes. The proposed methods represent a new paradigm for analysis of deterministic mathematical models of blood glucose concentration.

February 02, 2014 doi: 10.1177/0962280214520732 open full text
Estimation of sensitivity depending on sojourn time and time spent in preclinical state.
Kim, S., Wu, D.
Statistical Methods in Medical Research: An International Review Journal. February 02, 2014

The probability model for periodic screening was extended to provide statistical inference for sensitivity depending on sojourn time, in which the sensitivity was modeled as a function of time spent in the preclinical state and the sojourn time. The likelihood function with the proposed sensitivity model was then evaluated with simulated data to check its reliability in terms of the mean estimation and the standard error. Simulation results showed that the maximum likelihood estimates of the proposed model have little bias and small standard errors. The extended probability model was further applied to the Johns Hopkins Lung Project data using both maximum likelihood estimation and Bayesian Markov chain Monte Carlo.

February 02, 2014 doi: 10.1177/0962280212465499 open full text
Comparison of treatments in a cataract surgery with circular response.
Biswas, A., Dutta, S., Laha, A. K., Bakshi, P. K.
Statistical Methods in Medical Research: An International Review Journal. January 23, 2014

Circular data are a natural outcome in many biomedical studies, e.g. some measurements in ophthalmologic studies, degrees of rotation of hand or waist, etc. With reference to a real data set on astigmatism induced in two types of cataract surgeries we carry out some two-sample testing problems with the possibility of common or different concentration parameters in the circular set up. Detailed simulation study and the analysis of the data set, including redesigning the cataract surgery data, are carried out.

January 23, 2014 doi: 10.1177/0962280213519717 open full text
Graphical model-based O/E control chart for monitoring multiple outcomes from a multi-stage healthcare procedure.
Sibanda, N.
Statistical Methods in Medical Research: An International Review Journal. January 21, 2014

Most statistical process control programmes in healthcare focus on surveillance of outcomes at the final stage of a procedure, such as mortality or failure rates. Such an approach ignores the multi-stage nature of these procedures, in which a patient progresses through several stages prior to the final stage. In this paper, we introduce a novel approach to statistical process control programmes in healthcare. Our proposed approach is based on the regression adjustment and multi-stage control charts that have been in use in industrial applications for decades. Three advantages of the approach are: better understanding of how outcomes at different stages relate to each other, explicit monitoring of upstream stage outcomes may help curtail trends that lead to poorer end-stage outcomes and understanding the impact of each stage can help determine the most effective allocation of quality improvement resources. A test statistic for the control charts is proposed. Simulations are performed to test the control charts, and the results are summarised using an empirical probability of true detection. An illustrative example using data from a maternity unit is included. A main result from the simulation study is that taking a multi-stage approach makes it easer to explicitly identify shifts in upstream stage outcomes that might otherwise be signalled in final stage outcomes if dependence between stages is ignored.

January 21, 2014 doi: 10.1177/0962280213519719 open full text
Level of evidence for promising subgroup findings in an overall non-significant trial.
Tanniou, J., Tweel, I. v. d., Teerenstra, S., Roes, K. C.
Statistical Methods in Medical Research: An International Review Journal. January 20, 2014

In drug development and drug licensing, it sometimes occurs that a new drug does not demonstrate effectiveness for the full study population, but there appears to be benefit in a relevant, pre-defined subgroup. This raises the question, how strong the evidence from such a subgroup is, and which confirmatory testing strategies are the most appropriate ones. Hence, we considered the type I error and the power of a subgroup result in a trial with non-significant overall results and of suitable replication strategies. In the case of a single trial, the inflation of the overall type I error is substantial and can be up to twice as large, especially in relatively small subgroups. This also increases to the risk of starting a replication trial that should not be done, if such a second trial is not already available. The overall type I error is almost controlled by using an appropriate replication strategy. This confirms the required cautious interpretation of promising subgroups, even in the case that overall trial results were perceived to be close to significance.

January 20, 2014 doi: 10.1177/0962280213519705 open full text
Confidence intervals for proportion difference from two independent partially validated series.
Qiu, S.-F., Poon, W.-Y., Tang, M.-L.
Statistical Methods in Medical Research: An International Review Journal. January 20, 2014

Partially validated series are common when a gold-standard test is too expensive to be applied to all subjects, and hence a fallible device is used accordingly to measure the presence of a characteristic of interest. In this article, confidence interval construction for proportion difference between two independent partially validated series is studied. Ten confidence intervals based on the method of variance estimates recovery (MOVER) are proposed, with each using the confidence limits for the two independent binomial proportions obtained by the asymptotic, Logit-transformation, Agresti–Coull and Bayesian methods. The performances of the proposed confidence intervals and three likelihood-based intervals available in the literature are compared with respect to the empirical coverage probability, confidence width and ratio of mesial non-coverage to non-coverage probability. Our empirical results show that (1) all confidence intervals exhibit good performance in large samples; (2) confidence intervals based on MOVER combining the confidence limits for binomial proportions based on Wilson, Agresti–Coull, Logit-transformation, Bayesian (with three priors) methods perform satisfactorily from small to large samples, and hence can be recommended for practical applications. Two real data sets are analysed to illustrate the proposed methods.

January 20, 2014 doi: 10.1177/0962280213519718 open full text
A Bayesian model for joint analysis of multivariate repeated measures and time to event data in crossover trials.
Liu, F., Li, Q.
Statistical Methods in Medical Research: An International Review Journal. January 20, 2014

Joint modeling of longitudinal and survival data has become a popular technique in analyzing longitudinal clinical trials. In this discussion, the potentials of joint modeling are explored for analyzing time to event and multivariate repeated measures in crossover studies. The work is motivated by a real-life crossover study with three visual analog scale responses and a time to event response. To recover the information lost due to censoring of the time to event variable, we propose a Bayesian joint model to analyze the visual analog scale and time to event responses jointly, leveraging the moderate associations among the responses. The joint model links the time to event variable to the visual analog scale repeated measures via multi-layered subject-specific random effects. We show the Bayesian joint model produces more efficient inferences with satisfactory goodness of fit in general with comparison to modeling of the visual analog scale and time to event responses separately. A simulation study is performed to demonstrate the inferential advantages of Bayesian joint model over separate modeling and maximum likelihood approaches via non-linear mixed modeling in the crossover setting. This work also demonstrates the flexibility and usefulness of zero-one inflated beta regression in modeling non-Gaussian fixed-boundaries-inflated outcomes in general.

January 20, 2014 doi: 10.1177/0962280213519594 open full text
Notes on testing equality and interval estimation in Poisson frequency data under a three-treatment three-period crossover trial.
Lui, K.-J., Chang, K.-C.
Statistical Methods in Medical Research: An International Review Journal. January 17, 2014

When the frequency of event occurrences follows a Poisson distribution, we develop procedures for testing equality of treatments and interval estimators for the ratio of mean frequencies between treatments under a three-treatment three-period crossover design. Using Monte Carlo simulations, we evaluate the performance of these test procedures and interval estimators in various situations. We note that all test procedures developed here can perform well with respect to Type I error even when the number of patients per group is moderate. We further note that the two weighted-least-squares (WLS) test procedures derived here are generally preferable to the other two commonly used test procedures in the contingency table analysis. We also demonstrate that both interval estimators based on the WLS method and interval estimators based on Mantel-Haenszel (MH) approach can perform well, and are essentially of equal precision with respect to the average length. We use a double-blind randomized three-treatment three-period crossover trial comparing salbutamol and salmeterol with a placebo with respect to the number of exacerbations of asthma to illustrate the use of these test procedures and estimators.

January 17, 2014 doi: 10.1177/0962280213519249 open full text
Summary measure of discrimination in survival models based on cumulative/dynamic time-dependent ROC curves.
Lambert, J., Chevret, S.
Statistical Methods in Medical Research: An International Review Journal. January 05, 2014

Assessments of the discriminative performance of prognostic models have led to the development of several measures that extend the concept of discrimination as evaluated by the receiver operating characteristics curve and the area under the receiver operating characteristic curve (AUC) of diagnostic settings. Thus, several time-dependent-receiver operating characteristic curve and AUC(t) have been proposed. One of the most used, the cumulative/dynamic AUC^C,D(t) is the probability that, given two randomly chosen patients, one having failed before t and the other having failed after t, the prognostic marker will be correctly ranked. In this paper, we propose a weighted AUC^C,D(t) with time- and data-dependent weights as a summary measure of the mean AUC^C,D(t), restricted to a finite time range to ensure its clinical relevance. A simulation study shows that estimated restricted mean AUC increased with the strength of association of the covariate with the outcome, with low impact of censoring, and adequate coverage of bootstrap confidence intervals. We illustrate this methodology to two real datasets from two randomized clinical trials to assess the prognostic factors of the overall mortality in patients who have compensated cirrhosis and to assess the prognostic factors of event-free survival in patients who have acute myeloid leukemia.

January 05, 2014 doi: 10.1177/0962280213515571 open full text
Spatial generalised linear mixed models based on distances.
Melo, O. O., Mateu, J., Melo, C. E.
Statistical Methods in Medical Research: An International Review Journal. December 24, 2013

Risk models derived from environmental data have been widely shown to be effective in delineating geographical areas of risk because they are intuitively easy to understand. We present a new method based on distances, which allows the modelling of continuous and non-continuous random variables through distance-based spatial generalised linear mixed models. The parameters are estimated using Markov chain Monte Carlo maximum likelihood, which is a feasible and a useful technique. The proposed method depends on a detrending step built from continuous or categorical explanatory variables, or a mixture among them, by using an appropriate Euclidean distance. The method is illustrated through the analysis of the variation in the prevalence of Loa loa among a sample of village residents in Cameroon, where the explanatory variables included elevation, together with maximum normalised-difference vegetation index and the standard deviation of normalised-difference vegetation index calculated from repeated satellite scans over time.

December 24, 2013 doi: 10.1177/0962280213515792 open full text
Jackknife empirical likelihood confidence regions for the evaluation of continuous-scale diagnostic tests with verification bias.
Wang, B., Qin, G.
Statistical Methods in Medical Research: An International Review Journal. December 24, 2013

Recently, Wang and Qin proposed various bias-corrected empirical likelihood confidence regions for any two of the three parameters, sensitivity, specificity, and cut-off value, with the remaining parameter fixed at a given value in the evaluation of a continuous-scale diagnostic test with verification bias. In order to apply those methods, quantiles of the limiting weighted chi-squared distributions of the empirical log-likelihood ratio statistics should be estimated. In order to facilitate application and reduce computation burden, in this paper, jackknife empirical likelihood-based methods are proposed for any pairs of sensitivity, specificity and cut-off value, and asymptotic results can be derived accordingly. The proposed methods can be easily implemented to construct confidence regions for the evaluation of continuous-scale diagnostic tests with verification bias. Simulation studies are conducted to evaluate the finite sample performance and robustness of the proposed jackknife empirical likelihood-based confidence regions in terms of coverage probabilities. Finally, a real case analysis is provided to illustrate the application of new methods.

December 24, 2013 doi: 10.1177/0962280213515652 open full text
Use of auxiliary covariates in estimating a biomarker-adjusted treatment effect model with clinical trial data.
Zhang, Z., Qu, Y., Zhang, B., Nie, L., Soon, G.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2013

A biomarker-adjusted treatment effect (BATE) model describes the effect of one treatment versus another on a subpopulation of patients defined by a biomarker. Such a model can be estimated from clinical trial data without relying on additional modeling assumptions, and the estimator can be made more efficient by incorporating information on the main effect of the biomarker on the outcome of interest. Motivated by an HIV trial known as THRIVE, we consider the use of auxiliary covariates, which are usually available in clinical trials and have been used in overall treatment comparisons, in estimating a BATE model. Such covariates can be incorporated using an existing augmentation technique. For a specific type of estimating functions for difference-based BATE models, the optimal augmentation depends only on the joint main effects of marker and covariates. For a ratio-based BATE model, this result holds in special cases but not in general; however, simulation results suggest that the augmentation based on the joint main effects of marker and covariates is virtually equivalent to the theoretically optimal augmentation, especially when the augmentation terms are estimated from data. Application of these methods and results to the THRIVE data yields new insights on the utility of baseline CD4 cell count and viral load as predictive or treatment selection markers.

December 16, 2013 doi: 10.1177/0962280213515572 open full text
Rasch-family models are more valuable than score-based approaches for analysing longitudinal patient-reported outcomes with missing data.
de Bock, E., Hardouin, J.-B., Blanchin, M., Le Neel, T., Kubis, G., Bonnaud-Antignac, A., Dantan, E., Sebille, V.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2013

The objective was to compare classical test theory and Rasch-family models derived from item response theory for the analysis of longitudinal patient-reported outcomes data with possibly informative intermittent missing items. A simulation study was performed in order to assess and compare the performance of classical test theory and Rasch model in terms of bias, control of the type I error and power of the test of time effect. The type I error was controlled for classical test theory and Rasch model whether data were complete or some items were missing. Both methods were unbiased and displayed similar power with complete data. When items were missing, Rasch model remained unbiased and displayed higher power than classical test theory. Rasch model performed better than the classical test theory approach regarding the analysis of longitudinal patient-reported outcomes with possibly informative intermittent missing items mainly for power. This study highlights the interest of Rasch-based models in clinical research and epidemiology for the analysis of incomplete patient-reported outcomes data.

December 16, 2013 doi: 10.1177/0962280213515570 open full text
Multilevel models for cost-effectiveness analyses that use cluster randomised trial data: An approach to model choice.
Ng, E. S.-W., Diaz-Ordaz, K., Grieve, R., Nixon, R. M., Thompson, S. G., Carpenter, J. R.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2013

Multilevel models provide a flexible modelling framework for cost-effectiveness analyses that use cluster randomised trial data. However, there is a lack of guidance on how to choose the most appropriate multilevel models. This paper illustrates an approach for deciding what level of model complexity is warranted; in particular how best to accommodate complex variance–covariance structures, right-skewed costs and missing data. Our proposed models differ according to whether or not they allow individual-level variances and correlations to differ across treatment arms or clusters and by the assumed cost distribution (Normal, Gamma, Inverse Gaussian). The models are fitted by Markov chain Monte Carlo methods. Our approach to model choice is based on four main criteria: the characteristics of the data, model pre-specification informed by the previous literature, diagnostic plots and assessment of model appropriateness. This is illustrated by re-analysing a previous cost-effectiveness analysis that uses data from a cluster randomised trial. We find that the most useful criterion for model choice was the deviance information criterion, which distinguishes amongst models with alternative variance–covariance structures, as well as between those with different cost distributions. This strategy for model choice can help cost-effectiveness analyses provide reliable inferences for policy-making when using cluster trials, including those with missing data.

December 16, 2013 doi: 10.1177/0962280213511719 open full text
Estimating and testing interactions when explanatory variables are subject to non-classical measurement error.
Murad, H., Kipnis, V., Freedman, L. S.
Statistical Methods in Medical Research: An International Review Journal. December 11, 2013

Assessing interactions in linear regression models when covariates have measurement error (ME) is complex.
We previously described regression calibration (RC) methods that yield consistent estimators and standard errors for interaction coefficients of normally distributed covariates having classical ME. Here we extend normal based RC (NBRC) and linear RC (LRC) methods to a non-classical ME model, and describe more efficient versions that combine estimates from the main study and internal sub-study. We apply these methods to data from the Observing Protein and Energy Nutrition (OPEN) study. Using simulations we show that (i) for normally distributed covariates efficient NBRC and LRC were nearly unbiased and performed well with sub-study size ≥200; (ii) efficient NBRC had lower MSE than efficient LRC; (iii) the naïve test for a single interaction had type I error probability close to the nominal significance level, whereas efficient NBRC and LRC were slightly anti-conservative but more powerful; (iv) for markedly non-normal covariates, efficient LRC yielded less biased estimators with smaller variance than efficient NBRC. Our simulations suggest that it is preferable to use: (i) efficient NBRC for estimating and testing interaction effects of normally distributed covariates and (ii) efficient LRC for estimating and testing interactions for markedly non-normal covariates.

December 11, 2013 doi: 10.1177/0962280213509720 open full text
Causal inference with missing exposure information: Methods and applications to an obstetric study.
Zhang, Z., Liu, W., Zhang, B., Tang, L., Zhang, J.
Statistical Methods in Medical Research: An International Review Journal. December 05, 2013

Causal inference in observational studies is frequently challenged by the occurrence of missing data, in addition to confounding. Motivated by the Consortium on Safe Labor, a large observational study of obstetric labor practice and birth outcomes, this article focuses on the problem of missing exposure information in a causal analysis of observational data. This problem can be approached from different angles (i.e. missing covariates and causal inference), and useful methods can be obtained by drawing upon the available techniques and insights in both areas. In this article, we describe and compare a collection of methods based on different modeling assumptions, under standard assumptions for missing data (i.e. missing-at-random and positivity) and for causal inference with complete data (i.e. no unmeasured confounding and another positivity assumption). These methods involve three models: one for treatment assignment, one for the dependence of outcome on treatment and covariates, and one for the missing data mechanism. In general, consistent estimation of causal quantities requires correct specification of at least two of the three models, although there may be some flexibility as to which two models need to be correct. Such flexibility is afforded by doubly robust estimators adapted from the missing covariates literature and the literature on causal inference with complete data, and by a newly developed triply robust estimator that is consistent if any two of the three models are correct. The methods are applied to the Consortium on Safe Labor data and compared in a simulation study mimicking the Consortium on Safe Labor.

December 05, 2013 doi: 10.1177/0962280213513758 open full text
The choice of test in phase II cancer trials assessing continuous tumour shrinkage when complete responses are expected.
Wason, J. M., Mander, A. P.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2013

Traditionally, phase II cancer trials test a binary endpoint formed from a dichotomisation of the continuous change in tumour size. Directly testing the continuous endpoint provides considerable gains in power, although also results in several statistical issues. One such issue is when complete responses, i.e. complete tumour removal, are observed in multiple patients; this is a problem when normality is assumed. Using simulated data and a recently published phase II trial, we investigate how the choice of test affects the operating characteristics of the trial. We propose using parametric tests based on the censored normal distribution, comparing them to the t-test and Wilcoxon non-parametric test. The censored normal distribution fits the real dataset well, but simulations indicate its type-I error rate is inflated, and its power is only slightly higher than the t-test. The Wilcoxon test has deflated type I error. For two-arm designs, the differences are much smaller. We conclude that the t-test is suitable for use when complete responses are present, although positively skewed data can result in the non-parametric test having higher power.

November 28, 2013 doi: 10.1177/0962280211432192 open full text
Some recommendations for multi-arm multi-stage trials.
Wason, J., Magirr, D., Law, M., Jaki, T.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2013

Multi-arm multi-stage designs can improve the efficiency of the drug-development process by evaluating multiple experimental arms against a common control within one trial. This reduces the number of patients required compared to a series of trials testing each experimental arm separately against control. By allowing for multiple stages experimental treatments can be eliminated early from the study if they are unlikely to be significantly better than control. Using the TAILoR trial as a motivating example, we explore a broad range of statistical issues related to multi-arm multi-stage trials including a comparison of different ways to power a multi-arm multi-stage trial; choosing the allocation ratio to the control group compared to other experimental arms; the consequences of adding additional experimental arms during a multi-arm multi-stage trial, and how one might control the type-I error rate when this is necessary; and modifying the stopping boundaries of a multi-arm multi-stage design to account for unknown variance in the treatment outcome. Multi-arm multi-stage trials represent a large financial investment, and so considering their design carefully is important to ensure efficiency and that they have a good chance of succeeding.

November 28, 2013 doi: 10.1177/0962280212465498 open full text
Multiple imputation in the presence of high-dimensional data.
Zhao, Y., Long, Q.
Statistical Methods in Medical Research: An International Review Journal. November 25, 2013

Missing data are frequently encountered in biomedical, epidemiologic and social research. It is well known that a naive analysis without adequate handling of missing data may lead to bias and/or loss of efficiency. Partly due to its ease of use, multiple imputation has become increasingly popular in practice for handling missing data. However, it is unclear what is the best strategy to conduct multiple imputation in the presence of high-dimensional data. To answer this question, we investigate several approaches of using regularized regression and Bayesian lasso regression to impute missing values in the presence of high-dimensional data. We compare the performance of these methods through numerical studies, in which we also evaluate the impact of the dimension of the data, the size of the true active set for imputation, and the strength of correlation. Our numerical studies show that in the presence of high-dimensional data the standard multiple imputation approach performs poorly and the imputation approach using Bayesian lasso regression achieves, in most cases, better performance than the other imputation methods including the standard imputation approach using the correctly specified imputation model. Our results suggest that Bayesian lasso regression and its extensions are better suited for multiple imputation in the presence of high-dimensional data than the other regression methods.

November 25, 2013 doi: 10.1177/0962280213511027 open full text
Composite growth model applied to human oral and pharyngeal structures and identifying the contribution of growth types.
Wang, Y., Chung, M. K., Vorperian, H. K.
Statistical Methods in Medical Research: An International Review Journal. November 13, 2013

The growth patterns of different anatomic structures in the human body vary in terms of growth amount over time, growth rate and growth periods. The oral and pharyngeal structures, also known as vocal tract structures, are housed in the craniofacial complex where the cranium/brain follows a distinct neural growth pattern, and the face follows a distinct somatic or skeletal growth pattern. Thus, it is reasonable to expect the oral and pharyngeal structures to follow a combined or mixed growth pattern. Existing parametric growth models are limited in that they are mainly focused on modeling one particular type of growth pattern. In this paper, we propose a novel composite growth model using neural and somatic baseline curves to fit the combined growth pattern of select vocal tract structures. The method can also determine the overall percent contribution of each of the growth types.

November 13, 2013 doi: 10.1177/0962280213508849 open full text
Normalization of mean squared differences to measure agreement for continuous data.
Almehrizi, R.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2013

Agreement among observations on two variables for reliability or validation purposes is usually assessed by the evaluation of the mean squared differences (MSD). Many transformations of MSD have been proposed to interpret and make statistical inferences about the agreement between the two variables, including the concordance correlation coefficient (CCC) and the random marginal agreement coefficient (RMAC). This paper presents a normalization of MSD based on a reference range and uses it to derive CCC and RMAC (or ACC alternatively). The normalization of MSD enables the comparison between these two coefficients. The paper compares thoroughly the differences between these two coefficients and their properties at different agreement levels. Results show that ACC has promising properties over CCC. A Monte Carlo simulations as well as real data applications are performed. ACC for more than two variables are also derived.

November 06, 2013 doi: 10.1177/0962280213507506 open full text
Propensity score estimators for the average treatment effect and the average treatment effect on the treated may yield very different estimates.
Pirracchio, R., Carone, M., Rigon, M. R., Caruana, E., Mebazaa, A., Chevret, S.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2013

Objective
Propensity score matching is typically used to estimate the average treatment effect for the treated while inverse probability of treatment weighting aims at estimating the population average treatment effect. We illustrate how different estimands can result in very different conclusions.
Study design
We applied the two propensity score methods to assess the effect of continuous positive airway pressure on mortality in patients hospitalized for acute heart failure. We used Monte Carlo simulations to investigate the important differences in the two estimates.
Results
Continuous positive airway pressure application increased hospital mortality overall, but no continuous positive airway pressure effect was found on the treated. Potential reasons were (1) violation of the positivity assumption; (2) treatment effect was not uniform across the distribution of the propensity score. From simulations, we concluded that positivity bias was of limited magnitude and did not explain the large differences in the point estimates. However, when treatment effect varies according to the propensity score (E[Y(1)–Y(0)|g(X)] is not constant, Y being the outcome and g(X) the propensity score), propensity score matching ATT estimate could strongly differ from the inverse probability of treatment weighting-average treatment effect estimate. We show that this empirical result is supported by theory.
Conclusion
Although both approaches are recommended as valid methods for causal inference, propensity score-matching for ATT and inverse probability of treatment weighting for average treatment effect yield substantially different estimates of treatment effect. The choice of the estimand should drive the choice of the method.

November 06, 2013 doi: 10.1177/0962280213507034 open full text
Bayesian design for dichotomous repeated measurements with autocorrelation.
Abebe, H. T., Tan, F. E., Breukelen, G. J. V., Berger, M. P.
Statistical Methods in Medical Research: An International Review Journal. October 28, 2013

In medicine and health sciences, binary outcomes are often measured repeatedly to study their change over time. A problem for such studies is that designs with an optimal efficiency for some parameter values may not be efficient for other values. To handle this problem, we propose Bayesian designs which formally account for the uncertainty in the parameter values for a mixed logistic model which allows quadratic changes over time. Bayesian D-optimal allocations of time points are computed for different priors, costs, covariance structures and values of the autocorrelation. Our results show that the optimal number of time points increases with the subject-to-measurement cost ratio, and that neither the optimal number of time points nor the optimal allocations of time points appear to depend strongly on the prior, the covariance structure or on the size of the autocorrelation. It also appears that for subject-to-measurement cost ratios up to five, four equidistant time points, and for larger cost ratios, five or six equidistant time points are highly efficient. Our results are compared with the actual design of a respiratory infection study in Indonesia and it is shown that, selection of a Bayesian optimal design will increase efficiency, especially for small cost ratios.

October 28, 2013 doi: 10.1177/0962280213508850 open full text
A Bayesian network meta-analysis for binary outcome: how to do it.
Greco, T., Landoni, G., Biondi-Zoccai, G., D'Ascenzo, F., Zangrillo, A.
Statistical Methods in Medical Research: An International Review Journal. October 28, 2013

This study presents an overview of conceptual and practical issues of a network meta-analysis (NMA), particularly focusing on its application to randomised controlled trials with a binary outcome of interest. We start from general considerations on NMA to specifically appraise how to collect study data, structure the analytical network and specify the requirements for different models and parameter interpretations, with the ultimate goal of providing physicians and clinician-investigators a practical tool to understand pros and cons of NMA. Specifically, we outline the key steps, from the literature search to sensitivity analysis, necessary to perform a valid NMA of binomial data, exploiting Markov Chain Monte Carlo approaches. We also apply this analytical approach to a case study on the beneficial effects of volatile agents compared to total intravenous anaesthetics for surgery to further clarify the statistical details of the models, diagnostics and computations. Finally, datasets and models for the freeware WinBUGS package are presented for the anaesthetic agent example.

October 28, 2013 doi: 10.1177/0962280213500185 open full text
Studying noncollapsibility of the odds ratio with marginal structural and logistic regression models.
Pang, M., Kaufman, J. S., Platt, R. W.
Statistical Methods in Medical Research: An International Review Journal. October 09, 2013

One approach to quantifying the magnitude of confounding in observational studies is to compare estimates with and without adjustment for a covariate, but this strategy is known to be defective for noncollapsible measures such as the odds ratio. Comparing estimates from marginal structural and standard logistic regression models, the total difference between crude and conditional effects can be decomposed into the sum of a noncollapsibility effect and confounding bias. We provide an analytic approach to assess the noncollapsibility effect in a point-exposure study and provide a general formula for expressing the noncollapsibility effect. Next, we provide a graphical approach that illustrates the relationship between the noncollapsibility effect and the baseline risk, and reveals the behavior of the noncollapsibility effect for a range of different exposure and covariate effects. Various observations about noncollapsibility can be made from the different scenarios with or without confounding; for example, the magnitude of effect of the covariate plays a more important role in the noncollapsibility effect than does that of the effect of the exposure. In order to explore the noncollapsibility effect of the odds ratio in the presence of time-varying confounding, we simulated an observational cohort study. The magnitude of noncollapsibility was generally comparable to the effect in the point-exposure study in our simulation settings. Finally, in an applied example we demonstrate that collapsibility can have an important impact on estimation in practice.

October 09, 2013 doi: 10.1177/0962280213505804 open full text
Item response theory and structural equation modelling for ordinal data: Describing the relationship between KIDSCREEN and Life-H.
Titman, A. C., Lancaster, G. A., Colver, A. F.
Statistical Methods in Medical Research: An International Review Journal. October 09, 2013

Both item response theory and structural equation models are useful in the analysis of ordered categorical responses from health assessment questionnaires. We highlight the advantages and disadvantages of the item response theory and structural equation modelling approaches to modelling ordinal data, from within a community health setting. Using data from the SPARCLE project focussing on children with cerebral palsy, this paper investigates the relationship between two ordinal rating scales, the KIDSCREEN, which measures quality-of-life, and Life-H, which measures participation. Practical issues relating to fitting models, such as non-positive definite observed or fitted correlation matrices, and approaches to assessing model fit are discussed. item response theory models allow properties such as the conditional independence of particular domains of a measurement instrument to be assessed. When, as with the SPARCLE data, the latent traits are multidimensional, structural equation models generally provide a much more convenient modelling framework.

October 09, 2013 doi: 10.1177/0962280213504177 open full text
Longitudinal prostate-specific antigen reference ranges: Choosing the underlying model of age-related changes.
Simpkin, A. J., Metcalfe, C., Martin, R. M., Lane, J. A., Donovan, J. L., Hamdy, F. C., Neal, D. E., Tilling, K.
Statistical Methods in Medical Research: An International Review Journal. October 09, 2013

Serial measurements of prostate-specific antigen (PSA) are used as a biomarker for men diagnosed with prostate cancer following an active monitoring programme. Distinguishing pathological changes from natural age-related changes is not straightforward. Here, we compare four approaches to modelling age-related change in PSA with the aim of developing reference ranges for repeated measures of PSA. A suitable model for PSA reference ranges must satisfy two criteria. First, it must offer an accurate description of the trend of PSA on average and in individuals. Second, it must be able to make accurate predictions about new PSA observations for an individual and about the entire PSA trajectory for a new individual.

October 09, 2013 doi: 10.1177/0962280213503928 open full text
Robust inference for mixed censored and binary response models with missing covariates.
Sarkar, A., Das, K., Sinha, S. K.
Statistical Methods in Medical Research: An International Review Journal. October 09, 2013

In biomedical and epidemiological studies, often outcomes obtained are of mixed discrete and continuous in nature. Furthermore, due to some technical inconvenience or else, continuous responses are censored and also a few covariates cease to be observed completely. In this paper, we develop a model to tackle these complex situations. Our methodology is developed in a more general framework and provides a full-scale robust analysis of such complex models. The proposed robust maximum likelihood estimators of the model parameters are resistant to potential outliers in the data. We discuss the asymptotic properties of the robust estimators. To avoid computational difficulties involving irreducibly high-dimensional integrals, we propose a Monte Carlo method based on the Metropolis algorithm for approximating the robust maximum likelihood estimators. We study the empirical properties of these estimators in simulations. We also illustrate the proposed robust method using clustered data on blood sugar content from a clinical trial of individuals who were investigated for diabetes.

October 09, 2013 doi: 10.1177/0962280213503924 open full text
Power considerations for trials of two experimental arms versus a standard active control or placebo.
Hasselblad, V.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2013

The power of the two-experimental arm trial depends on three choices: (1) when one arm is dropped (if at all); (2) the final testing procedure, assuming no dropping; and (3) the sampling ratio for the three arms. Multiple-arm designs require critical values which were calculated using Mathematica. Power calculations were exact based on probabilities from binomial distributions. The "drop the loser" strategy is optimal for the primary endpoint. The equal sized two treated arm trial gives reasonable power for the primary as well as good power to select the best treated arm. The best power was provided by the 3:3:4 sampling, but it was only marginally better.

September 18, 2013 doi: 10.1177/0962280213503916 open full text
Efficient treatment allocation in two-way nested designs.
Lemme, F., van Breukelen, G. J., Berger, M. P.
Statistical Methods in Medical Research: An International Review Journal. September 12, 2013

Cluster randomized and multicenter trials sometimes combine two treatments A and B in a factorial design, with conditions such as A, B, A and B, or none. This results in a two-way nested design. The usual issue of sample size and power now arises for various clinically relevant contrast hypotheses. Assuming a fixed total sample size at each level (number of clusters or centers, number of patients), we derive the optimal proportion of the total sample to be allocated to each treatment arm. We consider treatment assignment first at the highest level (cluster randomized trial) and then at the lowest level (multicenter trial). We derive the optimal allocation ratio for various sets of clinically relevant hypotheses. We then evaluate the efficiency of each allocation and show that the popular balanced design is optimal or highly efficient for a range of research questions except for contrasting one treatment arm with all other treatment arms. We finally present simple equations for the total sample size needed to test each effect of interest in a balanced design, as a function of effect size, power and type I error α. All results are illustrated on a cluster-randomized trial on smoking prevention in primary schools and on a multicenter trial on lifestyle improvement in general practices.

September 12, 2013 doi: 10.1177/0962280213502145 open full text
Measuring and estimating treatment effect on dichotomous outcome of a population.
Wang, X., Jin, Y., Yin, L.
Statistical Methods in Medical Research: An International Review Journal. September 03, 2013

In different studies for treatment effect on dichotomous outcome of a certain population, one uses different regression models, leading to different measures of the treatment effect. In observational studies, the common measures of the treatment effect are: the conditional risk difference based on a linear model, the conditional risk ratio based on a log-linear model, and the conditional odds ratio based on a logistic model; in randomized trials, the common measures are: the marginal risk difference based on a linear model, the marginal risk ratio based on a log-linear model, and the marginal odds ratio based on a logistic model. In this article, we instead express these measures in terms of the risk of a dichotomous outcome conditional on covariates and treatment, where the risk is then described by a regression model. These expressions of the measures do not explicitly depend on the regression model. As a result, we are able to use one regression model in one study to estimate all these measures by maximum likelihood. We show that these measures have causal interpretations and reflect different aspects of the same underlying treatment effect under the assumption of no unmeasured confounding covariate given observed covariates. We get confidence intervals for these measures by finding approximate distributions of the maximum likelihood estimates of these measures. As an illustration, we estimate these measures for the effect of a triple therapy on eradication of Helicobacter pylori among Vietnamese children and are able to compare the treatment effect in this study with those in other studies.

September 03, 2013 doi: 10.1177/0962280213502146 open full text
Estimation of half-life periods in nonlinear data with fractional polynomials.
Mayer, B., Keller, F., Syrovets, T., Wittau, M.
Statistical Methods in Medical Research: An International Review Journal. September 02, 2013

Regression models are frequently used to model the functional relationship between an interesting outcome parameter and one or more potentially relevant explanatory variables. Objectives can be to set up as a prognostic model, for example, or an estimation model for a certain parameter of interest. Determining half-life periods can be viewed as a particular application of such an estimation model. However, specific to these modelling problems is that time-dependent active agent concentrations can be nonlinear. Concurrently, a major limitation to common regression approaches is the assumed linear relation of the investigated variables. Therefore, a more flexible approach is required to handle the problem of finding a model which fits the data adequately. One possibility is the use of fractional polynomials. The application of this modelling approach in a univariate setting is proposed in order to have an appropriate data model which subsequently serves as an estimation model for half-life periods. This estimation model includes Ridders’ method which is based on a regula falsi approach, a standard methodology of numerical analysis. The suggested procedure is applied to real data examples of antibiotic tissue concentrations in visceral surgery, nephropharmacology and clinical pharmacology and is furthermore compared to simple approaches of modelling nonlinear data.

September 02, 2013 doi: 10.1177/0962280213502403 open full text
Variable selection in semi-parametric models.
Zhang, H., Maity, A., Arshad, H., Holloway, J., Karmaus, W.
Statistical Methods in Medical Research: An International Review Journal. August 28, 2013

We propose Bayesian variable selection methods in semi-parametric models in the framework of partially linear Gaussian and problit regressions. Reproducing kernels are utilized to evaluate possibly non-linear joint effect of a set of variables. Indicator variables are introduced into the reproducing kernels for the inclusion or exclusion of a variable. Different scenarios based on posterior probabilities of including a variable are proposed to select important variables. Simulations are used to demonstrate and evaluate the methods. It was found that the proposed methods can efficiently select the correct variables regardless of the feature of the effects, linear or non-linear in an unknown form. The proposed methods are applied to two real data sets to identify cytosine phosphate guanine methylation sites associated with maternal smoking and cytosine phosphate guanine sites associated with cotinine levels with creatinine levels adjusted. The selected methylation sites have the potential to advance our understanding of the underlying mechanism for the impact of smoking exposure on health outcomes, and consequently benefit medical research in disease intervention.

August 28, 2013 doi: 10.1177/0962280213499679 open full text
Binomial confidence intervals for testing non-inferiority or superiority: a practitioner's dilemma.
Pradhan, V., Evans, J. C., Banerjee, T.
Statistical Methods in Medical Research: An International Review Journal. August 02, 2013

In testing for non-inferiority or superiority in a single arm study, the confidence interval of a single binomial proportion is frequently used. A number of such intervals are proposed in the literature and implemented in standard software packages. Unfortunately, use of different intervals leads to conflicting conclusions. Practitioners thus face a serious dilemma in deciding which one to depend on. Is there a way to resolve this dilemma? We address this question by investigating the performances of ten commonly used intervals of a single binomial proportion, in the light of two criteria, viz., coverage and expected length of the interval.

August 02, 2013 doi: 10.1177/0962280213498324 open full text
A flexible semiparametric modeling approach for doubly censored data with an application to prostate cancer.
Han, S., Andrei, A.-C., Tsui, K.-W.
Statistical Methods in Medical Research: An International Review Journal. July 30, 2013

Doubly censored data often arise in medical studies of disease progression involving two related events for which both an originating and a terminating event are interval-censored. Although regression modeling for such doubly censored data may be complicated, we propose a simple semiparametric regression modeling strategy based on jackknife pseudo-observations obtained using nonparametric estimators of the survival function. Inference is carried out via generalized estimating equations. Simulations studies show that the proposed method produces virtually unbiased covariate effect estimates, even for moderate sample sizes. A prostate cancer study example illustrates the practical advantages of the proposed approach.

July 30, 2013 doi: 10.1177/0962280213498325 open full text
Advanced colorectal neoplasia risk stratification by penalized logistic regression.
Lin, Y., Yu, M., Wang, S., Chappell, R., Imperiale, T. F.
Statistical Methods in Medical Research: An International Review Journal. July 30, 2013

Colorectal cancer is the second leading cause of death from cancer in the United States. To facilitate the efficiency of colorectal cancer screening, there is a need to stratify risk for colorectal cancer among the 90% of US residents who are considered "average risk." In this article, we investigate such risk stratification rules for advanced colorectal neoplasia (colorectal cancer and advanced, precancerous polyps). We use a recently completed large cohort study of subjects who underwent a first screening colonoscopy. Logistic regression models have been used in the literature to estimate the risk of advanced colorectal neoplasia based on quantifiable risk factors. However, logistic regression may be prone to overfitting and instability in variable selection. Since most of the risk factors in our study have several categories, it was tempting to collapse these categories into fewer risk groups. We propose a penalized logistic regression method that automatically and simultaneously selects variables, groups categories, and estimates their coefficients by penalizing the $$L1$$-norm of both the coefficients and their differences. Hence, it encourages sparsity in the categories, i.e. grouping of the categories, and sparsity in the variables, i.e. variable selection. We apply the penalized logistic regression method to our data. The important variables are selected, with close categories simultaneously grouped, by penalized regression models with and without the interactions terms. The models are validated with 10-fold cross-validation. The receiver operating characteristic curves of the penalized regression models dominate the receiver operating characteristic curve of naive logistic regressions, indicating a superior discriminative performance.

July 30, 2013 doi: 10.1177/0962280213497432 open full text
A flexible joint modeling framework for longitudinal and time-to-event data with overdispersion.
Njagi, E. N., Molenberghs, G., Rizopoulos, D., Verbeke, G., Kenward, M. G., Dendale, P., Willekens, K.
Statistical Methods in Medical Research: An International Review Journal. July 18, 2013

We combine conjugate and normal random effects in a joint model for outcomes, at least one of which is non-Gaussian, with particular emphasis on cases in which one of the outcomes is of survival type. Conjugate random effects are used to relax the often-restrictive mean-variance prescription in the non-Gaussian outcome, while normal random effects account for not only the correlation induced by repeated measurements from the same subject but also the association between the different outcomes. Using a case study in chronic heart failure, we show that model fit can be improved, even resulting in impact on significance tests, by switching to our extended framework. By first taking advantage of the ease of analytical integration over conjugate random effects, we easily estimate our framework, by maximum likelihood, in standard software.

July 18, 2013 doi: 10.1177/0962280213495994 open full text
Risk prediction for myocardial infarction via generalized functional regression models.
Ieva, F., Paganoni, A. M.
Statistical Methods in Medical Research: An International Review Journal. July 18, 2013

In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques.

July 18, 2013 doi: 10.1177/0962280213495988 open full text
Point success rate for patient therapeutic response prediction by continuous biomarker scores.
Ma, Z., Kim, Y., Hu, F., Lee, J. K.
Statistical Methods in Medical Research: An International Review Journal. July 09, 2013

Various predictive diagnostic tests are highly demanded to guide optimal treatments for individual patients, as individual patients with the same disease such as cancer frequently exhibit dramatically different therapeutic responses to multiple available treatment options. A large number of clinical trials have thus been performed to test the predictive ability and utility of various therapeutic biomarker tests. However, in these trial designs the conventional optimization criteria such as positive predictive value or negative predictive value cannot reflect each patient’s true chance of success associated with continuous predictive biomarker scores. We have developed a novel statistical concept, point success rate (PSR), to overcome deficiencies in these conventional methods for optimizing biomarker-based clinical trials. We demonstrate statistical superiority as well as clinical improvement by a PSR-based treatment selection both with simulated and breast cancer patient data.

July 09, 2013 doi: 10.1177/0962280213493161 open full text
A statistical model of breast cancer tumour growth with estimation of screening sensitivity as a function of mammographic density.
Abrahamsson, L., Humphreys, K.
Statistical Methods in Medical Research: An International Review Journal. July 09, 2013

Understanding screening sensitivity and tumour progression is important for designing and evaluating screening programmes for breast cancer. Several approaches for estimating tumour growth rates have been described, some of which simultaneously estimate (mammography) screening sensitivity. None of the continuous tumour growth modelling approaches has incorporated mammographic density, although it is known to have a profound influence on mammographic screening sensitivity. We describe a new approach for estimating breast cancer tumour growth which builds on recently described continuous tumour growth models and estimates mammographic screening sensitivity as a function of tumour size and mammographic density.

July 09, 2013 doi: 10.1177/0962280213492843 open full text
Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial.
Ma, X., Nie, L., Cole, S. R., Chu, H.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2013

In this article, we present an overview and tutorial of statistical methods for meta-analysis of diagnostic tests under two scenarios: (1) when the reference test can be considered a gold standard and (2) when the reference test cannot be considered a gold standard. In the first scenario, we first review the conventional summary receiver operating characteristics approach and a bivariate approach using linear mixed models. Both approaches require direct calculations of study-specific sensitivities and specificities. We next discuss the hierarchical summary receiver operating characteristics curve approach for jointly modeling positivity criteria and accuracy parameters, and the bivariate generalized linear mixed models for jointly modeling sensitivities and specificities. We further discuss the trivariate generalized linear mixed models for jointly modeling prevalence, sensitivities and specificities, which allows us to assess the correlations among the three parameters. These approaches are based on the exact binomial distribution and thus do not require an ad hoc continuity correction. Lastly, we discuss a latent class random effects model for meta-analysis of diagnostic tests when the reference test itself is imperfect for the second scenario. A number of case studies with detailed annotated SAS code in MIXED and NLMIXED procedures are presented to facilitate the implementation of these approaches.

June 26, 2013 doi: 10.1177/0962280213492588 open full text
Addressing missing covariates for the regression analysis of competing risks: Prognostic modelling for triaging patients diagnosed with prostate cancer.
Escarela, G., Ruiz-de-Chavez, J., Castillo-Morales, A.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2013

Competing risks arise in medical research when subjects are exposed to various types or causes of death. Data from large cohort studies usually exhibit subsets of regressors that are missing for some study subjects. Furthermore, such studies often give rise to censored data. In this article, a carefully formulated likelihood-based technique for the regression analysis of right-censored competing risks data when two of the covariates are discrete and partially missing is developed. The approach envisaged here comprises two models: one describes the covariate effects on both long-term incidence and conditional latencies for each cause of death, whilst the other deals with the observation process by which the covariates are missing. The former is formulated with a well-established mixture model and the latter is characterised by copula-based bivariate probability functions for both the missing covariates and the missing data mechanism. The resulting formulation lends itself to the empirical assessment of non-ignorability by performing sensitivity analyses using models with and without a non-ignorable component. The methods are illustrated on a 20-year follow-up involving a prostate cancer cohort from the National Cancer Institutes Surveillance, Epidemiology, and End Results program.

June 26, 2013 doi: 10.1177/0962280213492406 open full text
On analyzing ordinal data when responses and covariates are both missing at random.
Rana, S., Roy, S., Das, K.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2013

In many occasions, particularly in biomedical studies, data are unavailable for some responses and covariates. This leads to biased inference in the analysis when a substantial proportion of responses or a covariate or both are missing. Except a few situations, methods for missing data have earlier been considered either for missing response or for missing covariates, but comparatively little attention has been directed to account for both missing responses and missing covariates, which is partly attributable to complexity in modeling and computation. This seems to be important as the precise impact of substantial missing data depends on the association between two missing data processes as well. The real difficulty arises when the responses are ordinal by nature. We develop a joint model to take into account simultaneously the association between the ordinal response variable and covariates and also that between the missing data indicators. Such a complex model has been analyzed here by using the Markov chain Monte Carlo approach and also by the Monte Carlo relative likelihood approach. Their performance on estimating the model parameters in finite samples have been looked into. We illustrate the application of these two methods using data from an orthodontic study. Analysis of such data provides some interesting information on human habit.

June 26, 2013 doi: 10.1177/0962280213492063 open full text
Expectation maximization-based likelihood inference for flexible cure rate models with Weibull lifetimes.
Balakrishnan, N., Pal, S.
Statistical Methods in Medical Research: An International Review Journal. June 05, 2013

Recently, a flexible cure rate survival model has been developed by assuming the number of competing causes of the event of interest to follow the Conway–Maxwell–Poisson distribution. This model includes some of the well-known cure rate models discussed in the literature as special cases. Data obtained from cancer clinical trials are often right censored and expectation maximization algorithm can be used in this case to efficiently estimate the model parameters based on right censored data. In this paper, we consider the competing cause scenario and assuming the time-to-event to follow the Weibull distribution, we derive the necessary steps of the expectation maximization algorithm for estimating the parameters of different cure rate survival models. The standard errors of the maximum likelihood estimates are obtained by inverting the observed information matrix. The method of inference developed here is examined by means of an extensive Monte Carlo simulation study. Finally, we illustrate the proposed methodology with a real data on cancer recurrence.

June 05, 2013 doi: 10.1177/0962280213491641 open full text
Multiple-stage sampling procedure for covariate-adjusted response-adaptive designs.
Park, E., Chang, Y.-c. I.
Statistical Methods in Medical Research: An International Review Journal. May 30, 2013

Covariate-adjusted response-adaptive (CARA) design becomes an important statistical tool for evaluating and comparing the performance of treatments when targeted medicine and adaptive therapy become important medical innovations. Due to the nature of the adaptive therapies of interest and how subjects accrue to a sampling procedure, it is of interest how to control the sample size sequentially such that the estimates of treatment effects have satisfactory precision in addition to its asymptotic properties. In this paper, we apply a multiple-stage sequential sampling method to CARA design in such a way that the control of the sample size is more feasible. The theoretical properties of the proposed method, including the estimates of regression parameters and the allocation probabilities under this randomly stopped sampling procedure, are discussed. The numerical results based on synthesized data and a real example are presented.

May 30, 2013 doi: 10.1177/0962280213490091 open full text
Sensitivity analysis of incomplete longitudinal data departing from the missing at random assumption: Methodology and application in a clinical trial with drop-outs.
Moreno-Betancur, M., Chavance, M.
Statistical Methods in Medical Research: An International Review Journal. May 23, 2013

Statistical analyses of longitudinal data with drop-outs based on direct likelihood, and using all the available data, provide unbiased and fully efficient estimates under some assumptions about the drop-out mechanism. Unfortunately, these assumptions can never be tested from the data. Thus, sensitivity analyses should be routinely performed to assess the robustness of inferences to departures from these assumptions. However, each specific scientific context requires different considerations when setting up such an analysis, no standard method exists and this is still an active area of research. We propose a flexible procedure to perform sensitivity analyses when dealing with continuous outcomes, which are described by a linear mixed model in an initial likelihood analysis. The methodology relies on the pattern-mixture model factorisation of the full data likelihood and was validated in a simulation study. The approach was prompted by a randomised clinical trial for sleep-maintenance insomnia treatment. This case study illustrated the practical value of our approach and underlined the need for sensitivity analyses when analysing data with drop-outs: some of the conclusions from the initial analysis were shown to be reliable, while others were found to be fragile and strongly dependent on modelling assumptions. R code for implementation is provided.

May 23, 2013 doi: 10.1177/0962280213490014 open full text
Joint modeling of longitudinal data and discrete-time survival outcome.
Qiu, F., Stein, C. M., Elston, R. C., for the Tuberculosis Research Unit (TBRU).
Statistical Methods in Medical Research: An International Review Journal. May 23, 2013

A predictive joint shared parameter model is proposed for discrete time-to-event and longitudinal data. A discrete survival model with frailty and a generalized linear mixed model for the longitudinal data are joined to predict the probability of events. This joint model focuses on predicting discrete time-to-event outcome, taking advantage of repeated measurements. We show that the probability of an event in a time window can be more precisely predicted by incorporating the longitudinal measurements. The model was investigated by comparison with a two-step model and a discrete-time survival model. Results from both a study on the occurrence of tuberculosis and simulated data show that the joint model is superior to the other models in discrimination ability, especially as the latent variables related to both survival times and the longitudinal measurements depart from 0.

May 23, 2013 doi: 10.1177/0962280213490342 open full text
Predictions in an illness-death model.
Touraine, C., Helmer, C., Joly, P.
Statistical Methods in Medical Research: An International Review Journal. May 22, 2013

Multi-state models allow subjects to move among a finite number of states during a follow-up period. Most often, the objects of study are the transition intensities. The impact of covariates on them can also be studied by specifying regression models. Thus, estimation in multi-state models is usually focused on the transition intensities (or the cumulative transition intensities) and on the regression parameters. However, from a clinical or epidemiological point of view, other quantities could provide additional information and may be more relevant to answer practical questions. For example, given a set of covariates for a subject, it may be of interest to estimate the probability to experience a future event or the expected time without any event. To address these kinds of issues, we need to estimate quantities such as transition probabilities, cumulative probabilities and life expectancies. The purpose of this paper is to review a large number of these quantities in an illness-death model which is perhaps the most common multi-state model in the medical literature, and to propose a way to estimate them in addition to the transition intensities and the regression parameters. An illustration is given using interval-censored data from a large cohort study on cognitive ageing.

May 22, 2013 doi: 10.1177/0962280213489234 open full text
A systematic selection method for the development of cancer staging systems.
Lin, Y., Chappell, R., Gonen, M.
Statistical Methods in Medical Research: An International Review Journal. May 22, 2013

The tumor–node–metastasis (TNM) staging system has been the anchor of cancer diagnosis, treatment, and prognosis for many years. For meaningful clinical use, an orderly, progressive condensation of the T and N categories into an overall staging system needs to be defined, usually with respect to a time-to-event outcome. This can be considered as a cutpoint selection problem for a censored response partitioned with respect to two ordered categorical covariates and their interaction. The aim is to select the best grouping of the TN categories. A novel bootstrap cutpoint/model selection method is proposed for this task by maximizing bootstrap estimates of the chosen statistical criteria. The criteria are based on prognostic ability including a landmark measure of the explained variation, the area under the receiver operating characteristic (ROC) curve, and a concordance probability generalized from Harrell’s c-index. We illustrate the utility of our method by applying it to the staging of colorectal cancer.

May 22, 2013 doi: 10.1177/0962280213486853 open full text
MCAR is not necessary for the complete cases to constitute a simple random subsample of the target sample.
Galati, J. C., Seaton, K. A.
Statistical Methods in Medical Research: An International Review Journal. May 22, 2013

Missing data is the norm rather than the exception in complex epidemiological studies. Complete-case analyses, which discard all subjects with some data values missing, are known to be valid under the very restrictive assumption that the response mechanism is missing completely at random (MCAR). While conditions weaker than MCAR are known under which estimators of regression coefficients are unbiased, one often comes across the view in the literature that MCAR is necessary for the complete cases to form a simple random subsample of the target sample. In this paper, we explain why this is not the case, and we distill an assumption weaker than MCAR under which the simple random subsample condition holds, which we call available at random (AAR). Moreover, we show that, unlike MCAR, AAR response mechanisms can be missing not at random (MNAR). We also suggest how approximate AAR mechanisms might arise in practice through cancellation of selection and drop-out effects, and we conclude that before pooling partially complete and complete cases into an analysis, the investigator should consider how selection might impact on the representativeness of the cases included in the pooled analysis (compared to those comprising the complete cases only).

May 22, 2013 doi: 10.1177/0962280213490360 open full text
Detection of spatial variations in temporal trends with a quadratic function.
Moraga, P., Kulldorff, M.
Statistical Methods in Medical Research: An International Review Journal. April 23, 2013

Methods for the assessment of spatial variations in temporal trends (SVTT) are important tools for disease surveillance, which can help governments to formulate programs to prevent diseases, and measure the progress, impact, and efficacy of preventive efforts already in operation. The linear SVTT method is designed to detect areas with unusual different disease linear trends. In some situations, however, its estimation trend procedure can lead to wrong conclusions. In this article, the quadratic SVTT method is proposed as alternative of the linear SVTT method. The quadratic method provides better estimates of the real trends, and increases the power of detection in situations where the linear SVTT method fails. A performance comparison between the linear and quadratic methods is provided to help illustrate their respective properties. The quadratic method is applied to detect unusual different cervical cancer trends in white women in the United States, over the period 1969 to 1995.

April 23, 2013 doi: 10.1177/0962280213485312 open full text
Variable selection for optimal treatment decision.
Lu, W., Zhang, H. H., Zeng, D.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

In decision-making on optimal treatment strategies, it is of great importance to identify variables that are involved in the decision rule, i.e. those interacting with the treatment. Effective variable selection helps to improve the prediction accuracy and enhance the interpretability of the decision rule. We propose a new penalized regression framework which can simultaneously estimate the optimal treatment strategy and identify important variables. The advantages of the new approach include: (i) it does not require the estimation of the baseline mean function of the response, which greatly improves the robustness of the estimator; (ii) the convenient loss-based framework makes it easier to adopt shrinkage methods for variable selection, which greatly facilitates implementation and statistical inferences for the estimator. The new procedure can be easily implemented by existing state-of-art software packages like LARS. Theoretical properties of the new estimator are studied. Its empirical performance is evaluated using simulation studies and further illustrated with an application to an AIDS clinical trial.

April 16, 2013 doi: 10.1177/0962280211428383 open full text
Two-stage sampling designs for external validation of personal risk models.
Whittemore, A. S., Halpern, J.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

We propose a cost-effective sampling design and estimating procedure for validating personal risk models using right-censored cohort data. Validation involves using each subject’s covariates, as ascertained at cohort entry, in a risk model (specified independently of the data) to assign him/her a probability of an adverse outcome within a future time period. Subjects are then grouped according to the magnitudes of their assigned risks, and within each group, the mean assigned risk is compared with the probability of outcome occurrence as estimated using the follow-up data. Such validation presents two complications. First, in the presence of right-censoring, estimating the probability of developing the outcomes before death requires competing risk analysis. Second, for rare outcomes, validation using the full cohort requires assembling covariates and assigning risks to thousands of subjects. This can be costly when some covariates involve analyzing biological specimens. A two-stage sampling design addresses this problem by assembling covariates and assigning risks only to those subjects most informative for estimating key parameters. We use this design to estimate the outcome probabilities needed to evaluate model performance and we provide theoretical and bootstrap estimates of their variances. We also describe how to choose two-stage designs with minimal efficiency loss for a parameter of interest when the quantities determining optimality are unknown at the time of design. We illustrate these methods by using subjects in the California Teachers Study to validate ovarian cancer risk models. We find that a design with optimal efficiency for one performance parameter need not be so for others, and trade-offs will be required. A two-stage design that samples all outcome-positive subjects and more outcome-negative than censored subjects will perform well in most circumstances. The methods are implemented in Risk Model Assessment Program, an R program freely available at http://med.stanford.edu/epidemiology/two-stage.html.

April 16, 2013 doi: 10.1177/0962280213480420 open full text
Joint modeling of multivariate longitudinal measurements and survival data with applications to Parkinson's disease.
He, B., Luo, S.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

In many clinical trials, studying neurodegenerative diseases including Parkinson’s disease (PD), multiple longitudinal outcomes are collected in order to fully explore the multidimensional impairment caused by these diseases. The follow-up of some patients can be stopped by some outcome-dependent terminal event, e.g. death and dropout. In this article, we develop a joint model that consists of a multilevel item response theory (MLIRT) model for the multiple longitudinal outcomes, and a Cox’s proportional hazard model with piecewise constant baseline hazards for the event time data. Shared random effects are used to link together two models. The model inference is conducted using a Bayesian framework via Markov Chain Monte Carlo simulation implemented in BUGS language. Our proposed model is evaluated by simulation studies and is applied to the DATATOP study, a motivating clinical trial assessing the effect of tocopherol on PD among patients with early PD.

April 16, 2013 doi: 10.1177/0962280213480877 open full text
Linear combination methods to improve diagnostic/prognostic accuracy on future observations.
Kang, L., Liu, A., Tian, L.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

Multiple diagnostic tests or biomarkers can be combined to improve diagnostic accuracy. The problem of finding the optimal linear combinations of biomarkers to maximise the area under the receiver operating characteristic curve has been extensively addressed in the literature. The purpose of this article is threefold: (1) to provide an extensive review of the existing methods for biomarker combination; (2) to propose a new combination method, namely, the nonparametric stepwise approach; (3) to use leave-one-pair-out cross-validation method, instead of re-substitution method, which is overoptimistic and hence might lead to wrong conclusion, to empirically evaluate and compare the performance of different linear combination methods in yielding the largest area under receiver operating characteristic curve. A data set of Duchenne muscular dystrophy was analysed to illustrate the applications of the discussed combination methods.

April 16, 2013 doi: 10.1177/0962280213481053 open full text
An adaptive clinical trials procedure for a sensitive subgroup examined in the multiple sclerosis context.
Riddell, C. A., Zhao, Y., Petkau, J.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

The biomarker-adaptive threshold design (BATD) allows researchers to simultaneously study the efficacy of treatment in the overall group and to investigate the relationship between a hypothesized predictive biomarker and the treatment effect on the primary outcome. It was originally developed for survival outcomes for Phase III clinical trials where the biomarker of interest is measured on a continuous scale. In this paper, generalizations of the BATD to accommodate count biomarkers and outcomes are developed and then studied in the multiple sclerosis (MS) context where the number of relapses is a commonly used outcome. Through simulation studies, we find that the BATD has increased power compared with a traditional fixed procedure under varying scenarios for which there exists a sensitive patient subgroup. As an illustration, we apply the procedure for two hypothesized markers, baseline enhancing lesion count and disease duration at baseline, using data from a previously completed trial. MS duration appears to be a predictive marker relationship for this dataset, and the procedure indicates that the treatment effect is strongest for patients who have had MS for less than 7.8 years. The procedure holds promise of enhanced statistical power when the treatment effect is greatest in a sensitive patient subgroup.

April 16, 2013 doi: 10.1177/0962280213480576 open full text
Projecting adverse event incidence rates using empirical Bayes methodology.
Ma, G., Ganju, J., Huang, J.
Statistical Methods in Medical Research: An International Review Journal. April 16, 2013

Although there is considerable interest in adverse events observed in clinical trials, projecting adverse event incidence rates in an extended period can be of interest when the trial duration is limited compared to clinical practice. A naïve method for making projections might involve modeling the observed rates into the future for each adverse event. However, such an approach overlooks the information that can be borrowed across all the adverse event data. We propose a method that weights each projection using a shrinkage factor; the adverse event-specific shrinkage is a probability, based on empirical Bayes methodology, estimated from all the adverse event data, reflecting evidence in support of the null or non-null hypotheses. Also proposed is a technique to estimate the proportion of true nulls, called the common area under the density curves, which is a critical step in arriving at the shrinkage factor. The performance of the method is evaluated by projecting from interim data and then comparing the projected results with observed results. The method is illustrated on two data sets.

April 16, 2013 doi: 10.1177/0962280213483499 open full text
Step-up procedures for non-inferiority tests with multiple experimental treatments.
Kwong, K. S., Cheung, S. H., Hayter, A. J.
Statistical Methods in Medical Research: An International Review Journal. March 25, 2013

Non-inferiority (NI) trials are becoming more popular. The NI of a new treatment compared with a standard treatment is established when the new treatment maintains a substantial fraction of the treatment effect of the standard treatment. A valid NI trial is also required to show assay sensitivity, the demonstration of the standard treatment having the expected effect with a size comparable to those reported in previous placebo-controlled studies. A three-arm NI trial is a clinical study that includes a new treatment, a standard treatment and a placebo. Most of the statistical methods developed for three-arm NI trials are designed for the existence of only one new treatment. Recently, a single-step procedure was developed to deal with NI trials with multiple new treatments with the overall familywise error rate controlled at a specified level. In this article, we extend the single-step procedure to two new step-up procedures for NI trials with multiple new treatments. A comparative study of test power shows that both proposed step-up procedures provide a significant improvement of power when compared to the single-step procedure. One of the two proposed step-up procedures also allows the flexibility of allocating different error rates between the sensitivity hypothesis and the NI hypotheses so that the assignment of fewer patients to the placebo becomes possible when designing NI trials. We illustrate the new procedures using data from a clinical trial.

March 25, 2013 doi: 10.1177/0962280213477767 open full text
Simulating the contribution of a biospecimen and clinical data repository in a phase II clinical trial: A value of information analysis.
Craig, B. M., Han, G., Munkin, M. K., Fenstermacher, D.
Statistical Methods in Medical Research: An International Review Journal. March 15, 2013

The potential contributions of a centralized data warehouse or repository in clinical research include the expedited accrual of subjects for phase II trials. Understanding the contribution of data warehouses that integrate clinical, biospecimen, and molecular data for the conduct of clinical trials is essential to inform private and public decisions on resource allocation and investment. We conducted a value of information analysis using data from recent trials at the Moffitt Cancer Center and simulated the potential reductions in trial size due to possible alternative scenarios of expedited accrual. In this study, we compared alternative data sets using a single model to assess value of information. Our findings suggest that the reductions in trial size range from 0% to 43%, depending on the amount of censoring in overall survival. The ability to expedite the accrual of patients for clinical trial studies using large data repositories that store data on inclusion/exclusion criteria and response to standard of care therapies demonstrated significant improvement in reducing the number of subjects needed to achieve similar end-results, as evaluated using value of information analysis with a limited number of parameters and a parsimonious model of overall survival.

March 15, 2013 doi: 10.1177/0962280213480282 open full text
Notes on testing noninferiority in multivariate binary data under the matched-pair design.
Lui, K.-J., Chang, K.-C.
Statistical Methods in Medical Research: An International Review Journal. March 12, 2013

Since therapeutic efficacy is often measured by multiple endpoints, it will be of use if one can incorporate the information on various variables of response into procedures for testing noninferiority to improve power of a univariate test procedure for each individual variable. On the basis of the proposed mixed effects logistic regression model for multivariate binary data under the matched-pairs design, we develop procedures for testing noninferiority with respect to the odds ratio in multivariate binary data under the matched-pair design. We discuss use of Bonferroni’s and Scheffe’s methods to control the inflation in Type I error due to multiple tests. We further employ Monte Carlo simulation to evaluate and compare the performance of these test procedures. Finally, we use the data taken from a crossover clinical trial that monitored several adverse events of an antidepressive drug to illustrate the use of test procedures derived here.

March 12, 2013 doi: 10.1177/0962280213477022 open full text
Adjusted inference procedures for the interobserver agreement in twin studies.
Dixon, S. N., Donner, A., Shoukri, M. M.
Statistical Methods in Medical Research: An International Review Journal. March 12, 2013

We propose adjusted inference procedures for evaluating the agreement/disagreement of two raters in a clustered setting involving twins or paired body parts. These procedures include the construction of a confidence interval for the kappa statistic, a related test of statistical significance and a formula that facilitates sample size estimation. The results of a simulation study suggest that a simple adjustment using an estimated design effect will provide valid inferences. The methods proposed are illustrated using an example from the literature.

March 12, 2013 doi: 10.1177/0962280213476771 open full text
Growth charts of human development.
van Buuren, S.
Statistical Methods in Medical Research: An International Review Journal. March 12, 2013

This article reviews and compares two types of growth charts for tracking human development over age. Both charts assume the existence of a continuous latent variable, but relate to the observed data in different ways. The D-score diagram summarizes developmental indicators into a single aggregate score measuring global development. The relations between the indicators should be consistent with the Rasch model. If true, the D-score is a measure with interval scale properties, and allows for the calculation of meaningful differences both within and across age. The stage line diagram describes the natural development of ordinal indicators. The method models the transition probabilities between successive stages of the indicator as smoothly varying functions of age. The location of each stage is quantified by the mid-P-value. Both types of diagrams assist in identifying early and delayed development, as well as finding differences in tempo. The relevant techniques are illustrated to track global development during infancy and early childhood (0–2 years) and Tanner pubertal stages (8–21 years). New reference values for both applications are provided.

March 12, 2013 doi: 10.1177/0962280212473300 open full text
Statistical challenges in drug approval trials that use patient-reported outcomes.
Izem, R., Kammerman, L. A., Komo, S.
Statistical Methods in Medical Research: An International Review Journal. February 21, 2013

This article describes challenging aspects of the use of patient-reported outcome instruments in clinical trials for drug approval, in our perspective as statistical reviewers at the US Food and Drug Administration. We discuss aspects of planning and interpreting results in clinical trials (1) adapting an existing patient-reported outcome instrument for use in clinical trials, (2) using multi-item patient-reported outcomes and (3) missing patient-reported outcome values from many subjects over time. These challenges are illustrated with multiple examples from different clinical trials for different indications. We finally discuss important considerations in labeling.

February 21, 2013 doi: 10.1177/0962280213476376 open full text
Interpretation of patient-reported outcomes.
Cappelleri, J. C., Bushmakin, A. G.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

A patient-reported outcome is any report on the status of a patient’s health condition that comes directly from the patient. Clear and meaningful interpretation of patient-reported outcome scores are fundamental to their use as they can be valuable in designing studies, evaluating interventions, educating consumers, and informing health policy makers involved with regulatory, reimbursement, and advisory agencies. Interpretation of patient-reported outcome scores, however, is often not well understood because of insufficient data or lack of experience or clinical understanding to draw from. This article provides an update review on two broad approaches – anchor-based and distributed-based – aimed at enhancing the understanding and meaning of patient-reported outcome scores. Anchor-based approaches include percentages based on thresholds, criterion-group interpretation, content-based interpretation, and clinical important difference. Distributed-based approaches include effect size, probability of relative benefit, and responder analysis and cumulative proportions. A third strategy called mediation analysis, which can elucidate a health condition measured by a patient-reported outcome in the context of an intervention’s mechanism of action, is also highlighted and illustrated. Mediation analysis in the context of interpretation of patient-reported outcome scores is a relatively new development. The logic and rationale of the three methods are expressed generally. While the three approaches themselves are not new, some applications of them taken from their examples published in the past few years are original and coalesced in this article to add real-life implications of the different methodologies in one integrated report.

February 19, 2013 doi: 10.1177/0962280213476377 open full text
A graphical method to assess distribution assumption in group-based trajectory models.
Elsensohn, M.-H., Klich, A., Ecochard, R., Bastard, M., Genolini, C., Etard, J.-F., Gustin, M.-P.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

Group-based trajectory models had a rapid development in the analysis of longitudinal data in clinical research. In these models, the assumption of homoscedasticity of the residuals is frequently made but this assumption is not always met. We developed here an easy-to-perform graphical method to assess the assumption of homoscedasticity of the residuals to apply especially in group-based trajectory models. The method is based on drawing an envelope to visualize the local dispersion of the residuals around each typical trajectory. Its efficiency is demonstrated using data on CD4 lymphocyte counts in patients with human immunodeficiency virus put on antiretroviral therapy. Four distinct distributions that take into account increasing parts of the variability of the observed data are presented. Significant differences in group structures and trajectory patterns were found according to the chosen distribution. These differences might have large impacts on the final trajectories and their characteristics; thus on potential medical decisions. With a single glance, the graphical criteria allow the choice of the distribution that best capture data variability and help dealing with a potential heteroscedasticity problem.

February 19, 2013 doi: 10.1177/0962280213475643 open full text
Comparing models for quantitative risk assessment: an application to the European Registry of foreign body injuries in children.
Berchialla, P., Scarinzi, C., Snidero, S., Gregori, D.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

Risk Assessment is the systematic study of decisions subject to uncertain consequences. An increasing interest has been focused on modeling techniques like Bayesian Networks since their capability of (1) combining in the probabilistic framework different type of evidence including both expert judgments and objective data; (2) overturning previous beliefs in the light of the new information being received and (3) making predictions even with incomplete data. In this work, we proposed a comparison among Bayesian Networks and other classical Quantitative Risk Assessment techniques such as Neural Networks, Classification Trees, Random Forests and Logistic Regression models. Hybrid approaches, combining both Classification Trees and Bayesian Networks, were also considered. Among Bayesian Networks, a clear distinction between purely data-driven approach and combination of expert knowledge with objective data is made. The aim of this paper consists in evaluating among this models which best can be applied, in the framework of Quantitative Risk Assessment, to assess the safety of children who are exposed to the risk of inhalation/insertion/aspiration of consumer products. The issue of preventing injuries in children is of paramount importance, in particular where product design is involved: quantifying the risk associated to product characteristics can be of great usefulness in addressing the product safety design regulation. Data of the European Registry of Foreign Bodies Injuries formed the starting evidence for risk assessment. Results showed that Bayesian Networks appeared to have both the ease of interpretability and accuracy in making prediction, even if simpler models like logistic regression still performed well.

February 19, 2013 doi: 10.1177/0962280213476167 open full text
Estimating effect sizes for health related quality of life outcomes.
Julious, S. A., Walters, S. J.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

To enable an assessment of the costs and benefits of a new health technology one should use a range of outcome measures, including medical, psychosocial and economic. Therefore, unless a patient-reported outcome as well as clinical outcome is assessed in a study, the effect of a health technology on the patient will remain unknown as two therapies may have similar clinical consequences but different impacts upon the quality of the life of the patients. An important issue when designing a study with a new patient-reported outcome is the quantification of an effect size. Through a case study we highlight how simple calculations can enable the estimation of the effect sizes if there is information on established outcomes. This is done by mapping changes on the new scale to clinically relevant and important changes on established scales. We recommend the approaches described in this paper be considered for the quantification of important treatment effects when designing a clinical trial with a new patient-reported outcome measure.

February 19, 2013 doi: 10.1177/0962280213476379 open full text
Practical and statistical issues in missing data for longitudinal patient reported outcomes.
Bell, M. L., Fairclough, D. L.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

Patient reported outcomes are increasingly used in health research, including randomized controlled trials and observational studies. However, the validity of results in longitudinal studies can crucially hinge on the handling of missing data. This paper considers the issues of missing data at each stage of research. Practical strategies for minimizing missingness through careful study design and conduct are given. Statistical approaches that are commonly used, but should be avoided, are discussed, including how these methods can yield biased and misleading results. Methods that are valid for data which are missing at random are outlined, including maximum likelihood methods, multiple imputation and extensions to generalized estimating equations: weighted generalized estimating equations, generalized estimating equations with multiple imputation, and doubly robust generalized estimating equations. Finally, we discuss the importance of sensitivity analyses, including the role of missing not at random models, such as pattern mixture, selection, and shared parameter models. We demonstrate many of these concepts with data from a randomized controlled clinical trial on renal cancer patients, and show that the results are dependent on missingness assumptions and the statistical approach.

February 19, 2013 doi: 10.1177/0962280213476378 open full text
A general theoretical framework for interpreting patient-reported outcomes estimated from ordinally scaled item responses.
Massof, R. W.
Statistical Methods in Medical Research: An International Review Journal. February 19, 2013

A simple theoretical framework explains patient responses to items in rating scale questionnaires. Fixed latent variables position each patient and each item on the same linear scale. Item responses are governed by a set of fixed category thresholds, one for each ordinal response category. A patient’s item responses are magnitude estimates of the difference between the patient variable and the patient’s estimate of the item variable, relative to his/her personally defined response category thresholds. Differences between patients in their personal estimates of the item variable and in their personal choices of category thresholds are represented by random variables added to the corresponding fixed variables. Effects of intervention correspond to changes in the patient variable, the patient’s response bias, and/or latent item variables for a subset of items. Intervention effects on patients’ item responses were simulated by assuming the random variables are normally distributed with a constant scalar covariance matrix. Rasch analysis was used to estimate latent variables from the simulated responses. The simulations demonstrate that changes in the patient variable and changes in response bias produce indistinguishable effects on item responses and manifest as changes only in the estimated patient variable. Changes in a subset of item variables manifest as intervention-specific differential item functioning and as changes in the estimated person variable that equals the average of changes in the item variables. Simulations demonstrate that intervention-specific differential item functioning produces inefficiencies and inaccuracies in computer adaptive testing.

February 19, 2013 doi: 10.1177/0962280213476380 open full text
Methods to obtain referral criteria in growth monitoring.
van Dommelen, P., van Buuren, S.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

An important goal of growth monitoring is to identify genetic disorders, diseases or other conditions that manifest themselves through an abnormal growth. The two main conditions that can be detected by height monitoring are Turner’s syndrome and growth hormone deficiency. Conditions or risk factors that can be detected by monitoring weight or body mass index include hypernatremic dehydration, celiac disease, cystic fibrosis and obesity. Monitoring infant head growth can be used to detect macrocephaly, developmental disorder and ill health in childhood. This paper describes statistical methods to obtain evidence-based referral criteria in growth monitoring. The referral criteria that we discuss are based on either anthropometric measurement(s) at a fixed age using (1) a Centile or a Standard Deviation Score, (2) a Standard Deviation corrected for parental height, (3) a Likelihood Ratio Statistic and (4) an ellipse, or on multiple measurements over time using (5) a growth rate and (6) a growth curve model. We review the potential uses of these methods, and outline their strengths and limitations.

February 01, 2013 doi: 10.1177/0962280212473301 open full text
Testing for association in case-control genome-wide association studies with shared controls.
Chen, Z., Huang, H., Ng, H. K. T.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

The statistical analysis of genome-wide association studies (GWASs) with multiple diseases and shared controls (SCs) is discussed. The usual method for analyzing data from these studies is to compare each individual disease with either the SCs or the pooled controls which include other diseases. We observed that applying individual association tests can be problematic because these tests may suffer from power loss in detecting significant associations between diseases and single-nucleotide polymorphism or copy number variant. We propose here a two-stage procedure wherein we first apply an overall chi-square test for multiple diseases with SCs; if the overall test is rejected, then individual tests using the chi-square partition method will be applied to each disease against SCs. A real GWAS data set with SCs and a Monte Carlo simulation study are used to demonstrate that the proposed method is more effective and preferable than other existing methods for analyzing data from GWASs with multiple diseases and SCs.

February 01, 2013 doi: 10.1177/0962280212474061 open full text
Estimating the prevalence of transmitted HIV drug resistance using pooled samples.
Finucane, M. M., Rowley, C. F., Paciorek, C. J., Essex, M., Pagano, M.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

In many resource-poor countries, hiv-infected patients receive a standardized antiretroviral cocktail. In these settings, population-level surveillance of drug resistance is needed to characterize the prevalence of resistance mutations and to enable antiretroviral therapy programs to select the optimal regimen for their local population. The surveillance strategy currently recommended by the World Health Organization is prohibitively expensive in some settings and may not provide a sufficiently precise rendering of the emergence of drug resistance. By using a novel assay on pooled sera samples, we decrease surveillance costs while simultaneously increasing the accuracy of drug resistance prevalence estimates for an important mutation that impacts first-line antiretroviral therapy. We present a Bayesian model for pooled-testing data that garners more information from each resistance assay conducted, compared with individual testing. We expand on previous pooling methods to account for uncertainty about the population distribution of within-subject resistance levels. In addition, our model accounts for measurement error of the resistance assay, and this added uncertainty naturally propagates through the Bayesian model to our inference on the prevalence parameter. We conduct a simulation study that informs our pool size recommendations and that shows that this model renders the prevalence parameter identifiable in instances when an existing non-model-based estimator fails.

February 01, 2013 doi: 10.1177/0962280212473514 open full text
The covariate-adjusted frequency plot.
Holling, H., Bohning, W., Bohning, D., Formann, A. K.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

Count data arise in numerous fields of interest. Analysis of these data frequently require distributional assumptions. Although the graphical display of a fitted model is straightforward in the univariate scenario, this becomes more complex if covariate information needs to be included into the model. Stratification is one way to proceed, but has its limitations if the covariate has many levels or the number of covariates is large. The article suggests a marginal method which works even in the case that all possible covariate combinations are different (i.e. no covariate combination occurs more than once). For each covariate combination the fitted model value is computed and then summed over the entire data set. The technique is quite general and works with all count distributional models as well as with all forms of covariate modelling. The article provides illustrations of the method for various situations and also shows that the proposed estimator as well as the empirical count frequency are consistent with respect to the same parameter.

February 01, 2013 doi: 10.1177/0962280212473386 open full text
Modeling height for children born small for gestational age treated with growth hormone.
Willemsen, S. P., de Ridder, M., Eilers, P. H. C., Hokken-Koelega, A., Lesaffre, E.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

The analysis of growth curves of children can be done on either the original scale or in standard deviation scores. The first approach is found in many statistical textbooks, while the second approach is common in endocrinology, for instance in the evaluation of the effect of growth hormone in children that are born small for gestational age that remain small later in childhood. We illustrate here that the second approach may involve more complex modeling and hence a worse model fit.

February 01, 2013 doi: 10.1177/0962280212473320 open full text
Automatic smoothing parameter selection in GAMLSS with an application to centile estimation.
Rigby, R. A., Stasinopoulos, D. M.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

A method for automatic selection of the smoothing parameters in a generalised additive model for location, scale and shape (GAMLSS) model is introduced. The method uses a P-spline representation of the smoothing terms to express them as random effect terms with an internal (or local) maximum likelihood estimation on the predictor scale of each distribution parameter to estimate its smoothing parameters. This provides a fast method for estimating multiple smoothing parameters. The method is applied to centile estimation where all four parameters of a distribution for the response variable are modelled as smooth functions of a transformed explanatory variable x. This allows smooth modelling of the location, scale, skewness and kurtosis parameters of the response variable distribution as functions of x.

February 01, 2013 doi: 10.1177/0962280212473302 open full text
Estimating efficacy in the presence of non-ignorable non-trial interventions in the Helsinki Psychotherapy Study.
Harkanen, T., Arjas, E., Laaksonen, M. A., Lindfors, O., Haukka, J., Knekt, P.
Statistical Methods in Medical Research: An International Review Journal. February 01, 2013

In a randomised clinical trial with a longitudinal outcome, analyses of the efficacy of the study treatments may be complicated by both non-trial interventions, which have not been administered by the researcher, and sparsely measured outcome values. The delay between the change in outcome and the starting of the non-trial intervention may be much shorter than the time intervals between the actual measurements. We propose a model that accounts for the possible dynamic interdependence between the longitudinal outcome and time-to-event data. The model is based on discretising time into short intervals. This results in a missing data problem, which we tackle using Bayesian inference and data augmentation. The method is based on the assumption that decisions to initiate non-trial interventions are not confounded by unobservable factors. The Helsinki Psychotherapy Study data are used as an illustration. Different psychotherapies were compared, and possible episodes of psychotropic medication were viewed as non-trial interventions. Simulation studies suggest that our method provides reasonable estimates of the effects of both the study treatment and the non-trial intervention also showing some robustness against possible latent background factors. An application of marginal structural modelling, however, appeared to underestimate the differences between the treatments.

February 01, 2013 doi: 10.1177/0962280212473348 open full text
Interval estimation of random effects in proportional hazards models with frailties.
Ha, I. D., Vaida, F., Lee, Y.
Statistical Methods in Medical Research: An International Review Journal. January 29, 2013

Semi-parametric frailty models are widely used to analyze clustered survival data. In this article, we propose the use of the hierarchical likelihood interval for individual frailties. We study the relationship between hierarchical likelihood, empirical Bayesian, and fully Bayesian intervals for frailties. We show that our proposed interval can be interpreted as a frequentist confidence interval and Bayesian credible interval under a uniform prior. We also propose an adjustment of the proposed interval to avoid null intervals. Simulation studies show that the proposed interval preserves the nominal confidence level. The procedure is illustrated using data from a multicenter lung cancer clinical trial.

January 29, 2013 doi: 10.1177/0962280212474059 open full text
Development of a pediatric body mass index using longitudinal single-index models.
Wu, J., Tu, W.
Statistical Methods in Medical Research: An International Review Journal. January 08, 2013

As a measure of human adiposity, the body mass index, defined as weight/height², has been widely used in clinical investigations. For children undergoing pubertal development, whether this function of height and weight represents an optimal way of quantifying body mass for assessing of specific health outcomes has not been carefully studied. In this study, we propose an alternative pediatric body mass measure for prediction of blood pressure based on recorded height and weight data using single-index modeling techniques. Specifically, we present a general form of partially linear single-index mixed effect models for the determination of this new metric. A methodological contribution of this research is the development of an efficient algorithm for the fitting of a general class of partially linear single-index models in longitudinal data situations. The proposed model and related model fitting algorithm are easily implementable in most computational platforms. Simulation demonstrates superior performance of the new method, as compared to the standard body mass index measure. Using the proposed method, we explore an alternative body mass measure for the prediction of blood pressure in children. The method is potentially useful for the construction of other indices for specific investigations.

January 08, 2013 doi: 10.1177/0962280212471470 open full text
Meta-analysis of two-arm studies: Modeling the intervention effect from survival probabilities.
Combescure, C., Courvoisier, D., Haller, G., Perneger, T.
Statistical Methods in Medical Research: An International Review Journal. December 24, 2012

Pooling the hazard ratios is not always feasible in meta-analyses of two-arm survival studies, because the measure of the intervention effect is not systematically reported. An alternative approach proposed by Moodie et al. is to use the survival probabilities of the included studies, all collected at a single point in time: the intervention effect is then summarised as the pooled ratio of the logarithm of survival probabilities (which is an estimator of the hazard ratios when hazards are proportional). In this article, we propose a generalization of this method. By using survival probabilities at several points in time, this generalization allows a flexible modeling of the intervention over time. The method is applicable to partially proportional hazards models, with the advantage of not requiring the specification of the baseline survival. As in Moodie et al.’s method, the study-level factors modifying the survival functions can be ignored as long as they do not modify the intervention effect. The procedures of estimation are presented for fixed and random effects models. Two illustrative examples are presented.

December 24, 2012 doi: 10.1177/0962280212469716 open full text
A cure rate survival model under a hybrid latent activation scheme.
Borges, P., Rodrigues, J., Louzada, F., Balakrishnan, N.
Statistical Methods in Medical Research: An International Review Journal. December 21, 2012

In lifetimes studies, the occurrence of an event (such as tumor detection or death) might be caused by one of many competing causes. Moreover, both the number of causes and the time-to-event associated with each cause are not usually observable. The number of causes can be zero, corresponding to a cure fraction. In this article, we propose a method of estimating the numerical characteristics of unobservable stages (such as initiation, promotion and progression) of carcinogenesis from data on tumor size at detection in the presence of latent competing causes. To this end, a general survival model for spontaneous carcinogenesis under a hybrid latent activation scheme has been developed to allow for a simple pattern of the dynamics of tumor growth. It is assumed that a tumor becomes detectable when its size attains some threshold level (proliferation of tumorais cells (or descendants) generated by the malignant cell), which is treated as a random variable. We assume the number of initiated cells and the number of malignant cells (competing causes) both to follow weighted Poisson distributions. The advantage of this model is that it incorporates into the analysis characteristics of the stage of tumor progression as well as the proportion of initiated cells that had been ‘promoted’ to the malignant ones and the proportion of malignant cells that die before tumor induction. The lifetimes corresponding to each competing cause are assumed to follow a Weibull distribution. Parameter estimation of the proposed model is discussed through the maximum likelihood estimation method. A simulation study has been carried out in order to examine the coverage probabilities of the confidence intervals. Finally, we illustrate the usefulness of the proposed model by applying it to a real data involving malignant melanoma.

December 21, 2012 doi: 10.1177/0962280212469682 open full text
Bayesian multiple imputation for missing multivariate longitudinal data from a Parkinson's disease clinical trial.
Luo, S., Lawson, A. B., He, B., Elm, J. J., Tilley, B. C.
Statistical Methods in Medical Research: An International Review Journal. December 12, 2012

In Parkinson's disease (PD) clinical trials, Parkinson's disease is studied using multiple outcomes of various types (e.g. binary, ordinal, continuous) collected repeatedly over time. The overall treatment effects across all outcomes can be evaluated based on a global test statistic. However, missing data occur in outcomes for many reasons, e.g. dropout, death, etc., and need to be imputed in order to conduct an intent-to-treat analysis. We propose a Bayesian method based on item response theory to perform multiple imputation while accounting for multiple sources of correlation. Sensitivity analysis is performed under various scenarios. Our simulation results indicate that the proposed method outperforms standard methods such as last observation carried forward and separate random effects model for each outcome. Our method is motivated by and applied to a Parkinson's disease clinical trial. The proposed method can be broadly applied to longitudinal studies with multiple outcomes subject to missingness.

December 12, 2012 doi: 10.1177/0962280212469358 open full text
Near efficient target allocations in response-adaptive randomization.
Biswas, A., Bhattacharya, R.
Statistical Methods in Medical Research: An International Review Journal. December 12, 2012

Traditionally optimal target allocation proportions for response-adaptive designs are derived by completely ignoring the actual adaptive randomization procedure. Considering efficiency of the allocation designs, we derive near efficient target proportions to balance between individual and collective ethics. Performance of the derived allocation targets are assessed numerically for binary, normal and exponential responses. Generalization for multiple treatments is also addressed.

December 12, 2012 doi: 10.1177/0962280212468378 open full text
Power and sample size calculations for evaluating mediation effects in longitudinal studies.
Wang, C., Xue, X.
Statistical Methods in Medical Research: An International Review Journal. December 06, 2012

Current methods of power and sample size calculations for the design of longitudinal studies to evaluate mediation effects are mostly based on simulation studies and do not provide closed-form formulae. A further challenge due to the longitudinal study design is the consideration of missing data, which almost always occur in longitudinal studies due to staggered entry or drop out. In this article, we consider the product of coefficients as a measure for the longitudinal mediation effect and evaluate three methods for testing the hypothesis on the longitudinal mediation effect: the joint significant test, the normal approximation and the test of b methods. Formulae for power and sample size calculations are provided under each method while taking into account missing data. Performance of the three methods under limited sample size are examined using simulation studies. An example from the Einstein aging study is provided to illustrate the methods.

December 06, 2012 doi: 10.1177/0962280212465163 open full text
Maximum likelihood estimation of time to first event in the presence of data gaps and multiple events.
Green, C. L., Brownie, C., Boos, D. D., Lu, J.-C., Krucoff, M. W.
Statistical Methods in Medical Research: An International Review Journal. November 18, 2012

We propose a novel likelihood method for analyzing time-to-event data when multiple events and multiple missing data intervals are possible prior to the first observed event for a given subject. This research is motivated by data obtained from a heart monitor used to track the recovery process of subjects experiencing an acute myocardial infarction. The time to first recovery, T₁, is defined as the time when the ST-segment deviation first falls below 50% of the previous peak level. Estimation of T₁ is complicated by data gaps during monitoring and the possibility that subjects can experience more than one recovery. If gaps occur prior to the first observed event, T, the first observed recovery may not be the subject’s first recovery. We propose a parametric gap likelihood function conditional on the gap locations to estimate T₁. Standard failure time methods that do not fully utilize the data are compared to the gap likelihood method by analyzing data from an actual study and by simulation. The proposed gap likelihood method is shown to be more efficient and less biased than interval censoring and more efficient than right censoring if data gaps occur early in the monitoring process or are short in duration.

November 18, 2012 doi: 10.1177/0962280212466089 open full text
A comparison of incomplete-data methods for categorical data.
van der Palm, D. W., van der Ark, L. A., Vermunt, J. K.
Statistical Methods in Medical Research: An International Review Journal. November 18, 2012

We studied four methods for handling incomplete categorical data in statistical modeling: (1) maximum likelihood estimation of the statistical model with incomplete data, (2) multiple imputation using a loglinear model, (3) multiple imputation using a latent class model, (4) and multivariate imputation by chained equations. Each method has advantages and disadvantages, and it is unknown which method should be recommended to practitioners. We reviewed the merits of each method and investigated their effect on the bias and stability of parameter estimates and bias of the standard errors. We found that multiple imputation using a latent class model with many latent classes was the most promising method for handling incomplete categorical data, especially when the number of variables used in the imputation model is large.

November 18, 2012 doi: 10.1177/0962280212465502 open full text
Comparison of four methods for deriving hospital standardised mortality ratios from a single hierarchical logistic regression model.
Mohammed, M. A., Manktelow, B. N., Hofer, T. P.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2012

There is interest in deriving case-mix adjusted standardised mortality ratios so that comparisons between healthcare providers, such as hospitals, can be undertaken in the controversial belief that variability in standardised mortality ratios reflects quality of care. Typically standardised mortality ratios are derived using a fixed effects logistic regression model, without a hospital term in the model. This fails to account for the hierarchical structure of the data – patients nested within hospitals – and so a hierarchical logistic regression model is more appropriate. However, four methods have been advocated for deriving standardised mortality ratios from a hierarchical logistic regression model, but their agreement is not known and neither do we know which is to be preferred. We found significant differences between the four types of standardised mortality ratios because they reflect a range of underlying conceptual issues. The most subtle issue is the distinction between asking how an average patient fares in different hospitals versus how patients at a given hospital fare at an average hospital. Since the answers to these questions are not the same and since the choice between these two approaches is not obvious, the extent to which profiling hospitals on mortality can be undertaken safely and reliably, without resolving these methodological issues, remains questionable.

November 06, 2012 doi: 10.1177/0962280212465165 open full text
Analysing cognitive test data: Distributions and non-parametric random effects.
Muniz-Terrera, G., Hout, A. v. d., Rigby, R., Stasinopoulos, D.
Statistical Methods in Medical Research: An International Review Journal. November 06, 2012

An important assumption in many linear mixed models is that the conditional distribution of the response variable is normal. This assumption is violated when the models are fitted to an outcome variable that counts the number of correctly answered questions in a questionnaire. Examples include investigations of cognitive decline where models are fitted to Mini Mental State Examination scores, the most widely used test to measure global cognition. Mini Mental State Examination scores take integer values in the 0–30 range, and its distribution has strong ceiling and floor effects. This article explores alternative distributions for the outcome variable in mixed models fitted to mini mental state examination scores from a longitudinal study of ageing. Model fit improved when a beta-binomial distribution was chosen as the distribution for the response variable.

November 06, 2012 doi: 10.1177/0962280212465500 open full text
A Bayesian normal mixture accelerated failure time spatial model and its application to prostate cancer.
Wang, S., Zhang, J., Lawson, A. B.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2012

In the United States, prostate cancer is the third most common cause of death from cancer in males of all ages, and the most common cause of death from cancer in males over age 75. It has been recognized that the incidence of the prostate cancer is high in African Americans, and its occurrence and progression may be impacted by geographical factors. In order to investigate the spatial effects and racial disparities for prostate cancer in Louisiana, in this article we propose a normal mixture accelerated failure time spatial model, which does not require the proportional hazards assumption and allows the multi-model distribution to be modeled. The proposed model is estimated with a Bayesian approach and it can be easily implemented in WinBUGS. Extensive simulations show that the proposed model provides decent flexibility for a variety of parametric error distributions. The proposed method is applied to 2000–2007 Louisiana prostate cancer data set from the Surveillance, Epidemiology and End Results Program. The results reveal the possible spatial pattern and racial disparities for prostate cancer in Louisiana.

November 01, 2012 doi: 10.1177/0962280212466189 open full text
A literature-based approach to evaluate the predictive capacity of a marker using time-dependent summary receiver operating characteristics.
Combescure, C., Daures, J., Foucher, Y.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2012

Meta-analyses are popular tools to summarize the results of publications. Prognostic performances of a marker are usually summarized by meta-analyses of survival curves or hazard ratios. These approaches may detect a difference in survival according to the marker but do not allow evaluation of its prognostic capacity. Time-dependent receiver operating characteristic curves evaluate the ability of a marker to predict time-to-event. In this article, we describe an adaptation of time-dependent summary receiver operating characteristic curves from published survival curves. To achieve this goal, we modeled the marker and the time-to-event distributions using non-linear mixed models. First, we applied this methodology to individual data in kidney transplantation presented as aggregated data, in order to validate the method. Second, we re-analyzed a published meta-analysis, which focused on the capacity of KI-67 to predict the overall survival of patients with breast cancer.

November 01, 2012 doi: 10.1177/0962280212464542 open full text
A Phase I/II trial design when response is unobserved in subjects with dose-limiting toxicity.
Braun, T. M., Kang, S., Taylor, J. M.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2012

We propose a Phase I/II trial design in which subjects with dose-limiting toxicity are not followed for response, leading to three possible outcomes for each subject: dose-limiting toxicity, absence of therapeutic response without dose-limiting toxicity, and presence of therapeutic response without dose-limiting toxicity. We define the latter outcome as a ‘success,’ and the goal of the trial is to identify the dose with the largest probability of success. This dose is commonly referred to as the most successful dose. We propose a design that accumulates information on subjects with regard to both dose-limiting toxicity and response conditional on no dose-limiting toxicity. Bayesian methods are used to update the estimates of dose-limiting toxicity and response probabilities when each subject is enrolled, and we use these methods to determine the dose level assigned to each subject. Due to the need to explore doses more fully, each subject is not necessarily assigned the current estimate of the most successful dose; our algorithm may instead assign a dose that is in a neighborhood of the current most successful dose. We examine the ability of our design to correctly identify the most successful dose in a variety of settings via simulation and compare the performance of our design to that of competing approaches.

November 01, 2012 doi: 10.1177/0962280212464541 open full text
Statistical analysis of life history calendar data.
Eerola, M., Helske, S.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2012

The life history calendar is a data-collection tool for obtaining reliable retrospective data about life events. To illustrate the analysis of such data, we compare the model-based probabilistic event history analysis and the model-free data mining method, sequence analysis. In event history analysis, we estimate instead of transition hazards the cumulative prediction probabilities of life events in the entire trajectory. In sequence analysis, we compare several dissimilarity metrics and contrast data-driven and user-defined substitution costs. As an example, we study young adults' transition to adulthood as a sequence of events in three life domains. The events define the multistate event history model and the parallel life domains in multidimensional sequence analysis. The relationship between life trajectories and excess depressive symptoms in middle age is further studied by their joint prediction in the multistate model and by regressing the symptom scores on individual-specific cluster indices. The two approaches complement each other in life course analysis; sequence analysis can effectively find typical and atypical life patterns while event history analysis is needed for causal inquiries.

November 01, 2012 doi: 10.1177/0962280212461205 open full text
Misspecification of the covariance structure in generalized linear mixed models.
Chavance, M., Escolano, S.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

When fitting marginal models to correlated outcomes, the so-called sandwich variance is commonly used. However, this is not the case when fitting mixed models. Using two data sets, we illustrate the problems that can be encountered. We show that the differences or the ratios between the naive and sandwich standard deviations of the fixed effects estimators provide convenient means of assessing the fit of the model, as both are consistent when the covariance structure is correctly specified, but only the latter is when that structure is misspecified. When the number of statistical units is not too small, the sandwich formula correctly estimates the variance of the fixed effects estimator even if the random effects are misspecified, and it can be used in a diagnostic tool for assessing the misspecification of the random effects. A simple comparison with the naive variance is sufficient and we propose considering a ratio of the naive and sandwich standard deviation out of the [3/4; 4/3] interval as signaling a risk of erroneous inference due to a model misspecification. We strongly advocate broader use of the sandwich variance for statistical inference about the fixed effects in mixed models.

October 14, 2012 doi: 10.1177/0962280212462859 open full text
A minimal net reclassification improvement to assess predictions of intensive care mortality.
Redondo, Y. T. L., Lambert, J., Chevret, S.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Objective:
In assessing the improved discrimination of a new prognostic score, the "Net Reclassification Improvement" from reclassification methods appears of interest. We propose a measure that takes into account improvements in predicted probabilities to assess and allows testing the additional predictive ability of a new scoring system Y compared to a reference score X.
Study design and settings:
To assess and test the improvement in mortality prediction of (X + Y) compared to X, we defined a minimal net reclassification improvement that restricted improvements in predicted probabilities according to some positive threshold . Both absolute and relative improvements were considered. A simulation study was performed to assess its performances in a range of practical situations. We then applied our measures to real intensive care unit data.
Results:
Expectedly, minimal net reclassification improvement increased with the effect size of Y and decreased with the value of . Using relative improvements allowed erasing the influence of the population mortality. For given effect sizes of X and Y, the difference in all measures of reclassification decreased when a correlation between X and Y was introduced.
Conclusion:
Reclassification methods, particularly the minimal net reclassification improvement, seem to be clinically relevant when used with continuous clinical data with no known threshold.

October 14, 2012 doi: 10.1177/0962280212459389 open full text
Non-Gaussian Berkson errors in bioassay.
Althubaiti, A., Donev, A.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

The experimental design plays an important role in every experimental study. However, if errors in the settings of the studied factors cannot be avoided, i.e. Berkson errors occur, the estimates of the model parameters may be biased and the variability in the study increased. Correction methods for the effect of Berkson errors are compared. The emphasis is on the study of correlated Berkson errors which follow non-Gaussian distribution as this appears to have been a neglected, yet important, area. It is shown that the regression calibration approach bias correction methods are useful when the Berkson errors are independent. However, when these errors are dependent, the newly proposed method B-SIMEX clearly outperforms the other methods.

October 14, 2012 doi: 10.1177/0962280212460134 open full text
Design effects for sample size computation in three-level designs.
Cunningham, T. D., Johnson, R. E.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Experiments with multiple nested levels where randomization can take place at any level bring challenges to the computation of sample sizes. Formulas derived under simple single-level experiments must be adjusted using multiplicative factors or design effects. In this work, we take a unified approach to finding the design effects in terms of intracluster correlations and present formulas to compute sample sizes of different levels. Equal cluster sample sizes and homogeneous within cluster variances are assumed.

October 14, 2012 doi: 10.1177/0962280212460443 open full text
Comparing paired biomarkers in predicting quantitative health outcome subject to random censoring.
Liu, X., Jin, Z., Graziano, J. H.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

This paper uses a non-parametric test, based on consistently estimated discrimination accuracy defined as concordance probability between quantitative predictor and outcome, to compare paired biomarkers in predicting a health outcome, possibly subject to random censoring. Comparing with the Wilcoxon test for paired predictors based on Harrell’s C-index, we found that the proposed test is better in presence of random censoring, although the two unbiased tests are equivalent for outcome either uncensored or censored by a constant. A simulation study also demonstrates that the bias in estimated difference in concordance probability, due to ignoring random censoring, results in overestimated power, especially when random censoring is heavy. The method was applied in two studies, where the biomarkers measured from the same study subjects are correlated. The first study on 299 school children in Bangladesh found the associations that higher blood arsenic and manganese were related to lower intellectual test scores, while the differences between the biomarkers in predicting the intellectual test scores were not statistically significant. The second study on 418 patients with primary biliary cirrhosis found that the baseline serum bilirubin had greater discrimination accuracy than the baseline serum albumin in predicting survival time.

October 14, 2012 doi: 10.1177/0962280212460434 open full text
Classification using longitudinal trajectory of biomarker in the presence of detection limits.
Kim, Y., Kong, L.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Discriminant analysis is commonly used to evaluate the ability of candidate biomarkers to separate patients into pre-defined groups. Recent extension of discriminant analysis to longitudinal data enables us to improve the classification accuracy based on biomarker profiles rather than on a single biomarker measurement. However, the biomarker measurement is often limited by the sensitivity of the given assay, resulting in data that are censored at either the lower or the upper limit of detection. Inappropriate handling of censored data may affect the classification accuracy of biomarker and hinder the evaluation of its potential discrimination power. We develop a discriminant analysis method for censored longitudinal biomarker data based on mixed models and evaluate its performance by area under the receiver operation characteristic curve. Through the simulation study, we show that our method is better than the simple substitution methods in terms of parameter estimation and evaluating biomarker performance. Application to a biomarker study of patients with acute kidney injury demonstrates that our method may shed light on the potential clinical utility of biomarkers by taking into account both longitudinal trajectory and limit of detection issues.

October 14, 2012 doi: 10.1177/0962280212460438 open full text
The limitations of simple gene set enrichment analysis assuming gene independence.
Tamayo, P., Steinhardt, G., Liberzon, A., Mesirov, J. P.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Since its first publication in 2003, the Gene Set Enrichment Analysis method, based on the Kolmogorov-Smirnov statistic, has been heavily used, modified, and also questioned. Recently a simplified approach using a one-sample t-test score to assess enrichment and ignoring gene-gene correlations was proposed by Irizarry et al. 2009 as a serious contender. The argument criticizes Gene Set Enrichment Analysis’s nonparametric nature and its use of an empirical null distribution as unnecessary and hard to compute. We refute these claims by careful consideration of the assumptions of the simplified method and its results, including a comparison with Gene Set Enrichment Analysis’s on a large benchmark set of 50 datasets. Our results provide strong empirical evidence that gene–gene correlations cannot be ignored due to the significant variance inflation they produced on the enrichment scores and should be taken into account when estimating gene set enrichment significance. In addition, we discuss the challenges that the complex correlation structure and multi-modality of gene sets pose more generally for gene set enrichment methods.

October 14, 2012 doi: 10.1177/0962280212460441 open full text
Modeling clinical outcome using multiple correlated functional biomarkers: A Bayesian approach.
Long, Q., Zhang, X., Zhao, Y., Johnson, B. A., Bostick, R. M.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

In some biomedical studies, biomarkers are measured repeatedly along some spatial structure or over time and are subject to measurement error. In these studies, it is often of interest to evaluate associations between a clinical endpoint and these biomarkers (also known as functional biomarkers). There are potentially two levels of correlation in such data, namely, between repeated measurements of a biomarker from the same subject and between multiple biomarkers from the same subject; none of the existing methods accounts for correlation between multiple functional biomarkers. We propose a Bayesian approach to model a clinical outcome of interest (e.g. risk for colorectal cancer) in the presence of multiple functional biomarkers while accounting for potential correlation. Our simulations show that the proposed approach achieves good performance in finite samples under various settings. In the presence of substantial or moderate correlation, the proposed approach outperforms an existing approach that does not account for correlation. The proposed approach is applied to a study of biomarkers of risk for colorectal neoplasms and our results show that the risk for colorectal cancer is associated with two functional biomarkers, APC and TGF-α, in particular, with their values in the region between the proliferating and differentiating zones of colorectal crypts.

October 14, 2012 doi: 10.1177/0962280212460444 open full text
Tree-based identification of subgroups for time-varying covariate survival data.
Bertolet, M., Brooks, M. M., Bittner, V.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Classification and regression tree analyses identify subsets of a sample that differ on an outcome. Discrimination of subsets is performed using recursive binary splitting on a set of covariates, allowing for interactions of variable subgroups not easily captured in standard model building techniques. Using classification and regression tree with epidemiological data can be problematic as there is often a need to adjust for potential confounders and to account for time-varying covariates in the context of right-censored survival data. While classification and regression tree variations exist individually for survival analysis, time-varying covariates and incorporating possible confounders, examples of classification and regression tree using all three together are lacking. We propose a method to identify subsets of time-varying covariate risk factors that affect survival while adjusting for possible confounders. The technique is demonstrated on data from the Bypass Angioplasty Revascularization Investigation 2 Diabetes clinical trial to find combinations of modifiable time-varying cardiac risk factors (e.g. smoking status, blood pressure, lipid levels and HbA1c level) that are associated with time-to-event clinical outcomes.

October 14, 2012 doi: 10.1177/0962280212460442 open full text
Obtaining evidence by a single well-powered trial or several modestly powered trials.
IntHout, J., Ioannidis, J. P., Borm, G. F.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

There is debate whether clinical trials with suboptimal power are justified and whether results from large studies are more reliable than the (combined) results of smaller trials. We quantified the error rates for evaluations based on single conventionally powered trials (80% or 90% power) versus evaluations based on the random-effects meta-analysis of a series of smaller trials. When a treatment was assumed to have no effect but heterogeneity was present, the error rates for a single trial were increased more than 10-fold above the nominal rate, even for low heterogeneity. Conversely, for meta-analyses on a series of trials, the error rates were correct. When selective publication was present, the error rates were always increased, but they still tended to be lower for a series of trials than single trials. We conclude that evidence of efficacy based on a series of (smaller) trials, may lower the error rates compared with using a single well-powered trial. Only when both heterogeneity and selective publication can be excluded, a single trial is able to provide conclusive evidence.

October 14, 2012 doi: 10.1177/0962280212461098 open full text
Estimating controlled direct effects in the presence of intermediate confounding of the mediator-outcome relationship: Comparison of five different methods.
Lepage, B., Dedieu, D., Savy, N., Lang, T.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

In mediation analysis between an exposure X and an outcome Y, estimation of the direct effect of X on Y by usual regression after adjustment for the mediator M may be biased if Z is a confounder between M and Y, and is also affected by X. Alternative methods have been described to avoid such a bias: inverse probability of treatment weighting with and without weight truncation, the sequential g-estimator and g-computation. Our aim was to compare the usual linear regression adjusted for M to these methods when estimating the controlled direct effect between X and Y in the causal structure and to explore the size of the potential bias. Estimations were computed in several simulated data sets as well as real data. We observed an increased bias of the controlled direct effect estimation using linear regression adjusted for M for larger effects of X on M and larger effects of Z on M. The sequential g-estimator and g-computation gave unbiased estimations with adequate coverage values in every situation studied. With continuous exposure X and mediator M, inverse probability of treatment weighting resulted in some bias and less satisfactory coverage for large effects of X on M and Z on M.

October 14, 2012 doi: 10.1177/0962280212461194 open full text
Iterated combination-based paired permutation tests to determine shape effects of chemotherapy in patients with esophageal cancer.
Alfieri, R., Bonnini, S., Brombin, C., Castoro, C., Salmaso, L.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

The nonparametric combination of dependent permutation tests method is a useful general tool when a testing problem can be broken down into a set of different k > 1 partial tests. These partial tests, after adjustment of p-values to control for multiplicity, can be marginally analyzed, but jointly considered they can provide information on an overall hypothesis, which might represent the true goal of the testing problem. On the one hand, independence among the partial tests is usually an unrealistic assumption; on the other, even when the underlying dependence relations are known quite often they are difficult to cope with properly. Therefore this combination must be achieved nonparametrically, by implicitly taking into account the dependence structure of tests without explicitly describing it. An important property of the tests based on nonparametric combination methodology, when the number of response variables is high compared to the sample sizes, consists in the finite sample consistency. A practical problem involves choosing the most suitable combining function for each specific testing problem given that the final result can be affected by this crucial choice. The purpose of this article is to present an nonparametric combination solution based on the iterated combination of partial tests, evaluate its power behavior using a Monte Carlo simulation study and apply it to a real medical problem, namely the evaluation of the effects of chemotherapy on the shape of esophageal tumors. R code has been implemented to carry out the analyses.

October 14, 2012 doi: 10.1177/0962280212461981 open full text
Comparison of two drug safety signals in a pharmacovigilance data mining framework.
Tubert-Bitter, P., Begaud, B., Ahmed, I.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

Since adverse drug reactions are a major public health concern, early detection of drug safety signals has become a top priority for regulatory agencies and the pharmaceutical industry. Quantitative methods for analyzing spontaneous reporting material recorded in pharmacovigilance databases through data mining have been proposed in the last decades and are increasingly used to flag potential safety problems. While automated data mining is motivated by the usually huge size of pharmacovigilance databases, it does not systematically produce relevant alerts. Moreover, each detected signal requires appropriate assessment that may involve investigation of the whole therapeutic class. The goal of this article is to provide a methodology for comparing two detected signals. It is nested within the automated surveillance framework as (1) no extra information is required and (2) no simple inference on the actual risks can be extrapolated from spontaneous reporting data. We designed our methodology on the basis of two classical methods used for automated signal detection: the Bayesian Gamma Poisson Shrinker and the frequentist Proportional Reporting Ratio. A simulation study was conducted to assess the performances of both proposed methods. The latter were used to compare cardiovascular signals for two HIV treatments from the French pharmacovigilance database.

October 14, 2012 doi: 10.1177/0962280212462295 open full text
A Bayesian semiparametric approach with change points for spatial ordinal data.
Cai, B., Lawson, A. B., McDermott, S., Aelion, C. M.
Statistical Methods in Medical Research: An International Review Journal. October 14, 2012

The change-point model has drawn much attention over the past few decades. It can accommodate the jump process, which allows for changes of the effects before and after the change point. Intellectual disability is a long-term disability that impacts performance in cognitive aspects of life and usually has its onset prior to birth. Among many potential causes, soil chemical exposures are associated with the risk of intellectual disability in children. Motivated by a study for soil metal effects on intellectual disability, we propose a Bayesian hierarchical spatial model with change points for spatial ordinal data to detect the unknown threshold effects. The spatial continuous latent variable underlying the spatial ordinal outcome is modeled by the multivariate Gaussian process, which captures spatial variation and is centered at the nonlinear mean. The mean function is modeled by using the penalized smoothing splines for some covariates with unknown change points and the linear regression for the others. Some identifiability constraints are used to define the latent variable. A simulation example is presented to evaluate the performance of the proposed approach with the competing models. A retrospective cohort study for intellectual disability in South Carolina is used as an illustration.

October 14, 2012 doi: 10.1177/0962280212463415 open full text
A Bayesian path analysis to estimate causal effects of bazedoxifene acetate on incidence of vertebral fractures, either directly or through non-linear changes in bone mass density.
Detilleux, J., Reginster, J.-Y., Chines, A., Bruyere, O.
Statistical Methods in Medical Research: An International Review Journal. September 11, 2012

Background/Aims: Bone mass density values have been related with risk of vertebral fractures in post-menopausal women. However, bone mass density is not perfectly accurate in predicting risk of fracture, which decreases its usefulness as a surrogate in clinical trials. We propose a modeling framework with three interconnected parts to improve the evaluation of bone mass density accuracy in forecasting fractures after treatment.
Methods: The modeling framework includes: (1) a piecewise regression to describe non-linear temporal BMD changes more accurately than crude percent changes, (2) a structural equation model to analyze interdependencies among vertebral fractures and their potential risk factors in preference to regression techniques that consider only directional associations, and (3) a counterfactual causal interpretation of the direct and indirect relationships between treatment and occurrence of vertebral fractures. We apply the methods to BMD repeated measurements from a study of the effect of bazedoxifene acetate on incident vertebral fractures in three different geographical regions.
Results: We made four observations: (1) bone mass density changes varied largely across participants, (2) baseline age and body mass index influenced baseline bone mass density that, in turn, had an effect on prevalent fractures, (3) direct and/or indirect effects of bazedoxifene acetate on incident fractures were different across regions, and (4) estimates of indirect effects were sensible to the presence of post-treatment unmeasured confounders. In one region, around 40% of the bazedoxifene acetate effect on the occurrence of fracture is explained by its effect on bone mass density. Under the counterfactual approach, these 40% represent the average difference in the occurrence of fracture observed for untreated individuals when their bone mass density values are set at the value under bazedoxifene acetate versus under placebo.
Conclusions: Computational methods are available to evaluate and interpret the surrogacytic capability of a biomarker of a primary outcome.

September 11, 2012 doi: 10.1177/0962280212456655 open full text
Analysis of Poisson frequency data under a simple crossover trial.
Lui, K.-J., Chang, K.-C.
Statistical Methods in Medical Research: An International Review Journal. August 16, 2012

When the frequency of occurrence for an event of interest follows a Poisson distribution, we develop asymptotic and exact procedures for testing non-equality, non-inferiority and equivalence, as well as asymptotic and exact interval estimators for the ratio of mean frequencies between two treatments under a simple crossover design. Using Monte Carlo simulations, we evaluate the performance of these test procedures and interval estimators in a variety of situations. We note that all asymptotic test procedures developed here can generally perform well with respect to Type I error and can be preferable to the exact test procedure with respect to power if the number of patients per group is moderate or large. We further find that in these cases the asymptotic interval estimator with the logarithmic transformation can be more precise than the exact interval estimator without sacrificing the accuracy with respect to the coverage probability. However, the exact test procedure and exact interval estimator can be of use when the number of patients per group is small. We use a double-blind randomized crossover trial comparing salmeterol with a placebo in exacerbations of asthma to illustrate the practical use of these estimators.

August 16, 2012 doi: 10.1177/0962280212455753 open full text
Evaluating treatment effect within a multivariate stochastic ordering framework: Nonparametric combination methodology applied to a study on multiple sclerosis.
Brombin, C., Di Serio, C.
Statistical Methods in Medical Research: An International Review Journal. July 26, 2012

Multiple sclerosis is an autoimmune complex disease that affects the central nervous system. It has a multitude of symptoms that are observed in different people in many different ways. At this time, there is no definite cure for multiple sclerosis. However, therapies that slow the progression of disability, controlling symptoms and helping patients to maintain a normal quality of life, are available. We will focus on relapsing–remitting multiple sclerosis patients treated with interferons or glatiramer acetate. These treatments have been shown to be effective, but their relative effectiveness has not been well established yet. To assess the superiority of a treatment, instead of classical parametric methods, we propose a statistical approach within the permutation setting and the nonparametric combination of dependent permutation tests. In this framework, we may easily handle with hypothesis testing problems for multivariate monotonic stochastic ordering. This approach has been motivated by the analysis of a large observational Italian multicentre study on multiple sclerosis, with several continuous and categorical outcomes measured at multiple time points.

July 26, 2012 doi: 10.1177/0962280212454203 open full text
Meta-analysis using Dirichlet process.
Muthukumarana, S., Tiwari, R. C.
Statistical Methods in Medical Research: An International Review Journal. July 16, 2012

This article develops a Bayesian approach for meta-analysis using the Dirichlet process. The key aspect of the Dirichlet process in meta-analysis is the ability to assess evidence of statistical heterogeneity or variation in the underlying effects across study while relaxing the distributional assumptions. We assume that the study effects are generated from a Dirichlet process. Under a Dirichlet process model, the study effects parameters have support on a discrete space and enable borrowing of information across studies while facilitating clustering among studies. We illustrate the proposed method by applying it to a dataset on the Program for International Student Assessment on 30 countries. Results from the data analysis, simulation studies, and the log pseudo-marginal likelihood model selection procedure indicate that the Dirichlet process model performs better than conventional alternative methods.

July 16, 2012 doi: 10.1177/0962280212453891 open full text
Bayesian analysis of a disability model for lung cancer survival.
Armero, C., Cabras, S., Castellanos, M., Perra, S., Quiros, A., Oruezabal, M., Sanchez-Rubio, J.
Statistical Methods in Medical Research: An International Review Journal. July 05, 2012

Bayesian reasoning, survival analysis and multi-state models are used to assess survival times for Stage IV non-small-cell lung cancer patients and the evolution of the disease over time. Bayesian estimation is done using minimum informative priors for the Weibull regression survival model, leading to an automatic inferential procedure. Markov chain Monte Carlo methods have been used for approximating posterior distributions and the Bayesian information criterion has been considered for covariate selection. In particular, the posterior distribution of the transition probabilities, resulting from the multi-state model, constitutes a very interesting tool which could be useful to help oncologists and patients make efficient and effective decisions.

July 05, 2012 doi: 10.1177/0962280212452803 open full text
Development and evaluation of multi-marker risk scores for clinical prognosis.
French, B., Saha-Chaudhuri, P., Ky, B., Cappola, T. P., Heagerty, P. J.
Statistical Methods in Medical Research: An International Review Journal. July 05, 2012

Heart failure research suggests that multiple biomarkers could be combined with relevant clinical information to more accurately quantify individual risk and guide patient-specific treatment strategies. Therefore, statistical methodology is required to determine multi-marker risk scores that yield improved prognostic performance. Development of a prognostic score that combines biomarkers with clinical variables requires specification of an appropriate statistical model and is most frequently achieved using standard regression methods such as Cox regression. We demonstrate that care is needed in model specification and that maximal use of marker information requires consideration of potential non-linear effects and interactions. The derived multi-marker score can be evaluated using time-dependent receiver operating characteristic methods, or risk reclassification methods adapted for survival outcomes. We compare the performance of alternative model accuracy methods using simulations, both to evaluate power and to quantify the potential loss in accuracy associated with use of a sub-optimal regression model to develop the multi-marker score. We illustrate development and evaluation strategies using data from the Penn Heart Failure Study. Based on our results, we recommend that analysts carefully examine the functional form for component markers and consider plausible forms for effect modification to maximize the prognostic potential of a model-derived multi-marker score.

July 05, 2012 doi: 10.1177/0962280212451881 open full text
The channel capacity of a diagnostic test as a function of test sensitivity and test specificity.
Benish, W. A.
Statistical Methods in Medical Research: An International Review Journal. July 02, 2012

We apply the information theory concept of "channel capacity" to diagnostic test performance and derive an expression for channel capacity in terms of test sensitivity and test specificity. The expected value of the amount of information a diagnostic test will provide is equal to the "mutual information" between the test result and the disease state. For the case in which only two test results and two disease states are considered, mutual information, I(D;R), is a function of sensitivity, specificity, and the pretest probability of disease. The channel capacity of the test is the maximal value of I(D;R) for a given sensitivity and specificity. After deriving an expression for I(D;R) in terms of sensitivity, specificity, and pretest probability, we solve for the value of pretest probability that maximizes I(D;R). Channel capacity is obtained by using this value of pretest probability to calculate I(D;R). Channel capacity provides a convenient and meaningful single parameter measure of diagnostic test performance. It quantifies the upper limit of the amount of information a diagnostic test can be expected to provide about a patient’s disease state.

July 02, 2012 doi: 10.1177/0962280212439742 open full text
Restricted ROC curves are useful tools to evaluate the performance of tumour markers.
Parodi, S., Muselli, M., Carlini, B., Fontana, V., Haupt, R., Pistoia, V., Corrias, M. V.
Statistical Methods in Medical Research: An International Review Journal. June 26, 2012

In Clinical Epidemiology, receiver operating characteristic (ROC) analysis is a standard approach for the evaluation of the performance of diagnostic tests for binary classification based on a tumour marker distribution. The area under a ROC curve is a popular indicator of test accuracy, but its use has been questioned when the curve is asymmetric. This situation often happens when the marker concentrations overlap in the two groups under study in the range of low specificity, corresponding to a subset of values useless for classification purposes (non-informative values). The partial area under the curve at a high specificity threshold has been proposed as an alternative, but a method to identify an optimal cut-off that separates informative from non-informative values is not yet available. In this study, a new statistical approach is proposed to perform this task. Furthermore, a statistical test associated with the area under a ROC curve corresponding to informative values only (restricted ROC curve) is provided and its properties are explored by extensive simulations. Finally, the proposed method is applied to a real data set containing peripheral blood levels of six tumour markers proposed for the diagnosis of neuroblastoma. A new approach to combine couples of markers for classification purposes is also illustrated.

June 26, 2012 doi: 10.1177/0962280212452199 open full text
Causal inference with a quantitative exposure.
Zhang, Z., Zhou, J., Cao, W., Zhang, J.
Statistical Methods in Medical Research: An International Review Journal. June 22, 2012

The current statistical literature on causal inference is mostly concerned with binary or categorical exposures, even though exposures of a quantitative nature are frequently encountered in epidemiologic research. In this article, we review the available methods for estimating the dose–response curve for a quantitative exposure, which include ordinary regression based on an outcome regression model, inverse propensity weighting and stratification based on a propensity function model, and an augmented inverse propensity weighting method that is doubly robust with respect to the two models. We note that an outcome regression model often imposes an implicit constraint on the dose–response curve, and propose a flexible modeling strategy that avoids constraining the dose–response curve. We also propose two new methods: a weighted regression method that combines ordinary regression with inverse propensity weighting and a stratified regression method that combines ordinary regression with stratification. The proposed methods are similar to the augmented inverse propensity weighting method in the sense of double robustness, but easier to implement and more generally applicable. The methods are illustrated with an obstetric example and compared in simulation studies.

June 22, 2012 doi: 10.1177/0962280212452333 open full text
Methods for meta-analysis of individual participant data from Mendelian randomisation studies with binary outcomes.
Burgess, S., Thompson, S. G., Genetics Collaboration, C. C.
Statistical Methods in Medical Research: An International Review Journal. June 19, 2012

Mendelian randomisation is an epidemiological method for estimating causal associations from observational data by using genetic variants as instrumental variables. Typically the genetic variants explain only a small proportion of the variation in the risk factor of interest, and so large sample sizes are required, necessitating data from multiple sources. Meta-analysis based on individual patient data requires synthesis of studies which differ in many aspects. A proposed Bayesian framework is able to estimate a causal effect from each study, and combine these using a hierarchical model. The method is illustrated for data on C-reactive protein and coronary heart disease (CHD) from the C-reactive protein CHD Genetics Collaboration (CCGC). Studies from the CCGC differ in terms of the genetic variants measured, the study design (prospective or retrospective, population-based or case-control), whether C-reactive protein was measured, the time of C-reactive protein measurement (pre- or post-disease), and whether full or tabular data were shared. We show how these data can be combined in an efficient way to give a single estimate of causal association based on the totality of the data available. Compared to a two-stage analysis, the Bayesian method is able to incorporate data on 23% additional participants and 51% more events, leading to a 23–26% gain in efficiency.

June 19, 2012 doi: 10.1177/0962280212451882 open full text
Unconditional tests for comparing two ordered multinomials.
Shan, G., Ma, C.
Statistical Methods in Medical Research: An International Review Journal. June 13, 2012

We consider two exact unconditional procedures to test the difference between two multinomials with ordered categorical data. Exact unconditional procedures are compared to other approaches based on the Wilcoxon mid-rank test and the proportional odds model. We use a real example from an arthritis pain study to illustrate the various test procedures and provide an extensive numerical study to compare procedures with regards to type I error rates and power under the unconditional framework. The exact unconditional procedure based on estimation followed by maximization is generally more powerful than other procedures, and is therefore recommended for use in practice.

June 13, 2012 doi: 10.1177/0962280212450957 open full text
A comparison of two methods of estimating propensity scores after multiple imputation.
Mitra, R., Reiter, J. P.
Statistical Methods in Medical Research: An International Review Journal. June 11, 2012

In many observational studies, analysts estimate treatment effects using propensity scores, e.g. by matching or sub-classifying on the scores. When some values of the covariates are missing, analysts can use multiple imputation to fill in the missing data, estimate propensity scores based on the m completed datasets, and use the propensity scores to estimate treatment effects. We compare two approaches to implement this process. In the first, the analyst estimates the treatment effect using propensity score matching within each completed data set, and averages the m treatment effect estimates. In the second approach, the analyst averages the m propensity scores for each record across the completed datasets, and performs propensity score matching with these averaged scores to estimate the treatment effect. We compare properties of both methods via simulation studies using artificial and real data. The simulations suggest that the second method has greater potential to produce substantial bias reductions than the first, particularly when the missing values are predictive of treatment assignment.

June 11, 2012 doi: 10.1177/0962280212445945 open full text
Prior choice in discrete latent modeling of spatially referenced cancer survival.
Lawson, A. B., Choi, J., Zhang, J.
Statistical Methods in Medical Research: An International Review Journal. May 31, 2012

In this article, we examine the development and use of covariate models where the relation with explanantory covariates is spatially adaptive. In this way space is regarded as an effect modifier. We examine the possibility of discrete groupings of coefficients (clustering of coefficients). Our application is to prostate cancer survival based on the SEER cancer registry for the state of Louisiana, USA. This registry holds individual records linked to vital outcomes and is geo-coded at county level. We examine a range of potential prior distributions for groupings of regression coefficients in application to these data.

May 31, 2012 doi: 10.1177/0962280212447148 open full text
Longitudinal data analysis with non-ignorable missing data.
Tseng, C.-h., Elashoff, R., Li, N., Li, G.
Statistical Methods in Medical Research: An International Review Journal. May 24, 2012

A common problem in the longitudinal data analysis is the missing data problem. Two types of missing patterns are generally considered in statistical literature: monotone and non-monotone missing data. Nonmonotone missing data occur when study participants intermittently miss scheduled visits, while monotone missing data can be from discontinued participation, loss to follow-up, and mortality. Although many novel statistical approaches have been developed to handle missing data in recent years, few methods are available to provide inferences to handle both types of missing data simultaneously. In this article, a latent random effects model is proposed to analyze longitudinal outcomes with both monotone and non-monotone missingness in the context of missing not at random. Another significant contribution of this article is to propose a new computational algorithm for latent random effects models. To reduce the computational burden of high-dimensional integration problem in latent random effects models, we develop a new computational algorithm that uses a new adaptive quadrature approach in conjunction with the Taylor series approximation for the likelihood function to simplify the E-step computation in the expectation–maximization algorithm. Simulation study is performed and the data from the scleroderma lung study are used to demonstrate the effectiveness of this method.

May 24, 2012 doi: 10.1177/0962280212448721 open full text
Bayesian approach to non-inferiority trials for normal means.
Gamalo, M. A., Wu, R., Tiwari, R. C.
Statistical Methods in Medical Research: An International Review Journal. May 21, 2012

Regulatory framework recommends that novel statistical methodology for analyzing trial results parallels the frequentist strategy, e.g. the new method must protect type-I error and arrive at a similar conclusion. Keeping these in mind, we construct a Bayesian approach for non-inferiority trials with normal response. A non-informative prior is assumed for the mean response of the experimental treatment and Jeffrey's prior for its corresponding variance when it is unknown. The posteriors of the mean response and variance of the treatment in historical trials are then assumed as priors for its corresponding parameters in the current trial, where that treatment serves as the active control. From these priors, a Bayesian decision criterion is derived to determine whether the experimental treatment is non-inferior to the active control. This criterion is evaluated and compared with the frequentist method using simulation studies. Results show that both Bayesian and frequentist approaches perform alike, but the Bayesian approach has a higher power when the variances are unknown. Both methods also arrive at the same conclusion of non-inferiority when applied on two real datasets. A major advantage of the proposed Bayesian approach lies in its ability to provide posterior probabilities for varying effect sizes of the experimental treatment over the active control.

May 21, 2012 doi: 10.1177/0962280212448723 open full text
A spatial bivariate probit model for correlated binary data with application to adverse birth outcomes.
Neelon, B., Anthopolos, R., Miranda, M. L.
Statistical Methods in Medical Research: An International Review Journal. May 16, 2012

Motivated by a study examining geographic variation in birth outcomes, we develop a spatial bivariate probit model for the joint analysis of preterm birth and low birth weight. The model uses a hierarchical structure to incorporate individual and areal-level information, as well as spatially dependent random effects for each spatial unit. Because rates of preterm birth and low birth weight are likely to be correlated within geographic regions, we model the spatial random effects via a bivariate conditionally autoregressive prior, which induces regional dependence between the outcomes and provides spatial smoothing and sharing of information across neighboring areas. Under this general framework, one can obtain region-specific joint, conditional, and marginal inferences of interest. We adopt a Bayesian modeling approach and develop a practical Markov chain Monte Carlo computational algorithm that relies primarily on easily sampled Gibbs steps. We illustrate the model using data from the 2007–2008 North Carolina Detailed Birth Record.

May 16, 2012 doi: 10.1177/0962280212447149 open full text
On identification in Bayesian disease mapping and ecological-spatial regression models.
MacNab, Y. C.
Statistical Methods in Medical Research: An International Review Journal. May 08, 2012

We discuss identification of structural characteristics of the underlying relative risks ensemble for posterior relative risks inference within Bayesian generalized linear mixed model framework for small-area disease mapping and ecological–spatial regression. We revisit conditionally specified and locally characterized Gaussian Markov random field risks ensemble priors in univariate disease mapping and communicate insight into Gaussian Markov random field variance–covariance characteristics for representing disease risks variability and spatial risks interactions and for structural identification with respect to risks ensemble prior choices. Illustrative examples of identification in Bayesian disease mapping and ecological–spatial regression models are presented for Bayesian hierarchical generalized linear mixed Poisson models and zero-inflated Poisson models.

May 08, 2012 doi: 10.1177/0962280212447152 open full text
Spatial health effects analysis with uncertain residential locations.
Reich, B. J., Chang, H. H., Strickland, M. J.
Statistical Methods in Medical Research: An International Review Journal. May 02, 2012

Spatial epidemiology has benefited greatly from advances in geographic information system technology, which permits extensive study of associations between various health responses and a wide array of socio-economic and environmental factors. However, many spatial epidemiological datasets have missing values for a substantial proportion of spatial variables, such as the census tract of residence of study participants. The standard approach is to discard these observations and analyze only complete observations. In this article, we propose a new hierarchical Bayesian spatial model to handle missing observation locations. Our model utilizes all available information to learn about the missing locations and propagates uncertainty about the missing locations throughout the model. We show via a simulation study that this method can lead to more efficient epidemiological analysis. The method is applied to a study of the relationship between fine particulate matter and birth outcomes is southeast Georgia, where we find smaller posterior variance for most parameters using our missing data model compared to the standard complete case model.

May 02, 2012 doi: 10.1177/0962280212447151 open full text
Interpolation between spatial frameworks: An application of process convolution to estimating neighbourhood disease prevalence.
Congdon, P.
Statistical Methods in Medical Research: An International Review Journal. May 02, 2012

Health data may be collected across one spatial framework (e.g. health provider agencies), but contrasts in health over another spatial framework (neighbourhoods) may be of policy interest. In the UK, population prevalence totals for chronic diseases are provided for populations served by general practitioner practices, but not for neighbourhoods (small areas of circa 1500 people), raising the question whether data for one framework can be used to provide spatially interpolated estimates of disease prevalence for the other. A discrete process convolution is applied to this end and has advantages when there are a relatively large number of area units in one or other framework. Additionally, the interpolation is modified to take account of the observed neighbourhood indicators (e.g. hospitalisation rates) of neighbourhood disease prevalence. These are reflective indicators of neighbourhood prevalence viewed as a latent construct. An illustrative application is to prevalence of psychosis in northeast London, containing 190 general practitioner practices and 562 neighbourhoods, including an assessment of sensitivity to kernel choice (e.g. normal vs exponential). This application illustrates how a zero-inflated Poisson can be used as the likelihood model for a reflective indicator.

May 02, 2012 doi: 10.1177/0962280212447150 open full text
Which individuals make dropout informative?
Geskus, R. B.
Statistical Methods in Medical Research: An International Review Journal. April 25, 2012

Markers are internal host factors that measure the current disease or recovery status of an individual. Individuals with more advanced disease progression are more likely to drop out, e.g. because they die. Marker data after dropout are missing. Such missingness is certainly not completely at random. A mixed effects model can be used if missingness of the marker data depends on measured marker values only (missing at random). If missingness is not at random, such models yield biased results. We describe various approaches that jointly model the marker development and dropout risk and may eliminate bias. One example of such a model is a random effects selection model. Based on a real data set with frequent follow-up, we compare results from a random effects model and a random effects selection model. Results are remarkably similar. In a simulation study, we investigate how the bias in the parameter estimates from a random effects model depends on the frequency of measurements and the time between the last measurement and the dropout or censoring time. Results from the simulation study confirm that the bias is small if follow-up is frequent.

April 25, 2012 doi: 10.1177/0962280212445840 open full text
Dropout in crossover and longitudinal studies: Is complete case so bad?
Matthews, J. N., Henderson, R., Farewell, D. M., Ho, W.-K., Rodgers, L. R.
Statistical Methods in Medical Research: An International Review Journal. April 20, 2012

We discuss inference for longitudinal clinical trials subject to possibly informative dropout. A selection of available methods is reviewed for the simple case of trials with two timepoints. Using data from two such clinical trials, each with two treatments, we demonstrate that different analysis methods can at times lead to quite different conclusions from the same data. We investigate properties of complete-case estimators for the type of trials considered, with emphasis on interpretation and meaning of parameters. We contrast longitudinal and crossover designs and argue that for crossover studies there are often good reasons to prefer a complete case analysis. More generally, we suggest that there is merit in an approach in which no untestable assumptions are made. Such an approach would combine a dropout analysis, an analysis of complete-case data only, and a careful statement of justified conclusions.

April 20, 2012 doi: 10.1177/0962280212445838 open full text
The analysis of multivariate longitudinal data: A review.
Verbeke, G., Fieuws, S., Molenberghs, G., Davidian, M.
Statistical Methods in Medical Research: An International Review Journal. April 20, 2012

Longitudinal experiments often involve multiple outcomes measured repeatedly within a set of study participants. While many questions can be answered by modeling the various outcomes separately, some questions can only be answered in a joint analysis of all of them. In this article, we will present a review of the many approaches proposed in the statistical literature. Four main model families will be presented, discussed and compared. Focus will be on presenting advantages and disadvantages of the different models rather than on the mathematical or computational details.

April 20, 2012 doi: 10.1177/0962280212445834 open full text
Joint latent class models for longitudinal and time-to-event data: A review.
Proust-Lima, C., Sene, M., Taylor, J. M., Jacqmin-Gadda, H.
Statistical Methods in Medical Research: An International Review Journal. April 19, 2012

Most statistical developments in the joint modelling area have focused on the shared random-effect models that include characteristics of the longitudinal marker as predictors in the model for the time-to-event. A less well-known approach is the joint latent class model which consists in assuming that a latent class structure entirely captures the correlation between the longitudinal marker trajectory and the risk of the event. Owing to its flexibility in modelling the dependency between the longitudinal marker and the event time, as well as its ability to include covariates, the joint latent class model may be particularly suited for prediction problems. This article aims at giving an overview of joint latent class modelling, especially in the prediction context. The authors introduce the model, discuss estimation and goodness-of-fit, and compare it with the shared random-effect model. Then, dynamic predictive tools derived from joint latent class models, as well as measures to evaluate their dynamic predictive accuracy, are presented. A detailed illustration of the methods is given in the context of the prediction of prostate cancer recurrence after radiation therapy based on repeated measures of Prostate Specific Antigen.

April 19, 2012 doi: 10.1177/0962280212445839 open full text
On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: Sequential trials, random sample sizes, and missing data.
Molenberghs, G., Kenward, M. G., Aerts, M., Verbeke, G., Tsiatis, A. A., Davidian, M., Rizopoulos, D.
Statistical Methods in Medical Research: An International Review Journal. April 18, 2012

The vast majority of settings for which frequentist statistical properties are derived assume a fixed, a priori known sample size. Familiar properties then follow, such as, for example, the consistency, asymptotic normality, and efficiency of the sample average for the mean parameter, under a wide range of conditions. We are concerned here with the alternative situation in which the sample size is itself a random variable which may depend on the data being collected. Further, the rule governing this may be deterministic or probabilistic. There are many important practical examples of such settings, including missing data, sequential trials, and informative cluster size. It is well known that special issues can arise when evaluating the properties of statistical procedures under such sampling schemes, and much has been written about specific areas (Grambsch P. Sequential sampling based on the observed Fisher information to guarantee the accuracy of the maximum likelihood estimator. Ann Stat 1983; 11: 68–77; Barndorff-Nielsen O and Cox DR. The effect of sampling rules on likelihood statistics. Int Stat Rev 1984; 52: 309–326). Our aim is to place these various related examples into a single framework derived from the joint modeling of the outcomes and sampling process and so derive generic results that in turn provide insight, and in some cases practical consequences, for different settings. It is shown that, even in the simplest case of estimating a mean, some of the results appear counterintuitive. In many examples, the sample average may exhibit small sample bias and, even when it is unbiased, may not be optimal. Indeed, there may be no minimum variance unbiased estimator for the mean. Such results follow directly from key attributes such as non-ancillarity of the sample size and incompleteness of the minimal sufficient statistic of the sample size and sample sum. Although our results have direct and obvious implications for estimation following group sequential trials, there are also ramifications for a range of other settings, such as random cluster sizes, censored time-to-event data, and the joint modeling of longitudinal and time-to-event data. Here, we use the simplest group sequential setting to develop and explicate the main results. Some implications for random sample sizes and missing data are also considered. Consequences for other related settings will be considered elsewhere.

April 18, 2012 doi: 10.1177/0962280212445801 open full text
A transformation class for spatio-temporal survival data with a cure fraction.
Hurtado Rua, S. M., Dey, D. K.
Statistical Methods in Medical Research: An International Review Journal. April 18, 2012

We propose a hierarchical Bayesian methodology to model spatially or spatio-temporal clustered survival data with possibility of cure. A flexible continuous transformation class of survival curves indexed by a single parameter is used. This transformation model is a larger class of models containing two special cases of the well-known existing models: the proportional hazard and the proportional odds models. The survival curve is modeled as a function of a baseline cumulative distribution function, cure rates, and spatio-temporal frailties. The cure rates are modeled through a covariate link specification and the spatial frailties are specified using a conditionally autoregressive model with time-varying parameters resulting in a spatio-temporal formulation. The likelihood function is formulated assuming that the single parameter controlling the transformation is unknown and full conditional distributions are derived. A model with a non-parametric baseline cumulative distribution function is implemented and a Markov chain Monte Carlo algorithm is specified to obtain the usual posterior estimates, smoothed by regional level maps of spatio-temporal frailties and cure rates. Finally, we apply our methodology to melanoma cancer survival times for patients diagnosed in the state of New Jersey between 2000 and 2007, and with follow-up time until 2007.

April 18, 2012 doi: 10.1177/0962280212445658 open full text
A comparative investigation of methods for longitudinal data with limits of detection through a case study.
Fu, P., Hughes, J., Zeng, G., Hanook, S., Orem, J., Mwanda, O. W., Remick, S. C.
Statistical Methods in Medical Research: An International Review Journal. April 13, 2012

The statistical analysis of continuous longitudinal data may be complicated since quantitative levels of bioassay cannot always be determined. Values beyond the limits of detection (LOD) in the assays may not be observed and thus censored, rendering complexity to the analysis of such data. This article examines how both left-censoring and right censoring of HIV-1 plasma RNA measurements, collected for the study on AIDS-related Non-Hodgkin’s lymphoma (AR-NHL) in East Africa, affects the quantification of viral load and explores the natural history of viral load measurements over time in AR-NHL patients receiving anticancer chemotherapy. Data analyses using Monte Carlo EM algorithm (MCEM) are compared to analyses where the LOD or LOD/2 (left censoring) value is substituted for the censored observations, and also to other methods such as multiple imputation, and maximum likelihood estimation for censored data (generalized Tobit regression). Simulations are used to explore the sensitivity of the results to changes in the model parameters. In conclusion, the antiretroviral treatment was associated with a significant decrease in viral load after controlling the effects of other covariates. A simulation study with finite sample size shows MCEM is the least biased method and the estimates are least sensitive to the censoring mechanism.

April 13, 2012 doi: 10.1177/0962280212444800 open full text
Analyzing repeated measures semi-continuous data, with application to an alcohol dependence study.
Liu, L., Strawderman, R. L., Johnson, B. A., O'Quigley, J. M.
Statistical Methods in Medical Research: An International Review Journal. April 02, 2012

Two-part random effects models (Olsen and Schafer,¹ Tooze et al.²) have been applied to repeated measures of semi-continuous data, characterized by a mixture of a substantial proportion of zero values and a skewed distribution of positive values. In the original formulation of this model, the natural logarithm of the positive values is assumed to follow a normal distribution with a constant variance parameter. In this article, we review and consider three extensions of this model, allowing the positive values to follow (a) a generalized gamma distribution, (b) a log-skew-normal distribution, and (c) a normal distribution after the Box-Cox transformation. We allow for the possibility of heteroscedasticity. Maximum likelihood estimation is shown to be conveniently implemented in SAS Proc NLMIXED. The performance of the methods is compared through applications to daily drinking records in a secondary data analysis from a randomized controlled trial of topiramate for alcohol dependence treatment. We find that all three models provide a significantly better fit than the log-normal model, and there exists strong evidence for heteroscedasticity. We also compare the three models by the likelihood ratio tests for non-nested hypotheses (Vuong³). The results suggest that the generalized gamma distribution provides the best fit, though no statistically significant differences are found in pairwise model comparisons.

April 02, 2012 doi: 10.1177/0962280212443324 open full text
A new statistical decision rule for single-arm phase II oncology trials.
Chen, Y., Chen, Z., Mori, M.
Statistical Methods in Medical Research: An International Review Journal. March 28, 2012

Most single-arm phase II clinical trials compare the efficacy of a new treatment with historical controls through statistical hypothesis testing. One major problem with such a comparison is that the efficacy of the historical control is treated as a known constant, whereas in reality, it is never precisely known. This partially explains why many "Go" decisions made in single-arm phase II trials are shown to be incorrect in phase III trials. In this paper, we propose a new decision rule for an improved transitional decision for single-arm phase II oncology clinical trials with binary endpoints. This new decision rule is jointly based on the p value and a new statistical index named the testing confidence value. The testing confidence value reflects the uncertainty associated with the null value in the hypothesis testing of single-arm trials. Simulations are used to evaluate the operating characteristics of the new decision rule in comparison with the traditional decision rule and a widely used Bayesian decision rule. The application of the new decision rule is illustrated using a clinical trial on marginally resectable pancreatic cancer. A webpage http://www.yiyichenbiostatistics.com/TCV.html is available for readers to interactively compute the testing confidence value and to find the suggested decision based on the new decision rule.

March 28, 2012 doi: 10.1177/0962280212442584 open full text
Using proportion of similar response to evaluate correlates of protection for vaccine efficacy.
Giacoletti, K. E., Heyse, J.
Statistical Methods in Medical Research: An International Review Journal. March 26, 2012

A question of interest in many vaccine clinical development programmes is whether vaccine-induced serum antibody level can be used as a correlate of vaccine efficacy; that is, whether serum antibody levels induced by a candidate vaccine can reliably predict the risk of breakthrough disease. Traditionally, analyses to answer this question have been based on modelling the incidence of breakthrough disease as a function of antibody level, among vaccinated subjects in clinical trials. The Proportion of Similar Response (PSR) method will be described and explored, and compared to the Receive Operator Characteristics (ROC) curve as a graphical tool and the area under the ROC (AUROC) as a summary measure in the context of evaluating correlates of protection. A way to use PSR analysis as complementary to Youden’s index as a simple and elegant method to determine the discriminatory ability of a test and to set an optimal threshold value will be presented. In addition, the relationships among PSR and other measures of overlap and discrimination will be described. An example based on a clinical trial from a development programme for a vaccine against human papillomavirus (HPV) will be presented.

March 26, 2012 doi: 10.1177/0962280211416299 open full text
Response-adaptive designs for continuous treatment responses in phase III clinical trials: A review.
Biswas, A., Bhattacharya, R.
Statistical Methods in Medical Research: An International Review Journal. March 16, 2012

A variety of response-adaptive randomization procedures have been proposed in literature assuming binary outcomes. However, the list is not so long for continuous outcomes though many real clinical trials deal with continuous treatment responses. In this paper, we attempt to explore the available procedures together with a comparison of their performances. Some real-life adaptive trial is also reviewed.

March 16, 2012 doi: 10.1177/0962280212441424 open full text
Binomial regression with a misclassified covariate and outcome.
Luo, S., Chan, W., Detry, M. A., Massman, P. J., Doody, R. S.
Statistical Methods in Medical Research: An International Review Journal. March 15, 2012

Misclassification occurring in either outcome variables or categorical covariates or both is a common issue in medical science. It leads to biased results and distorted disease–exposure relationships. Moreover, it is often of clinical interest to obtain the estimates of sensitivity and specificity of some diagnostic methods even when neither gold standard nor prior knowledge about the parameters exists. We present a novel Bayesian approach in binomial regression when both the outcome variable and one binary covariate are subject to misclassification. Extensive simulation results under various scenarios and a real clinical example are given to illustrate the proposed approach. This approach is motivated and applied to a dataset from the Baylor Alzheimer's Disease and Memory Disorders Center.

March 15, 2012 doi: 10.1177/0962280212441965 open full text
Group sequential control of overall toxicity incidents in clinical trials - non-Bayesian and Bayesian approaches.
Yu, J., Hutson, A. D., Siddiqui, A. H., Kedron, M. A.
Statistical Methods in Medical Research: An International Review Journal. March 09, 2012

In some small clinical trials, toxicity is not a primary endpoint; however, it often has dire effects on patients’ quality of life and is even life-threatening. For such clinical trials, rigorous control of the overall incidence of adverse events is desirable, while simultaneously collecting safety information. In this article, we propose group sequential toxicity monitoring strategies to control overall toxicity incidents below a certain level as opposed to performing hypothesis testing, which can be incorporated into an existing study design based on the primary endpoint. We consider two sequential methods: a non-Bayesian approach in which stopping rules are obtained based on the ‘future’ probability of an excessive toxicity rate; and a Bayesian adaptation modifying the proposed non-Bayesian approach, which can use the information obtained at interim analyses. Through an extensive Monte Carlo study, we show that the Bayesian approach often provides better control of the overall toxicity rate than the non-Bayesian approach. We also investigate adequate toxicity estimation after the studies. We demonstrate the applicability of our proposed methods in controlling the symptomatic intracranial hemorrhage rate for treating acute ischemic stroke patients.

March 09, 2012 doi: 10.1177/0962280212440535 open full text
Sample size determination for disease prevalence studies with partially validated data.
Qiu, S.-F., Poon, W.-Y., Tang, M.-L.
Statistical Methods in Medical Research: An International Review Journal. February 28, 2012

Disease prevalence is an important topic in medical research, and its study is based on data that are obtained by classifying subjects according to whether a disease has been contracted. Classification can be conducted with high-cost gold standard tests or low-cost screening tests, but the latter are subject to the misclassification of subjects. As a compromise between the two, many research studies use partially validated datasets in which all data points are classified by fallible tests, and some of the data points are validated in the sense that they are also classified by the completely accurate gold-standard test. In this article, we investigate the determination of sample sizes for disease prevalence studies with partially validated data. We use two approaches. The first is to find sample sizes that can achieve a pre-specified power of a statistical test at a chosen significance level, and the second is to find sample sizes that can control the width of a confidence interval with a pre-specified confidence level. Empirical studies have been conducted to demonstrate the performance of various testing procedures with the proposed sample sizes. The applicability of the proposed methods are illustrated by a real-data example.

February 28, 2012 doi: 10.1177/0962280212439576 open full text
Modeling fecundity in the presence of a sterile fraction using a semi-parametric transformation model for grouped survival data.
McLain, A. C., Sundaram, R., Buck Louis, G. M.
Statistical Methods in Medical Research: An International Review Journal. February 28, 2012

The analysis of fecundity data is challenging and requires consideration of both highly timed and interrelated biologic processes in the context of essential behaviors such as sexual intercourse during the fertile window. Understanding human fecundity is further complicated by presence of a sterile population, i.e. couples unable to achieve pregnancy. Modeling techniques conducted to date have largely relied upon discrete time-to-pregnancy survival or day-specific probability models to estimate the determinants of time-to-pregnancy or acute effects, respectively. We developed a class of semi-parametric grouped transformation cure models that capture day-level variates purported to affect the cycle-level hazards of conception and, possibly, sterility. Our model's performance is assessed using simulation and longitudinal data from one of the few prospective cohort studies with preconception enrollment of women followed for 12 menstrual cycles at risk for pregnancy.

February 28, 2012 doi: 10.1177/0962280212438646 open full text
A semi-parametric approach to the frequency of occurrence under a simple crossover trial.
Lui, K.-J., Chang, K.-C.
Statistical Methods in Medical Research: An International Review Journal. February 23, 2012

To analyze the frequency of occurrence for an event of interest in a crossover design, we propose a semi-parametric approach. We develop two point estimators and four interval estimators in closed forms for the treatment effect under a random effects multiplicative risk model. Using Monte Carlo simulations, we evaluate these estimators and compare the four interval estimators with the classical interval estimator suggested elsewhere in a variety of situations. We note that the point estimator using the ratio of two arithmetic averages of mean frequencies under a multiplicative risk model can be comparable to the point estimator using the ratio of two geometric averages of mean frequencies. We note that as long as the number of patients per group is large, all the four interval estimators developed here can perform well. We also note that the classical interval estimator derived under the commonly assumed Poisson distribution for the frequency data can be conservative and lose precision if the Poisson distribution assumption is violated. We use a double-blind randomized crossover trial comparing salmeterol with a placebo in exacerbations of asthma to illustrate the practical use of these estimators.

February 23, 2012 doi: 10.1177/0962280212438157 open full text
Consistent causal effect estimation under dual misspecification and implications for confounder selection procedures.
Gruber, S., van der Laan, M. J.
Statistical Methods in Medical Research: An International Review Journal. February 23, 2012

In a previously published article in this journal, Vansteeland et al. [Stat Methods Med Res. Epub ahead of print 12 November 2010. DOI: 10.1177/0962280210387717] address confounder selection in the context of causal effect estimation in observational studies. They discuss several selection strategies and propose a procedure whose performance is guided by the quality of the exposure effect estimator. The authors note that when a particular linearity condition is met, consistent estimation of the target parameter can be achieved even under dual misspecification of models for the association of confounders with exposure and outcome and demonstrate the performance of their procedure relative to other estimators when this condition holds. Our earlier published work on collaborative targeted minimum loss based learning provides a general theoretical framework for effective confounder selection that explains the findings of Vansteelandt et al. and underscores the appropriateness of their suggestions that a confounder selection procedure should be concerned with directly targeting the quality of the estimate and that desirable estimators produce valid confidence intervals and are robust to dual misspecification.

February 23, 2012 doi: 10.1177/0962280212437451 open full text
A comparison of power analysis methods for evaluating effects of a predictor on slopes in longitudinal designs with missing data.
Wang, C., Hall, C. B., Kim, M.
Statistical Methods in Medical Research: An International Review Journal. February 21, 2012

In many longitudinal studies, evaluating the effect of a binary or continuous predictor variable on the rate of change of the outcome, i.e. slope, is often of primary interest. Sample size determination of these studies, however, is complicated by the expectation that missing data will occur due to missed visits, early drop out, and staggered entry. Despite the availability of methods for assessing power in longitudinal studies with missing data, the impact on power of the magnitude and distribution of missing data in the study population remain poorly understood. As a result, simple but erroneous alterations of the sample size formulae for complete/balanced data are commonly applied. These ‘naive’ approaches include the average sum of squares and average number of subjects methods. The goal of this article is to explore in greater detail the effect of missing data on study power and compare the performance of naive sample size methods to a correct maximum likelihood-based method using both mathematical and simulation-based approaches. Two different longitudinal aging studies are used to illustrate the methods.

February 21, 2012 doi: 10.1177/0962280212437452 open full text
Unscaled Bayes factors for multiple hypothesis testing in microarray experiments.
Bertolino, F., Cabras, S., Castellanos, M. E., Racugno, W.
Statistical Methods in Medical Research: An International Review Journal. February 15, 2012

Multiple hypothesis testing collects a series of techniques usually based on p-values as a summary of the available evidence from many statistical tests. In hypothesis testing, under a Bayesian perspective, the evidence for a specified hypothesis against an alternative, conditionally on data, is given by the Bayes factor. In this study, we approach multiple hypothesis testing based on both Bayes factors and p-values, regarding multiple hypothesis testing as a multiple model selection problem. To obtain the Bayes factors we assume default priors that are typically improper. In this case, the Bayes factor is usually undetermined due to the ratio of prior pseudo-constants. We show that ignoring prior pseudo-constants leads to unscaled Bayes factor which do not invalidate the inferential procedure in multiple hypothesis testing, because they are used within a comparative scheme. In fact, using partial information from the p-values, we are able to approximate the sampling null distribution of the unscaled Bayes factor and use it within Efron's multiple testing procedure. The simulation study suggests that under normal sampling model and even with small sample sizes, our approach provides false positive and false negative proportions that are less than other common multiple hypothesis testing approaches based only on p-values. The proposed procedure is illustrated in two simulation studies, and the advantages of its use are showed in the analysis of two microarray experiments.

February 15, 2012 doi: 10.1177/0962280212437827 open full text
Minimal sufficient balance--a new strategy to balance baseline covariates and preserve randomness of treatment allocation.
Zhao, W., Hill, M. D., Palesch, Y.
Statistical Methods in Medical Research: An International Review Journal. January 26, 2012

In many clinical trials, baseline covariates could affect the primary outcome. Commonly used strategies to balance baseline covariates include stratified constrained randomization and minimization. Stratification is limited to few categorical covariates. Minimization lacks the randomness of treatment allocation. Both apply only to categorical covariates. As a result, serious imbalances could occur in important baseline covariates not included in the randomization algorithm. Furthermore, randomness of treatment allocation could be significantly compromised because of the high proportion of deterministic assignments associated with stratified block randomization and minimization, potentially resulting in selection bias. Serious baseline covariate imbalances and selection biases often contribute to controversial interpretation of the trial results. The National Institute of Neurological Disorders and Stroke recombinant tissue plasminogen activator Stroke Trial and the Captopril Prevention Project are two examples. In this article, we propose a new randomization strategy, termed the minimal sufficient balance randomization, which will dually prevent serious imbalances in all important baseline covariates, including both categorical and continuous types, and preserve the randomness of treatment allocation. Computer simulations are conducted using the data from the National Institute of Neurological Disorders and Stroke recombinant tissue plasminogen activator Stroke Trial. Serious imbalances in four continuous and one categorical covariate are prevented with a small cost in treatment allocation randomness. A scenario of simultaneously balancing 11 baseline covariates is explored with similar promising results. The proposed minimal sufficient balance randomization algorithm can be easily implemented in computerized central randomization systems for large multicenter trials.

January 26, 2012 doi: 10.1177/0962280212436447 open full text
Impact of weighted composite compared to traditional composite endpoints for the design of randomized controlled trials.
Bakal, J. A., Westerhout, C. M., Armstrong, P. W.
Statistical Methods in Medical Research: An International Review Journal. January 24, 2012

Composite endpoints are commonly used in cardiovascular clinical trials. When using a composite endpoint a subject is considered to have an event when the first component endpoint has occurred. The use of composite endpoints offers the ability to incorporate several clinically important endpoint events thereby augmenting the event rate and increasing statistical power for a given sample size. One assumption of the composite is that all component events are of equal clinical importance. This assumption is rarely achieved given the diversity of component endpoints included. One means of adjusting for this diversity is to adjust the outcomes using severity weights determined a priori. The use of a weighted endpoint also allows for the incorporation of multiple endpoints per patient. Although weighting the outcomes lowers the effective number of events, it offers additional information that reduces the variance of the estimate. We created a series of simulation studies to examine the effect on power as the individual components of a typical composite were changed. In one study, we noted that the weighted composite was able to offer discriminative power when the component outcomes were altered, while the traditional method was not. In the other study, we noted that the weighted composite offered a similar level of power to the traditional composite when the change was driven by the more severe endpoints.

January 24, 2012 doi: 10.1177/0962280211436004 open full text
Multiple comparisons with a control for a latent variable model with ordered categorical responses.
Lu, T.-Y., Poon, W.-Y., Cheung, S. H.
Statistical Methods in Medical Research: An International Review Journal. January 19, 2012

Ordered categorical data are frequently encountered in clinical studies. A popular method for comparing the efficacy of treatments is to use logistic regression with the proportional odds assumption. The test statistic is based on the Wilcoxon–Mann–Whitney test. However, the proportional odds assumption may not be appropriate. In such cases, the probability of rejecting the null hypothesis is much inflated even though the treatments have the same mean efficacy. An alternative approach that does not rely on the proportional odds assumption is to conceptualize the responses as manifestations of some underlying continuous variables. However, statistical procedures were developed only for the comparison of two treatments. In this article, we derive testing procedures that compare several treatments to a control, utilizing a latent normal distribution with the latent variable model. The proposed procedure is useful because multiple comparisons with a control is very frequently an objective of a clinical study. Data from clinical trials are used to illustrate the proposed procedures.

January 19, 2012 doi: 10.1177/0962280211434425 open full text
Multivariate approach for protein identification based on mass spectrometric data.
Lee, J. B., Lee, J. W.
Statistical Methods in Medical Research: An International Review Journal. January 18, 2012

Protein mass spectrometry provides a powerful tool for detecting and identifying proteins. Several database searching algorithms may be used for this purpose. However, most of them depend on the heuristic approaches and the use of probability-based or statistical approach was very restrictive in the current algorithms. In this study, we present a statistical modelling of scores based on a generalized linear mixed model and provide a feasible computation method using penalized generalized weighted least squares. This model incorporates the dependency among matches into a new statistical scoring function, and uses the beta-binomial distribution to derive the score. Based on simulation experiments and analysis using real examples, we have improved protein searching performance and provided feasible computation procedures to deal with very large datasets. In particular, our methods may significantly increase accuracy in identifying medium and small proteins.

January 18, 2012 doi: 10.1177/0962280211434960 open full text
Some common errors of experimental design, interpretation and inference in agreement studies.
Erdmann, T., De Mast, J., Warrens, M.
Statistical Methods in Medical Research: An International Review Journal. January 13, 2012

We signal and discuss common methodological errors in agreement studies and the use of kappa indices, as found in publications in the medical and behavioural sciences. Our analysis is based on a proposed statistical model that is in line with the typical models employed in metrology and measurement theory. A first cluster of errors is related to nonrandom sampling, which results in a potentially substantial bias in the estimated agreement. Second, when class prevalences are strongly nonuniform, the use of the kappa index becomes precarious, as its large partial derivatives result in typically large standard errors of the estimates. In addition, the index reflects rather one-sidedly in such cases the consistency of the most prevalent class, or the class prevalences themselves. A final cluster of errors concerns interpretation pitfalls, which may lead to incorrect conclusions based on agreement studies. These interpretation issues are clarified on the basis of the proposed statistical modelling. The signalled errors are illustrated from actual studies published in prestigious journals. The analysis results in a number of guidelines and recommendations for agreement studies, including the recommendation to use alternatives to the kappa index in certain situations.

January 13, 2012 doi: 10.1177/0962280211433597 open full text
Finding differentially expressed genes in high dimensional data: Rank based test statistic via a distance measure.
Mathur, S., Sadana, A.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2012

We present a rank-based test statistic for the identification of differentially expressed genes using a distance measure. The proposed test statistic is highly robust against extreme values and does not assume the distribution of parent population. Simulation studies show that the proposed test is more powerful than some of the commonly used methods, such as paired t-test, Wilcoxon signed rank test, and significance analysis of microarray (SAM) under certain non-normal distributions. The asymptotic distribution of the test statistic, and the p-value function are discussed. The application of proposed method is shown using a real-life data set.

January 12, 2012 doi: 10.1177/0962280211434428 open full text
Simpson's paradox - aggregating and partitioning populations in health disparities of lung cancer patients.
Fu, P., Panneerselvam, A., Clifford, B., Dowlati, A., Ma, P., Zeng, G., Halmos, B., Leidner, R.
Statistical Methods in Medical Research: An International Review Journal. January 12, 2012

It is well known that non-small cell lung cancer (NSCLC) is a heterogeneous group of diseases. Previous studies have demonstrated genetic variation among different ethnic groups in the epidermal growth factor receptor (EGFR) in NSCLC. Research by our group and others has recently shown a lower frequency of EGFR mutations in African Americans with NSCLC, as compared to their White counterparts. In this study, we use our original study data of EGFR pathway genetics in African American NSCLC as an example to illustrate that univariate analyses based on aggregation versus partition of data leads to contradictory results, in order to emphasize the importance of controlling statistical confounding. We further investigate analytic approaches in logistic regression for data with separation, as is the case in our example data set, and apply appropriate methods to identify predictors of EGFR mutation. Our simulation shows that with separated or nearly separated data, penalized maximum likelihood (PML) produces estimates with smallest bias and approximately maintains the nominal value with statistical power equal to or better than that from maximum likelihood and exact conditional likelihood methods. Application of the PML method in our example data set shows that race and EGFR-FISH are independently significant predictors of EGFR mutation.

January 12, 2012 doi: 10.1177/0962280211434179 open full text
Vertical modeling: Analysis of competing risks data with missing causes of failure.
Nicolaie, M., Houwelingen, H. v., Putter, H.
Statistical Methods in Medical Research: An International Review Journal. December 16, 2011

We propose vertical modelling as a natural approach to the problem of analysis of competing risks data when failure types are missing for some individuals. Under a natural missing-at-random assumption for these missing failure types, we use the observed data likelihood to estimate its parameters and show that the all-cause hazard and the relative hazards appearing in vertical modelling are indeed key quantities of this likelihood. This fact has practical implications in that it suggests vertical modelling as a simple and attractive method of analysis in competing risks with missing causes of failure; all individuals are used in estimating the all-cause hazard and only those with non-missing cause of failure for relative hazards. The relative hazards also appear in a multiple imputation approach to the same problem proposed by Lu and Tsiatis and in the EM algorithm. We compare the vertical modelling approach with the method of Goetghebeur and Ryan for a breast cancer data set, highlighting the different aspects they contribute to the data analysis.

December 16, 2011 doi: 10.1177/0962280211432067 open full text
Optimal designs for epidemiologic longitudinal studies with binary outcomes.
Mehtala, J., Auranen, K., Kulathinal, S.
Statistical Methods in Medical Research: An International Review Journal. December 13, 2011

Alternating presence and absence of a medical condition in human subjects is often modelled as an outcome of underlying process dynamics. Longitudinal studies provide important insights into research questions involving such dynamics. This article concerns optimal designs for studies in which the dynamics are modelled as a binary continuous-time Markov process. Either one or both the transition rate parameters in the model are to be estimated with maximum precision from a sequence of observations made at discrete times on a number of subjects. The design questions concern the choice of time interval between observations, the initial state of each subject and the choice between number of subjects versus repeated observations per subject. Sequential designs are considered due to dependence of the designs on the model parameters. The optimal time spacing can be approximated by the reciprocal of the sum of the two rates. The initial distribution of the study subjects should be taken into account when relatively few repeated samples per subject are to be collected. A study with a reasonably large size should be designed in more than one phase because there are then enough observations to be spent in the first phase to revise the time spacing for the subsequent phases.

December 13, 2011 doi: 10.1177/0962280211430663 open full text
Non-parametric estimation of relative risk in survival and associated tests.
Wakounig, S., Heinze, G., Schemper, M.
Statistical Methods in Medical Research: An International Review Journal. December 12, 2011

We extend the Tarone and Ware scheme of weighted log-rank tests to cover the associated weighted Mantel–Haenszel estimators of relative risk. Weighting functions previously employed are critically reviewed. The notion of an average hazard ratio is defined and its connection to the effect size measure P(Y > X) is emphasized. The connection makes estimation of P(Y > X) possible also under censoring. Two members of the extended Tarone–Ware scheme accomplish the estimation of intuitively interpretable average hazard ratios, also under censoring and time-varying relative risk which is achieved by an inverse probability of censoring weighting. The empirical properties of the members of the extended Tarone–Ware scheme are demonstrated by a Monte Carlo study. The differential role of the weighting functions considered is illustrated by a comparative analysis of four real data sets.

December 12, 2011 doi: 10.1177/0962280211431022 open full text
A two-way enriched clinical trial design: combining advantages of placebo lead-in and randomized withdrawal.
Ivanova, A., Tamura, R. N.
Statistical Methods in Medical Research: An International Review Journal. December 04, 2011

A new clinical trial design, designated the two-way enriched design (TED), is introduced, which augments the standard randomized placebo-controlled trial with second-stage enrichment designs in placebo non-responders and drug responders. The trial is run in two stages. In the first stage, patients are randomized between drug and placebo. In the second stage, placebo non-responders are re-randomized between drug and placebo and drug responders are re-randomized between drug and placebo. All first-stage data, and second-stage data from first-stage placebo non-responders and first-stage drug responders, are utilized in the efficacy analysis. The authors developed one, two and three degrees of freedom score tests for treatment effect in the TED and give formulae for asymptotic power and for sample size computations. The authors compute the optimal allocation ratio between drug and placebo in the first stage for the TED and compare the operating characteristics of the design to the standard parallel clinical trial, placebo lead-in and randomized withdrawal designs. Two motivating examples from different disease areas are presented to illustrate the possible design considerations.

December 04, 2011 doi: 10.1177/0962280211431023 open full text
Bayesian analysis on meta-analysis of case-control studies accounting for within-study correlation.
Chen, Y., Chu, H., Luo, S., Nie, L., Chen, S.
Statistical Methods in Medical Research: An International Review Journal. December 04, 2011

In retrospective studies, odds ratio is often used as the measure of association. Under independent beta prior assumption, the exact posterior distribution of odds ratio given a single 2 x 2 table has been derived in the literature. However, independence between risks within the same study may be an oversimplified assumption because cases and controls in the same study are likely to share some common factors and thus to be correlated. Furthermore, in a meta-analysis of case–control studies, investigators usually have multiple 2 x 2 tables. In this article, we first extend the published results on a single 2 x 2 table to allow within study prior correlation while retaining the advantage of closed-form posterior formula, and then extend the results to multiple 2 x 2 tables and regression setting. The hyperparameters, including within study correlation, are estimated via an empirical Bayes approach. The overall odds ratio and the exact posterior distribution of the study-specific odds ratio are inferred based on the estimated hyperparameters. We conduct simulation studies to verify our exact posterior distribution formulas and investigate the finite sample properties of the inference for the overall odds ratio. The results are illustrated through a twin study for genetic heritability and a meta-analysis for the association between the N-acetyltransferase 2 (NAT2) acetylation status and colorectal cancer.

December 04, 2011 doi: 10.1177/0962280211430889 open full text
Slope estimation for informatively right censored longitudinal data modelling the number of observations using geometric and Poisson distributions: application to renal transplant cohort.
Jaffa, M. A., Lipsitz, S., Woolson, R. F.
Statistical Methods in Medical Research: An International Review Journal. December 04, 2011

Analysis of longitudinal data is often complicated by the presence of informative right censoring. This type of censoring should be accounted for in the analysis so that valid slope estimates are attained. In this study, we developed a new likelihood-based approach wherein the likelihood function is integrated over random effects to obtain a marginal likelihood function. Maximum likelihood estimates for the population slope were acquired by direct maximisation of the marginal likelihood function and empirical Bayes estimates for the individual slopes were generated using Gaussian quadrature. The performance of the model was assessed using the geometric and Poisson distributions to model the number of observations for every individual subject. Our model generated valid estimates for the slopes under both distributions with minimal bias and mean squared errors. Our sensitivity analysis confirmed the robustness of the model to assumptions pertaining to the underlying distribution and demonstrated its insensitivity to normality assumptions. Moreover, superiority of the model in terms of accuracy of slope estimates was consistently shown across the different levels of censoring in comparison to the naïve and bootstrap approaches. This model was illustrated using the cohort of renal transplant patients and estimates of the slopes that are adjusted for informative right censoring were acquired.

December 04, 2011 doi: 10.1177/0962280211430681 open full text
Methods for observational post-licensure medical product safety surveillance.
Nelson, J. C., Cook, A. J., Yu, O., Zhao, S., Jackson, L. A., Psaty, B. M.
Statistical Methods in Medical Research: An International Review Journal. December 02, 2011

Post-licensure medical product safety surveillance is important for detecting adverse events potentially not identified pre-licensure. Historically, post-licensure safety monitoring has been accomplished using passive reporting systems and by conducting formal Phase IV randomized trials or large epidemiological studies, also known as safety surveillance or pharmacovigilance studies. However, crucial gaps in the safety evidence base provided by these approaches have led to high profile product withdrawals and growing public concern about unknown health risks associated with licensed products. To address the limitations of existing surveillance systems and to facilitate more accurate and rapid detection of safety problems, new systems involving active surveillance of large, population-based cohorts using observational health care utilization databases are being developed. In this article, we review common statistical methods that have been employed previously for post-licensure safety monitoring, including data mining and sequential hypothesis testing, and assess which methods may be promising for potential use within this newly proposed prospective observational cohort monitoring framework. We discuss gaps in existing approaches and identify areas where methodological development is needed to improve the success of safety surveillance efforts in this setting.

December 02, 2011 doi: 10.1177/0962280211413452 open full text
The cross-validated AUC for MCP-logistic regression with high-dimensional data.
Jiang, D., Huang, J., Zhang, Y.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2011

We propose a cross-validated area under the receiving operator characteristic (ROC) curve (CV-AUC) criterion for tuning parameter selection for penalized methods in sparse, high-dimensional logistic regression models. We use this criterion in combination with the minimax concave penalty (MCP) method for variable selection. The CV-AUC criterion is specifically designed for optimizing the classification performance for binary outcome data. To implement the proposed approach, we derive an efficient coordinate descent algorithm to compute the MCP-logistic regression solution surface. Simulation studies are conducted to evaluate the finite sample performance of the proposed method and its comparison with the existing methods including the Akaike information criterion (AIC), Bayesian information criterion (BIC) or Extended BIC (EBIC). The model selected based on the CV-AUC criterion tends to have a larger predictive AUC and smaller classification error than those with tuning parameters selected using the AIC, BIC or EBIC. We illustrate the application of the MCP-logistic regression with the CV-AUC criterion on three microarray datasets from the studies that attempt to identify genes related to cancers. Our simulation studies and data examples demonstrate that the CV-AUC is an attractive method for tuning parameter selection for penalized methods in high-dimensional logistic regression models.

November 28, 2011 doi: 10.1177/0962280211428385 open full text
Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data.
Li, J., Tibshirani, R.
Statistical Methods in Medical Research: An International Review Journal. November 28, 2011

We discuss the identification of features that are associated with an outcome in RNA-Sequencing (RNA-Seq) and other sequencing-based comparative genomic experiments. RNA-Seq data takes the form of counts, so models based on the normal distribution are generally unsuitable. The problem is especially challenging because different sequencing experiments may generate quite different total numbers of reads, or ‘sequencing depths’. Existing methods for this problem are based on Poisson or negative binomial models: they are useful but can be heavily influenced by ‘outliers’ in the data. We introduce a simple, non-parametric method with resampling to account for the different sequencing depths. The new method is more robust than parametric methods. It can be applied to data with quantitative, survival, two-class or multiple-class outcomes. We compare our proposed method to Poisson and negative binomial-based methods in simulated and real data sets, and find that our method discovers more consistent patterns than competing methods.

November 28, 2011 doi: 10.1177/0962280211428386 open full text
Some considerations of classification for high dimension low-sample size data.
Zhang, L., Lin, X.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2011

We review in this article several classification methods, especially for high-dimensional and low-sample size data. We discuss several desirable properties for classifiers in such settings, including predictability, consistency, generality, stability, robustness and sparsity. Specifically, a good classifier should have a small prediction error (predictability); converge to the Bayes-rule classifier asymptotically (consistency); be stable when adding/removing an observation (generality); be stable for different data sets of the same kind (stochastic stability); be stable when there are a small number of contaminated observations (robustness); and have a small number of variables in the classifier (interpretability or sparsity). Several simulation examples and real applications are used to illustrate the usefulness of the existing popular classifiers and compare their performance.

November 23, 2011 doi: 10.1177/0962280211428387 open full text
Frailties in multi-state models: Are they identifiable? Do we need them?
Putter, H., van Houwelingen, H. C.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2011

The inclusion of latent frailties in survival models can serve two purposes: (1) the modelling of dependence in clustered data, (2) explaining lack of fit of univariate survival models, like deviation from the proportional hazards assumption. Multi-state models are somewhere between univariate data and clustered data. Frailty models can help in understanding the dependence in sequential transitions (like in clustered data) and can be useful in explaining some strange phenomena in the effect of covariates in competing risks models (like in univariate data). The (im)possibilities of frailty models will be exemplified on a data set of breast cancer patients with death as absorbing state and local recurrence and distant metastasis as intermediate events.

November 23, 2011 doi: 10.1177/0962280211424665 open full text
Controlling false positive selections in high-dimensional regression and causal inference.
Buhlmann, P., Rutimann, P., Kalisch, M.
Statistical Methods in Medical Research: An International Review Journal. November 23, 2011

Guarding against false positive selections is important in many applications. We discuss methods based on subsampling and sample splitting for controlling the expected number of false positives and assigning p-values. They are generic and especially useful for high-dimensional settings. We review encouraging results for regression, and we discuss new adaptations and remaining challenges for selecting relevant variables, based on observational data, having a causal or interventional effect on a response of interest.

November 23, 2011 doi: 10.1177/0962280211428371 open full text
Survival extrapolation using the poly-Weibull model.
Demiris, N., Lunn, D., Sharples, L. D.
Statistical Methods in Medical Research: An International Review Journal. November 21, 2011

Recent studies of (cost-) effectiveness in cardiothoracic transplantation have required estimation of mean survival over the lifetime of the recipients. In order to calculate mean survival, the complete survivor curve is required but is often not fully observed, so that survival extrapolation is necessary. After transplantation, the hazard function is bathtub-shaped, reflecting latent competing risks which operate additively in overlapping time periods. The poly-Weibull distribution is a flexible parametric model that may be used to extrapolate survival and has a natural competing risks interpretation. In addition, treatment effects and subgroups can be modelled separately for each component of risk. We describe the model and develop inference procedures using freely available software. The methods are applied to two problems from cardiothoracic transplantation.

November 21, 2011 doi: 10.1177/0962280211419645 open full text
Extension of the modified Poisson regression model to prospective studies with correlated binary data.
Zou, G. Y., Donner, A.
Statistical Methods in Medical Research: An International Review Journal. November 08, 2011

The Poisson regression model using a sandwich variance estimator has become a viable alternative to the logistic regression model for the analysis of prospective studies with independent binary outcomes. The primary advantage of this approach is that it readily provides covariate-adjusted risk ratios and associated standard errors. In this article, the model is extended to studies with correlated binary outcomes as arise in longitudinal or cluster randomization studies. The key step involves a cluster-level grouping strategy for the computation of the middle term in the sandwich estimator. For a single binary exposure variable without covariate adjustment, this approach results in risk ratio estimates and standard errors that are identical to those found in the survey sampling literature. Simulation results suggest that it is reliable for studies with correlated binary data, provided the total number of clusters is at least 50. Data from observational and cluster randomized studies are used to illustrate the methods.

November 08, 2011 doi: 10.1177/0962280211427759 open full text
Piecewise mixed-effects models with skew distributions for evaluating viral load changes: A Bayesian approach.
Huang, Y., Dagne, G. A., Zhou, S., Wang, Z.
Statistical Methods in Medical Research: An International Review Journal. November 01, 2011

Studies of human immunodeficiency virus dynamics in acquired immuno deficiency syndrome (AIDS) research are very important in evaluating the effectiveness of antiretroviral (ARV) therapies. The potency of ARV agents in AIDS clinical trials can be assessed on the basis of a viral response such as viral decay rate or viral load change in plasma. Following ARV treatment, the profile of each subject's viral load tends to follow a ‘broken stick’-like dynamic trajectory, indicating multiple phases of decline and increase in viral loads. Such multiple-phases (change-points) can be described by a random change-point model with random subject-specific parameters. One usually assumes a normal distribution for model error. However, this assumption may be unrealistic, obscuring important features of within- and among-subject variations. In this article, we propose piecewise linear mixed-effects models with skew-elliptical distributions to describe the time trend of a response variable under a Bayesian framework. This methodology can be widely applied to real problems for longitudinal studies. A real data analysis, using viral load data from an AIDS study, is carried out to illustrate the proposed method by comparing various candidate models. Biologically important findings are reported, and these findings also suggest that it is very important to assume a model with skew distribution in order to achieve reliable results, in particular, when the data exhibit skewness.

November 01, 2011 doi: 10.1177/0962280211426184 open full text
Change point detection in risk adjusted control charts.
Assareh, H., Smith, I., Mengersen, K.
Statistical Methods in Medical Research: An International Review Journal. October 23, 2011

Precise identification of the time when a change in a clinical process has occurred enables experts to identify a potential special cause more effectively. In this article, we develop change point estimation methods for a clinical dichotomous process in the presence of case mix. We apply Bayesian hierarchical models to formulate the change point where there exists a step change in the odds ratio and logit of risk of a Bernoulli process. Markov Chain Monte Carlo is used to obtain posterior distributions of the change point parameters including location and magnitude of changes and also corresponding probabilistic intervals and inferences. The performance of the Bayesian estimator is investigated through simulations and the result shows that precise estimates can be obtained when they are used in conjunction with the risk-adjusted CUSUM and EWMA control charts. In comparison with alternative EWMA and CUSUM estimators, more accurate and precise estimates are obtained by the Bayesian estimator. These superiorities enhance when probability quantification, flexibility and generaliability of the Bayesian change point detection model are also considered. The Deviance Information Criterion, as a model selection criterion in the Bayesian context, is applied to find the best change point model for a given dataset where there is no prior knowledge about the change type in the process.

October 23, 2011 doi: 10.1177/0962280211426356 open full text
A joint model for the dependence between clustered times to tumour progression and deaths: A meta-analysis of chemotherapy in head and neck cancer.
Rondeau, V., Pignon, J.-P., Michiels, S.
Statistical Methods in Medical Research: An International Review Journal. October 23, 2011

The observation of time to tumour progression (TTP) or progression-free survival (PFS) may be terminated by a terminal event. In this context, deaths may be due to tumour progression, and the time to the major failure event (death) may be correlated with the TTP. The usual assumption of independence between the TTP process and death, required by many commonly used statistical methods, can be violated. Furthermore, although the relationship between TTP and time to death is most relevant to the anti-cancer drug development or to evaluation of TTP as a surrogate endpoint, statistical models that try to describe the dependence structure between these two characteristics are not frequently used. We propose a joint frailty model for the analysis of two survival endpoints, TTP and time to death, or PFS and time to death, in the context of data clustering (e.g. at the centre or trial level). This approach allows us to simultaneously evaluate the prognostic effects of covariates on the two survival endpoints, while accounting both for the relationship between the outcomes and for data clustering. We show how a maximum penalized likelihood estimation can be applied to a nonparametric estimation of the continuous hazard functions in a general joint frailty model with right censoring and delayed entry. The model was motivated by a large meta-analysis of randomized trials for head and neck cancers (Meta-Analysis of Chemotherapy in Head and Neck Cancers), in which the efficacy of chemotherapy on TTP or PFS and overall survival was investigated, as adjunct to surgery or radiotherapy or both.

October 23, 2011 doi: 10.1177/0962280211425578 open full text
Linear time-dependent reference intervals where there is measurement error in the time variable - a parametric approach.
Gillard, J.
Statistical Methods in Medical Research: An International Review Journal. October 19, 2011

This article re-examines parametric methods for the calculation of time specific reference intervals where there is measurement error present in the time covariate. Previous published work has commonly been based on the standard ordinary least squares approach, weighted where appropriate. In fact, this is an incorrect method when there are measurement errors present, and in this article, we show that the use of this approach may, in certain cases, lead to referral patterns that may vary with different values of the covariate. Thus, it would not be the case that all patients are treated equally; some subjects would be more likely to be referred than others, hence violating the principle of equal treatment required by the International Federation for Clinical Chemistry. We show, by using measurement error models, that reference intervals are produced that satisfy the requirement for equal treatment for all subjects.

October 19, 2011 doi: 10.1177/0962280211426617 open full text
Impact of delayed diagnosis time in estimating progression rates to hepatitis C virus-related cirrhosis and death.
Fu, B., Wang, W., Shi, X.
Statistical Methods in Medical Research: An International Review Journal. October 18, 2011

Delay of the diagnosis of hepatitis C virus (HCV), and its treatment to avert cirrhosis, is often present since the early stage of HCV progression is latent. Current methods to determine the incubation time to HCV-related cirrhosis and the duration time from cirrhosis to subsequent events (e.g. complications or death) used to be based on the time of liver biopsy diagnosis and ignore this delay which led to an interval censoring for the first event time and a double censoring for the subsequent event time. To investigate the impact of this delay in estimating HCV progression rates and relevant estimating bias, we present a correlated two-stage progression model for delayed diagnosis time and fit the developed model to the previously studied hepatitis C cohort data from Edinburgh. Our analysis shows that taking the delayed diagnosis into account gives a mildly different estimate of progression rate to cirrhosis and significantly lower estimated progression rate to HCV-related death in comparison with conventional modelling. We also find that when the delay increases, the bias in estimating progression increases significantly.

October 18, 2011 doi: 10.1177/0962280211424667 open full text
Recommended confidence intervals for two independent binomial proportions.
Fagerland, M. W., Lydersen, S., Laake, P.
Statistical Methods in Medical Research: An International Review Journal. October 13, 2011

The relationship between two independent binomial proportions is commonly estimated and presented using the difference between proportions, the number needed to treat, the ratio of proportions or the odds ratio. Several different confidence intervals are available, but they can produce markedly different results. Some of the traditional approaches, such as the Wald interval for the difference between proportions and the Katz log interval for the ratio of proportions, do not perform well unless the sample size is large. Better intervals are available. This article describes and compares approximate and exact confidence intervals that are – with one exception – easy to calculate or available in common software packages. We illustrate the performances of the intervals and make recommendations for both small and moderate-to-large sample sizes.

October 13, 2011 doi: 10.1177/0962280211415469 open full text
Assessing the sensitivity of methods for estimating principal causal effects.
Stuart, E. A., Jo, B.
Statistical Methods in Medical Research: An International Review Journal. October 03, 2011

The framework of principal stratification provides a way to think about treatment effects conditional on post-randomization variables, such as level of compliance. In particular, the complier average causal effect (CACE) – the effect of the treatment for those individuals who would comply with their treatment assignment under either treatment condition – is often of substantive interest. However, estimation of the CACE is not always straightforward, with a variety of estimation procedures and underlying assumptions, but little advice to help researchers select between methods. In this article, we discuss and examine two methods that rely on very different assumptions to estimate the CACE: a maximum likelihood (‘joint’) method that assumes the ‘exclusion restriction,’ (ER) and a propensity score-based method that relies on ‘principal ignorability.’ We detail the assumptions underlying each approach, and assess each methods' sensitivity to both its own assumptions and those of the other method using both simulated data and a motivating example. We find that the ER-based joint approach appears somewhat less sensitive to its assumptions, and that the performance of both methods is significantly improved when there are strong predictors of compliance. Interestingly, we also find that each method performs particularly well when the assumptions of the other approach are violated. These results highlight the importance of carefully selecting an estimation procedure whose assumptions are likely to be satisfied in practice and of having strong predictors of principal stratum membership.

October 03, 2011 doi: 10.1177/0962280211421840 open full text
Model-based approaches to synthesize microarray data: a unifying review using mixture of SEMs.
Martella, F., Vermunt, J.
Statistical Methods in Medical Research: An International Review Journal. September 25, 2011

Several statistical methods are nowadays available for the analysis of gene expression data recorded through microarray technology. In this article, we take a closer look at several Gaussian mixture models which have recently been proposed to model gene expression data. It can be shown that these are special cases of a more general model, called the mixture of structural equation models (mixture of SEMs), which has been developed in psychometrics. This model combines mixture modelling and SEMs by assuming that component-specific means and variances are subject to a SEM. The connection with SEM is useful for at least two reasons: (1) it shows the basic assumptions of existing methods more explicitly and (2) it helps in straightforward development of alternative mixture models for gene expression data with alternative mean/covariance structures. Different specifications of mixture of SEMs for clustering gene expression data are illustrated using two benchmark datasets.

September 25, 2011 doi: 10.1177/0962280211419482 open full text
Efficient design of cluster randomized and multicentre trials with unknown intraclass correlation.
van Breukelen, G. J., Candel, M. J.
Statistical Methods in Medical Research: An International Review Journal. September 20, 2011

For cluster randomized and multicentre trials evaluating the effect of a treatment on persons nested within clusters, equations have been published to compute the optimal sample sizes at the cluster and person level as a function of sampling costs and intraclass correlation (ICC). Here, optimal means maximum power and precision for a given sampling budget, or minimum sampling costs for a given power and precision. However, the ICC is usually unknown, and the optimal sample sizes depend strongly on this ICC. To overcome this local optimality problem, this study presents Maximin designs (MMDs) based on relative efficiency (RE) and efficiency. These designs perform well over a range of possible ICC values either in terms of RE compared with the locally optimal designs, or in terms of minimum efficiency (maximum variance) of the treatment effect estimator. The use of MMDs is illustrated using information from many cluster randomized trials in primary care. It is concluded that MMDs and the optimal design for an ICC halfway its assumed range are efficient for a range of ICC values and recommendable for practical use. This requires that trial reports mention the study cost per cluster and person.

September 20, 2011 doi: 10.1177/0962280211421344 open full text
A multi-state model for the analysis of changes in cognitive scores over a fixed time interval.
Mitnitski, A. B., Fallah, N., Dean, C. B., Rockwood, K.
Statistical Methods in Medical Research: An International Review Journal. September 20, 2011

In this article, we present the novel approach of using a multi-state model to describe longitudinal changes in cognitive test scores. Scores are modelled according to a truncated Poisson distribution, conditional on survival to a fixed endpoint, with the Poisson mean dependent upon the baseline score and covariates. The model provides a unified treatment of the distribution of cognitive scores, taking into account baseline scores and survival. It offers a simple framework for the simultaneous estimation of the effect of covariates modulating these distributions, over different baseline scores. A distinguishing feature is that this approach permits estimation of the probabilities of transitions in different directions: improvements, declines and death. The basic model is characterised by four parameters, two of which represent cognitive transitions in survivors, both for individuals with no cognitive errors at baseline and for those with non-zero errors, within the range of test scores. The two other parameters represent corresponding likelihoods of death. The model is applied to an analysis of data from the Canadian Study of Health and Aging (1991–2001) to identify the risk of death, and of changes in cognitive function as assessed by errors in the Modified Mini-Mental State Examination. The model performance is compared with more conventional approaches, such as multivariate linear and polytomous regressions. This model can also be readily applied to a wide variety of other cognitive test scores and phenomena which change with age.

September 20, 2011 doi: 10.1177/0962280211406470 open full text
Probabilistic sensitivity analysis in health economics.
Baio, G., Dawid, A. P.
Statistical Methods in Medical Research: An International Review Journal. September 18, 2011

Health economic evaluations have recently become an important part of the clinical and medical research process and have built upon more advanced statistical decision-theoretic foundations. In some contexts, it is officially required that uncertainty about both parameters and observable variables be properly taken into account, increasingly often by means of Bayesian methods. Among these, probabilistic sensitivity analysis has assumed a predominant role. The objective of this article is to review the problem of health economic assessment from the standpoint of Bayesian statistical decision theory with particular attention to the philosophy underlying the procedures for sensitivity analysis.

September 18, 2011 doi: 10.1177/0962280211419832 open full text
A two-stage estimation for screening studies using two diagnostic tests with binary disease status verified in test positives only.
Li, F., Chu, H., Nie, L.
Statistical Methods in Medical Research: An International Review Journal. September 13, 2011

This article considers the statistical estimation and inference for screening studies in which two binary tests are used for screening with a binary disease status verified only for those subjects with at least one positive test result. The challenge encountered in these studies is the non-identifiability because the disease rate is not identifiable for subjects with negative results from both tests without additional assumptions. Different homogeneous association models have been proposed in the literature to circumvent the non-identifiability problem, which were solved using numerical methods. We propose to formulate the problem as a constrained maximum likelihood estimation (MLE) problem. The MLE has a closed-form in general, which can be solved using a unified two-stage estimation approach. We demonstrate the application of the proposed method on a set of homogeneous association models. The homogeneous association assumptions are generally not testable as all models are saturated. Therefore, we propose an association-ratio plot as a visualization tool for model comparisons. The methods are illustrated through three examples.

September 13, 2011 doi: 10.1177/0962280211421838 open full text
Accounting for perception, placebo and unmasking effects in estimating treatment effects in randomised clinical trials.
Jamshidian, F., Hubbard, A. E., Jewell, N. P.
Statistical Methods in Medical Research: An International Review Journal. September 08, 2011

There is a rich literature on the role of placebos in experimental design and evaluation of therapeutic agents or interventions. The importance of masking participants, investigators and evaluators to treatment assignment (treatment or placebo) has long been stressed as a key feature of a successful trial design. Nevertheless, there is considerable variability in the technical definition of the placebo effect and the impact of treatment assignments being unmasked. We suggest a formal concept of a ‘perception effect’ and define unmasking and placebo effects in the context of randomised trials. We employ modern tools from causal inference to derive semi-parametric estimators of such effects. The methods are illustrated on a motivating example from a recent pain trial where the occurrence of treatment-related side effects acts as a proxy for unmasking.

September 08, 2011 doi: 10.1177/0962280211413449 open full text
Estimating overall exposure effects for zero-inflated regression models with application to dental caries.
Albert, J. M., Wang, W., Nelson, S.
Statistical Methods in Medical Research: An International Review Journal. September 08, 2011

Zero-inflated (ZI) models, which may be derived as a mixture involving a degenerate distribution at value zero and a distribution such as negative binomial (ZINB), have proved useful in dental and other areas of research by accommodating ‘extra’ zeroes in the data. Used in conjunction with generalised linear models, they allow covariate-adjusted inference of an exposure effect on the mixing probability and on the mean for the non-degenerate distribution. However, these models do not directly provide covariate-adjusted inference for the overall exposure effect. Focusing on the ZINB and ZI beta binomial models, we propose an approach that uses model-predicted values for each person under each exposure state. This ‘average predicted value’ method allows covariate-adjusted estimation of flexible functions of exposure group means such as the difference or ratio. A second approach considers a log link for both components of the ZINB to allow a direct approach to estimation. We apply these new methods to a study of dental caries in very low birth weight adolescents. Simulation studies show good bias and robustness properties for both approaches under various scenarios. Robustness diminishes when there is exposure group imbalance for a covariate with a large effect.

September 08, 2011 doi: 10.1177/0962280211407800 open full text
On the power of the Cochran-Armitage test for trend in the presence of misclassification.
Buonaccorsi, J. P., Laake, P., Veierod, M. B.
Statistical Methods in Medical Research: An International Review Journal. August 30, 2011

The Cochran–Armitage (CA) test is commonly used in both epidemiology and genetics to test for linear trend in two-way tables with a binary outcome. There has been increasing interest in the power and size of the test and in determination of sample size, especially when there is potential misclassification in the ‘exposure’ category. This article provides a unified approach to determination of the power function over different sampling strategies (fixed overall sample size or fixed marginal sample sizes) and allowing for misclassification in one or both variables. The misclassification may be either differential or non-differential. In addition to the standard CA test, results are also given which provide some insight into the performance of the modified CA test, which utilizes a standard error obtained without invoking the null hypothesis. Even without misclassification, some new expressions are also obtained for determining power with a fixed overall sample size. Numerical illustrations are presented with an emphasis on the more commonly occurring problem of misclassification in the exposure category.

August 30, 2011 doi: 10.1177/0962280211406424 open full text
Bayesian hierarchical Poisson models with a hidden Markov structure for the detection of influenza epidemic outbreaks.
Conesa, D., Martinez-Beneito, M., Amoros, R., Lopez-Quilez, A.
Statistical Methods in Medical Research: An International Review Journal. August 25, 2011

Considerable effort has been devoted to the development of statistical algorithms for the automated monitoring of influenza surveillance data. In this article, we introduce a framework of models for the early detection of the onset of an influenza epidemic which is applicable to different kinds of surveillance data. In particular, the process of the observed cases is modelled via a Bayesian Hierarchical Poisson model in which the intensity parameter is a function of the incidence rate. The key point is to consider this incidence rate as a normal distribution in which both parameters (mean and variance) are modelled differently, depending on whether the system is in an epidemic or non-epidemic phase. To do so, we propose a hidden Markov model in which the transition between both phases is modelled as a function of the epidemic state of the previous week. Different options for modelling the rates are described, including the option of modelling the mean at each phase as autoregressive processes of order 0, 1 or 2. Bayesian inference is carried out to provide the probability of being in an epidemic state at any given moment. The methodology is applied to various influenza data sets. The results indicate that our methods outperform previous approaches in terms of sensitivity, specificity and timeliness.

August 25, 2011 doi: 10.1177/0962280211414853 open full text
A likelihood-based two-part marginal model for longitudinal semi-continuous data.
Su, L., Tom, B. D., Farewell, V. T.
Statistical Methods in Medical Research: An International Review Journal. August 25, 2011

Two-part models are an attractive approach for analysing longitudinal semicontinuous data consisting of a mixture of true zeros and continuously distributed positive values. When the population-averaged (marginal) covariate effects are of interest, two-part models that provide straightforward interpretation of the marginal effects are desirable. Presently, the only available approaches for fitting two-part marginal models to longitudinal semicontinuous data are computationally difficult to implement. Therefore, there exists a need to develop two-part marginal models that can be easily implemented in practice. We propose a fully likelihood-based two-part marginal model that satisfies this need by using the bridge distribution for the random effect in the binary part of an underlying two-part mixed model; and its maximum likelihood estimation can be routinely implemented via standard statistical software such as the SAS NLMIXED procedure. We illustrate the usage of this new model by investigating the marginal effects of pre-specified genetic markers on physical functioning, as measured by the Health Assessment Questionnaire, in a cohort of psoriatic arthritis patients from the University of Toronto Psoriatic Arthritis Clinic. An added benefit of our proposed marginal model when compared to a two-part mixed model is the robustness in regression parameter estimation when departure from the true random effects structure occurs. This is demonstrated through simulation.

August 25, 2011 doi: 10.1177/0962280211414620 open full text
Measuring continuous baseline covariate imbalances in clinical trial data.
Ciolino, J. D., Martin, R. H., Zhao, W., Hill, M. D., Jauch, E. C., Palesch, Y. Y.
Statistical Methods in Medical Research: An International Review Journal. August 24, 2011

This paper presents and compares several methods of measuring continuous baseline covariate imbalance in clinical trial data. Simulations illustrate that though the t-test is an inappropriate method of assessing continuous baseline covariate imbalance, the test statistic itself is a robust measure in capturing imbalance in continuous covariate distributions. Guidelines to assess effects of imbalance on bias, type I error rate and power for hypothesis test for treatment effect on continuous outcomes are presented, and the benefit of covariate-adjusted analysis (ANCOVA) is also illustrated.

August 24, 2011 doi: 10.1177/0962280211416038 open full text
Testing for seasonality using circular distributions based on non-negative trigonometric sums as alternative hypotheses.
Fernandez-Duran, J. J., Gregorio-Dominguez, M. M.
Statistical Methods in Medical Research: An International Review Journal. August 17, 2011

In medical and epidemiological studies, the importance of detecting seasonal patterns in the occurrence of diseases makes testing for seasonality highly relevant. There are different parametric and non-parametric tests for seasonality. One of the most widely used parametric tests in the medical literature is the Edwards test. The Edwards test considers a parametric alternative that is a sinusoidal curve with one peak and one trough. The Cave and Freedman test is an extension of the Edwards test that is also frequently applied and considers a sinusoidal curve with two peaks and two troughs as the alternative hypothesis. The Kuiper, Hewitt and David and Newell are common non-parametric tests. Fernández-Durán (2004) developed a family of univariate circular distributions based on non-negative trigonometric (Fourier) sums (series) (NNTS) that can account for an arbitrary number of peaks and troughs. In this article, this family of distributions is used to construct a likelihood ratio test for seasonality considering parametric alternative hypotheses that are NNTS distributions.

August 17, 2011 doi: 10.1177/0962280211411531 open full text
Crossover studies with survival outcomes.
Buyze, J., Goetghebeur, E.
Statistical Methods in Medical Research: An International Review Journal. June 29, 2011

Crossover designs are well known to have major advantages when comparing the effect of two treatments which do not interact. With a right-censored survival endpoint, however, this design is quickly abandoned in favour of the more costly parallel design. Motivated by human immunodeficiency virus (HIV) prevention studies which lacked power, we evaluate what may be gained in this setting and compare parallel with crossover designs. In a heterogeneous population, we find and explain a substantial increase in power for the crossover study using a non-parametric logrank test. With frailties in a proportional hazards model, crossover designs equally lead to substantially smaller variance for the subject-specific hazard ratio (HR), while the population-averaged HR sees negligible gain. Its efficiency benefit is recovered when the population-averaged HR is reconstructed from estimated subject-specific hazard rates. We derive the time point for treatment crossover that optimizes efficiency and end with the analysis of two recent HIV prevention trials. We find that a Cellulose sulphate trial could have hardly gained efficiency from a crossover design, while a Nonoxynol-9 trial stood to gain substantial power. We conclude that there is a role for effective crossover designs in important classes of survival problems.

June 29, 2011 doi: 10.1177/0962280211402258 open full text
Robust non-parametric tests for complex-repeated measures problems in ophthalmology.
Brombin, C., Midena, E., Salmaso, L.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2011

The NonParametric Combination methodology (NPC) of dependent permutation tests allows the experimenter to face many complex multivariate testing problems and represents a convincing and powerful alternative to standard parametric methods. The main advantage of this approach lies in its flexibility in handling any type of variable (categorical and quantitative, with or without missing values) while at the same time taking dependencies among those variables into account without the need of modelling them. NPC methodology enables to deal with repeated measures, paired data, restricted alternative hypotheses, missing data (completely at random or not), high-dimensional and small sample size data. Hence, NPC methodology can offer a significant contribution to successful research in biomedical studies with several endpoints, since it provides reasonably efficient solutions and clear interpretations of inferential results. Pesarin F. Multivariate permutation tests: with application in biostatistics. Chichester-New York: John Wiley &Sons, 2001; Pesarin F, Salmaso L. Permutation tests for complex data: theory, applications and software. Chichester, UK: John Wiley &Sons, 2010. We focus on non-parametric permutation solutions to two real-case studies in ophthalmology, concerning complex-repeated measures problems. For each data set, different analyses are presented, thus highlighting characteristic aspects of the data structure itself. Our goal is to present different solutions to multivariate complex case studies, guiding researchers/readers to choose, from various possible interpretations of a problem, the one that has the highest flexibility and statistical power under a set of less stringent assumptions. MATLAB code has been implemented to carry out the analyses.

June 24, 2011 doi: 10.1177/0962280211403659 open full text
Confidence interval estimation for the Bland-Altman limits of agreement with multiple observations per individual.
Zou, G. Y.
Statistical Methods in Medical Research: An International Review Journal. June 24, 2011

The limits of agreement (LoA) method proposed by Bland and Altman has become a standard for assessing agreement between different methods measuring the same quantity. Virtually, all method comparison studies have reported only point estimates of LoA due largely to the lack of simple confidence interval procedures. In this article, we address confidence interval estimation for LoA when multiple measurements per individual are available. Separate procedures are proposed for situations when the underlying true value of the measured quantity is assumed changing and when it is perceived as stable. A fixed number of replicates per individual is not needed for the procedures to work. As shown by the worked examples, the construction of these confidence intervals requires only quantiles from the standard normal and chi-square distributions. Simulation results show the proposed procedures perform well. A SAS macro implementing the methods is available on the publisher’s website.

June 24, 2011 doi: 10.1177/0962280211402548 open full text
Bayesian sample size calculation for estimation of the difference between two binomial proportions.
Pezeshk, H., Nematollahi, N., Maroufy, V., Marriott, P., Gittins, J.
Statistical Methods in Medical Research: An International Review Journal. March 24, 2011

In this study, we discuss a decision theoretic or fully Bayesian approach to the sample size question in clinical trials with binary responses. Data are assumed to come from two binomial distributions. A Dirichlet distribution is assumed to describe prior knowledge of the two success probabilities p ₁ and p ₂. The parameter of interest is p = p ₁ - p ₂. The optimal size of the trial is obtained by maximising the expected net benefit function. The methodology presented in this article extends previous work by the assumption of dependent prior distributions for p ₁ and p ₂.

March 24, 2011 doi: 10.1177/0962280211399562 open full text
Comparing measurement error correction methods for rate-of-change exposure variables in survival analysis.
Veronesi, G., Ferrario, M. M., Chambless, L. E.
Statistical Methods in Medical Research: An International Review Journal. February 07, 2011

In this article we focus on comparing measurement error correction methods for rate-of-change exposure variables in survival analysis, when longitudinal data are observed prior to the follow-up time. Motivational examples include the analysis of the association between changes in cardiovascular risk factors and subsequent onset of coronary events. We derive a measurement error model for the rate of change, estimated through subject-specific linear regression, assuming an additive measurement error model for the time-specific measurements. The rate of change is then included as a time-invariant variable in a Cox proportional hazards model, adjusting for the first time-specific measurement (baseline) and an error-free covariate. In a simulation study, we compared bias, standard deviation and mean squared error (MSE) for the regression calibration (RC) and the simulation-extrapolation (SIMEX) estimators. Our findings indicate that when the amount of measurement error is substantial, RC should be the preferred method, since it has smaller MSE for estimating the coefficients of the rate of change and of the variable measured without error. However, when the amount of measurement error is small, the choice of the method should take into account the event rate in the population and the effect size to be estimated. An application to an observational study, as well as examples of published studies where our model could have been applied, are also provided.

February 07, 2011 doi: 10.1177/0962280210395742 open full text
Modelling batched Gaussian longitudinal weight data in mice subject to informative dropout.
Albert, P. S., Shih, J. H.
Statistical Methods in Medical Research: An International Review Journal. February 07, 2011

Modelling longitudinal data subject to informative dropout is an active area in statistical research. This article focuses on modelling such longitudinal data when the outcome at each follow-up time is collected in batches rather than individually collected. The problem occurred in a study that compared the weight of mice over time between a control and a treatment group, where animal weight was measured in batches of five animals per cage. We develop both a shared parameter and a pattern mixture modelling approach for accounting for potentially informative dropout due to an animal’s death. Our methodology suggests that animals receiving the treatment have a lower weight in mid-life, and have a slower decline in weight in the later period of life. Our simulations suggest that both the shared random parameter and pattern mixture modelling approaches work well under a correctly specified model. However, the pattern mixture model is more robust against model misspecification than the shared random parameter model, but the shared random parameter model parameters have a more direct interpretation than those of the pattern mixture modelling approach.

February 07, 2011 doi: 10.1177/0962280210397886 open full text