Prediction Model of Psychosis Conversion Data

We used logistic regression with main effects only for each covariate and the outcome is conversion to psychosis.
Subscales were included assuming a linear relationship with the logit.
A lasso penalized fitting procedure was used to perform variable selection and estimation of the model. 5-fold cross validation (folds constructed to reflect the ~30% case rate in the sample) was used to determine the penalization parameter. This was carried out using functions from the R package glmnet.
Validation of the model is based on bootstrapping the performacne measures (AUC, Breir Score) - using procedure outlined in Harrell 1996 Stat in Medicine Vol 15 361 - 387 (Tutorial in Biostatistics: Multivariate Prog. Models: Issues…)

Model Coefficients

Tables show
- Beta = estimated coefficient for corresponding predictor (note that predictors are NOT standardized)
- Std_Beta = estimated coefficient for corresponding predictor when predictor is standardized (table is ordered by increasing magnitude of this Std_Beta)
- Prop_Sel_BS = proportion of times that the predictor was selected to remain in the model using the bootstrapping procedure
Selected Variables

	Beta	Std_Beta	Prop_Sel_BS
P4v	-0.30	-0.43	0.99
G2	-0.18	-0.29	0.95
P1	0.28	0.28	0.90
P5	0.20	0.26	0.90
Idea_Sev_Base	0.57	0.26	0.95
race_bin..c.is.0..non.c.is.1	0.49	0.24	0.92
N1	0.14	0.23	0.90
Behav_Sev_Base	0.82	0.20	0.90
GAF	-0.03	-0.18	0.86
P1PD	0.08	0.09	0.73
G3	0.05	0.08	0.73
SI_Base	0.17	0.04	0.67
Trauma_Sexual	0.11	0.03	0.74
N5	0.02	0.03	0.66
D3	0.02	0.03	0.69
GFS..Social	0.02	0.03	0.66
SB_Base	0.09	0.02	0.57

Un-Selected Variables

	Beta	Std_Beta	Prop_Sel_BS
X1..no.is.0..yes.is.1	0	0	1
Female	0	0	1
P2	0	0	1
famhx1..0.no..1.yes	0	0	1
P1NP	0	0	1
Age	0	0	1
N3	0	0	1
P1OB	0	0	1
P1FR	0	0	1
G1	0	0	1
Trauma_NonSexual	0	0	1
GFS..Role	0	0	1
D4	0	0	1
D1	0	0	1
P1SNG	0	0	1
G4	0	0	1
P3	0	0	1
N2	0	0	1
schizotypal..scz.is.1..non.is.0	0	0	1
N6	0	0	1
N4	0	0	0
D2	0	0	0
P4a	0	0	0
P4	0	0	0

In Sample and Optimism Adjusted Performance

# In Sample ROC
roc_l0

## 
## Call:
## roc.default(response = Y, predictor = pred_l0)
## 
## Data: pred_l0 in 135 controls (Y 0) < 64 cases (Y 1).
## Area under the curve: 0.8431

# Optimism Adj ROC
roc_l0$auc - mean(auc_opts)

## [1] 0.7286867

#
#
# In Sample Brier
brier_l0

## [1] 0.152943

# Optimism Adj Brier
brier_l0 - mean(brier_opts)

## [1] 0.2123282

Left column shows various categories/cut-offs (e.g., you classified as a converter if your predicted probability falls within the range x – 1.00) to determine conversion status. The corresponding roc curves are provided. The first set of PPV, NPV, Sensitivity, and Specificity values are based on the sample used to fit the model and are overly optimistic. The second set of PPV, NPV, Sensitivity, and Specificity values use the same bootstrap procedure used above to compute corrected values.

	Base Rate	PPV	NPV	Sens	Spec	PPV	NPV	Sens	Spec
0.05-1.00	98.49	32.49	100.00	100.00	2.22	30.74	22.53	93.48	0.89
0.1-1.00	92.96	34.43	100.00	100.00	10.37	31.60	63.29	91.10	7.22
0.15-1.00	80.90	37.71	92.16	95.31	25.93	33.53	74.46	84.70	21.00
0.2-1.00	68.34	44.67	95.27	95.31	44.44	38.78	82.89	83.27	38.14
0.25-1.00	58.29	49.82	92.82	90.62	57.04	42.06	82.40	77.38	49.83
0.3-1.00	47.24	55.14	88.65	81.25	68.89	44.91	79.83	67.10	61.26
0.35-1.00	38.69	62.16	86.97	75.00	78.52	49.14	79.04	60.17	70.69
0.4-1.00	32.66	65.99	84.43	67.19	83.70	50.44	77.08	51.99	75.96
0.45-1.00	26.63	71.55	82.30	59.38	88.89	52.81	75.61	44.18	81.42
0.5-1.00	18.09	72.07	76.82	40.62	92.59	45.53	70.99	25.74	85.51
0.55-1.00	12.06	83.23	75.00	31.25	97.04	45.51	69.83	16.98	90.43
0.6-1.00	9.55	94.70	74.58	28.12	99.26	50.65	69.94	14.90	93.17
0.65-1.00	5.03	100.00	71.58	15.62	100.00	24.30	67.60	3.74	94.51
0.7-1.00	3.02	100.00	70.10	9.38	100.00	-9.55	66.70	-0.91	95.11
0.75-1.00	2.01	100.00	69.39	6.25	100.00	-31.73	66.57	-2.18	95.74
0.8-1.00	1.51	100.00	69.04	4.69	100.00	-29.13	66.79	-1.76	96.33
0.85-1.00	0.00	NaN	68.00	0.00	100.00	-196.63	66.37	-4.34	96.92
0.9-1.00	0.00	NaN	68.00	0.00	100.00	-89.79	66.93	-2.44	97.57

Prediction Model of Psychosis Conversion Data

Model Coefficients

In Sample and Optimism Adjusted Performance

Frequency Distribution of Model-Based Predicted Risks Among Converters and Nonconverters (in sample)

ROC Curves