Setup

A user asked a reasonable question about score contributions summing to zero for a logistic-Normal model.

Here we analyze 500 pairs of 2 units generated from:

\[ \Pr(Y_{ij} = 1 | X, b) = \mbox{logit}^{-1}(\beta_0 + \beta_1 x_{ij} + b_i) = h(\beta, x_{ij}, b_i) \]

where \(X_{ij} \sim N(0, 1)\) and \(b_i \sim f(b_i) = N(0, \sigma^2)\). Let \(\theta = \{\beta_0, \beta_1, \sigma^2 \} = \{2, 2, 1\}\).

The loglihood for this model is:

\[ l(\theta; Y, X) = \sum_{i = 1}^m \{ \int \prod_{j = 1}^{n_i} h(\beta, x_{ij}, b_i)^y{ij}[1 - h(\beta, x_{ij}, b_i)]^{1 - y{ij}} f(b_i) \} \]

Take the derivative of \(l(\theta; Y, X)\), set equal to zero, and solve for \(\theta\) to find \(\hat{\theta}\).

\(\theta\) can be estimated from the lme4::glmer function.

## Generalized linear mixed model fit by maximum likelihood (Laplace
##   Approximation) [glmerMod]
##  Family: binomial  ( logit )
## Formula: y ~ x + (1 | id)
##    Data: d
## 
##      AIC      BIC   logLik deviance df.resid 
##    800.1    814.8   -397.0    794.1      997 
## 
## Scaled residuals: 
##     Min      1Q  Median      3Q     Max 
## -9.1700  0.0493  0.2134  0.4260  2.2126 
## 
## Random effects:
##  Groups Name        Variance Std.Dev.
##  id     (Intercept) 0.5483   0.7404  
## Number of obs: 1000, groups:  id, 500
## 
## Fixed effects:
##             Estimate Std. Error z value Pr(>|z|)    
## (Intercept)   1.8842     0.1707   11.04   <2e-16 ***
## x             1.7950     0.1704   10.53   <2e-16 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Correlation of Fixed Effects:
##   (Intr)
## x 0.760

Computing score equations

Note that geex does most of this work internally.

Do these evaluated score equations sum to zero?

## [1] -0.3425674  0.0140744  2.6353120

No.

As another check, how does the observed loglihood compare to lme4::glmer?

## [1] -396.7983
## 'log Lik.' -397.0352 (df=3)

Not bad.

What happens if instead of using \(\hat{\theta}_{glmer}\) to compute the scores we use estimated parameters obtained from the finding the roots of the score functions as they are defined within the grab_eeFUN function?

NOTE: the evaluation of m_estimate is slow!

Ah! That looks much better.