[R-lang] Re: False convergence in mixed logit model

T. Florian Jaeger tiflo@csli.stanford.edu
Thu Nov 29 12:00:07 PST 2012


HI Laura,

if your predictor variable for which you are entertaining a random
by-subject slope is almost entirely between-subjects, you should go without
random by-subject slopes for that predictors. Barr et al discuss the
obvious case of a between-subject manipulation (--> do not include
 by-subject random slopes for the manipulation). It seems that you have a
case that doesn't fit into the neat distinction made in their paper. You
have a predictor that *could *vary within subjects, but mostly ended up
varying between subjects. The problem is that in that case, it is hard for
the model to determine the (variance for) that random slope. Hence, you get
the false convergence.

Think about it this way: only the subjects who have *both* values for your
binary predictors provide data from which you can actually figure out how
the effect of that predictors *differs* across subjects. For subjects, who
categorically have one value for that predictor, the by-subject
*intercept *already
captures the same information (or, to put it differently, the model has not
way to distinguish between the by-subject intercept and the by-subject
slope). Now even when subjects aren't quite categorical (e.g., they have
value 1 about 98% of the time, and value 0 about 2% of the time), it's hard
for the model to reliably estimate what of the observed variance is due to
the by-subject random intercept and what is due to the by-subject random
slope.

Can you provide a tables of subject by predictor (for each of the binary
predictors)? I.e., a table that shows the proportion of 0 and 1 values for
the predictors for each of your subjects? if you have enough folks with
enough variance, you might be able to test on the subset. If not, you
probably have to exclude the random by-subject slope for those predictors.

HTH,

Florian


On Thu, Nov 29, 2012 at 2:45 PM, Laura Suttle <lsuttle@princeton.edu> wrote:

>
> Do you mean that the model converges to the same parameter values if you
>> set the starting parameters to wildly different values?
>>
>>
> Ok, I'm not sure what constitutes "wildly different." I just tried values
> like 100 and 1000 and now things aren't converging to the same values.
> Maybe I wasn't being different enough before to see it.
>
>
>>  It's not obvious to me why this model wouldn't actually be converged,
>> except that the variance of the subject random effect is very large.  Do
>> you have dramatic differences in behavior across subjects?
>>
>
> I do, in that participants usually either use this construction (so they
> have all 1's for the DV) or don't use it at all (so they have all 0's). The
> relevant thing that I'm trying to show is that more people use in one
> condition versus another. I don't want to convert this to a single binary
> variable since there are some subjects who switch up how they use these
> novel verbs, but for the most part they are either all or none.
>
>
>
>>  Another potential issue: are subject random effects relevant when you
>> are only looking at between subjects fixed effects? I'm wondering if that's
>> part of the issue.
>>
>>
>>  They're relevant whenever you have repeated measures from individual
>> subjects.
>>
>>  If your condition (Dative, Transitive, and something else) varies
>> within subjects, and what you care about is inferences about the effect of
>> condition, you'll want to fit a model with a by-subjects random slope
>> (i.e., response ~ condition + (condition | subj)), as per Barr et al.
>>
>
> Those conditions (control is the something else) are between subjects.
>
> Thanks,
> Laura
>
>
>>  Best
>>
>>  Roger
>>
>>  Here's the code with output for the simplest model I have with subjects
>> random effect (adding them in in any form causes this issue). ditrans is a
>> binary DV that categorizes speaker usages of a novel verb into double
>> object datives (1) or other sentence types (0). Condition is a categorical
>> variable that has three levels with a control condition as the baseline
>> (I've dummy coded these and there are no differences between running it
>> this way and with the dummy code, so I'm providing this one for the same of
>> simplicity).
>>
>>
>>  > ditmodel3 <-lmer(ditrans~condition +
>> (1|subj),data=adultdata,family="binomial",verbose=T)
>>    0:     814.04384: 0.391997 -0.472359 -0.840468 0.222898
>>   1:     474.30352:  1.39176 -0.489556 -0.853831 0.221628
>>   2:     446.52182:  1.60483 -1.29691 -1.39869 0.144777
>>   3:     386.13626:  2.59735 -1.19061 -1.38968 0.204059
>>   4:     383.11685:  2.69684 -1.19357 -1.39816 0.208801
>>   5:     377.92258:  2.89551 -1.20228 -1.41725 0.218187
>>   6:     376.11423:  2.97465 -1.20814 -1.42671 0.221868
>>   7:     372.88540:  3.13258 -1.22176 -1.44712 0.229201
>>   8:     371.72249:  3.19542 -1.22881 -1.45656 0.232118
>>   9:     369.58988:  3.32079 -1.24417 -1.47647 0.237957
>>  10:     369.58183:  3.32129 -1.24424 -1.47656 0.237981
>>  11:     369.56574:  3.32229 -1.24439 -1.47674 0.238028
>>  12:     369.55931:  3.32269 -1.24444 -1.47681 0.238046
>>  13:     369.54645:  3.32349 -1.24456 -1.47695 0.238084
>>  14:     369.54594:  3.32352 -1.24456 -1.47696 0.238085
>>  15:     369.54183:  3.32377 -1.24460 -1.47700 0.238097
>>  16:     369.54183:  3.32377 -1.24460 -1.47700 0.238097
>>  17:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  18:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  19:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  20:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  21:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  22:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  23:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  24:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  25:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  26:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  27:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  28:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  29:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  30:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  31:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  32:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  33:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  34:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>  35:     369.54175:  3.32378 -1.24460 -1.47699 0.238081
>>  36:     369.54062:  3.32390 -1.24460 -1.47681 0.237810
>>  37:     369.52871:  3.32514 -1.24460 -1.47496 0.234970
>>  38:     369.41249:  3.33754 -1.24460 -1.45648 0.206662
>>  39:     368.48882:  3.45721 -1.24460 -1.27810 -0.0665675
>>  40:     367.50882:  3.71632 -1.24460 -0.891867 -0.658143
>>  41:     367.42619:  3.80235 -1.24460 -0.763626 -0.854531
>>  42:     367.42069:  3.82935 -1.24460 -0.723378 -0.916133
>>  43:     367.42061:  3.83197 -1.24460 -0.719473 -0.922078
>>  44:     367.42056:  3.83302 -1.24460 -0.717885 -0.924450
>>  45:     367.42030:  3.83643 -1.24460 -0.712740 -0.931948
>>  46:     367.41976:  3.84096 -1.24460 -0.705833 -0.941589
>>  47:     367.41821:  3.84903 -1.24460 -0.693317 -0.957909
>>  48:     367.41433:  3.86190 -1.24460 -0.672862 -0.981743
>>  49:     367.40420:  3.88359 -1.24460 -0.637123 -1.01631
>>  50:     367.37911:  3.91871 -1.24460 -0.576118 -1.05854
>>  51:     367.31987:  3.97369 -1.24460 -0.473099 -1.09171
>>  52:     367.19878:  4.04676 -1.24460 -0.319173 -1.06129
>>  53:     367.00997:  4.11079 -1.24460 -0.148079 -0.875985
>>  54:     366.83505:  4.11105 -1.24460 -0.0694293 -0.533561
>>  55:     366.76596:  4.05082 -1.24460 -0.131236 -0.273508
>>  56:     366.75441:  4.00835 -1.24460 -0.205517 -0.224632
>>  57:     366.75330:  3.99692 -1.24460 -0.231735 -0.238739
>>  58:     366.75328:  3.99700 -1.24460 -0.232466 -0.242660
>>  59:     366.75327:  3.99706 -1.24460 -0.232631 -0.243916
>>  60:     366.75321:  3.99733 -1.24460 -0.233186 -0.248657
>>  61:     366.75310:  3.99766 -1.24460 -0.233948 -0.254660
>>  62:     366.75277:  3.99818 -1.24460 -0.235467 -0.265375
>>  63:     366.75194:  3.99913 -1.24460 -0.237734 -0.282069
>>  64:     366.74974:  4.00078 -1.24460 -0.241634 -0.309537
>>  65:     366.74401:  4.00376 -1.24460 -0.248277 -0.353787
>>  66:     366.72917:  4.00940 -1.24460 -0.260050 -0.425563
>>  67:     366.69014:  4.02037 -1.24460 -0.282150 -0.541757
>>  68:     366.58903:  4.04387 -1.24460 -0.323645 -0.729228
>>  69:     366.33054:  4.09566 -1.24460 -0.407098 -1.03058
>>  70:     365.69165:  4.21044 -1.24460 -0.582890 -1.50206
>>  71:     364.22961:  4.45811 -1.24460 -0.955294 -2.19620
>>  72:     363.88256:  4.50536 -1.24460 -1.03067 -2.28460
>>  73:     363.78744:  4.51528 -1.24460 -1.04951 -2.29787
>>  74:     363.76346:  4.51729 -1.24460 -1.05384 -2.29942
>>  75:     363.71559:  4.52093 -1.24460 -1.06284 -2.30192
>>  76:     363.50871:  4.53406 -1.24460 -1.10070 -2.30424
>>  77:     362.67929:  4.58017 -1.24460 -1.25368 -2.28868
>>  78:     362.34767:  4.59767 -1.24460 -1.31389 -2.27484
>>  79:     362.28332:  4.60067 -1.24460 -1.32582 -2.27114
>>  80:     362.15615:  4.60614 -1.24460 -1.34933 -2.26235
>>  81:     362.10583:  4.60818 -1.24460 -1.35861 -2.25843
>>  82:     361.98440:  4.61586 -1.24460 -1.37557 -2.24974
>>  83:     361.94584:  4.61723 -1.24460 -1.38296 -2.24642
>>  84:     361.93799:  4.61752 -1.24460 -1.38442 -2.24571
>>  85:     361.87531:  4.61979 -1.24460 -1.39595 -2.23983
>>  86:     361.87506:  4.61980 -1.24460 -1.39600 -2.23980
>>  87:     361.87456:  4.61982 -1.24460 -1.39609 -2.23975
>>  88:     361.87424:  4.61985 -1.24460 -1.39611 -2.23974
>>  89:     361.87421:  4.61986 -1.24460 -1.39612 -2.23973
>>  90:     361.87420:  4.61986 -1.24460 -1.39612 -2.23973
>>  91:     361.87418:  4.61986 -1.24460 -1.39612 -2.23973
>>  92:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>  93:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>  94:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>  95:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>  96:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>  97:     361.61308:  4.63208 -1.24460 -1.31790 -2.07725
>>  98:     360.78393:  4.70563 -1.24460 -1.01240 -1.42611
>>  99:     360.30955:  4.80814 -1.24460 -0.794807 -0.933962
>> 100:     359.83679:  4.94614 -1.24460 -0.694790 -0.658108
>> 101:     358.66805:  5.34357 -1.24460 -0.659430 -0.362284
>> 102:     357.21569:  5.94720 -1.24460 -0.865900 -0.407961
>> 103:     355.87363:  6.65147 -1.24460 -1.34871 -0.894451
>> 104:     354.79576:  7.33163 -1.24460 -2.04508 -1.75302
>> 105:     354.79427:  7.33250 -1.24460 -2.04623 -1.75453
>> 106:     354.78176:  7.33863 -1.24460 -2.05548 -1.76699
>> 107:     354.78124:  7.33885 -1.24460 -2.05585 -1.76750
>> 108:     354.78114:  7.33890 -1.24460 -2.05592 -1.76760
>> 109:     354.78092:  7.33897 -1.24460 -2.05608 -1.76780
>> 110:     354.78083:  7.33900 -1.24460 -2.05614 -1.76789
>> 111:     354.78081:  7.33901 -1.24460 -2.05615 -1.76790
>> 112:     354.78077:  7.33902 -1.24460 -2.05618 -1.76794
>> 113:     354.78075:  7.33903 -1.24460 -2.05619 -1.76795
>> 114:     354.78075:  7.33903 -1.24460 -2.05619 -1.76795
>> 115:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>> 116:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>> 117:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>> 118:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>> 119:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>> 120:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>> 121:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>> 122:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>  Warning message:
>> In mer_finalize(ans) : false convergence (8)
>> >
>> > summary(ditmodel3)
>> Generalized linear mixed model fit by the Laplace approximation
>> Formula: ditrans ~ condition + (1 | subj)
>>    Data: adultdata
>>    AIC   BIC logLik deviance
>>  362.8 381.7 -177.4    354.8
>> Random effects:
>>  Groups Name        Variance Std.Dev.
>>  subj   (Intercept) 53.861   7.339
>> Number of obs: 833, groups: subj, 48
>>
>>  Fixed effects:
>>                 Estimate Std. Error z value Pr(>|z|)
>> (Intercept)       -1.245      1.951  -0.638    0.523
>> conditionDative   -2.056      2.832  -0.726    0.468
>> conditionTrans    -1.768      2.791  -0.633    0.526
>>
>>  Correlation of Fixed Effects:
>>             (Intr) cndtnD
>> conditinDtv -0.689
>> conditnTrns -0.699  0.481
>>
>> Thanks and sorry for the length,
>> Laura
>>
>> On Thu, Nov 29, 2012 at 11:08 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>
>>> Yes -- the first column of the verbose output is the step number and the
>>> second column is the deviance.  If the deviance was still going down and
>>> the model stopped, you probably need more iterations.
>>>
>>>  It could be useful to change the starting value of the model
>>> parameters with the "start" argument of lmer and see if you wind up
>>> converging to the same parameter estimates regardless of starting value.
>>>
>>>  More information about the dataset, and example code output, is, of
>>> course, always helpful.
>>>
>>>  Best
>>>
>>>  Roger
>>>
>>>
>>>
>>>  On Nov 29, 2012, at 7:03 AM PST, Laura Suttle wrote:
>>>
>>> Hi Roger,
>>>
>>>  Thanks for the other list suggestion, I'll cross post to there.
>>>
>>>  Every variable in my data set is categorical, so I can't do that fix.
>>> I've tried playing around with the maxIter parameter before, but I'm not
>>> sure I was doing it right. Do you have any suggestions for where I can read
>>> more about how to interpret the verbose output? I found some things but
>>> they weren't very helpful.
>>>
>>>  Thanks,
>>> Laura
>>>
>>>
>>> On Thu, Nov 29, 2012 at 1:34 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>>
>>>> Hi Laura,
>>>>
>>>>  This is a question that might be better answered on R-sig-ME, but
>>>> briefly: I would be cautious with a model that reports false convergence;
>>>> in my experience with this warning (and I am by no means expert on it), it
>>>> can indicate that the optimization routine that determines the best-fit
>>>> model parameters got stuck at a parameter estimate that is not near a true
>>>> optimum, perhaps due to numerical issues.  You might try standardizing any
>>>> continuous predictor variables you and rerunning the lmer() call.  It would
>>>> be helpful to set the msVerbose control parameter to TRUE to see what the
>>>> optimizer is doing.  Also, upping the maxIter and/or maxFN control
>>>> parameters *might* be helpful.
>>>>
>>>>  I do not think that this warning message alone would be justification
>>>> to omit a random effect.
>>>>
>>>>  Best & hope that this helps,
>>>>
>>>>  Roger
>>>>
>>>>  On Nov 28, 2012, at 8:58 PM PST, Laura Suttle wrote:
>>>>
>>>> Hello all,
>>>>
>>>>  I hope this question hasn't been asked before, but the internet isn't
>>>> being of much help to me.
>>>>
>>>>  I am trying to run a mixed logit regression predicting whether
>>>> participants use a novel verb in a particular construction or not depending
>>>> on how they were exposed to that novel verb. I dummy coded the three
>>>> conditions of the experiment into two dummy variables and have added two
>>>> random effects, one for the motion used for the verb, the other for the
>>>> verb itself (since these were all counterbalanced).
>>>>
>>>>  I can get this model to run fine, the problem is when I try to add
>>>> any kind of random effect for the subjects themselves. I then get this
>>>> error message:
>>>>
>>>>  Warning message:
>>>> In mer_finalize(ans) : false convergence (8)
>>>>
>>>>  And all of the effects I had of the exposure type go away.
>>>>
>>>>  I've been trying to look up what this means and how to deal with it,
>>>> but there are no clear solutions or explanations that I can find, but
>>>> plenty of warning of how I should be skeptical of any output from a model
>>>> with this warning. One suggestion I did find was that the subjects variable
>>>> may be overfitting my data and there might be something to this: when
>>>> participants are exposed to the verb in a certain way, they tend to only
>>>> use the construction I'm looking for, with no variance in their responses.
>>>> That said, I'm not sure that's right and I'd love a second opinion on
>>>> either how I can fix this or whether I can use this as justification to not
>>>> include the subjects random effect.
>>>>
>>>>  Thanks in advance for any help you can give,
>>>> Laura Suttle
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20121129/f65c5ddd/attachment-0001.html 


More information about the ling-r-lang-L mailing list