[R-lang] Re: False convergence in mixed logit model

Thu Nov 29 12:51:38 PST 2012

Here are the proportion data. 33/48 overall picked either one or the other.

Thanks for the reference, I'm always glad to find more papers on these
methods.

Thanks,
Laura

  Control  3 0  18 0  A10 0  A18 0  A20 0  A22 0  A9 0  A5 0.055  2 0.312
A8 0.555  A21 0.666  4 0.937  5 0.937  1 1  21 1  22 1  Dative  9 0  11 0
13 0  15 0  17 0  23 0  A12 0  A13 0  A19 0  A23 0  A4 0  8 0.217  19 0.5
A6 0.555  A24 0.95  A15 1  Transitive  6 0  14 0  16 0  A14 0  A25 0  A3 0
A7 0  A17 0.055  A16 0.277  10 0.937  A1 0.944  7 1  12 1  20 1  24 1  A11 1

On Thu, Nov 29, 2012 at 3:00 PM, T. Florian Jaeger
<tiflo@csli.stanford.edu>wrote:

>  HI Laura,
>
>  if your predictor variable for which you are entertaining a random
> by-subject slope is almost entirely between-subjects, you should go without
> random by-subject slopes for that predictors. Barr et al discuss the
> obvious case of a between-subject manipulation (--> do not include
>  by-subject random slopes for the manipulation). It seems that you have a
> case that doesn't fit into the neat distinction made in their paper. You
> have a predictor that *could *vary within subjects, but mostly ended up
> varying between subjects. The problem is that in that case, it is hard for
> the model to determine the (variance for) that random slope. Hence, you get
> the false convergence.
>
>  Think about it this way: only the subjects who have *both* values for
> your binary predictors provide data from which you can actually figure out
> how the effect of that predictors *differs* across subjects. For
> subjects, who categorically have one value for that predictor, the
> by-subject *intercept *already captures the same information (or, to put
> it differently, the model has not way to distinguish between the by-subject
> intercept and the by-subject slope). Now even when subjects aren't quite
> categorical (e.g., they have value 1 about 98% of the time, and value 0
> about 2% of the time), it's hard for the model to reliably estimate what of
> the observed variance is due to the by-subject random intercept and what is
> due to the by-subject random slope.
>
>  Can you provide a tables of subject by predictor (for each of the binary
> predictors)? I.e., a table that shows the proportion of 0 and 1 values for
> the predictors for each of your subjects? if you have enough folks with
> enough variance, you might be able to test on the subset. If not, you
> probably have to exclude the random by-subject slope for those predictors.
>
>  HTH,
>
>  Florian
>
>
> On Thu, Nov 29, 2012 at 2:45 PM, Laura Suttle <lsuttle@princeton.edu>wrote:
>
>>
>>     Do you mean that the model converges to the same parameter values if
>>> you set the starting parameters to wildly different values?
>>>
>>>
>>  Ok, I'm not sure what constitutes "wildly different." I just tried
>> values like 100 and 1000 and now things aren't converging to the same
>> values. Maybe I wasn't being different enough before to see it.
>>
>>
>>>   It's not obvious to me why this model wouldn't actually be converged,
>>> except that the variance of the subject random effect is very large.  Do
>>> you have dramatic differences in behavior across subjects?
>>>
>>
>>  I do, in that participants usually either use this construction (so
>> they have all 1's for the DV) or don't use it at all (so they have all
>> 0's). The relevant thing that I'm trying to show is that more people use in
>> one condition versus another. I don't want to convert this to a single
>> binary variable since there are some subjects who switch up how they use
>> these novel verbs, but for the most part they are either all or none.
>>
>>
>>
>>>  Another potential issue: are subject random effects relevant when you
>>> are only looking at between subjects fixed effects? I'm wondering if that's
>>> part of the issue.
>>>
>>>
>>>   They're relevant whenever you have repeated measures from individual
>>> subjects.
>>>
>>>  If your condition (Dative, Transitive, and something else) varies
>>> within subjects, and what you care about is inferences about the effect of
>>> condition, you'll want to fit a model with a by-subjects random slope
>>> (i.e., response ~ condition + (condition | subj)), as per Barr et al.
>>>
>>
>>  Those conditions (control is the something else) are between subjects.
>>
>> Thanks,
>> Laura
>>
>>
>>>  Best
>>>
>>>  Roger
>>>
>>>  Here's the code with output for the simplest model I have with
>>> subjects random effect (adding them in in any form causes this issue).
>>> ditrans is a binary DV that categorizes speaker usages of a novel verb into
>>> double object datives (1) or other sentence types (0). Condition is a
>>> categorical variable that has three levels with a control condition as the
>>> baseline (I've dummy coded these and there are no differences between
>>> running it this way and with the dummy code, so I'm providing this one for
>>> the same of simplicity).
>>>
>>>
>>>  > ditmodel3 <-lmer(ditrans~condition +
>>> (1|subj),data=adultdata,family="binomial",verbose=T)
>>>    0:     814.04384: 0.391997 -0.472359 -0.840468 0.222898
>>>   1:     474.30352:  1.39176 -0.489556 -0.853831 0.221628
>>>   2:     446.52182:  1.60483 -1.29691 -1.39869 0.144777
>>>   3:     386.13626:  2.59735 -1.19061 -1.38968 0.204059
>>>   4:     383.11685:  2.69684 -1.19357 -1.39816 0.208801
>>>   5:     377.92258:  2.89551 -1.20228 -1.41725 0.218187
>>>   6:     376.11423:  2.97465 -1.20814 -1.42671 0.221868
>>>   7:     372.88540:  3.13258 -1.22176 -1.44712 0.229201
>>>   8:     371.72249:  3.19542 -1.22881 -1.45656 0.232118
>>>   9:     369.58988:  3.32079 -1.24417 -1.47647 0.237957
>>>  10:     369.58183:  3.32129 -1.24424 -1.47656 0.237981
>>>  11:     369.56574:  3.32229 -1.24439 -1.47674 0.238028
>>>  12:     369.55931:  3.32269 -1.24444 -1.47681 0.238046
>>>  13:     369.54645:  3.32349 -1.24456 -1.47695 0.238084
>>>  14:     369.54594:  3.32352 -1.24456 -1.47696 0.238085
>>>  15:     369.54183:  3.32377 -1.24460 -1.47700 0.238097
>>>  16:     369.54183:  3.32377 -1.24460 -1.47700 0.238097
>>>  17:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  18:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  19:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  20:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  21:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  22:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  23:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  24:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  25:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  26:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  27:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  28:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  29:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  30:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  31:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  32:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  33:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  34:     369.54182:  3.32377 -1.24460 -1.47700 0.238097
>>>  35:     369.54175:  3.32378 -1.24460 -1.47699 0.238081
>>>  36:     369.54062:  3.32390 -1.24460 -1.47681 0.237810
>>>  37:     369.52871:  3.32514 -1.24460 -1.47496 0.234970
>>>  38:     369.41249:  3.33754 -1.24460 -1.45648 0.206662
>>>  39:     368.48882:  3.45721 -1.24460 -1.27810 -0.0665675
>>>  40:     367.50882:  3.71632 -1.24460 -0.891867 -0.658143
>>>  41:     367.42619:  3.80235 -1.24460 -0.763626 -0.854531
>>>  42:     367.42069:  3.82935 -1.24460 -0.723378 -0.916133
>>>  43:     367.42061:  3.83197 -1.24460 -0.719473 -0.922078
>>>  44:     367.42056:  3.83302 -1.24460 -0.717885 -0.924450
>>>  45:     367.42030:  3.83643 -1.24460 -0.712740 -0.931948
>>>  46:     367.41976:  3.84096 -1.24460 -0.705833 -0.941589
>>>  47:     367.41821:  3.84903 -1.24460 -0.693317 -0.957909
>>>  48:     367.41433:  3.86190 -1.24460 -0.672862 -0.981743
>>>  49:     367.40420:  3.88359 -1.24460 -0.637123 -1.01631
>>>  50:     367.37911:  3.91871 -1.24460 -0.576118 -1.05854
>>>  51:     367.31987:  3.97369 -1.24460 -0.473099 -1.09171
>>>  52:     367.19878:  4.04676 -1.24460 -0.319173 -1.06129
>>>  53:     367.00997:  4.11079 -1.24460 -0.148079 -0.875985
>>>  54:     366.83505:  4.11105 -1.24460 -0.0694293 -0.533561
>>>  55:     366.76596:  4.05082 -1.24460 -0.131236 -0.273508
>>>  56:     366.75441:  4.00835 -1.24460 -0.205517 -0.224632
>>>  57:     366.75330:  3.99692 -1.24460 -0.231735 -0.238739
>>>  58:     366.75328:  3.99700 -1.24460 -0.232466 -0.242660
>>>  59:     366.75327:  3.99706 -1.24460 -0.232631 -0.243916
>>>  60:     366.75321:  3.99733 -1.24460 -0.233186 -0.248657
>>>  61:     366.75310:  3.99766 -1.24460 -0.233948 -0.254660
>>>  62:     366.75277:  3.99818 -1.24460 -0.235467 -0.265375
>>>  63:     366.75194:  3.99913 -1.24460 -0.237734 -0.282069
>>>  64:     366.74974:  4.00078 -1.24460 -0.241634 -0.309537
>>>  65:     366.74401:  4.00376 -1.24460 -0.248277 -0.353787
>>>  66:     366.72917:  4.00940 -1.24460 -0.260050 -0.425563
>>>  67:     366.69014:  4.02037 -1.24460 -0.282150 -0.541757
>>>  68:     366.58903:  4.04387 -1.24460 -0.323645 -0.729228
>>>  69:     366.33054:  4.09566 -1.24460 -0.407098 -1.03058
>>>  70:     365.69165:  4.21044 -1.24460 -0.582890 -1.50206
>>>  71:     364.22961:  4.45811 -1.24460 -0.955294 -2.19620
>>>  72:     363.88256:  4.50536 -1.24460 -1.03067 -2.28460
>>>  73:     363.78744:  4.51528 -1.24460 -1.04951 -2.29787
>>>  74:     363.76346:  4.51729 -1.24460 -1.05384 -2.29942
>>>  75:     363.71559:  4.52093 -1.24460 -1.06284 -2.30192
>>>  76:     363.50871:  4.53406 -1.24460 -1.10070 -2.30424
>>>  77:     362.67929:  4.58017 -1.24460 -1.25368 -2.28868
>>>  78:     362.34767:  4.59767 -1.24460 -1.31389 -2.27484
>>>  79:     362.28332:  4.60067 -1.24460 -1.32582 -2.27114
>>>  80:     362.15615:  4.60614 -1.24460 -1.34933 -2.26235
>>>  81:     362.10583:  4.60818 -1.24460 -1.35861 -2.25843
>>>  82:     361.98440:  4.61586 -1.24460 -1.37557 -2.24974
>>>  83:     361.94584:  4.61723 -1.24460 -1.38296 -2.24642
>>>  84:     361.93799:  4.61752 -1.24460 -1.38442 -2.24571
>>>  85:     361.87531:  4.61979 -1.24460 -1.39595 -2.23983
>>>  86:     361.87506:  4.61980 -1.24460 -1.39600 -2.23980
>>>  87:     361.87456:  4.61982 -1.24460 -1.39609 -2.23975
>>>  88:     361.87424:  4.61985 -1.24460 -1.39611 -2.23974
>>>  89:     361.87421:  4.61986 -1.24460 -1.39612 -2.23973
>>>  90:     361.87420:  4.61986 -1.24460 -1.39612 -2.23973
>>>  91:     361.87418:  4.61986 -1.24460 -1.39612 -2.23973
>>>  92:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>>  93:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>>  94:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>>  95:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>>  96:     361.87417:  4.61986 -1.24460 -1.39612 -2.23973
>>>  97:     361.61308:  4.63208 -1.24460 -1.31790 -2.07725
>>>  98:     360.78393:  4.70563 -1.24460 -1.01240 -1.42611
>>>  99:     360.30955:  4.80814 -1.24460 -0.794807 -0.933962
>>> 100:     359.83679:  4.94614 -1.24460 -0.694790 -0.658108
>>> 101:     358.66805:  5.34357 -1.24460 -0.659430 -0.362284
>>> 102:     357.21569:  5.94720 -1.24460 -0.865900 -0.407961
>>> 103:     355.87363:  6.65147 -1.24460 -1.34871 -0.894451
>>> 104:     354.79576:  7.33163 -1.24460 -2.04508 -1.75302
>>> 105:     354.79427:  7.33250 -1.24460 -2.04623 -1.75453
>>> 106:     354.78176:  7.33863 -1.24460 -2.05548 -1.76699
>>> 107:     354.78124:  7.33885 -1.24460 -2.05585 -1.76750
>>> 108:     354.78114:  7.33890 -1.24460 -2.05592 -1.76760
>>> 109:     354.78092:  7.33897 -1.24460 -2.05608 -1.76780
>>> 110:     354.78083:  7.33900 -1.24460 -2.05614 -1.76789
>>> 111:     354.78081:  7.33901 -1.24460 -2.05615 -1.76790
>>> 112:     354.78077:  7.33902 -1.24460 -2.05618 -1.76794
>>> 113:     354.78075:  7.33903 -1.24460 -2.05619 -1.76795
>>> 114:     354.78075:  7.33903 -1.24460 -2.05619 -1.76795
>>> 115:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>>> 116:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>>> 117:     354.78074:  7.33903 -1.24460 -2.05619 -1.76796
>>> 118:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>> 119:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>> 120:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>> 121:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>> 122:     354.78074:  7.33903 -1.24460 -2.05620 -1.76796
>>>  Warning message:
>>> In mer_finalize(ans) : false convergence (8)
>>> >
>>> > summary(ditmodel3)
>>> Generalized linear mixed model fit by the Laplace approximation
>>> Formula: ditrans ~ condition + (1 | subj)
>>>    Data: adultdata
>>>    AIC   BIC logLik deviance
>>>  362.8 381.7 -177.4    354.8
>>> Random effects:
>>>  Groups Name        Variance Std.Dev.
>>>  subj   (Intercept) 53.861   7.339
>>> Number of obs: 833, groups: subj, 48
>>>
>>>  Fixed effects:
>>>                 Estimate Std. Error z value Pr(>|z|)
>>> (Intercept)       -1.245      1.951  -0.638    0.523
>>> conditionDative   -2.056      2.832  -0.726    0.468
>>> conditionTrans    -1.768      2.791  -0.633    0.526
>>>
>>>  Correlation of Fixed Effects:
>>>             (Intr) cndtnD
>>> conditinDtv -0.689
>>> conditnTrns -0.699  0.481
>>>
>>> Thanks and sorry for the length,
>>> Laura
>>>
>>> On Thu, Nov 29, 2012 at 11:08 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>>
>>>> Yes -- the first column of the verbose output is the step number and
>>>> the second column is the deviance.  If the deviance was still going down
>>>> and the model stopped, you probably need more iterations.
>>>>
>>>>  It could be useful to change the starting value of the model
>>>> parameters with the "start" argument of lmer and see if you wind up
>>>> converging to the same parameter estimates regardless of starting value.
>>>>
>>>>  More information about the dataset, and example code output, is, of
>>>> course, always helpful.
>>>>
>>>>  Best
>>>>
>>>>  Roger
>>>>
>>>>
>>>>
>>>>  On Nov 29, 2012, at 7:03 AM PST, Laura Suttle wrote:
>>>>
>>>> Hi Roger,
>>>>
>>>>  Thanks for the other list suggestion, I'll cross post to there.
>>>>
>>>>  Every variable in my data set is categorical, so I can't do that fix.
>>>> I've tried playing around with the maxIter parameter before, but I'm not
>>>> sure I was doing it right. Do you have any suggestions for where I can read
>>>> more about how to interpret the verbose output? I found some things but
>>>> they weren't very helpful.
>>>>
>>>>  Thanks,
>>>> Laura
>>>>
>>>>
>>>> On Thu, Nov 29, 2012 at 1:34 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>>>
>>>>> Hi Laura,
>>>>>
>>>>>  This is a question that might be better answered on R-sig-ME, but
>>>>> briefly: I would be cautious with a model that reports false convergence;
>>>>> in my experience with this warning (and I am by no means expert on it), it
>>>>> can indicate that the optimization routine that determines the best-fit
>>>>> model parameters got stuck at a parameter estimate that is not near a true
>>>>> optimum, perhaps due to numerical issues.  You might try standardizing any
>>>>> continuous predictor variables you and rerunning the lmer() call.  It would
>>>>> be helpful to set the msVerbose control parameter to TRUE to see what the
>>>>> optimizer is doing.  Also, upping the maxIter and/or maxFN control
>>>>> parameters *might* be helpful.
>>>>>
>>>>>  I do not think that this warning message alone would be
>>>>> justification to omit a random effect.
>>>>>
>>>>>  Best & hope that this helps,
>>>>>
>>>>>  Roger
>>>>>
>>>>>  On Nov 28, 2012, at 8:58 PM PST, Laura Suttle wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>>  I hope this question hasn't been asked before, but the internet
>>>>> isn't being of much help to me.
>>>>>
>>>>>  I am trying to run a mixed logit regression predicting whether
>>>>> participants use a novel verb in a particular construction or not depending
>>>>> on how they were exposed to that novel verb. I dummy coded the three
>>>>> conditions of the experiment into two dummy variables and have added two
>>>>> random effects, one for the motion used for the verb, the other for the
>>>>> verb itself (since these were all counterbalanced).
>>>>>
>>>>>  I can get this model to run fine, the problem is when I try to add
>>>>> any kind of random effect for the subjects themselves. I then get this
>>>>> error message:
>>>>>
>>>>>  Warning message:
>>>>> In mer_finalize(ans) : false convergence (8)
>>>>>
>>>>>  And all of the effects I had of the exposure type go away.
>>>>>
>>>>>  I've been trying to look up what this means and how to deal with it,
>>>>> but there are no clear solutions or explanations that I can find, but
>>>>> plenty of warning of how I should be skeptical of any output from a model
>>>>> with this warning. One suggestion I did find was that the subjects variable
>>>>> may be overfitting my data and there might be something to this: when
>>>>> participants are exposed to the verb in a certain way, they tend to only
>>>>> use the construction I'm looking for, with no variance in their responses.
>>>>> That said, I'm not sure that's right and I'd love a second opinion on
>>>>> either how I can fix this or whether I can use this as justification to not
>>>>> include the subjects random effect.
>>>>>
>>>>>  Thanks in advance for any help you can give,
>>>>> Laura Suttle
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20121129/9eaf9ed0/attachment-0001.html