[R-lang] Re: False convergence in mixed logit model
Laura Suttle
lsuttle@princeton.edu
Thu Nov 29 12:51:38 PST 2012
Here are the proportion data. 33/48 overall picked either one or the other.
Thanks for the reference, I'm always glad to find more papers on these
methods.
Thanks,
Laura
Control 3 0 18 0 A10 0 A18 0 A20 0 A22 0 A9 0 A5 0.055 2 0.312
A8 0.555 A21 0.666 4 0.937 5 0.937 1 1 21 1 22 1 Dative 9 0 11 0
13 0 15 0 17 0 23 0 A12 0 A13 0 A19 0 A23 0 A4 0 8 0.217 19 0.5
A6 0.555 A24 0.95 A15 1 Transitive 6 0 14 0 16 0 A14 0 A25 0 A3 0
A7 0 A17 0.055 A16 0.277 10 0.937 A1 0.944 7 1 12 1 20 1 24 1 A11 1
On Thu, Nov 29, 2012 at 3:00 PM, T. Florian Jaeger
<tiflo@csli.stanford.edu>wrote:
> HI Laura,
>
> if your predictor variable for which you are entertaining a random
> by-subject slope is almost entirely between-subjects, you should go without
> random by-subject slopes for that predictors. Barr et al discuss the
> obvious case of a between-subject manipulation (--> do not include
> by-subject random slopes for the manipulation). It seems that you have a
> case that doesn't fit into the neat distinction made in their paper. You
> have a predictor that *could *vary within subjects, but mostly ended up
> varying between subjects. The problem is that in that case, it is hard for
> the model to determine the (variance for) that random slope. Hence, you get
> the false convergence.
>
> Think about it this way: only the subjects who have *both* values for
> your binary predictors provide data from which you can actually figure out
> how the effect of that predictors *differs* across subjects. For
> subjects, who categorically have one value for that predictor, the
> by-subject *intercept *already captures the same information (or, to put
> it differently, the model has not way to distinguish between the by-subject
> intercept and the by-subject slope). Now even when subjects aren't quite
> categorical (e.g., they have value 1 about 98% of the time, and value 0
> about 2% of the time), it's hard for the model to reliably estimate what of
> the observed variance is due to the by-subject random intercept and what is
> due to the by-subject random slope.
>
> Can you provide a tables of subject by predictor (for each of the binary
> predictors)? I.e., a table that shows the proportion of 0 and 1 values for
> the predictors for each of your subjects? if you have enough folks with
> enough variance, you might be able to test on the subset. If not, you
> probably have to exclude the random by-subject slope for those predictors.
>
> HTH,
>
> Florian
>
>
> On Thu, Nov 29, 2012 at 2:45 PM, Laura Suttle <lsuttle@princeton.edu>wrote:
>
>>
>> Do you mean that the model converges to the same parameter values if
>>> you set the starting parameters to wildly different values?
>>>
>>>
>> Ok, I'm not sure what constitutes "wildly different." I just tried
>> values like 100 and 1000 and now things aren't converging to the same
>> values. Maybe I wasn't being different enough before to see it.
>>
>>
>>> It's not obvious to me why this model wouldn't actually be converged,
>>> except that the variance of the subject random effect is very large. Do
>>> you have dramatic differences in behavior across subjects?
>>>
>>
>> I do, in that participants usually either use this construction (so
>> they have all 1's for the DV) or don't use it at all (so they have all
>> 0's). The relevant thing that I'm trying to show is that more people use in
>> one condition versus another. I don't want to convert this to a single
>> binary variable since there are some subjects who switch up how they use
>> these novel verbs, but for the most part they are either all or none.
>>
>>
>>
>>> Another potential issue: are subject random effects relevant when you
>>> are only looking at between subjects fixed effects? I'm wondering if that's
>>> part of the issue.
>>>
>>>
>>> They're relevant whenever you have repeated measures from individual
>>> subjects.
>>>
>>> If your condition (Dative, Transitive, and something else) varies
>>> within subjects, and what you care about is inferences about the effect of
>>> condition, you'll want to fit a model with a by-subjects random slope
>>> (i.e., response ~ condition + (condition | subj)), as per Barr et al.
>>>
>>
>> Those conditions (control is the something else) are between subjects.
>>
>> Thanks,
>> Laura
>>
>>
>>> Best
>>>
>>> Roger
>>>
>>> Here's the code with output for the simplest model I have with
>>> subjects random effect (adding them in in any form causes this issue).
>>> ditrans is a binary DV that categorizes speaker usages of a novel verb into
>>> double object datives (1) or other sentence types (0). Condition is a
>>> categorical variable that has three levels with a control condition as the
>>> baseline (I've dummy coded these and there are no differences between
>>> running it this way and with the dummy code, so I'm providing this one for
>>> the same of simplicity).
>>>
>>>
>>> > ditmodel3 <-lmer(ditrans~condition +
>>> (1|subj),data=adultdata,family="binomial",verbose=T)
>>> 0: 814.04384: 0.391997 -0.472359 -0.840468 0.222898
>>> 1: 474.30352: 1.39176 -0.489556 -0.853831 0.221628
>>> 2: 446.52182: 1.60483 -1.29691 -1.39869 0.144777
>>> 3: 386.13626: 2.59735 -1.19061 -1.38968 0.204059
>>> 4: 383.11685: 2.69684 -1.19357 -1.39816 0.208801
>>> 5: 377.92258: 2.89551 -1.20228 -1.41725 0.218187
>>> 6: 376.11423: 2.97465 -1.20814 -1.42671 0.221868
>>> 7: 372.88540: 3.13258 -1.22176 -1.44712 0.229201
>>> 8: 371.72249: 3.19542 -1.22881 -1.45656 0.232118
>>> 9: 369.58988: 3.32079 -1.24417 -1.47647 0.237957
>>> 10: 369.58183: 3.32129 -1.24424 -1.47656 0.237981
>>> 11: 369.56574: 3.32229 -1.24439 -1.47674 0.238028
>>> 12: 369.55931: 3.32269 -1.24444 -1.47681 0.238046
>>> 13: 369.54645: 3.32349 -1.24456 -1.47695 0.238084
>>> 14: 369.54594: 3.32352 -1.24456 -1.47696 0.238085
>>> 15: 369.54183: 3.32377 -1.24460 -1.47700 0.238097
>>> 16: 369.54183: 3.32377 -1.24460 -1.47700 0.238097
>>> 17: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 18: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 19: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 20: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 21: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 22: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 23: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 24: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 25: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 26: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 27: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 28: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 29: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 30: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 31: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 32: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 33: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 34: 369.54182: 3.32377 -1.24460 -1.47700 0.238097
>>> 35: 369.54175: 3.32378 -1.24460 -1.47699 0.238081
>>> 36: 369.54062: 3.32390 -1.24460 -1.47681 0.237810
>>> 37: 369.52871: 3.32514 -1.24460 -1.47496 0.234970
>>> 38: 369.41249: 3.33754 -1.24460 -1.45648 0.206662
>>> 39: 368.48882: 3.45721 -1.24460 -1.27810 -0.0665675
>>> 40: 367.50882: 3.71632 -1.24460 -0.891867 -0.658143
>>> 41: 367.42619: 3.80235 -1.24460 -0.763626 -0.854531
>>> 42: 367.42069: 3.82935 -1.24460 -0.723378 -0.916133
>>> 43: 367.42061: 3.83197 -1.24460 -0.719473 -0.922078
>>> 44: 367.42056: 3.83302 -1.24460 -0.717885 -0.924450
>>> 45: 367.42030: 3.83643 -1.24460 -0.712740 -0.931948
>>> 46: 367.41976: 3.84096 -1.24460 -0.705833 -0.941589
>>> 47: 367.41821: 3.84903 -1.24460 -0.693317 -0.957909
>>> 48: 367.41433: 3.86190 -1.24460 -0.672862 -0.981743
>>> 49: 367.40420: 3.88359 -1.24460 -0.637123 -1.01631
>>> 50: 367.37911: 3.91871 -1.24460 -0.576118 -1.05854
>>> 51: 367.31987: 3.97369 -1.24460 -0.473099 -1.09171
>>> 52: 367.19878: 4.04676 -1.24460 -0.319173 -1.06129
>>> 53: 367.00997: 4.11079 -1.24460 -0.148079 -0.875985
>>> 54: 366.83505: 4.11105 -1.24460 -0.0694293 -0.533561
>>> 55: 366.76596: 4.05082 -1.24460 -0.131236 -0.273508
>>> 56: 366.75441: 4.00835 -1.24460 -0.205517 -0.224632
>>> 57: 366.75330: 3.99692 -1.24460 -0.231735 -0.238739
>>> 58: 366.75328: 3.99700 -1.24460 -0.232466 -0.242660
>>> 59: 366.75327: 3.99706 -1.24460 -0.232631 -0.243916
>>> 60: 366.75321: 3.99733 -1.24460 -0.233186 -0.248657
>>> 61: 366.75310: 3.99766 -1.24460 -0.233948 -0.254660
>>> 62: 366.75277: 3.99818 -1.24460 -0.235467 -0.265375
>>> 63: 366.75194: 3.99913 -1.24460 -0.237734 -0.282069
>>> 64: 366.74974: 4.00078 -1.24460 -0.241634 -0.309537
>>> 65: 366.74401: 4.00376 -1.24460 -0.248277 -0.353787
>>> 66: 366.72917: 4.00940 -1.24460 -0.260050 -0.425563
>>> 67: 366.69014: 4.02037 -1.24460 -0.282150 -0.541757
>>> 68: 366.58903: 4.04387 -1.24460 -0.323645 -0.729228
>>> 69: 366.33054: 4.09566 -1.24460 -0.407098 -1.03058
>>> 70: 365.69165: 4.21044 -1.24460 -0.582890 -1.50206
>>> 71: 364.22961: 4.45811 -1.24460 -0.955294 -2.19620
>>> 72: 363.88256: 4.50536 -1.24460 -1.03067 -2.28460
>>> 73: 363.78744: 4.51528 -1.24460 -1.04951 -2.29787
>>> 74: 363.76346: 4.51729 -1.24460 -1.05384 -2.29942
>>> 75: 363.71559: 4.52093 -1.24460 -1.06284 -2.30192
>>> 76: 363.50871: 4.53406 -1.24460 -1.10070 -2.30424
>>> 77: 362.67929: 4.58017 -1.24460 -1.25368 -2.28868
>>> 78: 362.34767: 4.59767 -1.24460 -1.31389 -2.27484
>>> 79: 362.28332: 4.60067 -1.24460 -1.32582 -2.27114
>>> 80: 362.15615: 4.60614 -1.24460 -1.34933 -2.26235
>>> 81: 362.10583: 4.60818 -1.24460 -1.35861 -2.25843
>>> 82: 361.98440: 4.61586 -1.24460 -1.37557 -2.24974
>>> 83: 361.94584: 4.61723 -1.24460 -1.38296 -2.24642
>>> 84: 361.93799: 4.61752 -1.24460 -1.38442 -2.24571
>>> 85: 361.87531: 4.61979 -1.24460 -1.39595 -2.23983
>>> 86: 361.87506: 4.61980 -1.24460 -1.39600 -2.23980
>>> 87: 361.87456: 4.61982 -1.24460 -1.39609 -2.23975
>>> 88: 361.87424: 4.61985 -1.24460 -1.39611 -2.23974
>>> 89: 361.87421: 4.61986 -1.24460 -1.39612 -2.23973
>>> 90: 361.87420: 4.61986 -1.24460 -1.39612 -2.23973
>>> 91: 361.87418: 4.61986 -1.24460 -1.39612 -2.23973
>>> 92: 361.87417: 4.61986 -1.24460 -1.39612 -2.23973
>>> 93: 361.87417: 4.61986 -1.24460 -1.39612 -2.23973
>>> 94: 361.87417: 4.61986 -1.24460 -1.39612 -2.23973
>>> 95: 361.87417: 4.61986 -1.24460 -1.39612 -2.23973
>>> 96: 361.87417: 4.61986 -1.24460 -1.39612 -2.23973
>>> 97: 361.61308: 4.63208 -1.24460 -1.31790 -2.07725
>>> 98: 360.78393: 4.70563 -1.24460 -1.01240 -1.42611
>>> 99: 360.30955: 4.80814 -1.24460 -0.794807 -0.933962
>>> 100: 359.83679: 4.94614 -1.24460 -0.694790 -0.658108
>>> 101: 358.66805: 5.34357 -1.24460 -0.659430 -0.362284
>>> 102: 357.21569: 5.94720 -1.24460 -0.865900 -0.407961
>>> 103: 355.87363: 6.65147 -1.24460 -1.34871 -0.894451
>>> 104: 354.79576: 7.33163 -1.24460 -2.04508 -1.75302
>>> 105: 354.79427: 7.33250 -1.24460 -2.04623 -1.75453
>>> 106: 354.78176: 7.33863 -1.24460 -2.05548 -1.76699
>>> 107: 354.78124: 7.33885 -1.24460 -2.05585 -1.76750
>>> 108: 354.78114: 7.33890 -1.24460 -2.05592 -1.76760
>>> 109: 354.78092: 7.33897 -1.24460 -2.05608 -1.76780
>>> 110: 354.78083: 7.33900 -1.24460 -2.05614 -1.76789
>>> 111: 354.78081: 7.33901 -1.24460 -2.05615 -1.76790
>>> 112: 354.78077: 7.33902 -1.24460 -2.05618 -1.76794
>>> 113: 354.78075: 7.33903 -1.24460 -2.05619 -1.76795
>>> 114: 354.78075: 7.33903 -1.24460 -2.05619 -1.76795
>>> 115: 354.78074: 7.33903 -1.24460 -2.05619 -1.76796
>>> 116: 354.78074: 7.33903 -1.24460 -2.05619 -1.76796
>>> 117: 354.78074: 7.33903 -1.24460 -2.05619 -1.76796
>>> 118: 354.78074: 7.33903 -1.24460 -2.05620 -1.76796
>>> 119: 354.78074: 7.33903 -1.24460 -2.05620 -1.76796
>>> 120: 354.78074: 7.33903 -1.24460 -2.05620 -1.76796
>>> 121: 354.78074: 7.33903 -1.24460 -2.05620 -1.76796
>>> 122: 354.78074: 7.33903 -1.24460 -2.05620 -1.76796
>>> Warning message:
>>> In mer_finalize(ans) : false convergence (8)
>>> >
>>> > summary(ditmodel3)
>>> Generalized linear mixed model fit by the Laplace approximation
>>> Formula: ditrans ~ condition + (1 | subj)
>>> Data: adultdata
>>> AIC BIC logLik deviance
>>> 362.8 381.7 -177.4 354.8
>>> Random effects:
>>> Groups Name Variance Std.Dev.
>>> subj (Intercept) 53.861 7.339
>>> Number of obs: 833, groups: subj, 48
>>>
>>> Fixed effects:
>>> Estimate Std. Error z value Pr(>|z|)
>>> (Intercept) -1.245 1.951 -0.638 0.523
>>> conditionDative -2.056 2.832 -0.726 0.468
>>> conditionTrans -1.768 2.791 -0.633 0.526
>>>
>>> Correlation of Fixed Effects:
>>> (Intr) cndtnD
>>> conditinDtv -0.689
>>> conditnTrns -0.699 0.481
>>>
>>> Thanks and sorry for the length,
>>> Laura
>>>
>>> On Thu, Nov 29, 2012 at 11:08 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>>
>>>> Yes -- the first column of the verbose output is the step number and
>>>> the second column is the deviance. If the deviance was still going down
>>>> and the model stopped, you probably need more iterations.
>>>>
>>>> It could be useful to change the starting value of the model
>>>> parameters with the "start" argument of lmer and see if you wind up
>>>> converging to the same parameter estimates regardless of starting value.
>>>>
>>>> More information about the dataset, and example code output, is, of
>>>> course, always helpful.
>>>>
>>>> Best
>>>>
>>>> Roger
>>>>
>>>>
>>>>
>>>> On Nov 29, 2012, at 7:03 AM PST, Laura Suttle wrote:
>>>>
>>>> Hi Roger,
>>>>
>>>> Thanks for the other list suggestion, I'll cross post to there.
>>>>
>>>> Every variable in my data set is categorical, so I can't do that fix.
>>>> I've tried playing around with the maxIter parameter before, but I'm not
>>>> sure I was doing it right. Do you have any suggestions for where I can read
>>>> more about how to interpret the verbose output? I found some things but
>>>> they weren't very helpful.
>>>>
>>>> Thanks,
>>>> Laura
>>>>
>>>>
>>>> On Thu, Nov 29, 2012 at 1:34 AM, Levy, Roger <rlevy@ucsd.edu> wrote:
>>>>
>>>>> Hi Laura,
>>>>>
>>>>> This is a question that might be better answered on R-sig-ME, but
>>>>> briefly: I would be cautious with a model that reports false convergence;
>>>>> in my experience with this warning (and I am by no means expert on it), it
>>>>> can indicate that the optimization routine that determines the best-fit
>>>>> model parameters got stuck at a parameter estimate that is not near a true
>>>>> optimum, perhaps due to numerical issues. You might try standardizing any
>>>>> continuous predictor variables you and rerunning the lmer() call. It would
>>>>> be helpful to set the msVerbose control parameter to TRUE to see what the
>>>>> optimizer is doing. Also, upping the maxIter and/or maxFN control
>>>>> parameters *might* be helpful.
>>>>>
>>>>> I do not think that this warning message alone would be
>>>>> justification to omit a random effect.
>>>>>
>>>>> Best & hope that this helps,
>>>>>
>>>>> Roger
>>>>>
>>>>> On Nov 28, 2012, at 8:58 PM PST, Laura Suttle wrote:
>>>>>
>>>>> Hello all,
>>>>>
>>>>> I hope this question hasn't been asked before, but the internet
>>>>> isn't being of much help to me.
>>>>>
>>>>> I am trying to run a mixed logit regression predicting whether
>>>>> participants use a novel verb in a particular construction or not depending
>>>>> on how they were exposed to that novel verb. I dummy coded the three
>>>>> conditions of the experiment into two dummy variables and have added two
>>>>> random effects, one for the motion used for the verb, the other for the
>>>>> verb itself (since these were all counterbalanced).
>>>>>
>>>>> I can get this model to run fine, the problem is when I try to add
>>>>> any kind of random effect for the subjects themselves. I then get this
>>>>> error message:
>>>>>
>>>>> Warning message:
>>>>> In mer_finalize(ans) : false convergence (8)
>>>>>
>>>>> And all of the effects I had of the exposure type go away.
>>>>>
>>>>> I've been trying to look up what this means and how to deal with it,
>>>>> but there are no clear solutions or explanations that I can find, but
>>>>> plenty of warning of how I should be skeptical of any output from a model
>>>>> with this warning. One suggestion I did find was that the subjects variable
>>>>> may be overfitting my data and there might be something to this: when
>>>>> participants are exposed to the verb in a certain way, they tend to only
>>>>> use the construction I'm looking for, with no variance in their responses.
>>>>> That said, I'm not sure that's right and I'd love a second opinion on
>>>>> either how I can fix this or whether I can use this as justification to not
>>>>> include the subjects random effect.
>>>>>
>>>>> Thanks in advance for any help you can give,
>>>>> Laura Suttle
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ucsd.edu/pipermail/ling-r-lang-l/attachments/20121129/9eaf9ed0/attachment-0001.html
More information about the ling-r-lang-L
mailing list