From tine at haskins.yale.edu Thu May 8 13:37:55 2008 From: tine at haskins.yale.edu (Tine Mooshammer) Date: Thu, 08 May 2008 16:37:55 -0400 Subject: [R-lang] lmer for by item and by subject analysis Message-ID: <482364A3.70909@haskins.yale.edu> I have RT for 20 subjects from a delayed naming experiment with different syllable structures (VC, CV, CVC etc.). Therefore, item and structure (the experimental condition) are confounded. Example for the data: sp code2 structure logAc 1 F01 cake CVC 5.544396 2 F01 cape CVC 5.459586 3 F01 Kay CV 5.450609 4 F01 lake CVC 4.830711 5 F01 lay CV 4.705016 6 F01 pape CVC 5.446306 7 F01 pate CVC 5.319590 8 F01 pay CV 5.535364 9 F01 skate CCVC 5.116795 10 F01 skay CCV 5.189060 The lmer works well for the simple model: RTE.lmer=lmer(logAc ~ structure + (1|sp) + (1|code2), latrmE) but I get the following error messages for the more complicated model: RTE.lmerS=lmer(logAc ~ structure + (1+structure|sp) + (1|code2), latrmE) Warning messages: 1: In .local(x, ..., value) : Estimated variance-covariance for factor ?sp? is singular 2: In .local(x, ..., value) : nlminb returned message false convergence (8) Does that mean that I don't have to account for different speaker slopes or is there an error in the specification of the model or empty cells in the data (I'm not aware of that)? Furthermore, a slightly different specification for the model seems to be > RTE.lmerS=lmer(logAc ~ structure + (1|sp:structure) + (1|code2), latrmE) but then I get the following error messages: Error in sp:structure : NA/NaN argument In addition: Warning messages: 1: In sp:structure : numerical expression has 520 elements: only the first used 2: In sp:structure : numerical expression has 520 elements: only the first used 3: In inherits(x, "factor") : NAs introduced by coercion What is difference between the two models? I'm puzzled. Bye Tine -- ++++++++++++++++++++++++++++++++ Dr. Christine Mooshammer New address/Neue Adresse: Haskins Laboratories 300 George St., Suite 900 New Haven, CT 06511 USA Phone: ++1 203 865 6163 315 Email: tine at haskins.yale.edu +++++++++++++++++++++++++++++++++ From rlevy at ucsd.edu Thu May 8 15:23:18 2008 From: rlevy at ucsd.edu (Roger Levy) Date: Thu, 08 May 2008 15:23:18 -0700 Subject: [R-lang] lmer for by item and by subject analysis In-Reply-To: <482364A3.70909@haskins.yale.edu> References: <482364A3.70909@haskins.yale.edu> Message-ID: <48237D56.5060402@ucsd.edu> Tine Mooshammer wrote: > I have RT for 20 subjects from a delayed naming experiment with > different syllable structures (VC, CV, CVC etc.). Therefore, item and > structure (the experimental condition) are confounded. > > Example for the data: > sp code2 structure logAc > 1 F01 cake CVC 5.544396 > 2 F01 cape CVC 5.459586 > 3 F01 Kay CV 5.450609 > 4 F01 lake CVC 4.830711 > 5 F01 lay CV 4.705016 > 6 F01 pape CVC 5.446306 > 7 F01 pate CVC 5.319590 > 8 F01 pay CV 5.535364 > 9 F01 skate CCVC 5.116795 > 10 F01 skay CCV 5.189060 > hi Tine, > The lmer works well for the simple model: > RTE.lmer=lmer(logAc ~ structure + (1|sp) + (1|code2), latrmE) Could you show us the output of this model? > > but I get the following error messages for the more complicated model: > RTE.lmerS=lmer(logAc ~ structure + (1+structure|sp) + (1|code2), latrmE) > Warning messages: > 1: In .local(x, ..., value) : > Estimated variance-covariance for factor ?sp? is singular > > 2: In .local(x, ..., value) : > nlminb returned message false convergence (8) > > Does that mean that I don't have to account for different speaker slopes > or is there an error in the specification of the model or empty cells in > the data (I'm not aware of that)? I'm going to guess that you might have a very small estimated random effect of specific structure by speaker. See below as well. > > Furthermore, a slightly different specification for the model seems to be > > > RTE.lmerS=lmer(logAc ~ structure + (1|sp:structure) + (1|code2), latrmE) Note that this really is a different model than the (1 + structure | sp) model. In the (1 + structure | sp) model, speakers who are slow for one structure will tend to be slow for other structures as well. This is not the case for the (1 | sp:structure) model. > > but then I get the following error messages: > > Error in sp:structure : NA/NaN argument > In addition: Warning messages: > 1: In sp:structure : > numerical expression has 520 elements: only the first used > 2: In sp:structure : > numerical expression has 520 elements: only the first used > 3: In inherits(x, "factor") : NAs introduced by coercion What version of lme4 are you using? With the development version (on R-Forge), I am able to produce toy data that all three of your model specifications work reasonably well on (appended at end). This might also turn out to be a good question for the R-sig-ME list. But you should probably give the development version of lme4 a try, if that's not what you're using already. Best Roger *** mu <- 6 nsubj <- 20 ncodes <- 20 struct.effect <- c(0,1) subj.int <- rnorm(nsubj,0, 1) subj.slope <- rnorm(nsubj, 0, 0.05) code2.int <- rnorm(ncodes, 0, 1) dat <- expand.grid(sp=1:nsubj, code2=1:ncodes) dat$struct <- ifelse(dat$code2>(ncodes/2), 2, 1) dat <- within(dat, logAc <- mu + struct.effect[struct] + subj.int[sp] + subj.slope[sp]*struct + code2.int[code2] + rnorm(nsubj*ncodes, 0, 1)) dat dat$struct <- factor(dat$struct) dat$sp <- factor(dat$sp) dat$code2 <- factor(dat$code2) logAc.lmer1 <- lmer(logAc ~ struct + (1 | sp) + (1 | code2), data=dat, method="ML") logAc.lmer2 <- lmer(logAc ~ struct + (1 + struct | sp) + (1 | code2), dat, method="ML") logAc.lmer3 <- lmer(logAc ~ struct + (1 | sp:struct) + (1 | code2), dat, method="ML") > logAc.lmer1 Linear mixed model fit by maximum likelihood Formula: logAc ~ struct + (1 | sp) + (1 | code2) Data: dat AIC BIC logLik deviance REMLdev 1292 1312 -640.8 1282 1283 Random effects: Groups Name Variance Std.Dev. sp (Intercept) 0.60403 0.77720 code2 (Intercept) 0.62855 0.79281 Residual 1.13088 1.06343 Number of obs: 400, groups: sp, 20; code2, 20 Fixed effects: Estimate Std. Error t value (Intercept) 6.1954 0.3142 19.719 struct2 0.6435 0.3702 1.739 Correlation of Fixed Effects: (Intr) struct2 -0.589 > logAc.lmer2 Linear mixed model fit by maximum likelihood Formula: logAc ~ struct + (1 + struct | sp) + (1 | code2) Data: dat AIC BIC logLik deviance REMLdev 1295 1323 -640.7 1281 1282 Random effects: Groups Name Variance Std.Dev. Corr sp (Intercept) 0.636429 0.79776 struct2 0.035020 0.18714 -0.272 code2 (Intercept) 0.629427 0.79336 Residual 1.121574 1.05904 Number of obs: 400, groups: sp, 20; code2, 20 Fixed effects: Estimate Std. Error t value (Intercept) 6.1954 0.3168 19.555 struct2 0.6435 0.3726 1.727 Correlation of Fixed Effects: (Intr) struct2 -0.598 > logAc.lmer3 Linear mixed model fit by maximum likelihood Formula: logAc ~ struct + (1 | sp:struct) + (1 | code2) Data: dat AIC BIC logLik deviance REMLdev 1313 1333 -651.6 1303 1304 Random effects: Groups Name Variance Std.Dev. sp:struct (Intercept) 0.61738 0.78573 code2 (Intercept) 0.63429 0.79643 Residual 1.12153 1.05902 Number of obs: 400, groups: sp:struct, 40; code2, 20 Fixed effects: Estimate Std. Error t value (Intercept) 6.1954 0.3161 19.60 struct2 0.6435 0.4470 1.44 Correlation of Fixed Effects: (Intr) struct2 -0.707 From tine at haskins.yale.edu Fri May 9 08:16:38 2008 From: tine at haskins.yale.edu (Tine Mooshammer) Date: Fri, 09 May 2008 11:16:38 -0400 Subject: [R-lang] lmer for by item and by subject analysis In-Reply-To: <48237D56.5060402@ucsd.edu> References: <482364A3.70909@haskins.yale.edu> <48237D56.5060402@ucsd.edu> Message-ID: <48246AD6.3040606@haskins.yale.edu> > > hi Tine, > >> The lmer works well for the simple model: >> RTE.lmer=lmer(logAc ~ structure + (1|sp) + (1|code2), latrmE) > > Could you show us the output of this model? yes, sure: Linear mixed-effects model fit by REML Formula: logAc ~ structure + (1 | sp) + (1 | code2) Data: latrmE AIC BIC logLik MLdeviance REMLdeviance -489 -459.2 251.5 -524 -503 Random effects: Groups Name Variance Std.Dev. code2 (Intercept) 0.0069926 0.083622 sp (Intercept) 0.0454584 0.213210 Residual 0.0164155 0.128123 number of obs: 520, groups: code2, 26; sp, 20 Fixed effects: Estimate Std. Error t value (Intercept) 5.64618 0.06193 91.17 structureCV -0.12430 0.05590 -2.22 structureCVC -0.09259 0.05039 -1.84 structureCCV -0.39861 0.05930 -6.72 structureCCVC -0.37137 0.05930 -6.26 Correlation of Fixed Effects: (Intr) strcCV strCVC strCCV structureCV -0.451 structurCVC -0.501 0.555 structurCCV -0.426 0.471 0.523 structrCCVC -0.426 0.471 0.523 0.444 For model 1+structure|sp Linear mixed-effects model fit by REML Formula: logAc ~ structure + (1 + structure | sp) + (1 | code2) Data: latrmE AIC BIC logLik MLdeviance REMLdeviance -486.3 -397 264.2 -549.4 -528.3 Random effects: Groups Name Variance Std.Dev. Corr code2 (Intercept) 0.0070048 0.083695 sp (Intercept) 0.0374311 0.193471 structureCV 0.0030771 0.055472 -0.348 structureCVC 0.0100272 0.100136 -0.348 0.812 structureCCV 0.0014262 0.037765 -0.298 0.632 0.955 structureCCVC 0.0088162 0.093895 -0.298 0.632 0.955 1.000 Residual 0.0147228 0.121338 number of obs: 520, groups: code2, 26; sp, 20 Fixed effects: Estimate Std. Error t value (Intercept) 5.64618 0.05848 96.55 structureCV -0.12430 0.05701 -2.18 structureCVC -0.09259 0.05493 -1.69 structureCCV -0.39861 0.05962 -6.69 structureCCVC -0.37137 0.06264 -5.93 Correlation of Fixed Effects: (Intr) strcCV strCVC strCCV structureCV -0.520 structurCVC -0.587 0.566 structurCCV -0.475 0.475 0.528 structrCCVC -0.497 0.480 0.580 0.462 >> >> but I get the following error messages for the more complicated model: >> RTE.lmerS=lmer(logAc ~ structure + (1+structure|sp) + (1|code2), latrmE) >> Warning messages: >> 1: In .local(x, ..., value) : >> Estimated variance-covariance for factor ?sp? is singular >> >> 2: In .local(x, ..., value) : >> nlminb returned message false convergence (8) >> >> Does that mean that I don't have to account for different speaker >> slopes or is there an error in the specification of the model or >> empty cells in the data (I'm not aware of that)? > > I'm going to guess that you might have a very small estimated random > effect of specific structure by speaker. See below as well. > >> >> Furthermore, a slightly different specification for the model seems >> to be >> >> > RTE.lmerS=lmer(logAc ~ structure + (1|sp:structure) + (1|code2), >> latrmE) > > Note that this really is a different model than the (1 + structure | > sp) model. In the (1 + structure | sp) model, speakers who are slow > for one structure will tend to be slow for other structures as well. > This is not the case for the (1 | sp:structure) model. > >> >> but then I get the following error messages: >> >> Error in sp:structure : NA/NaN argument >> In addition: Warning messages: >> 1: In sp:structure : >> numerical expression has 520 elements: only the first used >> 2: In sp:structure : >> numerical expression has 520 elements: only the first used >> 3: In inherits(x, "factor") : NAs introduced by coercion > > What version of lme4 are you using? With the development version (on > R-Forge), I am able to produce toy data that all three of your model > specifications work reasonably well on (appended at end). I'm using the current version, 0.99875-9 With your data I still get a warning for logAc.lmer2 Warning message: In .local(x, ..., value) : Estimated variance-covariance for factor ?sp? is singular Does this mean that the variance of struct on sp is so small that it can be neglected? Sorry for stupid questions... And thanks for your quick answer. Tine > > This might also turn out to be a good question for the R-sig-ME list. > But you should probably give the development version of lme4 a try, if > that's not what you're using already. > > Best > > Roger > > *** > > mu <- 6 > nsubj <- 20 > ncodes <- 20 > struct.effect <- c(0,1) > subj.int <- rnorm(nsubj,0, 1) > subj.slope <- rnorm(nsubj, 0, 0.05) > code2.int <- rnorm(ncodes, 0, 1) > dat <- expand.grid(sp=1:nsubj, code2=1:ncodes) > dat$struct <- ifelse(dat$code2>(ncodes/2), 2, 1) > dat <- within(dat, logAc <- mu + struct.effect[struct] + subj.int[sp] > + subj.slope[sp]*struct + code2.int[code2] + rnorm(nsubj*ncodes, 0, 1)) > dat > > dat$struct <- factor(dat$struct) > dat$sp <- factor(dat$sp) > dat$code2 <- factor(dat$code2) > logAc.lmer1 <- lmer(logAc ~ struct + (1 | sp) + (1 | code2), data=dat, > method="ML") > logAc.lmer2 <- lmer(logAc ~ struct + (1 + struct | sp) + (1 | code2), > dat, method="ML") > logAc.lmer3 <- lmer(logAc ~ struct + (1 | sp:struct) + (1 | code2), > dat, method="ML") > > > > logAc.lmer1 > Linear mixed model fit by maximum likelihood > Formula: logAc ~ struct + (1 | sp) + (1 | code2) > Data: dat > AIC BIC logLik deviance REMLdev > 1292 1312 -640.8 1282 1283 > Random effects: > Groups Name Variance Std.Dev. > sp (Intercept) 0.60403 0.77720 > code2 (Intercept) 0.62855 0.79281 > Residual 1.13088 1.06343 > Number of obs: 400, groups: sp, 20; code2, 20 > > Fixed effects: > Estimate Std. Error t value > (Intercept) 6.1954 0.3142 19.719 > struct2 0.6435 0.3702 1.739 > > Correlation of Fixed Effects: > (Intr) > struct2 -0.589 > > logAc.lmer2 > Linear mixed model fit by maximum likelihood > Formula: logAc ~ struct + (1 + struct | sp) + (1 | code2) > Data: dat > AIC BIC logLik deviance REMLdev > 1295 1323 -640.7 1281 1282 > Random effects: > Groups Name Variance Std.Dev. Corr > sp (Intercept) 0.636429 0.79776 > struct2 0.035020 0.18714 -0.272 > code2 (Intercept) 0.629427 0.79336 > Residual 1.121574 1.05904 > Number of obs: 400, groups: sp, 20; code2, 20 > > Fixed effects: > Estimate Std. Error t value > (Intercept) 6.1954 0.3168 19.555 > struct2 0.6435 0.3726 1.727 > > Correlation of Fixed Effects: > (Intr) > struct2 -0.598 > > logAc.lmer3 > Linear mixed model fit by maximum likelihood > Formula: logAc ~ struct + (1 | sp:struct) + (1 | code2) > Data: dat > AIC BIC logLik deviance REMLdev > 1313 1333 -651.6 1303 1304 > Random effects: > Groups Name Variance Std.Dev. > sp:struct (Intercept) 0.61738 0.78573 > code2 (Intercept) 0.63429 0.79643 > Residual 1.12153 1.05902 > Number of obs: 400, groups: sp:struct, 40; code2, 20 > > Fixed effects: > Estimate Std. Error t value > (Intercept) 6.1954 0.3161 19.60 > struct2 0.6435 0.4470 1.44 > > Correlation of Fixed Effects: > (Intr) > struct2 -0.707 -- ++++++++++++++++++++++++++++++++ Dr. Christine Mooshammer New address/Neue Adresse: Haskins Laboratories 300 George St., Suite 900 New Haven, CT 06511 USA Phone: ++1 203 865 6163 315 Email: tine at haskins.yale.edu +++++++++++++++++++++++++++++++++ From rlevy at ling.ucsd.edu Fri May 9 13:18:21 2008 From: rlevy at ling.ucsd.edu (Roger Levy) Date: Fri, 09 May 2008 13:18:21 -0700 Subject: [R-lang] lmer for by item and by subject analysis In-Reply-To: <48246AD6.3040606@haskins.yale.edu> References: <482364A3.70909@haskins.yale.edu> <48237D56.5060402@ucsd.edu> <48246AD6.3040606@haskins.yale.edu> Message-ID: <4824B18D.9000703@ling.ucsd.edu> Hi Tine, Tine Mooshammer wrote: > >> >> hi Tine, >> >>> The lmer works well for the simple model: >>> RTE.lmer=lmer(logAc ~ structure + (1|sp) + (1|code2), latrmE) >> >> Could you show us the output of this model? > yes, sure: > > Linear mixed-effects model fit by REML > Formula: logAc ~ structure + (1 | sp) + (1 | code2) > Data: latrmE > AIC BIC logLik MLdeviance REMLdeviance > -489 -459.2 251.5 -524 -503 > Random effects: > Groups Name Variance Std.Dev. > code2 (Intercept) 0.0069926 0.083622 > sp (Intercept) 0.0454584 0.213210 > Residual 0.0164155 0.128123 > number of obs: 520, groups: code2, 26; sp, 20 > > Fixed effects: > Estimate Std. Error t value > (Intercept) 5.64618 0.06193 91.17 > structureCV -0.12430 0.05590 -2.22 > structureCVC -0.09259 0.05039 -1.84 > structureCCV -0.39861 0.05930 -6.72 > structureCCVC -0.37137 0.05930 -6.26 > > Correlation of Fixed Effects: > (Intr) strcCV strCVC strCCV > structureCV -0.451 structurCVC -0.501 > 0.555 structurCCV -0.426 0.471 0.523 structrCCVC > -0.426 0.471 0.523 0.444 > > For model 1+structure|sp > Linear mixed-effects model fit by REML > Formula: logAc ~ structure + (1 + structure | sp) + (1 | code2) > Data: latrmE > AIC BIC logLik MLdeviance REMLdeviance > -486.3 -397 264.2 -549.4 -528.3 > Random effects: > Groups Name Variance Std.Dev. Corr > code2 (Intercept) 0.0070048 0.083695 > sp (Intercept) 0.0374311 0.193471 > structureCV 0.0030771 0.055472 -0.348 > structureCVC 0.0100272 0.100136 -0.348 0.812 > structureCCV 0.0014262 0.037765 -0.298 0.632 0.955 > structureCCVC 0.0088162 0.093895 -0.298 0.632 0.955 1.000 > Residual 0.0147228 0.121338 > number of obs: 520, groups: code2, 26; sp, 20 I believe that the correlation coefficient for the random effects of structureCCVC and structureCCV is an indication that you have an overspecified random effects structure. But I think you might want to try using the development version -- see the following R-sig-ME email for installation information: https://stat.ethz.ch/pipermail/r-sig-mixed-models/2007q4/000427.html This newer version handles the estimation of very small random effects structures much better than the version 0.99875-9 that you're using. If you try it, let us know how it works! Best & good luck! Roger -- Roger Levy Email: rlevy at ling.ucsd.edu Assistant Professor Phone: 858-534-7219 Department of Linguistics Fax: 858-534-4789 UC San Diego Web: http://ling.ucsd.edu/~rlevy From brunilda at gmail.com Wed May 14 16:25:05 2008 From: brunilda at gmail.com (Bruno Estigarribia) Date: Wed, 14 May 2008 19:25:05 -0400 Subject: [R-lang] MANCOVA in R Message-ID: <482B74D1.4060703@mail.fpg.unc.edu> Hello, I posted this to the general R list but got no replies. I am not sure this is appropriate for r-lang: the data ARE linguistic but the question is general. Apologies if this is inappropriate. I have subjects in 4 groups: X1, X2, X3, X4. There are 33 subjects in group X1, 35 in X2, 31 in X3, and 46 in group X4. I have 7 continuous response variables (actually integers, they are frequencies of functional morphemes, approximately normal) measured for each subject: Y1 to Y7, and two continuous covariates C1, C2 (they are both integers). I want to perform all pairwise comparisons for each response variable between groups. I have searched for a way to do a MANCOVA in R to no avail. I am familiar with summary.manova, and with Venables & Ripley "Modern Applied Statistics With S" and Everitt's "An R and S-Plus Companion to Multivariate Analysis". However, I am neither a statistician nor a programmer so I am finding it hard to figure this out. Can summary.manova be adapted to use covariates? What is the impact of the unbalanced design? Can I adjust for multiple comparisons? Thank you Bruno Estigarribia UNC Chapel Hill From rlevy at ling.ucsd.edu Thu May 15 10:10:41 2008 From: rlevy at ling.ucsd.edu (Roger Levy) Date: Thu, 15 May 2008 10:10:41 -0700 Subject: [R-lang] MANCOVA in R In-Reply-To: <482B74D1.4060703@mail.fpg.unc.edu> References: <482B74D1.4060703@mail.fpg.unc.edu> Message-ID: <482C6E91.3010607@ling.ucsd.edu> Bruno Estigarribia wrote: > Hello, > > I posted this to the general R list but got no replies. I am not sure this > is appropriate for r-lang: the data ARE linguistic but the question is > general. Apologies if this is inappropriate. > > I have subjects in 4 groups: X1, X2, X3, X4. There are 33 subjects in > group X1, 35 in X2, 31 in X3, and 46 in group X4. I have 7 continuous > response variables (actually integers, they are frequencies of functional > morphemes, approximately normal) measured > for each subject: Y1 to Y7, and two continuous covariates C1, C2 (they > are both integers). > I want to perform all pairwise comparisons for each response variable > between groups. I have searched for a way to do a MANCOVA in R to no > avail. I am familiar with summary.manova, and with Venables & Ripley > "Modern Applied Statistics With S" and Everitt's "An R and S-Plus > Companion to Multivariate Analysis". However, I am neither a > statistician nor a programmer so I am finding it hard to figure this > out. Can summary.manova be adapted to use covariates? What is the impact > of the unbalanced design? Can I adjust for multiple comparisons? > Thank you Hi Bruno, I'm by no means an expert on MAN(C)OVA but here are my two cents: (1) it seems to me that manova(), like aov(), just works perfectly fine if you give it continuous covariates instead of factors in a formula. For example, try the following code: # generate random data library(mvtnorm) subj <- factor(rep(c("a","b","c","d","e","f","g","h","i","j"),10)) x <- rnorm(100) beta1 <- 1 beta2 <- -0.7 b1 <- rnorm(10,sd=5) b2 <- rnorm(10,sd=2) mu <- cbind(x * beta1 + b1[subj],x * beta2 + b2[subj]) Y <- t(apply(mu, 1, function(x) rmvnorm(1, x, matrix(c(6,3,3,4),2,2)))) # fit the model & get results fit <- manova(Y ~ x) summary.manova(fit) summary.aov(fit) (2) unfortunately, manova() doesn't support multistratum analysis, so you cannot specify an Error() term in manova(). I don't know of any workaround for this -- does anyone else? (3) I'm not sure I properly understood what you meant by "perform all pairwise comparisons for each response variable between groups", so if I misunderstood your question, please let me know! Hope this helps. Roger -- Roger Levy Email: rlevy at ling.ucsd.edu Assistant Professor Phone: 858-534-7219 Department of Linguistics Fax: 858-534-4789 UC San Diego Web: http://ling.ucsd.edu/~rlevy From lucien at ling.ucsd.edu Sun May 25 23:24:01 2008 From: lucien at ling.ucsd.edu (Lucien Carroll) Date: Sun, 25 May 2008 23:24:01 -0700 Subject: [R-lang] lrm penalty/regularization/norm Message-ID: <4e6d6c930805252324w30c19dbbvd80c5fe9bac1b950@mail.gmail.com> Hello, I would like to specify a L1-norm (Laplace) regularization for a logistic regression, and I'm having trouble figuring out how to do that. Do I understand that documentation right in thinking that the penalty argument in lrm() is setting the width of a Gaussian (L2-norm)? Is L1-norm possible with lrm() or should I be looking at some other function? Thanks. -- Lucien S. Carroll Graduate Student SDSU & UCSD Linguistics http://ling.ucsd.edu/~lucien -------------- next part -------------- An HTML attachment was scrubbed... URL: http://pidgin.ucsd.edu/pipermail/r-lang/attachments/20080525/613f6b57/attachment.htm From rlevy at ucsd.edu Wed May 28 10:49:31 2008 From: rlevy at ucsd.edu (Roger Levy) Date: Wed, 28 May 2008 10:49:31 -0700 Subject: [R-lang] lrm penalty/regularization/norm In-Reply-To: <4e6d6c930805252324w30c19dbbvd80c5fe9bac1b950@mail.gmail.com> References: <4e6d6c930805252324w30c19dbbvd80c5fe9bac1b950@mail.gmail.com> Message-ID: <483D9B2B.3040108@ucsd.edu> Lucien Carroll wrote: > Hello, > > I would like to specify a L1-norm (Laplace) regularization for a > logistic regression, and I'm having trouble figuring out how to do that. > Do I understand that documentation right in thinking that the penalty > argument in lrm() is setting the width of a Gaussian (L2-norm)? Is > L1-norm possible with lrm() or should I be looking at some other function? Hi Lucien, Yes, penalization lrm() is Gaussian (L2). I haven't used R for L1 penalization, but you might check out the lars package: it implements the lasso, which is L1-penalized. However, I don't think that lars covers logistic regression. (You might get additional info from a posting on the general R-help list.) Roger -- Roger Levy Email: rlevy at ling.ucsd.edu Assistant Professor Phone: 858-534-7219 Department of Linguistics Fax: 858-534-4789 UC San Diego Web: http://ling.ucsd.edu/~rlevy