[R-lang] [Fwd: [R] [R-pkgs] New glmnet package on CRAN]

Roger Levy rlevy at ucsd.edu
Mon Jun 2 11:14:53 PDT 2008


Hi Lucien,

It looks like your wish *may* have just been granted...

Cheers,

Roger

-------- Original Message --------
Subject: [R] [R-pkgs] New glmnet package on CRAN
Date: Mon, 02 Jun 2008 11:08:16 -0700
From: Trevor Hastie <hastie at stanford.edu>
To: r-packages at stat.math.ethz.ch

glmnet is a package that fits the regularization path for linear, two-
and multi-class logistic regression
models with "elastic net" regularization (tunable mixture of L1 and L2
penalties).
glmnet uses pathwise coordinate descent, and is very fast.

Some of the features of glmnet:

* by default it computes the path at 100 uniformly spaced (on the log
scale) values of the regularization parameter
* glmnet appears to be faster than any of the packages that are freely
available, in some cases by two orders of magnitude.
* recognizes and exploits sparse input matrices (ala Matrix package).
Coefficient matrices are output in sparse matrix representation.
* penalty is (1-a)*||\beta||_2^2 +a*||beta||_1  where a is between 0 and
1;  a=0 is the Lasso penalty, a=1 is the ridge penalty.
   For many correlated predictors, a=.95 or thereabouts improves the
performance of the lasso.
* convenient predict, plot, print, and coef methods
* variable-wise penalty modulation allows each variable to be penalized
by a scalable amount; if zero that variable always enters
* glmnet uses a symmetric parametrization for multinomial, with
constraints enforced by the penalization.

Other families such as poisson might appear in later versions of glmnet.

Examples of glmnet speed trials:

Newsgroup data: N=11,000, p=4 Million, two class logistic. 100 values
along lasso path.   Time = 2mins
14 Class cancer data: N=144, p=16K, 14 class multinomial, 100 values
along lasso path. Time = 30secs

Authors: Jerome Friedman, Trevor Hastie, Rob Tibshirani.

See our paper http://www-stat.stanford.edu/~hastie/Papers/glmnet.pdf for
implementation details,
and comparisons with other related software.

-- 
--------------------------------------------------------------------
  Trevor Hastie                                  hastie at stanford.edu
  Professor & Chair, Department of Statistics, Stanford University
  Phone: (650) 725-2231 (Statistics)	         Fax: (650) 725-8977
	 (650) 498-5233 (Biostatistics)		 Fax: (650) 725-6951
  URL: http://www-stat.stanford.edu/~hastie
  address: room 104, Department of Statistics, Sequoia Hall
	          390 Serra Mall, Stanford University, CA 94305-4065

_______________________________________________
R-packages mailing list
R-packages at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

______________________________________________
R-help at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-- 

Roger Levy                      Email: rlevy at ling.ucsd.edu
Assistant Professor             Phone: 858-534-7219
Department of Linguistics       Fax:   858-534-4789
UC San Diego                    Web:   http://ling.ucsd.edu/~rlevy



More information about the R-lang mailing list