User Documentation for Apache MADlib
Regression Models

Detailed Description

A collection of methods for modeling conditional expectation of a response variable.


 Clustered Variance
 Calculates clustered variance for linear, logistic, and multinomial logistic regression models, and Cox proportional hazards models.
 Cox-Proportional Hazards Regression
 Models the relationship between one or more independent predictor variables and the amount of time before an event occurs.
 Elastic Net Regularization
 Generates a regularized regression model for variable selection in linear and logistic regression problems, combining the L1 and L2 penalties of the lasso and ridge methods.
 Generalized Linear Models
 Estimate generalized linear model (GLM). GLM is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.
 Linear Regression
 Also called Ordinary Least Squares Regression, models linear relationship between a dependent variable and one or more independent variables.
 Logistic Regression
 Models the relationship between one or more predictor variables and a binary categorical dependent variable by predicting the probability of the dependent variable using a logistic function.
 Marginal Effects
 Calculates marginal effects for the coefficients in regression problems.
 Multinomial Regression
 Multinomial regression is to model the conditional distribution of the multinomial response variable using a linear combination of predictors.
 Ordinal Regression
 Regression to model data with ordinal response variable.
 Robust Variance
 Calculates Huber-White variance estimates for linear, logistic, and multinomial regression models, and for Cox proportional hazards models.