alaplace                package:VGAM                R Documentation

_A_s_y_m_m_e_t_r_i_c _L_a_p_l_a_c_e _D_i_s_t_r_i_b_u_t_i_o_n _F_a_m_i_l_y _F_u_n_c_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum likelihood estimation of the 1, 2 and 3-parameter
     asymmetric Laplace distributions (ALDs). The 1-parameter ALD may
     be used for quantile regression.

_U_s_a_g_e:

     alaplace1(tau = NULL, llocation = "identity", elocation = list(),
               ilocation = NULL, kappa = sqrt(tau/(1 - tau)), Scale.arg = 1,
               shrinkage.init = 0.95, parallelLocation = FALSE, digt = 4,
               dfmu.init = 3, method.init = 1, zero = NULL)

     alaplace2(tau = NULL,  llocation = "identity", lscale = "loge",
               elocation = list(), escale = list(),
               ilocation = NULL, iscale = NULL, kappa = sqrt(tau/(1 - tau)),
               shrinkage.init = 0.95,
               parallelLocation = FALSE, digt = 4, sameScale = TRUE,
               dfmu.init = 3, method.init = 1, zero = "(1 + M/2):M")

     alaplace3(llocation = "identity", lscale = "loge", lkappa = "loge",
               elocation = list(), escale = list(), ekappa = list(),
               ilocation = NULL, iscale = NULL, ikappa = 1,
               method.init = 1, zero = 2:3)

_A_r_g_u_m_e_n_t_s:

tau, kappa: Numeric vectors with 0 < tau < 1 and kappa >0. Most users
          will only specify 'tau' since the estimated location
          parameter corresponds to the tauth regression quantile, which
          is easier to understand. See below for details.

llocation, lscale, lkappa: Character. Parameter link functions for
          location parameter xi, scale parameter sigma, asymmetry
          parameter kappa. See 'Links' for more choices. For example,
          the argument 'llocation' can help handle count data by
          restricting the quantiles to be positive (use
          'llocation="loge"'). However, 'llocation' is best left alone
          since the theory only works properly with the identity link.

elocation, escale, ekappa: List. Extra argument for each of the links.
          See 'earg' in 'Links' for general information.

ilocation, iscale, ikappa: Optional initial values. If given, it must
          be numeric and values are recycled to the appropriate length.
          The default is to choose the value internally. 

parallelLocation: Logical. Should the quantiles be parallel on the
          transformed scale (argument 'llocation')? Assigning this
          argument to 'TRUE' circumvents the seriously embarrassing
          quantile crossing problem.

sameScale: Logical. Should the scale parameters be equal? It is advised
          to keep 'sameScale=TRUE' unchanged because it does not make
          sense to have different values for each 'tau' value.

method.init: Initialization method. Either the value 1, 2, 3 or 4.

dfmu.init: Degrees of freedom for the cubic smoothing spline fit
          applied to get an initial estimate of the location parameter.
          See 'vsmooth.spline'. Used only when 'method.init=3'.

shrinkage.init: How much shrinkage is used when initializing xi. The
          value must be between 0 and 1 inclusive, and a value of 0
          means the individual response values are used, and a value of
          1 means the median or mean is used. This argument is used
          only when 'method.init=4'.

Scale.arg: The value of the scale parameter sigma. This argument may be
          used to compute quantiles at different tau values from an
          existing fitted 'alaplace2()' model (practical only if it has
          a single value). If the model has 'parallelLocation = TRUE'
          then only the intercept need be estimated; use an offset. See
          below for an example.

   digt : Passed into 'Round' as the 'digits' argument for the 'tau'
          values; used cosmetically for labelling.

    zero: See 'CommonVGAMffArguments' for more information. Where
          possible, the default is to model all the sigma and kappa as
          an intercept-only term.

_D_e_t_a_i_l_s:

     These 'VGAM' family functions implement one variant of asymmetric
     Laplace distributions (ALDs) suitable for quantile regression.
     Kotz et al. (2001) call it _the_ ALD. Its density function is

 f(y;xi,sigma,kappa) =  (sqrt(2)/sigma) * (kappa/(1+ kappa^2)) * exp( -(sqrt(2) / (sigma * kappa)) * |y-xi| )

     for y <= xi, and

 f(y;xi,sigma,kappa) =  (sqrt(2)/sigma) * (kappa/(1+ kappa^2)) * exp( - (sqrt(2) * kappa / sigma) * |y-xi| )

     for y > xi. Here, the ranges are for all real y and xi, positive
     sigma and positive kappa. The special case kappa=1 corresponds to
     the (symmetric) Laplace distribution of Kotz et al. (2001). The
     mean is xi + sigma * (1/kappa - kappa) / sqrt(2) and the variance
     is sigma^2 * (1 + kappa^4) / (2 * kappa^2). The enumeration of the
     linear/additive predictors used here is to first have all the
     location parameters, followed by all the scale parameters.
     Finally, for 'alaplace3()', the last one is the asymmetry
     parameter.

     It is known that the maximum likelihood estimate of the location
     parameter xi corresponds to the regression quantile estimate of
     the classical quantile regression approach of Koenker and Bassett
     (1978). An important property of the ALD is that P(Y <=  xi) = tau
     where  tau = kappa^2 / (1 + kappa^2) so that kappa = sqrt(tau /
     (1-tau)). Thus 'alaplace1()' may be used as an alternative to 'rq'
     in the 'quantreg' package.

     In general the response must be a vector or a 1-column matrix. For
     'alaplace1()' and 'alaplace2()' the number of linear/additive
     predictors is dictated by the length of 'tau' or 'kappa'.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm' and 'vgam'.

     In the 'extra' slot of the fitted object are some list components
     which are useful, e.g., the sample proportion of values which are
     less than the fitted quantile curves.

_W_a_r_n_i_n_g:

     The MLE regularity conditions do not hold for this distribution so
     that misleading inferences may result, e.g., in the 'summary' and
     'vcov' of the object.

     Care is needed with 'tau' values which are too small, e.g., for
     count data with 'llocation="loge"' and if the sample proportion of
     zeros is greater than 'tau'.

_N_o_t_e:

     These 'VGAM' family functions use Fisher scoring. Convergence may
     be slow and half-stepping is usual (although one can use
     'trace=TRUE' to see which is the best model and then use 'maxit'
     to fit that model).

     For large data sets it is a very good idea to keep the length of
     'tau'/'kappa' low to avoid large memory requirements. Then for
     'parallelLoc=FALSE' one can repeatedly fit a model with
     'alaplace1()' with one tau at a time; and for 'parallelLoc=TRUE'
     one can refit a model with 'alaplace1()' with one tau at a time
     but using offsets and an intercept-only model.

     A second method for solving the noncrossing quantile problem is
     illustrated below in Example 3. This is called the _accumulative
     quantile method_ (AQM) and details are in Yee (2009). It does not
     make the strong parallelism assumption.

     The functions 'alaplace2()' and 'laplace' differ slightly in terms
     of the parameterizations.

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Koenker, R. and Bassett, G. (1978) Regression quantiles.
     _Econometrica_, *46*, 33-50.

     Kotz, S., Kozubowski, T. J. and Podgorski, K. (2001) _The Laplace
     distribution and generalizations: a revisit with applications to
     communications, economics, engineering, and finance_, Boston:
     Birkhauser.

     Yee, T. W. (2009) Quantile regression for counts and proportions.
     In preparation.

_S_e_e _A_l_s_o:

     'ralap', 'laplace', 'lms.bcn', 'amlnormal'.

_E_x_a_m_p_l_e_s:

     # Example 1: quantile regression with smoothing splines
     alldat = data.frame(x = sort(runif(n <- 500)))
     mymu = function(x) exp(-2 + 6*sin(2*x-0.2) / (x+0.5)^2)
     alldat = transform(alldat, y = rpois(n, lambda=mymu(x)))
     mytau = c(0.25, 0.75); mydof = 4

     fit = vgam(y ~ s(x, df=mydof), alaplace1(tau=mytau, llocation="loge",
                parallelLoc=FALSE), data=alldat, trace=TRUE)
     fitp = vgam(y ~ s(x, df=mydof), alaplace1(tau=mytau, llocation="loge",
                 parallelLoc=TRUE), data=alldat, trace=TRUE)
      
     ## Not run: 
     par(las=1)
     mylwd = 1.5
     with(alldat, plot(x, jitter(y, factor=0.5), col="red",
                       main="Example 1; green: parallelLoc=TRUE",
                       ylab="y", pch="o", cex=0.75))
     with(alldat, matlines(x, fitted(fit), col="blue", lty="solid", lwd=mylwd))
     with(alldat, matlines(x, fitted(fitp), col="green", lty="solid", lwd=mylwd))
     finexgrid = seq(0, 1, len=1001)
     for(ii in 1:length(mytau))
         lines(finexgrid, qpois(p=mytau[ii], lambda=mymu(finexgrid)),
               col="blue", lwd=mylwd)
     ## End(Not run)
     fit@extra  # Contains useful information

     # Example 2: regression quantile at a new tau value from an existing fit
     # Nb. regression splines are used here since it is easier.
     fitp2 = vglm(y ~ bs(x, df=mydof),
                  family = alaplace1(tau=mytau, llocation="loge",
                                     parallelLoc=TRUE),
                  data=alldat, trace=TRUE)

     newtau = 0.5  # Want to refit the model with this tau value
     fitp3 = vglm(y ~ 1 + offset(predict(fitp2)[,1]),
                 family = alaplace1(tau=newtau, llocation="loge"),
                  data=alldat)
     ## Not run: 
     with(alldat, plot(x, jitter(y, factor=0.5), col="red", ylab="y",
                       pch="o", cex=0.75,
                       main="Example 2; parallelLoc=TRUE"))
     with(alldat, matlines(x, fitted(fitp2), col="blue", lty="solid", lwd=mylwd))
     with(alldat, matlines(x, fitted(fitp3), col="black", lty="solid", lwd=mylwd))
     ## End(Not run)


     # Example 3: noncrossing regression quantiles using a trick: obtain
     # successive solutions which are added to previous solutions; use a log
     # link to ensure an increasing quantiles at any value of x.

     mytau = seq(0.2, 0.9, by=0.1)
     answer = matrix(0, nrow(alldat), length(mytau)) # Stores the quantiles
     alldat = transform(alldat, offsety=y*0)
     usetau = mytau
     for(ii in 1:length(mytau)) {
     #   cat("\n\nii =", ii, "\n")
         alldat = transform(alldat, usey=y-offsety)
         iloc = ifelse(ii==1, with(alldat, median(y)), 1.0) # Well-chosen!
         mydf = ifelse(ii==1, 5, 3)  # Maybe less smoothing will help
         lloc = ifelse(ii==1, "identity", "loge")  # 2nd value must be "loge"
         fit3 = vglm(usey ~ ns(x, df=mydf), data=alldat, trace=TRUE,
                     fam=alaplace1(tau=usetau[ii], lloc=lloc, iloc=iloc))
         answer[,ii] = (if(ii==1) 0 else answer[,ii-1]) + fitted(fit3)
         alldat = transform(alldat, offsety=answer[,ii])
     }

     # Plot the results.
     ## Not run: 
     with(alldat, plot(x, y, col="blue",
          main=paste("Noncrossing and nonparallel; tau =",
                     paste(mytau, collapse=", "))))
     with(alldat, matlines(x, answer, col="red", lty=1))

     # Zoom in near the origin.
     with(alldat, plot(x, y, col="blue", xlim=c(0, 0.2), ylim=0:1,
          main=paste("Noncrossing and nonparallel; tau =",
                     paste(mytau, collapse=", "))))
     with(alldat, matlines(x, answer, col="red", lty=1))
     ## End(Not run)

