loglaplace               package:VGAM               R Documentation

_L_o_g-_L_a_p_l_a_c_e _a_n_d _L_o_g_i_t-_L_a_p_l_a_c_e _D_i_s_t_r_i_b_u_t_i_o_n _F_a_m_i_l_y _F_u_n_c_t_i_o_n_s

_D_e_s_c_r_i_p_t_i_o_n:

     Maximum likelihood estimation of the 1-parameter log-Laplace and
     the 1-parameter logit-Laplace distributions. These may be used for
     quantile regression for counts and proportions respectively.

_U_s_a_g_e:

     loglaplace1(tau = NULL, llocation = "loge", elocation = list(),
         ilocation = NULL, kappa = sqrt(tau/(1 - tau)), Scale.arg = 1,
         shrinkage.init = 0.95, parallelLocation = FALSE, digt = 4,
         dfmu.init = 3, rep0 = 0.5, 
         minquantile = 0, maxquantile = Inf,
         method.init = 1, zero = NULL)
     logitlaplace1(tau = NULL, llocation = "logit", elocation = list(),
         ilocation = NULL, kappa = sqrt(tau/(1 - tau)),
         Scale.arg = 1, shrinkage.init = 0.95, parallelLocation = FALSE,
         digt = 4, dfmu.init = 3, rep01 = 0.5, method.init = 1, zero = NULL)

_A_r_g_u_m_e_n_t_s:

tau, kappa: See 'alaplace1'.

llocation: Character. Parameter link functions for location parameter
          xi. See 'Links' for more choices. However, this argument
          should be left unchanged with count data because it restricts
          the quantiles to be positive. With proportions data 
          'llocation' can be assigned a link such as 'logit', 
          'probit',  'cloglog',  etc.

elocation: List. Extra argument for each of the links. See 'earg' in
          'Links' for general information.

ilocation: Optional initial values. If given, it must be numeric and
          values are recycled to the appropriate length. The default is
          to choose the value internally.

parallelLocation: Logical. Should the quantiles be parallel on the
          transformed scale (argument 'llocation')? Assigning this
          argument to 'TRUE' circumvents the seriously embarrassing
          quantile crossing problem.

method.init: Initialization method. Either the value 1, 2, or ....

dfmu.init, shrinkage.init, Scale.arg, digt, zero: See 'alaplace1'.

rep0, rep01: Numeric, positive. Replacement values for 0s and 1s
          respectively. For count data, values of the response whose
          value is 0 are replaced by 'rep0'; it avoids computing
          'log(0)'. For proportions data values of the response whose
          value is 0 or 1 are replaced by 'min(rangey01[1]/2,
          rep01/w[y<=0])' and 'max((1 + rangey01[2])/2, 1-rep01/w[y >=
          1])' respectively; e.g., it avoids computing 'logit(0)' or
          'logit(1)'. Here, 'rangey01' is the 2-vector 'range(y[(y > 0)
          & (y < 1)])' of the response.

minquantile, maxquantile: Numeric. The minimum and maximum values
          possible in the quantiles. These argument are effectively
          ignored by default since 'loge' keeps all quantiles positive.
          However, if  'llocation = "logoff", elocation =
          list(offset=1)' then it is possible that the fitted quantiles
          have value 0 because 'minquantile=0'.

_D_e_t_a_i_l_s:

     These 'VGAM' family functions implement translations of the
     asymmetric Laplace distribution (ALD). The resulting variants may
     be suitable for quantile regression for count data or sample
     proportions. For example, a log link applied to count data is
     assumed to follow an ALD. Another example is a logit link applied
     to proportions data so as to follow an ALD. A positive random
     variable Y is said to have a log-Laplace distribution if Y =
     exp(W) where W has an ALD. There are many variants of ALDs and the
     one used here is described in 'alaplace1'.

_V_a_l_u_e:

     An object of class '"vglmff"' (see 'vglmff-class'). The object is
     used by modelling functions such as 'vglm' and 'vgam'.

     In the 'extra' slot of the fitted object are some list components
     which are useful. For example, the sample proportion of values
     which are less than the fitted quantile curves, which is
     'sum(wprior[y <= location]) / sum(wprior)' internally. Here,
     'wprior' are the prior weights (called 'ssize' below), 'y' is the
     response and 'location' is a fitted quantile curve. This
     definition comes about naturally from the transformed ALD data.

_W_a_r_n_i_n_g:

     The 'VGAM' family function 'logitlaplace1' will not handle a
     vector of just 0s and 1s as the response; it will only work
     satisfactorily if the number of trials is large.

     See 'alaplace1' for other warnings. Care is needed with 'tau'
     values which are too small, e.g., for count data the sample
     proportion of zeros must be less than all values in 'tau'.
     Similarly, this also holds with 'logitlaplace1', which also
     requires all 'tau' values to be less than the sample proportion of
     ones.

_N_o_t_e:

     The form of input for 'logitlaplace1' as response is a vector of
     proportions (values in [0,1]) and the number of trials is entered
     into the 'weights' argument of 'vglm'/'vgam'. See Example 2 below.
     See 'alaplace1' for other notes in general.

_A_u_t_h_o_r(_s):

     Thomas W. Yee

_R_e_f_e_r_e_n_c_e_s:

     Kotz, S., Kozubowski, T. J. and Podgorski, K. (2001) _The Laplace
     distribution and generalizations: a revisit with applications to
     communications, economics, engineering, and finance_, Boston:
     Birkhauser.

     Kozubowski, T. J. and Podgorski, K. (2003) Log-Laplace
     distributions. _International Mathematical Journal_, *3*, 467-495.

     Yee, T. W. (2009) Quantile regression for counts and proportions.
     In preparation.

_S_e_e _A_l_s_o:

     'alaplace1', 'dloglap'.

_E_x_a_m_p_l_e_s:

     # Example 1: quantile regression of counts with regression splines
     set.seed(123); my.k = exp(0)
     alldat = data.frame(x2 = sort(runif(n <- 500)))
     mymu = function(x) exp( 1 + 3*sin(2*x) / (x+0.5)^2)
     alldat = transform(alldat, y = rnbinom(n, mu=mymu(x2), size=my.k))
     mytau = c(0.1, 0.25, 0.5, 0.75, 0.9); mydof = 3
     fitp = vglm(y ~ bs(x2, df=mydof), data=alldat, trace=TRUE,
                 loglaplace1(tau=mytau, parallelLoc=TRUE)) # halfstepping is usual
      
     ## Not run: 
     par(las=1)  # Plot on a log1p() scale
     mylwd = 1.5
     with(alldat, plot(x2, jitter(log1p(y), factor=1.5), col="red", pch="o",
          main="Example 1; darkgreen=truth, blue=estimated", cex=0.75))
     with(alldat, matlines(x2, log1p(fitted(fitp)), col="blue", lty=1, lwd=mylwd))
     finexgrid = seq(0, 1, len=201)
     for(ii in 1:length(mytau))
         lines(finexgrid, col="darkgreen", lwd=mylwd,
               log1p(qnbinom(p=mytau[ii], mu=mymu(finexgrid), si=my.k)))
     ## End(Not run)
     fitp@extra  # Contains useful information

     # Example 2: sample proportions
     set.seed(123); nnn = 1000; ssize = 100  # ssize = 1 will not work!
     alldat = data.frame(x2 = sort(runif(nnn)))
     mymu = function(x) logit( 1.0 + 4*x, inv=TRUE)
     alldat = transform(alldat, ssize = ssize,
                        y2 = rbinom(nnn, size=ssize, prob=mymu(x2)) / ssize)

     mytau = c(0.25, 0.50, 0.75)
     fit1 = vglm(y2 ~ bs(x2, df=3), data=alldat, weights=ssize, trace=TRUE,
                 logitlaplace1(tau=mytau, lloc="cloglog", paral=TRUE))

     ## Not run: 
     # Check the solution.  Note: this may be like comparing apples with oranges.
     plotvgam(fit1, se=TRUE, scol="red", lcol="blue", main="Truth = 'darkgreen'")
     # Centered approximately !
     linkFunctionChar = as.character(fit1@misc$link)
     alldat = transform(alldat, trueFunction=
                        theta2eta(theta=mymu(x2), link=linkFunctionChar))
     with(alldat, lines(x2, trueFunction - mean(trueFunction), col="darkgreen"))

     # Plot the data + fitted quantiles (on the original scale)
     myylim = with(alldat, range(y2))
     with(alldat, plot(x2, y2, col="blue", ylim=myylim, las=1, pch=".", cex=2.5))
     with(alldat, matplot(x2, fitted(fit1), add=TRUE, lwd=3, type="l"))
     truecol = rep(1:3, len=fit1@misc$M) # Add the 'truth'
     smallxgrid = seq(0, 1, len=501)
     for(ii in 1:length(mytau))
         lines(smallxgrid, col=truecol[ii], lwd=2,
               qbinom(p=mytau[ii], prob=mymu(smallxgrid), size=ssize) / ssize)

     # Plot on the eta (== logit()/probit()/...) scale
     with(alldat, matplot(x2, predict(fit1), add=FALSE, lwd=3, type="l"))
     # Add the 'truth'
     for(ii in 1:length(mytau)) {
         true.quant = qbinom(p=mytau[ii], pr=mymu(smallxgrid), si=ssize)/ssize
         lines(smallxgrid, theta2eta(theta=true.quant, link=linkFunctionChar),
               col=truecol[ii], lwd=2)
     }
     ## End(Not run)

