fill                  package:VGAM                  R Documentation

_C_r_e_a_t_e_s _a _M_a_t_r_i_x _o_f _A_p_p_r_o_p_r_i_a_t_e _D_i_m_e_n_s_i_o_n

_D_e_s_c_r_i_p_t_i_o_n:

     A support function for the argument 'xij', it generates a matrix
     of an appropriate dimension.

_U_s_a_g_e:

     fill(x, values = 0, ncolx = ncol(x))

_A_r_g_u_m_e_n_t_s:

       x: A vector or matrix which is used to determine the dimension
          of the answer, in particular, the number of rows.  After
          converting 'x' to a matrix if necessary, the answer is a
          matrix of 'values' values, of dimension 'nrow(x)' by 'ncolx'.

  values: Numeric. The answer contains these values, which are recycled
          _columnwise_ if necessary, i.e., as 'matrix(values, ...,
          byrow=TRUE)'.

   ncolx: The number of columns of the returned matrix. The default is
          the number of columns of 'x'.

_D_e_t_a_i_l_s:

     The 'xij' argument for 'vglm' allows the user to input variables
     specific to each linear/additive predictor. For example, consider
     the bivariate logit model where the first/second linear/additive
     predictor is the logistic regression of the first/second binary
     response respectively. The third linear/additive predictor is
     'log(OR) = eta3', where 'OR' is the odds ratio.  If one has ocular
     pressure as a covariate in this model then 'xij' is required to
     handle the ocular pressure for each eye, since these will be
     different in general. [This contrasts with a variable such as
     'age', the age of the person, which has a common value for both
     eyes.]  In order to input these data into 'vglm' one often finds
     that functions 'fill', 'fill1', etc. are useful.

     All terms in the 'xij' and 'formula' arguments in 'vglm' must
     appear in the 'form2' argument too.

_V_a_l_u_e:

     'matrix(values, nrow=nrow(x), ncol=ncolx)', i.e., a matrix
     consisting of values 'values', with the number of rows matching
     'x', and the default number of columns is the number of columns of
     'x'.

_N_o_t_e:

     The effect of the 'xij' argument is after other arguments such as
     'exchangeable' and 'zero'. Hence 'xij' does not affect constraint
     matrices.

     Additionally, there are currently 3 other identical 'fill'
     functions, called 'fill1', 'fill2' and 'fill3'; if you need more
     then assign 'fill4 = fill5 = fill1' etc. The reason for this is
     that if more than one 'fill' function is needed then they must be
     unique. For example, if M=4 then  'xij = op ~ lop + rop +
     fill(mop) + fill(mop)' would reduce to  'xij = op ~ lop + rop +
     fill(mop)', whereas 'xij = op ~ lop + rop + fill1(mop) +
     fill2(mop)' would retain all M terms, which is needed.

     In Examples 1 to 3 below, the 'xij' argument illustrates
     covariates that are specific to a linear predictor. Here,
     'lop'/'rop' are the ocular pressures of the left/right eye in an
     artificial dataset, and 'mop' is their mean. Variables 'leye' and
     'reye' might be the presence/absence of a particular disease on
     the LHS/RHS eye respectively.

     In Example 3, the 'xij' argument illustrates fitting the
     (exchangeable) model where there is a common smooth function of
     the ocular pressure. One should use regression splines since 's'
     in 'vgam' does not handle the 'xij' argument.  However, regression
     splines such as 'bs' and 'ns' need to have the same basis
     functions here for both functions, and Example 3 illustrates a
     trick involving a function 'BS' to obtain this, e.g., same knots.
     Although regression splines create more than a single column per
     term in the model matrix, 'fill(BS(lop,rop))' creates the required
     (same) number of columns.

_A_u_t_h_o_r(_s):

     T. W. Yee

_R_e_f_e_r_e_n_c_e_s:

     More information can be found at <URL:
     http://www.stat.auckland.ac.nz/~yee>.

_S_e_e _A_l_s_o:

     'vglm.control', 'vglm', 'multinomial'.

_E_x_a_m_p_l_e_s:

     fill(runif(5))
     fill(runif(5), ncol=3)
     fill(runif(5), val=1, ncol=3)

     # Generate eyes data for the examples below. Eyes are independent (OR=1).
     nn = 1000  # Number of people
     eyesdat = data.frame(lop = round(runif(nn), 2),
                          rop = round(runif(nn), 2),
                          age = round(rnorm(nn, 40, 10)))
     eyesdat = transform(eyesdat,
         mop = (lop + rop) / 2,       # Mean ocular pressure
         op  = (lop + rop) / 2,       # Value unimportant unless plotting
     #   op  =  lop,                  # Choose this if plotting
         eta1 = 0 - 2*lop + 0.04*age, # Linear predictor for left eye
         eta2 = 0 - 2*rop + 0.04*age) # Linear predictor for right eye
     eyesdat = transform(eyesdat,
         leye = rbinom(nn, size=1, prob=logit(eta1, inverse=TRUE)),
         reye = rbinom(nn, size=1, prob=logit(eta2, inverse=TRUE)))

     # Example 1
     # All effects are linear
     fit1 = vglm(cbind(leye,reye) ~ op + age,
                 family = binom2.or(exchangeable=TRUE, zero=3),
                 data=eyesdat, trace=TRUE,
                 xij = list(op ~ lop + rop + fill(lop)),
                 form2 =  ~ op + lop + rop + fill(lop) + age)
     head(model.matrix(fit1, type="lm"))   # LM model matrix
     head(model.matrix(fit1, type="vlm"))  # Big VLM model matrix
     coef(fit1)
     coef(fit1, matrix=TRUE)  # Unchanged with 'xij'
     constraints(fit1)
     max(abs(predict(fit1)-predict(fit1, new=eyesdat))) # Predicts correctly
     summary(fit1)
     ## Not run: 
     plotvgam(fit1, se=TRUE) # Wrong, e.g., because it plots against op, not lop.
     # So set op=lop in the above for a correct plot.
     ## End(Not run)


     # Example 2
     # Model OR as a linear function of mop
     fit2 = vglm(cbind(leye,reye) ~ op + age,
                 binom2.or(exchangeable=TRUE, zero=NULL),
                 data=eyesdat, trace=TRUE,
                 xij = list(op ~ lop + rop + mop),
                 form2 =  ~ op + lop + rop + mop + age)
     head(model.matrix(fit2, type="lm"))   # LM model matrix
     head(model.matrix(fit2, type="vlm"))  # Big VLM model matrix
     coef(fit2)
     coef(fit2, matrix=TRUE)  # Unchanged with 'xij'
     max(abs(predict(fit2)-predict(fit2, new=eyesdat))) # Predicts correctly
     summary(fit2)
     ## Not run: 
     plotvgam(fit2, se=TRUE) # Wrong because it plots against op, not lop.
     ## End(Not run)

     # Example 3. This model uses regression splines on ocular pressure.
     # It uses a trick to ensure common basis functions.
     BS = function(x, ...) bs(c(x,...), df=3)[1:length(x),,drop=FALSE] # trick

     fit3 = vglm(cbind(leye,reye) ~ BS(lop,rop) + age,
                 family = binom2.or(exchangeable=TRUE, zero=3),
                 data=eyesdat, trace=TRUE,
                 xij = list(BS(lop,rop) ~ BS(lop,rop) +
                                          BS(rop,lop) +
                                          fill(BS(lop,rop))),
                 form2 = ~  BS(lop,rop) + BS(rop,lop) + fill(BS(lop,rop)) +
                            lop + rop + age)
     head(model.matrix(fit3, type="lm"))   # LM model matrix
     head(model.matrix(fit3, type="vlm"))  # Big VLM model matrix
     coef(fit3)
     coef(fit3, matrix=TRUE)
     summary(fit3)
     fit3@smart.prediction
     max(abs(predict(fit3)-predict(fit3, new=eyesdat))) # Predicts correctly
     predict(fit3, new=head(eyesdat))  # Note the 'scalar' OR, i.e., zero=3
     max(abs(head(predict(fit3))-predict(fit3, new=head(eyesdat)))) # Should be 0
     ## Not run: 
     plotvgam(fit3, se=TRUE, xlab="lop") # Correct
     ## End(Not run)

