Prev Up
Go backward to 4.1 Example: Multiple Regression
Go up to 4 Examples Using glmlab

4.2 Example: Log-Linear Models

Table 4.2 is a contingency table containing (fictitious) data of the patients at a local counselling service over the past year.

Type of Counselling
 StressPost RelationshipLoss ofOther
    AbortionBreakdownLoved OneTypes
    Trauma      
Male65n/a212039
Female3126495229
Total9626707268
Counselling Service Data
 

Suppose a log-linear model is to be fitted to the data. This problem concerns count data and can therefore be modelled by the Poisson distribution. We first define the variables in the problem:

>> Count=[65 31 0 26 21 49 20 52 39 29]';
The makefac command can be used to define Gender and Type (see Section 3.4.2). Count has been defined to contain the first column of the table, then the second column, then the third, etc. Gender, then, should be a variable of length 10 (the same size as Count). There are two levels of this variable ("male" and "female") and they occur one at a time. (That is, according to the ordering of the variable Count, Gender is listed as "male", "female", "male", "female"...so that each gender is listed as a block of size one.) Therefore, Gender can be specified as
>> Gender=makefac(10,2,1);
>> Gender'
ans =
     1     2     1     2     1     2     1     2     1     2
Similarly, makefac can be used for the type of counselling, Type.
>> Type=makefac(10,5,2);
>> Type'
ans =
     1     1     2     2     3     3     4     4     5     5
 

There is one further problem with the data: That males cannot be exposed to post-abortion trauma. This is called a structural zero. To circumvent this problem, define a vector of prior weights that effectively ignores the cell corresponding to male post abortion trauma:

>> Priorw=[1 1 0 1 1 1 1 1 1 1]';
The prior weights could also have been defined in this manner:
>> Priorw=ones(size(Count)); Priorw(3)=0;
Having defined the variables, glmlab can be started. As discussed before, if glmlab is already running, declare a new model (in the Options menu).

For this problem, the default normal distribution is no longer adequate. To change the distribution, press the Distribution menu item on the main glmlab window and choose poisson, as shown in Figure 4.2. This will cause the link function and the scale parameter to change (to the Poisson defaults), but no changes will be obvious in the glmlab window. Click on the Link menu item and the logarithm link should be selected; this is the default link function for the Poisson distribution. The variables can now be typed into the main glmlab window. First, type the name of the response variable  (Count) and the prior weights (Priorw) as shown in Figure 4.2. After pressing the FIT MODEL button, the following parameter estimates are produced:

 
Selecting the Poisson Distribution
 
 
Response and Prior Weights Only Entered
 
 -----------------------------------------
   Estimate        S.E.      Variable
 -----------------------------------------
   3.607910     0.054882   Constant
 -----------------------------------------
Scaled deviance:     50.434008           Link: LOG
Residual df:                 8   Distribution: POISSON
Scale parameter (dispersion parameter):         1.000000
To add Gender to the covariate list, remember to use the fac command (as Gender is qualitative). Add fac(Gender) to the covariate list and press FIT MODEL again to produce new estimates:
 -----------------------------------------
   Estimate        S.E.      Variable
 -----------------------------------------
   3.590439     0.083044   Constant     
   0.031231     0.110653   Gender(2)
 -----------------------------------------
Scaled Deviance:     50.354244    (change:     -0.079764)
Residual df:                 7    (change:            -1)
Scale parameter (dispersion parameter):         1.000000
Now add the last variable, Type, to the covariate list so the list of variables in the Covariate area includes fac(Gender) and fac(Type). Remember to use the fac command again because Type is qualitative. Press the FIT MODEL button to produces new estimates.
 -----------------------------------------
   Estimate        S.E.      Variable
 -----------------------------------------
   3.817497     0.118512   Constant     
   0.104671     0.114489   Gender(2)
  -0.664071     0.227643   Type(2)
  -0.315853     0.157170   Type(3)
  -0.287682     0.155902   Type(4)
  -0.344840     0.158501   Type(5)
 -----------------------------------------
Scaled Deviance:     39.197423    (change:    -11.156820)
Residual df:                 3    (change:            -4)
Scale parameter (dispersion parameter):         1.000000

The variable Type appears four times to indicate that there are five levels of this qualitative variable (only four variables are needed for a five level factor to maintain full rank). These four variables correspond to levels two, three, four and five of the Type variable.

It is common to want to fit interaction terms during modelling. To fit interaction terms, glmlab uses the character @ (usually SHIFT-2 on the keyboard). See Section 3.4.3. In this problem, the interaction between the two qualitative variables Gender and Type could be included. This can be done in glmlab by entering the following string in the covariate section of the main glmlab window:

fac(Gender), fac(Type), fac(Gender)@fac(Type)
Notice that the fac command has been used again. Thus, interactions between variables can be specified by separating the variables with the character @.  
 To define the interaction between variables in glmlab, use the @ character between the interacting variables. Remember to still use fac for qualitative  variables.
The interaction variable can be included in the glmlab Covariates area as shown in Figure 4.2.
Including an Interaction Term
 

Pressing the FIT MODEL button then gives the following parameter estimates:

 -----------------------------------------
   Estimate        S.E.      Variable
 -----------------------------------------
   4.174387     0.124035   Constant
  -0.740400     0.218272   Gender(2)
  -0.175891     0.265932   Type(2)
  -1.129865     0.251005   Type(3)
  -1.178655     0.255704   Type(4)
  -0.510826     0.202548   Type(5)
   0.000000      aliased   Gender(2)@Type(2)
   1.587698     0.340103   Gender(2)@Type(3)
   1.695912     0.341868   Gender(2)@Type(4)
   0.444134     0.328278   Gender(2)@Type(5)
 -----------------------------------------
Scaled Deviance:      0.000000    (change:    -39.197423)
Residual df:                 0    (change:            -3)
Scale parameter (dispersion parameter):         1.000000
 

Notice that the interaction between Gender and Type produces four terms in the model. There is no residual deviance or degrees of freedom as there are more parameters than there are observations to estimate. The presence of the term aliased indicates that the variable Gender(2)@Type(2) contains no new information.

 


Peter Dunn

Prev Up