Published on

Stereotype Logit Model

Authors
  • avatar
    Name
    Kevin Navarrete-Parra
    Twitter

I am writing quick and easy R guides for my didactic purposes and to provide useful starting places for my peers in grad school. If you see that I have made a mistake or would like to suggest some way to make the post better or more accurate, please feel free to email me. I am always happy to learn from others' experiences!

Table of contents

  1. Model Formula
  2. Odds Ratios
  3. Running it in R

Note that I explore much of the information for Sections 2-4 more thoroughly in other posts. Of particular utility will be my posts on logit models and continuation ratio models.

Model Formula

The stereotype logit model is an interesting generalization of the proportional odds model, as we will see below because it adds a series of nuances to categorical data exploration. One of the primary use cases for this model instead of the PO model is when the proportional odds assumption does not hold, which is unsurprisingly common for most data. Of course, you could fit the PPO and GOL models if your data don't follow the PO assumption, but the SL model might be more interesting.

The model's equation is as follows:

logit[π(j,J)]=ln(P(Y=jx1,x2,...,xp)P(Y=Jx1,x2,...,xp))=αj+ϕj(β1X1+β2X2+...+βpXp)logit[\pi (j, J)] = ln \left( \frac{P(Y = j | x_1, x_2, ..., x_p)}{P(Y = J | x_1, x_2, ..., x_p)} \right) = \alpha_j + \phi_j(\beta_1X_1 + \beta_2X_2 + ... + \beta_pX_p)

where J is the baseline (i.e., last) category for this given equation, j=1,2,...,J1j=1,2,..., J - 1, Y is the ordinal response variable with J categories, \alpha_j represents the intercepts, the \beta values are the coefficients for X variables, and \phi_j represents the constraints or scale parameters used to determine whether the outcome variable is ordinal. Importantly, \phi_j assumes that

1=ϕ1>ϕ2>ϕ3>...>ϕJ1>ϕJ=01 = \phi_1 > \phi_2 > \phi_3 > ... > \phi_{J-1} > \phi_J = 0

Notice that the constraints are ordered from one to zero, with each subsequent constraint being smaller than the last. Therefore, ϕ1=1\phi_1 = 1 and ϕJ=0\phi_J = 0, the constraints in the middle falling somewhere in between. What this does is force the model to have ordered categorical variables. Suppose, then, that you have a variable where J = 4. Once you know what ϕ1\phi_1 and ϕ4\phi_4 are, the only constraints you have left to estimate are ϕ2\phi_2 and ϕ3\phi_3.

As you can probably guess from the model formula above, the \phi values are distributed across your coefficients, leading to

ϕ1β>ϕ2β>ϕ3β>ϕ4β\phi_1\beta > \phi_2\beta > \phi_3\beta > \phi_4\beta

assuming we continue with our earlier example. As you can see here, the scale parameters (aka, constraints) ensure ordinality in the model.

Naturally, this leads to a minor complication when calculating the odds of being in category j versus category m. Rather than simply exponentiating the coefficients, we take the exponential of [(αjαm)+(ϕjϕm)β][(\alpha_j - \alpha_m) + (\phi_j - \phi_m)\beta]

Odds Ratios

The stereotype logit model estimates the log odds of being in a given category relative to the baseline. Therefore, generating the odds ratio can be done by

Odds(Y=jvs.Y=J)=P(Y=j)P(Y=J)Odds(Y = j vs. Y = J) = \frac{P(Y = j)}{P(Y = J)}

where j can be any category from 1 to J - 1. Recall, however, that the log odds of being in a given category as opposed to the baseline need to be multiplied by the scale parameters 4\phi_j$, as we saw above.

Running it in R

Running the stereotype logit model in r works with the rrvglm function from the VGAM package. An SL model would look like the following:

sl.model <- rrvglm(y ~ x1 + x2 + x3, family = multinomial, rank = 1, data = data)
summary(sl.model)

where rank = 1 indicates to R that the model is a one-dimensional SL model.

After you run your model, be sure to get the odds ratio table to interpret the results.


sl.or <- exp(coef(sl.model, matrix = TRUE))