Quantitative Variable: A quantitative variable is naturally measured as a number for which meaningful arithmetic operations make sense. Examples: Height, age, crop yield, GPA, salary, temperature, area, air pollution index (measured in parts per million), etc.
Categorical variable: Any variable that is not quantitative is categorical. Categorical variables take a value that is one of several possible categories. As naturally measured, categorical variables have no numerical meaning. Examples: Hair color, gender, field of study, college attended, political affiliation, status of disease infection.
Ordinal Variables: An ordinal variable is a categorical variable for which the possible values are ordered. Ordinal variables can be considered “in between” categorical and quantitative variables.
Example: Educational level might be categorized as
1: Elementary school education
2: High school graduate
3: Some college
4: College graduate
5: Graduate degree
• In this example (and for many ordinal variables), the quantitative differences between the categories are uneven, even though the differences between the labels are the same. (e.g., the difference between 1 and 2 is four years, whereas the difference between 2 and 3 could be anything from part of a year to several years)
• Thus it does not make sense to take a mean of the values.
• Common mistake: Treating ordinal variables like quantitative variables without thinking about whether this is appropriate in the particular situation at hand.
Ordinal regression: In statistics, ordinal regression (also called "ordinal classification") is a type of regression analysis used for predicting an ordinal variable. The Ordinal Regression procedure allows you to build models, generate predictions, and evaluate the importance of various predictor variables in cases where the dependent (target) variable is ordinal in nature.
Ordinal dependents and linear regression: When you are trying to predict ordinal responses, the usual linear regression models don't work very well. Those methods can work only by assuming that the outcome (dependent) variable is measured on an interval scale. Because this is not true for ordinal outcome variables, the simplifying assumptions on which linear regression relies are not satisfied, and thus the regression model may not accurately reflect the relationships in the data. In particular, linear regression is sensitive to the way you define categories of the target variable. With an ordinal variable, the important thing is the ordering of categories. So, if you collapse two adjacent categories into one larger category, you are making only a small change, and models built using the old and new categorizations should be very similar. Unfortunately, because linear regression is sensitive to the categorization used, a model built before merging categories could be quite different from one built after.
Below are some examples pf ordered logistic regression:
Example 1: A marketing research firm wants to investigate what factors influence the size of soda (small, medium, large or extra large) that people order at a fast-food chain. These factors may include what type of sandwich is ordered (burger or chicken), whether or not fries are also ordered, and age of the consumer. While the outcome variable, size of soda, is obviously ordered, the difference between the various sizes is not consistent. The difference between small and medium is 10 ounces, between medium and large 8, and between large and extra large 12.
Example 2: A researcher is interested in what factors influence modaling in Olympic swimming. Relevant predictors include at training hours, diet, age, and popularity of swimming in the athlete’s home country. The researcher believes that the distance between gold and silver is larger than the distance between silver and bronze.
Example 3: A study looks at factors that influence the decision of whether to apply to graduate school. College juniors are asked if they are unlikely, somewhat likely, or very likely to apply to graduate school. Hence, our outcome variable has three categories. Data on parental educational status, whether the undergraduate institution is public or private, and current GPA is also collected. The researchers have reason to believe that the “distances” between these three points are not equal. For example, the “distance” between “unlikely” and “somewhat likely” may be shorter than the distance between “somewhat likely” and “very likely”.