Thanks to Kit Baum an update of the package,
stdtable, is now available from SSC. It can be installed by typing in Stata
ssc install stdtable. The biggest change is the way the
replace option works.
stdtable can overwrite the data with the table it creates by specifying the
replace option. This can be useful for creating graphs, but the downside is that it overwrites the data. This new version allows the
replace(frame_name) option, which will replace the data in the data frame
frame_name. The regular
replace option still works, and is the only one allowed for Stata versions less than 16 (as data frames were introduced in Stata 16).
stdtable standardizes a cross tabulation such that the marginal distributions (row and column totals) correspond to some pre-specified distribution, a technique that goes back to at least (Yule 1912). The purpose is to display the association that exists in the table nett of the marginal distributions. Consider the example below:
Code:
. use "http://www.maartenbuis.nl/software/mob.dta", clear
(mobility table from the USA collected in 1973)
. tab row col [fw=pop]
Father's | Son's occupation
occupation | upper non lower non upper man lower man farm | Total
----------------+-------------------------------------------------------+----------
upper nonmanual | 1,414 521 302 643 40 | 2,920
lower nonmanual | 724 524 254 703 48 | 2,253
upper manual | 798 648 856 1,676 108 | 4,086
lower manual | 756 914 771 3,325 237 | 6,003
farm | 409 357 441 1,611 1,832 | 4,650
----------------+-------------------------------------------------------+----------
Total | 4,101 2,964 2,624 7,958 2,265 | 19,912
There are many more people that went from a farm to lower manual than the other way around. However, the number of people in agriculture strongly declined so sons had to leave the farm. Moreover, the number of people in lower manual occupations were on the increase, offering room for those sons that had to leave their farm. We may be interested in knowing if this asymmetry is completely explained by these changes in the marginal distribution, or if there is more to it.
Code:
. stdtable row col [fw=pop], format(%5.0f) cellwidth(9)
----------------------------------------------------------------------------------
Father's | Son's occupation
occupation | upper non lower non upper man lower man farm Total
----------------+-----------------------------------------------------------------
upper nonmanual | 42 24 17 13 4 100
lower nonmanual | 27 30 18 18 6 100
upper manual | 16 20 33 23 8 100
lower manual | 11 21 22 34 12 100
farm | 4 6 9 12 69 100
|
Total | 100 100 100 100 100 500
----------------------------------------------------------------------------------
These standardized counts can be interpreted as the row and column percentages that would occur if for both fathers and sons each occupation was equally likely. It appears that the apparent asymmetry was almost entirely due to changes in the marginal distributions. Also, it is now much clearer that farming is much more persistent over generations than the other occupations.