use teaching_Understsociety.dta **12680 observations **This is NOT a longitudinal dataset. It's one wave only, so we can practice summarize sum age, d gen pctile90=r(p90) tab sex, sum (age) ta sex hiqual, col ta sex hiqual, row bysort sex: ta sempderived hiqual, col replace scghq2=. if sghq<0 gen age_band=1 if age<=20 replace age_band=2 if age>20 & age<=30 replace age_band=3 if age>30 & age<=40 replace age_band=4 if age>40 & age<=50 replace age_band=5 if age>50 & age<=60 replace age_band=6 if age>60 & age<=70 replace age_band=7 if age>70 & age<=80 replace age_band=8 if age>80 replace age_band=. if age==. label define age_bands 1 "less20" 2 "20-30" 3 "30-40" 4 "40-50" 5 "50-60" 6 "60-70" 7 "70-80" 8 "more80" label values age_band age_bands recode age (18 19 = 1 "18 to 19") /// (20/29 = 2 "20 to 29") /// (30/39 = 3 "30 to 39") (else=.), generate(agegroups) label(agegroups) gen agesq=age^2 g female=(sex==2) gen fem_less50=(sex==2 & age<50) bysort sex: egen mean_age=mean(age) ta couple replace couple=0 if couple==2 gen id=_n gen ID=_N bysort sex: gen number=_N histogram age, frequency histogram age, frequency normal **Horizontal bars graph hbar , over(age_band) **Vertical bars graph bar, over(scghq2) by(sex) tab hiqual_dv, gen (educ) sort pidp merge 1:1 pidp using indiv_health