SES Screening

Published

January 1, 2025

Abstract

Exploration of SES screening using household assets (equitytools)

Code
```{stata}
do "$code/setup_p3_legacy.do"
qui use "$data_gen_p3/Pilot_3_Merged.dta" if period == 0, clear

label var mental_health_index "Mental Health Index"
label var total_food_reported "Total food consumption"
label var total_food_purchases "Total food purchases"
```

. /*******
>         Author:         Simon Taye
>         Date:           Dec 8, 2025
>         Purpose:        Add paths for old pilot 3 paths for compatibility
> *******/
. ************************************************
. *****************START: Configs
. ************************************************
. 
. 
. **** Path for Pilot 3 generated data (for quarto)
. global data_gen_p3 "$pilot3/data/generated/p3"

. global data_gen $data_gen_p3

. 
. * Location of calorie data which changes in other places this code is used
. global cal "$data/raw/pilot_3/calories_database.dta"

. * Structure for raw data - Pilot 3
. global ps "$data/raw/pilot_3/04_Phone_surveys"

. global bl "$data/raw/pilot_3/02_Baseline"

. global el "$data/raw/pilot_3/08_Endline_data"

. global fcm "$data/raw/pilot_3/07_Food_Consumption_Measure"

. global census "$data/raw/pilot_3/01_Census"

. global referral "$data/raw/pilot_3/11_Refferals/"

. * Nested structure for intervention data
. global intervention "$data/raw/pilot_3/03_Intervention_data"

. global training "$intervention/03_Treatment dissemination"

. global dropoff "$intervention/02_Drop_off"

. global pickup "$intervention/01_pick_up"

. 
. 
. 
end of do-file

We discussed screening households by SES during census.

Data in the census

See this spreadsheet for a breakdown . Quick summary is that

  • Not a lot of variation in flooring material (82% cement)
  • Roofing material: 47% sheet metal, 41% use sheet metal + wood
  • Some correlation between flooring and roofing material (eg. corr between cement floor and wood roof is 0.3)
  • No clear connection between community size and the materials used
    • It seems for a lot of communities, they standardize on a certain roofing material
  • We also expenditure and income questions in census (0.3 correlation between them after winsorizing)

Asset-based Wealth Index

In the in-person surveys we have equity questions—do you own x, where x is a radio, watch, refrigerator and more—with the intention to create a wealth index based on them (the questions and my index was based on: https://www.equitytool.org/ghana/)

Our index seems to be a good predictor of various objective variables that captures wealth

Code
```{stata}
graph combine g1 g2 g3 g4, row(2) title("Wealth Index and objective measures")
```

Code
```{stata}
*| include: true
*| output: asis

estimates clear
qui eststo: reg total_food_purchases wealth_index
qui eststo: reg total_income_hh_95 wealth_index
qui eststo: reg total_assets wealth_index
esttab *, $esttab_opts_md
```
Total food purchases Total Household Income (w 95%) Total value of assets (durable goods, farmland, animals)
Wealth Index (Standardized PCA) 47.22** 38.95*** 3073.8***
(17.43) (11.35) (665.3)
Constant 341.0*** 136.2*** 10017.2***
(17.60) (11.46) (671.7)
Observations 390 390 390
R2 0.019 0.029 0.052

Concerns

The trend lines look a lot flatter when considering the raw data instead of the binscatter. Binning seems to smooth out the variation perhaps too much.

If we had screened our existing sample based on the wealth index (say at a cutoff of 2), we would have still kept the house with the highest assets and the majority of those who report fairly high food purchases. At 1, we would still have the highest-asset households.

Code
```{stata}
*| output: asis
quietly count if wealth_index < 2
qui local c = r(N)
qui local cp = string((`c' / 390) * 100, "%5.2f")

quietly count if wealth_index < 1
qui local c2 = r(N)
qui local c2p = string((`c2' / 390) * 100, "%5.2f")

quietly count if wealth_index < 1.5
qui local c3 = r(N)
qui local c3p = string((`c3' / 390) * 100, "%5.2f")

di "*A cutoff of 2 would give us a sample of `c' / 390 households* (`cp'%); <br>" 
di "*A cut of of 1.5 gives `c3' participants* (`c3p'%) <br>"
di "*A cut of of 1 gives `c2' participants* (`c2p'%) <br>"
```

A cutoff of 2 would give us a sample of 371 / 390 households (95.13%);
A cut of of 1.5 gives 359 participants (92.05%)
A cut of of 1 gives 322 participants (82.56%)

Code
```{stata}
*| include: true
*| fig-cap: Wealth Index Relationship - w/o Binscatter
*| label: figure-2

graph combine scatter1 scatter2, row(1) xsize(22) ysize(8)
```

Wealth Index Relationship - w/o Binscatter

Wealth Index Relationship - w/o Binscatter

Impact on Outcome Regressions

Additionally controlling for wealth nor restricting the sample doesn’t significantly impact the regression outcomes

Code
```{stata}
*| output: asis
qui {
estimates clear
  foreach var of varlist total_food_purchases mental_health_index total_food_reported  {
    qui eststo: reg `var' treated
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)

    qui eststo: reg `var' treated wealth_index
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)
  }
}

esttab * , $esttab_opts_md scalars("mean Dep. Var. Mean") nomtitles mgroups("Food Purchases" "Mental Health Index" "Total Food Purchases", pattern(1 0 1 0 1 0)) title("Controlling for Wealth Index")
```

Controlling for Wealth Index

Food Purchases Mental Health Index Total Food Purchases
1 = Treated -33.57 -34.85 0.118 0.120 1.306 -1.073
(46.25) (45.88) (0.0990) (0.0990) (52.65) (51.39)
Wealth Index (Standardized PCA) 47.36** -0.0476 88.17***
(17.44) (0.0376) (19.53)
Constant 367.9*** 369.6*** -0.163 -0.164 463.6*** 466.7***
(41.90) (41.56) (0.0897) (0.0896) (47.69) (46.55)
Observations 390 390 390 390 390 390
R2 0.001 0.020 0.004 0.008 0.000 0.050
Dep. Var. Mean 340.3 340.3 -0.0656 -0.0656 464.6 464.6
Code
```{stata}
*| output: asis
qui {
  estimates clear
  foreach var of varlist total_food_purchases mental_health_index total_food_reported  {
    qui eststo: reg `var' treated
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)
          estadd local restrict = "No"

    qui eststo: reg `var' treated if wealth_index < 1
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)
          estadd local restrict = "Yes"
  }
}

esttab * , $esttab_opts_md scalars("mean Dep. Var. Mean" "restrict Wealth Restriction") nomtitles mgroups("Food Purchases" "Mental Health Index" "Total Food Purchases", pattern(1 0 1 0 1 0)) title("Restricted Sample")
```

Restricted Sample

Food Purchases Mental Health Index Total Food Purchases
1 = Treated -33.57 -74.29 0.118 0.123 1.306 -23.16
(46.25) (47.90) (0.0990) (0.111) (52.65) (47.31)
Constant 367.9*** 377.7*** -0.163 -0.150 463.6*** 446.5***
(41.90) (43.29) (0.0897) (0.0999) (47.69) (42.76)
Observations 390 322 390 322 390 322
R2 0.001 0.007 0.004 0.004 0.000 0.001
Dep. Var. Mean 340.3 317.0 -0.0656 -0.0495 464.6 427.6
Wealth Restriction No Yes No Yes No Yes
Code
```{stata}
*| output: asis
qui {
  estimates clear
  gen poor = wealth_index < 1
  gen poorXtreated = poor * treated
  label var poorXtreated "Index < 1 X Treated"
  foreach var of varlist total_food_purchases mental_health_index total_food_reported  {
    qui eststo: reg `var' treated, robust
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)
          estadd local restrict = "No"

    qui eststo: reg `var' treated wealth_index poorXtreated , robust
          sum `var' if e(sample)
          estadd scalar mean = r(mean)
          estadd scalar sd = r(sd)
          estadd local restrict = "Yes"
  }
}

esttab * , $esttab_opts_md scalars("mean Dep. Var. Mean" "restrict Wealth Restriction") nomtitles mgroups("Food Purchases" "Mental Health Index" "Total Food Purchases", pattern(1 0 1 0 1 0)) title("Interaction with wealth index")
```

Interaction with wealth index

Food Purchases Mental Health Index Total Food Purchases
1 = Treated -33.57 94.96 0.118 0.128 1.306 78.96
(49.56) (83.99) (0.0984) (0.152) (49.15) (103.2)
Wealth Index (Standardized PCA) 7.889 -0.0501 63.83*
(25.58) (0.0495) (25.54)
Index < 1 X Treated -156.7 -0.0101 -96.58
(83.38) (0.146) (99.11)
Constant 367.9*** 368.2*** -0.163 -0.165 463.6*** 465.8***
(45.70) (45.60) (0.0890) (0.0888) (43.59) (43.27)
Observations 390 390 390 390 390 390
R2 0.001 0.031 0.004 0.008 0.000 0.053
Dep. Var. Mean 340.3 340.3 -0.0656 -0.0656 464.6 464.6
Wealth Restriction No Yes No Yes No Yes

Recovering PCA Weights

Code
```{stata}
  use "$data_gen/Pilot_3_All_Surveys.dta" if period == 0 , clear
  rename refigerator refrigerator
```
Code
  pca radio television refrigerator cabinet watch account water_source toilet_facilty cooking_fuel floor_mat, component(1)
  //matrix list e(r)

Principal components/correlation                 Number of obs    =        390
                                                 Number of comp.  =          1
                                                 Trace            =         10
    Rotation: (unrotated = principal)            Rho              =     0.1745

    --------------------------------------------------------------------------
       Component |   Eigenvalue   Difference         Proportion   Cumulative
    -------------+------------------------------------------------------------
           Comp1 |      1.74481      .512431             0.1745       0.1745
           Comp2 |      1.23238     .0472736             0.1232       0.2977
           Comp3 |       1.1851       .16909             0.1185       0.4162
           Comp4 |      1.01601     .0848556             0.1016       0.5178
           Comp5 |      .931157     .0651169             0.0931       0.6109
           Comp6 |       .86604     .0489633             0.0866       0.6975
           Comp7 |      .817076     .0322117             0.0817       0.7793
           Comp8 |      .784865      .060054             0.0785       0.8577
           Comp9 |      .724811      .027056             0.0725       0.9302
          Comp10 |      .697755            .             0.0698       1.0000
    --------------------------------------------------------------------------

Principal components (eigenvectors) 

    --------------------------------------
        Variable |    Comp1 | Unexplained 
    -------------+----------+-------------
           radio |   0.3797 |       .7485 
      television |   0.3936 |       .7297 
    refrigerator |   0.3160 |       .8257 
         cabinet |   0.4532 |       .6416 
           watch |   0.4647 |       .6233 
         account |   0.3712 |       .7596 
    water_source |  -0.0926 |        .985 
    toilet_fac~y |  -0.0156 |       .9996 
    cooking_fuel |  -0.0404 |       .9971 
       floor_mat |  -0.1774 |       .9451 
    --------------------------------------