VIP
  • Reports
  • Data Catalogue
  • Weekly Analysis
  • Frontier

Lumpiness of Food Purchases

Published

April 6, 2026

Abstract

Lumpiness in household food purchases

Within-Household Dispersion

Three measures of within-household volatility—range, CV, and IQR—computed across periods for food purchases and consumption. All arms overlaid for direct comparison.

Assuming that predictable and risky households are more likely to smooth through lumpy purchases, we would expect to see higher dispersion over time in those arms than control and stable. The graphs below indicate that it doesn’t seem like that is through.

Important

Additionally, if the smoothing hypothesis was true, we would also expect purchases to exhibit much more lumpiness, and thus, dispersion than the consumption data. However, in practice purchase is much more smoother than consumption.

Code
```{python}
#| label: fig-range
#| fig-cap: Within-household range (max − min) by treatment arm.

dispersion_plot(hh_df, "purchase_range", "consumption_range", "Range (Max − Min)")
```
Figure 1: Within-household range (max − min) by treatment arm.
Code
```{python}
#| label: fig-cv
#| fig-cap: Within-household coefficient of variation by treatment arm.

dispersion_plot(hh_df, "purchase_cv", "consumption_cv", "Coefficient of Variation (SD / Mean)")
```
Figure 2: Within-household coefficient of variation by treatment arm.
Code
```{python}
#| label: fig-iqr
#| fig-cap: Within-household IQR by treatment arm.

dispersion_plot(hh_df, "purchase_iqr", "consumption_iqr", "IQR (Q75 − Q25)")
```
Figure 3: Within-household IQR by treatment arm.

Simulated Benchmarks

However, it is difficult to see if the above graphs are just not showing differences well or if the smoothing is truly uniform. To calibrate our expectations, we simulate some scenarios with the same number of households and periods as our data but with different degrees of “lumpiness” in their food purchases:

  • Smooth (no lumpiness): per-period food purchases are draw from a normal distribution with mean and SD matching the control arm’s empirical distribution (truncated at 0 if values are randomly drawn to be negative)
  • Lumpy (aggressive): half of each household’s periods are drawn from a distribution with mean 1.8x the control arm’s empirical mean and SD, with the other half being drawn from a normal with 0.2x the control arm’s mean. Thus, the overall mean should be same but with a forced lumpiness. The simulation is itended to match purchasing a lot of food during high draws (thus the 1.8x multiplier) and very little during low draws (0.2x multiplier) while maintaining average purchase over the study to be the same.
  • Lumpy (mild): same as above but with 1.3x and 0.7x the control arm’s empirical mean for the high and low draws, respectively, to simulate a milder lumpiness

From looking at the figures, it seems that there is almost no lumpiness in the observed food purchase data for most households. However, it does seem like some households might be experiencing lumpiness as

Code
```{python}
#| label: fig-sim-range-agg
#| fig-cap: 'Range: observed vs. aggressive simulated benchmarks.'

def sim_density(df, x_col, x_label):
    return (
        ggplot(df.dropna(subset=[x_col]), aes(x=x_col, color="scenario"))
        + stat_ecdf()
        + labs(x=x_label, y="Cumulative Probability", color=None)
        + theme_minimal()
        + theme(legend_position="top", figure_size=(8, 4))
    )

sim_density(sim_compare_aggressive, "range", "Range (Max − Min)")
```
Figure 4: Range: observed vs. aggressive simulated benchmarks.
Code
```{python}
#| label: fig-sim-cv-agg
#| fig-cap: 'CV: observed vs. aggressive simulated benchmarks.'

sim_density(sim_compare_aggressive, "cv", "Coefficient of Variation")
```
Figure 5: CV: observed vs. aggressive simulated benchmarks.
Code
```{python}
#| label: fig-sim-iqr-agg
#| fig-cap: 'IQR: observed vs. aggressive simulated benchmarks.'

sim_density(sim_compare_aggressive, "iqr", "IQR (Q75 − Q25)")
```
Figure 6: IQR: observed vs. aggressive simulated benchmarks.

Treatment effect on non-staple categories

It is difficult to see treatment effects on extensive margin of consumption / purchases on non-staple food categories (meat, dairy, vegetables). The rates of consumption are already fairly high

However, it does seem that the treatments do result in more vegetable purchases and consumption.

Code
```{python}
#| label: fig-extensive-share
#| fig-cap: Share of HH-periods with any purchase by arm and category.

desc_rows = []
for cat in cats:
    for arm_code, arm_label in TREATMENT_LABELS.items():
        subset = reg_df[reg_df["treatment"] == arm_code]
        share = subset[f"any_purchase_{cat}"].mean()
        desc_rows.append({"Category": cat.title(), "Arm": arm_label, "Share": share})

desc_df = pd.DataFrame(desc_rows)
desc_df["Arm"] = pd.Categorical(desc_df["Arm"], categories=ARM_ORDER, ordered=True)

(
    ggplot(desc_df, aes(x="Arm", y="Share", fill="Arm"))
    + geom_col()
    + facet_wrap("Category", nrow=1)
    + labs(x=None, y="Share with Any Purchase")
    + theme_minimal()
    + theme(legend_position="none", figure_size=(10, 4),
            axis_text_x=element_text(rotation=30, ha="right"))
)
```
Figure 7: Share of HH-periods with any purchase by arm and category.

Treatment Effects

  • Intensive Margin
  • Extensive Margin
Code
```{python}
#| label: tbl-treatment-effects-intensive
#| output: asis

fitted = []
for cat in cats:
    m = smf.ols(
        f"food_purchase_{cat}_99 ~ C(treatment, Treatment(reference=0)) + C(period_str)",
        data=reg_df,
    ).fit(cov_type="HC3")
    fitted.append(m)

fitted_c = []
for cat in cats:
    m = smf.ols(
        f"food_consumption_{cat}_99 ~ C(treatment, Treatment(reference=0)) + C(period_str)",
        data=reg_df,
    ).fit(cov_type="HC3")
    fitted_c.append(m)

n_pos_purchase = [int((reg_df[f"food_purchase_{cat}_99"] == 1).sum()) for cat in cats]
n_pos_consumption = [
    int((reg_df[f"food_consumption_{cat}_99"] == 1).sum()) for cat in cats
]

sg_intensive = Stargazer(fitted + fitted_c)
sg_intensive.title("Amount purchase / consumption — Treatment Effects")
sg_intensive.custom_columns(
    [
        "Meat - Purchase",
        "Dairy - Purchase",
        "Vegetables - Purchase",
        "Meat - Cons.",
        "Dairy - Cons.",
        "Vegetables - Cons.",
    ],
    [1, 1, 1, 1, 1, 1],
)
sg_intensive.covariate_order(
    [
        "C(treatment, Treatment(reference=0))[T.1]",
        "C(treatment, Treatment(reference=0))[T.2]",
        "C(treatment, Treatment(reference=0))[T.3]",
    ]
)
sg_intensive.rename_covariates(
    {
        "C(treatment, Treatment(reference=0))[T.1]": "Stable",
        "C(treatment, Treatment(reference=0))[T.2]": "Predictable",
        "C(treatment, Treatment(reference=0))[T.3]": "Risky",
    }
)
sg_intensive.add_line("Period FEs", ["Yes"] * 6, LineLocation.FOOTER_BOTTOM)
sg_intensive.show_degrees_of_freedom(False)
sg_intensive
```
Table 1
Amount purchase / consumption — Treatment Effects
Meat - PurchaseDairy - PurchaseVegetables - PurchaseMeat - Cons.Dairy - Cons.Vegetables - Cons.
(1)(2)(3)(4)(5)(6)
Stable1.537-0.0423.788***0.4600.1461.956**
(1.243)(0.341)(0.696)(1.875)(0.483)(0.841)
Predictable4.300***1.636***3.944***2.9832.122***2.535***
(1.308)(0.374)(0.697)(1.895)(0.512)(0.843)
Risky2.188**0.2622.564***-0.7150.787*2.134***
(1.033)(0.286)(0.551)(1.521)(0.403)(0.694)
Observations148651486514865148651486514865
R20.0580.0280.0150.1000.0250.022
Adjusted R20.0580.0270.0140.1000.0240.022
Residual Std. Error46.58512.86525.30366.15617.97230.611
F Statistic77.352***41.238***21.713***135.921***31.200***40.615***
Period FEsYesYesYesYesYesYes
Note:*p<0.1; **p<0.05; ***p<0.01
Code
```{python}
#| label: tbl-treatment-effects
#| output: asis

fitted = []
for cat in cats:
    m = smf.ols(
        f"any_purchase_{cat} ~ C(treatment, Treatment(reference=0)) + C(period_str)",
        data=reg_df,
    ).fit(cov_type="HC3")
    fitted.append(m)

fitted_c = []
for cat in cats:
    m = smf.ols(
        f"any_consumption_{cat} ~ C(treatment, Treatment(reference=0)) + C(period_str)",
        data=reg_df,
    ).fit(cov_type="HC3")
    fitted_c.append(m)

n_pos_purchase = [int((reg_df[f"any_purchase_{cat}"] == 1).sum()) for cat in cats]
n_pos_consumption = [int((reg_df[f"any_consumption_{cat}"] == 1).sum()) for cat in cats]

sg = Stargazer(fitted + fitted_c)
sg.title("Any Purchase / Consumption (LPM) — Treatment Effects")
sg.custom_columns(
    [
        "Meat - Purchase",
        "Dairy - Purchase",
        "Vegetables - Purchase",
        "Meat - Cons.",
        "Dairy - Cons.",
        "Vegetables - Cons.",
    ],
    [1, 1, 1, 1, 1, 1],
)
sg.covariate_order(
    [
        "C(treatment, Treatment(reference=0))[T.1]",
        "C(treatment, Treatment(reference=0))[T.2]",
        "C(treatment, Treatment(reference=0))[T.3]",
    ]
)
sg.rename_covariates(
    {
        "C(treatment, Treatment(reference=0))[T.1]": "Stable",
        "C(treatment, Treatment(reference=0))[T.2]": "Predictable",
        "C(treatment, Treatment(reference=0))[T.3]": "Risky",
    }
)
sg.add_line("N (Positive)", [str(n) for n in n_pos_purchase + n_pos_consumption], LineLocation.FOOTER_BOTTOM)
sg.add_line("Period FEs", ["Yes"] * 6, LineLocation.FOOTER_BOTTOM)
sg.show_degrees_of_freedom(False)
sg
```
Table 2
Any Purchase / Consumption (LPM) — Treatment Effects
Meat - PurchaseDairy - PurchaseVegetables - PurchaseMeat - Cons.Dairy - Cons.Vegetables - Cons.
(1)(2)(3)(4)(5)(6)
Stable-0.001-0.0060.007-0.0060.0020.008*
(0.008)(0.012)(0.006)(0.006)(0.013)(0.004)
Predictable0.0050.046***0.013**-0.0030.052***0.010***
(0.007)(0.012)(0.005)(0.006)(0.013)(0.004)
Risky0.0040.019*0.013***-0.0020.027**0.010***
(0.006)(0.010)(0.005)(0.005)(0.011)(0.003)
Observations148651486514865148651486514865
R20.0090.0250.0030.0130.0170.002
Adjusted R20.0080.0240.0030.0130.0170.002
Residual Std. Error0.2670.4480.1880.2230.4830.126
F Statistic11.089***43.452***5.462***12.684***28.895***3.585***
N (Positive)1371042941431514077578214623
Period FEsYesYesYesYesYesYes
Note:*p<0.1; **p<0.05; ***p<0.01

Draw Effects

Draws don’t seem to impact purchase and consumption choices within the risky and predictable arms much.

  • Intensive Margin
  • Extensive Margin
  • Extensive Margin Purchases - By Arm
Code
```{python}
#| label: tbl-draw-effects-intensive

draw_df = reg_df[
    reg_df["treatment"].isin([2, 3]) & reg_df["draw"].isin(["H", "M", "L"])
].copy()

fitted_d = []
for cat in cats:
    m = smf.ols(
        f"food_purchase_{cat}_99 ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=draw_df,
    ).fit(cov_type="HC3")
    fitted_d.append(m)

fitted_dc = []
for cat in cats:
    m = smf.ols(
        f"food_consumption_{cat}_99 ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=draw_df,
    ).fit(cov_type="HC3")
    fitted_dc.append(m)

n_pos_draw_purchase = [
    int((draw_df[f"food_purchase_{cat}_99"] == 1).sum()) for cat in cats
]
n_pos_draw_consumption = [
    int((draw_df[f"food_consumption_{cat}_99"] == 1).sum()) for cat in cats
]

sg_draw_intensive = Stargazer(fitted_d + fitted_dc)
sg_draw_intensive.title("Purchase / Consumption — Draw Effects (ref: Low Draw)")
sg_draw_intensive.custom_columns(
    [
        "Meat - Purchase",
        "Dairy - Purchase",
        "Vegetables - Purchase",
        "Meat - Consumption",
        "Dairy - Consumption",
        "Vegetables - Consumption",
    ],
    [1, 1, 1, 1, 1, 1],
)
sg_draw_intensive.covariate_order(
    [
        "C(draw, Treatment(reference='L'))[T.H]",
    ]
)
sg_draw_intensive.rename_covariates(
    {
        "C(draw, Treatment(reference='L'))[T.H]": "High Draw",
    }
)
sg_draw_intensive.add_line("Period FEs", ["Yes"] * 6, LineLocation.FOOTER_BOTTOM)
sg_draw_intensive.show_degrees_of_freedom(False)
sg_draw_intensive
```
Table 3
Purchase / Consumption — Draw Effects (ref: Low Draw)
Meat - PurchaseDairy - PurchaseVegetables - PurchaseMeat - ConsumptionDairy - ConsumptionVegetables - Consumption
(1)(2)(3)(4)(5)(6)
High Draw-0.6830.158-0.286-1.991-0.0100.122
(0.943)(0.276)(0.534)(1.395)(0.418)(0.703)
Observations773377337733773377337733
R20.0410.0150.0050.1310.0280.004
Adjusted R20.0400.0150.0040.1310.0270.003
Residual Std. Error41.46412.12923.45961.29218.33830.903
F Statistic40.142***20.485***7.187***116.872***29.321***4.739***
Period FEsYesYesYesYesYesYes
Note:*p<0.1; **p<0.05; ***p<0.01
Code
```{python}
#| label: tbl-draw-effects

draw_df = reg_df[
    reg_df["treatment"].isin([2, 3]) & reg_df["draw"].isin(["H", "M", "L"])
].copy()

fitted_d = []
for cat in cats:
    m = smf.ols(
        f"any_purchase_{cat} ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=draw_df,
    ).fit(cov_type="HC3")
    fitted_d.append(m)

fitted_dc = []
for cat in cats:
    m = smf.ols(
        f"any_consumption_{cat} ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=draw_df,
    ).fit(cov_type="HC3")
    fitted_dc.append(m)

n_pos_draw_purchase = [int((draw_df[f"any_purchase_{cat}"] == 1).sum()) for cat in cats]
n_pos_draw_consumption = [
    int((draw_df[f"any_consumption_{cat}"] == 1).sum()) for cat in cats
]

sg3 = Stargazer(fitted_d + fitted_dc)
sg3.title("Any Purchase / Consumption (LPM) — Draw Effects (ref: Low Draw)")
sg3.custom_columns(
    [
        "Meat - Purchase",
        "Dairy - Purchase",
        "Vegetables - Purchase",
        "Meat - Consumption",
        "Dairy - Consumption",
        "Vegetables - Consumption",
    ],
    [1, 1, 1, 1, 1, 1],
)
sg3.covariate_order(
    [
        "C(draw, Treatment(reference='L'))[T.H]",
    ]
)
sg3.rename_covariates(
    {
        "C(draw, Treatment(reference='L'))[T.H]": "High Draw",
    }
)

sg3.add_line(
    "N (Positive)",
    [str(n) for n in n_pos_draw_purchase + n_pos_draw_consumption],
    LineLocation.FOOTER_BOTTOM,
)
sg3.add_line("Period FEs", ["Yes"] * 6, LineLocation.FOOTER_BOTTOM)
sg3.show_degrees_of_freedom(False)
sg3
```
Table 4
Any Purchase / Consumption (LPM) — Draw Effects (ref: Low Draw)
Meat - PurchaseDairy - PurchaseVegetables - PurchaseMeat - ConsumptionDairy - ConsumptionVegetables - Consumption
(1)(2)(3)(4)(5)(6)
High Draw0.002-0.0150.0010.002-0.019*0.000
(0.006)(0.010)(0.004)(0.005)(0.011)(0.002)
Observations773377337733773377337733
R20.0010.0160.0030.0010.0200.002
Adjusted R20.0010.0160.002-0.0000.0190.001
Residual Std. Error0.2450.4480.1820.1980.4850.108
F Statistic1.996*21.120***4.005***0.69425.553***2.971***
N (Positive)723622117467741830997641
Period FEsYesYesYesYesYesYes
Note:*p<0.1; **p<0.05; ***p<0.01
Code
```{python}
#| label: tbl-draw-effects-intensive-by-arm

pred_df = reg_df[
    (reg_df["treatment"] == 2) & reg_df["draw"].isin(["H", "M", "L"])
].copy()
risky_df = reg_df[
    (reg_df["treatment"] == 3) & reg_df["draw"].isin(["H", "M", "L"])
].copy()

fitted_pred = []
for cat in cats:
    m = smf.ols(
        f"any_purchase_{cat} ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=pred_df,
    ).fit(cov_type="HC3")
    fitted_pred.append(m)

fitted_risky = []
for cat in cats:
    m = smf.ols(
        f"any_purchase_{cat} ~ C(draw, Treatment(reference='L')) + C(period_str)",
        data=risky_df,
    ).fit(cov_type="HC3")
    fitted_risky.append(m)

sg_by_arm = Stargazer(fitted_pred + fitted_risky)
sg_by_arm.title("Purchase — Effects of high draw by Arm")
sg_by_arm.custom_columns(
    ["Predictable", "Risky"],
    [3, 3],
)
sg_by_arm.custom_columns(
    [
        "Meat - Predictable",
        "Dairy - Predictable",
        "Vegetables - Predictable",
        "Meat - Risky",
        "Dairy - Risky",
        "Vegetables - Risky",
    ],
    [1, 1, 1, 1, 1, 1],
)
sg_by_arm.covariate_order(
    [
        "C(draw, Treatment(reference='L'))[T.H]",
    ]
)
sg_by_arm.rename_covariates(
    {
        "C(draw, Treatment(reference='L'))[T.H]": "High Draw",
    }
)
sg_by_arm.show_degrees_of_freedom(False)
sg_by_arm
```
Table 5
Purchase — Effects of high draw by Arm
Meat - PredictableDairy - PredictableVegetables - PredictableMeat - RiskyDairy - RiskyVegetables - Risky
(1)(2)(3)(4)(5)(6)
High Draw-0.0120.030-0.0030.008-0.033***0.002
(0.011)(0.019)(0.008)(0.007)(0.012)(0.005)
Observations224622462246548754875487
R20.0050.0130.0040.0010.0190.002
Adjusted R20.0020.0110.001-0.0000.0180.001
Residual Std. Error0.2490.4590.1840.2440.4430.181
F Statistic1.7474.955***2.002*0.99517.751***2.430**
Note:*p<0.1; **p<0.05; ***p<0.01
 
Cookie Preferences