Labor and Survey Modality

Published

February 23, 2026

Abstract

Set of regressions run on some outputs requested by Nick

1 U-shaped graphs

Code

```{python}
merged = pd.read_stata(f"/Users/st2246/Work/Pilot3/data/generated/main/transform/30_merged_panel-hh_id-period.dta", convert_categoricals=False)

merged_grouped = merged.groupby(["period", "treatment"])[
    "money_work_p_corrected_99"
].mean().reset_index()
merged_grouped["treatment"] = merged_grouped["treatment"].replace({
    0: "Control",
    1: "Stable",
    2: "Predictable",
    3: "Risky"
})

# Make gg
( ggplot(merged_grouped, aes(x = "period", y = "money_work_p_corrected_99", color = "treatment", group = "treatment")) +
    geom_line() +
    labs(
        title = "Work Income (Participant)",
        x = "Period",
        y = "Cedis"
    ) +
    # Covert R snippet to python
    scale_color_manual(
        name="Treatment",
        values={"Control": "#F8766D", "Stable": "#00BA38", "Predictable": "#619CFF", "Risky": "#FF61C3"}
    ) +
    theme_minimal()
)
```

The goal of this analysis is to investigate whether the u-shaped curve in employment-related outcomes is driven by survey modality (phone vs. in-person) or seasonality. The latter is a viable explanation since the baseline is done during the planting season while the endline is close to the harvest season, thus explaining spikes in labor.

2 Planting and Harvest Dates

Below, we use participants’ self-reported planting and harvest dates to confirm the overlap of the planting and harvest seasons with the baseline and endline respectively.

2.1 Planting

Important

The dates are winsorized. The earliest date in the raw data is in 2006 (likely because of accidentally clicking “months” instead of “days”)

Code

```{python}
#| label: fig-planting-ecdf
#| fig-cap: Cumulative distribution of all planting start dates (winsorized)

(
ggplot(planting, aes(x='planting_start_date_w'))
+ stat_ecdf(geom='step', pad=False)
+ geom_vline(data=period_starts, mapping=aes(xintercept='period_start'),
             linetype='dashed', alpha=0.5, color='steelblue')
+ geom_text(data=period_starts, mapping=aes(x='period_start', label='label'),
            y=0.02, size=7, color='steelblue', ha='left', nudge_x=1)
+ scale_x_datetime(date_breaks='1 month', date_labels='%b %Y')
+ labs(x='Planting start date', y='')
+ theme_minimal()
+ theme(axis_text_x=element_text(rotation=45, ha='right'))
)
```

Figure 1: Cumulative distribution of all planting start dates (winsorized)

Code

```{python}
#| label: fig-planting-hh
#| fig-cap: Earliest and latest planting date per household

planting_hh = (planting
    .groupby("hh_id")['planting_start_date_w']
    .agg(["min", "max"])
    .reset_index()
    .rename(columns={"min": "earliest_planting_date", "max": "latest_planting_date"})
)

planting_hh_long = planting_hh.melt(
    id_vars="hh_id",
    value_vars=["earliest_planting_date", "latest_planting_date"],
    var_name="type", value_name="date"
).replace({"earliest_planting_date": "Earliest", "latest_planting_date": "Latest"})

(
ggplot(planting_hh_long, aes(x="date", color="type"))
+ stat_ecdf(geom='step', pad=False)
+ geom_vline(data=period_starts, mapping=aes(xintercept='period_start'),
             linetype='dashed', alpha=0.5, color='steelblue')
+ geom_text(data=period_starts, mapping=aes(x='period_start', label='label'),
            y=0.02, size=7, color='steelblue', ha='left', nudge_x=1)
+ scale_x_datetime(date_breaks='1 month', date_labels='%b %Y')
+ labs(x='Planting date', y='', color='')
+ theme_minimal()
+ theme(axis_text_x=element_text(rotation=45, ha='right'))
)
```

Figure 2: Earliest and latest planting date per household

2.2 Harvest

Important

Harvest dates combine two sources: endline (actual, back-calculated) and phone surveys (expected, forward-calculated). Dates are winsorized to Jun 2025 – Feb 2026.

Code

```{python}
#| label: fig-harvest-ecdf
#| fig-cap: Cumulative distribution of all harvest start dates

(
ggplot(harvest_all, aes(x='harvest_date', color='source'))
+ stat_ecdf(geom='step', pad=False)
+ geom_vline(data=period_starts, mapping=aes(xintercept='period_start'),
             linetype='dashed', alpha=0.5, color='steelblue')
+ geom_text(data=period_starts, mapping=aes(x='period_start', label='label'),
            y=0.02, size=7, color='steelblue', ha='left', nudge_x=1)
+ scale_x_datetime(date_breaks='1 month', date_labels='%b %Y')
+ labs(title='Harvest Start Dates — All Crops Pooled',
       x='Harvest start date', y='')
+ theme_minimal()
+ theme(axis_text_x=element_text(rotation=45, ha='right'))
)
```

Figure 3: Cumulative distribution of all harvest start dates

Code

```{python}
#| label: fig-harvest-hh
#| fig-cap: Earliest and latest harvest date per household

harvest_hh = (hel_begun
    .groupby("hh_id")['harvest_date']
    .agg(["min", "max"])
    .reset_index()
    .rename(columns={"min": "earliest_harvest_date", "max": "latest_harvest_date"})
)

harvest_hh_long = harvest_hh.melt(
    id_vars="hh_id",
    value_vars=["earliest_harvest_date", "latest_harvest_date"],
    var_name="type", value_name="date"
).replace({"earliest_harvest_date": "Earliest", "latest_harvest_date": "Latest"})

(
ggplot(harvest_hh_long, aes(x="date", color="type"))
+ stat_ecdf(geom='step', pad=False)
+ geom_vline(data=period_starts, mapping=aes(xintercept='period_start'),
             linetype='dashed', alpha=0.5, color='steelblue')
+ geom_text(data=period_starts, mapping=aes(x='period_start', label='label'),
            y=0.02, size=7, color='steelblue', ha='left', nudge_x=1)
+ scale_x_datetime(date_breaks='1 month', date_labels='%b %Y')
+ labs(x='Harvest date', y='', color='')
+ theme_minimal()
+ theme(axis_text_x=element_text(rotation=45, ha='right'))
)
```

Figure 4: Earliest and latest harvest date per household

3 Date and Impact on Labor

Given that the overlap is there, we want to disentangle the impact of survey modality from seasonality in explaining the u-shaped curve. To do so, we exploit variation in survey date within each period.

Code

```{python}
period_dta = pd.read_stata(f"{TIDY}/16_period_hh_level-hh_id-period.dta")
# Survey date is in the startdate variable. Generate a days since start variable
period_dta["days_since_start"] = (
    period_dta["startdate"] - period_dta["startdate"].min()
).dt.days

participant_employment = pd.read_stata(
    f"/Users/st2246/Work/Pilot3/data/generated/main/transform/02_flag_double_reports-hh_id-period.dta"
)
employment = pd.read_stata(
    f"{TIDY}/05_Employment-hh_id-period-member_id.dta", convert_categoricals=False
)
adults = pd.read_stata(
    f"{TIDY}/08_inperson_demographics-hh_id-period-member_id.dta",
    convert_categoricals=False,
)
adults = adults[adults["age"] >= 18]
adults = adults[["hh_id", "member_id", "participant_id"]].drop_duplicates()
adults = adults.loc[adults.index.repeat(7)].copy()
adults["period"] = adults.groupby(["hh_id", "member_id"]).cumcount()
emp = adults.merge(
    employment[
        [
            "hh_id",
            "period",
            "member_id",
            "money_work_99",
            "days_worked_",
            "work_engaged_",
        ]
    ],
    on=["hh_id", "period", "member_id"],
    how="left",
)

# Fill in 0s for money_work_99 and days_worked_ work_engaged_ when missing
emp["money_work_99"] = emp["money_work_99"].fillna(0)
emp["days_worked_"] = emp["days_worked_"].fillna(0)
emp["work_engaged_"] = emp["work_engaged_"].fillna(0)


# For rows where member_id == participant_id, replace employment vars with corrected values
emp = emp.merge(
    participant_employment[
        [
            "hh_id",
            "period",
            "money_work_p_corrected_99",
            "days_worked_p_corrected",
            "work_engaged_p_corrected",
        ]
    ],
    on=["hh_id", "period"],
    how="left",
)
is_participant = emp["member_id"] == emp["participant_id"]
emp.loc[is_participant, "money_work_99"] = emp.loc[
    is_participant, "money_work_p_corrected_99"
].values
emp.loc[is_participant, "days_worked_"] = emp.loc[
    is_participant, "days_worked_p_corrected"
].values
emp.loc[is_participant, "work_engaged_"] = emp.loc[
    is_participant, "work_engaged_p_corrected"
].values.astype(int)

# df_allmembers: all members (with participant corrections) merged with period data
df_allmembers = emp.merge(
    period_dta[["hh_id", "period", "startdate", "days_since_start", "enum_id"]],
    left_on=["hh_id", "period"],
    right_on=["hh_id", "period"],
)
df_allmembers["inperson_survey"] = (
    df_allmembers["period"].apply(lambda p: 0 if 1 <= p <= 5 else 1).astype(int)
)

# df_noparticipant: exclude the participant row
df_noparticipant = df_allmembers[
    df_allmembers["member_id"] != df_allmembers["participant_id"]
].copy()

# Merge the period data with participant-level corrected employment for participant-only regressions
df = df_allmembers[df_allmembers["member_id"] == df_allmembers["participant_id"]].copy()
```

/var/folders/rj/c4rjx52d217gsm3ccfx5q20r0000gr/T/ipykernel_62160/2028874538.py:65: RuntimeWarning: invalid value encountered in cast

Survey Dates
Survey Dates w/ Planting / Harvest Dates

Code

```{python}
(
    ggplot(df[["hh_id", "period", "startdate"]].drop_duplicates(), aes(x='startdate', fill='factor(period)')) +
    geom_histogram(binwidth=1, position='stack') +
    labs(x='Survey Date', y='Count', fill='Period') +
    theme_minimal()
)
```

Code

```{python}
survey_hist_data = df[["hh_id", "period", "startdate"]].drop_duplicates()

bin_counts = survey_hist_data["startdate"].value_counts().sort_index()
max_count = bin_counts.max()

def make_ecdf_df(series, label):
    s = series.dropna().sort_values()
    n = len(s)
    return pd.DataFrame({"date": s.values, "ecdf": np.arange(1, n + 1) / n, "type": label})

overlay = pd.concat([
    make_ecdf_df(planting["planting_start_date_w"], "Planting"),
    make_ecdf_df(hel_begun["harvest_date"], "Harvest"),
])
overlay["scaled"] = overlay["ecdf"] * max_count

(
    ggplot(survey_hist_data, aes(x='startdate', fill='factor(period)')) +
    geom_histogram(binwidth=1, position='stack') +
    geom_line(data=overlay, mapping=aes(x='date', y='scaled', color='type', linetype='type'), size=0.8, inherit_aes=False) +
    scale_linetype_manual(values={"Planting": "solid", "Harvest": "dashed"}) +
    scale_color_manual(values={"Planting": "black", "Harvest": "black"}) +
    labs(x='Survey Date', y='Count', fill='Period', color='Season', linetype='Season') +
    theme_minimal()
)
```

3.1 Piecewise Linear Regression to Find Optimal Breakpoint

Under the seasonality hypothesis, we would expect the relationship between days since start and employment outcomes to be non-linear. At the beginning, it should be negative as we move further away from the planting season. However, as we approach the harvest season, the relationship should become positive.

Below, we run a series of piecewise linear regressions to find the optimal breakpoint. That is, we look for the date at which the relationship between days since start and employment outcomes changes by finding the date that minimizes the residual sum of squares.

Code

```{python}
import statsmodels.api as sm

def find_best_spline(column: str) -> tuple:
    y = df_allmembers[column].values
    x = df_allmembers['days_since_start'].values
    # Search over candidate breakpoints (exclude extremes)
    candidates = np.arange(x.min(), x.max())
    rss_values = []

    for bp in candidates:
        # Piecewise linear: x, and (x - bp)_+ 
        x_spline = np.maximum(x - bp, 0)
        X = sm.add_constant(np.column_stack([x, x_spline]))
        model = sm.OLS(y, X).fit()
        rss_values.append(model.ssr)
    candidate_dates = [df['startdate'].min() + pd.Timedelta(days=int(bp)) for bp in candidates]
    return candidates, rss_values, candidate_dates 
```

Code

```{python}
candidates, rss_values, candidate_dates = find_best_spline('days_worked_')
rss_values = np.array(rss_values)
best_date = candidate_dates[np.argmin(rss_values)]

days_all = (
    ggplot(pd.DataFrame({'breakpoint': candidate_dates, 'RSS': rss_values}), 
           aes(x='breakpoint', y='RSS')) +
    geom_line() +
    geom_vline(xintercept=best_date, linetype='dashed', color='red') +
    annotate('text', x=best_date, y=rss_values.max() * 0.99, 
             label=f'Best = {best_date.strftime("%Y-%m-%d")}', color='red', ha='left') +
    labs(x='Split dates', y='RSS', title="All household adults") +
    theme_minimal()
)
```

Code

```{python}
candidates_p, rss_values_p, candidate_dates_p = find_best_spline('days_worked_p_corrected')
rss_values_p = np.array(rss_values_p)
best_date_p = candidate_dates_p[np.argmin(rss_values_p)]

days_participant = (
    ggplot(pd.DataFrame({'breakpoint': candidate_dates_p, 'RSS': rss_values_p}), 
           aes(x='breakpoint', y='RSS')) +
    geom_line() +
    geom_vline(xintercept=best_date_p, linetype='dashed', color='red') +
    annotate('text', x=best_date_p, y=rss_values_p.max() * 0.99, 
             label=f'Best = {best_date_p.strftime("%Y-%m-%d")}', color='red', ha='left') +
    labs(x='Split Dates', y='RSS', title="Participant Only") +
    theme_minimal()
)
```

Code

```{python}
days_all / days_participant
```

Code

```{python}
candidates_m, rss_values_m, candidate_dates_m = find_best_spline('money_work_99')
rss_values_m = np.array(rss_values_m)
best_date_m = candidate_dates_m[np.argmin(rss_values_m)]

(
    ggplot(pd.DataFrame({'breakpoint': candidate_dates_m, 'RSS': rss_values_m}), 
           aes(x='breakpoint', y='RSS')) +
    geom_line() +
    geom_vline(xintercept=best_date_m, linetype='dashed', color='red') +
    annotate('text', x=best_date_m, y=rss_values_m.max() * 0.99, 
             label=f'Best = {best_date_m.strftime("%Y-%m-%d")}', color='red', ha='left') +
    labs(x='Candidate Breakpoint (date)', y='RSS') +
    theme_minimal()
)
```

Code

```{python}
candidates_w, rss_values_w, candidate_dates_w = find_best_spline('work_engaged_')
rss_values_w = np.array(rss_values_w)
best_date_w = candidate_dates_w[np.argmin(rss_values_w)]

work_all = (
    ggplot(pd.DataFrame({'breakpoint': candidate_dates_w, 'RSS': rss_values_w}), 
           aes(x='breakpoint', y='RSS')) +
    geom_line() +
    geom_vline(xintercept=best_date_w, linetype='dashed', color='red') +
    annotate('text', x=best_date_w, y=rss_values_w.max() * 0.99, 
             label=f'Best = {best_date_w.strftime("%Y-%m-%d")}', color='red', ha='left') +
    labs(x='Candidate Breakpoint (date)', y='RSS', title="All Household members") +
    theme_minimal()
)
```

Code

```{python}
candidates_wp, rss_values_wp, candidate_dates_wp = find_best_spline('work_engaged_p_corrected')
rss_values_wp = np.array(rss_values_wp)
best_date_wp = candidate_dates_m[np.argmin(rss_values_wp)]

work_p = (
    ggplot(pd.DataFrame({'breakpoint': candidate_dates_wp, 'RSS': rss_values_wp}), 
           aes(x='breakpoint', y='RSS')) +
    geom_line() +
    geom_vline(xintercept=best_date_wp, linetype='dashed', color='red') +
    annotate('text', x=best_date_wp, y=rss_values_wp.max() * 0.99, 
             label=f'Best = {best_date_wp.strftime("%Y-%m-%d")}', color='red', ha='left') +
    labs(x='Candidate Breakpoint (date)', y='RSS', title="Participant Only") +
    theme_minimal()
)
```

Code

```{python}
work_all / work_p
```

It seems early August is the ideal splitting point. Given the smoothness of the curves, I am not too concerned with picking the exact date. For the purposes of the analysis below, I will pick August 1st

Code

```{python}
best_day = candidates[np.argmin(rss_values)] - 8  # Best date is Aug 9, subtract 8 to get Aug 1

df_allmembers["days_after_start"] = df_allmembers["days_since_start"].apply(lambda x: min(x, best_day))
df_allmembers["days_since_aug1"] = df_allmembers["days_since_start"].apply(lambda x: max(0, x - best_day))

df_participant = df_allmembers[df_allmembers["member_id"] == df_allmembers["participant_id"]].copy()
df_noparticipant = df_allmembers[df_allmembers["member_id"] != df_allmembers["participant_id"]].copy()
```

3.2 Regressions

The regressions below suggest that the u-shaped curve is mostly driven by seasonality

Code

```{python}
from stargazer.stargazer import Stargazer

def run_regs(data: pd.DataFrame, dv: str) -> list:
    sub = data[[dv, "days_since_start", "days_after_start", "days_since_aug1",
                "inperson_survey", "enum_id"]].dropna()
    y = sub[dv]

    # TODO I'd spec where we interact days_since_aug1 and days_after_start with in-person survey. 
    specs = [
        ["days_since_start"],
        ["days_after_start", "days_since_aug1"],
        ["days_since_start", "inperson_survey"],
        ["days_after_start", "days_since_aug1", "inperson_survey"],
        ["days_since_start", "inperson_survey", "enum_id"],
        ["days_after_start", "days_since_aug1", "inperson_survey", "enum_id"],
    ]

    enum_dummies = pd.get_dummies(sub["enum_id"], prefix="enum", drop_first=True, dtype=float)

    models = []
    for spec in specs:
        if "enum_id" in spec:
            cols = [c for c in spec if c != "enum_id"]
            X = sm.add_constant(pd.concat([sub[cols], enum_dummies], axis=1))
        else:
            X = sm.add_constant(sub[spec])
        models.append(sm.OLS(y, X).fit())
    return models

models_all = run_regs(df_allmembers, "days_worked_")
models_p   = run_regs(df_participant, "days_worked_p_corrected")
```

Code

```{python}
#| output: asis
def render_table(models: list, title: str) -> str:
    sg = Stargazer(models)
    sg.title(title)
    sg.custom_columns(["Days Since Start", "w/ split", "Control for Modality", "", "Control for Enum", "(6)"], [1, 1, 1, 1, 1, 1])

    # Collect all enum dummy names across all models and hide them
    enum_cols = []
    for m in models:
        enum_cols += [p for p in m.params.index if p.startswith("enum_")]
    if enum_cols:
        all_params = []
        for m in models:
            for p in m.params.index:
                if not p.startswith("enum_") and p not in all_params:
                    all_params.append(p)
        sg.covariate_order(all_params)
        sg.add_line("Enum FE", ["No", "No", "No", "No", "Yes", "Yes"])

    return sg.render_html()

print(render_table(models_all, "Days Worked — All Household Adults"))
```

Days Worked — All Household Adults


	Dependent variable: days_worked_

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	0.528^***	0.760^***	0.326^***	0.603^***	0.743^***	0.969^***
	(0.016)	(0.018)	(0.018)	(0.036)	(0.131)	(0.132)
days_since_start	-0.002^***		-0.001^***		-0.002^***
	(0.000)		(0.000)		(0.000)
days_after_start		-0.006^***		-0.004^***		-0.006^***
		(0.000)		(0.000)		(0.000)
days_since_aug1		0.019^***		0.013^***		0.013^***
		(0.001)		(0.002)		(0.001)
inperson_survey			0.438^***	0.172^***	0.447^***	0.160^***
			(0.016)	(0.034)	(0.023)	(0.036)
Enum FE	No	No	No	No	Yes	Yes

Observations	54873	54873	54873	54873	54873	54873
R²	0.001	0.016	0.015	0.016	0.145	0.146
Adjusted R²	0.001	0.016	0.015	0.016	0.143	0.145
Residual Std. Error	1.706 (df=54871)	1.694 (df=54870)	1.695 (df=54870)	1.694 (df=54869)	1.581 (df=54767)	1.579 (df=54766)
F Statistic	69.691^*** (df=1; 54871)	438.338^*** (df=2; 54870)	411.538^*** (df=2; 54870)	300.860^*** (df=3; 54869)	88.209^*** (df=105; 54767)	88.525^*** (df=106; 54766)

Note:	p<0.1; p<0.05; p<0.01

Code

```{python}
#| output: asis
print(render_table(models_p, "Days Worked — Participant Only"))
```

Days Worked — Participant Only


	Dependent variable: days_worked_p_corrected

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	0.485^***	0.793^***	0.216^***	0.578^***	1.161^***	1.441^***
	(0.035)	(0.038)	(0.038)	(0.077)	(0.271)	(0.275)
days_since_start	0.001		0.002^***		-0.000
	(0.000)		(0.000)		(0.000)
days_after_start		-0.006^***		-0.003^***		-0.005^***
		(0.001)		(0.001)		(0.001)
days_since_aug1		0.028^***		0.019^***		0.018^***
		(0.002)		(0.003)		(0.003)
inperson_survey			0.583^***	0.235^***	0.531^***	0.176^**
			(0.034)	(0.072)	(0.048)	(0.075)
Enum FE	No	No	No	No	Yes	Yes

Observations	14865	14865	14865	14865	14865	14865
R²	0.000	0.021	0.020	0.021	0.184	0.186
Adjusted R²	0.000	0.021	0.019	0.021	0.178	0.180
Residual Std. Error	1.897 (df=14863)	1.878 (df=14862)	1.879 (df=14862)	1.877 (df=14861)	1.720 (df=14759)	1.718 (df=14758)
F Statistic	2.091 (df=1; 14863)	157.542^*** (df=2; 14862)	147.878^*** (df=2; 14862)	108.611^*** (df=3; 14861)	31.642^*** (df=105; 14759)	31.767^*** (df=106; 14758)

Note:	p<0.1; p<0.05; p<0.01

3.2.2 Money Earned

All Household Adults
Participants

Code

```{python}
#| output: asis
print(render_table(models_all_m, "Money Earned — All Household Adults"))
```

Money Earned — All Household Adults


	Dependent variable: money_work_99

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	17.557^***	28.015^***	7.325^***	10.969^***	15.574^***	19.274^***
	(0.630)	(0.699)	(0.685)	(1.390)	(5.292)	(5.367)
days_since_start	-0.066^***		-0.023^***		-0.030^***
	(0.007)		(0.007)		(0.009)
days_after_start		-0.287^***		-0.072^***		-0.095^***
		(0.010)		(0.018)		(0.018)
days_since_aug1		0.876^***		0.153^***		0.211^***
		(0.029)		(0.059)		(0.059)
inperson_survey			22.163^***	18.662^***	23.072^***	18.380^***
			(0.617)	(1.316)	(0.926)	(1.468)
Enum FE	No	No	No	No	Yes	Yes

Observations	54873	54873	54873	54873	54873	54873
R²	0.002	0.021	0.025	0.025	0.072	0.072
Adjusted R²	0.002	0.021	0.025	0.025	0.070	0.070
Residual Std. Error	66.321 (df=54871)	65.670 (df=54870)	65.556 (df=54870)	65.551 (df=54869)	64.002 (df=54767)	63.993 (df=54766)
F Statistic	89.042^*** (df=1; 54871)	592.207^*** (df=2; 54870)	690.208^*** (df=2; 54870)	463.228^*** (df=3; 54869)	40.450^*** (df=105; 54767)	40.240^*** (df=106; 54766)

Note:	p<0.1; p<0.05; p<0.01

Code

```{python}
#| output: asis
print(render_table(models_p_m, "Money Earned — Participants"))
```

Money Earned — Participants


	Dependent variable: money_work_p_corrected_99

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	8.699^***	19.394^***	-1.178	7.138^***	18.884^**	25.635^***
	(0.982)	(1.084)	(1.062)	(2.156)	(7.915)	(8.022)
days_since_start	0.054^***		0.096^***		0.057^***
	(0.011)		(0.011)		(0.014)
days_after_start		-0.172^***		-0.017		-0.063^**
		(0.015)		(0.028)		(0.028)
days_since_aug1		1.020^***		0.499^***		0.501^***
		(0.046)		(0.091)		(0.089)
inperson_survey			21.386^***	13.404^***	18.762^***	10.214^***
			(0.957)	(2.040)	(1.396)	(2.202)
Enum FE	No	No	No	No	Yes	Yes

Observations	14865	14865	14865	14865	14865	14865
R²	0.002	0.033	0.034	0.035	0.135	0.137
Adjusted R²	0.002	0.032	0.034	0.035	0.129	0.130
Residual Std. Error	53.770 (df=14863)	52.933 (df=14862)	52.891 (df=14862)	52.858 (df=14861)	50.223 (df=14759)	50.182 (df=14758)
F Statistic	24.826^*** (df=1; 14863)	250.164^*** (df=2; 14862)	262.321^*** (df=2; 14862)	181.644^*** (df=3; 14861)	21.962^*** (df=105; 14759)	22.027^*** (df=106; 14758)

Note:	p<0.1; p<0.05; p<0.01

3.2.3 Working?

All Household Adults
Participants

Code

```{python}
#| output: asis
print(render_table(models_all_w, "Work dummy — All Household Adults"))
```

Work dummy — All Household Adults


	Dependent variable: work_engaged_

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	0.083^***	0.129^***	0.041^***	0.082^***	0.108^***	0.143^***
	(0.002)	(0.003)	(0.003)	(0.005)	(0.020)	(0.020)
days_since_start	-0.000^***		0.000^*		-0.000^***
	(0.000)		(0.000)		(0.000)
days_after_start		-0.001^***		-0.001^***		-0.001^***
		(0.000)		(0.000)		(0.000)
days_since_aug1		0.004^***		0.002^***		0.002^***
		(0.000)		(0.000)		(0.000)
inperson_survey			0.091^***	0.052^***	0.088^***	0.044^***
			(0.002)	(0.005)	(0.003)	(0.006)
Enum FE	No	No	No	No	Yes	Yes

Observations	54873	54873	54873	54873	54873	54873
R²	0.000	0.025	0.026	0.027	0.139	0.141
Adjusted R²	0.000	0.025	0.026	0.027	0.138	0.139
Residual Std. Error	0.259 (df=54871)	0.256 (df=54870)	0.256 (df=54870)	0.256 (df=54869)	0.241 (df=54767)	0.241 (df=54766)
F Statistic	22.296^*** (df=1; 54871)	713.638^*** (df=2; 54870)	728.282^*** (df=2; 54870)	510.959^*** (df=3; 54869)	84.335^*** (df=105; 54767)	84.683^*** (df=106; 54766)

Note:	p<0.1; p<0.05; p<0.01

Code

```{python}
#| output: asis
print(render_table(models_p_w, "Work dummy — Participants"))
```

Work dummy — Participants


	Dependent variable: work_engaged_p_corrected

	Days Since Start	w/ split	Control for Modality		Control for Enum	(6)
	(1)	(2)	(3)	(4)	(5)	(6)

const	0.070^***	0.134^***	0.011^*	0.064^***	0.167^***	0.211^***
	(0.005)	(0.006)	(0.006)	(0.012)	(0.042)	(0.043)
days_since_start	0.000^***		0.001^***		0.000^***
	(0.000)		(0.000)		(0.000)
days_after_start		-0.001^***		-0.000		-0.001^***
		(0.000)		(0.000)		(0.000)
days_since_aug1		0.006^***		0.003^***		0.003^***
		(0.000)		(0.001)		(0.000)
inperson_survey			0.128^***	0.076^***	0.112^***	0.058^***
			(0.005)	(0.011)	(0.007)	(0.012)
Enum FE	No	No	No	No	Yes	Yes

Observations	14865	14865	14865	14865	14865	14865
R²	0.003	0.039	0.040	0.042	0.198	0.200
Adjusted R²	0.002	0.039	0.040	0.041	0.192	0.194
Residual Std. Error	0.299 (df=14863)	0.294 (df=14862)	0.294 (df=14862)	0.293 (df=14861)	0.269 (df=14759)	0.269 (df=14758)
F Statistic	37.592^*** (df=1; 14863)	298.845^*** (df=2; 14862)	308.781^*** (df=2; 14862)	215.019^*** (df=3; 14861)	34.605^*** (df=105; 14759)	34.702^*** (df=106; 14758)

Note:	p<0.1; p<0.05; p<0.01