Based on the information given in the received empty_diff_df.csv
,
computes the appropriate differences in mean outcomes at the local silo
and saves as filled_diff_df_$silo_name.csv
. Also stores trends data
as trends_data_$silo_name.csv
.
Usage
undid_stage_two(
empty_diff_filepath,
silo_name,
silo_df,
time_column,
outcome_column,
silo_date_format,
consider_covariates = TRUE,
filepath = tempdir()
)
Arguments
- empty_diff_filepath
A character filepath to the
empty_diff_df.csv
.- silo_name
A character indicating the name of the local silo. Ensure spelling is the same as it is written in the
empty_diff_df.csv
.- silo_df
A data frame of the local silo's data. Ensure any covariates are spelled the same in this data frame as they are in the
empty_diff_df.csv
.- time_column
A character which indicates the name of the column in the
silo_df
which contains the date data. Ensure thetime_column
references a column of character values.- outcome_column
A character which indicates the name of the column in the
silo_df
which contains the outcome of interest. Ensure theoutcome_column
references a column of numeric values.- silo_date_format
A character which indicates the date format which the date strings in the
time_column
are written in.- consider_covariates
An optional logical parameter which if set to
FALSE
ignores any of the computations involving the covariates. Defaults toTRUE
.- filepath
Character value indicating the filepath to save the CSV files. Defaults to
tempdir()
.
Value
A list of data frames. The first being the filled differences data frame, and the second being the trends data data frame. Use the suffix $diff_df to access the filled differences data frame, and use $trends_data to access the trends data data frame.
Details
Covariates at the local silo should be renamed to match the
spelling used in the empty_diff_df.csv
.
Examples
# Load data
silo_data <- silo71
empty_diff_path <- system.file("extdata/staggered", "empty_diff_df.csv",
package = "undidR")
# Run `undid_stage_two()`
results <- undid_stage_two(
empty_diff_filepath = empty_diff_path,
silo_name = "71",
silo_df = silo_data,
time_column = "year",
outcome_column = "coll",
silo_date_format = "yyyy"
)
#> filled_diff_df_71.csv saved to: /tmp/Rtmp8pxDCm/filled_diff_df_71.csv
#> trends_data_71.csv saved to: /tmp/Rtmp8pxDCm/trends_data_71.csv
# View results
head(results$diff_df)
#> silo_name gvar treat diff_times gt RI start_time end_time
#> 1 71 1991 1 1991;1990 1991;1991 0 1989-01-01 2000-01-01
#> 2 71 1991 1 1992;1990 1991;1992 0 1989-01-01 2000-01-01
#> 3 71 1991 1 1993;1990 1991;1993 0 1989-01-01 2000-01-01
#> 4 71 1991 1 1994;1990 1991;1994 0 1989-01-01 2000-01-01
#> 5 71 1991 1 1995;1990 1991;1995 0 1989-01-01 2000-01-01
#> 6 71 1991 1 1996;1990 1991;1996 0 1989-01-01 2000-01-01
#> diff_estimate diff_var diff_estimate_covariates diff_var_covariates
#> 1 0.12916667 0.009447555 0.116348472 0.009397021
#> 2 0.06916667 0.008602222 0.069515594 0.008272557
#> 3 0.02546296 0.007975422 0.005133291 0.007767637
#> 4 0.02703901 0.008564103 0.029958108 0.008338060
#> 5 0.17361111 0.008686695 0.168621303 0.007994236
#> 6 0.13594633 0.008204221 0.146360101 0.007834932
#> covariates date_format freq
#> 1 asian;black;male yyyy 1 year
#> 2 asian;black;male yyyy 1 year
#> 3 asian;black;male yyyy 1 year
#> 4 asian;black;male yyyy 1 year
#> 5 asian;black;male yyyy 1 year
#> 6 asian;black;male yyyy 1 year
head(results$trends_data)
#> silo_name treatment_time time mean_outcome mean_outcome_residualized
#> 1 71 1991 1989 0.3061224 0.1998800
#> 2 71 1991 1990 0.2708333 0.1502040
#> 3 71 1991 1991 0.4000000 0.1949109
#> 4 71 1991 1992 0.3400000 0.1876636
#> 5 71 1991 1993 0.2962963 0.1750943
#> 6 71 1991 1994 0.2978723 0.1195425
#> covariates date_format freq
#> 1 asian;black;male yyyy 1 year
#> 2 asian;black;male yyyy 1 year
#> 3 asian;black;male yyyy 1 year
#> 4 asian;black;male yyyy 1 year
#> 5 asian;black;male yyyy 1 year
#> 6 asian;black;male yyyy 1 year
# Clean up temporary files
unlink(file.path(tempdir(), c("diff_df_71.csv",
"trends_data_71.csv")))