Skip to contents

Based on the information given in the received empty_diff_df.csv, computes the appropriate differences in mean outcomes at the local silo and saves as filled_diff_df_$silo_name.csv. Also stores trends data as trends_data_$silo_name.csv.

Usage

undid_stage_two(
  empty_diff_filepath,
  silo_name,
  silo_df,
  time_column,
  outcome_column,
  silo_date_format,
  consider_covariates = TRUE,
  filepath = tempdir()
)

Arguments

empty_diff_filepath

A character filepath to the empty_diff_df.csv.

silo_name

A character indicating the name of the local silo. Ensure spelling is the same as it is written in the empty_diff_df.csv.

silo_df

A data frame of the local silo's data. Ensure any covariates are spelled the same in this data frame as they are in the empty_diff_df.csv.

time_column

A character which indicates the name of the column in the silo_df which contains the date data. Ensure the time_column references a column of character values.

outcome_column

A character which indicates the name of the column in the silo_df which contains the outcome of interest. Ensure the outcome_column references a column of numeric values.

silo_date_format

A character which indicates the date format which the date strings in the time_column are written in.

consider_covariates

An optional logical parameter which if set to FALSE ignores any of the computations involving the covariates. Defaults to TRUE.

filepath

Character value indicating the filepath to save the CSV files. Defaults to tempdir().

Value

A list of data frames. The first being the filled differences data frame, and the second being the trends data data frame. Use the suffix $diff_df to access the filled differences data frame, and use $trends_data to access the trends data data frame.

Details

Covariates at the local silo should be renamed to match the spelling used in the empty_diff_df.csv.

Examples

# Load data
silo_data <- silo71
empty_diff_path <- system.file("extdata/staggered", "empty_diff_df.csv",
                               package = "undidR")

# Run `undid_stage_two()`
results <- undid_stage_two(
  empty_diff_filepath = empty_diff_path,
  silo_name = "71",
  silo_df = silo_data,
  time_column = "year",
  outcome_column = "coll",
  silo_date_format = "yyyy"
)
#> filled_diff_df_71.csv saved to: /tmp/Rtmp8pxDCm/filled_diff_df_71.csv
#> trends_data_71.csv saved to: /tmp/Rtmp8pxDCm/trends_data_71.csv

# View results
head(results$diff_df)
#>   silo_name gvar treat diff_times        gt RI start_time   end_time
#> 1        71 1991     1  1991;1990 1991;1991  0 1989-01-01 2000-01-01
#> 2        71 1991     1  1992;1990 1991;1992  0 1989-01-01 2000-01-01
#> 3        71 1991     1  1993;1990 1991;1993  0 1989-01-01 2000-01-01
#> 4        71 1991     1  1994;1990 1991;1994  0 1989-01-01 2000-01-01
#> 5        71 1991     1  1995;1990 1991;1995  0 1989-01-01 2000-01-01
#> 6        71 1991     1  1996;1990 1991;1996  0 1989-01-01 2000-01-01
#>   diff_estimate    diff_var diff_estimate_covariates diff_var_covariates
#> 1    0.12916667 0.009447555              0.116348472         0.009397021
#> 2    0.06916667 0.008602222              0.069515594         0.008272557
#> 3    0.02546296 0.007975422              0.005133291         0.007767637
#> 4    0.02703901 0.008564103              0.029958108         0.008338060
#> 5    0.17361111 0.008686695              0.168621303         0.007994236
#> 6    0.13594633 0.008204221              0.146360101         0.007834932
#>         covariates date_format   freq
#> 1 asian;black;male        yyyy 1 year
#> 2 asian;black;male        yyyy 1 year
#> 3 asian;black;male        yyyy 1 year
#> 4 asian;black;male        yyyy 1 year
#> 5 asian;black;male        yyyy 1 year
#> 6 asian;black;male        yyyy 1 year
head(results$trends_data)
#>   silo_name treatment_time time mean_outcome mean_outcome_residualized
#> 1        71           1991 1989    0.3061224                 0.1998800
#> 2        71           1991 1990    0.2708333                 0.1502040
#> 3        71           1991 1991    0.4000000                 0.1949109
#> 4        71           1991 1992    0.3400000                 0.1876636
#> 5        71           1991 1993    0.2962963                 0.1750943
#> 6        71           1991 1994    0.2978723                 0.1195425
#>         covariates date_format   freq
#> 1 asian;black;male        yyyy 1 year
#> 2 asian;black;male        yyyy 1 year
#> 3 asian;black;male        yyyy 1 year
#> 4 asian;black;male        yyyy 1 year
#> 5 asian;black;male        yyyy 1 year
#> 6 asian;black;male        yyyy 1 year

# Clean up temporary files
unlink(file.path(tempdir(), c("diff_df_71.csv",
                             "trends_data_71.csv")))