Reshape dataset to wide format - tidyr version
reshape_wide_tidyr(
df,
case_id_var,
time_id_var,
timevar_max = 6,
datsize = Inf
)
dataframe
String with name of ID variable indicating same patient.
E.g. idvar="PUBCSNUM"
for SEER data.
String with name of variable that indicates diagnosis per patient.
E.g. timevar="SEQ_NUM"
for SEER data.
Numeric; default 6. Maximum number of cases per id. All tumors > timevar_max will be deleted before reshaping.
Number of rows to be taken from df. This parameter is mainly for testing. Default is Inf so that df is fully processed.
df
data(us_second_cancer)
msSPChelpR::reshape_wide_tidyr(us_second_cancer,
case_id_var = "fake_id",
time_id_var = "SEQ_NUM",
timevar_max = 2,
datsize = 10000)
#> Long dataset had too many cases per patient. Wide dataset is limited to 2 cases per id as defined in timevar_max option.
#> # A tibble: 6,003 × 29
#> fake_id registry.1 sex.1 race.1 datebirth.1 t_datediag.1 t_site_icd.1 t_dco.1
#> <chr> <chr> <chr> <chr> <date> <date> <chr> <chr>
#> 1 100004 SEER Reg … Male White 1926-01-01 1992-07-15 C50 histol…
#> 2 100034 SEER Reg … Male White 1979-01-01 2000-06-15 C50 histol…
#> 3 100037 SEER Reg … Fema… White 1938-01-01 1996-01-15 C54 histol…
#> 4 100038 SEER Reg … Male White 1989-01-01 1991-04-15 C50 histol…
#> 5 100039 SEER Reg … Fema… White 1946-01-01 2003-08-15 C50 histol…
#> 6 100047 SEER Reg … Fema… White 1927-01-01 1998-04-15 C50 histol…
#> 7 100057 SEER Reg … Male Black 1961-01-01 2010-04-15 C18 histol…
#> 8 100060 SEER Reg … Fema… White 1947-01-01 2003-08-15 C50 histol…
#> 9 100063 SEER Reg … Fema… Black 1938-01-01 1995-12-15 C50 histol…
#> 10 100073 SEER Reg … Male White 1960-01-01 1993-11-15 C44 histol…
#> # ℹ 5,993 more rows
#> # ℹ 21 more variables: t_hist.1 <int>, fc_age.1 <int>, datedeath.1 <date>,
#> # p_alive.1 <chr>, p_dodmin.1 <date>, fc_agegroup.1 <chr>,
#> # t_yeardiag.1 <chr>, registry.2 <chr>, sex.2 <chr>, race.2 <chr>,
#> # datebirth.2 <date>, t_datediag.2 <date>, t_site_icd.2 <chr>, t_dco.2 <chr>,
#> # t_hist.2 <int>, fc_age.2 <int>, datedeath.2 <date>, p_alive.2 <chr>,
#> # p_dodmin.2 <date>, fc_agegroup.2 <chr>, t_yeardiag.2 <chr>