NEWS.md
histgroup_iarc() to create variable for groups of malignant neoplasms considered to be histologically ‘different’ for the purpose of defining multiple tumors, ICD-O-3 (see #100)quiet argument to suppress rlang::warn() and rlang::inform() messages. You can use this when you have checked your results for correctness and want to reduce message output, but keep the progress bars.asir(): add World Standard Population 2000-2025 for function with option std_pop=="WHO2000" as described here: https://seer.cancer.gov/stdpopulations/world.who.html
sir_byfutime() gains new argument expect_missing_refstrata_df. You can define another dataframe that contains strata expected to be missing from refrates_df (because they are not explicitly coded with incidence = 0). This can be helpful, if refrates_df has a lot of strata and 0 incidence strata have been removed to save storage space. Internally, the rows of expect_missing_refstrata_df will be appended to refrates_df. This reduces the number of lines reported in attribute problems_missing_ref_strata. Default setting is expect_missing_refstrata_df = NULL.data("us_second_cancer") gains new variable t_hist on histology, i.e. ICD-O-3-Code on tumor morphology (4 digits)calc_refrates() more robust for missing race_var (Closes #89)calc_refrates() using calc_totals == TRUE (Closes #90)calc_refrates() using numeric versions of fill_sites (Closes #92)asir() that throws error for variable not needed (Closes #95)cli
verb.()syntax from tidytable (Closes #94)calc_refrates() to calculate age-, sex-, region-, year-specific reference rates from a long format dataframe with cancer cases that are counted for incident cases and then matched with a reference population. The resulting reference rates dataframe can directly be used with sir_byfutime() function.dattype = NULL and thus are more flexible to take other source data types (Closes #73)asir, calc_futime*, calc_refrates, ir_crosstab_byfutime, pat_status*, renumber_time_id*, and sir_byfutime now by default are set to dattype = NULL. If you relied on automatic variable naming feature, you need to add dattype = "seer"or dattype = "zfkd" to your function call.problems_missing_count_strata and problems_missing_fu_strata (Closes #80)sir_byfutime():
results_df
sir_ratio() and related sir_ratio_lci() and sir_ratio_uci() to calculate ratio of two SIRs/SMRs to get relative risk and confidence limits for this ratio.reshape_long_tt() ⇒ the _tt variants usually have smaller memory use than tidyverse and data.table variants. Execution time is usually much faster than tidyverse and comparable to or a little slower than the data.table variant.summarize_sir_results():
sir_byfutime()
summarize_sir_results():
summarize_site == TRUE. Previously the results incorrectly counted each site multiple times. (Closes #62)pat_status():
dattype = "zfkd"
data("standard_population")
data("population_us") (Closes #58)sir_byfutime(): change output of integer columns to numeric to fix bug in summarize_sir_results() (Closes #59)vignette("introduction")
reshape_wide_tt(), renumber_time_id_tt(), pat_status_tt(), vital_status_tt(), calc_futime_tt() ⇒ the _tt variants usually have smaller memory use than tidyverse and data.table variants. Execution time is usually much faster than tidyverse and comparable to or a little slower than the data.table variant.sir_byfutime():race_var to optionally stratify SIR calculations by race.summarize_sir_results():sir_byfutime() functionsite_var_name
sir_byfutime():
add_total_row and add_total_fu are replaced by calc_total_row and calc_total_fu. These are logical parameters now. The positioning of total rows and columns is completely handled by the summarize_sir_results() function now. There total rows can be set to top and bottom and total columns to left and right.expcount_src including related parameters stdpop_df, refpop_df, std_pop, truncate_std_pop and pyar_var have been removed. Function sir_byfutime() will only work calculating expected counts based on reference rates, not within the cohort of the dataset. To calculate expected based on the cohort, a new function create_refrates will be added in the future. (#41)collapse_ci has been removed and added to summarize_sir_results() instead.icdcat_var to site_var
agegroup_var to age_var
expcount_src, futime_src, stdpop_df, refpop_df, std_pop, truncate_std_pop, pyar_var, icdcat_var, collapse_ci have been removed to simply the function ⇒ make sure you remove these arguments from your sir_byfutime() function calls.sir():
sir_byfutime(). To migrate your former sir() functions, you can simply use sir_byfutime(, futime_breaks = "none") that will yield the same results.summarize_sir_results():
summarize_icdcat to summarize_site
reshape_long_tidyr():
var_selection is deprecated. Please select variables before running the reshape_long_* functions.asir():
agegroup_var to age_var
icdcat_var to site_var
pat_status(), pat_status_tt(), vital_status(), and vital_status_tt():
ir_crosstab_byfutime():
futime_breaks now uses breaks in years instead of months as previously.futime_var is now follow-up time in yearsicdcat_var to site_var. This need manual update of function calls of sir_byfutime() and asir(), if option is specified.t_icdcat to t_site. So the reference data frames used will need to have a t_site column.renumber_time_id_dt(), pat_status_dt(), reshape_long_dt(), reshape_wide_dt(), vital_status_dt()) have been removed for simplicity, please use tidytable variants, i.e. reshape_wide_tt(), renumber_time_id_tt(), pat_status_tt(), vital_status_tt(), calc_futime_tt(), instead. They will give the same data.table output and same performance.reshape_wide() with option chunks is used. Closes #1.reshape_wide_tidyr() and reshape_wide_tt() is now preserved. Closes #31.renumer_time_id() and make sure that new_time_id_var is returned as integer.pat_status_*(., check = TRUE)optionsir_byfutime() so that PYARs do not get lost before running summary functionsir_byfutime() now also gives correct results if range of futime_breaks is not 0-Inf but smallerrenumber_time_id() function; use sorting by date of diagnosis instead of old time_id_varreshape_wide_tidyr() functionreshape_wide_dt() function which is much faster now and uses data.table::dcast instead of stats::reshape nowpat_status() and pat_status_dt() functionssummarize_sir_results() is now functionalvignette("introduction")
pat_status() and pat_status_dt() functionsrenumber_time_id() that broke functionspat_status() and calc_futime()
tidyselect::all_of in summarize_sir_results()
vignette("patstatus_futime")
tidyselect::all_of for vector-based variable selectionvital_status_dt and pat_status_dt
data.table
reshape_long function workadd_total_row work, even if option ybreak_vars = "none"