Insert summary rows between groups of rows in a data frame — Group

This function analyses a data frame, sorting it based on the groups detailed in the group_ID_col, and inserts summary/mean rows in between each group.

For this to work the data frame must be structured such that it has a group ID column of some sort, where the group of each item is listed. All the data columns that need to be averaged need to be numeric also.

A secondary option is for the sorting of items within their groups, using the secondary_sort_col parameter.

Usage

Group_Summariser(
  df,
  group_ID_col,
  secondary_sort_col,
  input_weighting_column,
  weighting_leniency = 0,
  blank_cols = c(),
  sep_row = FALSE,
  seq_col = FALSE,
  weighting_col = FALSE,
  round_weighting = TRUE,
  round_averages = FALSE,
  na_rm = TRUE
)

Arguments

df: Required - The data.frame that summary rows need to be inserted into.
group_ID_col: Required - The column name specifying the groups that summary rows are created for.
secondary_sort_col: Optional - Specify the column that the results should be sorted by after they're sorted into groups.
input_weighting_column: Optional - Specify a column which contains set weightings. If selected, these weightings will be used in the summariser instead of a set average. Where partial weightings are given for an item, the remaining matches will have their weightings split evenly between them.
weighting_leniency: Optional - default: 0 - Introduces some forgiveness in the 'group weightings must equal 1' rule. In some cases using existing weightings can lead to a total weighting value that is not equal to 1 (particularly if the weightings have been rounded in some way before using the Group Summariser). The inputted value is sets the range around 1 which the tool will accept - e.g. an input value of 0.03 will mean that the weighting total can be from 0.97 to 1.03.
blank_cols: Optional - Specify a lits of column names that you wish to leave blank on the average rows (e.g. metadata). Recommended to run the function once, see the results, and then check which columns you want to list here.
sep_row: Optional - default: 'FALSE' - if set to TRUE, The Summariser will insert an empty row after each summary row, to help reading and separation. The column names listed here must exactly match the columns you want excluded, in a character string; e.g. c("FCT Food Item Code", "FCT Food Name") for the columns FCT Food Item Code and FCT Food Name.
seq_col: Optional - default: 'FALSE' - if set to TRUE, The Summariser will insert a sequence column, numbering each item that goes into a summary row.
weighting_col: Optional - default: 'FALSE' - if set to TRUE, The Summariser will insert a weighting factor for each item that goes into a summary row.
round_weighting: Optional - default: 'TRUE' - If set to TRUE, The Summariser will round each weighted value to 2 decimal places.
round_averages: Optional - default: 'FALSE' - If set to TRUE, The Summariser will round each summarised average to 2 decimal places.
na_rm: Optional - default: 'TRUE' - If set to TRUE, The Summariser will round values even if an NA value is present. If set to 'FALSE', an NA value in the column will result in the Summary row for that column being NA as well.

Value

A data.frame that mirrors df, but after each group a summary row is inserted, containing the mean of the data columns.