This function examines all character columns in a data.frame (typically one
that was read from comma separated text file using a reader function such as
readdr::read_csv()
), and generates a specification suitable for reading
those columns from the underlying file as factors.
The key benefit is that it will find all variables that are using the same
factor levels and group them together, so editing the col_spec
to reorder
factor levels or make other changes is straightforward.
The result is returned as a catty
vector to provide a more readable output
by default.
The output can then be edited before the col_spec
is used to read the data
in fresh from the CSV file. Assuming the cols()
result is assigned to a
variable cspec
, one might have the following:
<- read_csv(csv_file)
d cb_as_col_spec_factors(d)
# edit the output to fit your needs
<- cols(...)
cspec <- read_csv(csv_file, col_types=cspec) d