FR: Add function to tidy non-mutually exclusive factors surveyed over multiple inputs

I'd like to see a function which would turn a ragged array to a sparse one, usually when a "factor" with non-mutually exclusive choices is tentatively recorded using a group of drop-downs.

For example, if you have such a "factor" with legal values `A`/`B`/`C`/`D` recorded over three variables `col1`, `col2` and `col3`.

|id|col1|col2|col3|
|--|----|----|----|
|1|A|B|C|
|2|B|C|NA|
|3|D|NA|NA|
|4|B|D|NA|

calling such a function, indicating that `col1`, `col2` and `col3` are encoding for the same information, would yield

|id|A|B|C|D|
|--|----|----|----|----|
|1|T|T|T|F|
|2|F|T|T|F|
|3|F|F|F|T|
|4|F|T|F|T|

Options would include the ability to set a prefix for the new variable names to avoid collisions, and to create the `NA` column.

I found this use case many times in medical surveys where disease history is badly recorded using multiple drop-down lists or sets of checkboxes. IIRC, google surveys also treats sets of checkboxes this way, with one column containing semi-colon separated values. This can be dealt with using a call to `separate` then a call to `binarize`.

Playing around a bit with `spread` and `gather` allows this behavior but this can be CPU/memory heavy on large dataframes.

There is a (pre-tidyeval) implementation in PR #288 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

FR: Add function to tidy non-mutually exclusive factors surveyed over multiple inputs #384

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

id	col1	col2	col3
1	A	B	C
2	B	C	NA
3	D	NA	NA
4	B	D	NA

FR: Add function to tidy non-mutually exclusive factors surveyed over multiple inputs #384

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions