A recent study has shed light on the importance of collecting health data from a representative sample of people, rather than relying on larger convenience samples of individuals who already own wearable data collection devices. The research, conducted by Ritika Chaturvedi and her team, focused on the American Life in Realtime (ALiR) study, which provided Fitbits and tablets to participants from the Understanding America Study – a probability sample of adults.
Unlike convenience samples that often skew towards wealthier, urban, white, and physically fit individuals, ALiR aimed to achieve a more diverse demographic representation across race, education, and income levels. This approach proved to be crucial in ensuring that the data collected was equitable and accurate, without leaving out minorities, older adults, or lower-income groups.
One of the key findings of the study was the comparison of COVID-19 detection models trained on ALiR data versus the NIH’s All of Us program, which consisted of participants who already owned wearable devices. The results showed that ALiR’s model performed consistently well across different demographic subgroups, while the All of Us model exhibited a 22-40% worse performance in older women and non-white populations.
The authors of the study emphasized the significance of probability sampling and providing devices to participants as a means to remove participation barriers and create a more reliable data source. This approach could potentially lead to the development of AI health tools that are effective across all populations, thus promoting health equity.
The research has been published in the journal PNAS Nexus, and more information can be found in the article “American Life in Realtime: Benchmark, publicly available person-generated health data for equity in precision health.”
In conclusion, the study’s findings highlight the importance of collecting health data from a diverse and representative sample of individuals to ensure accuracy and equity in healthcare research. By leveraging such data sources, researchers can develop more inclusive and effective health tools that benefit all members of society.
