Missing Data in Research on Youth and Family Programs

Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.

Files

Metadata

Work Title Missing Data in Research on Youth and Family Programs
Access
Open Access
Creators
  1. Jaime Ballard
  2. Adeya Richmond
  3. Suzanne van den Hoogenhof
  4. Lynne Borden
  5. Daniel Francis Perkins
Keyword
  1. Youth Program
  2. Family Program
  3. Missing Data
  4. Youth Programs
  5. Family Programs
  6. Prevalence
  7. Research Personnel
  8. Site Level
  9. Fatigue
  10. Research Design
  11. Self Report
  12. Datasets
  13. Statistical Methods
  14. Multilevel Data
  15. Statistical Method
  16. Research Design Methods
  17. Self Report Measures
  18. Data Reporting
  19. Multiple Selves
  20. Program Participation
  21. Community Partnerships
  22. Evaluation Design
  23. Research Planning
  24. Participation
  25. Classroom
  26. Evaluation
License CC BY 4.0 (Attribution)
Work Type Article
Publisher
  1. Psychological Reports
Publication Date January 1, 2021
Publisher Identifier (DOI)
  1. 10.1177/00332941211026851
Deposited January 03, 2025

Versions

Analytics

Collections

This resource is currently not in any collection.

Work History

Version 1
published

  • Created
  • Updated
  • Added Creator Jaime Ballard
  • Added Creator Adeya Richmond
  • Added Creator Suzanne van den Hoogenhof
  • Added Creator Lynne Borden
  • Added Creator Daniel Francis Perkins
  • Updated Keyword, Publisher, Publisher Identifier (DOI), and 2 more Show Changes
    Keyword
    • Youth Program, Family Program, Missing Data, Youth Programs, Family Programs, Prevalence, Research Personnel, Site Level, Fatigue, Research Design, Self Report, Datasets , Statistical Methods, Multilevel Data, Statistical Method, Research Design Methods, Self Report Measures, Data Reporting, Multiple Selves, Program Participation, Community Partnerships, Evaluation Design, Research Planning, Participation, Classroom, Evaluation
    Publisher
    • Psychological Reports
    Publisher Identifier (DOI)
    • 10.1177/00332941211026851
    Description
    • <p>Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.</p>
    Publication Date
    • 2021-01-01
  • Updated
  • Updated Description Show Changes
    Description
    • <p>Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.</p>
    • Background: Multilevel data can be missing at the individual level or at a nested level, such as family, classroom, or program site. Increased knowledge of higher-level missing data is necessary to develop evaluation design and statistical methods to address it. Methods: Participants included 9,514 individuals participating in 47 youth and family programs nationwide who completed multiple self-report measures before and after program participation. Data were marked as missing or not missing at the item, scale, and wave levels for both individuals and program sites. Results: Site-level missing data represented a substantial portion of missing data, ranging from 0–46% of missing data at pre-test and 35–71% of missing data at post-test. Youth were the most likely to be missing data, although site-level data did not differ by the age of participants served. In this dataset youth had the most surveys to complete, so their missing data could be due to survey fatigue. Conclusions: Much of the missing data for individuals can be explained by the site not administering those questions or scales. These results suggest a need for statistical methods that account for site-level missing data, and for research design methods to reduce the prevalence of site-level missing data or reduce its impact. Researchers can generate buy-in with sites during the community collaboration stage, assessing problematic items for revision or removal and need for ongoing site support, particularly at post-test. We recommend that researchers conducting multilevel data report the amount and mechanism of missing data at each level.
  • Updated Creator Jaime Ballard
  • Updated Creator Adeya Richmond
  • Updated Creator Suzanne van den Hoogenhof
  • Updated Creator Lynne Borden
  • Updated Creator Daniel Francis Perkins
  • Added ballard-et-al-2021-missing-data-in-research-on-youth-and-family-programs.pdf
  • Updated License Show Changes
    License
    • https://creativecommons.org/licenses/by/4.0/
  • Published
  • Updated