🔲 What are the final clean datasets that should be produced?

🔲 What file format should each clean dataset be in?

  • What did we say in our Data Management Plan?

🔲 What are the team standards for what constitutes a clean dataset?

  • Are these standards laid out in documentation?

🔲 Have our data cleaning plans been reviewed by the team?

🔲 Who is in charge of cleaning data?

  • When will data cleaning occur?
  • Is there a timeline for when clean data is needed?

🔲 What tools will be used for data cleaning?

  • Are there any costs associated with our data cleaning tools?

🔲 What structure should clean datasets be in?

  • Long vs wide
  • Merged vs separate (across time, across forms)

🔲 How will data and code be versioned?

  • Manually?
    • Is this written out in a Style Guide?
  • Programmatically?
    • What program will be used? (ex: Git/GitHub, SharePoint, Box)