Data Lifecycle
Learn how to plan, collect, process, analyze, preserve and share data effectively and responsibly with the data lifecycle.
The data lifecycle is a model that describes the different phases that data goes through, from creation to disposal. The data lifecycle can help you manage your data better, ensure its quality and integrity and comply with ethical and legal standards.
Stages of the Data Lifecycle
Plan
This is the stage where you define your research questions, objectives, methods and data needs. You should also plan how you will collect, manage, use, share and preserve your data, as well as identify any relevant policies or legal considerations. You should document your data management plan and update it as your project evolves.
Create
This is the stage where you produce data from various sources, such as experiments, surveys or observations. You should ensure that your data is of high quality, accurate and consistent. You should also apply appropriate data formats, metadata standards and identifiers to make your data understandable.
Manage
This is the stage where you organize, store and secure your data during your project. You should follow the best practices for data security, including reviewing the university's Institutional Data Policy to learn how to properly classify and protect the data. Access to data should be closely monitored and controlled, and changes/modifications should be tracked and documented.
Use
This is the stage where you analyze, interpret and visualize your data to answer your research question(s). You should use appropriate tools and methods to manipulate, transform and explore your data. Data analysis workflows and outputs, such as code, scripts, models or figures should also be appropriately documented.
Share
This is the stage where you disseminate your data and findings to your peers and/or the public. You should consider the benefits and risks of sharing your data, as well as any ethical or legal implications. Suitable platforms and formats to share data, such as repositories, journals or websites, should be considered and sufficient metadata, documentation and citation information should be used to enable reuse and verification of the data.
Collect/Reuse
This is the stage where you reuse your own or other’s data for new purposes or to answer new questions. Before collecting or reusing data, it should be evaluated for quality, relevance and reliability. You should also respect the data providers' rights and/or licenses and acknowledge their contributions. It should be documented how you reused any data and what new insights have been generated.
Destroy
This is the stage where you delete or dispose of data that is no longer needed or useful. Best practices for data destruction should be followed and you must comply with the university’s retention schedules. You should also document when and why you have destroyed the data and what impact (if any) it may have on your project or others.
Close Out
This is the stage where you store and backup data for long-term access and potential reuse. You should follow the best practices for data preservation, such as using reliable and secure storage platforms, applying data formats and metadata standards, and assigning consistent identifiers and licenses. The data’s lifecycle and lessons learned, as well as your data management performance and outcomes, should be documented for future reference.