Skip to content Skip to footer

Data life cycle: Collecting

What is data collection?

Data collection is the process where information is gathered about specific variables of interest either using instrumentation or other methods (e.g. questionnaires, patient records). While data collection methods depend on the field and research subject, it is important to ensure data quality.

You can also reuse existing data in your project. This can either be individual earlier collected datasets or reference data from curated resources like ELIXIR Core Data Resources or consensus data like reference genomes. For more information see Reuse in the data life cycle.

Why is data collection important?

Apart from being the source of information to build your findings on, the collection phase lays the foundation for the quality of both the data and its documentation. It is important that the decisions made regarding quality measures are implemented, and that the collect procedures are appropriately recorded.

What should be considered for data collection?

Appropriate tools or integration of multiple tools (also called tool assembly or ecosystem) can help you with data management and documentation during data collection. Suitable tools for data management and documentation during data collection are Electronic Lab Notebooks (ELNs), Electronic Data Capture (EDC) systems, Laboratory Information Management Systems (LIMS). Moreover, online platforms for collaborative research and file sharing services could also be used as ELN or data management systems.

Independently of the tools you will use, consider the following, while collecting data

  • How to capture provenance - e.g. of samples, researchers and instruments
  • Ensure data quality - data can either be generated by yourself, or by another infrastructure or facility with this specialization
  • Reusing data instead of generating new data
  • Experimental design - including a collection plan (e.g. repetitions, controls, randomization) in advance
  • Instrument calibration
  • If you work with sensitive or confidential data, take care of data protection and security issues
  • If you work with human-related data, think about permissions, consent
  • How to store the data
  • Where to store the data
  • Identify suitable metadata standards

Related pages

More information

Contributors