Data @ Reed

Data Management

Data management is an important part of the research process. Your thesis advisor or lab may already have data management protocols in place for you to follow. Below are general best practices to consider.

Basic Data Storage

  • Data may be stored initially on lab computers.
  • For some data it may be appropriate to upload it to external storage devices, Reed's network file system AFS, or to a cloud-based server like Google Drive.
  • Security recommendations from Reed:

Version Control

Data Backup

  • Backup your data regularly. (regularly = AT LEAST DAILY)
  • 3-2-1 rule: Three copies of your data, on two different types of storage media, in more than one location.
  • Check your backups! Check for presence/absence, matching file size & accessibility.
  • Backup recommendations from Reed

Folder and File Organization

  • Be consistent! Follow a consistent folder system for storing data  ideally one that makes sense to someone without deep knowledge of the research project. Examples: Instrument → Date → Sample ; Grant Number → Location → AnalysisType
  • Do not rely on the folder structure to provide critical context for files. Files copied elsewhere will lose this information. Put critical information in the file name. Example: use Project01/SiteB/SiteB_2016_rawdata.txt instead of Project01/SiteB/2016/rawdata.txt.

File-naming Conventions

  • As with your folder system, be consistent with your file-naming convention. Provide enough information to uniquely identify a file. Document these decisions!
  • Avoid spaces in filenames, not all software applications can use such names. Instead use capital letters or underscores to connect multiple words. Likewise, avoid the following special characters in filenames: & , * % # ; * : ( ) ! @$ ^ ~ ' { } [ ] ? < > - + /
  • For dates, use the format YYYYMMDD, YYYY-MM-DD, or YYYY_MM_DD. With this date format, files sorted alphabetically will also sort chronologically.
  • Keep track of different file versions by using a suffix to represent the version number. Example: ProjectName_Instrument_Condition_YYYYMMD_v01.txt

Data Documentation

  • Document the decisions you make about your data during the research process. It is easy to forget and you may need to refer to it later during the thesis writing process.
  • Describe the content of your data files in a data dictionary. List variables with definitions, units of measure, scope notes, coded values. Document how missing values are represented. List the file formats you are using.

Long-Term Data Storage

  • Store copies of data in open, stable formats (e.g., .txt, .csv, .tiff) for long term accessibility, but keep a copy of the original format.
  • Upload the dataset to a data repository and assign it a persistent identifier (DOI). Contact David Isaak, Data Services Librarian if you have questions.