Cyberhelp

for Researchers & Teams

How much data can I store in my research data directory?

September 13, 2019 by @qread

TL;DR: Try to have a general idea of your data storage needs, and discuss it with the data science team if you are concerned, but do not be too worried unless you are going well over 1 terabyte.

How much space will I need for my data?

There’s big data, and then there’s BIG data! Data storage needs depend on the types of data you are working with, and SESYNC users vary greatly in their data needs. SESYNC’s /nfs filesystem is where the research data directories for SESYNC projects and for individual users are hosted — this is what you are accessing when you log into https://files.sesync.org.

The filesystem has a large data storage capacity. Even so, users still need to be cognizant of how much data they are storing on the filesystem to avoid maxing out the storage and compromising other people’s work. We recommend that groups have a general idea of their data storage needs and discuss it with a member of the data science team if they have any concerns about being able to store all the data they need on SESYNC’s filesystem.

no parking

One thing that project participants should be aware of is that SESYNC is not set up to be a data repository. We do not have the resources to store data over the long term that is not actively being used for a project. Please avoid using the filesystem to park data!

What kinds of data take up the most space?

Here is a quick rundown of the data formats that are likely to take up the most space.

To give an example of the file sizes you can get with high spatial resolution images, a single raster image containing elevation data for the continental United States at 30 m pixel resolution is around 100 GB in size. For an example of how time series can balloon in size, the MODIS land surface temperature data is provided globally every 8 days at 1 km resolution. A year’s worth of that data is around 1.5 GB. If you want to use multiple MODIS data products for multiple years, your data storage requirements are going to multiply quickly.

I think my data are too big! What do I do?

Generally, data storage will not be an issue unless you will need to store significantly more than 1 TB of data in your research data directory for a long period of time. If you do think you might need more than that, or if you are planning to make extensive use of a very memory-intensive type of data, feel free to contact Cyberhelp to discuss your data use and storage needs. We will be able to work out a solution that meets your needs!

Related