MERRA2 Analysis Process#
This Jupyter notebook provides a brief overview of how to use the geodata package to download MERRA2 climate data, create geographic-temporal subsets called cutouts, and use those cutouts to generate standalone datasets for separate analysis.
The following guide assumes you have installed and configured geodata and all required dependencies.
Step 1 - Setup#
Import the package first.
import geodata
Notifications in geodata are implemented using loggers from the logging library.
It is recommended to always launch a logger to get information on what is going on. For debugging, you can use the more verbose level=logging.DEBUG:
import logging
logging.basicConfig(level=logging.INFO)
Step 2 - Download#
Assuming you have previously created an Earthdata Login profile and approved the GES DISC app, you can download MERRA2 data from the source as follows.
First, define a dataset object for the data you wish to download:
DS = geodata.Dataset(
module="merra2",
weather_data_config="surface_flux_monthly",
years=slice(2010, 2010),
months=slice(1, 7),
)
2024-11-06 15:34:49,730 - geodata.dataset - INFO - Bounds was not specified, default to global bounds.
2024-11-06 15:34:49,732 - geodata.dataset - INFO - Directory /Users/geodata/.local/geodata/merra2 found, checking for completeness.
2024-11-06 15:34:49,733 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201001.nc4` not found!
2024-11-06 15:34:49,733 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201002.nc4` not found!
2024-11-06 15:34:49,734 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201003.nc4` not found!
2024-11-06 15:34:49,735 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201004.nc4` not found!
2024-11-06 15:34:49,735 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201005.nc4` not found!
2024-11-06 15:34:49,736 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201006.nc4` not found!
2024-11-06 15:34:49,736 - geodata.dataset - INFO - File `/Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201007.nc4` not found!
2024-11-06 15:34:49,737 - geodata.dataset - WARNING - Arguments `xs` and `ys` not used in preparing dataset. Defaulting to global.
2024-11-06 15:34:49,738 - geodata.dataset - INFO - 7 files not completed.
Use
moduleto specify the data source. In this example, it is “merra2”.Use
weather_data_configto specifiy the dataset. In this example, it is the MERRA2 monthly mean, single-level surface flux diagnosticsTo download the MERRA2 hourly, single-level surface flux diagnostics, specify
weather_data_config = "surface_flux_hourly".
Use
years=slice()andmonths=slice()to specify the years and months for download. In each parameter, the first value indicates the start period, and the second value the end period.
Use the code block below to begin the download.
When a dataset object is created, geodata performs a check to see if the data specified has already been downloaded by checking for the existence of MERRA2 datafiles in the merra2 directory configured in src/geodata/config.py (downloaded data is placed into subdirectories by year and then - for daily files - by month, ie 2011/01, 2011/02, 2012/01, etc). Monthly files are simply placed in the month’s folder. If downloaded data is found, the prepared attribute is set to True upon dataset object declaration.
Accordingly, the snippet below saves you the trouble of accidentally redownloading data if it is already present in the correct subdirectories.
if not DS.prepared:
DS.get_data()
2024-11-06 15:34:53,326 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201001.nc4
2024-11-06 15:34:53,328 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201001.nc4
2024-11-06 15:35:18,885 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201001.nc4
2024-11-06 15:35:18,888 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201002.nc4
2024-11-06 15:35:18,889 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201002.nc4
2024-11-06 15:35:58,417 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201002.nc4
2024-11-06 15:35:58,419 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201003.nc4
2024-11-06 15:35:58,421 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201003.nc4
2024-11-06 15:36:35,820 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201003.nc4
2024-11-06 15:36:35,822 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201004.nc4
2024-11-06 15:36:35,823 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201004.nc4
2024-11-06 15:37:13,549 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201004.nc4
2024-11-06 15:37:13,552 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201005.nc4
2024-11-06 15:37:13,553 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201005.nc4
2024-11-06 15:37:58,141 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201005.nc4
2024-11-06 15:37:58,142 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201006.nc4
2024-11-06 15:37:58,144 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201006.nc4
2024-11-06 15:38:38,069 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201006.nc4
2024-11-06 15:38:38,071 - geodata - INFO - Preparing API calls for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201007.nc4
2024-11-06 15:38:38,072 - geodata - INFO - Making request to https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/2010/MERRA2_300.tavgM_2d_flx_Nx.201007.nc4
2024-11-06 15:39:17,346 - geodata - INFO - Successfully downloaded data for /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201007.nc4
100%|██████████| 7/7 [04:24<00:00, 37.72s/it]
Finally, in order to use the downloaded MERRA2 data with geodata, run:
DS.trim_variables()
trim_variables() subsets and resaves the downloaded files so that only those variables needed to generate geodata outputs are kept.
Step 3 - Create Cutout#
A cutout is a subset of downloaded data based on specified time periods and geographic coordinates. Cutouts are saved to the cutout directory the GEODATA_ROOT directory specified in src/geodata/config.py and can be used to generate multiple outputs.
*Note: 04/02/2020 - There is a known issue with MERRA2-based cutouts where running cutout.prepare(overwrite=True) on an existing cutout prevents the cutout from being used to generate outputs. A workaround is to manually delete the problem cutout and recreate it from scratch. A fix is planned pending investigation.
To create a cutout, run the following:
cutout = geodata.Cutout(
name="tokyo-2010-test",
module="merra2",
weather_data_config="surface_flux_monthly",
xs=slice(138.5, 139.5),
ys=slice(35, 36),
years=slice(2010, 2010),
months=slice(7, 7),
)
cutout.prepare()
2024-11-06 15:41:50,342 - geodata.cutout - INFO - Cutout (tokyo-2010-test, /Users/geodata/.local/geodata/cutouts) not found or incomplete.
2024-11-06 15:41:50,457 - geodata.preparation - INFO - Starting preparation of cutout 'tokyo-2010-test'
2024-11-06 15:41:50,457 - geodata - INFO - MultiIndex([(2010, 7)],
names=['year', 'month'])
2024-11-06 15:41:50,458 - geodata - INFO - [(2010, 7)]
2024-11-06 15:41:52,063 - geodata - INFO - Opening /Users/geodata/.local/geodata/merra2/2010/MERRA2_300.tavgM_2d_flx_Nx.201007.nc4
2024-11-06 15:41:52,086 - geodata.preparation - INFO - Merging variables into monthly compound files
2024-11-06 15:41:52,087 - geodata.preparation - INFO - Cutout 'tokyo-2010-test' has been successfully prepared
The above code creates a cutout for July 2010 for a geographic area roughly corresponding to the Tokyo metropolitan area. Walking through the parameters:
namewill be the name of the directory created in the cutouts folder where geodata will place the data files corresponding to the cutout.moduleindicates the source for the data from which the cutout is created.weather_data_configindicates the specific dataset from the source. For MERRA2, the available options aresurface_flux_hourlyandsurface_flux_monthly.Use
xs=slice()andys=slice()to define a geographical range for the cutout.Use
years=slice()andmonths=slice()to define a temporal range for the cutout. Naturally, the indicated time range must be present within the source data.
geodata.Cutout() only defines the cutout object in memory. To actually create the cutout files, run prepare().
As with get_data(), prepare() will first perform a check to see if a cutout has already been created at the same specified, and will exit the creation process if a cutout already exists. To override this behavior and force a recalculation of the cutout, run prepare(overwrite=True).
To verify the results of the cutout, you can print some attributes to the console as follows.
Basic information:
cutout
<Cutout tokyo-2010-test x=138.75-139.38 y=35.00-36.00 time=2010/7-2010/7 prepared>
Name:
cutout.name
'tokyo-2010-test'
Coordinates:
cutout.coords
Coordinates:
* x (x) float64 16B 138.8 139.4
* y (y) float64 24B 35.0 35.5 36.0
lon (x) float64 16B 138.8 139.4
lat (y) float64 24B 35.0 35.5 36.0
* time (time) datetime64[ns] 8B 2010-07-01
* year-month (year-month) object 8B MultiIndex
* year (year-month) int64 8B 2010
* month (year-month) int64 8B 7
All metadata:
cutout.meta
<xarray.Dataset> Size: 112B
Dimensions: (x: 2, y: 3, time: 1, year-month: 1)
Coordinates:
* x (x) float64 16B 138.8 139.4
* y (y) float64 24B 35.0 35.5 36.0
lon (x) float64 16B 138.8 139.4
lat (y) float64 24B 35.0 35.5 36.0
* time (time) datetime64[ns] 8B 2010-07-01
* year-month (year-month) object 8B MultiIndex
* year (year-month) int64 8B 2010
* month (year-month) int64 8B 7
Data variables:
*empty*
Attributes: (12/31)
History: Original file generated: Fri Jul 3 01...
Filename: MERRA2_300.tavgM_2d_flx_Nx.201001.nc4
Comment: GMAO filename: d5124_m2_jan00.tavg1_2d...
Conventions: CF-1
Institution: NASA Global Modeling and Assimilation ...
References: http://gmao.gsfc.nasa.gov
... ...
Source: CVS tag: GEOSadas-5_12_4
Contact: http://gmao.gsfc.nasa.gov
identifier_product_doi: 10.5067/0JRLVL8YV2Y4
RangeBeginningTime: 00:00:00.000000
RangeEndingTime: 23:59:59.000000
module: merra2Information about the variable config used to download the data:
cutout.dataset_module.weather_data_config
{'surface_flux_hourly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'daily',
'tasks_func': <function geodata.datasets.merra2.tasks_daily_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_month_surface_flux(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_*.tavg1_2d_flx_Nx.*.nc4',
'url': 'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXFLX.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_flx_Nx.{year}{month:0>2}{day:0>2}.nc4',
'url_opendap': 'https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXFLX.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_flx_Nx.{year}{month:0>2}{day:0>2}.nc4.nc4',
'fn': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_flx_Nx.{year}{month:0>2}{day:0>2}.nc4',
'variables': ['ustar',
'z0m',
'disph',
'rhoa',
'ulml',
'vlml',
'tstar',
'hlml',
'tlml',
'pblh',
'hflux',
'eflux']},
'slv_flux_hourly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'daily_multiple',
'tasks_func': <function geodata.datasets.merra2.tasks_daily_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_month_surface_flux(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_*.tavg1_2d_slv_flx_Nx.*.nc4',
'url': ['https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXFLX.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_flx_Nx.{year}{month:0>2}{day:0>2}.nc4',
'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXSLV.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4'],
'url_opendap': ['https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXFLX.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_flx_Nx.{year}{month:0>2}{day:0>2}.nc4',
'https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXSLV.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4'],
'fn': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_flx_Nx.{year}{month:0>2}{day:0>2}.nc4',
'variables': ['ustar',
'z0m',
'disph',
'rhoa',
'ulml',
'vlml',
'tstar',
'hlml',
'tlml',
'pblh',
'hflux',
'eflux',
'u2m',
'v2m',
'u10m',
'v10m',
'u50m',
'v50m'],
'variables_list': [['ustar',
'z0m',
'disph',
'rhoa',
'ulml',
'vlml',
'tstar',
'hlml',
'tlml',
'pblh',
'hflux',
'eflux'],
['u2m', 'v2m', 'u10m', 'v10m', 'u50m', 'v50m']]},
'surface_flux_monthly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'monthly',
'tasks_func': <function geodata.datasets.merra2.tasks_monthly_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_month_surface_flux(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/MERRA2_*.tavgM_2d_flx_Nx.*.nc4',
'url': 'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXFLX.5.12.4/{year}/MERRA2_{spinup}.tavgM_2d_flx_Nx.{year}{month:0>2}.nc4',
'fn': '/Users/geodata/.local/geodata/merra2/{year}/MERRA2_{spinup}.tavgM_2d_flx_Nx.{year}{month:0>2}.nc4',
'variables': ['ustar',
'z0m',
'disph',
'rhoa',
'ulml',
'vlml',
'tstar',
'hlml',
'tlml',
'pblh',
'hflux',
'eflux'],
'meta_attrs': {'History': 'Original file generated: Fri Jul 3 01:41:08 2015 GMT',
'Filename': 'MERRA2_300.tavgM_2d_flx_Nx.201001.nc4',
'Comment': 'GMAO filename: d5124_m2_jan00.tavg1_2d_flx_Nx.monthly.201001.nc4',
'Conventions': 'CF-1',
'Institution': 'NASA Global Modeling and Assimilation Office',
'References': 'http://gmao.gsfc.nasa.gov',
'Format': 'NetCDF-4/HDF-5',
'SpatialCoverage': 'global',
'VersionID': '5.12.4',
'TemporalRange': '1980-01-01 -> 2016-12-31',
'identifier_product_doi_authority': 'http://dx.doi.org/',
'ShortName': 'M2TMNXFLX',
'RangeBeginningDate': '2010-01-01',
'RangeEndingDate': '2010-01-31',
'GranuleID': 'MERRA2_300.tavgM_2d_flx_Nx.201001.nc4',
'ProductionDateTime': 'Original file generated: Fri Jul 3 01:41:08 2015 GMT',
'LongName': 'MERRA2 tavg1_2d_flx_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Surface Flux Diagnostics Monthly Mean',
'Title': 'MERRA2 tavg1_2d_flx_Nx: 2d,1-Hourly,Time-Averaged,Single-Level,Assimilation,Surface Flux Diagnostics Monthly Mean',
'SouthernmostLatitude': '-90.0',
'NorthernmostLatitude': '90.0',
'WesternmostLongitude': '-180.0',
'EasternmostLongitude': '179.375',
'LatitudeResolution': '0.5',
'LongitudeResolution': '0.625',
'DataResolution': '0.5 x 0.625',
'Source': 'CVS tag: GEOSadas-5_12_4',
'Contact': 'http://gmao.gsfc.nasa.gov',
'identifier_product_doi': '10.5067/0JRLVL8YV2Y4',
'RangeBeginningTime': '00:00:00.000000',
'RangeEndingTime': '23:59:59.000000',
'module': 'merra2'}},
'surface_flux_dailymeans': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'dailymeans',
'tasks_func': <function geodata.datasets.merra2.tasks_daily_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_dailymeans_surface_flux(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_*.statD_2d_slv_Nx.*.nc4',
'url': 'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2SDNXSLV.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.statD_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4',
'fn': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_{spinup}.statD_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4',
'variables': ['hournorain', 'tprecmax', 't2mmax', 't2mmean', 't2mmin']},
'slv_radiation_hourly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'daily_multiple',
'tasks_func': <function geodata.datasets.merra2.tasks_daily_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_slv_radiation(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_*.tavg1_2d_slv_rad_Nx.*.nc4',
'url': ['https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXSLV.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4',
'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXRAD.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_rad_Nx.{year}{month:0>2}{day:0>2}.nc4'],
'url_opendap': ['https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXSLV.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_Nx.{year}{month:0>2}{day:0>2}.nc4.nc4',
'https://goldsmr4.gesdisc.eosdis.nasa.gov/opendap/MERRA2/M2T1NXRAD.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_rad_Nx.{year}{month:0>2}{day:0>2}.nc4.nc4'],
'fn': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_slv_rad_Nx.{year}{month:0>2}{day:0>2}.nc4',
'variables': ['albedo', 'swgdn', 'swtdn', 't2m'],
'variables_list': [['t2m'], ['albedo', 'swgdn', 'swtdn']]},
'slv_radiation_monthly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'monthly_multiple',
'tasks_func': <function geodata.datasets.merra2.tasks_monthly_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_slv_radiation(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/MERRA2_*.tavgM_2d_slv_rad_Nx.*.nc4',
'url': ['https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXSLV.5.12.4/{year}/MERRA2_{spinup}.tavgM_2d_slv_Nx.{year}{month:0>2}.nc4',
'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2_MONTHLY/M2TMNXRAD.5.12.4/{year}/MERRA2_{spinup}.tavgM_2d_rad_Nx.{year}{month:0>2}.nc4'],
'fn': '/Users/geodata/.local/geodata/merra2/{year}/MERRA2_{spinup}.tavgM_2d_slv_rad_Nx.{year}{month:0>2}.nc4',
'variables': ['albedo', 'swgdn', 'swtdn', 't2m']},
'surface_aerosol_hourly': {'api_func': <function geodata.datasets.merra2.api_merra2(toDownload, fileGranularity, downloadedFiles)>,
'file_granularity': 'daily',
'tasks_func': <function geodata.datasets.merra2.tasks_daily_merra2(xs, ys, yearmonths, prepare_func, **meta_attrs)>,
'meta_prepare_func': <function geodata.datasets.merra2.prepare_meta_merra2(xs, ys, year, month, template, module, **params)>,
'prepare_func': <function geodata.datasets.merra2.prepare_month_aerosol(fn, year, month, xs, ys)>,
'template': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_*.tavg1_2d_aer_Nx.*.nc4',
'url': 'https://goldsmr4.gesdisc.eosdis.nasa.gov/data/MERRA2/M2T1NXAER.5.12.4/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_aer_Nx.{year}{month:0>2}{day:0>2}.nc4',
'fn': '/Users/geodata/.local/geodata/merra2/{year}/{month:0>2}/MERRA2_{spinup}.tavg1_2d_aer_Nx.{year}{month:0>2}{day:0>2}.nc4',
'variables': ['bcsmass', 'dusmass25', 'ocsmass', 'so4smass', 'sssmass25']}}
For Merra2, you can confirm variables downloaded this way:
cutout.dataset_module.weather_data_config["surface_flux_monthly"]["variables"]
['ustar',
'z0m',
'disph',
'rhoa',
'ulml',
'vlml',
'tstar',
'hlml',
'tlml',
'pblh',
'hflux',
'eflux']
Step 4 - Generate Outputs#
geodata currently supports the following wind outputs using MERRA2 surface flux diagnostic data.
Wind generation time-series (
wind)Wind speed time-series (
windspd)Wind power density time-series (
windpwd)
Wind Generation Time-series#
Convert wind speeds for turbine to wind energy generation using the following code:
ds_wind = cutout.wind(turbine="Suzlon_S82_1.5_MW", smooth=True, var_height="lml")
Going over the parameters:
cutout- string - A cutout created bygeodata.Cutout()turbine- string or dict - Name of a turbine known by the reatlas client or a turbineconfig dictionary with the keys ‘hub_height’ for the hub height and ‘V’, ‘POW’ defining the power curve. For a full list of currently supported turbines, see the list of Turbines here.smooth- bool or dict - If True smooth power curve with a gaussian kernel as determined for the Danish wind fleet to Delta_v = 1.27 and sigma = 2.29. A dict allows to tune these values.
Note -
You can also specify all of the general conversion arguments documented in the convert_and_aggregate function (e.g. var_height='lml').
The convert function returns an xarray dataset, which is an in-memory representation of a NetCDF file.
ds_wind
<xarray.DataArray (time: 1, y: 3, x: 2)> Size: 48B
array([[[0.08232222, 0.20032592],
[0.01580369, 0.07699983],
[0.00229215, 0.01120574]]])
Coordinates:
* x (x) float64 16B 138.8 139.4
* y (y) float64 24B 35.0 35.5 36.0
* time (time) datetime64[ns] 8B 2010-07-01T00:30:00
lon (x) float64 16B 138.8 139.4
lat (y) float64 24B 35.0 35.5 36.0To convert this array to a more conventional dataframe, run:
df_wind = ds_wind.to_dataframe(name="wind")
which converts the xarray dataset into a pandas dataframe:
df_wind
| lon | lat | wind | |||
|---|---|---|---|---|---|
| time | y | x | |||
| 2010-07-01 00:30:00 | 35.0 | 138.750 | 138.750 | 35.0 | 0.082322 |
| 139.375 | 139.375 | 35.0 | 0.200326 | ||
| 35.5 | 138.750 | 138.750 | 35.5 | 0.015804 | |
| 139.375 | 139.375 | 35.5 | 0.077000 | ||
| 36.0 | 138.750 | 138.750 | 36.0 | 0.002292 | |
| 139.375 | 139.375 | 36.0 | 0.011206 |
To output the data to a csv for separate analysis:
df_wind.to_csv("merra2_wind_data.csv")
Extract wind speeds at given height (ms-1)
ds_windspd = cutout.windspd(turbine="Vestas_V66_1750kW", var_height="lml")
Going over the parameters:
cutout- string - A cutout created bygeodata.Cutout()**params- Must have 1 of the following:turbine- string or dict - Name of a turbine known by the reatlas client or a turbineconfig dictionary with the keys ‘hub_height’ for the hub height and ‘V’, ‘POW’ defining the power curve. For a full list of currently supported turbines, see the list of Turbines here.hub-height- num - Extrapolation height (m)
Note -
You can also specify all of the general conversion arguments documented in the convert_and_aggregate function (e.g. var_height='lml').
The convert function returns an xarray dataset, which is an in-memory representation of a NetCDF file.
ds_windspd
<xarray.DataArray 'wnd67m' (time: 1, y: 3, x: 2)> Size: 24B
array([[[4.89272 , 6.517427 ],
[2.7624094, 4.864064 ],
[0.9258263, 2.434551 ]]], dtype=float32)
Coordinates:
* x (x) float64 16B 138.8 139.4
* y (y) float64 24B 35.0 35.5 36.0
* time (time) datetime64[ns] 8B 2010-07-01T00:30:00
lon (x) float64 16B 138.8 139.4
lat (y) float64 24B 35.0 35.5 36.0
Attributes:
long_name: extrapolated 67 m wind speed using log ratio, from variable h...
units: m s**-1To convert this array to a more conventional dataframe, run:
df_windspd = ds_windspd.to_dataframe(name="windspd")
which converts the xarray dataset into a pandas dataframe:
df_windspd
| lon | lat | windspd | |||
|---|---|---|---|---|---|
| time | y | x | |||
| 2010-07-01 00:30:00 | 35.0 | 138.750 | 138.750 | 35.0 | 4.892720 |
| 139.375 | 139.375 | 35.0 | 6.517427 | ||
| 35.5 | 138.750 | 138.750 | 35.5 | 2.762409 | |
| 139.375 | 139.375 | 35.5 | 4.864064 | ||
| 36.0 | 138.750 | 138.750 | 36.0 | 0.925826 | |
| 139.375 | 139.375 | 36.0 | 2.434551 |
To output the data to a csv for separate analysis:
df_windspd.to_csv("merra2_windspd_data.csv")
Wind Power Density Time-series#
Extract wind power density at given height, according to: WPD = 0.5 * Density * Windspd^3
ds_windwpd = cutout.windwpd(turbine="Vestas_V66_1750kW", var_height="lml")
Going over the parameters:
cutout- string - A cutout created bygeodata.Cutout()**params- Must have 1 of the following:turbine- string or dict - Name of a turbine known by the reatlas client or a turbineconfig dictionary with the keys ‘hub_height’ for the hub height and ‘V’, ‘POW’ defining the power curve. For a full list of currently supported turbines, see the list of Turbines here.hub-height- num - Extrapolation height (m)
Note -
You can also specify all of the general conversion arguments documented in the convert_and_aggregate function (e.g. var_height='lml').
The convert function returns an xarray dataset, which is an in-memory representation of a NetCDF file.
ds_windwpd
<xarray.DataArray (time: 1, y: 3, x: 2)> Size: 24B
array([[[ 66.65265 , 160.12375 ],
[ 11.40101 , 65.33937 ],
[ 0.42476463, 8.158394 ]]], dtype=float32)
Coordinates:
* x (x) float64 16B 138.8 139.4
* y (y) float64 24B 35.0 35.5 36.0
* time (time) datetime64[ns] 8B 2010-07-01T00:30:00
lon (x) float64 16B 138.8 139.4
lat (y) float64 24B 35.0 35.5 36.0To convert this array to a more conventional dataframe, run:
df_windwpd = ds_windwpd.to_dataframe(name="windwpd")
which converts the xarray dataset into a pandas dataframe:
df_windwpd
| lon | lat | windwpd | |||
|---|---|---|---|---|---|
| time | y | x | |||
| 2010-07-01 00:30:00 | 35.0 | 138.750 | 138.750 | 35.0 | 66.652649 |
| 139.375 | 139.375 | 35.0 | 160.123749 | ||
| 35.5 | 138.750 | 138.750 | 35.5 | 11.401010 | |
| 139.375 | 139.375 | 35.5 | 65.339371 | ||
| 36.0 | 138.750 | 138.750 | 36.0 | 0.424765 | |
| 139.375 | 139.375 | 36.0 | 8.158394 |
To output the data to a csv for separate analysis:
df_windwpd.to_csv("merra2_windwpd_data.csv")