使用 xarray 和 open_mfdataset 从 url 打开多个文件

使用 xarray 和 open_mfdataset 从 url 打开多个文件

我正在尝试下载 2015-2050 年的多个 CMIP6 数据文件，以获得高分辨率阵风。从这里获取的数据集中共有 432 个文件（有关用于缩小范围的搜索词的屏幕截图）。

其中有 432 个文件，我可以通过右键单击 OpenDAP 下载按钮（屏幕截图中以红色突出显示）并将 url 粘贴到函数中来单独打开它们，open_mfdataset如下所示：

ds = xarray.open_mfdataset(('http://esgf-data3.ceda.ac.uk/thredds/dodsC/esg_cmip6/CMIP6/HighResMIP/CMCC/CMCC-CM2-VHR4/highres-future/r1i1p1f1/6hrPlevPt/sfcWind/gn/v20190509/sfcWind_6hrPlevPt_CMCC-CM2-VHR4_highres-future_r1i1p1f1_gn_201501010000-201501311800.nc',

'http://esgf-data3.ceda.ac.uk/thredds/dodsC/esg_cmip6/CMIP6/HighResMIP/CMCC/CMCC-CM2-VHR4/highres-future/r1i1p1f1/6hrPlevPt/sfcWind/gn/v20190509/sfcWind_6hrPlevPt_CMCC-CM2-VHR4_highres-future_r1i1p1f1_gn_201502010000-201502281800.nc'))

这工作正常，但是有 432 个文件，并且需要很长时间才能做到这一点 - 我尝试过其他方法，但觉得有一种方法可以使用 xarray 来有效地完成此操作，但我缺少 - 我真的很感激一些帮助。谢谢。

编辑：我使用下面屏幕截图中的“THREDDS Catalog”链接和以下代码使其正常工作：

df = pd.read_html('http://esg.lasg.ac.cn/thredds/catalog/esgcet/180/CMIP6.HighResMIP.CAS.FGOALS-f3-H.highresSST-future.r1i1p1f1.6hrPlevPt.psl.gr.v20200521.html#CMIP6.HighResMIP.CAS.FGOALS-f3-H.highresSST-future.r1i1p1f1.6hrPlevPt.psl.gr.v20200521', skiprows = 1)

df = df[0]

#get all relevant (432 files)

df = df[:432]

#add the url to each datafile to create a downloadable link for each

df['url'] = 'http://esg.lasg.ac.cn/thredds/dodsC/esg_dataroot/CMIP6/HighResMIP/CAS/FGOALS-f3-H/highresSST-future/r1i1p1f1/6hrPlevPt/psl/gr/v20200521/' + df['CMIP6.HighResMIP.CAS.FGOALS-f3-H.highresSST-future.r1i1p1f1.6hrPlevPt.psl.gr'].astype(str)

filelist = df['url'].tolist()

#do the first 10 to see if it works (change the number)

test = filelist[:10]

#do the first 10 into a dataset

ds = xarray.open_mfdataset(test)

侃侃无极

浏览 439回答 1

1回答

慕盖茨4494581

正如我们所知，我们不能使用通配符运算符来访问 openDAP 服务器的数据。但我们可以利用 CMIP 模型输出文件的常规结构并手动构建文件名：import calendarimport pandas as pdimport xarray as xr# base urlbase_url = 'http://esgf-data3.ceda.ac.uk/thredds/dodsC/esg_cmip6/CMIP6/HighResMIP/CMCC/CMCC-CM2-VHR4/highres-future/r1i1p1f1/6hrPlevPt/sfcWind/gn/v20190509/sfcWind_6hrPlevPt_CMCC-CM2-VHR4_highres-future_r1i1p1f1_gn_'# period of interestpr = pd.period_range(start='2015-01',end='2050-12', freq='M')file_list=[]for dt in pr:    # get recent year and month    year = dt.strftime('%Y')    month = dt.strftime('%m')    # get last day of month (no leap years in CMIP)    last_day_of_month = str(calendar.monthrange(dt.year, dt.month)[1])    if last_day_of_month == '29':        last_day_of_month = '28'            # build complete file name    single_file=(base_url+year+month+'010000-'+year+month+last_day_of_month+'1800.nc')    file_list.append(single_file)    ds=xr.open_mfdataset(file_list)

0

0

随时随地看视频慕课网APP

相关分类

Python