Xarray 数据处理(2)
Contents
Xarray
数据处理(2)¶
主讲人:李显祥
大气科学学院
import xarray as xr
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
#xr.set_options(display_style="html")
%matplotlib inline
#air_temp = xr.tutorial.load_dataset('air_temperature')
air_temp = xr.open_dataset('air_temperature.nc')
7. 广播和对齐¶
可以直接在
Dataset
和DataArray
对象上进行算术运算。标签会被保留,在运算中 dataArray 的维度会自动对齐。
广播¶
import numpy as np
import xarray as xr
a = xr.DataArray(np.arange(3), dims='time',
coords={'time':np.arange(3)})
b = xr.DataArray(np.arange(4), dims='space',
coords={'space':np.arange(4)})
a + b
<xarray.DataArray (time: 3, space: 4)> array([[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]]) Coordinates: * time (time) int64 0 1 2 * space (space) int64 0 1 2 3
xarray.DataArray
- time: 3
- space: 4
- 0 1 2 3 1 2 3 4 2 3 4 5
array([[0, 1, 2, 3], [1, 2, 3, 4], [2, 3, 4, 5]])
- time(time)int640 1 2
array([0, 1, 2])
- space(space)int640 1 2 3
array([0, 1, 2, 3])
思考题 1:
b + a = ?
思考题 2:
anomaly = air_temp.air - air_temp.air.mean(dim='time')
这里使用了广播吗?
对齐¶
atime = np.arange(3)
btime = np.arange(5) + 1
atime, btime
(array([0, 1, 2]), array([1, 2, 3, 4, 5]))
a = xr.DataArray(np.arange(3), dims='time',
coords={'time':atime})
b = xr.DataArray(np.arange(5), dims='time',
coords={'time':btime})
a + b
<xarray.DataArray (time: 2)> array([1, 3]) Coordinates: * time (time) int64 1 2
xarray.DataArray
- time: 2
- 1 3
array([1, 3])
- time(time)int641 2
array([1, 2])
如果我们要保留所有坐标,怎么办呢?
# 也可以保留所有坐标
with xr.set_options(arithmetic_join="outer"):
print(a + b)
<xarray.DataArray (time: 6)>
array([nan, 1., 3., nan, nan, nan])
Coordinates:
* time (time) int64 0 1 2 3 4 5
注意:没有对应坐标的点被设置为 nan
使用 .where()
进行屏蔽¶
means = air_temp.air.mean(dim=['time'])
means.where(means > 273.15).plot()
<matplotlib.collections.QuadMesh at 0x7fdb097b3790>
8. 分组 Groupby¶
xarray 支持和 pandas 类似的分组(“group by”)操作来实现拆散-应用-合并:
将数据根据一定的规则拆散为独立的组
对每个组应用某种函数
将这些组合并成一个单一的数据对象
分组操作对于 Dataset 和 DataArray 对象都适用。
air_temp.air.mean(dim=['lat','lon']).groupby('time.season').mean()
# air_temp.air.groupby('time.season').mean().mean(dim=['lat','lon'])
<xarray.DataArray 'air' (season: 4)> array([273.6497 , 289.2049 , 278.9914 , 283.02814], dtype=float32) Coordinates: * season (season) object 'DJF' 'JJA' 'MAM' 'SON'
xarray.DataArray
'air'
- season: 4
- 273.6 289.2 279.0 283.0
array([273.6497 , 289.2049 , 278.9914 , 283.02814], dtype=float32)
- season(season)object'DJF' 'JJA' 'MAM' 'SON'
array(['DJF', 'JJA', 'MAM', 'SON'], dtype=object)
clim = air_temp.air.groupby('time.month').mean('time')
clim
<xarray.DataArray 'air' (month: 12, lat: 25, lon: 53)> array([[[246.34987, 246.38608, 246.21518, ..., 243.06113, 244.08795, 245.6467 ], [248.8576 , 248.90733, 248.7104 , ..., 241.52866, 243.50865, 246.75471], [251.57712, 251.19661, 250.71463, ..., 243.39891, 246.78462, 251.56572], ..., [295.85028, 295.24405, 295.22684, ..., 295.18625, 294.65707, 294.0485 ], [296.5446 , 296.46982, 296.15994, ..., 295.35593, 295.0812 , 294.53006], [297.15417, 297.2383 , 297.04892, ..., 296.01797, 295.77554, 295.63647]], [[246.67715, 246.40576, 245.9484 , ..., 241.85838, 243.0021 , 244.44383], [247.8001 , 247.75992, 247.47757, ..., 240.64706, 242.26633, 245.06662], [249.07079, 248.57234, 247.94254, ..., 242.42874, 245.33348, 249.72273], ... [297.8426 , 297.1406 , 296.98773, ..., 297.96884, 297.56888, 297.1611 ], [298.58783, 298.42026, 297.96896, ..., 298.16412, 298.19397, 297.9083 ], [298.81143, 298.8566 , 298.62103, ..., 298.72955, 298.7519 , 298.8189 ]], [[247.971 , 248.02118, 247.91302, ..., 239.7719 , 241.02383, 242.62823], [249.73361, 250.16037, 250.48581, ..., 238.78964, 240.96469, 244.11626], [252.0296 , 251.53136, 251.36629, ..., 238.07542, 241.91293, 247.06987], ..., [296.76508, 295.97668, 295.88922, ..., 296.45605, 296.09137, 295.65756], [297.46814, 297.38025, 297.04428, ..., 296.8556 , 296.84668, 296.52133], [297.8809 , 297.9868 , 297.77554, ..., 297.60034, 297.5655 , 297.53763]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * month (month) int64 1 2 3 4 5 6 7 8 9 10 11 12
xarray.DataArray
'air'
- month: 12
- lat: 25
- lon: 53
- 246.3 246.4 246.2 245.8 245.2 244.6 ... 298.0 298.0 297.6 297.6 297.5
array([[[246.34987, 246.38608, 246.21518, ..., 243.06113, 244.08795, 245.6467 ], [248.8576 , 248.90733, 248.7104 , ..., 241.52866, 243.50865, 246.75471], [251.57712, 251.19661, 250.71463, ..., 243.39891, 246.78462, 251.56572], ..., [295.85028, 295.24405, 295.22684, ..., 295.18625, 294.65707, 294.0485 ], [296.5446 , 296.46982, 296.15994, ..., 295.35593, 295.0812 , 294.53006], [297.15417, 297.2383 , 297.04892, ..., 296.01797, 295.77554, 295.63647]], [[246.67715, 246.40576, 245.9484 , ..., 241.85838, 243.0021 , 244.44383], [247.8001 , 247.75992, 247.47757, ..., 240.64706, 242.26633, 245.06662], [249.07079, 248.57234, 247.94254, ..., 242.42874, 245.33348, 249.72273], ... [297.8426 , 297.1406 , 296.98773, ..., 297.96884, 297.56888, 297.1611 ], [298.58783, 298.42026, 297.96896, ..., 298.16412, 298.19397, 297.9083 ], [298.81143, 298.8566 , 298.62103, ..., 298.72955, 298.7519 , 298.8189 ]], [[247.971 , 248.02118, 247.91302, ..., 239.7719 , 241.02383, 242.62823], [249.73361, 250.16037, 250.48581, ..., 238.78964, 240.96469, 244.11626], [252.0296 , 251.53136, 251.36629, ..., 238.07542, 241.91293, 247.06987], ..., [296.76508, 295.97668, 295.88922, ..., 296.45605, 296.09137, 295.65756], [297.46814, 297.38025, 297.04428, ..., 296.8556 , 296.84668, 296.52133], [297.8809 , 297.9868 , 297.77554, ..., 297.60034, 297.5655 , 297.53763]]], dtype=float32)
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
- month(month)int641 2 3 4 5 6 7 8 9 10 11 12
array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12])
我们也可以对拆分的组应用算术运算
anomalies = air_temp.air.groupby('time.month') - clim
anomalies
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> array([[[ -5.149872 , -3.886078 , -2.7151794 , ..., -10.261139 , -8.587952 , -7.046707 ], [ -5.057617 , -4.4073334 , -4.0104065 , ..., -8.728668 , -8.208664 , -7.454727 ], [ -1.5771179 , -1.3966217 , -1.8246307 , ..., -10.198914 , -10.394623 , -9.865723 ], ..., [ 0.74972534, 0.9559326 , 1.1731567 , ..., 0.21374512, 0.44293213, 0.65148926], [ -0.6445923 , -0.26983643, 0.6300659 , ..., 0.5440674 , 0.8187866 , 0.6699219 ], [ -0.86416626, -0.44830322, 0.05108643, ..., 0.88201904, 1.0144653 , 0.9635315 ]], [[ -4.249878 , -3.686081 , -3.1151886 , ..., -11.061127 , -10.487961 , -9.84671 ], [ -5.257614 , -4.8073425 , -4.5104065 , ..., -10.528656 , -11.008652 , -11.054718 ], [ 1.622879 , 1.6933899 , 1.3853607 , ..., -12.598923 , -13.394623 , -13.06572 ], ... [ -3.0750732 , -2.0867004 , -0.49923706, ..., -1.3660583 , -1.4013672 , -1.3675537 ], [ -1.1781311 , -0.19024658, 0.54571533, ..., -1.5655823 , -1.7566833 , -2.1313477 ], [ -0.09088135, 0.40319824, 0.714447 , ..., -1.9103394 , -2.0755005 , -2.3476257 ]], [[ -2.8809967 , -3.731186 , -4.6230316 , ..., 1.9180908 , 0.466156 , -0.8382416 ], [ 0.15638733, -0.8703766 , -2.09581 , ..., 0.800354 , -0.6746979 , -2.4262695 ], [ 10.960388 , 10.658646 , 10.023697 , ..., 1.8145752 , 0.677063 , -0.7798767 ], ..., [ -2.9750671 , -2.2866821 , -0.79922485, ..., -1.1660461 , -1.0013733 , -0.9675598 ], [ -1.3781433 , -0.4902649 , 0.14572144, ..., -1.1655884 , -1.1566772 , -1.3313293 ], [ -0.19088745, 0.10321045, 0.31445312, ..., -1.1103516 , -1.3754883 , -1.8476257 ]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 month (time) int64 1 1 1 1 1 1 1 1 1 1 ... 12 12 12 12 12 12 12 12 12 12
xarray.DataArray
'air'
- time: 2920
- lat: 25
- lon: 53
- -5.15 -3.886 -2.715 -1.812 -1.125 ... -0.8265 -1.11 -1.375 -1.848
array([[[ -5.149872 , -3.886078 , -2.7151794 , ..., -10.261139 , -8.587952 , -7.046707 ], [ -5.057617 , -4.4073334 , -4.0104065 , ..., -8.728668 , -8.208664 , -7.454727 ], [ -1.5771179 , -1.3966217 , -1.8246307 , ..., -10.198914 , -10.394623 , -9.865723 ], ..., [ 0.74972534, 0.9559326 , 1.1731567 , ..., 0.21374512, 0.44293213, 0.65148926], [ -0.6445923 , -0.26983643, 0.6300659 , ..., 0.5440674 , 0.8187866 , 0.6699219 ], [ -0.86416626, -0.44830322, 0.05108643, ..., 0.88201904, 1.0144653 , 0.9635315 ]], [[ -4.249878 , -3.686081 , -3.1151886 , ..., -11.061127 , -10.487961 , -9.84671 ], [ -5.257614 , -4.8073425 , -4.5104065 , ..., -10.528656 , -11.008652 , -11.054718 ], [ 1.622879 , 1.6933899 , 1.3853607 , ..., -12.598923 , -13.394623 , -13.06572 ], ... [ -3.0750732 , -2.0867004 , -0.49923706, ..., -1.3660583 , -1.4013672 , -1.3675537 ], [ -1.1781311 , -0.19024658, 0.54571533, ..., -1.5655823 , -1.7566833 , -2.1313477 ], [ -0.09088135, 0.40319824, 0.714447 , ..., -1.9103394 , -2.0755005 , -2.3476257 ]], [[ -2.8809967 , -3.731186 , -4.6230316 , ..., 1.9180908 , 0.466156 , -0.8382416 ], [ 0.15638733, -0.8703766 , -2.09581 , ..., 0.800354 , -0.6746979 , -2.4262695 ], [ 10.960388 , 10.658646 , 10.023697 , ..., 1.8145752 , 0.677063 , -0.7798767 ], ..., [ -2.9750671 , -2.2866821 , -0.79922485, ..., -1.1660461 , -1.0013733 , -0.9675598 ], [ -1.3781433 , -0.4902649 , 0.14572144, ..., -1.1655884 , -1.1566772 , -1.3313293 ], [ -0.19088745, 0.10321045, 0.31445312, ..., -1.1103516 , -1.3754883 , -1.8476257 ]]], dtype=float32)
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
- time(time)datetime64[ns]2013-01-01 ... 2014-12-31T18:00:00
- standard_name :
- time
- long_name :
- Time
array(['2013-01-01T00:00:00.000000000', '2013-01-01T06:00:00.000000000', '2013-01-01T12:00:00.000000000', ..., '2014-12-31T06:00:00.000000000', '2014-12-31T12:00:00.000000000', '2014-12-31T18:00:00.000000000'], dtype='datetime64[ns]')
- month(time)int641 1 1 1 1 1 1 ... 12 12 12 12 12 12
array([ 1, 1, 1, ..., 12, 12, 12])
注意 xarray 自动改变了 clim 的 month 维度以适应 air 的 time 维度。这也是 xarray 的“广播”功能的优势之一:它会自动根据维度的名字来匹配,我们也不需要像 numpy 那样使用 reshape 来改变数组的形状或者插入一个长度为1的维度 。
anomalies.plot()
(array([8.400000e+01, 9.000000e+02, 1.145500e+04, 9.761600e+04,
5.205310e+05, 2.676686e+06, 4.679230e+05, 8.459900e+04,
8.757000e+03, 4.490000e+02]),
array([-33.740555 , -27.573364 , -21.406174 , -15.238983 , -9.071793 ,
-2.904602 , 3.2625885, 9.429779 , 15.59697 , 21.76416 ,
27.93135 ], dtype=float32),
<BarContainer object of 10 artists>)
anomalies.sel(time='2013-12').plot()
(array([5.0000e+01, 7.0800e+02, 4.2130e+03, 1.4026e+04, 3.8860e+04,
8.7334e+04, 1.4104e+04, 3.9360e+03, 9.3300e+02, 1.3600e+02]),
array([-27.580475 , -22.353567 , -17.12666 , -11.899753 , -6.6728454,
-1.4459381, 3.7809691, 9.007876 , 14.234784 , 19.46169 ,
24.688599 ], dtype=float32),
<BarContainer object of 10 artists>)
anomalies.sel(time='2014-12-01T00:00:00').plot(center=0)
<matplotlib.collections.QuadMesh at 0x7fdaeac3f5d0>
注意这里 anomalies
和上节课的 anomalies
的不同。
重采样 Resample¶
重采样将时间序列改变为新的时间间隔
tmin = air_temp.air.resample(time='1D').min() # Resample to one day '1D
tmax = air_temp.air.resample(time='1D').max()
(tmin.sel(time='2013-08-01')-273.15).plot()
<matplotlib.collections.QuadMesh at 0x7fdaebd904d0>
ds_extremes = xr.Dataset({'tmin': tmin, 'tmax': tmax})
ds_extremes
<xarray.Dataset> Dimensions: (lat: 25, lon: 53, time: 730) Coordinates: * time (time) datetime64[ns] 2013-01-01 2013-01-02 ... 2014-12-31 * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 Data variables: tmin (time, lat, lon) float32 241.2 241.8 241.8 ... 295.7 295.5 295.2 tmax (time, lat, lon) float32 242.3 242.7 243.5 ... 296.6 296.2 295.8
xarray.Dataset
- lat: 25
- lon: 53
- time: 730
- time(time)datetime64[ns]2013-01-01 ... 2014-12-31
array(['2013-01-01T00:00:00.000000000', '2013-01-02T00:00:00.000000000', '2013-01-03T00:00:00.000000000', ..., '2014-12-29T00:00:00.000000000', '2014-12-30T00:00:00.000000000', '2014-12-31T00:00:00.000000000'], dtype='datetime64[ns]')
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
- tmin(time, lat, lon)float32241.2 241.8 241.8 ... 295.5 295.2
array([[[241.2 , 241.79999, 241.79999, ..., 232. , 233.59999, 235.79999], [243.59999, 244.09999, 244. , ..., 230.29999, 231.5 , 234.5 ], [250. , 249.79999, 248.89 , ..., 230.7 , 231.79999, 236.09999], ..., [295.6 , 295.4 , 295.4 , ..., 295.4 , 295.1 , 294.69998], [295.9 , 296.19998, 296.29 , ..., 295.6 , 295.5 , 295.1 ], [296.29 , 296.29 , 296.4 , ..., 296.4 , 296.4 , 296.6 ]], [[243.2 , 243.09999, 243.09999, ..., 237.2 , 238.79999, 240.89 ], [246.39 , 245.29999, 244.2 , ..., 234.5 , 234.89 , 237.2 ], [251.89 , 251.29999, 249.79999, ..., 234.7 , 235.89 , 239.7 ], ... [296.19 , 294.88998, 294.38998, ..., 296.59 , 296.1 , 295.19998], [296.69 , 297.38998, 296.6 , ..., 296.29 , 295.99 , 295.19998], [297.79 , 298.49 , 298.1 , ..., 296.29 , 295.88998, 295.5 ]], [[242.48999, 242.39 , 242.09 , ..., 241.68999, 241.48999, 241.79 ], [248.39 , 248.79 , 248.39 , ..., 239.59 , 240.29 , 241.68999], [262.38998, 261.79 , 261.29 , ..., 238.09 , 241.09 , 245.18999], ..., [293.69 , 293.69 , 295.09 , ..., 295.09 , 294.69 , 294.29 ], [296.09 , 296.88998, 297.19 , ..., 295.29 , 295.09 , 294.38998], [297.69 , 298.09 , 298.09 , ..., 295.69 , 295.49 , 295.19 ]]], dtype=float32)
- tmax(time, lat, lon)float32242.3 242.7 243.5 ... 296.2 295.8
array([[[242.29999, 242.7 , 243.5 , ..., 234.39 , 236.09999, 238.7 ], [246.29999, 245.29999, 244.7 , ..., 232.79999, 235.29999, 239.29999], [256.6 , 255.5 , 254.2 , ..., 233.2 , 236.39 , 241.7 ], ..., [296.6 , 296.4 , 296.4 , ..., 296.5 , 295.79 , 295.29 ], [297. , 297.5 , 297.1 , ..., 296.79 , 296.6 , 296.29 ], [297.5 , 297.69998, 297.5 , ..., 297.79 , 298. , 297.9 ]], [[244.09999, 243.79999, 243.5 , ..., 240.89 , 242.7 , 244.79999], [249.5 , 249. , 247.89 , ..., 237.59999, 238.29999, 240.09999], [254.79999, 253.09999, 250.89 , ..., 239.39 , 240.5 , 243.09999], ... [296.59 , 295.88998, 295.79 , ..., 297.29 , 296.69998, 295.5 ], [297.88998, 297.88998, 297.29 , ..., 296.88998, 296.29 , 295.59 ], [298.88998, 298.99 , 298.49 , ..., 297.38998, 296.88998, 296.09 ]], [[245.79 , 244.79 , 243.48999, ..., 245.79 , 246.79 , 247.29 ], [249.89 , 249.29 , 248.89 , ..., 241.29 , 242.48999, 244.29 ], [263.69 , 263.29 , 262.49 , ..., 240.48999, 243.09 , 246.89 ], ..., [297.09 , 297.69 , 298.59 , ..., 295.79 , 295.49 , 294.69 ], [296.99 , 298.38998, 299.29 , ..., 295.79 , 295.69 , 295.19 ], [298.29 , 299.29 , 299.38998, ..., 296.59 , 296.19 , 295.79 ]]], dtype=float32)
ds_extremes.to_dataframe().head()
tmin | tmax | |||
---|---|---|---|---|
lat | lon | time | ||
75.0 | 200.0 | 2013-01-01 | 241.199997 | 242.299988 |
2013-01-02 | 243.199997 | 244.099991 | ||
2013-01-03 | 242.599991 | 244.799988 | ||
2013-01-04 | 247.299988 | 252.389999 | ||
2013-01-05 | 242.799988 | 248.599991 |
如果我们要计算季节平均温度,但是季节不是从12月开始算,而是从1月份开始算,该怎么办?
tmean = air_temp.air.resample(time='QS-JAN').mean()
tmean
<xarray.DataArray 'air' (time: 8, lat: 25, lon: 53)> array([[[244.61775, 244.4874 , 244.16472, ..., 242.67862, 243.6617 , 244.84286], [246.70831, 246.60774, 246.2854 , ..., 241.76208, 243.09488, 245.42445], [249.80577, 249.27916, 248.56662, ..., 242.6106 , 245.18909, 249.2116 ], ..., [295.54068, 294.8626 , 294.5957 , ..., 295.23795, 294.7679 , 294.27173], [296.29684, 296.1032 , 295.6412 , ..., 295.4861 , 295.27814, 294.85345], [296.90457, 296.85693, 296.57358, ..., 296.13675, 295.94586, 295.85483]], [[266.05133, 265.95355, 265.73987, ..., 255.27213, 256.4855 , 258.0242 ], [266.68463, 266.89017, 267.00583, ..., 254.36305, 256.30783, 258.96777], [267.43503, 267.17825, 267.05594, ..., 255.34682, 258.0517 , 261.6338 ], ... [299.39444, 298.6934 , 298.25183, ..., 298.41876, 297.89505, 297.44012], [299.58566, 299.3528 , 298.86545, ..., 298.13773, 298.18146, 297.92966], [299.52676, 299.55106, 299.26666, ..., 298.4734 , 298.55417, 298.6812 ]], [[254.5719 , 254.2065 , 253.78835, ..., 244.84901, 245.83794, 247.16304], [258.69034, 258.4616 , 258.29688, ..., 243.59813, 245.36269, 248.19038], [259.91925, 258.61987, 257.98532, ..., 243.66206, 247.11562, 251.96043], ..., [298.0594 , 297.28436, 297.13278, ..., 297.8992 , 297.5463 , 297.1005 ], [298.82498, 298.7005 , 298.23013, ..., 298.11282, 298.20093, 297.89267], [299.03397, 299.12115, 298.8528 , ..., 298.71136, 298.70218, 298.74704]]], dtype=float32) Coordinates: * time (time) datetime64[ns] 2013-01-01 2013-04-01 ... 2014-10-01 * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0
xarray.DataArray
'air'
- time: 8
- lat: 25
- lon: 53
- 244.6 244.5 244.2 243.6 242.9 242.2 ... 299.0 299.1 298.7 298.7 298.7
array([[[244.61775, 244.4874 , 244.16472, ..., 242.67862, 243.6617 , 244.84286], [246.70831, 246.60774, 246.2854 , ..., 241.76208, 243.09488, 245.42445], [249.80577, 249.27916, 248.56662, ..., 242.6106 , 245.18909, 249.2116 ], ..., [295.54068, 294.8626 , 294.5957 , ..., 295.23795, 294.7679 , 294.27173], [296.29684, 296.1032 , 295.6412 , ..., 295.4861 , 295.27814, 294.85345], [296.90457, 296.85693, 296.57358, ..., 296.13675, 295.94586, 295.85483]], [[266.05133, 265.95355, 265.73987, ..., 255.27213, 256.4855 , 258.0242 ], [266.68463, 266.89017, 267.00583, ..., 254.36305, 256.30783, 258.96777], [267.43503, 267.17825, 267.05594, ..., 255.34682, 258.0517 , 261.6338 ], ... [299.39444, 298.6934 , 298.25183, ..., 298.41876, 297.89505, 297.44012], [299.58566, 299.3528 , 298.86545, ..., 298.13773, 298.18146, 297.92966], [299.52676, 299.55106, 299.26666, ..., 298.4734 , 298.55417, 298.6812 ]], [[254.5719 , 254.2065 , 253.78835, ..., 244.84901, 245.83794, 247.16304], [258.69034, 258.4616 , 258.29688, ..., 243.59813, 245.36269, 248.19038], [259.91925, 258.61987, 257.98532, ..., 243.66206, 247.11562, 251.96043], ..., [298.0594 , 297.28436, 297.13278, ..., 297.8992 , 297.5463 , 297.1005 ], [298.82498, 298.7005 , 298.23013, ..., 298.11282, 298.20093, 297.89267], [299.03397, 299.12115, 298.8528 , ..., 298.71136, 298.70218, 298.74704]]], dtype=float32)
- time(time)datetime64[ns]2013-01-01 ... 2014-10-01
array(['2013-01-01T00:00:00.000000000', '2013-04-01T00:00:00.000000000', '2013-07-01T00:00:00.000000000', '2013-10-01T00:00:00.000000000', '2014-01-01T00:00:00.000000000', '2014-04-01T00:00:00.000000000', '2014-07-01T00:00:00.000000000', '2014-10-01T00:00:00.000000000'], dtype='datetime64[ns]')
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
我们可以进一步求长期性的季节性平均(即季节性气候):
tmean.groupby('time.month').mean()
<xarray.DataArray 'air' (month: 4, lat: 25, lon: 53)> array([[[248.06451, 247.87129, 247.50122, ..., 242.38834, 243.41504, 244.81207], [249.94589, 249.91418, 249.65762, ..., 241.28279, 242.95572, 245.75687], [251.79315, 251.37738, 250.85794, ..., 242.50226, 245.54515, 250.02003], ..., [295.80817, 295.299 , 295.20734, ..., 294.8109 , 294.21518, 293.65 ], [296.67496, 296.5963 , 296.23596, ..., 294.91943, 294.63684, 294.14258], [297.27472, 297.30914, 297.05096, ..., 295.5161 , 295.29193, 295.1877 ]], [[266.2707 , 266.1467 , 265.92212, ..., 255.81694, 256.9698 , 258.4292 ], [267.29608, 267.4804 , 267.57135, ..., 254.85863, 256.7419 , 259.33945], [268.3296 , 268.1264 , 268.0636 , ..., 256.2502 , 258.92676, 262.47302], ... [299.19385, 298.38416, 297.86572, ..., 298.37845, 297.90448, 297.50476], [299.31348, 299.026 , 298.5249 , ..., 298.23846, 298.32867, 298.1369 ], [299.2212 , 299.25613, 299.00455, ..., 298.67078, 298.75018, 298.88126]], [[254.51819, 254.2999 , 253.97646, ..., 243.64935, 244.68417, 246.07224], [259.2971 , 259.24634, 259.14093, ..., 242.3569 , 244.08167, 246.92615], [260.90253, 259.7304 , 259.12848, ..., 242.1542 , 245.5603 , 250.43971], ..., [297.9024 , 297.14087, 296.95703, ..., 297.8386 , 297.45078, 297.03668], [298.49506, 298.34576, 297.91656, ..., 298.09235, 298.10907, 297.82123], [298.68652, 298.7552 , 298.51666, ..., 298.68442, 298.65936, 298.70523]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * month (month) int64 1 4 7 10
xarray.DataArray
'air'
- month: 4
- lat: 25
- lon: 53
- 248.1 247.9 247.5 246.9 246.2 245.5 ... 299.0 299.1 298.7 298.7 298.7
array([[[248.06451, 247.87129, 247.50122, ..., 242.38834, 243.41504, 244.81207], [249.94589, 249.91418, 249.65762, ..., 241.28279, 242.95572, 245.75687], [251.79315, 251.37738, 250.85794, ..., 242.50226, 245.54515, 250.02003], ..., [295.80817, 295.299 , 295.20734, ..., 294.8109 , 294.21518, 293.65 ], [296.67496, 296.5963 , 296.23596, ..., 294.91943, 294.63684, 294.14258], [297.27472, 297.30914, 297.05096, ..., 295.5161 , 295.29193, 295.1877 ]], [[266.2707 , 266.1467 , 265.92212, ..., 255.81694, 256.9698 , 258.4292 ], [267.29608, 267.4804 , 267.57135, ..., 254.85863, 256.7419 , 259.33945], [268.3296 , 268.1264 , 268.0636 , ..., 256.2502 , 258.92676, 262.47302], ... [299.19385, 298.38416, 297.86572, ..., 298.37845, 297.90448, 297.50476], [299.31348, 299.026 , 298.5249 , ..., 298.23846, 298.32867, 298.1369 ], [299.2212 , 299.25613, 299.00455, ..., 298.67078, 298.75018, 298.88126]], [[254.51819, 254.2999 , 253.97646, ..., 243.64935, 244.68417, 246.07224], [259.2971 , 259.24634, 259.14093, ..., 242.3569 , 244.08167, 246.92615], [260.90253, 259.7304 , 259.12848, ..., 242.1542 , 245.5603 , 250.43971], ..., [297.9024 , 297.14087, 296.95703, ..., 297.8386 , 297.45078, 297.03668], [298.49506, 298.34576, 297.91656, ..., 298.09235, 298.10907, 297.82123], [298.68652, 298.7552 , 298.51666, ..., 298.68442, 298.65936, 298.70523]]], dtype=float32)
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
- month(month)int641 4 7 10
array([ 1, 4, 7, 10])
9. 滚动 Rolling¶
xarray 支持滚动窗口操作
ctime = np.arange(10) + 1
c = xr.DataArray(np.arange(10), dims='time',
coords={'time':ctime})
rol = c.rolling(time=3)
#rol
for e in rol:
print(e)
(<xarray.DataArray 'time' ()>
array(1)
Coordinates:
time int64 1, <xarray.DataArray (time: 1)>
array([nan])
Coordinates:
* time (time) int64 1)
(<xarray.DataArray 'time' ()>
array(2)
Coordinates:
time int64 2, <xarray.DataArray (time: 2)>
array([nan, nan])
Coordinates:
* time (time) int64 1 2)
(<xarray.DataArray 'time' ()>
array(3)
Coordinates:
time int64 3, <xarray.DataArray (time: 3)>
array([0., 1., 2.])
Coordinates:
* time (time) int64 1 2 3)
(<xarray.DataArray 'time' ()>
array(4)
Coordinates:
time int64 4, <xarray.DataArray (time: 3)>
array([1., 2., 3.])
Coordinates:
* time (time) int64 2 3 4)
(<xarray.DataArray 'time' ()>
array(5)
Coordinates:
time int64 5, <xarray.DataArray (time: 3)>
array([2., 3., 4.])
Coordinates:
* time (time) int64 3 4 5)
(<xarray.DataArray 'time' ()>
array(6)
Coordinates:
time int64 6, <xarray.DataArray (time: 3)>
array([3., 4., 5.])
Coordinates:
* time (time) int64 4 5 6)
(<xarray.DataArray 'time' ()>
array(7)
Coordinates:
time int64 7, <xarray.DataArray (time: 3)>
array([4., 5., 6.])
Coordinates:
* time (time) int64 5 6 7)
(<xarray.DataArray 'time' ()>
array(8)
Coordinates:
time int64 8, <xarray.DataArray (time: 3)>
array([5., 6., 7.])
Coordinates:
* time (time) int64 6 7 8)
(<xarray.DataArray 'time' ()>
array(9)
Coordinates:
time int64 9, <xarray.DataArray (time: 3)>
array([6., 7., 8.])
Coordinates:
* time (time) int64 7 8 9)
(<xarray.DataArray 'time' ()>
array(10)
Coordinates:
time int64 10, <xarray.DataArray (time: 3)>
array([7., 8., 9.])
Coordinates:
* time (time) int64 8 9 10)
air_roll = air_temp.air.rolling(time=3) #,center=True)
air_roll
DataArrayRolling [time->3]
air_roll.mean()
<xarray.DataArray 'air' (time: 2920, lat: 25, lon: 53)> array([[[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ... [295.19 , 295.62332, 297.15665, ..., 295.45667, 295.19 , 294.52335], [296.69003, 297.82333, 298.39 , ..., 295.52335, 295.38998, 294.65665], [298.09 , 298.95667, 298.88998, ..., 296.12332, 295.75668, 295.49002]], [[244.79 , 244.02332, 242.95667, ..., 243.05666, 243.32332, 243.82332], [249.62334, 249.18999, 248.48999, ..., 240.48999, 241.35664, 242.89 ], [262.68997, 262.05667, 261.45667, ..., 239.92334, 242.45667, 246.12334], ..., [294.09 , 294.29 , 295.99 , ..., 295.29 , 295.05664, 294.55667], [296.39 , 297.32333, 297.69 , ..., 295.49002, 295.4233 , 294.79 ], [297.88998, 298.55664, 298.45667, ..., 296.09 , 295.82333, 295.55667]]], dtype=float32) Coordinates: * lat (lat) float32 75.0 72.5 70.0 67.5 65.0 ... 25.0 22.5 20.0 17.5 15.0 * lon (lon) float32 200.0 202.5 205.0 207.5 ... 322.5 325.0 327.5 330.0 * time (time) datetime64[ns] 2013-01-01 ... 2014-12-31T18:00:00 Attributes: long_name: 4xDaily Air temperature at sigma level 995 units: degK precision: 2 GRIB_id: 11 GRIB_name: TMP var_desc: Air temperature dataset: NMC Reanalysis level_desc: Surface statistic: Individual Obs parent_stat: Other actual_range: [185.16 322.1 ]
xarray.DataArray
'air'
- time: 2920
- lat: 25
- lon: 53
- nan nan nan nan nan nan nan ... 297.3 297.2 297.0 296.1 295.8 295.6
array([[[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ..., [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan]], [[ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], [ nan, nan, nan, ..., nan, nan, nan], ... [295.19 , 295.62332, 297.15665, ..., 295.45667, 295.19 , 294.52335], [296.69003, 297.82333, 298.39 , ..., 295.52335, 295.38998, 294.65665], [298.09 , 298.95667, 298.88998, ..., 296.12332, 295.75668, 295.49002]], [[244.79 , 244.02332, 242.95667, ..., 243.05666, 243.32332, 243.82332], [249.62334, 249.18999, 248.48999, ..., 240.48999, 241.35664, 242.89 ], [262.68997, 262.05667, 261.45667, ..., 239.92334, 242.45667, 246.12334], ..., [294.09 , 294.29 , 295.99 , ..., 295.29 , 295.05664, 294.55667], [296.39 , 297.32333, 297.69 , ..., 295.49002, 295.4233 , 294.79 ], [297.88998, 298.55664, 298.45667, ..., 296.09 , 295.82333, 295.55667]]], dtype=float32)
- lat(lat)float3275.0 72.5 70.0 ... 20.0 17.5 15.0
- standard_name :
- latitude
- long_name :
- Latitude
- units :
- degrees_north
- axis :
- Y
array([75. , 72.5, 70. , 67.5, 65. , 62.5, 60. , 57.5, 55. , 52.5, 50. , 47.5, 45. , 42.5, 40. , 37.5, 35. , 32.5, 30. , 27.5, 25. , 22.5, 20. , 17.5, 15. ], dtype=float32)
- lon(lon)float32200.0 202.5 205.0 ... 327.5 330.0
- standard_name :
- longitude
- long_name :
- Longitude
- units :
- degrees_east
- axis :
- X
array([200. , 202.5, 205. , 207.5, 210. , 212.5, 215. , 217.5, 220. , 222.5, 225. , 227.5, 230. , 232.5, 235. , 237.5, 240. , 242.5, 245. , 247.5, 250. , 252.5, 255. , 257.5, 260. , 262.5, 265. , 267.5, 270. , 272.5, 275. , 277.5, 280. , 282.5, 285. , 287.5, 290. , 292.5, 295. , 297.5, 300. , 302.5, 305. , 307.5, 310. , 312.5, 315. , 317.5, 320. , 322.5, 325. , 327.5, 330. ], dtype=float32)
- time(time)datetime64[ns]2013-01-01 ... 2014-12-31T18:00:00
- standard_name :
- time
- long_name :
- Time
array(['2013-01-01T00:00:00.000000000', '2013-01-01T06:00:00.000000000', '2013-01-01T12:00:00.000000000', ..., '2014-12-31T06:00:00.000000000', '2014-12-31T12:00:00.000000000', '2014-12-31T18:00:00.000000000'], dtype='datetime64[ns]')
- long_name :
- 4xDaily Air temperature at sigma level 995
- units :
- degK
- precision :
- 2
- GRIB_id :
- 11
- GRIB_name :
- TMP
- var_desc :
- Air temperature
- dataset :
- NMC Reanalysis
- level_desc :
- Surface
- statistic :
- Individual Obs
- parent_stat :
- Other
- actual_range :
- [185.16 322.1 ]
10. I/O¶
xarray
支持读取和保存许多格式的文件
netcdf
、hdf
可以直接读写grib
(version 1 and 2) 需要借助cfgrib
来读取 (conda install -c conda-forge cfgrib
),支持Linux
,Mac OS X
, 和Windows
import xarray as xr
xr.open_dataset(grib_file,engine='cfgrib')
或者
xr.open_dataarray(grib_file,engine='cfgrib')
保存到 netcdf
文件
ds_extremes.to_netcdf('ds_extremes.nc')