重新分箱¶
日期 | 2007-05-27 (最后修改), 2006-03-27 (创建) |
---|
示例展示了如何重新分箱数据以生成更小或更大的数组,而无需(以及使用)插值。
示例 1¶
在这里,我们处理最简单的情况,其中任何所需的新的形状都是有效的,并且没有对数据进行插值来确定新的值。* 首先,为每个维度创建浮点切片对象。* 其次,使用 mgrid 从切片计算新箱的坐标。* 然后,将坐标转换为整数索引。* 最后,使用“花式索引”在所需索引处评估原始数组。
def rebin( a, newshape ):
'''Rebin an array to a new shape.
'''
assert len(a.shape) == len(newshape)
slices = [ slice(0,old, float(old)/new) for old,new in zip(a.shape,newshape) ]
coordinates = mgrid[slices]
indices = coordinates.astype('i') #choose the biggest smaller integer index
return a[tuple(indices)]
如果我们只对将大小减少某个整数因子感兴趣,那么我们可以使用
def rebin_factor( a, newshape ):
'''Rebin an array to a new shape.
newshape must be a factor of a.shape.
'''
assert len(a.shape) == len(newshape)
assert not sometrue(mod( a.shape, newshape ))
slices = [ slice(None,None, old/new) for old,new in zip(a.shape,newshape) ]
return a[slices]
示例 2¶
这是处理 ndarray 减少情况的另一种方法。这与 IDL 的 rebin 命令完全相同,其中原始数组中的所有值都被求和并分配到新数组中的条目中。与 IDL 一样,新形状必须是旧形状的因子。丑陋的“evList 技巧”构建并执行一个形式为
a.reshape(args[0],factor[0],).sum(1)/factor[0] a.reshape(args[0],factor[0],args[1],factor[1],).sum(1).sum(2)/factor[0]/factor[1]
等等。这种通用形式被扩展以涵盖所需维度的数量。
def rebin(a, *args):
'''rebin ndarray data into a smaller ndarray of the same rank whose dimensions
are factors of the original dimensions. eg. An array with 6 columns and 4 rows
can be reduced to have 6,3,2 or 1 columns and 4,2 or 1 rows.
example usages:
>>> a=rand(6,4); b=rebin(a,3,2)
>>> a=rand(6); b=rebin(a,2)
'''
shape = a.shape
lenShape = len(shape)
factor = asarray(shape)/asarray(args)
evList = ['a.reshape('] + \
['args[%d],factor[%d],'%(i,i) for i in range(lenShape)] + \
[')'] + ['.sum(%d)'%(i+1) for i in range(lenShape)] + \
['/factor[%d]'%i for i in range(lenShape)]
print ''.join(evList)
return eval(''.join(evList))
上面的代码返回一个与输入数组类型相同的数组。如果输入是整数数组,则输出值将向下取整。如果您想要一个浮点数组,该数组可以正确地平均输入值而不会取整,您可以执行以下操作。
a.reshape(args[0],factor[0],).mean(1)
a.reshape(args[0],factor[0],args[1],factor[1],).mean(1).mean(2)
def rebin(a, *args):
shape = a.shape
lenShape = len(shape)
factor = asarray(shape)/asarray(args)
evList = ['a.reshape('] + \
['args[%d],factor[%d],'%(i,i) for i in range(lenShape)] + \
[')'] + ['.mean(%d)'%(i+1) for i in range(lenShape)]
print ''.join(evList)
return eval(''.join(evList))
一些测试用例
# 2-D case
a=rand(6,4)
print a
b=rebin(a,6,4)
print b
b=rebin(a,6,2)
print b
b=rebin(a,3,2)
print b
b=rebin(a,1,1)
# 1-D case
print b
a=rand(4)
print a
b=rebin(a,4)
print b
b=rebin(a,2)
print b
b=rebin(a,1)
print b
示例 3¶
一个 Python 版本的 congrid,用于在 IDL 中使用,用于使用各种最近邻和插值例程将数据重采样到任意大小。
import numpy as n
import scipy.interpolate
import scipy.ndimage
def congrid(a, newdims, method='linear', centre=False, minusone=False):
'''Arbitrary resampling of source array to new dimension sizes.
Currently only supports maintaining the same number of dimensions.
To use 1-D arrays, first promote them to shape (x,1).
Uses the same parameters and creates the same co-ordinate lookup points
as IDL''s congrid routine, which apparently originally came from a VAX/VMS
routine of the same name.
method:
neighbour - closest value from original data
nearest and linear - uses n x 1-D interpolations using
scipy.interpolate.interp1d
(see Numerical Recipes for validity of use of n 1-D interpolations)
spline - uses ndimage.map_coordinates
centre:
True - interpolation points are at the centres of the bins
False - points are at the front edge of the bin
minusone:
For example- inarray.shape = (i,j) & new dimensions = (x,y)
False - inarray is resampled by factors of (i/x) * (j/y)
True - inarray is resampled by(i-1)/(x-1) * (j-1)/(y-1)
This prevents extrapolation one element beyond bounds of input array.
'''
if not a.dtype in [n.float64, n.float32]:
a = n.cast[float](a)
m1 = n.cast[int](minusone)
ofs = n.cast[int](centre) * 0.5
old = n.array( a.shape )
ndims = len( a.shape )
if len( newdims ) != ndims:
print "[congrid] dimensions error. " \
"This routine currently only support " \
"rebinning to the same number of dimensions."
return None
newdims = n.asarray( newdims, dtype=float )
dimlist = []
if method == 'neighbour':
for i in range( ndims ):
base = n.indices(newdims)[i]
dimlist.append( (old[i] - m1) / (newdims[i] - m1) \
* (base + ofs) - ofs )
cd = n.array( dimlist ).round().astype(int)
newa = a[list( cd )]
return newa
elif method in ['nearest','linear']:
# calculate new dims
for i in range( ndims ):
base = n.arange( newdims[i] )
dimlist.append( (old[i] - m1) / (newdims[i] - m1) \
* (base + ofs) - ofs )
# specify old dims
olddims = [n.arange(i, dtype = n.float) for i in list( a.shape )]
# first interpolation - for ndims = any
mint = scipy.interpolate.interp1d( olddims[-1], a, kind=method )
newa = mint( dimlist[-1] )
trorder = [ndims - 1] + range( ndims - 1 )
for i in range( ndims - 2, -1, -1 ):
newa = newa.transpose( trorder )
mint = scipy.interpolate.interp1d( olddims[i], newa, kind=method )
newa = mint( dimlist[i] )
if ndims > 1:
# need one more transpose to return to original dimensions
newa = newa.transpose( trorder )
return newa
elif method in ['spline']:
oslices = [ slice(0,j) for j in old ]
oldcoords = n.ogrid[oslices]
nslices = [ slice(0,j) for j in list(newdims) ]
newcoords = n.mgrid[nslices]
newcoords_dims = range(n.rank(newcoords))
#make first index last
newcoords_dims.append(newcoords_dims.pop(0))
newcoords_tr = newcoords.transpose(newcoords_dims)
# makes a view that affects newcoords
newcoords_tr += ofs
deltas = (n.asarray(old) - m1) / (newdims - m1)
newcoords_tr *= deltas
newcoords_tr -= ofs
newa = scipy.ndimage.map_coordinates(a, newcoords)
return newa
else:
print "Congrid error: Unrecognized interpolation type.\n", \
"Currently only \'neighbour\', \'nearest\',\'linear\',", \
"and \'spline\' are supported."
return None
章节作者:未知[3]、DavidLinke、AngusMcMorland、PauGargallo