Skip to content

Instantly share code, notes, and snippets.

@axiaoxin
Last active May 12, 2017 08:54
Show Gist options
  • Save axiaoxin/2d59dbbd6f9efe9e5869bd9158f851a9 to your computer and use it in GitHub Desktop.
Save axiaoxin/2d59dbbd6f9efe9e5869bd9158f851a9 to your computer and use it in GitHub Desktop.
获取不连续的id区间
In [25]: from itertools import groupby
In [26]: f = open('Fattrid.csv')
In [27]: lines = f.readlines()
In [28]: f.close()
In [29]: ids = sorted([int(line.split()[0]) for line in lines])
In [30]: fun = lambda (i, v): v - i
In [31]: c_ids_list = []
In [32]: for k, g in groupby(enumerate(ids), fun):
...: c_ids = [v for i, v in g]
...: c_ids_list.append(c_ids)
...:
In [35]: for i, l in enumerate(c_ids_list):
...: if i >= len(c_ids_list)-1:break
...: if c_ids_list[i+1][0] - c_ids_list[i][-1] > 1000:
...: print '[%s, %s]' %(c_ids_list[i][-1]+1, c_ids_list[i+1][0]-1)
...:
[922, 9043]
[16182, 36999]
[38000, 49999]
[680119, 1999999]
[2236283, 2339999]
@axiaoxin
Copy link
Author

axiaoxin commented May 12, 2017

文件中有很多id和name的行,现在要新增一些id,id不能与已有的id重复,需要找到一段有1000个位置以上空余区间来新增这些id

通过值减下标的方式分组

In [40]: for k, g in groupby(enumerate(lst), fun):
    ...:     print k, g, [v for i, v in g]
    ...:
    ...:
1 <itertools._grouper object at 0x0BDED0F0> [1, 2, 3]
2 <itertools._grouper object at 0x03C05AF0> [5, 6, 7, 8]
4 <itertools._grouper object at 0x03BFC910> [11, 12, 13]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment