我正在尝试构建一个有助于简化研究工作的工具,并且似乎需要检测我何时在一列中的数据中具有递增序列,而在另一列中具有 asc/desc 序列。
有没有一种干净的方法来检查行中是否有序列,而不必编写一个像https://stackoverflow.com/a/52679427/5045375这样遍历行的状态机?编写这样的代码片段必须检查一列中的值是否在递增(无间隙),而另一列中的值是否为 asc/desc(无间隙)。我完全能够做到这一点,我只是想知道我的熊猫工具箱中是否有我遗漏的东西。
这里有一些例子来澄清我的意图,
import pandas as pd
from collections import namedtuple
QUERY_SEGMENT_ID_COLUMN = 'Query Segment Id'
REFERENCE_SEGMENT_ID_COLUMN = 'Reference Segment Id'
def dataframe(data):
columns = [QUERY_SEGMENT_ID_COLUMN, REFERENCE_SEGMENT_ID_COLUMN]
return pd.DataFrame(data, columns=columns)
# No sequence in either column. No results
data_without_pattern = [[1, 2], [7, 0], [3, 6]]
# Sequence in first column, but no sequence in second column. No results
data_with_pseodo_pattern_query = [[1, 2], [2, 0], [3, 6]]
# Sequence in second column, but no sequence in first column. No results
data_with_pseudo_pattern_reference = [[1, 2], [7, 3], [3, 4]]
# Broken sequence in first column, sequence in second column. No results
data_with_pseudo_pattern_query_broken = [[1, 2], [3, 3], [7, 4]]
# Sequence occurs in both columns, asc. Expect results
data_with_pattern_asc = [[1, 2], [2, 3], [3, 4]]
# Sequence occurs in both columns, desc. Expect results
data_with_pattern_desc = [[1, 4], [2, 3], [3, 2]]
# There is a sequence, and some noise. Expect results
data_with_pattern_and_noise = [[1, 0], [1, 4], [1, 2], [1, 3], [2, 3], [3, 4]]
在第一个示例中,没有任何模式,
print(dataframe(data_without_pattern))
Query Segment Id Reference Segment Id
0 1 2
1 7 0
2 3 6
第二个示例在查询列中有一个升序的 id 序列,但在参考列中没有,
print(dataframe(data_with_pseodo_pattern_query))
Query Segment Id Reference Segment Id
0 1 2
1 2 0
2 3 6
繁星点点滴滴
相关分类