Skip to content

Commit

Permalink
Exclude neighbours of the ref_index in DISE algorithm
Browse files Browse the repository at this point in the history
issue #133
  • Loading branch information
FarnazH committed Jun 29, 2024
1 parent d931154 commit 025c0d3
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion selector/methods/distance.py
Original file line number Diff line number Diff line change
Expand Up @@ -447,16 +447,20 @@ def algorithm(self, X, max_size):
"""

# calculate distance of all samples from reference sample; distance is a (n_samples,) array
# this includes the distance of reference sample from itself, which is 0
distances = scipy.spatial.minkowski_distance(X[self.ref_index], X, p=self.p)
# get sorted index of samples based on their distance from reference (closest to farthest)
# the first index will be the ref_index which has distance of zero
index_sorted = np.argsort(distances)
assert index_sorted[0] == self.ref_index
# construct KDTree for quick nearest-neighbor lookup
kdtree = scipy.spatial.KDTree(X)

# construct bitarray to track selected samples (1 means exclude)
bv = bitarray.bitarray(list(np.zeros(len(X), dtype=int)))
bv[self.ref_index] = 1

# the neighbours of the ref_index are going to be excluded in the first iteration
# and ref_index is going to be added to the selected list
selected = []
for idx in index_sorted:
# select sample if it is not already excluded from consideration
Expand Down

0 comments on commit 025c0d3

Please sign in to comment.