Skip to content

Commit c73e211

Browse files
committed
make cluster a plugin
1 parent 8d383a1 commit c73e211

16 files changed

Lines changed: 571 additions & 313 deletions

README.md

Lines changed: 40 additions & 103 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,7 @@ Table of Contents
4040
* [Event Detection](#event-detection)
4141
* [Double-Click Annotations](#double-click-annotations)
4242
* [Network Architecture](#network-architecture)
43+
* [Clustering Algorithm](#clustering-algorithm)
4344
* [Troubleshooting](#troubleshooting)
4445
* [Frequently Asked Questions](#frequently-asked-questions)
4546
* [Reporting Problems](#reporting-problems)
@@ -904,21 +905,21 @@ in the `Ground Truth` directory: "activations.log", "activations-samples.log",
904905
and "activations.npz". The two ending in ".log" report any errors, and the
905906
".npz" file contains the actual data in binary format.
906907

907-
Now reduce the dimensionality of the hidden state activations
908-
to either two or three dimensions with the `Cluster` button.
909-
Choose to do so using either UMAP ([McInnes, Healy, and Melville
910-
(2018)](https://arxiv.org/abs/1802.03426)), t-SNE ([van der Maaten and
911-
Hinton (2008)](http://www.jmlr.org/papers/v9/vandermaaten08a.html)), or PCA.
912-
UMAP and t-SNE are each controlled by separate parameters (`neighbors` and
913-
`distance`, and `perplexity` and `exaggeration` respectively), a description
914-
of which can be found in the aforementioned articles. UMAP and t-SNE can
915-
also be optionally preceded by PCA, in which case you'll need to specify
916-
the fraction of coefficients to retain using `PCA fraction`. You'll also
917-
need to choose to cluster just the last hidden layer using the "layers"
918-
multi-select box. Output are two or three files in the `Ground Truth`
919-
directory: "cluster.log" contains any errors, "cluster.npz" contains binary
920-
data, and "cluster-pca.pdf" shows the results of the principal components
921-
analysis (PCA) if one was performed.
908+
Now reduce the dimensionality of the hidden state activations to either two or
909+
three dimensions with the `Cluster` button. By default, Songexplorer uses the
910+
UMAP algorithm ([McInnes, Healy, and Melville
911+
(2018)](https://arxiv.org/abs/1802.03426)), but tSNE and PCA can be used
912+
instead via a plugin (see [Clustering Algorithm](#clustering-algorithm)). For
913+
now, leave the `neighbors` and `distance` parameters set to their default
914+
values. A description of how they change the resulting clusters can be found
915+
in the aforementioned article. Also leave the `PCA fraction` parameter at its
916+
default. In future, if you find clustering slow for larger data sets, UMAP can
917+
be preceded by PCA, and the fraction of coefficients that are retained is
918+
specified using `PCA fraction`. Lastly, choose to cluster just the last hidden
919+
layer using the "layers" multi-select box. Output are two or three files in
920+
the `Ground Truth` directory: "cluster.log" contains any errors, "cluster.npz"
921+
contains binary data, and "cluster-pca.pdf" shows the results of the principal
922+
components analysis (PCA) if one was performed.
922923

923924
Finally, click on the `Visualize` button to render the clusters in the
924925
left-most panel. Adjust the size and transparency of the markers using
@@ -1756,46 +1757,7 @@ supply your own code instead. Simply put in a python file a list called
17561757
script which uses those parameters to generate a "detected.csv" given a WAV
17571758
file. Then change the "detect_plugin" variable in your "configuration.py"
17581759
file to point to the full path of this python file, without the ".py"
1759-
extension. See the minimal example in "src/detect-plugin.py" for a template,
1760-
a pared down version of which is as follows:
1761-
1762-
#!/usr/bin/python3
1763-
1764-
# a list of lists specifying the detect-specific hyperparameters in the GUI
1765-
detect_parameters = [
1766-
["my-simple-textbox", "h-parameter 1", "", "32", [], None, True],
1767-
]
1768-
1769-
# a function which returns a vector of strings used to annotate the detected events
1770-
def detect_labels(audio_nchannels):
1771-
# kinds = [... ]
1772-
return kinds
1773-
1774-
# a script which inputs a WAV file and outputs a CSV file
1775-
if __name__ == '__main__':
1776-
1777-
import os
1778-
import scipy.io.wavfile as spiowav
1779-
import sys
1780-
import csv
1781-
import json
1782-
1783-
1784-
_, filename, detect_parameters, audio_tic_rate, audio_nchannels = sys.argv
1785-
1786-
detect_parameters = json.loads(detect_parameters)
1787-
hyperparameter1 = int(detect_parameters["my-simple-textbox"])
1788-
1789-
_, song = spiowav.read(filename)
1790-
1791-
# add logic here to find events of interest
1792-
# events = [...] # e.g. a list of 3-tuples with start, stop, kind
1793-
1794-
basename = os.path.basename(filename)
1795-
with open(os.path.splitext(filename)[0]+'-detected.csv', 'w') as fid:
1796-
csvwriter = csv.writer(fid)
1797-
for e in range(events):
1798-
csvwriter.writerow([basename, e[0], e[1], 'detected', e[2]])
1760+
extension. See the minimal example in "src/detect-plugin.py" for a template.
17991761

18001762
## Double-Click Annotations ##
18011763

@@ -1829,54 +1791,29 @@ The default network architecture is a set of layered convolutions, the depth
18291791
and width of which can be configured as described above. Should this not prove
18301792
flexible enough, SongExplorer is designed with a means to supply your own
18311793
TensorFlow code that implements a whiz bang architecture of any arbitrary
1832-
design. See the minimal example in "src/architecture-plugin.py" for a
1833-
template of how this works, a pared down version of which is as follows:
1834-
1835-
import tensorflow as tf
1836-
1837-
# a list of lists specifying the architecture-specific hyperparameters in the GUI
1838-
model_parameters = [
1839-
# each hyperparameter is described by a list with these entries:
1840-
# [ key in `model_settings`,
1841-
# title in GUI,
1842-
# "" for textbox or [] for pull-down,
1843-
# default value,
1844-
# enable logic,
1845-
# callback,
1846-
# required ]
1847-
]
1848-
1849-
# a function which returns a keras model
1850-
def create_model(model_settings):
1851-
# `model_settings` is a superset of the hyperparameters above. see src/models.py
1852-
1853-
# hidden_layers is used to visualize intermediate clusters in the GUI
1854-
hidden_layers = []
1855-
1856-
# 'parallelize' specifies the number of output tics to classify simultaneously
1857-
ninput_tics = model_settings["context_tics"] + model_settings["parallelize"] - 1
1858-
input_layer = Input(shape=(ninput_tics, model_settings["nchannels"]))
1859-
1860-
# add custom layers here, e.g. x = Conv1D()(x)
1861-
# append interesting ones to hidden_layers
1862-
1863-
# last layer must be convolutional with nlabels as the output size
1864-
output_layer = Conv1D(model_settings['nlabels'], 1)(x)
1865-
1866-
return tf.keras.Model(inputs=input_layer, outputs=[hidden_layers, output_layer])
1867-
1868-
In brief, two objects must be supplied in a python file: (1) a list named
1869-
`model_parameters` which defines the variable names, titles, and default
1870-
values, etc. to appear in the GUI, and (2) a function `create_model` which
1871-
builds and returns the network graph. Specify as the `architecture_plugin` in
1872-
"configuration.py" the full path to this file, without the ".py" extension.
1873-
The buttons immediately above the configuration textbox in the GUI will
1874-
change to reflect the different hyperparameters used by this architecture.
1875-
All the workflows described above (detecting sounds, making predicions, fixing
1876-
mistakes, etc) can be used with this custom network in an identical manner.
1877-
The default convolutional architecture is itself written as a plug-in, and
1878-
can be found in "src/convolutional.py".
1879-
1794+
design. See the minimal example in "src/architecture-plugin.py" for a template
1795+
of how this works. In brief, two objects must be supplied in a python file:
1796+
(1) a list named `model_parameters` which defines the variable names, titles,
1797+
and default values, etc. to appear in the GUI, and (2) a function
1798+
`create_model` which builds and returns the network graph. Specify as the
1799+
`architecture_plugin` in "configuration.py" the full path to this file, without
1800+
the ".py" extension. The buttons immediately above the configuration textbox
1801+
in the GUI will change to reflect the different hyperparameters used by this
1802+
architecture. All the workflows described above (detecting sounds, making
1803+
predicions, fixing mistakes, etc) can be used with this custom network in an
1804+
identical manner. The default convolutional architecture is itself written as
1805+
a plug-in, and can be found in "src/convolutional.py".
1806+
1807+
## Clustering Algorithm ##
1808+
1809+
The method used to reduce the dimensionality of the activations for
1810+
visualization is also a plugin. By default, the UMAP algorithm is used, but
1811+
also included are plugins for t-SNE ([van der Maaten and Hinton
1812+
(2008)](http://www.jmlr.org/papers/v9/vandermaaten08a.html)) and PCA. To use
1813+
these alternatives change "cluster_plugin" in "configuration.py" to "tSNE" or
1814+
"PCA", respectively. To create your own plugin, write a script which defines a
1815+
list called `cluster_parameters`, inputs "activations.npz" and outputs
1816+
"cluster.npz". See "src/cluster-plugin.py" for a template.
18801817

18811818
# Troubleshooting #
18821819

configuration.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -60,7 +60,7 @@
6060
gui_context_spectrogram_height_pix=150
6161
gui_context_probability_height_pix=75
6262
gui_context_undo_proximity_pix=3
63-
gui_context_doubleclick_plugin="point"
63+
gui_context_doubleclick_plugin="point" # or snap-to
6464
gui_spectrogram_colormap="Viridis256"
6565
gui_spectrogram_window="hann"
6666
gui_spectrogram_length_sec=0.010
@@ -126,6 +126,7 @@
126126
cluster_ngpu_cards=-1
127127
cluster_ngigabytes_memory=-1
128128
cluster_cluster_flags=""
129+
cluster_plugin="UMAP" # or tSNE, PCA
129130

130131
accuracy_where=default_where
131132
accuracy_ncpu_cores=-1

src/PCA.py

Lines changed: 175 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,175 @@
1+
#!/usr/bin/env python3
2+
3+
# reduce dimensionality of internal activation states with PCA
4+
5+
# e.g. PCA.py \
6+
# --data_dir=`pwd`/groundtruth-data \
7+
# --layers=0,1,2,3,4 \
8+
# --pca_batch_size=5 \
9+
# --parallelize=0 \
10+
# --parameters='{"ndims":2}'
11+
12+
import argparse
13+
import os
14+
import numpy as np
15+
import sys
16+
from sklearn.decomposition import PCA, IncrementalPCA
17+
from natsort import natsorted
18+
from datetime import datetime
19+
import socket
20+
from itertools import repeat
21+
22+
import json
23+
24+
def cluster_parameters():
25+
return [
26+
["ndims", "# dims", ["2","3"], "2", 1, [], None, True],
27+
]
28+
29+
def do_cluster(activations_flattened, ilayer, layers, parameters):
30+
if ilayer in layers:
31+
print("reducing dimensionality of layer "+str(ilayer)+" with PCA...")
32+
mu = np.mean(activations_flattened[ilayer], axis=0)
33+
sigma = np.std(activations_flattened[ilayer], axis=0)
34+
activations_scaled = (activations_flattened[ilayer] - mu) / sigma
35+
if FLAGS.pca_batch_size==0:
36+
pca = PCA()
37+
else:
38+
nfeatures = np.shape(activations_scaled)[1]
39+
pca = IncrementalPCA(batch_size = FLAGS.pca_batch_size * nfeatures)
40+
fit = pca.fit(activations_scaled)
41+
return fit, fit.transform(activations_scaled)[:,0:int(parameters["ndims"])]
42+
else:
43+
return None, None
44+
45+
FLAGS = None
46+
47+
def main():
48+
flags = vars(FLAGS)
49+
for key in sorted(flags.keys()):
50+
print('%s = %s' % (key, flags[key]))
51+
52+
layers = [int(x) for x in FLAGS.layers.split(',')]
53+
54+
print("loading data...")
55+
activations=[]
56+
npzfile = np.load(os.path.join(FLAGS.data_dir, 'activations.npz'),
57+
allow_pickle=True)
58+
sounds = npzfile['sounds']
59+
for arr_ in natsorted(filter(lambda x: x.startswith('arr_'), npzfile.files)):
60+
activations.append(npzfile[arr_])
61+
62+
nlayers = len(activations)
63+
64+
kinds = set([x['kind'] for x in sounds])
65+
labels = set([x['label'] for x in sounds])
66+
print('label counts')
67+
for kind in kinds:
68+
print(kind)
69+
for label in labels:
70+
count = sum([label==x['label'] and kind==x['kind'] for x in sounds])
71+
print(count,label)
72+
73+
activations_flattened = [None]*nlayers
74+
for ilayer in layers:
75+
nsounds = np.shape(activations[ilayer])[0]
76+
activations_flattened[ilayer] = np.reshape(activations[ilayer],(nsounds,-1))
77+
print("shape of layer "+str(ilayer)+" is "+str(np.shape(activations_flattened[ilayer])))
78+
79+
fits_pca = [None]*nlayers
80+
activations_scaled = [None]*nlayers
81+
82+
if FLAGS.parallelize!=0:
83+
from multiprocessing import Pool
84+
nprocs = os.cpu_count() if FLAGS.parallelize==-1 else FLAGS.parallelize
85+
with Pool(min(nprocs,nlayers)) as p:
86+
fits, activations_clustered = zip(*p.starmap(do_cluster,
87+
zip(repeat(activations_flattened),
88+
range(len(activations_flattened)),
89+
repeat(layers),
90+
repeat(FLAGS.parameters))))
91+
else:
92+
fits = [None]*nlayers
93+
activations_clustered = [None]*nlayers
94+
for ilayer in layers:
95+
print("reducing dimensionality of layer "+str(ilayer)+" with PCA...")
96+
fits[ilayer], activations_clustered[ilayer] = do_cluster(activations_flattened,
97+
ilayer,
98+
layers,
99+
FLAGS.parameters)
100+
101+
import matplotlib as mpl
102+
mpl.use('Agg')
103+
import matplotlib.pyplot as plt
104+
#plt.ion()
105+
106+
fig = plt.figure()
107+
ax = fig.add_subplot(111)
108+
for ilayer in layers:
109+
cumsum = np.cumsum(fits[ilayer].explained_variance_ratio_)
110+
line, = ax.plot(cumsum)
111+
line.set_label('layer '+str(ilayer))
112+
113+
ax.set_ylabel('cumsum explained variance')
114+
ax.set_xlabel('# of components')
115+
ax.legend(loc='lower right')
116+
plt.savefig(os.path.join(FLAGS.data_dir, 'cluster.pdf'))
117+
118+
np.savez(os.path.join(FLAGS.data_dir, 'cluster'), \
119+
sounds = sounds,
120+
activations_clustered = np.array(activations_clustered, dtype=object),
121+
fits = np.array(fits, dtype=object) if FLAGS.save_fits else None,
122+
labels_touse = npzfile['labels_touse'],
123+
kinds_touse = npzfile['kinds_touse'])
124+
125+
def str2bool(v):
126+
if v.lower() in ('yes', 'true', 't', 'y', '1'):
127+
return True
128+
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
129+
return False
130+
else:
131+
raise argparse.ArgumentTypeError('Boolean value expected.')
132+
133+
if __name__ == "__main__":
134+
parser = argparse.ArgumentParser()
135+
parser.add_argument(
136+
'--data_dir',
137+
type=str)
138+
parser.add_argument(
139+
'--layers',
140+
type=str)
141+
parser.add_argument(
142+
'--pca_batch_size',
143+
type=int)
144+
parser.add_argument(
145+
'--parallelize',
146+
type=int,
147+
default=0)
148+
parser.add_argument(
149+
'--parameters',
150+
type=json.loads,
151+
default='{"neighbors": 10, "distance": 0.1}')
152+
parser.add_argument(
153+
'--save_fits',
154+
type=str2bool,
155+
default=False,
156+
help='Whether to save the cluster models')
157+
158+
FLAGS, unparsed = parser.parse_known_args()
159+
160+
print(str(datetime.now())+": start time")
161+
repodir = os.path.dirname(os.path.dirname(os.path.realpath(__file__)))
162+
with open(os.path.join(repodir, "VERSION.txt"), 'r') as fid:
163+
print('SongExplorer version = '+fid.read().strip().replace('\n',', '))
164+
print("hostname = "+socket.gethostname())
165+
166+
try:
167+
main()
168+
169+
except Exception as e:
170+
print(e)
171+
172+
finally:
173+
if hasattr(os, 'sync'):
174+
os.sync()
175+
print(str(datetime.now())+": finish time")

0 commit comments

Comments
 (0)