@@ -40,6 +40,7 @@ Table of Contents
4040 * [ Event Detection] ( #event-detection )
4141 * [ Double-Click Annotations] ( #double-click-annotations )
4242 * [ Network Architecture] ( #network-architecture )
43+ * [ Clustering Algorithm] ( #clustering-algorithm )
4344 * [ Troubleshooting] ( #troubleshooting )
4445 * [ Frequently Asked Questions] ( #frequently-asked-questions )
4546 * [ Reporting Problems] ( #reporting-problems )
@@ -904,21 +905,21 @@ in the `Ground Truth` directory: "activations.log", "activations-samples.log",
904905and "activations.npz". The two ending in ".log" report any errors, and the
905906".npz" file contains the actual data in binary format.
906907
907- Now reduce the dimensionality of the hidden state activations
908- to either two or three dimensions with the ` Cluster ` button.
909- Choose to do so using either UMAP ([ McInnes, Healy, and Melville
910- (2018)] ( https://arxiv.org/abs/1802.03426 ) ), t-SNE ( [ van der Maaten and
911- Hinton (2008) ] ( http://www.jmlr.org/papers/v9/vandermaaten08a.html ) ), or PCA.
912- UMAP and t-SNE are each controlled by separate parameters ( ` neighbors ` and
913- ` distance ` , and ` perplexity ` and ` exaggeration ` respectively), a description
914- of which can be found in the aforementioned articles . UMAP and t-SNE can
915- also be optionally preceded by PCA, in which case you'll need to specify
916- the fraction of coefficients to retain using ` PCA fraction ` . You'll also
917- need to choose to cluster just the last hidden layer using the "layers"
918- multi-select box. Output are two or three files in the ` Ground Truth `
919- directory: "cluster.log" contains any errors, "cluster.npz" contains binary
920- data, and "cluster-pca.pdf" shows the results of the principal components
921- analysis (PCA) if one was performed.
908+ Now reduce the dimensionality of the hidden state activations to either two or
909+ three dimensions with the ` Cluster ` button. By default, Songexplorer uses the
910+ UMAP algorithm ([ McInnes, Healy, and Melville
911+ (2018)] ( https://arxiv.org/abs/1802.03426 ) ), but tSNE and PCA can be used
912+ instead via a plugin (see [ Clustering Algorithm ] ( #clustering-algorithm ) ). For
913+ now, leave the ` neighbors ` and ` distance ` parameters set to their default
914+ values. A description of how they change the resulting clusters can be found
915+ in the aforementioned article . Also leave the ` PCA fraction ` parameter at its
916+ default. In future, if you find clustering slow for larger data sets, UMAP can
917+ be preceded by PCA, and the fraction of coefficients that are retained is
918+ specified using ` PCA fraction ` . Lastly, choose to cluster just the last hidden
919+ layer using the "layers" multi-select box. Output are two or three files in
920+ the ` Ground Truth ` directory: "cluster.log" contains any errors, "cluster.npz"
921+ contains binary data, and "cluster-pca.pdf" shows the results of the principal
922+ components analysis (PCA) if one was performed.
922923
923924Finally, click on the ` Visualize ` button to render the clusters in the
924925left-most panel. Adjust the size and transparency of the markers using
@@ -1756,46 +1757,7 @@ supply your own code instead. Simply put in a python file a list called
17561757script which uses those parameters to generate a "detected.csv" given a WAV
17571758file. Then change the "detect_plugin" variable in your "configuration.py"
17581759file to point to the full path of this python file, without the ".py"
1759- extension. See the minimal example in "src/detect-plugin.py" for a template,
1760- a pared down version of which is as follows:
1761-
1762- #!/usr/bin/python3
1763-
1764- # a list of lists specifying the detect-specific hyperparameters in the GUI
1765- detect_parameters = [
1766- ["my-simple-textbox", "h-parameter 1", "", "32", [], None, True],
1767- ]
1768-
1769- # a function which returns a vector of strings used to annotate the detected events
1770- def detect_labels(audio_nchannels):
1771- # kinds = [... ]
1772- return kinds
1773-
1774- # a script which inputs a WAV file and outputs a CSV file
1775- if __name__ == '__main__':
1776-
1777- import os
1778- import scipy.io.wavfile as spiowav
1779- import sys
1780- import csv
1781- import json
1782-
1783-
1784- _, filename, detect_parameters, audio_tic_rate, audio_nchannels = sys.argv
1785-
1786- detect_parameters = json.loads(detect_parameters)
1787- hyperparameter1 = int(detect_parameters["my-simple-textbox"])
1788-
1789- _, song = spiowav.read(filename)
1790-
1791- # add logic here to find events of interest
1792- # events = [...] # e.g. a list of 3-tuples with start, stop, kind
1793-
1794- basename = os.path.basename(filename)
1795- with open(os.path.splitext(filename)[0]+'-detected.csv', 'w') as fid:
1796- csvwriter = csv.writer(fid)
1797- for e in range(events):
1798- csvwriter.writerow([basename, e[0], e[1], 'detected', e[2]])
1760+ extension. See the minimal example in "src/detect-plugin.py" for a template.
17991761
18001762## Double-Click Annotations ##
18011763
@@ -1829,54 +1791,29 @@ The default network architecture is a set of layered convolutions, the depth
18291791and width of which can be configured as described above. Should this not prove
18301792flexible enough, SongExplorer is designed with a means to supply your own
18311793TensorFlow code that implements a whiz bang architecture of any arbitrary
1832- design. See the minimal example in "src/architecture-plugin.py" for a
1833- template of how this works, a pared down version of which is as follows:
1834-
1835- import tensorflow as tf
1836-
1837- # a list of lists specifying the architecture-specific hyperparameters in the GUI
1838- model_parameters = [
1839- # each hyperparameter is described by a list with these entries:
1840- # [ key in `model_settings`,
1841- # title in GUI,
1842- # "" for textbox or [] for pull-down,
1843- # default value,
1844- # enable logic,
1845- # callback,
1846- # required ]
1847- ]
1848-
1849- # a function which returns a keras model
1850- def create_model(model_settings):
1851- # `model_settings` is a superset of the hyperparameters above. see src/models.py
1852-
1853- # hidden_layers is used to visualize intermediate clusters in the GUI
1854- hidden_layers = []
1855-
1856- # 'parallelize' specifies the number of output tics to classify simultaneously
1857- ninput_tics = model_settings["context_tics"] + model_settings["parallelize"] - 1
1858- input_layer = Input(shape=(ninput_tics, model_settings["nchannels"]))
1859-
1860- # add custom layers here, e.g. x = Conv1D()(x)
1861- # append interesting ones to hidden_layers
1862-
1863- # last layer must be convolutional with nlabels as the output size
1864- output_layer = Conv1D(model_settings['nlabels'], 1)(x)
1865-
1866- return tf.keras.Model(inputs=input_layer, outputs=[hidden_layers, output_layer])
1867-
1868- In brief, two objects must be supplied in a python file: (1) a list named
1869- ` model_parameters ` which defines the variable names, titles, and default
1870- values, etc. to appear in the GUI, and (2) a function ` create_model ` which
1871- builds and returns the network graph. Specify as the ` architecture_plugin ` in
1872- "configuration.py" the full path to this file, without the ".py" extension.
1873- The buttons immediately above the configuration textbox in the GUI will
1874- change to reflect the different hyperparameters used by this architecture.
1875- All the workflows described above (detecting sounds, making predicions, fixing
1876- mistakes, etc) can be used with this custom network in an identical manner.
1877- The default convolutional architecture is itself written as a plug-in, and
1878- can be found in "src/convolutional.py".
1879-
1794+ design. See the minimal example in "src/architecture-plugin.py" for a template
1795+ of how this works. In brief, two objects must be supplied in a python file:
1796+ (1) a list named ` model_parameters ` which defines the variable names, titles,
1797+ and default values, etc. to appear in the GUI, and (2) a function
1798+ ` create_model ` which builds and returns the network graph. Specify as the
1799+ ` architecture_plugin ` in "configuration.py" the full path to this file, without
1800+ the ".py" extension. The buttons immediately above the configuration textbox
1801+ in the GUI will change to reflect the different hyperparameters used by this
1802+ architecture. All the workflows described above (detecting sounds, making
1803+ predicions, fixing mistakes, etc) can be used with this custom network in an
1804+ identical manner. The default convolutional architecture is itself written as
1805+ a plug-in, and can be found in "src/convolutional.py".
1806+
1807+ ## Clustering Algorithm ##
1808+
1809+ The method used to reduce the dimensionality of the activations for
1810+ visualization is also a plugin. By default, the UMAP algorithm is used, but
1811+ also included are plugins for t-SNE ([ van der Maaten and Hinton
1812+ (2008)] ( http://www.jmlr.org/papers/v9/vandermaaten08a.html ) ) and PCA. To use
1813+ these alternatives change "cluster_plugin" in "configuration.py" to "tSNE" or
1814+ "PCA", respectively. To create your own plugin, write a script which defines a
1815+ list called ` cluster_parameters ` , inputs "activations.npz" and outputs
1816+ "cluster.npz". See "src/cluster-plugin.py" for a template.
18801817
18811818# Troubleshooting #
18821819
0 commit comments