-
Notifications
You must be signed in to change notification settings - Fork 5
Description
currently, neuropal data extraction is achieved by feeding extractor.py with a list of neuron positions and a list of ID's, where the two must be matched by index.
additionally, the default output data structure is simply the matrix of n x t with no additional metadata. labels are never tied to their corresponding traces. additionally, much data is left out of gce output that is valuable during data analysis
therefore i propose:
(a) modifying blob thread data structure so it can preserve a label which is read in curator.py
(b) neuron ID labeling in curator.py modifies that blob thread label
(c) for analysis of gce output, this output data file should contain more information:
- neuron traces (only thing currently output)
- neuron positions throughout the recording
- neuron id's
- alternative neuron traces (e.g. not bleach-corrected, not background-subtracted)
- parameters for recapitulation of gce extraction
i currently achieve this by pulling together this info into a single pandas dataframe inspired by greg's convenience function results_as_dataframe:
######## extract gce results as dataframe and append labels to frame ########
# load extractor object
e = load_extractor(data_folder)
# load elements of extractor object
peaks = e.blobthreadtracker_params["algorithm_params"]['peaks']
labels = e.blobthreadtracker_params["algorithm_params"]['labels']
ts = e.timeseries
df = e.results_as_dataframe()
# append labels to df
for l in range(ts.shape[1]):
try:
if labels[l].isnumeric():
df.loc[df["blob_ix"] == l, "ID"] = ""
else:
df.loc[df["blob_ix"] == l, "ID"] = labels[l]
except IndexError as err:
print('Extra blob found which was not provided in GCE input. Setting label to empty string')
df.loc[df["blob_ix"] == l, "ID"] = ""
this should be the default output, which data analysis infrastructure can be built off of