API¶

class
deltascope.
brain
¶ Object to manage biological data and associated functions.

add_aligned_df
(df)¶ Adds dataframe of aligned data
Warning
Calculates model, but assumes that the dimensions of the fit are x and z
Parameters: df (pd.DataFrame) – Dataframe of aligned data Returns: brain.df_align

add_thresh_df
(df)¶ Adds dataframe of thresholded and transformed data to
brain.df_thresh
Parameters: df (pd.DataFrame) – dataframe of thesholded and transformed data Returns: brain.df_thresh

align_data
(df_fit, fit_dim, deg=2, mm=None, vertex=None, flip=None)¶ Apply PCA transformation matrix and align data so that the vertex is at the origin
Creates
brain.df_align
andbrain.mm
Parameters:  df (pd.DataFrame) – dataframe containing thresholded xyz data
 comp_order (array) – Array specifies the assignment of components to x,y,z. Form [x component index, y component index, z component index], e.g. [0,2,1]
 fit_dim (array) – Array of length two containing two strings describing the first and second axis for fitting the model, e.g. [‘x’,’z’]
 deg (int) – (or None) Degree of the function that should be fit to the model. deg=2 by default
 mm – (
math_model
or None) Math model for primary channel  vertex (array) – (or None) Array of type [vx,vy,vz] (
brain.vertex
) indicating the translation values  flip (Bool) – (or None) Boolean value to determine if the data should be rotated by 180 degrees

df_align
¶ Dataframe containing point data aligned using PCA

mm
¶ Math model object fit to data in brain object

calc_coord
(row)¶ Calculate alpah, r, theta for a particular row
Parameters: row (pd.Series) – row from dataframe in the form of a pandas Series Returns: pd.Series populated with coordinate of closest point on the math model, r, theta, and ac (arclength)

calculate_pca_median
(data, threshold, radius, microns)¶ Calculate PCA transformation matrix,
brain.pcamed
, based on data (brain.pcamed
) after applying median filter and thresholdParameters:  data (array) – 3D array containing raw probability data
 threshold (float) – Value between 0 and 1 indicating the lower cutoff for positive signal
 radius (int) – Radius of neighborhood that should be considered for the median filter
 microns (array) – Array with three values representing the x,y,z micron dimensions of the voxel

median
¶ Pandas dataframe containing data that has been processed with a median filter twice and thresholded

pcamed
¶ PCA object managing the transformation matrix and any resulting transformations

calculate_pca_median_2d
(data, threshold, radius, microns)¶ Calculate PCA transformation matrix for 2 dimensions of data,
brain.pcamed
, based on data after applying median filter and thresholdWarning
fit_dim is not used to determine which dimensions to fit. Defaults to x and z
Parameters:  data (array) – 3D array containing raw probability data
 threshold (float) – Value between 0 and 1 indicating the lower cutoff for positive signal
 radius (int) – Radius of neighborhood that should be considered for the median filter
 microns (array) – Array with three values representing the x,y,z micron dimensions of the voxel

create_dataframe
(data, scale)¶ Creates a pandas dataframe containing the x,y,z and signal/probability value for each point in the
brain.raw_data
arrayParameters:  data (array) – Raw probability data in 3D array
 scale (array) – Array of length three containing the micron values for [x,y,z]
Returns: Pandas DataFrame with xyz and probability value for each point

find_arclength
(xc)¶ Calculate arclength by integrating the derivative of the math model in xy plane
Parameters: row (float) – Postion in the x axis along the curve Returns: Length of the arc along the curve between the row and the vertex Return type: float

find_distance
(t, point)¶ Find euclidean distance between math model(t) and data point in the xy plane
Parameters:  t (float) – float value defining point on the line
 point (array) – array [x,y] defining data point
Returns: distance between the two points
Return type: float

find_min_distance
(row)¶ Find the point on the curve that produces the minimum distance between the point and the data point using scipy.optimize.minimize(
brain.find_distance()
)Parameters: row (pd.Series) – row from dataframe in the form of a pandas Series Returns: point in the curve (xc, yc, zc) and r Return type: floats

find_r
(row, zc, yc, xc)¶ Calculate r using the Pythagorean theorem
Parameters:  row (pd.Series) – row from dataframe in the form of a pandas Series
 yc (float) – Y position of the closest point in the curve to the data point
 zc (float) – Z position of the closest point in the curve to the data point
 xc (float) – X position of hte closest point in the curve to the data point
Returns: r, distance between the point and the model
Return type: float

find_theta
(row, zc, yc)¶ Calculate theta for a row containing data point in relationship to the xz plane
Parameters:  row (pd.Series) – row from dataframe in the form of a pandas Series
 yc (float) – Y position of the closest point in the curve to the data point
 zc (float) – Z position of the closest point in the curve to the data point
Returns: theta, angle between point and the model plane
Return type: float

fit_model
(df, deg, fit_dim)¶ Fit model to dataframe
Parameters:  df (pd.DataFrame) – Dataframe containing at least x,y,z
 deg (int) – Degree of the function that should be fit to the model
 fit_dim (array) – Array of length two containing two strings describing the first and second axis for fitting the model, e.g. [‘x’,’z’]
Returns: math model
Return type:

flip_data
(df)¶ Rotate data by 180 degrees
Parameters: df (dataframe) – Pandas dataframe containing x,y,z data Returns: Rotated dataframe

integrand
(x)¶ Function to integrate to calculate arclength
Parameters: x (float) – integer value for x Returns: arclength value for integrating Return type: float

pca_transform_2d
(df, pca, comp_order, fit_dim, deg=2, mm=None, vertex=None, flip=None)¶ Transforms df in 2D based on the PCA object, pca, whose transformation matrix has already been calculated
Calling
brain.align_data()
createsbrain.df_align
Warning
fit_dim is not used to determine which dimensions to fit. Defaults to x and z
Parameters:  df (pd.DataFrame) – Dataframe containing thresholded xyz data
 pca (pca_object) – A pca object containing a transformation object, e.g.
brain.pcamed
 comp_order (array) – Array specifies the assignment of components to x,y,z. Form [x component index, y component index, z component index], e.g. [0,2,1]
 fit_dim (array) – Array of length two containing two strings describing the first and second axis for fitting the model, e.g. [‘x’,’z’]
 deg (int) – (or None) Degree of the function that should be fit to the model. deg=2 by default
 mm – (
math_model
or None) Math model for primary channel  vertex (array) – (or None) Array of type [vx,vy,vz] (
brain.vertex
) indicating the translation values  flip (Bool) – (or None) Boolean value to determine if the data should be rotated by 180 degrees

pca_transform_3d
(df, pca, comp_order, fit_dim, deg=2, mm=None, vertex=None, flip=None)¶ Transforms df in 3D based on the PCA object, pca, whose transformation matrix has already been calculated
Parameters:  df (pd.DataFrame) – Dataframe containing thresholded xyz data
 pca (pca_object) – A pca object containing a transformation object, e.g.
brain.pcamed
 comp_order (array) – Array specifies the assignment of components to x,y,z. Form [x component index, y component index, z component index], e.g. [0,2,1]
 fit_dim (array) – Array of length two containing two strings describing the first and second axis for fitting the model, e.g. [‘x’,’z’]
 deg (int) – (or None) Degree of the function that should be fit to the model. deg=2 by default
 mm – (
math_model
or None) Math model for primary channel  vertex (array) – (or None) Array of type [vx,vy,vz] (
brain.vertex
) indicating the translation values  flip (Bool) – (or None) Boolean value to determine if the data should be rotated by 180 degrees

plot_projections
(df, subset)¶ Plots the x, y, and z projections of the input dataframe in a matplotlib plot
Parameters:  df (pd.DataFrame) – Dataframe with columns: ‘x’,’y’,’z’
 subset (float) – Value between 0 and 1 indicating what percentage of the df to subsample
Returns: Matplotlib figure with three labeled scatterplots

preprocess_data
(threshold, scale, microns)¶ Thresholds and scales data prior to PCA
Creates
brain.threshold
,brain.df_thresh
, andbrain.df_scl
Parameters:  threshold (float) – Value between 0 and 1 to use as a cutoff for minimum pixel value
 scale (array) – Array with three values representing the constant by which to multiply x,y,z respectively
 microns (array) – Array with three values representing the x,y,z micron dimensions of the voxel

threshold
¶ Value used to threshold the data prior to calculating the model

df_thresh
¶ Dataframe containing only points with values above the specified threshold

df_scl
¶ Dataframe containing data from
brain.df_thresh
after a scaling value has been applied

process_alignment_data
(data, threshold, radius, microns)¶ Applies a median filter twice to the data which is used for alignment
Ensures than any noise in the structural data does not interfere with alignment
Parameters:  data (array) – Raw data imported by the function
brain.read_data()
 threshold (float) – Value between 0 and 1 to use as a cutoff for minimum pixel value
 radius (int) – Integer that determines the radius of the circle used for the median filter
 microns (array) – Array with three values representing the x,y,z micron dimensions of the voxel
Returns: Dataframe containing data processed with the median filter and threshold
 data (array) – Raw data imported by the function

read_data
(filepath)¶ Reads 3D data from file and selects appropriate channel based on the assumption that the channel with the most zeros has zero as the value for no signal
Parameters: filepath (str) – Filepath to hdf5 probability file Returns: Creates the variable brain.raw_data

raw_data
¶ Array of shape [z,y,x] containing raw probability data


setup_test_data
(size=None, gthresh=0.5, scale=[1, 1, 1], microns=[0.16, 0.16, 0.21], mthresh=0.2, radius=20, comp_order=[0, 2, 1], fit_dim=['x', 'z'], deg=2)¶ Setup a test dataset to use for testing transform coordinates :param int size: Number of points to sample for the test dataset

subset_data
(df, sample_frac=0.5)¶ Takes a random sample of the data based on the value between 0 and 1 defined for sample_frac
Creates the variable
brain.subset
Parameters:  pd.DataFrame – Dataframe which will be sampled
 sample_frac (float) – (or None) Value between 0 and 1 specifying proportion of the dataset that should be randomly sampled for plotting

subset
¶ Random sample of the input dataframe

transform_coordinates
()¶ Transform coordinate system so that each point is defined relative to math model by (alpha,theta,r) (only applied to
brain.df_align
)Returns: appends columns r, xc, yc, zc, ac, theta to brain.df_align


deltascope.
calc_variance
(anum, dfs)¶ Calculate the variance between samples according to bin position and variance between adjacent bins
Parameters:  anum (int) – Number of bins which the arclength axis should be divided into
 dfs (dict) – Dictionary of dfs which are going to be processed
Returns: Two arrays: svar (anum,tnum) and bvar (anum*tnum,snum)
Return type: np.array

deltascope.
calculate_area_error
(pdf, Lkde, x)¶ Calculate area between PDF and each kde in Lkde
Parameters:  pdf (array) – Array of probability distribution function that is the same shape as kdes in Lkde
 Lkde (list) – List of arrays of Kdes
 x (array) – Array of datapoints used to generate pdf and kdes
Returns: List of error values for each kde in Lkde

deltascope.
calculate_models
(Ldf)¶ Calculate model for each dataframe in list and add to new dataframe
Parameters: Ldf (list) – List of dataframes containing aligned data Returns: pd.Dataframe with a,b,c values for parabolic model

deltascope.
convert_to_arr
(xarr, tarr, DT, mdf, Ldf=[])¶ Convert a pandas dataframe containing landmarks as columns and samples as rows into a 3D numpy array
The columns of mdf determine which landmarks will be saved into the array. Any additional dataframes that need to be converted can be included in Ldf
Parameters:  xarr (np.array) – Array containing all unique x values of landmarks in the dataset
 tarr (np.array) – Array containing all unique t values of landmarks in the dataset
 DT (str) – Either
r
orpts
indicating which data type should be saved to the array  mdf (pd.DataFrame) – Main landmark dataframe containing landmarks as columns and samples as rows
 Ldf (list) – List of additional pd.DataFrames that should also be converted to arrays
Returns: Array of the main dataframe and list of arrays converted from Ldf

class
deltascope.
embryo
(name, number, outdir)¶ Class to managed multiple brain objects in a multichannel sample
Parameters:  name (str) – Name of this sample set
 number (str) – Sample number corresponding to this embryo
 outdir (str) – Path to directory for output files

outdir
¶ Path to directory for output files

name
¶ Name of this sample set

number
¶ Sample number corresponding to this embryo

add_channel
(filepath, key)¶ Add channel to
embryo.chnls
dictionaryParameters:  filepath (str) – Complete filepath to image
 key (str) – Name of the channel

add_psi_data
(filepath, key)¶ Read psi data into a channel dataframe
Parameters:  filepath (str) – Complete filepath to data
 key (str) – Descriptive key for channel dataframe in dictionary

process_channels
(mthresh, gthresh, radius, scale, microns, deg, primary_key, comp_order, fit_dim)¶ Process all channels through the production of the
brain.df_align
dataframeParameters:  mthresh (float) – Value between 0 and 1 to use as a cutoff for minimum pixel value for median data
 gthresh (float) – Value between 0 and 1 to use as a cutoff for minimum pixel value for general data
 radius (int) – Size of the neighborhood area to examine with median filter
 scale (array) – Array with three values representing the constant by which to multiply x,y,z respectively
 microns (array) – Array with three values representing the x,y,z micron dimensions of the voxel
 deg (int) – Degree of the function that should be fit to the model
 primary_key (str) – Key for the primary structural channel which PCA and the model should be fit too
 comp_order (array) – Array specifies the assignment of components to x,y,z. Form [x component index, y component index, z component index], e.g. [0,2,1]
 fit_dim (array) – Array of length two containing two strings describing the first and second axis for fitting the model, e.g. [‘x’,’z’]

save_projections
(subset)¶ Save projections of both channels into png files in
embryo.outdir
following the naming scheme [embryo.name
]_[embryo.number
]_[channel name]_MIP.pngParameters: subset (float) – Value between 0 and 1 to specify the fraction of the data to randomly sample for plotting

save_psi
()¶ Save all channels into psi files following the naming scheme [
embryo.name
]_[embryo.number
]_[channel name].psi

deltascope.
find_anchors
(df, dim)¶ Parameters: dim (str) – either y or z

deltascope.
generate_kde
(data, var, x, absv=False)¶ Generate list of KDEs from either dictionary or list of data
Parameters:  data – pd.DataFrames to convert
 var (str) – Name of column to select from df
 x (array) – Array of datapoints to evaluate KDE on
 absv (bool) – (or None) Set to True to use absolute value of selected data for KDE calculation
Type: dict or list
Returns: List of KDE arrays

class
deltascope.
landmarks
(percbins=[10, 50, 90], rnull=15)¶ Class to handle calculation of landmarks to describe structural data
Parameters:  percbins (list) – (or None) Must be a list of integers between 0 and 100
 rnull (int) – (or None) When the r value cannot be calculated it will be set to this value

brain.
lm_wt_rf
¶ pd.DataFrame, which wildtype landmarks will be added to

brain.
lm_mt_rf
¶ pd.DataFrame, which mutant landmarks will be added to

brain.
rnull
¶ Integer specifying the value which null landmark calculations will be set to

brain.
percbins
¶ Integer specifying the percentiles which will be used to calculate landmarks

calc_bins
(Ldf, ac_num, tstep)¶ Calculates alpha and theta bins based on ac_num and tstep
Creates
landmarks.acbins
andlandmarks.tbins
Warning
tstep does not handle scenarios where 2pi is not evenly divisible by tstep
Parameters:  Ldf (dict) – Dict dataframes that are being used for the analysis
 ac_num (int) – Integer indicating the number of divisions that should be made along alpha
 tstep (float) – The size of each bin used for alpha

acbins
¶ List containing the boundaries of each bin along alpha based on ac_num

tbins
¶ List containing the boundaries of each bin along theta based on tstep

calc_mt_landmarks
(df, snum, wt)¶ Warning
Deprecated function, but attempted to calculate mutant landmarks based on the number of points found in the wildtype standard

calc_perc
(df, snum, dtype, out)¶ Calculate landmarks for a dataframe based on the bins and percentiles that have been previously defined
Parameters:  df (pd.DataFrame) – Dataframe containing columns x,y,z,alpha,r,theta
 snum (str) – String containing a sample identifier that can be converted to an integer
 dtype (str) – String describing the sample group to which the sample belongs, e.g. control or experimental
Returns: pd.DataFrame with new landmarks appended

calc_wt_reformat
(df, snum)¶ Warning
Deprecated function, but includes code pertaining to calculating point based data

class
deltascope.
math_model
(model)¶ Object to contain attributes associated with the math model of a sample
Parameters: model (array) – Array of coefficients calculated by np.polyfit 
cf
¶ Array of coefficients for the math model

p
¶ Poly1d function for the math model to allow calculation and plotting of the model


class
deltascope.
paramsClass
(path=None, dparams=None)¶ A class to read and validate parameters for multiprocessing transformation. Validated parameters can be read as attributes of the object

add_outdir
(path)¶ Add out directory as an attribute of the class
Parameters: path (str) – Complete path to the output directory

check_config
(D, path)¶ Check that each parameter in the config file is correct and raise an error if it isn’t
Parameters:  D (dict) – Dictionary containing parameters from the config file
 path (str) – Complete filepath to the config file


deltascope.
process_sample
(num, root, outdir, name, chs, prefixes, threshold, scale, deg, primary_key, comp_order, fit_dim, flip_dim)¶ Process single sample through
brain
class and saves df to csvWarning
Out of date and will probably fail
Parameters:  num (str) – Sample number
 root (str) – Complete path to the root directory for this sample set
 name (str) – Name describing this sample set
 outdir (str) – Complete path to output directory
 chs (array) – Array containing strings specifying the directories for each channel
 prefixes (array) – Array containing strings specifying the file prefix for each channel
 threshold (float) – Value between 0 and 1 to use as a cutoff for minimum pixel value
 scale (array) – Array with three values representing the constant by which to multiply x,y,z respectively
 deg (int) – Degree of the function that should be fit to the model
 primary_key (str) – Key for the primary structural channel which PCA and the model should be fit too

deltascope.
read_psi
(filepath)¶ Reads psi file at the given filepath and returns data in a pandas DataFrame
Parameters: filepath (str) – Complete filepath to file Returns: pd.Dataframe containing data

deltascope.
read_psi_to_dict
(directory, dtype)¶ Read psis from directory into dictionary of dfs with filtering based on dtype
Parameters:  directory (str) – Directory to get psis from
 dtype (str) – Usually ‘AT’ or ‘ZRF1’
Returns: Dictionary of pd.DataFrame

deltascope.
reformat_to_cart
(df)¶ Take a dataframe in which columns contain the bin parameters and convert to a cartesian coordinate system
Parameters: df (pd.DataFrame) – Dataframe containing columns with string names that contain the bin parameter Returns: pd.DataFrame with each landmark as a row and columns: x,y,z,r,r_std,t,pts

deltascope.
rescale_variable
(Ddfs, var, newvar)¶ Rescale variable from 1 to 1 and save in newvar column on original dataframe
Parameters:  Ddfs (dict) – Dictionary of pd.DataFrames
 var (str) – Name of column to select from dfs
 newvar (str) – Name to use for new data in appended column
Returns: Dictionary of dataframes containing column of rescaled data

deltascope.
subplot_lmk
(ax, p, avg, sem, parr, xarr, tarr, dtype, Pn={'alpha': 0.3, 'cmap': 'Greys_r', 'mtc': 'r', 'tarr': None, 'wtc': 'b', 'xarr': None, 'zfb': 1, 'zln': 2, 'zpt': 3})¶ Plot a ribbon of average and standard error of the mean onto the subplot, ax
Parameters:  ax (plt.Subplot) – Matplotlib subplot onto which the data should be plotted
 p (list) – List of two theta values that should be plotted
 avg (np.array) – Array of shape (xvalues,tvalues) containing the average values of the data
 sem (np.array) – Array of shape (xvalues,tvalues) containing the standard error of the mean values of the data
 parr (np.array) – Array of shape (xvalues,tvalues) containing the p values for the data
 dtype (str) – String describing sample type
 Pn – Dictionary containing the following values: ‘zln’:2,’zpt’:3,’zfb’:1,’wtc’:’b’,’mtc’:’r’,’alpha’:0.3,’cmap’:’Greys_r’
Type: dict or None

deltascope.
write_data
(filepath, df)¶ Writes data in PSI format to file after writing header using
write_header()
. Closes file at the conclusion of writing data.Parameters:  filepath (str) – Complete filepath to output file
 df (pd.DataFrame) – dataframe containing columns x,y,z,ac,r,theta

deltascope.
write_header
(f)¶ Writes header for PSI file with columns Id,x,y,z,ac,r,theta
Parameters: f (file) – file object created by ‘open(filename,’w’)`