wot provides a command line interface that offers fine-grained control over calculations.
Help is available for each tool using the syntax wot tool -h. For example, wot optimal_transport -h. We recommend
pegasus for preprocessing, visualization, and clustering
tools.
convert_matrix
Convert matrix data formatParameter | Description |
---|---|
matrix | File to convert |
format | Output file format Choices: gct, h5ad, loom, txt, parquet |
out | Output file name |
obs | Row metadata to join with ids in matrix |
var | Column metadata to join with ids in matrix |
transpose | Transpose the matrix before saving |
wot convert_matrix my_matrix.h5ad --format txt
This command converts my_matrix.h5ad to txt format.
cells_by_gene_set
Generate cell sets from gene set scoresParameter | Description |
---|---|
score | Gene sets scores generated from the gene_set_scores command |
filter | Comma separated list of column names to include |
quantile | Quantile for cells to be considered a member of a cell set Default: 99 |
out | Output file name prefix Default: wot |
census
Generate ancestor census for each time point given an initial cell setParameter | Description |
---|---|
tmap | Directory of transport maps as produced by optimal transport |
cell_set | gmt, gmx, or grp file of cell sets. |
day | The starting timepoint at which to consider the cell sets |
out | Output files prefix Default: census |
diff_exp
Compute differentially expressed genes from the output of the fate toolParameter | Description |
---|---|
matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
fate | Fate dataset produced by the fate tool |
cell_days | File with headers "id" and "day" corresponding to cell id and days |
out | Output file name Default: wot_diff_exp.csv |
cell_days_field | Field name in cell_days file that contains cell days Default: day |
cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
gene_filter | File with one gene id per line |
verbose | Print progress |
wot diff_exp --matrix data/ExprMatrix.h5ad --cell_days data/cell_days.txt --fate IPS_d17_fates.txt --fold_change 0 --gene_filter data/TFs.txt --cell_day_filter 14 --verbose
This command computes differentially expressed genes at day 14 that are predictive of IPS fate at day 17.
fates
Generate fates for cell sets generated at the given time.Parameter | Description |
---|---|
tmap | Directory of transport maps as produced by optimal transport |
cell_set | gmt, gmx, or grp file of cell sets. |
day | Day to consider for cell sets |
cell_set_filter | Comma separated list of cell sets to include (e.g. IPS,Stromal) |
format | Output matrix file format Default: txt |
embedding | Optional file with id, x, y used for plotting |
out | Prefix for output file names Default: wot |
verbose | Print cell set information |
wot fates --tmap tmaps/serum --cell_set data/major_cell_sets.gmt --day 17 --cell_set_filter IPS --out IPS_d17 --verbose
This command computes IPS cell fates.
gene_set_scores
Score each cell according to its expression of input gene signaturesParameter | Description |
---|---|
matrix | A matrix with cells on rows and genes on columns |
gene_sets | Gene sets in gmx, gmt, or grp format |
method | Method to compute gene set scores Choices: mean_z_score, mean, mean_rank Default: mean_z_score |
cell_filter | File with one cell id per line to include |
gene_set_filter | Gene sets to include |
max_z_score | Threshold z-scores at specified value Default: 5 |
nperm | Number of permutations to perform |
out | Output file name prefix Default: |
transpose | Transpose the matrix |
format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: txt |
verbose | Print verbose information |
wot gene_set_scores --matrix data/ExprMatrix.h5ad --method mean_z_score --gene_sets data/gene_sets.gmx
This command computes gene set scores. See notebooks for an example of converting gene set scores to growth rates.
optimal_transport
Compute transport maps between pairs of time pointsParameter | Description |
---|---|
matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
cell_days | File with headers "id" and "day" corresponding to cell id and days |
cell_growth_rates | File with "id" and "cell_growth_rate"headers corresponding to cell id and growth rate per day. |
parameters | Optional two column parameter file containing parameter name and value |
config | Configuration per timepoint or pair of timepoints |
transpose | Transpose the matrix |
local_pca | Convert day pairs matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
growth_iters | Number of growth iterations for learning the growth rate. Default: 1 |
gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
cell_filter | File with one cell id per line to include |
cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
scaling_iter | Number of scaling iterations for OT solver Default: 3000 |
inner_iter_max | For OT solver Default: 50 |
epsilon | Controls the entropy of the transport map. An extremely large entropy parameter will give a maximally entropic transport map, and an extremely small entropy parameter will give a nearly deterministic transport map (but could also lead to numerical instability in the algorithm Default: 0.05 |
lambda1 | Regularization parameter that controls the fidelity of the constraints on p Default: 1 |
lambda2 | Regularization parameter that controls the fidelity of the constraints on q Default: 50 |
max_iter | Maximum number of scaling iterations. Abort if convergence was not reached Default: 10000000.0 |
batch_size | Number of scaling iterations to perform between duality gap check Default: 5 |
tolerance | Maximal acceptable ratio between the duality gap and the primal objective value Default: 1e-08 |
epsilon0 | Warm starting value for epsilon Default: 1 |
tau | For OT solver Default: 10000 |
ncells | Number of cells to downsample from each timepoint and covariate |
ncounts | Sample ncounts from each cell |
solver | The solver to use to compute transport matrices Choices: duality_gap, fixed_iters Default: duality_gap |
cell_days_field | Field name in cell_days file that contains cell days Default: day |
cell_growth_rates_field | Field name in cell_growth_rates file that contains growth rates Default: cell_growth_rate |
verbose | Print progress information |
format | Output file format Choices: h5ad, loom Default: h5ad |
no_overwrite | Do not overwrite existing transport maps if they exist |
out | Prefix for output file names Default: ./tmaps |
wot optimal_transport --matrix data/ExprMatrix.var.genes.h5ad --cell_days data/cell_days.txt --cell_filter data/serum_cell_ids.txt --growth_iters 3 --cell_growth_rates data/growth_gs_init.txt --out tmaps/serum --verbose
This command computes transport maps for all consecutive time points.
optimal_transport_validation
Compute a validation summaryParameter | Description |
---|---|
matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
cell_days | File with headers "id" and "day" corresponding to cell id and days |
cell_growth_rates | File with "id" and "cell_growth_rate"headers corresponding to cell id and growth rate per day. |
parameters | Optional two column parameter file containing parameter name and value |
config | Configuration per timepoint or pair of timepoints |
transpose | Transpose the matrix |
local_pca | Convert day pairs matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
growth_iters | Number of growth iterations for learning the growth rate. Default: 1 |
gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
cell_filter | File with one cell id per line to include |
cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
scaling_iter | Number of scaling iterations for OT solver Default: 3000 |
inner_iter_max | For OT solver Default: 50 |
epsilon | Controls the entropy of the transport map. An extremely large entropy parameter will give a maximally entropic transport map, and an extremely small entropy parameter will give a nearly deterministic transport map (but could also lead to numerical instability in the algorithm Default: 0.05 |
lambda1 | Regularization parameter that controls the fidelity of the constraints on p Default: 1 |
lambda2 | Regularization parameter that controls the fidelity of the constraints on q Default: 50 |
max_iter | Maximum number of scaling iterations. Abort if convergence was not reached Default: 10000000.0 |
batch_size | Number of scaling iterations to perform between duality gap check Default: 5 |
tolerance | Maximal acceptable ratio between the duality gap and the primal objective value Default: 1e-08 |
epsilon0 | Warm starting value for epsilon Default: 1 |
tau | For OT solver Default: 10000 |
ncells | Number of cells to downsample from each timepoint and covariate |
ncounts | Sample ncounts from each cell |
solver | The solver to use to compute transport matrices Choices: duality_gap, fixed_iters Default: duality_gap |
cell_days_field | Field name in cell_days file that contains cell days Default: day |
cell_growth_rates_field | Field name in cell_growth_rates file that contains growth rates Default: cell_growth_rate |
verbose | Print progress information |
covariate | Covariate values for each cell |
full_distances | Compute full distances |
day_triplets | Three column file without a header containing start time, interpolation time, and end time |
out | Prefix for output file names Default: tmaps_val |
interp_size | The number of cells in the interpolated population Default: 10000 |
covariate_field | Field name in covariate file that contains covariate Default: covariate |
wot optimal_transport_validation --matrix data/ExprMatrix.var.genes.h5ad --cell_days data/cell_days.txt --cell_filter data/serum_cell_ids.txt --covariate data/batches.txt --cell_growth_rates tmaps/serum_g.txt --cell_growth_rates_field g2 --verbose
This command computes and plots optimal transport validation results.
trajectory
Generate trajectories for cell sets generated at the given time.Parameter | Description |
---|---|
tmap | Directory of transport maps as produced by optimal transport |
cell_set | gmt, gmx, or grp file of cell sets. |
day | Day to consider for cell sets |
cell_set_filter | Comma separated list of cell sets to include (e.g. IPS,Stromal) |
format | Output matrix file format Default: txt |
embedding | Optional file with id, x, y used for plotting |
out | Prefix for output file names Default: wot |
verbose | Print cell set information |
wot trajectory --tmap tmaps/serum --cell_set data/major_cell_sets.gmt --day 18 --embedding data/fle_coords.txt --verbose
This command computes and plots trajectories using the serum transport maps and major cell sets.
trajectory_divergence
Computes the distance between trajectories across timeParameter | Description |
---|---|
matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
cell_days | File with headers "id" and "day" corresponding to cell id and days |
distance_metric | Distance metric (earth mover's distance or total variation) Choices: emd, total_variation Default: emd |
trajectory | One or more trajectory datasets as produced by the trajectory tool |
compare | If "match", compare trajectories with the same name. If "all", compare all pairs. If "within" compare within a trajectory. If a trajectory name, compare to the specified trajectory Default: within |
local_pca | Convert day matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
plot | Plot results |
cell_filter | File with one cell id per line to include |
gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
cell_days_field | Field name in cell_days file that contains cell days Default: day |
out | Prefix for output file names Default: wot-trajectory |
verbose | Print progress |
wot trajectory_divergence --trajectory wot_trajectory.txt --cell_days data/cell_days.txt --matrix data/ExprMatrix.var.genes.h5ad --compare within --verbose --plot
This command computes the trajectory divergence.
trajectory_trends
Generate mean expression profiles for ancestors and descendants of each trajectoryParameter | Description |
---|---|
matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
trajectory | Trajectory dataset as produced by the trajectory tool |
out | Prefix for output file names Default: trends |
plot | Generate plots for each trajectory |
cell_days | File with headers "id" and "day" corresponding to cell id and days |
format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: txt |
gene_filter | File with one gene id per line or comma separated string of list of genes to include from the matrix |
cell_days_field | Field name in cell_days file that contains cell days Default: day |
wot trajectory_trends --trajectory wot_trajectory.txt --cell_days data/cell_days.txt --matrix data/ExprMatrix.h5ad --gene_filter Nanog,Obox6, Zfp42 --plot
This command computes and plots trajectory trends for the genes Nanog, Obox6, and Zfp42.
transition_table
Generate a transition table from one cell set to another cell setParameter | Description |
---|---|
tmap | Directory of transport maps as produced by optimal transport |
cell_set | gmt, gmx, or grp file of cell sets. |
start_time | The start time for the cell sets to compute the transitions to cell sets at end_time |
end_time | The end time |
out | Prefix for ouput file., Default: wot |
format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: h5ad |