wot provides a command line interface that offers fine-grained control over calculations.
Help is available for each tool using the syntax wot tool -h. For example, wot optimal_transport -h. We recommend
pegasus for preprocessing, visualization, and clustering
tools.
convert_matrix
Convert matrix data format| Parameter | Description |
|---|---|
| matrix | File to convert |
| format | Output file format Choices: gct, h5ad, loom, txt, parquet |
| out | Output file name |
| obs | Row metadata to join with ids in matrix |
| var | Column metadata to join with ids in matrix |
| transpose | Transpose the matrix before saving |
wot convert_matrix my_matrix.h5ad --format txt
This command converts my_matrix.h5ad to txt format.
cells_by_gene_set
Generate cell sets from gene set scores| Parameter | Description |
|---|---|
| score | Gene sets scores generated from the gene_set_scores command |
| filter | Comma separated list of column names to include |
| quantile | Quantile for cells to be considered a member of a cell set Default: 99 |
| out | Output file name prefix Default: wot |
census
Generate ancestor census for each time point given an initial cell set| Parameter | Description |
|---|---|
| tmap | Directory of transport maps as produced by optimal transport |
| cell_set | gmt, gmx, or grp file of cell sets. |
| day | The starting timepoint at which to consider the cell sets |
| out | Output files prefix Default: census |
diff_exp
Compute differentially expressed genes from the output of the fate tool| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
| fate | Fate dataset produced by the fate tool |
| cell_days | File with headers "id" and "day" corresponding to cell id and days |
| out | Output file name Default: wot_diff_exp.csv |
| cell_days_field | Field name in cell_days file that contains cell days Default: day |
| cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
| gene_filter | File with one gene id per line |
| verbose | Print progress |
wot diff_exp --matrix data/ExprMatrix.h5ad --cell_days data/cell_days.txt --fate IPS_d17_fates.txt --fold_change 0 --gene_filter data/TFs.txt --cell_day_filter 14 --verbose
This command computes differentially expressed genes at day 14 that are predictive of IPS fate at day 17.
fates
Generate fates for cell sets generated at the given time.| Parameter | Description |
|---|---|
| tmap | Directory of transport maps as produced by optimal transport |
| cell_set | gmt, gmx, or grp file of cell sets. |
| day | Day to consider for cell sets |
| cell_set_filter | Comma separated list of cell sets to include (e.g. IPS,Stromal) |
| format | Output matrix file format Default: txt |
| embedding | Optional file with id, x, y used for plotting |
| out | Prefix for output file names Default: wot |
| verbose | Print cell set information |
wot fates --tmap tmaps/serum --cell_set data/major_cell_sets.gmt --day 17 --cell_set_filter IPS --out IPS_d17 --verbose
This command computes IPS cell fates.
gene_set_scores
Score each cell according to its expression of input gene signatures| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and genes on columns |
| gene_sets | Gene sets in gmx, gmt, or grp format |
| method | Method to compute gene set scores Choices: mean_z_score, mean, mean_rank Default: mean_z_score |
| cell_filter | File with one cell id per line to include |
| gene_set_filter | Gene sets to include |
| max_z_score | Threshold z-scores at specified value Default: 5 |
| nperm | Number of permutations to perform |
| out | Output file name prefix Default: |
| transpose | Transpose the matrix |
| format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: txt |
| verbose | Print verbose information |
wot gene_set_scores --matrix data/ExprMatrix.h5ad --method mean_z_score --gene_sets data/gene_sets.gmx
This command computes gene set scores. See notebooks for an example of converting gene set scores to growth rates.
optimal_transport
Compute transport maps between pairs of time points| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
| cell_days | File with headers "id" and "day" corresponding to cell id and days |
| cell_growth_rates | File with "id" and "cell_growth_rate"headers corresponding to cell id and growth rate per day. |
| parameters | Optional two column parameter file containing parameter name and value |
| config | Configuration per timepoint or pair of timepoints |
| transpose | Transpose the matrix |
| local_pca | Convert day pairs matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
| growth_iters | Number of growth iterations for learning the growth rate. Default: 1 |
| gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
| cell_filter | File with one cell id per line to include |
| cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
| scaling_iter | Number of scaling iterations for OT solver Default: 3000 |
| inner_iter_max | For OT solver Default: 50 |
| epsilon | Controls the entropy of the transport map. An extremely large entropy parameter will give a maximally entropic transport map, and an extremely small entropy parameter will give a nearly deterministic transport map (but could also lead to numerical instability in the algorithm Default: 0.05 |
| lambda1 | Regularization parameter that controls the fidelity of the constraints on p Default: 1 |
| lambda2 | Regularization parameter that controls the fidelity of the constraints on q Default: 50 |
| max_iter | Maximum number of scaling iterations. Abort if convergence was not reached Default: 10000000.0 |
| batch_size | Number of scaling iterations to perform between duality gap check Default: 5 |
| tolerance | Maximal acceptable ratio between the duality gap and the primal objective value Default: 1e-08 |
| epsilon0 | Warm starting value for epsilon Default: 1 |
| tau | For OT solver Default: 10000 |
| ncells | Number of cells to downsample from each timepoint and covariate |
| ncounts | Sample ncounts from each cell |
| solver | The solver to use to compute transport matrices Choices: duality_gap, fixed_iters Default: duality_gap |
| cell_days_field | Field name in cell_days file that contains cell days Default: day |
| cell_growth_rates_field | Field name in cell_growth_rates file that contains growth rates Default: cell_growth_rate |
| verbose | Print progress information |
| format | Output file format Choices: h5ad, loom Default: h5ad |
| no_overwrite | Do not overwrite existing transport maps if they exist |
| out | Prefix for output file names Default: ./tmaps |
wot optimal_transport --matrix data/ExprMatrix.var.genes.h5ad --cell_days data/cell_days.txt --cell_filter data/serum_cell_ids.txt --growth_iters 3 --cell_growth_rates data/growth_gs_init.txt --out tmaps/serum --verbose
This command computes transport maps for all consecutive time points.
optimal_transport_validation
Compute a validation summary| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
| cell_days | File with headers "id" and "day" corresponding to cell id and days |
| cell_growth_rates | File with "id" and "cell_growth_rate"headers corresponding to cell id and growth rate per day. |
| parameters | Optional two column parameter file containing parameter name and value |
| config | Configuration per timepoint or pair of timepoints |
| transpose | Transpose the matrix |
| local_pca | Convert day pairs matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
| growth_iters | Number of growth iterations for learning the growth rate. Default: 1 |
| gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
| cell_filter | File with one cell id per line to include |
| cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
| scaling_iter | Number of scaling iterations for OT solver Default: 3000 |
| inner_iter_max | For OT solver Default: 50 |
| epsilon | Controls the entropy of the transport map. An extremely large entropy parameter will give a maximally entropic transport map, and an extremely small entropy parameter will give a nearly deterministic transport map (but could also lead to numerical instability in the algorithm Default: 0.05 |
| lambda1 | Regularization parameter that controls the fidelity of the constraints on p Default: 1 |
| lambda2 | Regularization parameter that controls the fidelity of the constraints on q Default: 50 |
| max_iter | Maximum number of scaling iterations. Abort if convergence was not reached Default: 10000000.0 |
| batch_size | Number of scaling iterations to perform between duality gap check Default: 5 |
| tolerance | Maximal acceptable ratio between the duality gap and the primal objective value Default: 1e-08 |
| epsilon0 | Warm starting value for epsilon Default: 1 |
| tau | For OT solver Default: 10000 |
| ncells | Number of cells to downsample from each timepoint and covariate |
| ncounts | Sample ncounts from each cell |
| solver | The solver to use to compute transport matrices Choices: duality_gap, fixed_iters Default: duality_gap |
| cell_days_field | Field name in cell_days file that contains cell days Default: day |
| cell_growth_rates_field | Field name in cell_growth_rates file that contains growth rates Default: cell_growth_rate |
| verbose | Print progress information |
| covariate | Covariate values for each cell |
| full_distances | Compute full distances |
| day_triplets | Three column file without a header containing start time, interpolation time, and end time |
| out | Prefix for output file names Default: tmaps_val |
| interp_size | The number of cells in the interpolated population Default: 10000 |
| covariate_field | Field name in covariate file that contains covariate Default: covariate |
wot optimal_transport_validation --matrix data/ExprMatrix.var.genes.h5ad --cell_days data/cell_days.txt --cell_filter data/serum_cell_ids.txt --covariate data/batches.txt --cell_growth_rates tmaps/serum_g.txt --cell_growth_rates_field g2 --verbose
This command computes and plots optimal transport validation results.
trajectory
Generate trajectories for cell sets generated at the given time.| Parameter | Description |
|---|---|
| tmap | Directory of transport maps as produced by optimal transport |
| cell_set | gmt, gmx, or grp file of cell sets. |
| day | Day to consider for cell sets |
| cell_set_filter | Comma separated list of cell sets to include (e.g. IPS,Stromal) |
| format | Output matrix file format Default: txt |
| embedding | Optional file with id, x, y used for plotting |
| out | Prefix for output file names Default: wot |
| verbose | Print cell set information |
wot trajectory --tmap tmaps/serum --cell_set data/major_cell_sets.gmt --day 18 --embedding data/fle_coords.txt --verbose
This command computes and plots trajectories using the serum transport maps and major cell sets.
trajectory_divergence
Computes the distance between trajectories across time| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
| cell_days | File with headers "id" and "day" corresponding to cell id and days |
| distance_metric | Distance metric (earth mover's distance or total variation) Choices: emd, total_variation Default: emd |
| trajectory | One or more trajectory datasets as produced by the trajectory tool |
| compare | If "match", compare trajectories with the same name. If "all", compare all pairs. If "within" compare within a trajectory. If a trajectory name, compare to the specified trajectory Default: within |
| local_pca | Convert day matrix to local PCA coordinates.Set to 0 to disable Default: 30 |
| plot | Plot results |
| cell_filter | File with one cell id per line to include |
| gene_filter | File with one gene id per line to use for computingcost matrices (e.g. variable genes) |
| cell_day_filter | Comma separated list of days to include (e.g. 12,14,16) |
| cell_days_field | Field name in cell_days file that contains cell days Default: day |
| out | Prefix for output file names Default: wot-trajectory |
| verbose | Print progress |
wot trajectory_divergence --trajectory wot_trajectory.txt --cell_days data/cell_days.txt --matrix data/ExprMatrix.var.genes.h5ad --compare within --verbose --plot
This command computes the trajectory divergence.
trajectory_trends
Generate mean expression profiles for ancestors and descendants of each trajectory| Parameter | Description |
|---|---|
| matrix | A matrix with cells on rows and features, such as genes or pathways on columns |
| trajectory | Trajectory dataset as produced by the trajectory tool |
| out | Prefix for output file names Default: trends |
| plot | Generate plots for each trajectory |
| cell_days | File with headers "id" and "day" corresponding to cell id and days |
| format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: txt |
| gene_filter | File with one gene id per line or comma separated string of list of genes to include from the matrix |
| cell_days_field | Field name in cell_days file that contains cell days Default: day |
wot trajectory_trends --trajectory wot_trajectory.txt --cell_days data/cell_days.txt --matrix data/ExprMatrix.h5ad --gene_filter Nanog,Obox6, Zfp42 --plot
This command computes and plots trajectory trends for the genes Nanog, Obox6, and Zfp42.
transition_table
Generate a transition table from one cell set to another cell set| Parameter | Description |
|---|---|
| tmap | Directory of transport maps as produced by optimal transport |
| cell_set | gmt, gmx, or grp file of cell sets. |
| start_time | The start time for the cell sets to compute the transitions to cell sets at end_time |
| end_time | The end time |
| out | Prefix for ouput file., Default: wot |
| format | Output file format Choices: gct, h5ad, loom, txt, parquet Default: h5ad |