{ "cells": [ { "cell_type": "markdown", "id": "5cf5560d-d6c4-459c-b23f-512bde31b3e3", "metadata": {}, "source": [ "# Breast cancer example" ] }, { "cell_type": "markdown", "id": "f8770706-0b33-4e16-a087-ab339b402f30", "metadata": {}, "source": [ "# Pose pipeline overview" ] }, { "cell_type": "markdown", "id": "d44ee8a9-38a7-4aad-89c2-47287954d0e2", "metadata": {}, "source": [ "(Single-modal) In this tutorial, we demonstrate how to perform the POSE pipeline on Breast Cancer (BC) RNA data from TCGA. This analysis entails the following:\n", "\n", "1. Upload data\n", "2. Compute pairwise sample (WE) distances with respect to gene neigborhoods \n", "3. Compute global pairwise sample (WE) distance matrices\n", "4. Convert global distances to a single sample pairwise similarity matrix\n", "5. (Future work - dpt distance and/or multi-scale diffusion based distances)\n", "6. Extract pseudo-organization (i.e., ordering) of samples.\n", "7. Determine schema (i.e., branching).\n", "8. Visualize schema\n", "9. (Future work - further investigation into samples within different branches and differential analysis between branches)\n", "\n", "- $d_W(i,j,v)$ - The Wasserstein distance of the 1-hop neighborhood around gene $v$ between sample $i$ and sample $j$\n", "- $d_E(i,j,v)$ - The Euclidean distance of the 1-hop neighborhood around gene $v$ between sample $i$ and sample $j$\n", "- $D_W(i,j) = |d_W(i,j,v)|$ - Net Wasserstein distance between sample $i$ and sample $j$ wrt all genes $v$\n", "- $D_E(i,j) = |d_E(i,j,v)|$ - Net Euclidean distance between sample $i$ and sample $j$ wrt all genes $v$\n", "- $K_W = e^{-\\frac{\\|D_W\\|^2}{\\sigma^2}}$ - Pairwise sample Wasserstein similarity matrix \n", "- $K_E = e^{-\\frac{\\|D_E\\|^2}{\\sigma^2}}$ - Pairwise sample Euclidean similarity matrix \n", "- (Multi-feature, multi-modal) $K = \\frac{K_W + K_E}{2}$ - Fused pairwise sample similarity matrix\n", "- $D = 1 - K$ - Fused pairwise sample distance matrix\n", "- (Clustering) Determine branching according to lineage tracing algorithm using $D$ \n", "- (Visualizing) Pseudo-ordering of samples in branches according to distance from root node $r$, i.e., $D(r,:)$" ] }, { "cell_type": "markdown", "id": "c752e21d-6559-40d0-b75d-52322eb4805a", "metadata": {}, "source": [ "__Input__\n", "- `X` : RNA-Seq data matrix\n", "- `genelist` : list of genes of interest in `X`\n", "- `G` : gene-gene (or protein-protein) interaction network\n", "\n", "__Algorithm__\n", "1. Compute sample-pairwise distances:\n", " 1. Compute sample-pairwise gene-1-hop-neighborhood distances:\n", " - Wasserstein distances (shape): For each gene `g` in the `genelist`, compute $$d_W^{(g)}(sample_i, sample_j)$$ which is the Wasserstein distance of gene `g`'s 1-hop neighborhood between every two samples `i` and `j`.\n", " - Euclidean distances (scale): For each gene `g` in the `genelist`, compute $$d_E^{(g)}(sample_i, sample_j)$$ which is the Euclidean distance of gene `g`'s 1-hop neighborhood (including self) between every two samples `i` and `j`.\n", " 2. Compute net sample-pairwise distances:\n", " - Wasserstein distance (shape): $$d_W(sample_i, sample_j) = |d_W^{(g)}(sample_i, sample_j)|$$ as the $L_1$ or $L_2$ norm of the vector of gene-1-hop-neighborhood Wasserstein distances for all genes `g` in the `genelist`.\n", " - Euclidean distance (scale): $$d_E(sample_i, sample_j) = |d_E^{(g)}(sample_i, sample_j)|$$ as the $L_1$ or $L_2$ norm of the vector of gene-1-hop-neighborhood Euclidean distances for all genes `g` in the `genelist`.\n", "2. Convert net sample-pairwise distances to standardized similarities:\n", " - $\\sigma$ : Kernel width representing each data point's accessible neighbors. \n", " - Set $\\sigma$ normalization for each obs as the distance to its k-th neighbor : $$K_{ij}(d) = \\sqrt{\\frac{2\\sigma_i\\sigma_j}{\\sigma_i^2 + \\sigma_j^2}}e^{-\\frac{|d_{ij}^2|}{2(\\sigma_i^2 + \\sigma_j^2)}}$$\n", "where for each observation $i$, $\\sigma_i$ is the distance to its $k$-th nearest neighbor.\n", " - Standardized sample-pairwise Wasserstein similarity: $$K_W = K(d_W)$$\n", " - Standardized sample-pairwise Euclidean similarity: $$K_E = K(d_E)$$\n", "3. Compute fused similarity $\\tilde{K}$:\n", " - $\\tilde{K} = \\frac{K_W + K_E}{2}$ (Note : could choose other weights to fuse with an unbalanced combination)\n", "4. Compute diffusion distance from similarity: $d_K$\n", "5. Compute POSE from diffusion distance: $G_{POSE}$\n", "6. PROBE the POSE:\n", " - Clustering\n", " - Visualization\n", "\n" ] }, { "cell_type": "markdown", "id": "516eb552-2fa8-4809-b6f2-0952ba3da52a", "metadata": {}, "source": [ "__Aknowledgement__\n", "- The diffusion distance and pseudo-ordering is performed according to the lineage tracing algorithm presented in Haghverdi, et al. (2019). \n", "- The code for computing the ordering is largely adopted from the scanpy implementation in Python.\n" ] }, { "cell_type": "markdown", "id": "d20a4dbb-db02-4c87-aab8-7e6dced91ba4", "metadata": {}, "source": [ "First, import the necessary packages:" ] }, { "cell_type": "markdown", "id": "0df1c8d8-63d5-4035-bd66-9d9c1fbcb2e6", "metadata": {}, "source": [ "# Load libraries" ] }, { "cell_type": "code", "execution_count": 1, "id": "8540d6ad-1032-468a-a103-10ab35a97de0", "metadata": {}, "outputs": [], "source": [ "import pathlib\n", "import sys\n", "\n", "from collections import defaultdict as ddict\n", "import itertools\n", "import matplotlib.colors as cm\n", "import matplotlib.pyplot as plt\n", "import networkx as nx\n", "import numpy as np\n", "import pandas as pd\n", "import scipy.sparse as sc_sparse\n", "from tqdm import tqdm" ] }, { "cell_type": "markdown", "id": "d06bbd95-7b76-42ee-917f-51950a0c331f", "metadata": {}, "source": [ "If ``netflow`` has not been installed, add the path to the library:" ] }, { "cell_type": "code", "execution_count": 2, "id": "a1044eec-5973-4df1-99ac-f0308f9ca1ce", "metadata": {}, "outputs": [], "source": [ "sys.path.insert(0, pathlib.Path(pathlib.Path('.').absolute()).parents[3].resolve().as_posix())\n", "# sys.path.insert(0, pathlib.Path(pathlib.Path('.').absolute()).parents[0].resolve().as_posix())" ] }, { "cell_type": "code", "execution_count": 3, "id": "6d88da91-52a2-42da-9f39-3356b9f01691", "metadata": {}, "outputs": [], "source": [ "import netflow as nf\n", "from netflow.probe import visualization as nfv" ] }, { "cell_type": "markdown", "id": "c0c88311-56e0-42e3-99e2-f69075dbe9d6", "metadata": {}, "source": [ "# Set up directories" ] }, { "cell_type": "code", "execution_count": 4, "id": "03af6966-61d4-4bb5-ae64-a6fc9e028282", "metadata": {}, "outputs": [], "source": [ "MAIN_DIR = pathlib.Path('.').absolute()" ] }, { "cell_type": "markdown", "id": "1ebeae8f-082b-483f-a481-50bedd139943", "metadata": {}, "source": [ "Paths to where data is stored:" ] }, { "cell_type": "code", "execution_count": 5, "id": "2eb1cf17-7d49-44b2-9914-ae4ac997c51c", "metadata": {}, "outputs": [], "source": [ "DATA_DIR = MAIN_DIR / 'example_data' / 'breast_tcga'\n", "\n", "RNA_FNAME = DATA_DIR / 'rna_606.txt'\n", "E_RNA_FNAME = DATA_DIR / 'edgelist_hprd_rna_606.txt'\n", "\n", "CNA_FNAME = DATA_DIR / 'cna_606.txt'\n", "E_CNA_FNAME = DATA_DIR / 'edgelist_hprd_cna_606.txt'\n", "\n", "METH_FNAME = DATA_DIR / 'methylation_606.txt'\n", "E_METH_FNAME = DATA_DIR / 'edgelist_hprd_methylation_606.txt'\n", "\n", "CLIN_FNAME = DATA_DIR / 'clin_606.txt'" ] }, { "cell_type": "markdown", "id": "164e41c1-3407-4983-b31f-74d9ec5701e4", "metadata": {}, "source": [ "Directory where output should be saved:" ] }, { "cell_type": "code", "execution_count": 6, "id": "2959b849-da29-435e-8de5-4505afe3374a", "metadata": {}, "outputs": [], "source": [ "OUT_DIR = MAIN_DIR / 'example_data' / 'results_netflow_breast_tcga'" ] }, { "cell_type": "markdown", "id": "c243e314-e1f6-4e29-9b9e-c32b1c424bcf", "metadata": {}, "source": [ "# Load clinical data" ] }, { "cell_type": "code", "execution_count": 7, "id": "136bf4e2-bec5-4b9e-94d3-e634eff43ccc", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(606, 34)\n" ] } ], "source": [ "clin = pd.read_csv(CLIN_FNAME, header=0, index_col=0)\n", "print(clin.shape)" ] }, { "cell_type": "code", "execution_count": 8, "id": "58507362-cadd-4082-a23e-1b7e75a40f37", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | PATIENT_ID | \n", "SAMPLE_TYPE | \n", "CANCER_TYPE | \n", "CANCER_TYPE_DETAILED | \n", "ONCOTREE_CODE | \n", "TMB_NONSYNONYMOUS | \n", "OS_STATUS | \n", "OS_MONTHS | \n", "DFS_STATUS | \n", "DFS_MONTHS | \n", "... | \n", "PATH_MARGIN | \n", "AJCC_TUMOR_PATHOLOGIC_PT | \n", "AJCC_NODES_PATHOLOGIC_PN | \n", "AJCC_METASTASIS_PATHOLOGIC_PM | \n", "AJCC_PATHOLOGIC_TUMOR_STAGE | \n", "AJCC_STAGING_EDITION | \n", "Buffa Hypoxia Score | \n", "Ragnum Hypoxia Score | \n", "Winter Hypoxia Score | \n", "PAM50 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TCGA-E9-A295-01 | \n", "TCGA-E9-A295 | \n", "Primary | \n", "Breast Cancer | \n", "Breast Invasive Lobular Carcinoma | \n", "ILC | \n", "1.866667 | \n", "0:LIVING | \n", "12.32 | \n", "0:DiseaseFree | \n", "12.32 | \n", "... | \n", "Negative | \n", "T2 | \n", "N0 (i-) | \n", "M0 | \n", "Stage IIA | \n", "7th | \n", "-19.0 | \n", "-4.0 | \n", "0.0 | \n", "LumA | \n", "
TCGA-AR-A1AS-01 | \n", "TCGA-AR-A1AS | \n", "Primary | \n", "Breast Cancer | \n", "Breast Invasive Ductal Carcinoma | \n", "IDC | \n", "0.933333 | \n", "0:LIVING | \n", "37.78 | \n", "0:DiseaseFree | \n", "37.78 | \n", "... | \n", "Negative | \n", "T2 | \n", "N1 | \n", "M0 | \n", "Stage IIB | \n", "6th | \n", "11.0 | \n", "10.0 | \n", "14.0 | \n", "LumA | \n", "
2 rows × 34 columns
\n", "netflow.methods.classes: 05/01/2025 05:05:01 PM | MSG | \n", "classes:multiple_pairwise_observation_neighborhood_wass_distance:864 | >>> Observation pairwise 1-hop neighborhood \n", "Wasserstein distances on rna_immune_signature_nbhd_wass_without_self saved to \n", "rna_immune_signature_nbhd_wass_without_self.csv. \n", "\n" ], "text/plain": [ "netflow.methods.classes: \u001b[1;36m05\u001b[0m/\u001b[1;36m01\u001b[0m/\u001b[1;36m2025\u001b[0m \u001b[1;92m05:05:01\u001b[0m PM | \u001b[1;33mMSG\u001b[0m | \n", "classes:multiple_pairwise_observation_neighborhood_wass_distan\u001b[1;92mce:864\u001b[0m | >>> Observation pairwise \u001b[1;36m1\u001b[0m-hop neighborhood \n", "Wasserstein distances on rna_immune_signature_nbhd_wass_without_self saved to \n", "rna_immune_signature_nbhd_wass_without_self.csv. \n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Computing pairwise 1-hop nbhd Wasserstein distances: 100%|\u001b[33m█████████████████████████████████████████\u001b[0m| 36/36 [04:52<00:00, 8.13s/it]\u001b[0m\n" ] }, { "data": { "text/html": [ "
netflow.methods.classes: 05/01/2025 05:10:23 PM | MSG | \n", "classes:multiple_pairwise_observation_neighborhood_wass_distance:864 | >>> Observation pairwise 1-hop neighborhood \n", "Wasserstein distances on meth_immune_signature_nbhd_wass_without_self saved to \n", "meth_immune_signature_nbhd_wass_without_self.csv. \n", "\n" ], "text/plain": [ "netflow.methods.classes: \u001b[1;36m05\u001b[0m/\u001b[1;36m01\u001b[0m/\u001b[1;36m2025\u001b[0m \u001b[1;92m05:10:23\u001b[0m PM | \u001b[1;33mMSG\u001b[0m | \n", "classes:multiple_pairwise_observation_neighborhood_wass_distan\u001b[1;92mce:864\u001b[0m | >>> Observation pairwise \u001b[1;36m1\u001b[0m-hop neighborhood \n", "Wasserstein distances on meth_immune_signature_nbhd_wass_without_self saved to \n", "meth_immune_signature_nbhd_wass_without_self.csv. \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "for dd in keeper.data:\n", " keeper.wass_distance_pairwise_observation_feature_nbhd(dd.label, dd.label, features=gene_sig,\n", " include_self=False, label=gene_sig_label)" ] }, { "cell_type": "markdown", "id": "56e083e7-633d-4d51-b444-0861998c5cc1", "metadata": {}, "source": [ "As a result, for each feature data (RNA, CNA and methylation), the Wasserstein distances between pairwise observations,\n", "where rows are observation-pairs and columns are feature (node) names is saved to `keeper.misc` keyed by \n", "``f\"{data_key}_{label}_nbhd_euc_with{'' if include_self else 'out'}_self\"``. \n", "\n", "For example, we can see the RNA results as follows: " ] }, { "cell_type": "code", "execution_count": 16, "id": "a2ed35ca-aed8-4d87-8c88-28837e5da4dc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | \n", " | ACTB | \n", "CCR5 | \n", "CD163 | \n", "CD247 | \n", "CD28 | \n", "CD3D | \n", "CD3E | \n", "CD3G | \n", "CD4 | \n", "CD80 | \n", "... | \n", "SNAI1 | \n", "TGFB2 | \n", "TGFBR1 | \n", "TGFBR2 | \n", "TNF | \n", "TUBA1A | \n", "VAV1 | \n", "VEGFA | \n", "ZAP70 | \n", "ZEB1 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
observation_a | \n", "observation_b | \n", "\n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " | \n", " |
TCGA-E9-A295-01 | \n", "TCGA-AR-A1AS-01 | \n", "0.375430 | \n", "0.482697 | \n", "0.025785 | \n", "0.423335 | \n", "0.226186 | \n", "0.623157 | \n", "0.489348 | \n", "0.161659 | \n", "0.930970 | \n", "0.061216 | \n", "... | \n", "0.319277 | \n", "0.760959 | \n", "0.56571 | \n", "0.955370 | \n", "0.177350 | \n", "0.167693 | \n", "0.304148 | \n", "0.199462 | \n", "0.638770 | \n", "0.645654 | \n", "
TCGA-AQ-A1H2-01 | \n", "0.467019 | \n", "0.447891 | \n", "0.050915 | \n", "0.221294 | \n", "0.223247 | \n", "0.167546 | \n", "0.245124 | \n", "0.014759 | \n", "0.153712 | \n", "0.205665 | \n", "... | \n", "0.178091 | \n", "0.559237 | \n", "0.62304 | \n", "0.764808 | \n", "0.481764 | \n", "0.582859 | \n", "0.383708 | \n", "0.218584 | \n", "0.474194 | \n", "0.261223 | \n", "
2 rows × 36 columns
\n", "netflow.methods.classes: 05/01/2025 05:23:41 PM | MSG | \n", "classes:multiple_pairwise_observation_neighborhood_euc_distance:961 | >>> Observation pairwise 1-hop neighborhood \n", "Euclidean distances on rna_immune_signature_nbhd_euc_without_self saved to \n", "rna_immune_signature_nbhd_euc_without_self.csv. \n", "\n" ], "text/plain": [ "netflow.methods.classes: \u001b[1;36m05\u001b[0m/\u001b[1;36m01\u001b[0m/\u001b[1;36m2025\u001b[0m \u001b[1;92m05:23:41\u001b[0m PM | \u001b[1;33mMSG\u001b[0m | \n", "classes:multiple_pairwise_observation_neighborhood_euc_distan\u001b[1;92mce:961\u001b[0m | >>> Observation pairwise \u001b[1;36m1\u001b[0m-hop neighborhood \n", "Euclidean distances on rna_immune_signature_nbhd_euc_without_self saved to \n", "rna_immune_signature_nbhd_euc_without_self.csv. \n" ] }, "metadata": {}, "output_type": "display_data" }, { "name": "stderr", "output_type": "stream", "text": [ "Computing pairwise 1-hop nbhd Euclidean distances: 100%|\u001b[33m███████████████████████████████████████████\u001b[0m| 36/36 [00:01<00:00, 27.24it/s]\u001b[0m\n" ] }, { "data": { "text/html": [ "
netflow.methods.classes: 05/01/2025 05:23:52 PM | MSG | \n", "classes:multiple_pairwise_observation_neighborhood_euc_distance:961 | >>> Observation pairwise 1-hop neighborhood \n", "Euclidean distances on meth_immune_signature_nbhd_euc_without_self saved to \n", "meth_immune_signature_nbhd_euc_without_self.csv. \n", "\n" ], "text/plain": [ "netflow.methods.classes: \u001b[1;36m05\u001b[0m/\u001b[1;36m01\u001b[0m/\u001b[1;36m2025\u001b[0m \u001b[1;92m05:23:52\u001b[0m PM | \u001b[1;33mMSG\u001b[0m | \n", "classes:multiple_pairwise_observation_neighborhood_euc_distan\u001b[1;92mce:961\u001b[0m | >>> Observation pairwise \u001b[1;36m1\u001b[0m-hop neighborhood \n", "Euclidean distances on meth_immune_signature_nbhd_euc_without_self saved to \n", "meth_immune_signature_nbhd_euc_without_self.csv. \n" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "for dd in keeper.data:\n", " keeper.euc_distance_pairwise_observation_feature_nbhd(dd.label, dd.label, features=gene_sig,\n", " include_self=False, label=gene_sig_label,\n", " metric='euclidean')" ] }, { "cell_type": "markdown", "id": "de62a861-1120-447f-9df9-3116c6615213", "metadata": {}, "source": [ "### Distance from a single feature neighborhood" ] }, { "cell_type": "markdown", "id": "aba93a63-fd69-42da-871e-2e44f16d9aaf", "metadata": {}, "source": [ "If we are interested in the distances with respect to a single feature neighborhood, for example CD28, we can extract the stacked-distance format from `keeper.misc` and add it to the distances in `keeper.distances`:" ] }, { "cell_type": "markdown", "id": "40645a62-0e3d-48e5-a7e8-09f2a7f562c8", "metadata": {}, "source": [ "First we extract the stacked RNA CD28 neighborhood Wasserstein distances:" ] }, { "cell_type": "code", "execution_count": 18, "id": "27ae544a-14f1-4b93-a2f0-fefd34ec9400", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_keys(['rna_immune_signature_nbhd_wass_without_self', 'meth_immune_signature_nbhd_wass_without_self', 'rna_immune_signature_nbhd_euc_without_self', 'meth_immune_signature_nbhd_euc_without_self'])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "keeper.misc.keys()" ] }, { "cell_type": "code", "execution_count": 19, "id": "caf75516-dca6-4d8b-9be1-f9f29278d63e", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "observation_a observation_b \n", "TCGA-E9-A295-01 TCGA-AR-A1AS-01 0.226186\n", " TCGA-AQ-A1H2-01 0.223247\n", " TCGA-A8-A08O-01 0.246211\n", " TCGA-BH-A1FJ-01 0.322797\n", " TCGA-JL-A3YX-01 0.382493\n", "Name: CD28, dtype: float64" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rna_CD28_nbhd_wass_dist = keeper.misc[f\"rna_{gene_sig_label}_nbhd_wass_without_self\"]['CD28']\n", "rna_CD28_nbhd_wass_dist.head()" ] }, { "cell_type": "markdown", "id": "aefbfd45-6cb1-467a-bcba-d2dd9510d33d", "metadata": {}, "source": [ "Then we add the stacked distances to `keeper.distances`, keyed by `'wass_dist_rna_CD28_nbhd'`:" ] }, { "cell_type": "code", "execution_count": 20, "id": "9457a3a8-1209-4922-9dcf-08c23f9090db", "metadata": {}, "outputs": [], "source": [ "keeper.add_stacked_distance(rna_CD28_nbhd_wass_dist, 'wass_dist_rna_CD28_nbhd')" ] }, { "cell_type": "markdown", "id": "9b393cd7-33e7-4285-8c69-e61e69d29cc8", "metadata": {}, "source": [ "### Distance from norm of feature neighborhood distances" ] }, { "cell_type": "markdown", "id": "4ae4d9bd-b431-417e-8614-5c400cf91c6d", "metadata": {}, "source": [ "Alternatively, we can use the norm of the vector of neighborhood distances over all features as the final distance. First, we extract the norm of feature neighborhood distances, again demonstrated on the RNA-Wasserstein distances.\n", "\n", "Summary of the arguments passed to compute the norm and save it as a distance:\n", "- `misc_key` : Specify the key of the stacked RNA feature neighborhood Wasserstein distnaces stored in `keeper.misc`\n", "- `distance_label` : Specify the key that the resulting distance is stored as in `keeper.distances`\n", "- `features` : Option to select subset of features to include. If provided, restrict to norm over columns corresponding to features in the specified list. If `None`, use all columns. (In this demonstration, we use the default behavior and include all features that neighborhood distances were computed on.)\n", "- `method` : Indicate which norm to compute, can be one of ['L1', 'L2', 'inf', 'mean', 'median'] " ] }, { "cell_type": "code", "execution_count": 21, "id": "c7077537-3737-453d-8ce2-57b1d55c7eff", "metadata": {}, "outputs": [], "source": [ "misc_key = f\"rna_{gene_sig_label}_nbhd_wass_without_self\"\n", "distance_label = f\"norm_{misc_key}\"\n", "nf.methods.metrics.norm_features_as_sym_dist(keeper, misc_key,label=distance_label,\n", " features=None, method='L1')" ] }, { "cell_type": "markdown", "id": "e14ef55c-a4d3-4900-9a7c-ce09f5bb732b", "metadata": {}, "source": [ "We can now see this distance has been added to `keeper.distances`:" ] }, { "cell_type": "code", "execution_count": 22, "id": "42700708-06c4-4ea2-bffe-5e6463271075", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | TCGA-E9-A295-01 | \n", "TCGA-AR-A1AS-01 | \n", "TCGA-AQ-A1H2-01 | \n", "TCGA-A8-A08O-01 | \n", "TCGA-BH-A1FJ-01 | \n", "TCGA-JL-A3YX-01 | \n", "TCGA-A7-A425-01 | \n", "TCGA-AC-A2BM-01 | \n", "TCGA-LL-A6FP-01 | \n", "TCGA-A7-A26E-01 | \n", "... | \n", "TCGA-A2-A1G0-01 | \n", "TCGA-WT-AB41-01 | \n", "TCGA-EW-A1P6-01 | \n", "TCGA-XX-A89A-01 | \n", "TCGA-A7-A4SD-01 | \n", "TCGA-AC-A6IX-01 | \n", "TCGA-AR-A24L-01 | \n", "TCGA-BH-A42U-01 | \n", "TCGA-AR-A24S-01 | \n", "TCGA-BH-A0BC-01 | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
TCGA-E9-A295-01 | \n", "0.000000 | \n", "13.415799 | \n", "11.976610 | \n", "10.434216 | \n", "11.961795 | \n", "9.241838 | \n", "9.865547 | \n", "12.639423 | \n", "17.064882 | \n", "13.054158 | \n", "... | \n", "11.651279 | \n", "16.169011 | \n", "10.006264 | \n", "11.019874 | \n", "13.043276 | \n", "9.243581 | \n", "12.245928 | \n", "13.691054 | \n", "11.829914 | \n", "10.739983 | \n", "
TCGA-AR-A1AS-01 | \n", "13.415799 | \n", "0.000000 | \n", "13.968681 | \n", "9.741795 | \n", "9.344761 | \n", "9.340045 | \n", "12.072010 | \n", "12.957682 | \n", "16.617421 | \n", "10.944821 | \n", "... | \n", "13.210496 | \n", "19.160851 | \n", "9.812397 | \n", "13.671998 | \n", "15.426071 | \n", "11.771620 | \n", "11.390057 | \n", "17.613263 | \n", "13.028777 | \n", "11.599280 | \n", "
2 rows × 606 columns
\n", "