iddn.tools ========== .. py:module:: iddn.tools .. autoapi-nested-parse:: Utility functions of iDDN Functions --------- .. autoapisummary:: iddn.tools.evaluate_metrics iddn.tools.get_comm_diff_network iddn.tools.iddn_basic_pipeline iddn.tools.iddn_output_to_csv iddn.tools.collect_edges Module Contents --------------- .. py:function:: evaluate_metrics(net_est: numpy.ndarray, net_gt: numpy.ndarray) Calculate the recall, precision, and F1 scores for iDDN estimates given ground truth :param net_est: Estimated network dependency matrix. The weights will be binarized. :type net_est: array_like :param net_gt: Ground truth network dependency matrix. The weights will be binarized. :type net_gt: array_like :rtype: None .. py:function:: get_comm_diff_network(out_iddn) Find the common and differential network from iDDN estimates :param out_iddn: The raw output of iDDN. P is the number of features. :type out_iddn: (2,P,P) arraylike :rtype: Common network and differential network matrices .. py:function:: iddn_basic_pipeline(dat1, dat2, dep_mat=None, lambda1=0.2, lambda2=0.05) A convenient pipeline for iDDN Let `P` be the number of features (like genes or any molecules). `N1` is the number of samples in condition1, `N2` in conditions. The data will be standardized by iDDN, so users do not need to standardize it. The data from two conditions can have different sample size, but the feature number must be the same. :param dat1: The data in condition 1. :type dat1: (N1,P) array_like :param dat2: The data in condition 2. :type dat2: (N2,P) array_like :param dep_mat: Constraints or dependency matrix :type dep_mat: (P,P) array_like :param lambda1: The penalty for overall sparsity, from 0 to 1. :type lambda1: float :param lambda2: The penalty for the discrepancies between the networks under two conditions :type lambda2: float :returns: * A dictionary containing results. `comm` (P by P) is the estimated common network, * `diff` (P by P) the differential network. * `g1` (P by P) is the network under the first condition. `g2` (P by P) is the network under the second condition. * `out_iddn` (2 by P by P) is the raw output of iDDN. .. py:function:: iddn_output_to_csv(out_iddn, node_names) Convert iDDN results to Pandas data frames This is useful for sharing the results, as well as visualization. Each row of the data frame is one edge. There are four columns: the first node in an edge, the second node in an edge, the condition at which the edge exist, the weight, and the color of that edge. For common networks, the conditions are all set as 0. For differential networks, if an edge only exists in the first condition, the condition is set as 0. If an edge only exists in the second condition, the condition is set as 1. Let `P` be the number of features. :param out_iddn: The raw output of iDDN. :type out_iddn: (2,P,P) array_like :param node_names: The list of node names to output. :type node_names: (P) array_like :returns: * **df_edge_comm** (*pd.DataFrame*) -- A Pandas data frame for common network. * **df_edge_diff** (*pd.DataFrame*) -- A Pandas data frame for differential network. * **nodes_show_comm** (*array_like*) -- The list of node names that is present in the estimated common network. In other words, we only keep nodes that has at least one edge with other nodes. * **nodes_show_diff** (*array_like*) -- The list of node names that is present in the estimated common network. .. py:function:: collect_edges(conn_mat, wt_mat, node_names, group=0, color_in='blue') Convert the adjacency matrix to Pandas data frame For differential network, call this function twice, once for each condition, and then combine them. Let `P` be the number of features. :param conn_mat: The adjacency or connectivity matrix :type conn_mat: (P,P) array_like :param wt_mat: Similar to conn_mat, but with weights :type wt_mat: (P,P) array_like :param node_names: The names of all nodes :type node_names: (P) array_like :param group: The index of condition, can be 0 or 1 :type group: int :param color_in: The color for this data frame. :type color_in: str :returns: * **df_edge** (*pd.DataFrame*) -- A Pandas data frame for the network. * **nodes_show** (*array_like*) -- The list of node names that is present in the estimated network.