iddn.tools
==========

.. py:module:: iddn.tools

.. autoapi-nested-parse::

   Utility functions of iDDN


Functions
---------

.. autoapisummary::

   iddn.tools.evaluate_metrics
   iddn.tools.get_comm_diff_network
   iddn.tools.iddn_basic_pipeline
   iddn.tools.iddn_output_to_csv
   iddn.tools.collect_edges


Module Contents
---------------

.. py:function:: evaluate_metrics(net_est: numpy.ndarray, net_gt: numpy.ndarray)

   Calculate the recall, precision, and F1 scores for iDDN estimates given ground truth

   :param net_est: Estimated network dependency matrix. The weights will be binarized.
   :type net_est: array_like
   :param net_gt: Ground truth network dependency matrix. The weights will be binarized.
   :type net_gt: array_like

   :rtype: None


.. py:function:: get_comm_diff_network(out_iddn)

   Find the common and differential network from iDDN estimates

   :param out_iddn: The raw output of iDDN. P is the number of features.
   :type out_iddn: (2,P,P) arraylike

   :rtype: Common network and differential network matrices


.. py:function:: iddn_basic_pipeline(dat1, dat2, dep_mat=None, lambda1=0.2, lambda2=0.05)

   A convenient pipeline for iDDN

   Let `P` be the number of features (like genes or any molecules).
   `N1` is the number of samples in condition1, `N2` in conditions.

   The data will be standardized by iDDN, so users do not need to standardize it.
   The data from two conditions can have different sample size, but the feature number must be the same.

   :param dat1: The data in condition 1.
   :type dat1: (N1,P) array_like
   :param dat2: The data in condition 2.
   :type dat2: (N2,P) array_like
   :param dep_mat: Constraints or dependency matrix
   :type dep_mat: (P,P) array_like
   :param lambda1: The penalty for overall sparsity, from 0 to 1.
   :type lambda1: float
   :param lambda2: The penalty for the discrepancies between the networks under two conditions
   :type lambda2: float

   :returns: * A dictionary containing results. `comm` (P by P) is the estimated common network,
             * `diff` (P by P) the  differential network.
             * `g1` (P by P) is the network under the first condition. `g2` (P by P) is the network under the second condition.
             * `out_iddn` (2 by P by P) is the raw output of iDDN.


.. py:function:: iddn_output_to_csv(out_iddn, node_names)

   Convert iDDN results to Pandas data frames

   This is useful for sharing the results, as well as visualization.
   Each row of the data frame is one edge.
   There are four columns: the first node in an edge, the second node in an edge,
   the condition at which the edge exist, the weight, and the color of that edge.
   For common networks, the conditions are all set as 0.
   For differential networks, if an edge only exists in the first condition, the condition is set as 0.
   If an edge only exists in the second condition, the condition is set as 1.

   Let `P` be the number of features.

   :param out_iddn: The raw output of iDDN.
   :type out_iddn: (2,P,P) array_like
   :param node_names: The list of node names to output.
   :type node_names: (P) array_like

   :returns: * **df_edge_comm** (*pd.DataFrame*) -- A Pandas data frame for common network.
             * **df_edge_diff** (*pd.DataFrame*) -- A Pandas data frame for differential network.
             * **nodes_show_comm** (*array_like*) -- The list of node names that is present in the estimated common network.
               In other words, we only keep nodes that has at least one edge with other nodes.
             * **nodes_show_diff** (*array_like*) -- The list of node names that is present in the estimated common network.


.. py:function:: collect_edges(conn_mat, wt_mat, node_names, group=0, color_in='blue')

   Convert the adjacency matrix to Pandas data frame

   For differential network, call this function twice, once for each condition, and then combine them.

   Let `P` be the number of features.

   :param conn_mat: The adjacency or connectivity matrix
   :type conn_mat: (P,P) array_like
   :param wt_mat: Similar to conn_mat, but with weights
   :type wt_mat: (P,P) array_like
   :param node_names: The names of all nodes
   :type node_names: (P) array_like
   :param group: The index of condition, can be 0 or 1
   :type group: int
   :param color_in: The color for this data frame.
   :type color_in: str

   :returns: * **df_edge** (*pd.DataFrame*) -- A Pandas data frame for the network.
             * **nodes_show** (*array_like*) -- The list of node names that is present in the estimated network.