Intra-chromosomal structural communities determined by multiscale community mining using graph wavelets

to appear in BMC Bioinformatics


Multi-scale community mining using graph wavelets

Hi-C data can be represented as graphs where nodes represent DNA loci and the edges connect interacting loci, allowing us to reformulate the question of finding structural domains as a question of finding communities in the DNA interaction network.

We used the multi-scale community detection (MSCD) algorithm based on spectral graph wavelets that previously described and benchmarked against others multi-scale community mining methods from the literature (Tremblay and Borgnat, 2014).

The purpose of detecting communities at different scales using graph wavelets instead of, say, cutting a hierarchical clustering at different levels, is to fit as close to the data as possible. Cutting a hierarchical clustering impose a hierarchical structure to the set of community obtained at the different scales (cutting levels). When using wavelets, we do not suppose beforehand that the data have a hierarchical structure: a community at a coarse scale does not necessarily have to contain communities found at a finer scale.

Source code

The latest version of the MSCD MATLAB toolbox is available on Nicolas Tremblay's website.

The version of the MSCD code used for the publication along with a wrapper MATLAB script allowing to reproduce the full analysis pipeline described in the publication is available for download. See file 'README.txt' in the ZIP archive.

