PhD Student, GenScale team, IRISA (UMR CNRS 6074), Rennes, FRANCE
CARNAC-LR howto and experiments from manuscript
Scripts require c++11 and Python3. CARNAC-LR is available at https://github.com/kamimrcht/CARNAC-LR
Experiment 1 - ring of cliques
The ring of clique file can be found here.
./CARNAC-LR/CARNAC-LR -f input_K7.txt
CARNAC-LR output this file: carnac_clique_ring.txt.
Each line of the output file is a cluster composed of 7 nodes that form a clique in the original file. CARNAC-LR successfully retrieves the 30 cliques that are 30 communities in this toy example.
Experiment 2 - comparison to state of the art
10K reads from chromosome 1 were extracted from full mouse transcriptome sequence.
"Ground truth" clusters we deduced from mapping results on mouse reference genome GRCm38 are accessible here.
Minimap and clustering
Minimap version 0.2 is used to compute overlaps between reads:
minimap -Sw2 -L100 -t10 reads_10k.fa reads_10k.fa > minimap_10k_chr1.paf
PAF file is converted to an input graph:
python3 ~/CARNAC/scripts/paf_to_CARNAC.py minimap_10k.paf input_graph_10k.txt
CPM approach sources are here
Modularity and Louvain sources are here
CARNAC-LR command line:
./CARNAC-LR/CARNAC-LR -f input_graph_10k_chr1.txt
Results command lines
python3 validate_jaccard.py ground_truth_file clustering_output_file
./validate_clustering ground_truth_file clustering_output_file
python3 validate_jaccard.py clusters_10k_chr1.truth carnac_10k_chr1.txt
./validate_clustering clusters_10k_chr1.truth carnac_10k.txt