Home

/

GitHub

/

README.md
In mahdieh1/PeakCNV: A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

PeakCNV

PeakCNV consists of three steps: (1) Building CNVRs; (2) Cluster CNVRs; (3) Selection. PeakCNV first computes the genome-wide probability of each base pair to be deleted or duplicated in case and control samples using the Fisher exact test resulting in identification of statistically significant CNVRs. Then, PeakCNV groups statistically significant CNVRs into different clusters based on two combined features. The first feature (importance) shows the number of samples in each region after removing the common samples between each two CNVRs. The second one is the distance between CNVRs. Finally, the best CNVR(s) from each cluster will be report as the representative of the cluster

Installation
PeakCNV analysis workflow
Details of intermediate functions and file formats
Arguments
How to use
Reference
Author Info
Acknowledgements
License

PeakCNV runs on R (requires at least 3.2.0). Install the package from Github using the following R commands.

install.packages("devtools")
library(devtools)
install_github("mahdieh1/PeakCNV")

Alternatively, the package may downloaded from Github and installed in R:

# Clone/download PeakCNV into the current working dirctory with the following command: git clone https://github.com/mahdieh1/PeakCNV.git

The full PeakCNV workflow includes the following 3 steps: 1. Building CNVRs 2. Clustering process 3. Selection process

By default, the PeakCNV() function runs the entire workflow. However, it is possible to run specific steps of the workflow by specifiying the run.to parameter (see full documentation in https://mahdieh1.github.io/PeakCNV/).

| Step | run.to | | :---: | :---: | | 1 | 1 | | 2 | 2 | | 3 | 3 |

PeakCNV builds deletion and duplication CNVR maps for cases and controls by merging genomic coordinates of deletion and duplication type of CNVs, respectively. Then, PeakCNV estimates the probability of being either deleted or duplicated in cases versus controls for each base pair of these maps using one-tailed Fisher’s exact test. The output of this step is CNVRs that are significantly more frequently duplicated or deleted in cases or controls individuals.

PeakCNV(1)

To identify those CNVRs in the proximity of each other with the similar percentage of case samples coverage, PeakCNV performs clustering step. In this step, for each chromosome, PeakCNV asks you the eps value based on the k nearest neighbors(knn) plot. The optimal value is an elbow, where a sharp change in the distance occurs. For example, in the below image, the optimal eps value is around a distance of 0.15. 2Om1mD8

PeakCNV(2)

PeakCNV selects the most independent CNVRs from each cluster.

PeakCNV(run.to = 3)

By default, PeakCNV runs in the current working directory unless specified by the user. By default, results will be saved in the working directory. In clustering step, for each chromosome, PeakCNV asks you the eps value based on the k nearest neighbors(knn) plot. The optimal value is an elbow, where a sharp change in the distance occurs. For more information about results see the https://mahdieh1.github.io/PeakCNV/.

library("PeakCNV")
PeakCNV()

Download the test data sets from https://github.com/mahdieh1/PeakCNV/tree/main/test-data into your working directory. For this datasets, P-value i

Input files

Case CNVs (case.bed):

| Chr | Start | End | Sample-Id | | :---: | :---: | :---: | :---: | | 1 | 6742281 | 6742903 | SP7890 |

Control CNVs (control.bed):

| Chr | Start | End | Sample-Id | | :---: | :---: | :---: | :---: | | 1 | 6742281 | 6742903 | sa321 |

Please put input files (case.bed and control.bed) in the working directory. If your CNV list contains chr X or Y, please replace them with 23,24.

Output files:

CNVRs (CNVRs.bed):

| Chr | Start | End | | :---: | :---: | :---: | | 1 | 6742281 | 6742903 |

clustered CNVRs (clustering.txt):

| Chr | Start | End | #case | Cluster-NO | | :---: | :---: | :---: | :---: | :---: | | 1 | 6742281 | 6742903 | 15 | 1 |

Selected CNVRs (selection.txt)

| Score | #chr | Start | End | #case | Cluster-NO | | :---: | :---: | :---: | :---: | :---: | :---: | | 56 | 21 | 6742281 | 6742903 | 15 | 225.86 | 0 |

Please consider citing the follow paper when you use this code.
Labani, M., Afrasiabi, A., Beheshti, A., Lovell, N. H., & Alinejad-Rokny, H. (2022). PeakCNV: A multi-feature ranking algorithm-based tool for genome-wide copy number variation-association study. Computational and Structural Biotechnology Journal, 20, 4975-4983.
https://www.sciencedirect.com/science/article/pii/S2001037022004068
}

I will be pleased to address any question or concern about the PeakCNV package: In case of queries, please email: mahdieh.labani@students.mq.edu.au

Mahdieh Labani

This work was funded by the UNSW Scientia Program Fellowship and the Australian Research Council Discovery Early Career Researcher Award (DECRA), Macquarie PhD Scholarship and Australian Government Research Training Program (RTP) scholarship. Analyses were made possible with High Performance Computing resources provided by the BioMedical Machine Learning Lab with funding from the Australian Government and the UNSW SYDNEY.

This package is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License, version 3, as published by the Free Software Foundation.

A copy of the GNU General Public License, version 3, is available at https://www.r-project.org/Licenses/GPL-3

mahdieh1/PeakCNV documentation built on June 23, 2024, 7:34 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

mahdieh1/PeakCNV
A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

README.md
In mahdieh1/PeakCNV: A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

PeakCNV

A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

Table of Contents

Installation

PeakCNV analysis workflow

Arguments

Details of intermediate functions and file formats

Step 1 - Building CNVRs

Step 2 - Clustering

Step 3 - Selection

How to use

Input files

Output files:

Reference

Contacts

People who contributed to the PeakCNV code:

Acknowledgements

License

R Package Documentation

Browse R Packages

We want your feedback!

mahdieh1/PeakCNV A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

README.md In mahdieh1/PeakCNV: A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

PeakCNV

A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

Table of Contents

Installation

PeakCNV analysis workflow

Arguments

Details of intermediate functions and file formats

Step 1 - Building CNVRs

Step 2 - Clustering

Step 3 - Selection

How to use

Input files

Output files:

Reference

Contacts

People who contributed to the PeakCNV code:

Acknowledgements

License

R Package Documentation

Browse R Packages

We want your feedback!

mahdieh1/PeakCNV
A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study

README.md
In mahdieh1/PeakCNV: A Multi-Feature Ranking Algorithm-based tool for Genome-wide Copy Number Variation-association study