3 Data Input
➡️ This section contains two subpages: VCF and data.frame/genind/genlight, allowing you to upload various data types for analysis.
3.1 VCF
Required Dataset (one of the following):
VCF file from PLINK
VCF or gzipped VCF (vcf.gz) file from VCFtools
VCF file in RDS format from ShiNyP
The VCF file should contain chromosome and position information in the first two columns (#CHROM
and POS
), along with sample names and their genotypic information. For some whole genome sequencing (WGS) data, where SNP marker ID information is missing, ShiNyP will auto-generate the SNP ID names as #CHROM:POS, such as 2:12500, indicating chromosome 2, position 12500.
Step 1: Input your VCF File
Browse and upload one VCF file.
If your VCF file is from VCFtools, please tick the ‘VCF file from VCFtools’ checkbox.
After the progress bar shows ‘Upload complete’, click the Input VCF File button.
Or use our Demo Data
- Click the Use Demo Data button and select one species. Detailed descriptions of the demo datasets are available at https://reurl.cc/QEx5lZ.
Note: By default, the genotypic information for 5 samples and 10 SNPs will be displayed on the interactive table.
Step 2: Transform to data.frame
If you have already input a VCF file on ShiNyP, click the Transform to data.frame button.
Download the
data.frame
file (in RDS format) and Site Info (in RDS format) so that you will not have to input the VCF file again; instead, you can upload thedata.frame
file.
Outputs:
VCF Data (RDS): VCF data stored in RDS format, which can be open and read in R environment.
data.frame (RDS):
data.frame
file. It’s necessary for downstream analyses, please download and save it!Site Info. (RDS): SNP site information file. It’s necessary for downstream analyses, please download and save it!
Note: If your data is large (more than 1GB), it may take some time to process. Please be patient. The ShiNyP platform processes one task at a time (e.g., you must wait for the input process to finish before you can reset the data).
VCF Data Input!