3 Data Transform

➡️ This section allow you to convert your SNP data in data.frame into genind and more.

Required Dataset (one of the following):

  • Input VCF Data (data.frame file) from the Data Input page.

  • Post-QC Data (data.frame file) from the Data QC page.


Step 1: Transform data.frame to genlight

Click the Transform to genlight button. This will generate the genlight file.

Note: After obtaining the clustering results from the Population Structure/DAPC subpage, you can add Group Info. to the genlight file by inputting the ‘DAPC_Group_Info.csv’. This step is necessary for analyses like ‘AMOVA’ and ‘OutFLANK’.


Output:

  • genlight (RDS): genlight file. It’s necessary for downstream analyses, please download and save it!

Note: Please download and save your data.frame and genlight files after transformation. This will save you from having to input the large VCF file again next time.


Step 2: Transform genind to others

Select the desired data format to export from genlight and click the Transform button. This will generate the specified file.


Outputs:

  • genlight (RDS): genlight file with Group Info. It’s necessary for downstream analyses, please download and save it! Downloadable on ShiNyP.

  • genind (RDS): Input format for ShiNyP DAPC subpage, optimized for DAPC analysis to reduce computation time. Downloadable on ShiNyP.

The following transformed files will be generated at the specified path you provide.

  • PLINK (PED & MAP): Input format for PLINK program, designed to perform a range of basic and large-scale SNP analyses.

  • GenAlEx (CSV): Input format for GenAlEx program, offers a wide range of population genetic analysis in Excel.

  • LEA (GENO & LFMM): Input format for LEA R package, designed for population genomics, landscape genomics and genotype-environment association tests.

  • GDS (GDS): Input format for SNPRelate R package, designed for efficient SNP data analysis.

  • STRUCTURE (STR): Input format for STRUCTURE program, used for inferring population structure.

  • fastStructure (STR): Input format for fastStructure program, used for inferring population structure from large SNP data.

  • PHYLIP (TXT): Input format for PHYLIP program, used for phylogenetic tree reconstruction and evolutionary analysis.

  • Treemix (GZ): Input format for Treemix program, designed for modeling population splits and migration events.

  • BayeScan (TXT): Input format for BayeScan, used for detecting loci under selection.

Data Transformation Complete!