8 Selection Sweep

➡️ This section contains four subpages: pcadapt, OutFLANK, IBS, and Manhattan PlotPlus, allowing you to detect selection signatures in different scenario and customize your plot.

8.1 pcadapt

A PCA-based approach identifies selective outliers relative to population structure (Luu, Bazin, and Blum 2016).

Required Datasets:
  • data.frame
  • Site Info. (RDS) of the current data.frame, downloadable from Data Input or Data QC pages.
Steps:
  1. Upload Site Info. (required).

  2. Click SNP Thinning button (optional) and choose window size (number of SNPs) and r² threshold. For more information, visit https://bcm-uga.github.io/pcadapt/articles/pcadapt.html.

  3. Click the Run pcadapt button to perform genome scan for selection.

Outputs:
  • pcadapt p-value per site (RDS): A dataset containing p-values and adjusted p-values for each site.

  • pcadapt Manhattan Plot (PDF): A Manhattan plot visualizing the p-values per site across the genome. Significant SNPs are highlighted in red.

  • pcadapt QQ Plot (PDF): A QQ plot comparing the distribution of observed p-values to the expected distribution under the null hypothesis.

  • pcadapt Histogram of p-values (PDF): A histogram showing the distribution of p-values across all sites.

  • pcadapt Histogram of Test Statistics (PDF): A histogram showing the distribution of test statistics across all sites.

  • pcadapt Significant SNPs (CSV): A table listing SNPs identified as significant by pcadapt, including their site info., p-values, and adjusted p-values.

The pcadapt Complete!

8.2 OutFLANK

A Fst-based approach detects selection signals by comparing genetic differentiation between defined group assignments (Whitlock and Lotterhos 2015). For more information, visit https://rpubs.com/lotterhos/outflank.

Required Datasets:
  • genind with ‘Group Info.’, downloadable from Data Conversion page after you have both the data.frame and Group Info.
  • Site Info. (RDS) of the current data.frame, downloadable from Data Input or Data QC pages.
Steps:
  1. Upload Site Info. (required).

  2. Click the Run OutFLANK button to perform genome scan for selection.

Outputs:
  • OutFLANK p-value per site (RDS): A dataset containing p-values and adjusted p-values for each site.

  • OutFLANK Manhattan Plot (PDF): A Manhattan plot visualizing the p-values per site across the genome. Significant SNPs are highlighted in red.

  • OutFLANK QQ Plot (PDF): A QQ plot comparing the distribution of observed p-values to the expected distribution under the null hypothesis.

  • OutFLANK Histogram of p-values (PDF): A histogram showing the distribution of p-values across all sites.

  • OutFLANK Histogram of Fst (PDF): A histogram showing the distribution of Fst values across all sites.

  • OutFLANK Significant SNPs (CSV): A table listing SNPs identified as significant by OutFLANK, including their site info., Fst values, and p-values.

The OutFLANK Complete!

8.3 IBS (Identity By State)

An approach to detect differences in genomic regions between pairs of individuals, useful for identifying pedigree relationships.

Required Datasets:
  • data.frame
  • Site Info. (RDS) of the current data.frame, downloadable from Data Input or Data QC pages.
  • Chromosome Info. (CSV): Reference genome information of the current study. For more details about this file, refer to Section 4.3 (SNP Density).
Steps:
  1. Upload Site Info. (required).

  2. Upload Chromosome Info. (CSV) (required).

  3. Choose the reference and comparison samples.

  4. Select window size (kb) and step size (kp).

  5. To remove heterozygous SNPs from the reference sample, click the Remove heterozygous SNPs checkbox (optional).

  6. Click the Run IBS button to perform IBS analysis.

Outputs:
  • Chromosome Ideogram (PDF): An ideogram visualizing the IBS results, using a gradient palette to represent the differences across chromosomes.

  • Sliding Window Data (CSV): A sliding window dataset with IBS results, including SNP count, different SNPs, and the ratio of different SNPs per window.

The IBS Complete!

8.4 Manhattan Plot Plus

Customize your phylogenetic tree plot based on the results from Genetic Diversity/Diversity Parameter, Selection Sweep/pcadapt, or Selection Sweep/OutFLANK.

Required Files:
  • Genetic Diversity per Site (Genetic_Diversity_per_Site.rds), pcadapt p-value per Site (pcadapt_p-value_per_site.rds), or OutFLANK p-value per Site (OutFLANK_p-value_per_site.rds).
  • Chromosome Info. (CSV): Reference genome information of the current study. For more details about this file, refer to Section 4.3 (SNP Density).
Steps:
  1. Upload genetic_diversity/pcadapt_pvalue/OutFLANK_pvalue per site (RDS).

  2. Upload Chromosome Info. (CSV).

  3. Click the Run Manhattan Plot button to generate the Manhattan plot.

  4. Customize the Manhattan plot and click the Run Manhattan Plot button again.

Outputs:
  • Manhattan Plot (PDF): A Manhattan plot with user-defined layout style and attributes.

Note: If generating a plot for p-values, make sure to use ‘-log10’ transformation for the Y axis.

Manhattan Plot Plus Complete!

References

Luu, Keurcien, Eric Bazin, and Michael G. B. Blum. 2016. Pcadapt: An R Package to Perform Genome Scans for Selection Based on Principal Component Analysis.” Molecular Ecology Resources 17 (1): 67–77. https://doi.org/10.1111/1755-0998.12592.
Whitlock, Michael C., and Katie E. Lotterhos. 2015. “Reliable Detection of Loci Responsible for Local Adaptation: Inference of a Null Model Through Trimming the Distribution ofFST.” The American Naturalist 186 (S1): S24–36. https://doi.org/10.1086/682949.