Title: | Genomic Prediction of Cross Performance |
---|---|
Description: | This function performs genomic prediction of cross performance using genotype and phenotype data. It processes data in several steps including loading necessary software, converting genotype data, processing phenotype data, fitting mixed models, and predicting cross performance based on weighted marker effects. |
Authors: | Marlee Labroo, Christine Nyaga, Lukas Mueller |
Maintainer: | Christine Nyaga <[email protected]> |
License: | MIT + file LICENSE |
Version: | 0.1.0 |
Built: | 2024-11-07 05:14:53 UTC |
Source: | https://github.com/cmn92/gpcp |
This function performs genomic prediction of cross performance using genotype and phenotype data. It processes data in several steps including loading necessary software, converting genotype data, processing phenotype data, fitting mixed models, and predicting cross performance based on weighted marker effects.
runGPCP(phenotypeFile, genotypeFile, genotypes, traits, weights = NA, userSexes = "", userFixed = NA, userRandom = NA, Ploidy = NA, NCrosses = NA)
runGPCP(phenotypeFile, genotypeFile, genotypes, traits, weights = NA, userSexes = "", userFixed = NA, userRandom = NA, Ploidy = NA, NCrosses = NA)
phenotypeFile |
A data frame containing phenotypic data, typically read from a CSV file. |
genotypeFile |
A file path to the genotypic data, either in VCF format or as a HapMap. |
genotypes |
A character string representing the column name in the phenotype file that corresponds to the genotype IDs. |
traits |
A string of comma-separated trait names from the phenotype file, which will be used for genomic prediction. |
weights |
A numeric vector specifying the weights for the traits. The order of weights should correspond to the order of traits. |
userSexes |
Optional. A string representing the column name in the phenotype file corresponding to the individuals' sexes. |
userFixed |
A string of comma-separated fixed effect variables from the phenotype file. If no fixed effects are required, set to NA. |
userRandom |
A string of comma-separated random effect variables from the phenotype file. If no random effects are required, set to NA. |
Ploidy |
An integer representing the ploidy level of the organism (e.g., 2, 4, 6). |
NCrosses |
An integer specifying the number of top crosses to output. Maximum is a full diallel. |
This function is designed for genomic prediction of cross performance and can handle both diploid and polyploid species. It processes genotype data, calculates genetic relationships, and fits mixed models using the 'sommer' package. It outputs the best predicted crosses based on user-defined traits and weights.
A data frame containing predicted crosses with the following columns:
Parent1 |
First parent genotype ID. |
Parent2 |
Second parent genotype ID. |
CrossPredictedMerit |
Predicted merit of the cross. |
P1Sex |
Optional. Sex of the first parent if userSexes is provided. |
P2Sex |
Optional. Sex of the second parent if userSexes is provided. |
This function relies on the 'sommer', 'dplyr', and 'AGHmatrix' packages for processing mixed models and genomic data.
Marlee Labroo, Christine Nyaga, Lukas Mueller
Xiang, J., et al. (2016). "Mixed Model Methods for Genomic Prediction." Nature Genetics. Batista, L., et al. (2021). "Genetic Prediction and Relationship Matrices." Theoretical and Applied Genetics.
# Load phenotype data from CSV phenotypeFile <- read.csv("~/Documents/GCPC_input_files/2020_TDr_PHENO (1).csv") # Genotype file path genotypeFile <- "~/Documents/GCPC_input_files/genotypeFile.vcf" # Define inputs genotypes <- "Accession" traits <- c("rAUDPC_YMV", "YIELD", "DMC") weights <- c(0.2, 3, 1) userFixed <- c("LOC", "REP") Ploidy <- 2 NCrosses <- 150 # Run genomic prediction of cross performance finalcrosses <- runGPCP( phenotypeFile = phenotypeFile, genotypeFile = genotypeFile, genotypes = genotypes, traits = paste(traits, collapse = ","), weights = weights, userFixed = paste(userFixed, collapse = ","), Ploidy = Ploidy, NCrosses = NCrosses ) # View the predicted crosses print(finalcrosses)
# Load phenotype data from CSV phenotypeFile <- read.csv("~/Documents/GCPC_input_files/2020_TDr_PHENO (1).csv") # Genotype file path genotypeFile <- "~/Documents/GCPC_input_files/genotypeFile.vcf" # Define inputs genotypes <- "Accession" traits <- c("rAUDPC_YMV", "YIELD", "DMC") weights <- c(0.2, 3, 1) userFixed <- c("LOC", "REP") Ploidy <- 2 NCrosses <- 150 # Run genomic prediction of cross performance finalcrosses <- runGPCP( phenotypeFile = phenotypeFile, genotypeFile = genotypeFile, genotypes = genotypes, traits = paste(traits, collapse = ","), weights = weights, userFixed = paste(userFixed, collapse = ","), Ploidy = Ploidy, NCrosses = NCrosses ) # View the predicted crosses print(finalcrosses)