期刊文献

A machine learning approach to predict the most and the least feed–efficient groups in beef cattle 收藏

一种机器学习方法,以预测肉牛中最不含饲料效率的群体
摘要
The present study evaluated three strategies to find the optimum subset of DNA markers from the 50 K Illumina Bovine panel to classify beef cattle into the most and the least feed-efficient groups without using individual feed intake and performance measures. Residual feed intake (RFI) and 50 K single nucleotide polymorphisms (SNPs) genotype data of 4,057 beef animals from research and commercial herds were included. Initially, all cattle were ranked based on their phenotypic RFI values. Then different datasets were created by selecting animals from the 1%, 5%, 10%, 15%, 20%, 25%, 30%, 35%, 40%, and 45% range of top and bottom of the ranked RFI values. SNP subsets were selected based on the top-ranked SNPs contributing to the variance of RFI (first strategy), selecting SNPs from the SNP subsets created in the first strategy (strategy 2), and extracting SNPs from 50k SNPs (strategy 3). Then eleven ML algorithms were employed to classify the most and the least feed-efficient groups using 260 datasets generated by combinations of ten RFI phenotype percentage groups and 6, 18, and 2 SNP subsets in the first, second and third strategies, respectively. There was a high degree of accuracy (>69%) for classifying animals in the range of 1% for all ML algorithms under the three strategies and different SNP subsets. Implementing the linear Support Vector Machine algorithm for 15 K SNPs obtained in the first strategy predicted the 1% of the most and the least feed-efficient animals with an accuracy of 84%. In the second strategy, selecting 524 SNPs from the 15 K SNPs subset outperformed the other strategies with an accuracy of 81% for 1% of the population using the Naive Bayes algorithm. It was concluded that a smaller number of SNPs (524) could be used to predict the most and the least feed-efficient animals with an acceptable accuracy to reduce the cost of selection for RFI using genomic information.
摘要译文
本研究评估了三种策略,以找到50 K Illumina牛面板的DNA标记的最佳子集,以将肉牛分类为最不同意的饲料牛群,而无需使用单个饲料摄入量和绩效指标。包括研究和商业牛群中的4,057种牛肉动物的残留饲料摄入量(RFI)和50 K单核苷酸多态性(SNP)基因型数据。最初,所有牛都根据其表型RFI值对其进行了排名。然后,通过从1%,5%,10%,15%,20%,25%,30%,35%,40%和45%范围的顶部和底部从1%,5%,10%,20%,25%,30%,35%,35%,35%,35%,35%,35%,35%,40%,40%和底部选择动物来创建不同的数据集。。SNP子集是根据促成RFI方差(第一个策略)的顶级SNP选择的,从第一个策略(策略2)中选择SNP子集,并从50k SNP中提取SNP(策略3)。然后,使用了11个ML算法,分别在第一,第二和第三策略中分别使用十个RFI表型百分比组以及6、18和2个SNP子集生成的260个数据集来对饲料效率最低的组进行分类。在三种策略和不同的SNP子集下,所有ML算法的分类范围为1%的动物的准确性高(> 69%)。在第一个策略中获得的15 K SNP的线性支持向量机算法预测,精度为84%的饲料效率最低的动物的1%。在第二个策略中,使用Naive Bayes算法,从15 K SNP的524个SNP优于其他策略,其精度为81%。得出的结论是,较少数量的SNP(524)可用于预测最不可接受的动物,具有可接受的准确性,可以使用基因组信息降低RFI选择成本。
Ghader Manafiazar [a]. A machine learning approach to predict the most and the least feed–efficient groups in beef cattle[J]. Smart Agricultural Technology, 2023,5: 100317