99超碰在线99,中文字幕人妻专区视频

鏈接：https://machinelearningmastery.com/feature-selection-with-the-caret-r-package/

基于重要性的特征排序 Rank Features By Importance

特征的重要性可以通過構建模型來評估。比如決策樹(Decision tree)就有內部機制來評估特征的重要性. 其它方法也可以根據(jù)ROC曲線分析來評估特征的重要性。

下面給出了基于Pima Indians Diabetes數(shù)據(jù)庫和Learning Vector Quantization(LVQ)模型. 這樣就能夠根據(jù)重要性來對特征進行排序。代碼如下:

# ensure results are repeatable
set.seed(7)
# load the library
library(mlbench)
library(caret)
# load the dataset
data(PimaIndiansDiabetes)
# prepare training scheme
control <- trainControl(method="repeatedcv", number=10, repeats=3)
# train the model
model <- train(diabetes~., data=PimaIndiansDiabetes, method="lvq", preProcess="scale", trControl=control)
# estimate variable importance
importance <- varImp(model, scale=FALSE)
# summarize importance
print(importance)
# plot importance
plot(importance)

更進一步的，Recursive Feature Elimination (RFE)方法能夠實現(xiàn)特征的選擇。這個結果最后需要借助統(tǒng)計學和實際需要來判定特征選擇結果的合理性分析。

# ensure the results are repeatable
set.seed(7)
# load the library
library(mlbench)
library(caret)
# load the data
data(PimaIndiansDiabetes)
# define the control using a random forest selection function
control <- rfeControl(functions=rfFuncs, method="cv", number=10)
# run the RFE algorithm
results <- rfe(PimaIndiansDiabetes[,1:8], PimaIndiansDiabetes[,9], sizes=c(1:8), rfeControl=control)
# summarize the results
print(results)
# list the chosen features
predictors(results)
# plot the results
plot(results, type=c("g", "o"))

需要說明的是，通過相關性分析(Remove Redundant Features)部分，我不確定分析之后該如何處理這些特征呢？既沒有對特征排序，又沒有給出一個特征子集，所以怎么使用這個信息還不明確。

麻豆精品无码av,欧美1区2区,久久中文字幕乱码人妻,亚洲欧美另类少妇精品,在线看黄射,69pao高清,九九九久久久国产精品,子操大逼1234区,九九爱99热精品

基于R的Caret包的特征選擇

評論 0

近期熱門動態(tài)

下一篇