Java教程

随机森林乳腺癌案例

本文主要是介绍随机森林乳腺癌案例,对大家解决编程问题具有一定的参考价值,需要的程序猿们随着小编来一起学习吧!

随机森林在乳腺癌数据上的调参

导入需要的库

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import GridSearchCV, cross_val_score
from sklearn.ensemble import RandomForestClassifier
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

导入数据集,探索数据

cancer = load_breast_cancer()
cancer.data.shape
(569, 30)

进行一次简单的建模,看看模型本身在数据集上的效果

rfc = RandomForestClassifier(n_estimators=100, random_state=90)
score_pre = cross_val_score(rfc, cancer.data, cancer.target, cv=10).mean()
score_pre
0.9648809523809524

随机森林调整的第一步:无论如何先来调 n_estimators

score_ = []
for i in range(0,200,10):
    rfc = RandomForestClassifier(n_estimators=i+1, random_state=90)
    score = cross_val_score(rfc, cancer.data, cancer.target, cv=10).mean()
    score_.append(score)

查看 n_estimators(1-200)最高得分

print(max(score_), (score_.index(max(score_)) * 10) + 1)
0.9631265664160402 71

nestimators 学习曲线

plt.figure(figsize=(20, 8),dpi=80)
plt.plot(range(1,201,10), score_)
plt.show()

细化 n_estimators 取值

est_range = [*range(65,75)]
est_refinement = []
for i in est_range:
    rfc = RandomForestClassifier(n_estimators=i, random_state=90)
    score = cross_val_score(rfc, cancer.data, cancer.target, cv=10).mean()
    est_refinement.append(score)
print(max(est_refinement), est_range[est_refinement.index(max(est_refinement))])
0.9666353383458647 73

为网格搜索做准备,书写网格搜索的参数

"""
param_grid = {'n_estimators':np.arange(0, 200, 10)}
 
param_grid = {'max_depth':np.arange(1, 20, 1)}
    
param_grid = {'max_leaf_nodes':np.arange(25,50,1)}
    对于大型数据集,可以尝试从1000来构建,先输入1000,每100个叶子一个区间,再逐渐缩小范围
 
有一些参数是可以找到一个范围的,或者说我们知道他们的取值和随着他们的取值,模型的整体准确率会如何变化,这
样的参数我们就可以直接跑网格搜索
param_grid = {'criterion':['gini', 'entropy']}
 
param_grid = {'min_samples_split':np.arange(2, 2+20, 1)}
 
param_grid = {'min_samples_leaf':np.arange(1, 1+10, 1)}
    
param_grid = {'max_features':np.arange(5,30,1)} 
 
"""
"\nparam_grid = {'n_estimators':np.arange(0, 200, 10)}\n \nparam_grid = {'max_depth':np.arange(1, 20, 1)}\n    \nparam_grid = {'max_leaf_nodes':np.arange(25,50,1)}\n    对于大型数据集,可以尝试从1000来构建,先输入1000,每100个叶子一个区间,再逐渐缩小范围\n \n有一些参数是可以找到一个范围的,或者说我们知道他们的取值和随着他们的取值,模型的整体准确率会如何变化,这\n样的参数我们就可以直接跑网格搜索\nparam_grid = {'criterion':['gini', 'entropy']}\n \nparam_grid = {'min_samples_split':np.arange(2, 2+20, 1)}\n \nparam_grid = {'min_samples_leaf':np.arange(1, 1+10, 1)}\n    \nparam_grid = {'max_features':np.arange(5,30,1)} \n \n"

开始按照参数对模型整体准确率的影响程度进行调参,首先调整 max_depth

param_grid = {'max_depth':np.arange(1, 20, 1)}
rfc = RandomForestClassifier(n_estimators=73, random_state=90)
grid = GridSearchCV(rfc, param_grid,cv=10)
grid.fit(cancer.data, cancer.target)
GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={'max_depth': array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
   18, 19])})</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-1" type="checkbox" ><label for="sk-estimator-id-1" class="sk-toggleable__label sk-toggleable__label-arrow">GridSearchCV</label><div class="sk-toggleable__content"><pre>GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;max_depth&#x27;: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
   18, 19])})</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-2" type="checkbox" ><label for="sk-estimator-id-2" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-3" type="checkbox" ><label for="sk-estimator-id-3" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div></div></div></div></div></div></div></div>

# 查看最好模型参数
grid.best_params_

{'max_depth': 8}

# 查看最好的分
grid.best_score_

0.9666353383458647

调整 max_features
param_grid = {'max_features':np.arange(5,30,1)} 
rfc = RandomForestClassifier(n_estimators=73, random_state=90)
grid = GridSearchCV(rfc, param_grid,cv=10)
grid.fit(cancer.data, cancer.target)

#sk-container-id-2 { color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1) }
#sk-container-id-2 pre { padding: 0 }
#sk-container-id-2 div.sk-toggleable { background-color: rgba(255, 255, 255, 1) }
#sk-container-id-2 label.sk-toggleable__label { cursor: pointer; display: block; width: 100%; margin-bottom: 0; padding: 0.3em; box-sizing: border-box; text-align: center }
#sk-container-id-2 label.sk-toggleable__label-arrow:before { content: "▸"; float: left; margin-right: 0.25em; color: rgba(105, 105, 105, 1) }
#sk-container-id-2 label.sk-toggleable__label-arrow:hover:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-2 div.sk-estimator:hover label.sk-toggleable__label-arrow:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-2 div.sk-toggleable__content { max-height: 0; max-width: 0; overflow: hidden; text-align: left; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-2 div.sk-toggleable__content pre { margin: 0.2em; color: rgba(0, 0, 0, 1); border-radius: 0.25em; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-2 input.sk-toggleable__control:checked~div.sk-toggleable__content { max-height: 200px; max-width: 100%; overflow: auto }
#sk-container-id-2 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before { content: "▾" }
#sk-container-id-2 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-2 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-2 input.sk-hidden--visually { border: 0; clip: rect(1px, 1px, 1px, 1px); height: 1px; margin: -1px; overflow: hidden; padding: 0; position: absolute; width: 1px }
#sk-container-id-2 div.sk-estimator { font-family: monospace; background-color: rgba(240, 248, 255, 1); border: 1px dotted rgba(0, 0, 0, 1); border-radius: 0.25em; box-sizing: border-box; margin-bottom: 0.5em }
#sk-container-id-2 div.sk-estimator:hover { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-2 div.sk-parallel-item::after { content: ""; width: 100%; border-bottom: 1px solid rgba(128, 128, 128, 1); flex-grow: 1 }
#sk-container-id-2 div.sk-label:hover label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-2 div.sk-serial::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: 0 }
#sk-container-id-2 div.sk-serial { display: flex; flex-direction: column; align-items: center; background-color: rgba(255, 255, 255, 1); padding-right: 0.2em; padding-left: 0.2em; position: relative }
#sk-container-id-2 div.sk-item { position: relative; z-index: 1 }
#sk-container-id-2 div.sk-parallel { display: flex; align-items: stretch; justify-content: center; background-color: rgba(255, 255, 255, 1); position: relative }
#sk-container-id-2 div.sk-item::before, #sk-container-id-2 div.sk-parallel-item::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: -1 }
#sk-container-id-2 div.sk-parallel-item { display: flex; flex-direction: column; z-index: 1; position: relative; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-2 div.sk-parallel-item:first-child::after { align-self: flex-end; width: 50% }
#sk-container-id-2 div.sk-parallel-item:last-child::after { align-self: flex-start; width: 50% }
#sk-container-id-2 div.sk-parallel-item:only-child::after { width: 0 }
#sk-container-id-2 div.sk-dashed-wrapped { border: 1px dashed rgba(128, 128, 128, 1); margin: 0 0.4em 0.5em; box-sizing: border-box; padding-bottom: 0.4em; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-2 div.sk-label label { font-family: monospace; font-weight: bold; display: inline-block; line-height: 1.2em }
#sk-container-id-2 div.sk-label-container { text-align: center }
#sk-container-id-2 div.sk-container { display: inline-block !important; position: relative }
#sk-container-id-2 div.sk-text-repr-fallback { display: none }GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;max_features&#x27;: array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
   22, 23, 24, 25, 26, 27, 28, 29])})</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-4" type="checkbox" ><label for="sk-estimator-id-4" class="sk-toggleable__label sk-toggleable__label-arrow">GridSearchCV</label><div class="sk-toggleable__content"><pre>GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;max_features&#x27;: array([ 5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21,
   22, 23, 24, 25, 26, 27, 28, 29])})</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-5" type="checkbox" ><label for="sk-estimator-id-5" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-6" type="checkbox" ><label for="sk-estimator-id-6" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div></div></div></div></div></div></div></div>

# 查看最好模型参数
grid.best_params_

{'max_features': 24}

# 查看最好的分
grid.best_score_

0.9666666666666668

调整 min_samples_leaf
param_grid = {'min_samples_leaf':np.arange(1, 1+10, 1)} 
rfc = RandomForestClassifier(n_estimators=73, random_state=90)
grid = GridSearchCV(rfc, param_grid,cv=10)
grid.fit(cancer.data, cancer.target)

#sk-container-id-3 { color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1) }
#sk-container-id-3 pre { padding: 0 }
#sk-container-id-3 div.sk-toggleable { background-color: rgba(255, 255, 255, 1) }
#sk-container-id-3 label.sk-toggleable__label { cursor: pointer; display: block; width: 100%; margin-bottom: 0; padding: 0.3em; box-sizing: border-box; text-align: center }
#sk-container-id-3 label.sk-toggleable__label-arrow:before { content: "▸"; float: left; margin-right: 0.25em; color: rgba(105, 105, 105, 1) }
#sk-container-id-3 label.sk-toggleable__label-arrow:hover:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-3 div.sk-estimator:hover label.sk-toggleable__label-arrow:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-3 div.sk-toggleable__content { max-height: 0; max-width: 0; overflow: hidden; text-align: left; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-3 div.sk-toggleable__content pre { margin: 0.2em; color: rgba(0, 0, 0, 1); border-radius: 0.25em; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-3 input.sk-toggleable__control:checked~div.sk-toggleable__content { max-height: 200px; max-width: 100%; overflow: auto }
#sk-container-id-3 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before { content: "▾" }
#sk-container-id-3 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-3 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-3 input.sk-hidden--visually { border: 0; clip: rect(1px, 1px, 1px, 1px); height: 1px; margin: -1px; overflow: hidden; padding: 0; position: absolute; width: 1px }
#sk-container-id-3 div.sk-estimator { font-family: monospace; background-color: rgba(240, 248, 255, 1); border: 1px dotted rgba(0, 0, 0, 1); border-radius: 0.25em; box-sizing: border-box; margin-bottom: 0.5em }
#sk-container-id-3 div.sk-estimator:hover { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-3 div.sk-parallel-item::after { content: ""; width: 100%; border-bottom: 1px solid rgba(128, 128, 128, 1); flex-grow: 1 }
#sk-container-id-3 div.sk-label:hover label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-3 div.sk-serial::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: 0 }
#sk-container-id-3 div.sk-serial { display: flex; flex-direction: column; align-items: center; background-color: rgba(255, 255, 255, 1); padding-right: 0.2em; padding-left: 0.2em; position: relative }
#sk-container-id-3 div.sk-item { position: relative; z-index: 1 }
#sk-container-id-3 div.sk-parallel { display: flex; align-items: stretch; justify-content: center; background-color: rgba(255, 255, 255, 1); position: relative }
#sk-container-id-3 div.sk-item::before, #sk-container-id-3 div.sk-parallel-item::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: -1 }
#sk-container-id-3 div.sk-parallel-item { display: flex; flex-direction: column; z-index: 1; position: relative; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-3 div.sk-parallel-item:first-child::after { align-self: flex-end; width: 50% }
#sk-container-id-3 div.sk-parallel-item:last-child::after { align-self: flex-start; width: 50% }
#sk-container-id-3 div.sk-parallel-item:only-child::after { width: 0 }
#sk-container-id-3 div.sk-dashed-wrapped { border: 1px dashed rgba(128, 128, 128, 1); margin: 0 0.4em 0.5em; box-sizing: border-box; padding-bottom: 0.4em; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-3 div.sk-label label { font-family: monospace; font-weight: bold; display: inline-block; line-height: 1.2em }
#sk-container-id-3 div.sk-label-container { text-align: center }
#sk-container-id-3 div.sk-container { display: inline-block !important; position: relative }
#sk-container-id-3 div.sk-text-repr-fallback { display: none }GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;min_samples_leaf&#x27;: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])})</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-7" type="checkbox" ><label for="sk-estimator-id-7" class="sk-toggleable__label sk-toggleable__label-arrow">GridSearchCV</label><div class="sk-toggleable__content"><pre>GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;min_samples_leaf&#x27;: array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])})</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-8" type="checkbox" ><label for="sk-estimator-id-8" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-9" type="checkbox" ><label for="sk-estimator-id-9" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div></div></div></div></div></div></div></div>

# 查看最好模型参数
grid.best_params_

{'min_samples_leaf': 1}

# 查看最好的分
grid.best_score_

0.9666353383458647

不懈努力,继续尝试 min_samples_split
param_grid = {'min_samples_split':np.arange(2, 2+20, 1)}
rfc = RandomForestClassifier(n_estimators=73, random_state=90)
grid = GridSearchCV(rfc, param_grid,cv=10)
grid.fit(cancer.data, cancer.target)

#sk-container-id-4 { color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1) }
#sk-container-id-4 pre { padding: 0 }
#sk-container-id-4 div.sk-toggleable { background-color: rgba(255, 255, 255, 1) }
#sk-container-id-4 label.sk-toggleable__label { cursor: pointer; display: block; width: 100%; margin-bottom: 0; padding: 0.3em; box-sizing: border-box; text-align: center }
#sk-container-id-4 label.sk-toggleable__label-arrow:before { content: "▸"; float: left; margin-right: 0.25em; color: rgba(105, 105, 105, 1) }
#sk-container-id-4 label.sk-toggleable__label-arrow:hover:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-4 div.sk-estimator:hover label.sk-toggleable__label-arrow:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-4 div.sk-toggleable__content { max-height: 0; max-width: 0; overflow: hidden; text-align: left; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-4 div.sk-toggleable__content pre { margin: 0.2em; color: rgba(0, 0, 0, 1); border-radius: 0.25em; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-4 input.sk-toggleable__control:checked~div.sk-toggleable__content { max-height: 200px; max-width: 100%; overflow: auto }
#sk-container-id-4 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before { content: "▾" }
#sk-container-id-4 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-4 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-4 input.sk-hidden--visually { border: 0; clip: rect(1px, 1px, 1px, 1px); height: 1px; margin: -1px; overflow: hidden; padding: 0; position: absolute; width: 1px }
#sk-container-id-4 div.sk-estimator { font-family: monospace; background-color: rgba(240, 248, 255, 1); border: 1px dotted rgba(0, 0, 0, 1); border-radius: 0.25em; box-sizing: border-box; margin-bottom: 0.5em }
#sk-container-id-4 div.sk-estimator:hover { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-4 div.sk-parallel-item::after { content: ""; width: 100%; border-bottom: 1px solid rgba(128, 128, 128, 1); flex-grow: 1 }
#sk-container-id-4 div.sk-label:hover label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-4 div.sk-serial::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: 0 }
#sk-container-id-4 div.sk-serial { display: flex; flex-direction: column; align-items: center; background-color: rgba(255, 255, 255, 1); padding-right: 0.2em; padding-left: 0.2em; position: relative }
#sk-container-id-4 div.sk-item { position: relative; z-index: 1 }
#sk-container-id-4 div.sk-parallel { display: flex; align-items: stretch; justify-content: center; background-color: rgba(255, 255, 255, 1); position: relative }
#sk-container-id-4 div.sk-item::before, #sk-container-id-4 div.sk-parallel-item::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: -1 }
#sk-container-id-4 div.sk-parallel-item { display: flex; flex-direction: column; z-index: 1; position: relative; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-4 div.sk-parallel-item:first-child::after { align-self: flex-end; width: 50% }
#sk-container-id-4 div.sk-parallel-item:last-child::after { align-self: flex-start; width: 50% }
#sk-container-id-4 div.sk-parallel-item:only-child::after { width: 0 }
#sk-container-id-4 div.sk-dashed-wrapped { border: 1px dashed rgba(128, 128, 128, 1); margin: 0 0.4em 0.5em; box-sizing: border-box; padding-bottom: 0.4em; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-4 div.sk-label label { font-family: monospace; font-weight: bold; display: inline-block; line-height: 1.2em }
#sk-container-id-4 div.sk-label-container { text-align: center }
#sk-container-id-4 div.sk-container { display: inline-block !important; position: relative }
#sk-container-id-4 div.sk-text-repr-fallback { display: none }GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;min_samples_split&#x27;: array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21])})</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-10" type="checkbox" ><label for="sk-estimator-id-10" class="sk-toggleable__label sk-toggleable__label-arrow">GridSearchCV</label><div class="sk-toggleable__content"><pre>GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;min_samples_split&#x27;: array([ 2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17, 18,
   19, 20, 21])})</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-11" type="checkbox" ><label for="sk-estimator-id-11" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-12" type="checkbox" ><label for="sk-estimator-id-12" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div></div></div></div></div></div></div></div>

# 查看最好模型参数
grid.best_params_

{'min_samples_split': 2}

# 查看最好的分
grid.best_score_

0.9666353383458647

最后尝试一下 criterion
param_grid = {'criterion':['gini', 'entropy']}
rfc = RandomForestClassifier(n_estimators=73, random_state=90)
grid = GridSearchCV(rfc, param_grid,cv=10)
grid.fit(cancer.data, cancer.target)

#sk-container-id-5 { color: rgba(0, 0, 0, 1); background-color: rgba(255, 255, 255, 1) }
#sk-container-id-5 pre { padding: 0 }
#sk-container-id-5 div.sk-toggleable { background-color: rgba(255, 255, 255, 1) }
#sk-container-id-5 label.sk-toggleable__label { cursor: pointer; display: block; width: 100%; margin-bottom: 0; padding: 0.3em; box-sizing: border-box; text-align: center }
#sk-container-id-5 label.sk-toggleable__label-arrow:before { content: "▸"; float: left; margin-right: 0.25em; color: rgba(105, 105, 105, 1) }
#sk-container-id-5 label.sk-toggleable__label-arrow:hover:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-5 div.sk-estimator:hover label.sk-toggleable__label-arrow:before { color: rgba(0, 0, 0, 1) }
#sk-container-id-5 div.sk-toggleable__content { max-height: 0; max-width: 0; overflow: hidden; text-align: left; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-5 div.sk-toggleable__content pre { margin: 0.2em; color: rgba(0, 0, 0, 1); border-radius: 0.25em; background-color: rgba(240, 248, 255, 1) }
#sk-container-id-5 input.sk-toggleable__control:checked~div.sk-toggleable__content { max-height: 200px; max-width: 100%; overflow: auto }
#sk-container-id-5 input.sk-toggleable__control:checked~label.sk-toggleable__label-arrow:before { content: "▾" }
#sk-container-id-5 div.sk-estimator input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-5 div.sk-label input.sk-toggleable__control:checked~label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-5 input.sk-hidden--visually { border: 0; clip: rect(1px, 1px, 1px, 1px); height: 1px; margin: -1px; overflow: hidden; padding: 0; position: absolute; width: 1px }
#sk-container-id-5 div.sk-estimator { font-family: monospace; background-color: rgba(240, 248, 255, 1); border: 1px dotted rgba(0, 0, 0, 1); border-radius: 0.25em; box-sizing: border-box; margin-bottom: 0.5em }
#sk-container-id-5 div.sk-estimator:hover { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-5 div.sk-parallel-item::after { content: ""; width: 100%; border-bottom: 1px solid rgba(128, 128, 128, 1); flex-grow: 1 }
#sk-container-id-5 div.sk-label:hover label.sk-toggleable__label { background-color: rgba(212, 235, 255, 1) }
#sk-container-id-5 div.sk-serial::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: 0 }
#sk-container-id-5 div.sk-serial { display: flex; flex-direction: column; align-items: center; background-color: rgba(255, 255, 255, 1); padding-right: 0.2em; padding-left: 0.2em; position: relative }
#sk-container-id-5 div.sk-item { position: relative; z-index: 1 }
#sk-container-id-5 div.sk-parallel { display: flex; align-items: stretch; justify-content: center; background-color: rgba(255, 255, 255, 1); position: relative }
#sk-container-id-5 div.sk-item::before, #sk-container-id-5 div.sk-parallel-item::before { content: ""; position: absolute; border-left: 1px solid rgba(128, 128, 128, 1); box-sizing: border-box; top: 0; bottom: 0; left: 50%; z-index: -1 }
#sk-container-id-5 div.sk-parallel-item { display: flex; flex-direction: column; z-index: 1; position: relative; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-5 div.sk-parallel-item:first-child::after { align-self: flex-end; width: 50% }
#sk-container-id-5 div.sk-parallel-item:last-child::after { align-self: flex-start; width: 50% }
#sk-container-id-5 div.sk-parallel-item:only-child::after { width: 0 }
#sk-container-id-5 div.sk-dashed-wrapped { border: 1px dashed rgba(128, 128, 128, 1); margin: 0 0.4em 0.5em; box-sizing: border-box; padding-bottom: 0.4em; background-color: rgba(255, 255, 255, 1) }
#sk-container-id-5 div.sk-label label { font-family: monospace; font-weight: bold; display: inline-block; line-height: 1.2em }
#sk-container-id-5 div.sk-label-container { text-align: center }
#sk-container-id-5 div.sk-container { display: inline-block !important; position: relative }
#sk-container-id-5 div.sk-text-repr-fallback { display: none }GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;criterion&#x27;: [&#x27;gini&#x27;, &#x27;entropy&#x27;]})</pre><b>In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. <br />On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.</b></div><div class="sk-container" hidden><div class="sk-item sk-dashed-wrapped"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-13" type="checkbox" ><label for="sk-estimator-id-13" class="sk-toggleable__label sk-toggleable__label-arrow">GridSearchCV</label><div class="sk-toggleable__content"><pre>GridSearchCV(cv=10,
         estimator=RandomForestClassifier(n_estimators=73, random_state=90),
         param_grid={&#x27;criterion&#x27;: [&#x27;gini&#x27;, &#x27;entropy&#x27;]})</pre></div></div></div><div class="sk-parallel"><div class="sk-parallel-item"><div class="sk-item"><div class="sk-label-container"><div class="sk-label sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-14" type="checkbox" ><label for="sk-estimator-id-14" class="sk-toggleable__label sk-toggleable__label-arrow">estimator: RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div><div class="sk-serial"><div class="sk-item"><div class="sk-estimator sk-toggleable"><input class="sk-toggleable__control sk-hidden--visually" id="sk-estimator-id-15" type="checkbox" ><label for="sk-estimator-id-15" class="sk-toggleable__label sk-toggleable__label-arrow">RandomForestClassifier</label><div class="sk-toggleable__content"><pre>RandomForestClassifier(n_estimators=73, random_state=90)</pre></div></div></div></div></div></div></div></div></div></div>

# 查看最好模型参数
grid.best_params_

{'criterion': 'gini'}

# 查看最好的分
grid.best_score_

0.9666353383458647

最好模型参数
rfc = RandomForestClassifier(n_estimators=73, 
                             random_state=90, 
                             max_features=24)
score = cross_val_score(rfc, cancer.data, cancer.target, cv=10).mean()
score

0.9666666666666668

最终提高准确率
score - score_pre

0.0017857142857143904

这篇关于随机森林乳腺癌案例的文章就介绍到这儿,希望我们推荐的文章对大家有所帮助,也希望大家多多支持为之网!