随机森林分类算法python代码,随机森林算法实现代码

　　从随机导入样本，随机，选择，随机

　　来自匹配导入中心，日志

　　来自实用工具导入运行时

　　类节点(对象) :

　　def __init__(self，size):

　　 nodeclasstobuildtreleaves keyword参数：size { int } - nodesize(默认值：{ none })

　　#节点大小

　　self.size=size

　　#要分割的特征

　　self.split_feature=None

　　#拆分点

　　self.split_point=None

　　#左电池节点

　　self.left=无

　　#光周期节点

　　self.right=无

　　分类策略树（对象) :

　　ef_init_(self，x，n_samples，max_depth):

　　 isolationtreeclassrarguments 3360 x { list }-2d listwithitorfloatn _ samples { int }-subsamplesizemax _ depth {

　　self.height=0

　　#如果是n _ samplesisgreaterthann

　　n=len(x)

　　如果n _样本n:

　　n _样本=n

　　#根节点

　　self.root=node(n_samples)。

　　#构建隔离树

　　自我. build_tree(x，n_samples，max_depth))。

　　def_get_split(self，x，idx，split_feature):

　　 randomlychoosesplitpointarguments:x { list }-2d listobjectwithintorfloatidx { list }-1d listobjectwithintsplints

　　# thesplitpointshouldbegreaterthamin(x[feature])()()()()()))))652

　　unique=set(map)lambdai:x[I][split _ feature]，idx))

　　#无法拆分

　　iflen(唯一)==1:

　　返回无

　　unique.remove(min ) unique))

　　x_min，x_max=min(唯一)，max(唯一).

　　#注意：random (-xintheinterval [ 0，1 ])。

　　返回随机(* ) * (x_max - x_min ) x_min

　　定义_构建_树（自身，x，n _样本，最大深度) :

　　当前节点数据空间被划分为2个子空间：lessstansplitpointintspecifiedimensionhelefchildofcurrent de putgreaterthanorequaltosplitppointdataonthecurrentnode 右子。递归构造tnewchildnodesuntilthedatacanotespld节点已达到最大深度，参数：x { list }-2d listobjectwithintorfloatn _ samples-subsamplesizemax _ demax

　　#数据形状

　　(m=len(x[0] ) () ) ) ) ) ) ) ) ) ) ) ) ) 652

　　透镜(十)

　　#随机选择的样本点进入树的根节点

　　idx=样本（范围(n)，n _样本)

　　#深度、节点和索引文件

　　# BFS

　　而阙和que[0][0]=最大深度：

　　深度，nd，idx=que.pop(0)

　　#如果X无法拆分，则停止拆分

　　nd.split_feature=choice(范围(m))

　　nd.split_point=self ._get_split(X，idx，nd.split_feature)

　　如果nd .分割点为无：

　　继续

　　#拆分

　　idx_left=[]

　　idx_right=[]

　　而idx:

　　i=idx.pop()

　　xi=X[i][nd.split_feature]

　　如果xi新分裂点：

　　idx_left.append(i)

　　否则：

　　idx_right.append(i)

　　#生成左右子代

　　nd.left=Node(len(idx_left))

　　nd.right=Node(len(idx_right))

　　#把左和孩子放进阙和深度加一

　　#更新nqdds/p自身的高度高度=深度。

　　定义_预测(自我，xi):

　　""预测的辅助功能。参数：Xi {列表} - 1D列表与int或浮动回报率：Xi内部所属节点的深度

　　#从孤立树中搜索xi，直到xi到达一个叶节点

　　nd=self.root

　　深度=0

　　而左侧和右：

　　如果Xi[nd。分割_特征]nd。分割点：

　　nd=nd。左

　　否则：

　　nd=nd。右

　　深度=1

　　返回深度，nd。尺寸

　　dddch类（对象):

　　def __init__(self):

　　 dddch，随机构建一些隔离树实例，以及每个IsolationTreeAttributes的平均得分：trees { list }-带有隔离树对象调整的1d列表{ float }“”

　　self.trees=无

　　自调整=无#待定

　　定义拟合（自身，X，n _样本=100,最大深度=10，n _树=256):

　　使用数据集X构建dddch参数：X{list} -使用（同Internationalorganizations）国际组织或浮动关键字参数的2d列表：n_samples{int} -根据论文，将样本数设置为256(默认值：{256})max_depth{int} -树高限制（默认值：{10})n_trees{int} -根据论文，将树数设置为100(默认值：{100})

　　自我调整=自我。_获取_调整(n _样本)

　　self.trees=[IsolationTree(X，n_samples，max_depth)

　　对于_在范围(n_trees)]

　　def _get_adjustment(self，node_size):

　　根据论文中的公式计算调整。参数：node_size{int} -叶节点数返回：浮动调整

　　如果节点大小为2:

　　我=节点大小- 1

　　ret=2 *(log(I)0.5772156649)-2 * I/node _ size

　　否则如果节点大小==2:

　　ret=1

　　否则：

　　ret=0

　　返回浸水使柔软

　　定义_预测(自我，xi):

　　""预测的辅助功能。参数：Xi {列表} -带有（同Internationalorganizations）国际组织或漂浮物的1d列表对象返回：列表-带有漂浮物的1d列表对象

　　#计算每棵树上xi的平均得分

　　分数=0

　　n_trees=len(自身树)

　　对于自我树中的树：

　　深度，节点大小=树。_预测(xi)

　　得分=(深度自我. get_adjustment(node_size))

　　分数=分数/n _树

　　#规模

　　返回2 ** -(得分/自我调整)

　　定义预测（自我，X):

　　获取y的预测。参数：X{list} -带有（同Internationalorganizations）国际组织或漂浮物的二维列表对象返回：带有漂浮物的一维列表对象

　　回归【自我. X年预测xi的(xi)

　　@run_time

　　def main():

　　打印(比较X的平均分数和异常值的分数.)

　　#随机生成数据集

　　n=100

　　x=[[random()for _ in range(5)]for _ in range(n)]

　　#添加异常值

　　十。追加([10]*5)

　　#列车型号

　　clf=dddch()

　　clf.fit(X，n_samples=500)

　　#显示结果

　　打印(平均分数为% .2f %(sum(clf。预测(X))/len(X)))

　　打印(离群值的分数是% . 2f“% clf ._predict(X[-1])

　　if __name__==__main__ :

　　主()

郑重声明：本文由网友发布，不代表盛行IT的观点，版权归原作者所有，仅为传播更多信息之目的，如有侵权请联系，我们将第一时间修改或删除，多谢。

相关文章阅读