ICode9

精准搜索请尝试: 精确搜索
首页 > 其他分享> 文章详细

机器学习之逻辑回归实践

2021-07-05 14:03:43  阅读:176  来源: 互联网

标签:逻辑 机器 实践 dataset train test import model sklearn


购买意向预测与其他预测

今天使用逻辑回归做了个购买意向的预测。

数据集如下

4个特征,这里我们不使用ID和性别,只使用年龄和收入两个特征):

代码如下

import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
#import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score
 
dataset = pd.read_csv('Social_Network_Ads.csv')
X = dataset.iloc[ : ,2:4].values
Y = dataset.iloc[ : ,4].values
 
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.2)
 
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test )
 
classifier = LogisticRegression()
classifier.fit(X_train, Y_train)
 
y_pred = classifier.predict(X_test)
 
acc = accuracy_score(Y_test, y_pred)
print("准确率为:", acc)

还有一个其他方面的预测代码示例如下:

# encoding: utf-8
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
 
dataset = pd.read_csv('dataset.csv', delimiter=',')
X = np.asarray(dataset.get(['x1', 'x2']))
y = np.asarray(dataset.get('y'))
 
# 划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5)
 
# 使用 sklearn 的 LogisticRegression 作为模型,其中有 penalty,solver,dual 几个比较重要的参数,不同的参数有不同的准确率,这里为了简便都使用默认的,详细的请参考 sklearn 文档
model = LogisticRegression(solver='liblinear')
 
# 拟合
model.fit(X, y)
 
# 预测测试集
predictions = model.predict(X_test)
 
# 打印准确率
print('测试集准确率:', accuracy_score(y_test, predictions))
 
weights = np.column_stack((model.intercept_, model.coef_)).transpose()
 
n = np.shape(X_train)[0]
xcord1 = []
ycord1 = []
xcord2 = []
ycord2 = []
for i in range(n):
    if int(y_train[i]) == 1:
        xcord1.append(X_train[i, 0])
        ycord1.append(X_train[i, 1])
    else:
        xcord2.append(X_train[i, 0])
        ycord2.append(X_train[i, 1])
fig = plt.figure()
ax = fig.add_subplot(111)
ax.scatter(xcord1, ycord1, s=30, c='red', marker='s')
ax.scatter(xcord2, ycord2, s=30, c='green')
x_ = np.arange(-3.0, 3.0, 0.1)
y_ = (-weights[0] - weights[1] * x_) / weights[2]
ax.plot(x_, y_)
plt.xlabel('x1')
plt.ylabel('x2')
plt.show()

 3. 癌症预测数据集实践

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report

# 加载数据
breast = load_breast_cancer()
print(breast)
# 数据拆分
X_train, X_test, y_train, y_test = train_test_split(
    breast.data, breast.target)

# 数据标准化
std = StandardScaler()
X_train = std.fit_transform(X_train)
X_test = std.transform(X_test)

# 训练预测
lg = LogisticRegression()

lg.fit(X_train, y_train)

y_predict = lg.predict(X_test)

# 查看训练准确度和预测报告
print(lg.score(X_test, y_test))
print("="*20)
print(classification_report(
    y_test, y_predict, labels=[0, 1]))

标签:逻辑,机器,实践,dataset,train,test,import,model,sklearn
来源: https://blog.csdn.net/SAINT0911/article/details/118488265

本站声明: 1. iCode9 技术分享网(下文简称本站)提供的所有内容,仅供技术学习、探讨和分享;
2. 关于本站的所有留言、评论、转载及引用,纯属内容发起人的个人观点,与本站观点和立场无关;
3. 关于本站的所有言论和文字,纯属内容发起人的个人观点,与本站观点和立场无关;
4. 本站文章均是网友提供,不完全保证技术分享内容的完整性、准确性、时效性、风险性和版权归属;如您发现该文章侵犯了您的权益,可联系我们第一时间进行删除;
5. 本站为非盈利性的个人网站,所有内容不会用来进行牟利,也不会利用任何形式的广告来间接获益,纯粹是为了广大技术爱好者提供技术内容和技术思想的分享性交流网站。

专注分享技术,共同学习,共同进步。侵权联系[81616952@qq.com]

Copyright (C)ICode9.com, All Rights Reserved.

ICode9版权所有