Yishuo School District (31) | SPSS Statistical Analysis (41) Curve Regression Analysis

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(1)

“分享兴趣,传播快乐,增长见闻,留下美好! 大家好,这里是小编。欢迎大家继续访问学苑内容,我们将竭诚为您带来更多更好的内容分享。

"Share interest, spread happiness, increase knowledge, and leave a good impression! Hello everyone, this is Xiaobian. Welcome to continue to visit the content of Xueyuan, and we will wholeheartedly bring you more and better content to share.

上一期,我们一起来学习了如何进行多元回归分析,在实际问题中,如果变量间的关系是非线性的,那么问题就要复杂得多。变量之间的非线性关系可以划分为本质线性关系和本质非线性关系。

所谓的本质线性关系是指,变量关系形式上虽然呈非线性关系(如二次曲线),但可以通过变量变换化为线性关系,并可最终通过线性回归分析建立线性模型。本质非线性关系是指,变量关系不仅形式上呈非线性关系,而且也无法通过变量变换化为线性关系,最终无法通过线性回归分析建立线性模型。

In the last issue, we learned how to conduct multiple regression analysis. In practical problems, if the relationship between variables is nonlinear, the problem is much more complex. The nonlinear relationship between variables can be divided into essential linear relationship and essential nonlinear relationship.

The so-called essential linear relationship means that although the variable relationship is nonlinear in form (such as a quadratic curve), it can be transformed into a linear relationship through variable transformation, and finally a linear model can be established through linear regression analysis. The essence of nonlinear relationship refers to that the variable relationship is not only nonlinear in form, but also cannot be transformed into linear relationship through variable transformation, and ultimately cannot establish a linear model through linear regression analysis.

在实际问题中,用户往往不能确定究竟何种函数模型更接近样本数据,在SPSS中进行曲线估计的一般步骤如下:

首先,根据实际问题的特点,在多种可选择的模型中选择几种;其次,SPSS自动完成模型参数的估计,并输出回归方程显著性检验的F值和概率P值、决定系数R方等统计量;最后,以决定系数为主要依据选择其中的最优模型(R方最大),并进行预测操作。

In practical problems, users often cannot determine which function model is closer to the sample data. The general steps for curve estimation in SPSS are as follows:

First of all, according to the characteristics of the actual problem, select several models from a variety of choices; Secondly, SPSS automatically completes the estimation of model parameters, and outputs statistics such as F value, probability P value and determination coefficient R square of regression equation significance test; Finally, the optimal model (R maximum) is selected based on the determination coefficient, and the prediction operation is carried out.

实例训练

接下来,我们进入一个实际案例,下图是1989年~2001年国家保费收入与国内生产总值的数据,试研究保费收入与国内生产总值的关系。

Next, let's enter into an actual case. The figure below shows the data of national premium income and GDP from 1989 to 2001. Try to study the relationship between premium income and GDP.

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(2)

1

分析

先用散点图的形式进行分析,看究竟是否具有一元线性关系,如果具有一元线性关系,则用一元线性回归分析,否则采用曲线估计求解。

The first step is analysis. First, analyze in the form of scatter chart to see whether there is a linear relationship of one variable. If there is a linear relationship of one variable, use a linear regression analysis of one variable, otherwise use curve estimation to solve.

2

组织数据

定义三个变量,分别是“year”(年度)、“y”(保费收入)和“x”(国内生产总值),输入数据并保存。

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(3)

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(4)

Step 2: organize data, define three variables, namely "year", "y" (premium income) and "x" (gross domestic product), enter data and save them.

3

作出散点图,初步判定变量的分布趋势

从下图可以看出,保费收入y随国内生产总值x的提高而逐渐提高,而且当国内生产总值达到一定的水平过后,保费收入的增幅则更加明显。因此用线性回归模型表示x,y的关系是不恰当的。应该去找拟合效果更好的模型。

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(5)

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(6)

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(7)

The third step is to make a scatter chart to preliminarily determine the distribution trend of variables. It can be seen from the figure below that the premium income y gradually increases with the increase of GDP x, and when the GDP reaches a certain level, the increase of premium income is more obvious. Therefore, it is inappropriate to express the relationship between x and y with linear regression model. We should find a better fitting model.

4

进行曲线回归

选择菜单“分析->回归->曲线估算”,将所有模型都选上,按下图进行设置,看运行结果中哪些模型的拟合效果更好。

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(8)

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(9)

The fourth step is curve regression. Select the menu "Analysis ->Regression ->Curve Estimation", select all models, and set them according to the following figure to see which models have better fitting effect in the running results.

从决定系数(R方)来看,三次曲线效果最好,并且方差分析的显著性概率值为0,故重复上述过程,只选“三次”一种模型。

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(10)

From the determination coefficient (R square), the cubic curve is the best, and the significance probability value of ANOVA is 0, so repeat the above process, and only select the "cubic" model.

5

主要结果及其分析

下图是对三次曲线模型的摘要及参数估算值表,决定系数R方=0.990,且显著性概率值为0.000,故可判断保费收入与国内生产总值之间有较显著的三次曲线关系。

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(11)

spss三线图选择题怎么分析(懿说学区31SPSS统计分析)(12)

The fifth step is the main results and their analysis. The following figure is the summary of the cubic curve model and the table of parameter estimates. The coefficient of determination R=0.990, and the significance probability value is 0.000. Therefore, it can be judged that there is a significant cubic curve relationship between premium income and GDP.

下期预告:本期,我们学习了

曲线回归分析的实践操作。

下一期,我们将会学习

关于非线性回归分析的问题。

Preview of the next issue: In this issue, we learned the practical operation of curve regression analysis. In the next issue, we will learn about nonlinear regression analysis.

今天的分享就到这里了

如果您对今天的文章有独特的想法

欢迎给我们留言

让我们相约明天

祝您今天过得开心快乐!

That's all for today's sharing. If you have unique ideas about today's article, please leave us a message. Let's meet tomorrow. I wish you a happy day today!

参考资料:百度百科,《SPSS 23 统计分析实用教程》

翻译:百度翻译

本文由learningyard新学苑原创,部分文字图片来源于他处,如有侵权,请联系删除。

,