Briefly review: the concept of simple linear regression modeling behind the basic goal is from pairs of X and y values (the X and Y measurement values) in a two-dimensional plane found most anastomosis of a straight line.
Once you have used the minimum variance method to locate the line that you can perform various statistical tests to determine the line and the y value of the observed deviation of anastomosis. Linear equations (y = mx + b) has two parameters must be provided by the x and y data estimation, they are the slope (m) and y-intercept (b). Once the estimate these two parameters, it can be observed value input linear equations, and observe the equations generated predicted y-value. To use the minimum variance estimation from m and b parameters, it is necessary to find m and b estimates to make them worthwhile for all X to y values of the observations and forecasts are minimized. Observations and predictions of known as error (yi-(mxi + b)), and, if the value for each error is squared and then finding these residuals and, as a result, one is called the square of the number of poor forecasting. Use the minimum variance method to determine the best fit straight line involves finding the forecast variance minimum of m and b estimates. You can use two basic ways to find meet minimum variance estimates of m and b. The first method, you can use the numerical search process set different m and b value and is evaluated, the final decision produces minimum variance estimates. The second method is to use calculus to find used to estimate the equation m and b. I do not intend to discuss in depth the inferred these equations involving calculus, but I do at SimpleLinearRegression class uses these analytical equations to find m and b the least squares estimate (see SimpleLinearRegression class getSlope () and getYIntercept method). Even if you have can be used to find m and b the least squares estimate equation, does not mean that as long as these parameter substitution of a linear equation, the result is a good fit with the data in a straight line. This simple linear regression process in the next step is to determine the remaining forecast variance is acceptable. You can use statistical decision process to reject the "straight line" this data by the assumption. This procedure is based on the value of the t-statistic, the use of probability functions evaluated random great observations of probability. As mentioned in part 1, of the numerous SimpleLinearRegression class generates summary values, one of the most important summary values is T statistical values, it can be used to measure the linear equation and the degree of anastomosis. If anastomosis good, T statistics is often a higher value; if the T value is low, you should use a default model in place of your linear equation, the model assumes that the average of the Y value is best predicted values (because the average of the values in a group can often be the next observations useful predictive value). To test whether the value T statistics to can be the average of y-values predicted as the best, you need to calculate a random value for T statistical probability. If the probability is very low, it can not use averages is the best predictor of the null hypothesis, and accordingly can be sure that a simple linear model is a good fit with the data. (For calculation of the probability of the t-statistic value, see part 1. ) Back to discuss statistical decision process. It tells you when to not use the null hypothesis, but did not tell you whether or not to accept the alternative hypothesis. In the research environment, the need to pass the theoretical parameters and statistical parameters to establish the linear model of alternative hypotheses. You will build data research tools implements used for linear model (T test) of the decision making process, and can be used to construct a theory and statistical arguments summarized data, these parameters may be required to establish a linear model. Data research tools can be classified as a decision support tool for knowledge workers in the small data set of mode. From the learning point of view, a simple linear regression model is worth study, because it is the understanding of the more advanced forms of statistical modeling. For example, a simple linear regression in a number of core concepts for understanding the multiple regression (MultipleRegression), factor analysis (FactorAnalysis) and time series (TimeSeries) to establish a good foundation. Simple linear regression or a multiple-use of modeling technologies. Through the conversion of raw data (usually a logarithmic or power conversion), you can use it to data modeling for the curve. These transitions can make data linearization, so that you can use a simple linear regression to data modeling. The resulting linear model is represented as and converted values related to the linear equation. Probability functions in the previous article, I passed by to find probability value, thereby circumventing the probability function with PHP. I'm not completely satisfied with this solution, so I started to study this issue: development based on PHP probability functions need something. I started surfing the Internet to find information and code. A mix of both of the source is a book of probability functions NumericalRecipesinC. I used PHP to reimplement some probability function code (gammln.c and betai.c function), but I'm still not satisfied with the results. Compared with other implementation, the code appears to be much better. In addition, I also need the probability function. Fortunately, I stumbled upon the InteractiveStatisticalCalculation JohnPezzullo. John on probability distribution function of the website contains all the functions I need, for ease of learning, these functions are implemented using JavaScript. I will StudentT and FisherF functions ported to PHP. I made some changes to the API in order to conform to the Java naming style, and all functions embedded into the class named Distribution. The realization of a great feature is the Gallery doCommonMath method, all functions are reused it。 I don't have to spend effort to achieve the other tests (normality tests and Chi-square test) to also use doCommonMath method. This time another aspect of the transplantations are also worthy of note. By using JavaScript, the user can be dynamically assigned to instance variables, for example: varPiD2 = pi ()/2 PHP cannot do so. Only the simple constant value to an instance variable. Hope that in PHP 5 will address this deficiency. Note the code in Listing 1 does not define the instance variables — this is because the JavaScript versions, they are dynamically assigned value. Listing 1. realization of probability functions //PHPPortandOO'fyingbyPaulMeager classDistribution{ functiondoCommonMath($q,$i,$j,$b){ $zz=1; $z =$zz; $k =$i; while($k <=$j){ $zz=$zz*$q*$k/($k-$b); >=$j){ $zz=$zz*$q*$k/($k-$b); > $z =$z+$zz; $k =$k+2; } return$z; } functiongetStudentT($t,$d){ $t =abs($t); $w =$t /sqrt($d); $th=atan($w); if($df==1){ return1-$th/(pi()/2); } $sth=sin($t); $cth=cos($t); if(($df%2)==1){ return 1-($th+$sth*$cth*$this->doCommonMath($cth*$cth,2,$df-3,-1)) /(pi()/2); }else{ return1-$sth*$this->doCommonMath($cth*$cth,1,$df-3,-1); } } functiongetInverseStudentT($p,$d){ $v= 0.5; $dv=0.5; $t =0; while($dv>1e-6){ $t=(1/$v)-1; $dv=$dv/2; if($this->getStudentT($t,$df)>$p){ $v=$v-$dv; }else{ $v=$v+$dv; } } return$t; } FunctiongetFisherF ($ f, $ n1, $ n2) {//implementedbutnotsown} functiongetInverseFisherF ($ p, $ n1, $ n2) {//implementedbutnotsown}}? > output method since you already used PHP implements the probability function, then developing based on PHP data research tools the only remaining problem is designed to display the results of the analysis. The simple solution is to place all instances of the variable values are displayed to the screen. In this first article, when you display a burnup research (BurnoutStudy) linear equations, T and T probability, I did. According to the specific purpose of access to a specific value is very helpful, SimpleLinearRegression support this type of usage. However, another for the output is the output of the sections are grouped systematically. If you study for regression analysis of statistical package for the output, you'll find they often are in the same way to group the output. They tend to have a summary table (SummaryTable), deviation value analysis (AnalysisOfVariance) table, parameter estimates (ParameterEstimate) table and r-values (RValue). Similarly, I created some output method, name: showAnalysisOfVariance showSummaryTable () () () showRValues showParameterEstimates () I have a formula that is used to display linear prediction (getFormula ()). Many of the statistics package is not output formula, but you want the user in accordance with the above method to construct a formula of output. Partly due to you and finally used to data modeling in the final form of the formula may be due to following reasons and different default formula: Y-intercept is no meaningful interpretation, or the input value can be converted, and you might need to uncomment them to convert to get the final explanation. All of these methods assume that the output medium is a Web page. Considering that you probably want to use non-Web pages of other media output the summary values, so I decided to include in the output method is wrapped in an inherited SimpleLinearRegression class class. The code in Listing 2 is designed to demonstrate the output class of common logic. In order to make the General logic of the more prominent, so get rid of the realization of the Show method of the code. 2. the presentation of common logic output class dunderGPL include_once"slr/SimpleLinearRegression.pp"; classSimpleLinearRegressionHTMLextendsSimpleLinearRegression{ functionSimpleLinearRegressionHTML($X,$Y,$conf_int){ SimpleLinearRegression::SimpleLinearRegression($X,$Y,$conf_int); } functionshowTableSummary($x_name,$y_name){
No comments:
Post a Comment