Errors and residuals

The expected value, being the mean of the entire population, is typically unobservable, and hence the statistical error cannot be observed either.A residual (or fitting deviation), on the other hand, is an observable estimate of the unobservable statistical error.If we assume a normally distributed population with mean μ and standard deviation σ, and choose individuals independently, then we have and the sample mean is a random variable distributed such that: The statistical errors are then with expected values of zero,[4] whereas the residuals are The sum of squares of the statistical errors, divided by σ2, has a chi-squared distribution with n degrees of freedom: However, this quantity is not observable as the population mean is unknown.It is remarkable that the sum of squares of the residuals and the sample mean can be shown to be independent of each other, using, e.g. Basu's theorem.That fact, and the normal and chi-squared distributions given above form the basis of calculations involving the t-statistic: whereThat is fortunate because it means that even though we do not know σ, we know the probability distribution of this quotient: it has a Student's t-distribution with n − 1 degrees of freedom.If one runs a regression on some data, then the deviations of the dependent variable observations from the fitted function are the residuals.[5] If the data exhibit a trend, the regression model is likely incorrect; for example, the true function may be a quadratic or higher order polynomial.Since this is a biased estimate of the variance of the unobserved errors, the bias is removed by dividing the sum of the squared residuals by df = n − p − 1, instead of n, where df is the number of degrees of freedom (n minus the number of parameters (excluding the intercept) p being estimated - 1).[7] Another method to calculate the mean square of error when analyzing the variance of linear regression using a technique like that used in ANOVA (they are the same because ANOVA is a type of regression), the sum of squares of the residuals (aka sum of squares of the error) is divided by the degrees of freedom (where the degrees of freedom equal n − p − 1, where p is the number of parameters estimated in the model (one for each variable in the regression equation, not including the intercept)).Concretely, in a linear regression where the errors are identically distributed, the variability of residuals of inputs in the middle of the domain will be higher than the variability of residuals at the ends of the domain:[9] linear regressions fit endpoints better than the middle.This is the basis for the least squares estimate, where the regression coefficients are chosen such that the SSR is minimal (i.e. its derivative is zero).
Regression analysisLinear regressionSimple regressionPolynomial regressionGeneral linear modelGeneralized linear modelVector generalized linear modelDiscrete choiceBinomial regressionBinary regressionLogistic regressionMultinomial logistic regressionMixed logitProbitMultinomial probitOrdered logitOrdered probitPoissonMultilevel modelFixed effectsRandom effectsLinear mixed-effects modelNonlinear mixed-effects modelNonlinear regressionNonparametricSemiparametricRobustQuantileIsotonicPrincipal componentsLeast angleSegmentedErrors-in-variablesLeast squaresLinearNon-linearOrdinaryWeightedGeneralizedGeneralized estimating equationPartialNon-negativeRidge regressionRegularizedLeast absolute deviationsIteratively reweightedBayesianBayesian multivariateLeast-squares spectral analysisRegression validationMean and predicted responseGoodness of fitStudentized residualGauss–Markov theoremstatisticsoptimizationdeviationobserved valueelementstatistical sampletrue valueobservationpopulation meanestimatedsample meanstudentized residualseconometricsunivariate distributionlocation modelexpected valuepopulationindependentalmost surelynormal distributionz-scoret-statisticstandard deviationexpectedchi-squared distributiondegrees of freedomBessel's correctionsample variancesum of squares of the residualsBasu's theoremprobability distributionsStudent's t-distributionconfidence intervalheteroscedasticityhomoscedasticitymean squared errorbiasedinfluence functionsregression coefficientsstudentizingoutliersBias (statistics)prediction errorsroot mean square errorSum of squares of residualssum of absolute errorsAbsolute deviationConsensus forecastsError detection and correctionExplained sum of squaresInnovation (signal processing)Lack-of-fit sum of squaresMargin of errorMean absolute errorObservational errorPropagation of errorProbable errorRandom and systematic errorsReduced chi-squared statisticRegression dilutionRoot mean square deviationSampling errorStandard errorType I and type II errorsChapman and HallCox, David R.Snell, E. JoyceJournal of the Royal Statistical Society, Series BEncyclopedia of MathematicsEMS PressComputational statisticsLinear least squaresNon-linear least squaresIteratively reweighted least squaresCorrelation and dependencePearson product-moment correlationRank correlationSpearman's rhoKendall's tauPartial correlationConfounding variableOrdinary least squaresPartial least squaresTotal least squaresstatistical modelSimple linear regressionGeneralized least squaresWeighted least squaresGrowth curve (statistics)Segmented regressionLocal regressionBinomialLogisticDecomposition of varianceAnalysis of varianceAnalysis of covarianceMultivariate AOVStepwise regressionModel selectionMallows's CpModel specificationErrors and residualsMinimum mean-square errorFrisch–Waugh–Lovell theoremDesign of experimentsResponse surface methodologyOptimal designBayesian designNumericalapproximationNumerical analysisApproximation theoryNumerical integrationGaussian quadratureOrthogonal polynomialsChebyshev polynomialsChebyshev nodesCurve fittingCalibration curveNumerical smoothing and differentiationSystem identificationMoving least squaresStatistics outlineStatistics topicsOutlineDescriptive statisticsContinuous dataCenterArithmeticArithmetic-GeometricContraharmonicGeneralized/powerGeometricHarmonicHeronianLehmerMedianDispersionAverage absolute deviationCoefficient of variationInterquartile rangePercentileCentral limit theoremMomentsKurtosisL-momentsSkewnessCount dataIndex of dispersionContingency tableFrequency distributionGrouped dataDependenceKendall's τSpearman's ρScatter plotGraphicsBar chartBiplotBox plotControl chartCorrelogramFan chartForest plotHistogramPie chartQ–Q plotRadar chartRun chartStem-and-leaf displayViolin plotData collectionStudy designEffect sizeMissing dataReplicationSample size determinationStatisticStatistical powerSurvey methodologySamplingClusterStratifiedOpinion pollQuestionnaireControlled experimentsBlockingFactorial experimentInteractionRandom assignmentRandomized controlled trialRandomized experimentScientific controlAdaptive clinical trialStochastic approximationUp-and-down designsObservational studiesCohort studyCross-sectional studyNatural experimentQuasi-experimentStatistical inferenceStatistical theoryProbability distributionSampling distributionOrder statisticEmpirical distributionDensity estimationLp spaceParameterlocationParametric familyLikelihood(monotone)Location–scale familyExponential familyCompletenessSufficiencyStatistical functionalBootstrapOptimal decisionloss functionEfficiencyStatistical distancedivergenceAsymptoticsRobustnessFrequentist inferencePoint estimationEstimating equationsMaximum likelihoodMethod of momentsM-estimatorMinimum distanceUnbiased estimatorsMean-unbiased minimum-varianceRao–BlackwellizationLehmann–Scheffé theoremMedian unbiasedPlug-inInterval estimationLikelihood intervalPrediction intervalTolerance intervalResamplingJackknifeTesting hypotheses1- & 2-tailsUniformly most powerful testPermutation testRandomization testMultiple comparisonsParametric testsLikelihood-ratioScore/Lagrange multiplierSpecific testsZ-test (normal)Student's t-testF-testChi-squaredG-testKolmogorov–SmirnovAnderson–DarlingLillieforsJarque–BeraNormality (Shapiro–Wilk)Likelihood-ratio testCross validationRank statisticsSample medianSigned rank (Wilcoxon)Hodges–Lehmann estimatorRank sum (Mann–Whitney)1-way (Kruskal–Wallis)2-way (Friedman)Ordered alternative (Jonckheere–Terpstra)Van der Waerden testBayesian inferenceBayesian probabilityposteriorCredible intervalBayes factorBayesian estimatorMaximum posterior estimatorCorrelationPearson product-momentCoefficient of determinationMixed effects modelsSimultaneous equations modelsMultivariate adaptive regression splines (MARS)Bayesian regressionHomoscedasticity and HeteroscedasticityExponential familiesLogistic (Bernoulli)Poisson regressionsPartition of varianceAnalysis of variance (ANOVA, anova)Multivariate ANOVACategoricalMultivariateTime-seriesSurvival analysisCohen's kappaGraphical modelLog-linear modelMcNemar's testCochran–Mantel–Haenszel statisticsRegressionManovaCanonical correlationDiscriminant analysisCluster analysisClassificationStructural equation modelFactor analysisMultivariate distributionsElliptical distributionsNormalDecompositionStationaritySeasonal adjustmentExponential smoothingCointegrationStructural breakGranger causalityDickey–FullerJohansenQ-statistic (Ljung–Box)Durbin–WatsonBreusch–GodfreyTime domainAutocorrelation (ACF)partial (PACF)Cross-correlation (XCF)ARMA modelARIMA model (Box–Jenkins)Autoregressive conditional heteroskedasticity (ARCH)Vector autoregression (VAR)Frequency domainSpectral density estimationFourier analysisWaveletWhittle likelihoodSurvivalSurvival functionKaplan–Meier estimator (product limit)Proportional hazards modelsAccelerated failure time (AFT) modelFirst hitting timeHazard functionNelson–Aalen estimatorLog-rank testApplicationsBiostatisticsBioinformaticsClinical trialsstudiesEpidemiologyMedical statisticsEngineering statisticsChemometricsMethods engineeringProbabilistic designProcessquality controlReliabilitySocial statisticsActuarial scienceCensusCrime statisticsDemographyJurimetricsNational accountsOfficial statisticsPopulation statisticsPsychometricsSpatial statisticsCartographyEnvironmental statisticsGeographic information systemGeostatisticsKriging