Statistical interpretation of data--Detection and handling of outlying observations in exponential sample

Basic Information

Standard ID: GB 8056-1987

Standard Name:Statistical interpretation of data--Detection and handling of outlying observations in exponential sample

Chinese Name: 数据的统计处理和解释指数样本异常值的判断和处理

Standard category:National Standard (GB)

state:Abolished

Date of Release1987-07-08

Date of Implementation:1988-04-01

Date of Expiration:2009-01-01

standard classification number

Standard ICS number:Sociology, Services, Organization and management of companies (enterprises), Administration, Transport>>Quality>>03.120.30 Application of statistical methods

Standard Classification Number:Comprehensive>>Basic Subjects>>A41 Mathematics

associated standards

alternative situation:Replaced by GB/T 8056-2008

Publication information

publishing house:China Standards Press

Publication date:1988-04-01

other information

Release date:1987-07-08

Review date:2004-10-14

drafter:Fei Heliang, Xu Jinlong, Chen Zhenmin

Drafting unit:Shanghai Normal University

Focal point unit:National Technical Committee for Application of Statistical Methods and Standardization

Proposing unit:National Technical Committee for Application of Statistical Methods and Standardization

Publishing department:National Bureau of Standards

competent authority:National Standardization Administration

Skip to download

Introduction to standards:

This standard specifies the general principles and implementation methods for judging and handling abnormal observations in random samples from exponential distribution (single parameter). It is applicable to samples from exponential populations or approximate exponential populations, that is, except for individual or a few grass-normal values, most of the data (main data) come from the same exponential population or approximate exponential population. GB 8056-1987 Statistical processing and interpretation of data Judgment and treatment of abnormal values in exponential samples GB8056-1987 standard download decompression password: www.bzxz.net
This standard specifies the general principles and implementation methods for judging and handling abnormal observations in random samples from exponential distribution (single parameter). It is applicable to samples from exponential populations or approximate exponential populations, that is, except for individual or a few grass-normal values, most of the data (main data) come from the same exponential population or approximate exponential population.

Some standard content:

National Standard of the People's Republic of China
Statistical interpretation of dataDetection and handling of outlying observationsin exponential sample
1 Purpose and scope of application
UDC 519.28
GB 8056--87
1.1 This standard specifies the general principles and implementation methods for judging and handling outlying observations in random samples from exponential distribution (single parameter). It is applicable to samples from exponential populations or approximate exponential populations, that is, except for individual or a few outliers, most of the remaining data (main data) come from the same exponential population or approximate exponential population. The distribution function of the exponential distribution is: F(α)=
The probability density function is:
f(α)=
（2）
1.2 Abnormal observations (or outliers) refer to individual values in the sample whose values are significantly different from the rest of the observations in the sample.
1.2.1 An abnormal value may simply be an extreme manifestation of the random variability inherent in the data. If so, it should be treated in the same way as other observations in the sample. 1.2.2 An abnormal value may also be the result of an accidental deviation from the specified test conditions and test methods. Or it may be an error in calculating or recording this value. This abnormal value does not belong to the same population as the other observations. 2 Reference standards
GB3358-82 "Statistical Terms and Symbols" GB4086.14086.6--83 "Statistical Distribution Table" 3 Symbols and their meanings
The first observation in an observation from the smallest to the largest. Tn(n) -
When the sample size is n100, the statistic used to test whether the largest observation X(n) is an outlier. Tn(1) - When the sample size is n≤100, the statistic used to test whether the smallest observation X(1) is an outlier. En《n）——When the sample size is n>100, the statistic used to test whether the largest observation X(n) is an outlier. En(1)
When the sample size is n>100, the statistic used to test whether the smallest observation X() is an outlier - the significance level of the test.
（n）（1α）—When the significant level is α, the statistic () is used as the critical value for the test. Tn() (a)
When the significance level is α, the statistic T(1) is used as the critical value for the test. . （，）—Quantile of the F variable with degrees of freedom and . Ein,r(1)-
In a sample with a truncated number, the statistical method used to determine whether (1) is an abnormal value is used. Approved by the National Bureau of Standards from July 8, 1987
Implemented on April 1, 1988
4 Statistical principles for judging outliers
GB8056-87
When judging whether an observation is an outlier, a direct decision can usually be made based on technical or physical reasons, such as when the experimenter already knows that the experiment deviates from the prescribed experimental method, or there is a problem with the test instrument, etc. When the above reasons are unclear, statistical methods can be used. 4.1 This standard judges the outliers of the sample under the following different circumstances: One-sided situation a): Based on past experience, the outliers are all high-end values; One-sided situation b): Based on past experience, the outliers are all low-end values; Two-sided situation: The outliers are extreme values that may appear at both ends. 4.2 When implementing this standard, the upper limit of the number of outliers detected in the sample should be specified (a small proportion of the number of sample observations). When this limit is exceeded, the representativeness of the sample should be carefully studied and handled. 4.3 Test rules for judging single outliers
4.3.1 Assume that all observations are sample values from the same population as the null hypothesis, select the situation in 4.1 that is consistent with the actual situation as the alternative hypothesis, and then construct a statistic for judging outliers based on statistical principles. 4.3.2 Specify an appropriate significance level α, recommended The α value is 1%, and it is not advisable to use an α value exceeding 5%. Determine the critical value of the statistic based on α and the number of observations n. 4.3.3 Substitute the batch of data into the statistic. When the value of the statistic obtained exceeds the critical value, the extreme observation to be checked in advance is judged to be abnormal; otherwise, it is judged that there is no abnormal value. 4.4 Test rules for judging multiple outliers
When the number of outliers that can be detected is allowed to be greater than 1, the method specified in this standard is to repeatedly use the same test rule for judging a single abnormal value, that is, to use the specified significance level and the test rules specified in 4.3. If no outliers are detected, the entire test stops; if an outlier is detected, the same significance level and the same rules are used to continue testing the remaining observations after removing the detected outliers until no outliers are detected or the number of detected outliers reaches the upper limit. 5 General rules for handling outliers
5.1 For outliers detected by statistical methods, their technical and physical reasons should be sought as much as possible as a basis for handling outliers.
5.2 The ways to handle outliers are:
a. The outliers are retained in the sample for subsequent data analysis; b. It is allowed to eliminate outliers, that is, exclude outliers from the sample; c. It is allowed to eliminate outliers and add appropriate observations to the sample; d. Correct outliers when the actual cause is found. 5.3 The user of the standard should weigh the cost of finding the cause of outliers according to the nature of the actual problem, correctly judge the benefits of outliers and the risk of erroneously eliminating normal observations, and determine to implement one of the following "rules": a: Any outliers shall not be eliminated or corrected without sufficient technical or physical reasons. b. Except for outliers with sufficient technical or physical reasons, only those that are statistically highly abnormal (that is, significant observations under the significance level α specified in this standard) are allowed to be eliminated or corrected. 5.4 The eliminated or corrected observations and their reasons should be recorded for reference. 6 Rules for judging single outliers
6.1 This standard stipulates that when the sample size is n100, the statistic T(n) (or 7(1)) is used for testing, and when the sample size is n100, the statistic En(n) (or En(1)) is used for testing. 6.2 Test rules for one-sided case a)
6.2.1 When the sample size is n100, the following method is used: a. Calculate the value of the statistic Tn(n):
GB 8056-87
·(3)
b. Determine the significance level α, and find the critical value T,(n)(1-α) corresponding to n,α in Table A,1. When the value of Tn(n) is greater than the critical value T,(n)(1-α), X(n) is judged to be an outlier, otherwise it is judged to be no outlier. 6.2.2 When the sample size n>100, implement as follows: a. Calculate the value of the statistic E,(n):
(n-1)【X(n) -.
（4）bzxZ.net
2, X, -(X(mX(n 1)
b. Determine the significance level α, and find the critical value F1-α(2,2n-2) corresponding to n,α in the quantile table of the F variable. When the value of En(n) is greater than F,-α(2,2n-2), X(n) is judged to be an outlier, otherwise it is judged to be no outlier. 6.3 Test rules for one-sided case b)
6.3.1 When the sample size n<100, implement as follows: a. Calculate the statistic The value of T(1):
(5)
b. Determine the significance level α, and find the critical value Tn(1) (α) corresponding to n, α in Table A2. When the value of Tn(1) is less than the critical value T, (2) (α), X(1) is judged to be an outlier, otherwise it is judged to be no outlier. 6.3.2 When the sample size n is 100, implement it as follows: Calculate the value of the statistic En(1):
n(n-1) X(1)
≥ x -na)
Determine the significance level α, and find the critical value Fα (2, 2n-2) corresponding to n, α in the quantile list of the F variable. b.
When the value of En(1) is less than the critical value Fα (2, 2n-2), X(1) is judged to be an outlier, otherwise it is judged to be no outlier. 6.4 Test rules for two-sided cases
6.4.1 When the sample size n≤100, implement as follows: a.
Calculate the value of the statistic Tn(n), Tn(1). The critical value Tn(n)(1-
b. Determine the significance level α, find the corresponding sub-n in Table A1, and find the corresponding n,
The critical value Tn(n)(
Calculate the sample mean:
There are outliers.
≥1, Tn()
Tip: This standard content only shows part of the intercepted content of the complete standard. If you need the complete standard, please go to the top to download the complete standard document for free.