RATIO-TYPE ESTIMATORS IN STRATIFIED RANDOM SAMPLING USING AUXILIARY ATTRIBUTE

RATIO-TYPE ESTIMATORS IN STRATIFIED RANDOM SAMPLING USING AUXILIARY ATTRIBUTE

  • The Complete Research Material is averagely 94 pages long and it is in Ms Word Format, it has 1-5 Chapters.
  • Major Attributes are Abstract, All Chapters, Figures, Appendix, References.
  • Study Level: BTech, BSc, BEng, BA, HND, ND or NCE.
  • Full Access Fee: ₦5,000

Get the complete project » Instant Download Active

ABSTRACT

A problem of the ratio-type estimators in Stratified Sampling is the use of non-attribute auxiliary information. In this study, some ratio-type estimators in stratified random sampling using attribute as auxiliary information are proposed. The sample mean of study variable and proportion of auxiliary attribute were transformed linearly and using auxiliary parameters respectively. Biases and mean square errors (MSE) for these estimators were derived. The MSE of these estimators were compared with the MSE of the traditional combined ratio estimator. The results show that the proposed estimators are more efficient and less bias than the combined ratio estimate in all conditions. An empirical study was also conducted using students height data from each faculty of the Usmanu Danfodiyo University, Sokoto. The results also show that the proposed estimators are more efficient and less bias than the combined ratio estimator. In addition, formulae for determination of sample sizes when the proposed estimators are adopted under various allocations (Optimum, Neyman and Proportional) for fixed cost and desired precision were obtained.


CHAPTER ONE

INTRODUCTION

1.1 INTRODUCTION

Prior knowledge about population mean along with coefficient of variation, kurtosis and correlation of the population of an auxiliary variable are known to be very useful particularly when the ratio, product and regression estimators are used for estimation of population mean of a variable of interest. The use of auxiliary information can increase the precision of an estimator when study variable is highly correlated with auxiliary variable. Srivastava and Jhajj (1981) suggested a class of estimators of the population mean, provided that the mean and variance of the auxiliary variable are known. Singh and Tailor (2003) considered a modified ratio estimator by exploiting the known value of correlation coefficient of the auxiliary variable. Singh and Upadhyaya (1999) suggested two ratio-type estimators when the coefficient of variation and kurtosis of the auxiliary variable are known.

However, the fact that the known population proportion of an attribute also provides similar type of information has not drawn as much attention. In several situations, instead of existence of auxiliary variables there exists some auxiliary attributes, which are highly correlated with study variable (Singh et. al.,2008). For example, sex and height of the persons, amount of milk produced by a particular breed of cow, amount of yield of wheat crop by a particular variety of wheat etc. (Jhajj et. al., 2006). In such situations, taking the advantage of point-biserial correlation between the study variable and the auxiliary attribute, the estimators of parameters of interest can be constructed by using prior knowledge of the parameters of auxiliary attribute.

1


It is often useful to incorporate auxiliary information of the population in a sampling procedure. In practice, auxiliary information can be obtained in different ways. For example, the sampling frames often used in official statistics production may include auxiliary information on the population elements or these data are extracted from administrative registers and are merged with the sampling frame elements. In other words, aggregate-level of auxiliary information can be obtained from different sources, such as published official statistics. Use of auxiliary information in sampling and estimation can be very useful in the construction of an efficient sampling design.

In the estimation of population parameters, auxiliary information is used to improve efficiency for the variable of interest. Whenever there is auxiliary information, the researcher wants to utilize it in the method of estimation to obtain the most efficient estimator.

In simple random sampling, the variance of the estimate (say, of population mean Y ) depends, apart from the sample size, on the variability of the character y in the

population. If the population is very heterogeneous and considerations of cost limit the size of the sample, it may be found impossible to get a sufficiently precise estimate by taking a simple random sample from entire population. And populations encountered in practice are generally very heterogeneous (Raj and Chandhok, 1998). In surveys of manufacturing establishments, for example, it can be found that some establishment are very large, that is, they employ 1000 or more persons, but there are many others which have only two or three persons on their rolls. Any estimate made from a direct random sample taken from the totality of such establishments would be subject to exceedingly large sampling fluctuations. But suppose it is possible to divide this population into parts

2


or strata on the basis of, say employment, thereby separating the very large ones, the medium-sized ones and the smaller ones. If a random of establishments is now taken from each stratum, it should be possible to make a better estimate of the strata average, which in turn should help in producing a better of the population average. Similarly, if a sample is selected with probability proportionate to x from the entire population, the variance of the population-total estimate may be very high because the ratio of y to x


varies considerably over the population. If a way population so that the variation of the ratio of y to x


can be found of subdividing the is considerably reduced within the


subdivisions or strata, a better estimate of the population can be made. This is the basic consideration involved in the use of stratification for improving the precision of estimation (Raj and Chandhok, 1998).

1.2 CENSUS VERSUS SAMPLE SURVEY

Broadly speaking, information on population may be collected in two ways. Either every unit in the population is enumerated (called complete enumeration, or census) or enumeration is limited to only a part or sample selected from the population (called sample enumeration or sample survey). A sample survey will usually be less costly than a complete census because the expense of covering all units would be greater than that of covering only a sample fraction. Also, it will take less time to collect and process data from a sample than from a census. But economy is not the only consideration; the most important point is whether the accuracy of the results would be adequate for the end in view. It is a curious fact that the results from a carefully planned and well executed sample survey are expected to be more accurate (near to the aim of study) than those from a complete enumeration that can be taken. A complete census ordinarily requires a

3


huge and unwieldy organization and therefore many types of errors creep in which cannot be controlled adequately. In a sample survey the volume of work is reduced considerably, and it becomes possible to employ persons of higher caliber, train them suitably, and supervise their work adequately. In a properly designed sample survey it is also possible to make a valid estimate of the margin of error and hence decide whether the results are sufficiently accurate. A complete census does not reveal by its self the margin of uncertainty to which it is subject. But there is not always a choice of one versus the other. For example, if the data are required for every small administrative area in a country, no sample survey of a reasonable size will be able to deliver the desired information; only a complete census can do this (Raj and Chandhok, 1998).

1.3 RANDOM SAMPLING

Simple random sampling is a method of selecting  n units out of the  N such that every

one of the N Cn distinct samples has an equal chance of being drawn. In practice a simple random sample is drawn unit by unit. The units in the population are numbered from 1 to N . A series of random numbers between 1 and N is then drawn, either by means of a table of random numbers or by means of a computer program that produces such a table. At any draw the process used must give an equal chance of selection to any number in the population not already drawn. The units that bear these n numbers constitute the sample.

It is easily verified that all N Cn distinct samples have an equal chance of being selected by this method. Consider one distinct sample, that is, one set of n specified units. At the

first draw the probability that some one of the n specified units is selected is n N . At the

second draw the probability that some one of the remaining  (n -1)  specified units is

4


drawn is, and so on. Hence the probability that all n specified units are selected in n draws is

n

.

( n - 1)

.

( n - 2)

...

1

=

n !( N - n)!

=

1

( N -1)  ( N -2)   ( N - n +1)

N !

N Cn

Since a number that has been drawn is removed from the population for all subsequent draws, this method is also called random sampling without replacement. Random sampling with replacement is entirely feasible; at any draw, all N members of the population are given equal chance of being drawn, no matter how often they have been drawn. The formulas for the variances and estimated variances of estimates made from the sample are often simpler when sampling is with replacement than when it is without replacement. For this reason sampling with replacement is sometimes used in the more complex sampling plans (Cochran, 1977).

1.4 DEFINITION OF BASIC TERMS

Sample:- A sample is a group of units selected from larger group (population). Bystudying the sample, it is hoped to draw valid conclusions about the larger group. A sample is generally selected for study because the population is too large to study in its entirety. The sample should be representative of general population. This is often best achieved by random sampling. Also, before collecting the sample, it is important that the researcher carefully and completely defines the population, including a description of the members to be included (Cochran, 1977).

Parameter:- A parameter is a value usually unknown (and which therefore has to beestimated), used to represent a certain population characteristic. Within a population, a

5


parameter is fixed value which does not vary. They are often denoted by Greek letters (Cochran, 1977).

Statistic:- A statistic is a quantity that is calculated from a sample data. It is used to giveinformation about unknown values in the corresponding population. it is possible to draw more than one sample from the same population and the value of a statistic will in general vary from sample to sample. Therefore, statistic is a random variable (Cochran, 1977).

Estimator:- An estimator is a rule for calculating an estimate of a given quantity basedon observed data. There are point and interval estimators. The point estimator yields single-valued results, although this includes the possibility of single vector-valued results and results that can be expressed as a single function. This is in contrast to an interval estimator, where the results would be a range of plausible values (or vectors or functions). An estimator is a statistic, (that is, a function of data) that is used to infer the value of an unknown parameter in statistical model. The parameter being estimated is sometimes called estimand. It can be either finite-dimensional (in parametric and semi-parametric) or finite-dimensional (in nonparametric and semi-nonparametric models). If

the parameter is denoted by q , then the estimator is typically written as qˆ . Being a function of data, the estimator is a random variable (Cochran, 1977).

Bias:-the bias of an estimator is the difference between this estimator’s expected valueand the true value of the parameter being estimated. An estimator with zero bias is called unbiased. Otherwise the estimator is said to be biased. Suppose we have a statistical model parameterized by  q giving rise to a probability distribution for observed data p (x \q)and a statistic


You either get what you want or your money back. T&C Apply







You can find more project topics easily, just search

Quick Project Topic Search