CAN ONE TRUST THE SOCIOLOGISTS? (METHOD OF SAMPLING IN SOCIOLOGICAL SURVEYS)

There have been many debates upon the results of sociological surveys lately, especially upon those handling socially topical problems like rating of the president or opposition parties, attitude towards the integration with Russia or EU, etc. While experts lay claims to the sampling computation, question wording and conclusion argumentation, public discussions often challenge the very scientific methodology: Is it right to judge public opinion from the answers of several hundred or a thousand respondents? Similar claims – from polite perplexity to categorical criticism – are heard from general public and intellectuals, the authorities and the opposition (it depends on who dislikes the published results). Are the results of sociological surveys and, first of all, of public opinion polls veracious? Aren’t they usual wangling covered by science? Let’s examine the basic rules of sample computation.

The methods of sociological survey can be divided into entire and fragmentary. Entire surveys require that all elements of the subject under study be examined. Surveys of this kind may due to many reasons appear very laborious, high-cost or just impossible. In this case methods of fragmentary survey are applied: monographic method, method of general body, method of sampling, etc. Nowadays the method of sampling is the most widespread of all fragmentary methods.

In the sample method some part of subject’s elements is surveyed but not all of them. In both science and everyday life people’s knowledge, judgments and acts are based on the selective data mainly. Thus, the taste of berries is identified by trying only several berries of the whole bunch; we judge the work of public transport in a city by several of its routes and a new film – by comments of our friends. In science, methods of sampling are the mechanism of human perception and not an invention of sociologists, like others say. Let’s say, doctors conclude about patient’s health on the basis of the blood test comprising only several drops of his blood and geologists judge presence and dimension of a mineral layer by several geological tests. This is always a part of information about a subject that we can receive in the course of scientific activity or in everyday life. The experience proves that if the requirements to sampling are carefully observed, the method of sampling turns quite veracious. But those using the results of sampling studies should well understand what sampling is and what requirements it imposes.

Following are the basic concepts of the sampling method. The totality of elements in the object under study that relate to the problem handled is called the entire assembly. The entire assembly consists of minor elements that are called the elements of entire assembly. The studied objects of the entire assembly are called the sample. The sample is a specially selected part of the entire assembly. To a certain extent, it is a model of the entire assembly and this is why the latter can be valued based on the sample. However, there is no necessity to simulate all aspects of the entire assembly in the sample. Simulating the elements essential for a particular survey is sufficient. Ability of the sample to represent and simulate these essential characteristics of the entire assembly is called representativeness. The basic principle of building representative sampling is ensuring equal opportunities for all elements of the entire assembly to be included into the sample.

It is noteworthy that sampling results do not guarantee absolute veracity as only a part of the entire assembly is examined. Therefore there can be some errors of measurement introduced. If the sample is defined accurately, the error value doesn’t exceed the margin of error.

Sample errors can be systematic and random. The first happen if a sample building is incorrect or the procedure of information gathering is infringed. Also, random errors occur due to a different degree of respondents’ readiness to participating in a survey. The increasing amount of sampling doesn’t incur the decrease of the amount of systematic errors. These are professionalism and responsibility of all members of a sociological survey that avert or minimize (discover and correct) this kind of errors.

Systematic errors happen when not all of the entire assembly is examined but only a part. Measurement errors fall into the same category. They cannot be eliminated but their amount decreases when the amount of sampling is increased. Systematic errors can be reduced to a margin and in that case the required accuracy in the results of a survey can be reached provided there are no systematic errors. The amount of systematic errors depends on the sample number and its building method. Attempts of the researches to increase accuracy of their results incur the increase of sampling number and, consequently, survey costs.

Two kinds of selection methods are generally used for sample construction: probabilistic methods of selection using probability-theoretical approach for vindication of representativeness and non-statistical methods of sample selection and its representativeness vindication (quota selection, method of general body, method of accessible sample, etc.). Applying non-statistical methods of selection one shouldn’t forget that in this case there are no common rules for vindicating representativeness of sample. Such vindication should be based on the compliance of sample with survey targets and on indispensable exhaustiveness of the information gathered. Also, if non-statistical methods are applied to sample formation, there aren’t any methods of calculating sample amount necessary to provide its representativeness.

We would like to describe more the probability methods of sampling formation. When using probability methods of sampling formation, statistic generalization of sampling results is based on application of the probability theory implying random (probability) choice of elements from the entire assembly. If a random selection has been conducted correctly, all elements of the entire assembly receive equal opportunity to be included into the sample. Simple random sample lies within the probability selection methods. The procedure of a simple random sample is carried when the entire assembly is homogeneous and there is a full list of its elements. Selection from the list is then performed in special procedures (table of random numbers, simple draw, etc.) Simple random sample has its variety called patterned sampling. Patterned sampling is the sampling of entire assembly’s elements from actual lists (e.g., lists of voters in a constituency, corporate list of employees, etc.). The step-interval of sampling between the elements of entire assembly depends on the amount of sample and of general assembly.

If the entire assembly is very large and heterogeneous, the procedure of stratified random sampling (the approach is also called subdivision, zoning or stratification of the entire assembly) is used dividing the entire assembly into disjoint parts. And then simple random samples are extracted from every part. If the entire assembly is indefinite and its lists are not available, the cluster sampling is used. In this case respondents (observation units) are chosen by groups (clusters) and not separately (e.g., a student group, a production team, etc.). There are also some other schemes of random sample formation. The choice of a sampling scheme depends on the financial and time resources of a survey. For more schemes see additional literature: W. Cockran. Sampling Methods. Moscow. 1976; O. Tereschenko. Sample Grounding and Computation // On-line Sociological Surveys: Methodology and Build-Up Experience. Minsk. 2001.

Below is a computation scheme of the random representative sampling. By computation of a random representative sampling, we more particularly mean determining the value of sample’s random error and finding the sample amount, the random error of which does not exceed the maximum permissible confidence probability value.

There are a large number of formulas to calculate sample amount that are applicable in various conditions, for different targets and different types of observation units. The formula below can be used for calculating the amount of sample in cases when the amount of entire assembly is already known.

N t² pq

n= —————-

N D_p² + tpq

Where:
N – is the size of the entire assembly;
p and q – sampling fractions (e.g., men and women), for greater reliability their value taken is 0.5, each as their product will thus be maximal. If the fractions are more than two and their sum amounts to one, the product will decrease by an order taking into account a new division;
D_p – admissible error of sample, usually 5% or 0.05 (values 0.01; 0.03; 0.05 are more commonly used).
The confidence probability is Р=0.95 while t=1.96 (approximately 2) (t – is a confidence coefficient determined by the table of normal distribution).

Example. We should calculate the amount of sample for examining able-bodied citizens and over in the country. The values are N=7,083,000 people,t=2, D_p=0.05; p=0.5; q=0.5:

7083000 ^. 0.5 ^. 0.5 ^. 2²

n= ——————————————– = 400

7083000 ^. 0.05 ^. 0.05 + 2 ^. 0.5 ^. 0.5

The amount of sample received is 400 people but it is sufficient supposing the entire assembly is fully homogeneous. This sample is homogeneous only in the sense that it incorporates the citizens aged 16 and over. But it needs to be subdivided if studying the problems of employment in accordance with additional features: gender, age, education, social status and type of settlement. Thus there are 40 subgroups and each of them should be represented by not less than 30 respondents. The amount of sample finally comes to 1,500 observation units.

Often, a necessary sample cannot be formed if only one method of sampling is applied. Different methods of sampling, both statistical and non-statistical, can be sometimes used on different stages of the sample. Computation of such a sample amount is very laborious and can be performed by experts only.

It follows that: If a sociological survey is conducted observing the above rules of sample computation and all the necessary procedures, its results without any doubts cover all the entire assembly (employees of the enterprise given, voters of the constituency given or the population of the country) supposing that the possible deviation lies within the representative margin of error (±3%, ±5%, etc.). Before trusting the results offered, one should first see whether the authors (or publishers) give the main characteristics of that survey: who and when conducted the survey, by which method and what the representative margin of error is. Journalists, politicians and businessmen often ignore these facts as some “technical data” and run the risk of being deceived. Therefore, in most countries such information is a precondition for publishing results, even in a smallest-circulation newspaper. Where these simple rules are observed, survey results are not questioned and the sociology really serves the society.