# Source and Collection

This lesson systematically differentiates between types of data i.e. primary and secondary, to explain the methods of collection of data. It also specifies the different sources of data. The methods of data collection can be understood simply by this comprehensive lesson. Examples are used for easy understanding of data collection.

## Statistical Data: An Introduction

The bedrock of data analysis and interpretation is formed by the collection of data. â€˜Dataâ€™ is basically unorganized statistical facts and figures collected for some specific purposes, such as analysis. There can be different sources of data, such as statistical and non-statistical sources.

Also, there are different methods of data collection, depending on the type of data. There are two main types of data: primary and secondary. Â Understanding the difference between the two is important in deciding which method of data collection to use. Tremendous amounts of statistical analyses are carried out continuously in countries for publication purposes or even for policy framing.

## Sources of Data

There are two sources of data in Statistics. Statistical sources refer to data that are collected for some official purposes and include censuses and officially conducted surveys. Non-statistical sources refer to the data that are collected for other administrative purposes or for the private sector.

### Statistical Survey

A statistical Survey is normally conducted using a sample. It is also called Sample Survey. It is the method of collecting sample data and analyzing it using statistical methods. This is done to make estimations about population characteristics. The advantage is that it gives you full control over the data. Y

ou can ask questions suited to the study you are carrying out. But, the disadvantage is that there is a chance of sample error creeping up. This is because a sample is chosen and the entire population is not studied. Leaving out some units of the population while choosing the sample causes this error to arise.

### Census

Opposite to a sample survey, a census is based on all items of the population and then data are analyzed. Data collection happens for a specific reference period. For example, the Census of India is conducted every 10 years. Other censuses are conducted roughly every 5-10 years. Data is collected using questionnaires that may be mailed to the respondents.

Responses can also be collected over other modes of communication like the telephone. An advantage is that even the most remote of the units of the population get included in the census method. The major disadvantage lies in the high cost of data collection and that it is a time-consuming process.

### Register

Registers are basically storehouses of statistical information from which data can be collected and analysis can be made. Registers tend to be detailed and extensive. It is beneficial to use data from here as it is reliable. Two or more registers can be linked together based on common information for even more relevant data collection.

From agriculture to business, all industries maintain registers for record-keeping. Some administrative registers also serve the purpose of acting as a repository of data for other statistical bodies in a country.

## Types of Data and Data Collection

Like stated earlier, there are two types of data: primary and secondary.

### Primary data

As the name suggests, are first-hand information collected by the surveyor. The data so collected are pure and original and collected for a specific purpose. They have never undergone any statistical treatment before. The collected data may be published as well. The Census is an example of primary data.

Methods of primary data collection:

1. Personal investigation: The surveyor collects the data himself/herself. The data so collected is reliable but is suited for small projects.
2. Collection Via Investigators: Trained investigators are employed to contact the respondents to collect data.
3. Questionnaires: Questionnaires may be used to ask specific questions that suit the study and get responses from the respondents. These questionnaires may be mailed as well.
4. Telephonic Investigation: The collection of data is done through asking questions over the telephone.to give quick and accurate information.

### Secondary data

Secondary data are opposite to primary data. They are collected and published already (by some organization, for instance). They can be used as a source of data and used by surveyors to collect data from and conduct the analysis. Secondary data are impure in the sense that they have undergone statistical treatment at least once.

Methods of secondary data collection:

1. Official publications such as the Ministry of Finance, Statistical Departments of the government, Federal Bureaus, Agricultural Statistical boards, etc. Semi-official sources include State Bank, Boards of Economic Enquiry, etc.
3. Articles in the newspaper, from journals and technical publications.

## Solved Example for You

Question: Differentiate between primary and secondary data.

Answer: Primary data refers to first-hand information which is directly collected from the units being surveyed. It is pure in the sense that it has not undergone any statistical treatment yet. It is particularly collected for some purpose. Secondary data, on the other hand, is second-hand data. It is collected from some source that had originally primarily collected it. It has therefore undergone statistical treatment and is classified as impure or not original. Thus, the main difference between primary and secondary data lies in the exchange of hands.

