Skip to Main Content

Finding Data

Strategies and resources for finding data across the social sciences, including opinion surveys.

Your Research Question

Define Your Research Question

Try to state your research question without describing the sources or data you will use to answer the question.

Think about Your Method of Analysis

What sort of analysis do you plan to do? Do you need to data or statistics to illustrate points? Will you be using a stats package, such as R, SPSS, SAS, or Stata, to do analysis?

Defining Your Topic and Unit of Analysis

When you define your topic and unit of analysis, you should look at your research question and ask:

  • What are the specifics of the data I need to use to answer my research question? What is my topic? What unit of analysis, geographic unit, and time unit (frequency) do I need? Do I need time series data?

Define Your Topic

Use specific language when defining your topic. This will help you identify a variable or variables.

Examples:

  • I'm looking for the percentage of people living below the poverty line in areas where hurricanes frequently hit.

Identify Unit of Analysis

Who or what is being described by your variable(s)?

Examples:

  • Individuals, families, households
  • Institutions (companies, schools, non-profits, health facilities)
  • Products (commodities, stocks, currencies)

Identify Time Frame and Frequency

For what point in time do you want to know this about the people, institutions, or products you identified? How often do you want to know it about them?

Examples:

  • As recent as possible, plus data from 10 and 20 years before that
  • Every month in 1995 and 1996

Identify Geographic Unit

What part of the world is your research question concerned with?

Examples:

  • Counties in Michigan
  • Countries currently in the EU
  • Businesses headquartered in China

Identify Whether this is Time Series Data

Are you looking for data collected at regular intervals over time? Identifying what sort of time series may be helpful as you search for data.

  • Cross sectional: collected at the same point of time for several individuals
  • Longitudinal/Panel: data collected at a sequence of time points for each of a sample of individuals
  • Time Series: data collected at a sequence of time points, usually at a uniform frequency
  • Pooled cross sectional time series: mixture of time series data and cross-section data

* Adapted from Barbara Mento's guide to Finding Data at Boston College

 

Data vs. Statistics

  • Data is the raw information from which statistics are created; statistics provide an interpretation and summary of data.
  • To make sense of data, you will likely need to use a statistical software program (SPSS, SAS, Stata, etc.)  to analyze and make sense of the data.  
  • On the other hand, you can often easily use and understand statistics because they have already been processes.  .  
Last Updated: Aug 24, 2022 5:29 PM
Subjects: Social Sciences
Tags: clarklibrary