Individuals

The objects described by a set of data.

Variable

an alphabetic character representing a number, called the value, which is either arbitrary or not fully specified or unknown. It is usually a letter like x or y.

Population

the entire group of items or individuals for which a sample is taken (the entire American _______________________, New jersey is a sample)

sample

a randomly selected group chosen for the purpose of collecting data. ie 3 middle schools in Monmouth county to determine information regarding typical Middle School student in Monmouth County

categorical variable

A variable that names categories (words/numbers)

Quantitative Variable

A variable in which the numbers act as numerical values - always have units.

Continuous Variable

A quantitative variable that can take any real numerical value over an interval.

Discrete Variable

A quantitative variable that can only take a limited, finite number of values, ex. the number of petals on a flower.

Nominal Variable

Qualitative and unordered variables, ex. flower color.

Ordinal Data

Data that can be ranked, like star ratings. Although they can be ranked, they are not true quantitative variables because the intervals between consecutive ranks are often not identical.

Spreadsheet

A way of recording data; usually in a table in which each row is an individual and each column is a variable.

Exploratory Data Analysis

An examination of data in order to describe their main features. Starting by examining the variables, their relationships, and creating a graph are usually a good idea.

distribution

The _____________________________ of a var. gives: possible values of the variance; the relative frequency of each value.

Distribution of a Categorical Variable

This lists the categories and gives either the count or the persent of individuals that fall into each category.

Frequency

how many time you get it

Relative Frequency

the ratio of the frequency of a category to the total frequency

Roundoff Error

Errors in rounding up or down that lead to inconsistent data, ex. not totalling to 100%.

Bar graphs

What graphs are appropriate for categorical data?

Bar graph

Bars do not touch; categorical data is typically on the horizontal axis; to describe: comment on which occurred the most often or least often

Pie Chart

Graphical representation of data in the form of a circle containing wedges.

Pareto Chart

Bar graph for qualitative data, with the bars arranged in order according to frequencies.

Histogram

A bar graph depicting a frequency distribution. The height of the bars indicates the frequency of a group of scores.

Creating a Histogram

Bin sizes are important; rule of thumb says start with 5-10 bins and refine accordingly.

Interpreting Histograms

The overall pattern of a histogram can be described in terms of its shape, center, and spread.

Histogram Shape

Can be unimodel (single-peaked) or bimodel (double-peaked).

Histogram Center

Can be symmetric, left-skewed (left side extends out more), or right-skewed (right side extends out more). Skewed only applies to unimodel graphs.

Histogram Spread

The range of values; often altered by deviations such as outliers.

outlier

a data value that is either much greater or much less than the median

Back-To-Back Stemplot

A stemplot with two sets of leaves, one on the right, one on the left.

dotplot

Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values. Dots representing equal values are stacked.

Time Plots

Plots each observation against the time at which it was measured. Time is always on the horizontal scale, and the variable on the vertical scale. Time plots can reveal trends, cycles, and other patterns.

Time Series Data

Data collected over time.

Mean

the sum of all the values divided by the number of values

Frequency Density

frequency/class width

negative correlation

y tends to decrease, as x increases

positive correlation

y tends to increase, as x increases