Mean

the sum of all the values divided by the number of values

Median

A segment or Ray that joins a vertex to the midpoint of the opposite side

Mode

the most common value

Range

The difference between the greatest number and the least number in a set of data.

Measures of Central Tendency

used to describe the center of a set of data: mean, median, mode

outlier

a data value that is either much greater or much less than the median

Relative Frequency

the ratio of the frequency of a category to the total frequency

dot plot

graph with each individual entry

Histogram

A bar graph depicting a frequency distribution. The height of the bars indicates the frequency of a group of scores.

box plot

also called a box and whisker plot

Symmetric

Flip the left side of the equation to the right side.

Skewed right

mean > median

Skewed left

mean < median

normal distribution

Bell-shaped probability distribution where the frequencies start low, then increase to one or two high frequencies, then decrease to a low frequency. The distribution is approximately symmetric.

Standard Deviation

measures how spread out the data is in relationship to the mean

conditional relative frequency

based only on a specific row or column in a 2 way table

standardizing

We ________________ to eliminate units

standardized value

Value found by subtracting the mean and dividing by the standard deviation.

shifting

Adding a constant to the mean, the median, and the quartiles, but does not change the standard deviation or IQR.

Rescaling

Multiple each data value by a constant multiplies both the measures of position and the measures of spread by that constant.

Normal Model

A useful family of models for unimodel, symmetric distributions.

parameter

Number that describes a population

Statistic

A number that describes a sample

Z-score

z is the distance from the mean of the normal distribution expressed in units of standard deviation.

Far Outlier

If a point is more than 3.0 IQR from either end of the box in a boxplot.

Comparing Distributions

Consider: shape, center, spread

Comparing Boxplots

Compare Shapes; Compare Medians; Compare IQRS; Check for outliers

timeplot

Displays data that change overtime.

Variance

Measures of spread

Resistant

A calculated summary is said to be ________________ if outliers have only a small effect on it.

5 Number Summary

Reports the min., Q1, the median, Q3 and the max.

Percentile

The # that falls above i% of the data.

Interquartile Range (IQR)

The difference between the 1st and 3rd Quartiles.

outliers

Values that are very unusual in the sense that they are very far away from most of the data.

skewed

Distribution is _____________________ if it's not symmetric and 1 tail stretches out farther than the other.

tails

The parts that typically trail off on either side.

uniform

A distribution that's roughly flat.

Unimodal

1 mode

Bimodal

2 modes

multimodal

More than 2 modes

spread

A numerical summary of how tightly the values are clustered around the center. Measures: IQR, Standard Dev.

Center

Each regular polygon has a center because it can be inscribed in a circle.

Shape

To describe the _________ of a distribution, look for: single vs. mult. modes; symmetry vs skewness; outliers and gaps.

dotplot

Consists of a graph in which each data value is plotted as a point (or dot) along a scale of values. Dots representing equal values are stacked.

Stem and Leaf Display

Shows quantitative data values in a way that sketches the distribution of the data.

Gap

A region of the distribution where there are no values.

Frequency Table (Relative Frequency Table)

Lists the categories in a categorical var. and gives the count of percentages of each categories observation.

distribution

The _____________________________ of a var. gives: possible values of the variance; the relative frequency of each value.

area principle

In a statistical display, each data value should be represented by the same amount of area.

bar chart

Shows a bar whose area represents the count (or percentage) of observations for each category of a categorical variance.

Pie Chart

Graphical representation of data in the form of a circle containing wedges.

Contingency Table

Displays counts and, sometimes, percentages of individuals falling into named categories on 2 or more var.

Marginal Distribution

In a contingency table, the distribution of either var. alone.

Conditional Distribution

The distribution of a var. restricting the who to consider only a smaller group of individuals.

Independence

Variables are ________________ if the conditional distribution of one variables is the same for each category of the other.

Segmented Bar Chart

Displays the conditional distribution of a categorical var. within each category of another var.

simpson's paradox

When averages are taken across different groups, they can appear to contradict the overall averages.

context

Tells who was measured, what was measured, how the data were collected, where the data was collected, and when and why the study was performed.

data

A collection of information gathered for a purpose. Data may be in the form of either words or numbers.

Data Table

An arrangement of data in which each row represents a case and each column represents a variable.

case

Individual about whom or which we have data.

Population

the entire group of items or individuals for which a sample is taken (the entire American _______________________, New jersey is a sample)

sample

a randomly selected group chosen for the purpose of collecting data. ie 3 middle schools in Monmouth county to determine information regarding typical Middle School student in Monmouth County

Variable

an alphabetic character representing a number, called the value, which is either arbitrary or not fully specified or unknown. It is usually a letter like x or y.

units

A quantity or amount adopted as a standard of measurement, such as dollars, hours, or grams.

categorical variable

A variable that names categories (words/numbers)

Quantitative Variable

A variable in which the numbers act as numerical values - always have units.

Random Phenomenon

If we know what outcomes could happen, but not which particular valves will happen.

Trial

each result/observation of an experiment, such as one roll of a number cube.

outcome

any one of the possible results of an action

Event

a single outcome or a group of outcomes

sample space

is the set of all possible outcomes

Law of Large Numbers

As the number of trials in a probability experiment increases, the experimental probability approaches the theoretical probability.

Empirical Probability

The probability comes from the long-run relative frequency of the event's occurence.

Theoretical Probability

What the outcomes were supposed to be theoretically.

Personal Probability

When the probability is subjective and represents your personal degree of belief.

observational study

The researcher observes the experimental units in their natural setting and records the variable(s) of interest. The researcher makes no attempt to control any aspect of the experimental units.

retrospective study

An observational study in which subjects are selected and then their previous conditions or behaviors are determined.

prospective study

An observational study in which subjects are followed to observe future outcomes.

Experiment

an organized procedure for testing a hypothesis.

factor

A number that divides evenly into another number. 3 is a factor of 15 (look at the chart!)

response

Dependent (response) variables

experimental units

Individuals on whom an experiment is performed.

level

The specific values that the experimenter chooses for a factor.

treatment

The process, intervention, or other controlled circumstance applied to randomly assigned experimental units.

Priciples of Experimental Design

Control; Randomize; Replicate; Block

Control group

Consists of the units who are not to receive the treatment that is the focus of the experiment

placebo effect

The tendency of many human subjects to show a response even when adminstered a placebo.

blinding

Any individual associated with an experiment who is not aware of how subjects have been allocated to treatment groups.

Placebo

A treatment known to have no affect.

Confounding

Levels of one factor are associated with the levels of another factor in such a way that their effects cannot be separated.