Dr. Susan A. Adams, LPC, NCC

(662) 846-4360

Email: sadams@dsu.deltast.edu

"Cheat Sheet" of Assessment Terminology

 

This page was created by Dr. Adams as a classroom handout. It is NOT designed to be a "textbook definition" list of terminology.  Rather it is to provide a "working" definition of terms for beginning students.

Assessment - any procedure used to gather information about people

Testing - type of assessment that uses specific procedures to obtain information and convert that information to numbers or scores  

Measures of Central Tendency - mean, median, mode; generic term includes sample statistics and population parameters that indicate middle of a score distribution

Mean - average of scores (total numbers and divide by number of scores to get average)

Median - middle score (if even number of scores divide middle two numbers to get middle)

Mode - most frequent score  

Measures of Variability - range, mid-range, standard deviation; generic term includes sample statistics and population parameters that indicate score distribution

Range - stated as (a) lowest to highest number or (b) single number which is result of subtracting lowest from highest

Mid-range - add highest and lowest numbers and divided by 2  

Normal distribution - bell shaped curve; mean, median, and mode will all be same number; 68% of scores will fall within + 1 standard deviation of the mean in a normal distribution.

Bell shaped curve - normal distribution; mean, median, and mode will all be same number; 68% of scores will fall within + 1 standard deviation of the mean in a normal distribution.

Distribution - each 1/2 is 34%, 14%, 2% rounded off; 68% of scores will fall within + 1 standard deviation of the mean in a normal distribution.  

Skewness - tail of the whale is the name of the curve; mean is pulled toward tail so median is a better measure if have a large skew

Positively skewed - more than 1/2 scores fall below its mean

Negatively skewed - more than 1/2 scores fall above its mean  

Histogram - bar chart that is a picture of frequency distribution; horizontal axis equals score intervals; vertical axis equals frequencies  

Reliability - consistency of scores; how consistent a test measures

Validity - does it measure what it is supposed to measure

Can have a test with HIGH reliability, but not valid; if a test is valid, it will be reliable!  

Standard deviation - measure of spread of scores around the mean; most popular measure of spread of scores in distribution; the more spread the larger the standard deviation; requires interval level measurement

Standard score - basic standard score is the z-score; produce both decimals and negative numbers which make it difficult to interpret so other types of standard scores have been developed  

Population - the total of all subjects (scores, data) possessing certain common characteristics that are being studied

Sample - a subgroup of the population; selection of independent objects, scores, individuals within a given population for computation of value  

Raw Score - an "uncooked" piece of data or standard score

Standard error of measurement - most common use is to construct banks of confidence around an individual's obtained score; represents theoretical distribution that would be obtained if an individual were repeatedly tested with a large number of exactly equivalent forms of the same test  

Correlation - the degree that two sets of measures are related; how two scores are co-related; sign tells direction

Positive correlation - both go in the same direction together (e.g., height and weight)

Negative correlation - as one goes up the other goes down (e.g., academic ranking and hours left to take)  

Aptitude Tests - measure capacity to learn

Achievement Tests - measure what already knows  

Standardized assessments - standardized procedures include specific criteria for test construction, administration, and interpretation; tests must be administered and scored according to specified procedures; testing conditions must be uniform for all participants; scores are objective and usually interpreted compared to normative data from representative sample

Nonstandardized assessments - produce results that are less dependable; may include interviews, essays, or other forms of less dependable procedures  

Qualitative assessment - produce verbal description of a person's behavior or situation that can be placed into one of several categories

Quantitative assessment - yield a specific score on a continuous scale; includes most psychological tests  

Percentile Rank - expressed in terms of the percentage of persons in the comparison group who fall below them when the scores are placed in rank order (e.g., rank of 75 says that score is as high or higher than 75% of those in the comparison group)

Grade Equivalents - number representing a grade followed by a decimal representing 10 months of the school year; easier to interpret without understanding of measurement concepts (e.g., a score of 9.3 for a sixth grader would represent the score the average ninth grader would make if takes the test)  

Test-Retest Reliability - measures consistency over time

Alternate Form Reliability - equivalent forms of the same test; difficult to create two good forms; need national testing program that includes field tested sample items in each administration of the test which are not scored

Split-half Reliability - obtained from single administration of test; divides tests into halves and compares results; not top 1/2 and bottom 1/2 because of fatigue and effects of practice; most common is split even / odd; larger number of items promotes more stability in scores

Interitem Consistency - measure of internal consistency that assesses the extent to which the items on the test are related to the total score on the test and also are related to other items on the test  

Content Validity - do these items measure what the test is supposed to be measuring

Face Validity - not really validity, but do the items appear to be measuring what they are supposed to measure; judgment about appropriateness is done by test taker

Criterion-related Validity - ability of the test to predict performance on another measure; important for selection

Concurrent Validity - type of criterion related validity; measured against a criterion; usually used in the future to estimate some type of behavior (e.g., ability to do the work of a computer technician)

Predictive Validity - type of criterion related validity; predict performance in the future (e.g., SAT predictive validity for college academic performance)

Construct Validity - test is related to things it is supposed to and NOT related to others

Congruent Validity - type of construct validity; correlation (co-relate) indicates the extent to which scores on the test being analyzed predict scores on established tests (e.g., new measure of anxiety should correlate HIGHLY with other measures of anxiety)

Convergent validity - type of construct validity; extent scores on test correlated with scores on tests of related constructs (if research indicates relationship between depression and anxiety then scores on anxiety measure should correlate positively with measures of depression)

Discriminant Validity - type of construct validity; scores should not show high correlation with other tests that are supposed to be different (e.g., scores of math ability should not be highly correlated with clerical speed or accuracy)

Internal consistency - type of construct validity; usually indicates reliability of items in the test to each other and to the total score  

Nominal data - names; labels that serve to identify different categories; does not relate to amount or quantity (e.g., gender, race)

Ordinal data - ranks of scores; lower number represents more of the construct than higher number OR said another way higher number represents less of the construct than lower number (e.g., student ranks first in class higher academic achievement than student who ranks second; freshman, sophomore, junior, senior)

Interval data - most commonly used; continuous data, but does not have "absolute zero"; a given interval is the same distance no matter where it is found in the data (e.g., thermometer) (note:  zero degrees does NOT represent absence of temperature)

Ratio data - same as interval except has absolute zero; absolute zero is total absence of construct being measured (e.g., measure piece of paper) (note: zero paper means no paper)