There is no single "best" puppy temperament test. There are several proven methods, each developed in a different context, each measuring slightly different things, and each with real limitations. The best breeders understand all of them — and draw from each one.
This guide covers every major method in use today, explains the research behind each, and shows how the BreedTools Puppy Temperament Test combines their strongest elements into a single 5-trait assessment with built-in home matching.
The short version
- Test at exactly 49 days: 7 weeks is the validated window (Scott & Fuller, 1965) — old enough to show real tendencies, before experience reshapes them.
- Use an unfamiliar tester: Not the breeder. One puppy at a time, in a new enclosed area, before feeding.
- 10 tests, scored 1–6: 1 = boldest/most dominant, 6 = most sensitive/withdrawn. Volhard expanded Campbell's original 5 to 10.
- Retrieving predicts trainability best: Guide-dog research (Goddard & Beilharz, 1986) found the retrieve is the single strongest predictor of training success.
- Recovery beats reaction: It's not whether a puppy startles (most do) — it's how fast it recovers. One of the strongest stability predictors.
- It's a snapshot, not destiny: Genetics, socialization, and training all shape the adult. Use results to inform placement, not guarantee it.
The four major methods
1. Campbell's Puppy Test (1972)
William Campbell published the first standardized puppy behavior evaluation in his 1975 book Behavior Problems in Dogs, based on research he began in the early 1970s. His test consists of 5 exercises — social attraction, following, restraint, social dominance, and elevation — all performed by an unfamiliar tester on puppies at approximately 7 weeks of age.
Campbell's contribution was foundational. He established the core principles that every later method built on: test at a specific developmental window, use an unfamiliar tester, test individually in an unfamiliar environment, and score on a standardized scale.
What it does well: Reliable measurement of social orientation and dominance tendencies. The social attraction and following tests remain the most widely used exercises across all methods.
Where it falls short: Only 5 tests, all focused on social behavior. No measurement of environmental sensitivity, startle recovery, or trainability. Campbell himself acknowledged the test was a starting point, not a complete assessment.
Campbell, W.E. (1972). A behavior test for puppy selection. Modern Veterinary Practice, 53(12), 29-33.
2. Volhard Puppy Aptitude Test (1980s)
Wendy and Jack Volhard expanded Campbell's 5 tests into a more comprehensive 10-test battery they called the Puppy Aptitude Test (PAT). They kept Campbell's original 5 social/dominance tests and added 5 more: retrieving, touch sensitivity, sound sensitivity, sight sensitivity, and stability (startle recovery).
The Volhards introduced the 1-6 scoring scale that is now standard across most methods: 1 being the most dominant/bold response, 6 being the most submissive/sensitive. They also formalized the requirement that testing occur at exactly 49 days of age — not a day earlier or later.
What it does well: The most comprehensive single test available. Covers social behavior, trainability (retrieve), and environmental sensitivity. The 1-6 scale is intuitive and widely understood by breeders.
Where it falls short: The Volhard PAT reports 10 individual scores but provides no framework for grouping them into meaningful behavioral dimensions. A breeder looking at scores of 2, 3, 4, 2, 3, 1, 5, 3, 2, 3 has to interpret the pattern intuitively. The original scoring also doesn't distinguish between a puppy that startles and recovers quickly versus one that doesn't startle at all — a critical distinction for predicting adult behavior.
Volhard, J. & Volhard, W. (2001). Dog Training for Dummies. Wiley. Original PAT published in various editions of their training materials from the 1980s onward.
3. Swedish Dog Mentality Assessment (DMA, 1997)
The Swedish Kennel Club developed the DMA (Mentalbeskrivning) as a standardized behavioral description for adult and adolescent dogs — but its core insight has been widely adopted into puppy evaluation: startle recovery is one of the strongest single predictors of adult behavioral stability.
The DMA measures multiple behavioral dimensions including curiosity, playfulness, chase instinct, sociability, and — critically — how the dog responds to sudden, surprising stimuli and how quickly it returns to baseline behavior. This "recovery" dimension proved so predictive that guide dog programs around the world incorporated it into their puppy evaluation protocols.
What it does well: The DMA's emphasis on recovery speed introduced a behavioral dimension that other methods either missed or underweighted. A puppy that startles at a loud noise but investigates the source within 5 seconds is behaviorally very different from one that startles and hides for 30 seconds — even though both "startled." The DMA captures that distinction.
Where it falls short: Originally designed for dogs 12+ months old, not 7-week-old puppies. The full DMA involves firearms sounds and other stimuli that aren't appropriate for young puppies. Applying its principles requires adaptation, which is what guide dog programs and the BreedTools approach do.
Svartberg, K. & Forkman, B. (2002). Personality traits in the domestic dog (Canis familiaris). Applied Animal Behaviour Science, 79(2), 133-155. Swedish Kennel Club DMA protocol standardized 1997.
4. Guide Dog Program Evaluations
Organizations like The Seeing Eye, Guide Dogs for the Blind, and Assistance Dogs International have been testing puppy temperament at scale for decades — breeding, testing, and tracking outcomes for thousands of dogs per year. This gives them something no other method has: large-scale outcome data that shows which puppy traits actually predict adult success.
Two findings from guide dog research stand out:
- Retrieving is the single best predictor of trainability. Goddard and Beilharz (1986) found that a puppy's willingness to retrieve and return an object at 7-8 weeks correlated more strongly with adult training success than any other single test. This makes test 6 (retrieving) disproportionately important — something Volhard's equal-weighting approach misses.
- Novel object response predicts environmental confidence. Guide dog programs routinely test puppies' reactions to unfamiliar objects (wobble boards, crinkly surfaces, moving objects) because adult guide dogs must navigate unpredictable environments daily. A puppy that cautiously investigates a novel object shows the balanced curiosity-without-fear that predicts adult stability.
What they do well: Outcome-validated testing at massive scale. These programs know which tests actually predict adult behavior because they track every dog from puppy test through career outcome.
Where they fall short: Their protocols are optimized for service dog selection, not companion placement. A puppy that fails guide dog screening isn't necessarily a bad companion — it may simply be too social, too independent, or too handler-focused for guide work. Breeders need a more flexible framework.
Goddard, M.E. & Beilharz, R.G. (1986). Early prediction of adult behaviour in potential guide dogs. Applied Animal Behaviour Science, 15(3), 247-260. Serpell, J.A. & Hsu, Y.A. (2005). Effects of breed, sex, and neuter status on trainability in dogs. Anthrozoos, 18(3), 196-207.
What the research actually says about prediction
Here's what breeders need to know about how well these methods work:
| Finding | Study | What It Means for Breeders |
|---|---|---|
| Puppy tests predict adult behavior significantly above chance | Goddard & Beilharz, 1986 | Testing is meaningful — it's not random. Puppies that test bold tend to stay bold; puppies that test sensitive tend to stay sensitive |
| Retrieval at 8 weeks is the strongest single predictor of adult trainability | Goddard & Beilharz, 1986; Wilsson & Sundgren, 1998 | Don't give all 10 tests equal weight. The retrieve test deserves more attention than, say, touch sensitivity |
| Startle recovery predicts adult behavioral stability better than initial startle response | Svartberg & Forkman, 2002; Swedish DMA data | It doesn't matter that a puppy startles — nearly all do. What matters is how fast it recovers. This distinction is critical |
| Fear-related behaviors at 7 weeks show moderate-to-strong continuity into adulthood | Slabbert & Odendaal, 1999 | Puppies showing fearful responses at 7 weeks are unlikely to 'grow out of it' without intervention. Place accordingly |
| Social attraction scores are more stable across testing sessions than sensitivity scores | Beaudet et al., 2004 | Social tests (attraction, following) are your most reliable data points. Sensitivity tests can vary more day to day |
| No single test method perfectly predicts adult temperament | Multiple studies | This is why combining methods works better than using any one alone — each captures different behavioral dimensions |
Where every single method falls short
Even the best methods share common blind spots:
- None match the puppy to the home. Every traditional method produces a puppy profile and stops. The breeder has to mentally translate "mostly 3s and 4s" into "good for families with children." This translation step is where most placement mistakes happen — it depends entirely on the breeder's experience and judgment.
- None organize scores into behavioral dimensions. Getting 10 individual numbers doesn't tell you much unless you know which numbers cluster together and what each cluster predicts. A 2 on social attraction and a 5 on touch sensitivity mean very different things, but raw score sheets treat them the same.
- None weight tests by predictive value. Research clearly shows that retrieving predicts trainability better than touch sensitivity, and that recovery speed predicts stability better than initial startle response. But Volhard's framework gives each test equal weight.
- None account for the home side of the equation. A "bold" puppy isn't inherently good or bad — it's good for an experienced, active home and potentially problematic for a quiet first-time owner. Without profiling the home, the puppy profile is only half the picture.
The BreedTools approach: 5 traits + home matching
The BreedTools Puppy Temperament Test doesn't replace any existing method — it organizes the best elements of all four into a framework that directly serves placement decisions.
Step 1: The same 10 tests, organized into 5 dimensions
The tool uses the full Volhard 10-test protocol with the standard 1-6 scoring scale. But instead of reporting 10 raw numbers, it maps each test to one of 5 validated behavioral trait dimensions:
| Trait Dimension | Tests Included | Based On | What It Predicts |
|---|---|---|---|
| Confidence | Restraint, Social Handling, Elevation | Volhard tests 3-5, Campbell's core protocol | Response to handling, authority, and loss of control — predicts how much structure and experience the owner needs |
| Social Drive | Social Attraction, Following | Campbell (1972), Volhard tests 1-2 | Bond formation, people-orientation, separation tolerance — most stable predictor across testing sessions (Beaudet et al., 2004) |
| Trainability | Retrieving | Guide dog programs, Volhard test 6 | Willingness to work with humans — single strongest predictor of training success (Goddard & Beilharz, 1986) |
| Sensitivity | Touch Sensitivity, Sound Startle | Volhard tests 7-8 | Physical and environmental sensitivity thresholds — predicts training method compatibility and socialization needs |
| Recovery | Startle Recovery, Novel Object | Swedish DMA, Guide dog programs | How quickly the puppy bounces back from surprise — one of the strongest predictors of adult behavioral stability (Svartberg & Forkman, 2002) |
Each dimension draws on decades of research. Grouping tests this way matches validated behavioral constructs rather than arbitrary test numbering.
Step 2: Home profiling
This is what no other method provides. The BreedTools tool asks 6 questions about the prospective home:
- Experience level — Expert handler, experienced owner, some experience, or first-time owner
- Household composition — Active adults, quiet adults, older children, young children, or seniors
- Activity level — Very active, active, moderate, or low
- Primary goal — Working/sport, active companion, family companion, therapy/service, or quiet companion
- Training commitment — Extensive, regular, basic, or minimal
- Tolerance for challenging behavior — High, moderate, or low
Step 3: Match scoring
The algorithm compares the puppy's 5-trait profile against the home profile across 6 weighted factors. The result is a 0-100 compatibility score with specific match insights — not vague guidelines, but statements like "This puppy's dominant confidence level needs a more experienced handler than this home provides" or "Strong working/sport potential — high drive with quick recovery."
| Match Score | Verdict | Breeder Action |
|---|---|---|
| 85-100 | Excellent match | Proceed with confidence. This puppy and home are well-aligned on almost every dimension |
| 70-84 | Good match | Strong alignment with minor considerations. Discuss specific insights with the buyer |
| 55-69 | Fair match | Some meaningful gaps. Placement could work with awareness, preparation, and clear expectations |
| 40-54 | Challenging match | Significant mismatch. Consider whether another puppy in the litter is a better fit |
| Below 40 | Poor match | Not recommended. High risk of behavioral issues, frustration, or a returned dog |
Why this approach works better
The BreedTools method isn't better because it invented new tests — it uses the same proven tests that have been validated for decades. It's better because of what it does after the tests:
- Groups raw scores into validated behavioral dimensions instead of leaving breeders to interpret 10 disconnected numbers
- Weights tests by their proven predictive value — retrieving and recovery get more influence than touch sensitivity, because research shows they're more predictive
- Profiles the home, not just the puppy — turns one-sided assessment into two-sided matching
- Produces specific match insights instead of vague score ranges — gives breeders talking points for buyer conversations
- Can be re-run with different home profiles — test the same puppy against multiple buyers to find the best fit in seconds
The underlying philosophy is simple: no single method captures everything, and a puppy profile without a home profile is only half a placement decision. By combining the best of Volhard, Campbell, guide dog research, and the Swedish DMA into a two-sided matching framework, the tool gives breeders something that didn't previously exist in a single, free, accessible format.
When and who — the testing protocol
Regardless of which method or tool you use, the testing protocol is the same across all validated approaches:
- Age: Exactly 49 days (7 weeks). This window was established by Scott and Fuller's 1965 research on canine behavioral development and has been validated by every major method since.
- Tester: Someone unfamiliar to the puppies. Not the breeder, not a family member, not anyone who has been handling the litter.
- Environment: An unfamiliar, enclosed area. Not the whelping room, not anywhere the puppies have been before.
- Individual testing: One puppy at a time. No littermates or other dogs present.
- Timing: Before feeding. Puppies should be alert and active, not sleepy or full.
Scott, J.P. & Fuller, J.L. (1965). Genetics and the Social Behavior of the Dog. University of Chicago Press. The foundational study establishing critical periods in canine behavioral development.
How long does it take?
The physical testing takes 15-20 minutes per puppy. This is the same time commitment as a standard Volhard PAT — the BreedTools approach doesn't add extra physical tests, it organizes and matches the results better.
| Tests | Time | Why |
|---|---|---|
| Social Attraction + Following (1-2) | ~2 min | Quick observe-and-score, minimal setup |
| Restraint + Social Handling + Elevation (3-5) | ~4 min | 30-second timed holds each, plus transitions |
| Retrieving (6) | ~2 min | Toss, observe, may repeat once |
| Touch Sensitivity (7) | ~1 min | Count to 10, note the reaction point |
| Sound Startle + Startle Recovery (8-9) | ~3 min | Need assistant for sound, observe initial reaction + 30-sec recovery |
| Novel Object (10) | ~2 min | Place object, observe for 30 seconds |
| Transitions between tests | ~3-4 min | Repositioning, letting the puppy reset between exercises |
Total: ~15-20 minutes per puppy. The home profile questionnaire takes ~3 minutes per buyer and can be done separately.
For a full litter, expect 1.5-2 hours for 6 puppies or 2-2.5 hours for 8 puppies. The real time savings come after testing — the home matching algorithm replaces the hours breeders typically spend manually comparing puppy profiles against buyer notes. You can match one puppy against multiple home profiles in seconds.
The 10 tests explained
These are the same 10 exercises used in the Volhard PAT, with each test mapped to the behavioral dimension it measures in the BreedTools 5-trait framework.
Social Drive tests
1. Social attraction — The tester kneels a few feet away and gently coaxes the puppy to come, using encouraging sounds but no commands. Measures willingness to approach an unfamiliar person and degree of social confidence. (Campbell, 1972; Volhard test 1)
2. Following — The tester stands and walks away at a normal pace. Measures willingness to follow a human — an indicator of social attachment and pack drive. (Campbell, 1972; Volhard test 2)
Confidence tests
3. Restraint — The tester gently rolls the puppy onto its back and holds it there with light pressure on the chest for 30 seconds. Measures response to physical dominance and inability to control its own position. (Campbell, 1972; Volhard test 3)
4. Social handling — The tester crouches beside the puppy and strokes it firmly from head to back, repeatedly. Measures acceptance of social handling and response to authority. (Volhard test 4)
5. Elevation — The tester lifts the puppy with both hands under the belly and holds it elevated for 30 seconds. Measures response to complete vulnerability — no control over its situation. (Campbell, 1972; Volhard test 5)
Trainability test
6. Retrieving — The tester crumples a small paper ball and tosses it 2-4 feet in front of the puppy. Measures willingness to work with a human and prey/play drive. Research from guide dog programs shows this is the single best predictor of training responsiveness. (Volhard test 6; Goddard & Beilharz, 1986)
Sensitivity tests
7. Touch sensitivity — The tester takes the webbing between the puppy's toes and presses with gradually increasing firmness, counting to 10. Measures pain threshold — directly relevant to training method compatibility. (Volhard test 7)
8. Sound startle — An assistant makes a sharp, sudden noise while the puppy is facing away. Measures initial response to startling auditory stimuli. (Volhard test 8; Swedish DMA adaptation)
Recovery tests
9. Startle recovery — An umbrella is opened suddenly 4-5 feet from the puppy. What matters is not whether the puppy startles (most will), but how quickly it recovers and investigates. This distinction — borrowed from the Swedish DMA — is critical. (Volhard test 10; Swedish DMA principle)
10. Novel object — An unfamiliar object (crinkly bag, wobble board, or dragged towel on a string) is presented near the puppy. Measures environmental curiosity versus avoidance — a standard test in guide dog puppy evaluations that Volhard's original sight sensitivity test only partially captures. (Guide dog programs; adapted from Volhard test 9)
Scoring system
Each test is scored on the standard 1-6 scale established by the Volhards. Lower numbers indicate more dominant/bold responses; higher numbers indicate more submissive/sensitive responses.
Interpreting mixed results
Most puppies won't score the same number on every test — and that's perfectly normal. A puppy might score 2 on social attraction (confident approach), 4 on restraint (accepting), and 3 on sound startle (mild startle, quick recovery). This paints a picture of a confident but cooperative puppy.
This is exactly why trait dimensions matter more than raw scores. Instead of trying to interpret 10 individual numbers, the BreedTools approach groups them:
- Social Drive (tests 1-2) — How the puppy relates to people. The most stable scores across testing sessions, according to Beaudet et al. (2004)
- Confidence (tests 3-5) — How the puppy responds to handling and control. Predicts how much structure and experience the owner needs
- Trainability (test 6) — The retrieve. Single strongest predictor of adult training success. Give this score extra weight
- Sensitivity (tests 7-8) — Environmental and physical thresholds. Predicts which training methods will work
- Recovery (tests 9-10) — Startle recovery and novel object response. The dimension most breeders overlook and one of the strongest predictors of adult stability
Using results for puppy placement
The real value of temperament testing is in matching puppies to homes. A dominant, high-energy puppy placed with a timid first-time owner is a recipe for frustration — for both the owner and the dog. A sensitive, submissive puppy placed in a loud, chaotic household may develop anxiety.
| Home Type | Ideal Trait Profile | Why |
|---|---|---|
| Experienced sport/working home | Confidence 2-3, Social 2-3, Trainability 1-2, Recovery 1-2 | High drive, handler focus, quick recovery — thrives with structure and a job to do |
| Active family with dog experience | Confidence 2-3, Social 2-3, Trainability 2-3, Recovery 2-3 | Outgoing and resilient, responds to training, tolerant of active household |
| Family with young children | Confidence 3-4, Social 3-4, Trainability 3-4, Recovery 2-3 | Cooperative, adaptable, tolerant of unpredictable handling, good recovery from surprises |
| Therapy/service dog prospect | Confidence 3-4, Social 2-3, Trainability 1-3, Sensitivity 3-4 | People-oriented, trainable, moderate sensitivity (aware but not reactive) |
| Quiet home, seniors | Confidence 4-5, Social 3-4, Recovery 3-4 | Gentle, people-oriented, low conflict, adapts to calm routine |
| First-time owner | Confidence 3-4, Social 3-4, Trainability 2-3, Recovery 2-3 | Balanced and forgiving — cooperative enough to train easily, resilient enough to handle beginner mistakes |
These are general guidelines. The BreedTools Puppy Temperament Test automates this matching with a 6-factor compatibility algorithm.
Most experienced breeders find that matching puppies by temperament — rather than letting buyers choose by color or appearance — produces happier placements and fewer returned dogs. The BreedTools tool lets you test the same puppy against multiple home profiles in seconds, making it easy to compare match scores across your buyer list.
Temperament testing works best when paired with a structured socialization protocol — together, they give you a complete picture of each puppy's behavioral development before placement.
Research credits and further reading
The methods and research cited in this guide and used in the BreedTools Puppy Temperament Test:
- Campbell, W.E. (1972) — "A behavior test for puppy selection." Modern Veterinary Practice, 53(12). The first standardized puppy behavior evaluation; established the unfamiliar-tester protocol.
- Volhard, J. & Volhard, W. — Puppy Aptitude Test (PAT). Expanded Campbell's 5 tests to 10 with the 1-6 scoring scale. Published across multiple editions of their training materials from the 1980s onward.
- Scott, J.P. & Fuller, J.L. (1965) — Genetics and the Social Behavior of the Dog. University of Chicago Press. The foundational study establishing critical periods in canine behavioral development, including the 7-week testing window.
- Goddard, M.E. & Beilharz, R.G. (1986) — "Early prediction of adult behaviour in potential guide dogs." Applied Animal Behaviour Science, 15(3). Demonstrated that retrieval at 8 weeks is the strongest single predictor of adult trainability.
- Svartberg, K. & Forkman, B. (2002) — "Personality traits in the domestic dog." Applied Animal Behaviour Science, 79(2). Validated behavioral dimensions in the Swedish DMA, demonstrating the predictive power of startle recovery.
- Wilsson, E. & Sundgren, P.E. (1998) — "Behaviour test for eight-week old puppies." Applied Animal Behaviour Science, 56(1). Swedish military dog program research confirming retrieving as a key trainability predictor.
- Slabbert, J.M. & Odendaal, J.S.J. (1999) — "Early prediction of adult police dog efficiency." Applied Animal Behaviour Science, 64(2). Demonstrated continuity of fear-related behaviors from puppyhood into adulthood.
- Beaudet, R. et al. (2004) — Research on test-retest reliability showing social attraction scores are more stable across sessions than sensitivity scores.
- Serpell, J.A. & Hsu, Y.A. (2005) — "Effects of breed, sex, and neuter status on trainability in dogs." Anthrozoos, 18(3). CBARQ behavioral dimension validation.
Temperament testing FAQs
What is the best puppy temperament test?
At what age should you temperament test puppies?
Who should perform the temperament test?
How accurate is puppy temperament testing?
What do temperament test scores mean?
Why does the BreedTools test include home matching?
Should I let buyers choose their own puppy?
Free Tool
Puppy Temperament Test & Match
Score puppy temperament across 5 behavioral dimensions using the methods described in this article — then match each puppy to the right home.
Start testing →