Dog Breed Explorer

Understanding Modern Dog Breeds Through Data Visualizations

About This Dashboard

This tool was developed to explore data on modern dog breeds from dogTime, which is a website that maintains information on dog breeds and since it is standardized it makes for an easy way to scrape data. DogTime also keeps ratings of dogs, for instance trainability, energy, or exercise needs, with a scale from 1 = low to 5 = high.These data were scraped and served on kaggle. This dashboard attempts to provide the user with a snapshot of the distribution of dog breeds across all of the American Kennel Club (AKC) dog groups. We attempt to also understand modern dog breeds through data visualizations of behavioral traits.


Visualization

Purpose

  • This visual highlights key information for the reader to consume without spending too much time on it.

Technique

  • Below we present a colored a modified key performance indicator graphic that captures some overall information on this dataset for any reader to quickly glance and get an idea of what this data may contain.

Data

  • This dataset contains AKC registered dog breeds in addition to hybrid wolf breeds, and new breeds that are emerging such as “goldendoodle”.

Key Finding

  • From this visual, we can quickly see how many breeds are in the dataset, average life span, and the most common breed group in addition to the healthiest. We wanted to keep this visual appealing and quick to cognitively process so we kept it to only 4 pieces of information.

391

Total Breeds Analyzed

+98% Coverage

12.6 years

Average Lifespan

Range: 8-18 years

Mixed Breed Dogs

Most Common Group

31% of breeds

Large

Healthiest Size

Health Score: 3.75

Distribution of Breeds

Visualization

Purpose

  • With this visual we want to answer the question about how are breeds distributed, which breeds are nested in two categories; breed groups and breed size as denoted by AKC.

Technique

  • We create a mosaic plot to capture visually the proportional distribution of breeds by size and group.

Data

  • We use each breeds recorded information on the group they fall in and their size.

Key Finding

  • Very Large breeds dominate the dataset (53%), followed by Medium (20%) and Large (17%). This distribution reflects both historical breeding purposes (working, hunting) and modern popularity of larger companion breeds.

  • Group patterns: Working and Mixed breeds are heavily represented in larger sizes, while Companion breeds cluster in smaller sizes.


Distinct Breed Characteristics

Visualization

Purpose

  • This visual attempts to examine how traits vary by breed group using a minimal data-ink ratio by removing all non-essential elements, minimal gridlines, no chart borders, and clean backgrounds

Technique

  • We use a line graph with a standardized y axis to compare across breed groups. Line graphs allow us to compare across various discrete categories, in this case traits. We also deploy a small multiples technique to enable comparison within the eyespan. We remove colors to keep the ink to a minimal. Additionally for this plot, small multiples with common scales enable direct comparison across groups (Tufte principle). We iterated over this graph several times and found that importance of the traits to the overall presentation directed us to use traits as the group lines rather than the breed groups like we initially had designed this graphic.

Data

  • We take the average trait (5 traits in total) by breed group and plot them, except for mixed breed since this group has a lot more variation.

Key Finding

Distinct behavioral signatures emerge:

  • Sporting breeds: Highest trainability (4.1) and affectionate (4.8) - bred for active partnership
  • Companion breeds: Highest adaptability (3.4) and friendliness (4.1) - optimized for home life
  • Herding breeds: High scores across all dimensions (3.8-4.3) - versatile working dogs
  • Terrier breeds: Moderate adaptability (2.9) but high energy (3.9) - independent hunters
  • Hound breeds: High friendliness (4.3) but lower trainability (3.0) - independent trackers


Important Comparison Dimensions

Visualization

Purpose

  • Here we attempt to explore an important question on breed’s life span as it relates to average size of the breed.

Technique

  • We adapt a scatterplot to these data to reveal the following relationship and plot a loess form function smoothing line to show that this is not a linear relationship. Furthermore, we take breed’s average weight on the log scale to expand the scale visually.

Data

  • We use the average breed weight on the log scale on the x axis, and life span in years on the y axis. Additionally, we color code the points by the size of the breeds, and this allows the reader to quickly glance at the plot and visually see clustering in the data.

Key Finding

Physical Dimension (Size-Lifespan Tradeoff):

  • Strong negative correlation (r = -0.47)
  • Very Small breeds: ~15 years average lifespan
  • Very Large breeds: ~11 years average lifespan
  • Insight: Prospective owners face 4-year lifespan difference based on size preference


Breed Comparison Tool

Compare 2-4 Breeds on Behavioral Traits

Visualization

Purpose

  • This complex interactive visual allows users to select 2 to 4 breeds for comparison on behavioral traits using a radar graph. The radar chart updates automatically as the user makes selection. The primary goal is for users to discover breeds with their behavioral preference.

Technique, how to Read the Radar Chart

Visual Encoding:

  • Distance from center = Score strength (1 = center, 5 = outer edge)

  • Larger area = More well-rounded breed across all traits -

  • Similar shapes = Similar behavioral profiles

  • Overlapping areas = Breeds are comparable on those dimensions

Interactive Features: - Select breeds from drop down menu

  1. Chart updates automatically

  2. Hover over points for exact values

  3. Legend toggle: Click breed names to show/hide - Zoom and pan available

Data

  • We use 8 behavioral indicators that were scored for each breed and present them in this graph. We limit a comparison to 4 breeds at a time so that the user is not overwhelmed by the graphic.

Key Finding

  • This tool is effective at showing multidimensions of factors for use with comparison across breeds, giving the user a positive experience by reducing cognitive burden when staring at several graphics.

Select Breeds to Compare (2-4 breeds)


Geographic Origins

Visualization

Purpose

  • This visual aims at assisting the user in exploring the distribution of breed origins around the world.

Technique

  • We use an interactive map with bubble plots of breed origins. The bubble plot sizes correspond to the number of breeds from a particular country. A static map may have accomplished the same result; however, we went with an interactive plot as it allows users to select a bubble point and expand the list of breeds from the particular country. Given that this is an interactive plot, we also add other country level information such as the average dog weight, average lifespan, and primary dog breeds.

Data

  • These data only contains the breed name and country of origin. Using the country of origin, we were able to use the tidygeocoder r package to easily convert country names to coordinates for this visual. We also averaged breed weight lifespan for each country to present in the visual.

Key Finding

Europe as the historical center: - United Kingdom, Germany, and France show highest breed diversity - Reflects centuries of selective breeding traditions - Continental Europe developed specialized working breeds (herding, hunting, guarding) (source AKC).

North America’s contribution: - United States: 165 breeds (primarily modern mixed breeds and new developments) - Represents recent breed development and crossbreeding innovation (source AKC).

Modern insight: While breeds originated globally, European standardization and American innovation dominate modern breed development (source AKC).


Summary & Recommendations

My motivation for this graphic is to simulate a data visualization presentation to Nestle Purina administrators where I play the role as a data scientist. Therefore, the next section assumes the audience are Purina administrators and conveys data visualization results along with recommendations.

Finding: Very Large breeds comprise 53% of recognized breeds, with Working and Mixed breeds heavily represented in larger sizes.

Implication for Purina: Product development should prioritize formulations for large breeds, while recognizing the growing market for small companion breeds (20% of total).

Finding: Breed groups show distinct behavioral signatures that align with historical breeding purposes (Sporting = high trainability/energy, Companion = high adaptability/friendliness).

Implication for Purina: Marketing and product positioning should emphasize group-specific traits. Nutritional needs likely differ by behavioral profiles (active vs. sedentary).

Finding: Size-Lifespan tradeoff create a useful decision framework.

Implication for Purina: Consumer education tools should focus on this dimension. Life-stage nutrition becomes especially important given lifespan variations.

Finding: Europe remains the historical breeding center, but North America drives modern innovation. Regional specializations reflect climate and cultural needs.

Implication for Purina: Global product strategies should account for regional breed preferences and historical breeding contexts.