Data Sources & Attribution

Documentation of data sources, algorithms, and metadata for all fields in the Plant Species Directory.

Column Sources (43 fields)
Column Source Module Data Source Algorithm Description
Genus Pipeline Core Input Parameter The genus portion of the botanical name provided as input to the pipeline.
Species Pipeline Core Input Parameter The species epithet portion of the botanical name provided as input to the pipeline.
Family Botanical Name Validator Claude AI The taxonomic family (e.g., Sapindaceae, Fagaceae) identified by Claude AI based on current botanical nomenclature.
Botanical Name Notes Botanical Name Validator Claude AI Claude AI verifies the scientific name against current botanical nomenclature. Empty if the name is accepted and current. Contains notes if the name is outdated, a synonym, or has spelling issues.
External Reference URLs iNaturalist Enrichment Multiple sources (Google Search + Michigan Flora + iNaturalist) Reference links compiled from three sources: (1) botanical reference websites found via Google search and verified to contain species-specific information; (2) Michigan Flora species page from the University of Michigan Herbarium; (3) Wikipedia article link from iNaturalist.
Common Names Common Names Claude AI Claude AI identifies vernacular names used in Southeast Michigan and the Great Lakes region (Ohio, Indiana, Illinois, Wisconsin, southern Ontario). Listed from most to least common. Excludes scientific synonyms and cultivar names.
Previously Known As Previous Botanical Names GBIF (Global Biodiversity Information Facility) Historical scientific names retrieved from GBIF, the international biodiversity database. Shows species-level synonyms only (excludes varieties and subspecies). Useful for cross-referencing older botanical literature.
Coefficient of Conservatism (C) Michigan Flora Ecological Metrics Michigan Flora (University of Michigan Herbarium) Coefficient of Conservatism from Michigan Flora. Ranges 0-10, indicating how sensitive a species is to habitat disturbance. Higher values mean the species is found only in high-quality, undisturbed natural areas.
Coefficient of Wetness (CW) Michigan Flora Ecological Metrics Michigan Flora (University of Michigan Herbarium) Wetness indicator from Michigan Flora. Ranges from -5 (obligate wetland species) to +5 (obligate upland species). Values near 0 indicate species tolerant of both wet and dry conditions.
Physiognomy Michigan Flora Ecological Metrics Michigan Flora (University of Michigan Herbarium) Growth form classification from Michigan Flora: Forb (herbaceous flowering plant), Graminoid (grass-like), Shrub, Tree, Vine, etc.
Duration Michigan Flora Ecological Metrics Michigan Flora (University of Michigan Herbarium) Life cycle from Michigan Flora: Annual (one growing season), Biennial (two years), or Perennial (multiple years).
Native-MIFlora Michigan Flora Ecological Metrics Michigan Flora API (University of Michigan Herbarium) Native status according to Michigan Flora. "Native" means the species is indigenous to Michigan; "Non-native" means it was introduced from elsewhere.
Michigan Flora Description Michigan Flora Ecological Metrics Michigan Flora API (University of Michigan Herbarium) Species description text from the Michigan Flora API, maintained by the University of Michigan Herbarium.
SE Michigan Monthly Observations iNaturalist Enrichment iNaturalist (citizen science observations) Monthly observation counts from iNaturalist for Southeast Michigan (Wayne, Oakland, Macomb, Washtenaw, Livingston counties). Based on verified citizen science observations. Indicates when the species is most visible or identifiable throughout the year.
Wikipedia Summary iNaturalist Enrichment Wikipedia (via iNaturalist) Summary text from the Wikipedia article for this species, retrieved through iNaturalist which maintains links to corresponding Wikipedia entries.
BONAP Range Map BONAP Range Map BONAP (bonap.net) + SerpApi fallback First attempts predictable BONAP image URL pattern (http://bonap.net/MapGallery/County/{Genus}%20{species}.png) with HEAD request verification. If direct URL fails (404), falls back to SerpApi search for "bonap {genus} {species} range map". Returns direct PNG image URL.
Lake County Images Lake County Images Google Drive Parsed PDFs Parsed Lake County PDF files. Extracts species images and associates with species. Attribution: Lake County Seed Collection Guide | Lake County, Illinois, USA | Lake County Forest Preserve District | Authors: Kelly Schultz & Dale Shields | Photos by DJ Shields | License: CC BY-NC 4.0
Seed Color at Maturity 3-Tier: Seed Color at Maturity Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting: Tier 1 uses trusted sources (Google Drive Tier 1 folder, Michigan Flora, Lake County Guide), Tier 2 adds secondary sources (Missouri Seedling Guide) plus Tier 1 context, Tier 3 independently reports model knowledge. Returns merged JSON with all three tier responses.
Collection Miss Risk 3-Tier: Collection Miss Risk Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to assess how easily seeds are lost before collection. Returns Low/Moderate/High enum with brief explanation.
Processing Hazards 3-Tier: Processing Hazards Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to identify safety hazards during seed processing/handling. Notes irritants, toxins, spines, or allergenic properties.
Germination Ecology / Real-World Behavior 3-Tier: Germination Ecology / Real-World Behavior Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to explain dormancy mechanisms, dispersal timing, and natural germination cues.
Stratification Requirements 3-Tier: Stratification Requirements Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to document artificial (fridge) stratification protocols with codes and plain-language explanation.
Artificial Stratification Risks 3-Tier: Artificial Stratification Risks Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to document pitfalls of fridge stratification including premature germination and mold risks.
Collection Quantity Notes 3-Tier: Collection Quantity Notes Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to set expectations for typical seed yield per plant or stand.
Readiness Collection Cues 3-Tier: Readiness Collection Cues Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe observable cues indicating harvest timing.
Collection Safety and Nuisance Notes 3-Tier: Collection Safety and Nuisance Notes Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to document hazards and nuisances during seed collection.
Some Useful Collection Tools 3-Tier: Some Useful Collection Tools Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to list simple tools and containers for seed collection.
What You Collect 3-Tier: What You Collect Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to identify the physical unit collected (seeds, pods, etc.).
Ease of Collection 3-Tier: Ease of Collection Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to rate collection effort as Easy/Moderate/Difficult with brief explanation.
Seed Visibility at Maturity 3-Tier: Seed Visibility at Maturity Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe whether seeds are Visible/Hidden/Partly visible at maturity.
Habitat Notes 3-Tier: Habitat Notes Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to provide habitat context helpful for plant identification.
Collection Challenges 3-Tier: Collection Challenges Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to note practical obstacles to obtaining usable seed.
Seed Cleaning Complexity 3-Tier: Seed Cleaning Complexity Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe core cleaning approach including chaff, mesh, or steps.
Seed Drying Needs 3-Tier: Seed Drying Needs Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe container-based drying after collection.
Collection ID Cautions 3-Tier: Collection ID Cautions Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe observable traits on target species at collection time.
Processing Difficulty 3-Tier: Processing Difficulty Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to rate overall difficulty of post-collection processing as Easy/Moderate/Difficult.
Processing Nuggets 3-Tier: Processing Nuggets Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to provide 1-2 high-leverage practical tips for processing.
Processing Time / Labor 3-Tier: Processing Time / Labor Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe magnitude of effort required for processing as Minimal/Moderate/High.
Other Storage Hazards / Warnings 3-Tier: Other Storage Hazards / Warnings Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to document storage hazards like pest attraction or volatiles.
Storage Mold Risk 3-Tier: Storage Mold Risk Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to assess mold risk during storage.
Safe Storage Method – Dry Fridge 3-Tier: Safe Storage Method – Dry Fridge Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe dry refrigeration storage approach.
Safe Storage Method – Room Temp 3-Tier: Safe Storage Method – Room Temp Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to describe short-term dry storage at room temperature.
Similar Species / Distinguishing Features 3-Tier: Similar Species / Distinguishing Features Tiered LLM synthesis (Tier 1 trusted sources, Tier 2 secondary sources, Tier 3 independent model knowledge) Uses 3-tier LLM prompting to list species commonly confused with the target and how they differ.