U.S. Local Food Sourcing Analysis
Comprehensive Methodology Documentation v2.0
The goal of this project is to create a national farmshed map that identifies areas with the highest potential for locally sourcing food. While the current application focuses on hospitals as anchor institutions, the underlying methodology produces a generalizable assessment of local food system strength that could serve commercial, institutional, or individual use cases.
This analysis attempts to answer: "Which areas have the highest potential for locally sourcing their food?" It does not attempt to determine whether local sourcing is actually possible in any given area—that would require detailed supply-and-demand modeling, economic feasibility analysis, and infrastructure assessment. Instead, this methodology identifies areas where the conditions are most favorable relative to other areas, based on four measurable dimensions of agricultural capacity.
All metrics are computed at the Census block group level—the smallest geographic unit for which the Census Bureau publishes sample data. Block groups typically contain between 600 and 3,000 people, though sizes vary significantly:
Because block groups vary so much in size, and because food accessibility realistically extends beyond any single block group's boundaries, all metrics incorporate a neighborhood-aware calculation. For each block group, we calculate its own metric value and add a weighted contribution (50%) from neighboring block groups within an 8 km (5 mile) buffer. This captures the reality that farms, markets, and distribution infrastructure in adjacent areas contribute to a location's food access.
Agricultural land use varies dramatically across states. According to USDA data, 39% of U.S. land area is used by farms, totaling 876 million acres—but this national average masks enormous regional variation:
The Midwest and Great Plains states have the highest agricultural land use:
In contrast, the Northeast states this analysis currently covers have much lower agricultural intensity:
This variation is critical for interpreting farmshed scores. A "high" score in Massachusetts reflects strong local food potential relative to that state's agricultural context, but the absolute agricultural capacity is much lower than a "medium" score in Iowa. This is why the methodology provides both state-level normalization (relative ranking within state) and national-level normalization (absolute comparison across states).
The farmshed score combines four dimensions that capture different aspects of local food system strength:
Using the USDA Cropland Data Layer (CDL), a national land cover dataset updated annually at 30-meter resolution, we capture the scale of agriculture in each area. The CDL classifies approximately 250 crop and land cover types across the continental United States.
The CDL is first filtered to include only crops primarily grown for direct human consumption—excluding field corn (95% goes to animal feed and ethanol), soybeans (primarily oil and feed), cotton, hay, and pasture. The remaining raster is then vectorized to calculate total acreage of human food cropland.
This metric is supplemented by OpenStreetMap (OSM) farm polygons tagged as farmland, orchards, or vineyards, capturing community-mapped agricultural areas that may be missing from the satellite-derived CDL.
Why it matters: Scale serves as a proxy for current agricultural production in an area. Without sufficient productive capacity, local food sourcing cannot occur at institutional scale. The density of farmland reflects the agricultural heritage and current land use patterns of a region.
The following CDL crop categories are classified as human food crops:
Vegetables: Dry Beans, Potatoes, Sweet Potatoes, Mixed Vegetables & Fruits, Watermelons, Onions, Cucumbers, Peas, Tomatoes, Caneberries, Herbs, Carrots, Asparagus, Garlic, Cantaloupes, Honeydew Melons, Broccoli, Peppers, Greens, Strawberries, Squash, Lettuce, Pumpkins, Cabbage, Cauliflower, Celery, Radishes, Turnips, Eggplants, Gourds
Fruits & Nuts: Cherries, Peaches, Apples, Grapes, Other Tree Crops, Citrus, Pecans, Almonds, Walnuts, Pears, Pistachios, Prunes, Olives, Oranges, Avocados, Nectarines, Plums, Apricots, Blueberries, Cranberries
Grains for Human Consumption: Rice, Barley, Durum Wheat, Spring Wheat, Winter Wheat, Rye, Oats, Millet, Buckwheat, Quinoa, Amaranth
Legumes: Peanuts, Chickpeas, Lentils
Specialty: Sweet Corn, Mint, Hops, Aquaculture
Excluded (primarily feed/industrial): Field Corn, Soybeans, Sorghum, Alfalfa, Hay, Cotton, Sugarbeets, Tobacco, Christmas Trees, and all pasture/fallow/developed land cover classes.
Diversity counts the number of unique food products available from farms in each area. This combines product lists from farm point data (LocalHarvest, Rodale Institute, regenerative farm databases) with crop types identified from CDL polygons.
Products are deduplicated so that "tomatoes" from three different farms counts as one unique product. This measures the variety of what's available, not the quantity.
Why it matters: A diverse local food system can support varied institutional menus and nutritional needs. An area with only apple orchards scores lower on diversity than an area with vegetables, fruits, grains, and dairy—even if total acreage is similar.
Quality measures the density of farms using certified organic or regenerative agricultural practices. Data sources include the USDA Organic Integrity Database, Rodale Institute's organic farm database, and self-reported regenerative farms.
This uses point-counting methodology: each organic or regenerative operation is counted equally regardless of size. A 1-acre organic farm counts the same as a 100-acre operation. This reflects the value of sustainable practices at any scale for local food access and environmental goals.
Why it matters: Hospitals and institutions increasingly have sustainability mandates. The presence of certified organic and regenerative farms indicates a local food system aligned with health and environmental missions.
Accessibility counts CSA (Community Supported Agriculture) operations and farmers markets—direct-to-consumer distribution channels that make local food accessible to institutions and individuals.
Operations offering both CSA and market sales are deduplicated (counted once). This measures the density of distribution points, not production capacity.
Why it matters: A farm can be organic but not offer CSA or market sales. Accessibility captures the distribution infrastructure that connects producers to consumers. Without existing distribution channels, local food sourcing requires building new supply chains.
Because food accessibility extends beyond block group boundaries, each metric incorporates neighboring areas. For each block group i:
Where neighbors are all block groups whose centroids fall within an 8 km (5 mile) buffer around block group i's centroid.
Interpretation: Each block group's score is 100% of its own value plus 50% of the average value from surrounding areas. This acknowledges that nearby agricultural resources contribute to local food access.
State-normalized scores rank block groups within their own state. A score of 75 means "better than 75% of block groups in this state."
Log scaling is used for diversity because product counts follow a power-law distribution (few block groups have >50 products).
National-normalized scores use fixed caps derived from analysis across all processed states. These caps represent the 95th percentile of maximum values observed:
| Dimension | National Cap | Unit | Interpretation |
|---|---|---|---|
| Scale | 2,347.63 | acres | Score of 100 = 2,348+ acres human food crops in buffer |
| Diversity | 109.53 | products | Score of 100 = 110+ unique products in buffer |
| Quality | 5.83 | farms/km² | Score of 100 = 5.83+ organic/regen farms per km² |
| Accessibility | 8.89 | points/km² | Score of 100 = 8.89+ CSA/market points per km² |
Values exceeding the cap are clipped to 100.
Raw agricultural metrics don't account for demand. An area with high farm density but very high population may have less per-capita access than a rural area with moderate farms but few people. The population adjustment creates a ratio of "farm capacity to population demand."
Using log₁₀ with divisor 4.5 means population densities of approximately 31,600 people/km² max out the scale at 1.0. This accommodates dense urban cores (Manhattan: ~27,000/km²) while still differentiating suburban and rural areas.
| Ratio Range | Adjustment Factor | Interpretation |
|---|---|---|
| ≥ 2.0 | 1.00 (no penalty) | Plenty of farm capacity for population |
| [1.0, 2.0) | 0.90 – 1.00 | Adequate capacity |
| [0.5, 1.0) | 0.70 – 0.90 | Moderate capacity relative to demand |
| [0.2, 0.5) | 0.50 – 0.70 | Limited capacity for population |
| < 0.2 | 0.30 – 0.50 | Insufficient capacity for population |
The combined score integrates all four dimensions using weighted averaging:
After population adjustment, the final score range is 0-100.
| Dimension | Weight | Justification |
|---|---|---|
| Scale | 30% | Production capacity is fundamental—without sufficient acreage, institutional-scale sourcing isn't possible |
| Diversity | 30% | Product variety enables menu planning and nutritional completeness for hospital food service |
| Quality | 20% | Organic/regenerative farms align with hospital sustainability and health missions |
| Accessibility | 20% | CSA/market infrastructure demonstrates existing local distribution networks |
"Within this state, which areas are best positioned for local food sourcing?"
This is a relative comparison. A score of 80 means "top 20% within this state" but says nothing about absolute capacity. Use state scores when comparing hospitals within a single state or identifying regional patterns.
"Compared to the most agricultural areas in the country, how does this area rank?"
This is an absolute comparison. A score of 80 means the area has 80% of the agricultural capacity of the highest-producing regions nationwide. Use national scores when comparing across state boundaries or assessing absolute potential.
A fundamental challenge with this methodology is that the national caps—while derived from cross-state analysis—are still internal to this dataset. The 95th percentile values represent the upper range of what we observed, not an externally validated threshold for "local sourcing is feasible here." Without ground truth validation, the scores remain sophisticated rankings rather than predictive assessments.
This section outlines potential approaches to validate the farmshed index against real-world outcomes. These represent future research directions rather than completed work.
Approach: Identify hospitals that currently source food locally and test whether they have higher farmshed scores than hospitals that don't.
Potential data sources:
Test: Do hospitals reporting high local sourcing percentages cluster in high-farmshed-score areas?
Limitation: Selection bias is a concern—hospitals that actively source locally may have helped create the local food infrastructure (CSAs, farmers markets) that the index measures. Cause and effect may be entangled.
Approach: Interview food service directors at hospitals across the score spectrum to assess face validity.
Sample questions:
Value: Qualitative insights could reveal where the index succeeds, where it fails, and what dimensions might be missing (e.g., cold chain infrastructure, distributor relationships, seasonal availability).
Approach: Work backward from actual hospital food demand to test whether local supply could theoretically meet it.
Method:
Potential finding: "A farmshed score of 60 corresponds to areas where local farms could theoretically supply approximately 40% of a typical hospital's produce needs."
Limitation: This is a significant research undertaking requiring production yield estimates, hospital food service data, and assumptions about what fraction of farm output is available for institutional purchase.
Approach: Test whether farmshed scores correlate with independent measures that should relate to local food capacity.
| Independent Indicator | Expected Relationship | Data Source |
|---|---|---|
| USDA Food Desert designation | Negative correlation | USDA Food Access Research Atlas |
| Farmers market count (USDA) | Positive correlation | USDA National Farmers Market Directory |
| Farm-to-school program participation | Positive correlation | USDA Farm to School Census |
| State agricultural output per capita | Positive correlation | USDA NASS state statistics |
| Local food policy council presence | Positive correlation | Food Policy Networks database |
Value: Strong correlations with multiple independent indicators would provide convergent validity—evidence that the index captures something real about local food systems.
Approach: Verify that the index produces sensible results for areas with known characteristics.
Expected high scores:
Expected low scores:
Test: Do these known extremes align with index predictions? Failure to correctly rank obvious cases would indicate fundamental problems with the methodology.
Until validation studies are completed, we recommend interpreting farmshed scores as a screening and prioritization tool rather than a feasibility assessment:
| Score Range | Recommended Interpretation |
|---|---|
| ≥70 | High priority for investigation. Strong indicators suggest local sourcing may be viable. Direct outreach to farms and food service assessment recommended. |
| 50–69 | Moderate potential. Some local food infrastructure exists. Feasibility depends on specific hospital needs and willingness to develop supplier relationships. |
| 30–49 | Challenging but possible. Limited local capacity may require regional sourcing (100+ miles) or focus on specific product categories. |
| <30 | Structural barriers likely. Local sourcing at scale would require significant infrastructure investment or policy intervention. May still be viable for small-scale pilots or specific products. |
Source: USDA National Agricultural Statistics Service (NASS)
Resolution: 30 meters nationwide; 10 meters in select states
Classification: ~250 crop and land cover classes
CRS: NAD83 Conus Albers (EPSG:5070)
Use: Scale dimension (filtered to human food crops) and Diversity dimension (crop type variety)
Sources:
Use: Diversity (product lists), Quality (organic/regenerative flags), Accessibility (CSA/market flags)
Tags: landuse=farmland, farmyard, orchard, vineyard
Use: Supplement CDL with community-mapped agricultural areas
Geography: TIGER/Line Shapefiles 2023 (U.S. Census Bureau)
Population: American Community Survey (ACS) 5-Year Estimates, Table B01003_001E (Total Population)
Use: Spatial aggregation unit; population used for demand adjustment
Primary Source: OpenStreetMap via Overpass API
Query Tags: amenity=hospital
Classification: Hospitals are classified by type (general acute, psychiatric, rehabilitation, children's, veterans, etc.) based on OSM tags and name pattern matching
Coverage: Northeast US states (Connecticut, Maine, Massachusetts, New Hampshire, New Jersey, New York, Pennsylvania, Rhode Island, Vermont)
Geocoding: Hospital coordinates come directly from OSM point locations or way/relation centroids
The analysis follows a 12-step pipeline for each state:
blockgroup_scale.geojsonblockgroup_diversity.geojsonblockgroup_quality.geojsonblockgroup_accessibility.geojsonfarmshed_combined.geojsonhospitals_with_farmshed.geojsonpython scripts/build_state_farmshed_v2.py --state-fips 25 --state-name massachusetts
All outputs saved to data_out/{state_name}/:
boundary.geojson – State outlinehospitals.geojson – Hospital locationspopulation_density.geojson – Block groups with populationfarm_points_unified.geojson – All farm point datafarm_polygons_human_food.geojson – CDL+OSM filtered for human foodblockgroup_scale.geojson – Scale dimension (0-100)blockgroup_diversity.geojson – Diversity dimension (0-100)blockgroup_quality.geojson – Quality dimension (0-100)blockgroup_accessibility.geojson – Accessibility dimension (0-100)farmshed_combined.geojson – Combined farmshed (0-100)hospitals_with_farmshed.geojson – Final output with all scoresThe pipeline supports any U.S. state. Currently processed states:
| State | FIPS | Hospitals | Notes |
|---|---|---|---|
| Connecticut | 09 | ~30 | Northeast region |
| Maine | 23 | ~35 | Northeast region |
| Massachusetts | 25 | ~66 | Primary case study |
| New Hampshire | 33 | ~25 | Northeast region |
| New Jersey | 34 | ~70 | Northeast region |
| New York | 36 | ~180 | Largest sample |
| Pennsylvania | 42 | ~170 | Agricultural heartland |
| Rhode Island | 44 | ~12 | Smallest state |
| Vermont | 50 | ~15 | Northeast region |
This hospital farmshed analysis tool was developed to support research into local food systems, farm-to-institution programs, and regional agricultural capacity. The four-dimensional methodology provides a framework for assessing local food access that accounts for production scale, product diversity, farming practices, and distribution infrastructure.
While primarily designed for hospital food sourcing analysis, the underlying farmshed scores can serve broader applications: commercial food businesses evaluating location decisions, policy makers assessing food system investments, or individuals understanding their local agricultural landscape.
Created by: TyreeSpatial
For questions, collaborations, or custom analyses: contact@tyreespatial.com