How the data were processed

This data package combines original files that were broken into seperate Excel files, and splits files on the type of geographic grain, so each of the output files consists of only one geography, covering the whole state. For instance, the original release has four Public_Transit_Access files, one each for large regions of California. These files were combined, then split again into 5 files, one each for the county, tract, place, region and CMSA geographic aggregations.

The geotype field has a code the describes the type of geography used in aggregating each row. These geographies are generally Census geographies, and are specified with Census geoids, in the geotypevalue field. The geotype codes are:

  • CA or ST The whole state of California
  • CO A county
  • CD A county subdivision
  • PL A Census Designated Place
  • RE A Sub-state region
  • ZC ZCTA, the Census version of a ZIP code area.
  • R4 Consolidated Metropolitan Statistical Areas
  • MS Metropolitan Statistical Area

These values, excluding RE, R4 and MS are converted to a GVid for linking to other files.

On most files the state code is CA, but in the Open Space file it is ST.

These geotype codes are all mapped to names and used as part of the file names.

Other important processing steps included: - The ind_definition and ind_id, which have a name and number for each of the indicators, were removed from the data and used as table descriptions. These value appear to be constant in all rows in a file. - Some very large RSE values have been changed to NULL

As of Dec 1, 2015, In the Neighborhood Change files, the Relative Standard Error (rse) column is often computed for values that are very close to zero, so the RSE is very large. In other files in this dataset, the rse value is capped at 100. As per Dulce Bustamante-Zamora at CDPH, these values should be blank, (NULL) so this correction is made for rows where the difference is 0.

The reportyear field can be either a single integer year, or it may be a range of years, which is represented as a string.


source California Department of Public Health, Office of Health Equity ,


Name Partitions Description id
housing_cost HCI Indicator 106.0: Percent of households spending more than 30% (50%) of monthly household income on monthly gross rent or selected housing costs t04p0A003
neighborhood_change HCI Indicator 772.0: Neighborhood change: 10-year change in number of households by income and race/ethnicity t04p0B003
household_type_tracts HCI Indicator 746.0: Household by type of family and head of household t04p0C003
household_type HCI Indicator 746.0: Household by type of family and head of household t04p0D003
living_wage HCI Indicator 770.0: Living wage and percent of families with incomes below the living wage t04p0E003
walk_bicycyle HCI Indicator 778.0: Percent of population aged 16 years or older whose commute to work is 10 minutes /day or more by walking or biking t04p0F003
food_affordability HCI Indicator 757.0: Food affordability for female-headed household with children under 18 years t04p0G003
healthy_food HCI Indicator 75.0: Modified retail food environment index t04p0H003
poverty_rate HCI Indicator 754.0: Overall, concentrated, and child (0 to 18 years of age) poverty rate t04p0I003
traffic_fatalities HCI Indicator 753.0: Annual number of fatal and severe road traffic injuries per population and per miles traveled by transport mode t04p0J003
violent_crime HCI Indicator 752.0: Number of Violent Crimes per 1000 Population t04p0K003
transport_work HCI Indicator 42.0: Percent of residents mode of transportation to work t04p0L003
household_crowding HCI Indicator 137.0: Percent of household overcrowding (> 1.0 PPR) and severe overcrowding (> 1.5 PPR) t04p0M003
high_school_ed HCI Indicator 369.0: High School or Greater Educational Attainment in the Population Aged 25 Years and Older t04p0N003
alcohol_outlets HCI Indicator 774.0: Percent of Population within 1/4 Mile of Alcohol Outlets by Type of Establishment Sales t04p0O003
unsafe_water HCI Indicator 426.0: Drinking water quality (Percent of the population served by community water systems not meeting regulations of the Safe Drinking Water Act) t04p0P003
air_quality HCI Indicator 776.0: Average Ambient PM2.5 concentration (microgram/m3) t04p0Q003
jobs_housing_ratio HCI Indicator 768.0: Jobs to housing ratio t04p0R003
unemployment HCI Indicator 290.0: Unemployment rate t04p0s003
income_inequality HCI Indicator 556.0: Income inequality: Gini coefficient describing the amount of total annual community income generated by the number of households t04p0S003
registered_voters HCI Indicator 653.0: Percent of adults age 18 years and older who are registered voters t04p0t003
abuse_neglect HCI Indicator 741.0: Percent of children (under 18) reported with neglect or physical or sexual abuse t04p0u003
jobs_employed_ratio HCI Indicator 769.0: Jobs to employed residents ratio t04p0v003
miles_traveled HCI Indicator 39.0: Miles traveled per capita by mode (car, public transit, walk/bike) t04p0w003
ozone HCI Indicator 761.0: Average annual number of unhealthy days of ozone t04p0x003
public_transit HCI Indicator 51.0: Percent of population within 1/2 mile of major transit stop t04p0y003
open_space HCI Indicator 469.0: Percent of Population within ½ Mile of Park, Beach, Open Space, or Coastline t04p0z003


