Racial Bias in the PPB

In 2012 a 17-year-old, unarmed black man named Trayvon Martin was fatally shot by police. This sparked national outrage, revealing to the average white American what black Americans already knew: police presence in black neighborhoods historically has not been friendly or beneficial to the communities they are supposed to serve. We sought out Portland’s Police Bureau (PPB) data to elucidate any institutionalized bias on the part of PPB officers. We chose Portland because, in addition to living here, determining whether police treat subjects differentially because of race is particularly interesting due to Oregon’s history of redlining and overt racism. The PPB annually publishes data on each incident that officers were involved in, including dispatched calls in 2019 and officer involved shootings from 2010-2019 on their open data website (https://www.portlandoregon.gov/police/71673). In the dispatched calls data set, each observation is a call to PPB and data such as how long it took an officer to get to the scene and the priority of the call (i.e. very urgent situations with violence involved would take priority over domestic disputes such as theft). The officer involved shootings data set contains each instance of an officer shooting at a subject. It’s important to note that this data set is much smaller than the dispatched call data set with only 54 cases in 9 years versus nearly 250,000 calls in a single year. In addition to call data and officer involved shooting data, we used the data available from PPB called “Reported Bias Crime”. The data set has information on bias crimes reported to the Portland Police Department since 2015. On the website, there is a prominent histogram counting the number of these crimes over time, as well as the category of the crimes. Of interest perhaps is the apparent uptick since the 3rd quarter of 2018. We used this data, alongside Portland census data from 2010 (https://www.census.gov/quickfacts/portlandcityoregon), to see if neighborhoods of color were able to get police assistance in a timely manner when needed and what types of citizens were shot at by police, either armed or unarmed. The census data included information on the distribution of ethnic groups across neighborhoods of Portland. This allowed us to see how the reported biased crimes are distributed across various neighborhoods, and whether those neighborhoods are more or less diverse. Unfortunately, these data are tallies of people, but it is easier to work with proportions of ethnicity to better encapsulate the distribution. In this vein, we counted the biased crimes across the neighborhoods, and viewed the proportion of non-white residents across neighborhoods with the following code:

prop <- Census2010_1 %>%
  select(1,2,17)%>%
  mutate(white = as.numeric(white),
         total = as.numeric(total),
         propwhite = white/total,
         propnonwhite = 1-propwhite)
biasarea <- BiasCrime_All %>%
  clean_names()%>%
  
  group_by(neighborhood) %>%
  count()%>%
  left_join(prop, by = c("neighborhood" = "neighborhood"))%>%
  select(neighborhood, count = n, propnonwhite)%>%
  filter(count >= 5)
biasarea
## # A tibble: 8 x 3
## # Groups:   neighborhood [8]
##   neighborhood        count propnonwhite
##   <chr>               <int>        <dbl>
## 1 Buckman                12       0.0807
## 2 Cully                   5       0.374 
## 3 Downtown               31       0.184 
## 4 East Columbia           5       0.383 
## 5 Hazelwood               6       0.272 
## 6 Lloyd                   6      NA     
## 7 Old Town/Chinatown     14      NA     
## 8 Powellhurst-Gilbert   102       0.327

Note that of the neighborhoods with more than 5 observations, 6 of the 8 had a proportion of non-white residents that was greater than 25%, suggesting that perhaps more of these biased crimes are occurring in the more ethnically non-white neighborhoods. To examine this further, now turned to the actual types of crimes being committed. For this, we subset the data even further, into only crimes being committed in Powellhurst-Gilbert and Downtown, as combined, the two neighborhoods reported more than a third of the biased crimes, as well as honing in on the biases involving race and ethnicity.

biastype <- clean_names(BiasCrime_All) %>%
  filter(neighborhood %in% c("Downtown", "Powellhurst-Gilbert"), bias_category == "Race/ Ethnicity/ Ancestry")
ggplot(biastype, aes(x = bias_type, fill = offense_type)) +
  geom_bar(position = "dodge") +
  coord_flip() +
    labs(title = "Distribution of Offense Type By Bias Type", 
       subtitle = "Portland, Oregon",
       fill = "Offense Type",
       x = "Bias Type")+
  scale_fill_manual(values=c("cornflowerblue", "dodgerblue4", "blue3", "navy")) +
  theme(plot.subtitle = element_text(face = "italic"))

Those robberies are quite the outlier. In fact, upon investigation, it appears as though they were committed by the same 12 women in a very similar time period. As such, this is what skewed the data in the Powellhurst neighborhood. To get a better idea of the the crimes being committed, we removed the robberies and plotted the data again:

biastype <- clean_names(BiasCrime_All) %>%
  filter(bias_category == "Race/ Ethnicity/ Ancestry", offense_type != "Robbery")
ggplot(biastype, aes(x = bias_type, fill = offense_type)) +
  geom_bar(position = "dodge") +
  coord_flip() +
    labs(title = "Distribution of Offense Type By Bias Type", 
       subtitle = "Portland, Oregon",
       fill = "Offense Type",
       x = "Bias Type")+
#  scale_fill_manual("darkgoldenrod", "burlywood4", "darkolivegreen", "darkslateblue", "gray48", "navy")+
  theme(plot.subtitle = element_text(face = "italic"))

Note now the high incidence of assault, primarily in the minority communities. Many of the nonwhite communities are underrepresented across the Pacific Northwest, and Portland is no exception. Yet frequently these communities are over-represented in crime statistics, and this is yet another concrete example of this over-representation. With the clear biases against minorities in Portland, we present a graph regarding officer shootings segmented by race:

oisData %>%
  ggplot(aes(x = `Subject Race`)) +
  geom_bar(aes(fill = `Was Subject Injury Fatal?`)) +
  labs(title = "Distribution of Officer-Involved Shootings by Subject Race", 
       subtitle = "Portland, Oregon",
       y = "Frequency") +
  scale_fill_manual(breaks = c('No', 'Yes'), 
                      values=c("cornflowerblue", "dodgerblue4")) +
  theme(plot.subtitle = element_text(face = "italic"))

oisData %>%
  group_by(`Subject Race`) %>%
  summarise(frequency = n(),
            fatal_shooting = sum(`Was Subject Injury Fatal?` == "Yes"),
            fatal_shooting_pct = sum(`Was Subject Injury Fatal?` == "Yes")/n())
## # A tibble: 3 x 4
##   `Subject Race` frequency fatal_shooting fatal_shooting_pct
##   <chr>              <int>          <int>              <dbl>
## 1 Black                 10              6              0.6  
## 2 Hispanic               2              2              1    
## 3 White                 42             22              0.524

Clearly, the frequency of Officer Involved shootings is not representative of the population, and the fatal shootings of minorities is most likely higher. Unfortunately, we do run into a data collection issue. Since there aren’t that many observations, it is hard to draw concrete conclusions about rates of fatality.

The final set of PPB data we analyzed was the aforementioned dispatched call data set. This data set includes information on each of 249,823 calls for police service in 2019 in which one or more officers were dispatched. First, we performed spacial analysis on the geographic distribution of dispatched calls, with each point in the following visualization representing a single call:

dispatchData %>%
  ggplot(aes(x = OpenDataLat, y = OpenDataLon)) +
  ggtitle("Spatial Distribution of Portland Dispatches") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Latitude") +
  ylab("Longitude") +
  xlim(45.45, 45.65) +
  ylim(-122.85, -122.4) +
  geom_point(color = "dodgerblue4", alpha = 0.01) +
  theme_minimal()

Looking at the above graphic, it appears the majority of dispatched calls were placed at or around downtown (with latitudes around 45.525 and longitudes around -122.7). To further illustrate this clustering, we created a heat map that similarly illustrates the geographic distribution of dispatched calls, where there is again a noticeable cluster of dispatched calls at or around the downtown area:

dispatchData %>%
  ggplot(aes(x = OpenDataLat, y = OpenDataLon)) +
  ggtitle("Spatial Distribution of Portland Dispatches") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Latitude") +
  ylab("Longitude") +
  xlim(45.45, 45.65) +
  ylim(-122.85, -122.4) +
  geom_bin2d(bins = 20) +
  scale_color_brewer(palette = "Blues") +
  theme_minimal()

One of the most important variables we considered when analyzing the dispatched calls was the response time, or the total seconds between a call being placed in the police dispatch queue and the first officer arriving on-scene. The response time was most strongly correlated to the priority level of the call (determined by the PPB based on the perceived severity of the incident and the subsequent urgency of the response). Predictably, calls of high priority generally yielded the shortest response times:

dispatchData %>%
  group_by(Priority) %>%
  summarise(avgResponseTime = mean(ResponseTime_sec, na.rm = TRUE),
            avgTimeInQueue = mean(TimeInQueue_sec, na.rm = TRUE),
            avgTravelTime = mean(TravelTime_sec, na.rm = TRUE)) %>%
  arrange(avgResponseTime)
## # A tibble: 3 x 4
##   Priority avgResponseTime avgTimeInQueue avgTravelTime
##   <chr>              <dbl>          <dbl>         <dbl>
## 1 High                488.           108.          381.
## 2 Medium              943.           518.          435.
## 3 Low                2624.          2081.          565.
dispatchData %>%
  group_by(Priority) %>%
  ggplot(aes(x = Priority, y = ResponseTime_sec)) +
  ggtitle("Priority vs. Response Time") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Priority") +
  ylab("Response Time (Seconds)") +
  scale_y_log10() +
  geom_violin(aes(fill = Priority)) +
  guides(fill = FALSE) +
  theme_minimal()

While it appears that the PPB appropriately responded faster to dispatched calls of higher priority, we also looked to see if there were any noticeable relationships between response times and certain demographic characteristics of neighborhoods. Specifically, we looked to see if neighborhoods with lower proportions of white-residents and lower median household incomes were receiving equitable police service as measured by response time to dispatched calls:

dispatchDataCensus %>%
  group_by(Neighborhood) %>%
  summarise(avgResponseTime = mean(ResponseTime_sec, na.rm = TRUE), 
            white_pct = mean(white)) %>%
  ggplot(aes(x = white_pct, y = avgResponseTime)) +
  ggtitle("Proportion of White Residents (Neighborhood) vs. Average Response Time") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Proportion of White Residents (Neighborhood)") +
  ylab("Average Response Time (Seconds)") +
  geom_jitter(color = "dodgerblue4", alpha = 0.5) +
  geom_smooth(method = 'lm', color = "dodgerblue4") +
  theme_minimal()

dispatchDataNeighborhood %>%
  group_by(Neighborhood) %>%
  summarise(avgResponseTime = mean(ResponseTime_sec, na.rm = TRUE),
            medianIncome = mean(`Median household income ($)`, na.rm = TRUE)) %>%
  ggplot(aes(x = medianIncome, y = avgResponseTime)) +
  ggtitle("Median Household Income (Neighborhood) vs. Average Response Time") +
  theme(plot.title = element_text(hjust = 0.5)) +
  xlab("Median Household Income (Neighborhood)") +
  ylab("Average Response Time (Seconds)") +
  geom_jitter(color = "dodgerblue4", alpha = 0.5) +
  geom_smooth(method = 'lm', color = "dodgerblue4") +
  theme_minimal()
## `geom_smooth()` using formula 'y ~ x'

While it appears that the proportion of white residents in a given neighborhood had little to no effect on the response time of dispatched calls, there was a noticeable positive relationship between the median household income of a neighborhood and the average response time of dispatched calls. This can likely be attributed to lower income neighborhoods being clustered together in areas that receive a greater proportion of dispatched calls and also happen to be near the central office of the PPB. Specifically, the neighborhood with the lowest median household income (Downtown) also happens to be in the same neighborhood as the location of the central office of the PPB, and (as illustrated in the spatial analysis) had the highest volume and concentration of dispatched calls. This likely explains why neighborhoods with low median household incomes (like Downtown) see generally shorter response times.

In examining the services of the PPB, overall they appear to be acting fairly proficiently at meeting the needs of the community equitably. Certainly, we stumbled upon some troubling figures, such as the higher rates of assault seen in minorities in biased crimes, as well as higher rates of officer involved shootings of minorities than should be expected in Portland based on demographic census data. However, the dispatched call data presents the PPD as equitably responding to emergency situations. Additionally, the transparency of the Open Data in and of itself suggests that the PPB is accepting a degree of accountability and moving forward in addressing inequity in Portland law enforcement. Some steps certainly have been taken, but the PPB is clearly still imperfect. We have a long way to come yet.