CSE442 | Chicago Crime Analysis

Team Members: Saidutt Nimmagadda, Ross Monster, Vardhman Mehta, Divye Jain

How does the incidence and location of different crimes vary over the years in Chicago? (2010-16)

Design Rationale

We knew that we wanted to provide a visualization of not only the location of particular crimes, but also how location of incidence changes over the course of a few years. In this way we used encoding of position to convey the location of crimes and color to convey the incidence. We used a normalized color scale to make the differences in crime rates between communities more salient within a visualization. This was the result of a tradeoff. Comparisons of unnormalized frequencies between visualizations are harder to do because of the normalized color scale (e.g. a bright red on ARSON in 2015 might be a frequency that’s much higher than a bright red for DOMESTIC VIOLENCE in 2016). However, the mapping of individual crimes to a map caused performance issues, and associating colors representing unnormalized frequency of crimes to a community led to unclear comparisons between communities in any particular visualization. We used the mouse-over tooltip to convey unnormalized frequencies of a crime in a community. We used a green to red scale, but this in retrospect is something we will modify. We understand that this scale is problematic for the roughly 10% of the population that are colorblind, but we just needed a little bit more time to select the right scale. We used position encoding via markings that distinguish individual communities in Chicago.

As far as interaction techniques go, we implemented three. We allowed for the user to choose a particular crime in a drop down menu, because drop-down menus could restrict the types of queries to just what crimes we know are in the dataset. We also implemented a slider so that one could intuitively cycle between years in the visualization. When one hovers over a point in the map, we have a tooltip that details the neighborhood that point is in, as well as the unnormalized frequency of the crime in that neighborhood. Our two separate interactive techniques (dropdown and slider) allowed a user to generate a visualization for one crime and one year at a time. Our visualizations would crash if the csv files with our data were too large, so we dynamically loaded csv files as needed whenever a particular crime was selected. We also felt that visualizing multiple crimes at once, which could be something we implement in the future, would not necessarily convey the salient trends that we wanted to convey for our prototype about individual crimes.

Development Process

We initially met up and spent some time exploring the data and having a conversation about what we wanted to visualize and the interaction techniques we wanted to implement. We quickly decided that, since our dataset had latitude and longitude values for each incidence of a crime, a map visualization would be feasible. We then settled on a question: how would the incidence and location of crimes vary over the years in Chicago? Afterwards, we decided that a drop down menu to select a crime to visualize and a slider to allow for transitions between years in the visualization. Grappling with the size of our dataset, we understood that we had to restrict the number of years we could visualize at one time, as well as the crime. Our full dataset was over a gigabyte large; there was no way we would successfully load that entire dataset at once in our browser.

Then, we each began brushing up on our D3 in order to make a functioning map of the city of Chicago. Vardhman was integral to this portion of the visualization. They gathered the topojson and map data necessary to call a visualization of the map on a page. Divye then began setting up our GitHub pages, as well as adding a visualization and interactive tooltip of streets in Chicago. This was something we later scrapped as we felt that individual street names wasn't as salient of information as community areas. Ross then took a small subset of the crime data and implemented a simple crime mapping that used latitude-longitude values to project red circle marks on a particular location on the map. Sai wrote a script that would distribute our large dataset into many individual datasets, reducing the year interval to 2010-16 and making each crime's data reside in just one file. Then he and Divye worked on the dropdown menu so that the crime to visualize could be selected by the user. Then Sai implemented the visualization of specific crimes based on user selection. Sai then updated the data files to include community names because Vardhman would implement the display of community names in tooltips. Divye then implemented the slider for year selection. Divye and Vardhman then implemented some UI fixes, such as random color encoding on map visualizations before cleaning up some of the code. Although this visualization was complete, we changed the nature of our visualization because of performance issues. Instead of mapping every occurrence of a crime onto the map, we made a community heatmap. Ross thus used the association between individual crimes and their community to associate each community with a color that represented the normalized frequency of that crime. Sai then wrote the writeup. By far, what took the longest was being able to figure out exactly how to call up and project points to the map. Once that was figured out, filtering the dataset and what points to project was not extremely difficult. Overall, we believe we spent around 18 to 20 people-hours, including the time spent on d3 map visualization research.