MBTA: Visualizing systemwide bus crowding

I developed this project during my summer co-op in the Office of Performance Management and Innovation at the Massachusetts Bay Transit Authority (MBTA) in the summer of 2016.

When I began the co-op, my department had recently received the results of a system-wide study of bus crowding carried out by their partners at the MIT transit lab. The data was available in a database, but the team wanted a browsable web interface that would allow internal parties to explore the data and identify crowding problems throughout the day along different routes.

As the sole designer and developer on the project, I met with groups in several related departments, and worked in close partnership with Arthur Prokosch (database analysis) and Laurel Paget-Seekins (project oversight) to develop a visualization that would suit their needs.

The website was designed as a browsing tool for internal users who are experts in transit data, and want to understand where crowding occurs within the bus system, and how it varies throughout the day. It was also important for them to be able to see several different data aggregation summaries for the selected route. I coded the front end interface using a mixture of HTML5 and Javascript to display data from a custom API, and used the d3, Leaflet, and Angular libraries to support additional chart features.

The live version of the website was only released for internal use, but you can get a sense of the project by viewing the videos below. The first video shows the main website dashboard, which shows crowding for different routes. The second video is a system-wide map highlighting the worst crowding throughout the day.

Unfortunately, I did not blog about my process during project development because we were working with an unpublished data set. The official MBTA blog post about the project is here. Additional notes and sketches from various stages of the project are shown below.

Stage 1: Defining the user

The project goals and audience were completely undefined when I began my internship, so the first stage of the project focused on defining the user for the final visualization and identifying the tasks that they needed to complete. We were very aware that we were making sweeping generalizations about very complex user groups, but identifying general interests helped us to narrow down which features of the visualization (and of the data itself) would appeal to the different audiences.

Before doing the user persona exercise, we had hoped to create a visualization for all user groups – execs, analysts, and the general public – all in one go. After seeing how divergent the interests of the different groups were, we decided to focus first on an analytical tool, that could possibly be expanded later to suit a wider audience.

Stage 2: Sketches

Next, I began working with various transit data experts in the department to get a sense of the kind of information that we could expect to find in the database, and to get a better sense of how they needed to view that data to accomplish specific analytical tasks.

One thing that came out of this exploration very clearly was the fact that transit experts prefer to see their data on a map, even if it makes it harder to see features of the data itself. They also needed to be able to see multiple views of the data at one time, and to have a mix of detail and summary statistics in one page. With that view in mind, we began sketching out a rough layout and hierarchy for a data dashboard that would meet those needs.

The dashboard has an interactive route map on the left, which serves as a browser to control the linked charts to its right. All click interactions are supplemented with standard UI elements such as sliders and dropdown menus, to provide an alternate way of loading the data, and to support searching for a specific route. (Dropdown menus were a departmental favorite, and were specifically requested as the UX method of choice.)

The visualizations showed progressively more data detail as the user moves from left to right; on the left, you can see only the broad shape of the route overlaid on a map of the city. The center panel displays time-aggregated crowding data for the specific route selected, and has additional controls to allow the user to switch between inbound and outbound data and time of day. The rightmost panel shows a simplified route map that reports crowding for specific stop-stop intervals along the route (the additional squiggle beneath the straight line intervals is the geographic route information – since the data only contained lat/long points for the stops, this was sometimes quite different than the colored data segments that we drew to connect the endpoints). The summary statistics above each chart section reflect the different data aggregation levels needed for analysis at that level of detail.

Stage 3: Connecting to Data

I worked closely with our collaborators in the MIT transit lab who conducted the bus crowding study to develop a standardized object format for the data API. There were several difficulties in connecting different aspects of the data, since different pieces were handled in different databases and could not always be manually connected. Route and stop latitude and longitude points were one example, but there were several others. Some of these discrepancies were possible to fix, and others we had to just live with. Once the basic data structure was determined, my collaborators exposed the data at various aggregation levels through a web API, and I developed an Angular framework to query the database and handle the data loading and chart refresh behavior.

Stage 4: Programming the interface

The interface programming effort was fairly straightforward, though there were a few things to figure out before I could get Angular and d3 to play nicely together, and to match the map route drawing (as an SVG layer controlled by d3) to the map layers controlled by Leaflet. The final project interface(s) can be seen at the top of the page; in the end, we developed both the multi-visualization dashboard shown above and a time-based system overview for browsing a map of crowding in the city throughout the day.

Stage 5: Documentation

Because others would be continuing my work after I left, it was particularly important to provide documentation about how the code actually worked to simplify future upgrades and improvements. In addition to adding comments within the code, I produced a series of visual representations showing how the Angular framework related to the different code files, and to the application UI, where different CSS classes were used, and how specific global variables were passed through the controller framework.