Six months ago, the City of Chicago launched OpenGrid, an open-source platform to view data about your city, your neighborhood, or even your block. OpenGrid evolved from the city’s internal, proprietary WindyGrid project. Today, OpenGrid now forms the bones for WindyGrid. But unlike WindyGrid, OpenGrid’s code is publicly available online and is part of Chicago’s mission to work with civic technology innovators to develop creative solutions. Since it’s initial release, Chicago has published three new releases of the platform to fix bugs and improve the platform based on feedback from developers.

During Amazon’s DC Summit, the City of Chicago announced it’ll be expanding it’s partnership with civic technologists. Starting July 8th, developers who want to work with City of Chicago can join the weekly developer calls to review recent activity and coordinate new work for the OpenGrid platform. Perhaps you want to work on a significant feature suggested in the roadmap,  fix a bugimprove usability, or simply–but importantly–write a unit test.

If you’re interested in collaborating with the city, you can take a look at the past meeting notes on the project’s Wiki. While the source code is online, the project’s documentation gives a concise overview for developers. Likewise, look at the instructions for contributors to help us run a useful, collaborative project.

Calls will be held each Friday and details will be posted on the OpenGrid wiki page ahead of the meeting.

The aim of these calls is to better-engage potential contributors or adopters of the platform. That interaction will allow better planning, roadmapping, and divvying-up tasks. Ultimately, we believe OpenGrid will be a better platform when we build with others.

We will be experimenting with this to deepen collaboration between the city and those who can benefit, contribute, or use OpenGrid. This approach draws upon the practices of Mozilla Science Lab, Apache Foundation, and other pioneers who have build and maintained significant open source projects across a community. This builds upon years of collaboration the City of Chicago has established with technologists, non-profits, universities, and companies to help Chicago become a data-driven city.

We’ll post to this blog and provide updated on OpenGrid’s wiki page on details of the call-in/webinar details.

We are very excited that, this summer, the City of Chicago teamed-up with a team of volunteer data scientists to develop new statistical models to improve the accuracy of beach advisories due to the presence of E. Coli. Sean Thornton, a Program Advisor at Harvard University’s Ash Center, has a terrific article on the project:

For Chicagoans, few things are as enjoyable as a day at the beach. That joy, however, is contingent on clean waters that are free of contaminants such as E. coli bacteria.  With the arrival of this year’s beach season, Chicago has built an analytical pilot model that will enhance its Park District’s regular beach water quality inspection process.  The model specifically aims to guide which beaches may need to close based on predicted E. Coli readings, which helps protect the public with advisories or closures as soon as possible.

This is not Chicago’s first municipal predictive analytics project; the city has had success with predictive models for rodent baiting operations and food establishment inspections, among others.  This model differs, however, because it wasn’t built by the City of Chicago at all, but by a team of volunteers from the city’s civic tech community.

This was both a fascinating and difficult question to deliver a better model. In fact, the team put together an ensemble model ranging from logit regressions to gradient-boosting models to deliver improved performance. Each week, between 4 and 12 data scientists, statisticians, and researchers attempted to develop a model which can reduce the chance of children and Chicagoans getting sick from an enjoyable day at the beach.

Because the project is open source, you can find the entire source code on GitHub to see if you can improve the work done by a group of civic technologists. Notes about the project can be found on the project’s wiki page, including detailed noted on the lab testing process and weekly updates on the teams findings.

The summer of 2016 is needed to compare performance. As with any statistical model, the team was concerned with “over-fitting”–where such effort is made to predict past events, that the model suffers when predicting things that truly happen in the future. This summer provides an excellent testing-ground for the new model, providing feedback on the predictive power of these new techniques.

Open Data

Open data played a key role in enabling this project. For the past two years, Chicago Park District has published real-time information on the forecasts from the statistical model developed by the United States Geological Survey (USGS). Beginning this week, these results are now being stored on the city’s open data portal.

To support this project, the Chicago Park District and City of Chicago also released the actual results of the lab tests on the portal as well. Over the course of this summer, the team of volunteer civic technologists, the City of Chicago, and Chicago Park District will be comparing the results of the existing USGS model to the new models to compare performance. Both the forecast data and the actual lab results will be used to show the performance of each model.

The above statistical models depend on hourly weather forecasts. For this, the team used the excellent weather models from Forecast.io (which powers the popular Dark Sky apps). Updated weather information for each beach is available here.

Teamwork

Of course, this project was not possible without the team that volunteered their time on the project: Matt, RebeccaKevin, MelissaScott, David, Daniel, Nick, Scott, Chris, and Forest. Well over 150 hours were dedicated to this project by a team who worked hard on helping improve our summers a little more.

Featured image is Simple Times” by Christopher.F, copyright 2014 and licensed under Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0)

Happy Chicago Bike Week! Chicago has over 200 miles of on-street protect, buffered, and shared bike lanes and over 13,000 bike racks. Chicago also has DIVVY, a bike-sharing program with 480 docking stations across Chicago and is the largest bike-share program in North American in terms of geography covered. With 3.2 million trips in 2015 alone, DIVVY bikes are a sizable portion of bicyclists who get to work, run errands, and get around Chicago using their bicycle.

DIVVY Trips dashboard

DIVVY Trips dashboard

Today, we’ve rolled-out some massive data sets for the data savvy to explore DIVVY. First, we have released the history of bike availability at DIVVY docking stations for the past three years. DIVVY users will know that sometimes bikes cannot be found at a DIVVY station and other times a station can be full of bikes. This data set contains the history of each station availability for every 10 minutes over the past 3 years. By leveraging the portal, you can get see each time a station was full and each time a station was empty.

This is a massive data set with over 52 million rows at the time of writing, but useful to others. For instance, this data set allows others to figure out when a bike station might be full or empty, helping the city optimize the allocation of bikes. This data was collected from DIVVY’s API. However, because of occasional glitches, there may be some gaps in the data. We will be filling-in any noticed gaps and if you notice any issues, feel free to contact us.

Second, we’ve released DIVVY bicycle trips on the data portal from 2013 through 2015. This data has always been available as a bulk download, but this centralizes it in the data portal, allowing you to leverage the data portal APIs and filtering, such as this command to only show trips originating from the Millennium Park bike station. There are 6.4 million trips at the time of writing, so we’re excited to see what everyone will be able to do with DIVVY data powered by an API.

These data sets give you some interesting insight into the short, but successful history behind the DIVVY program. Don’t forget, we also have the list of current DIVVY stations if you just want to be in the know.

Feature image is “First Divvy bike share station opens in Daley Plaza“, copyright WBEZ 2013 and licensed under Creative Commons Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0).

Three months ago, Chicago released OpenGrid, a project which lets you explore data in your neighborhood. This week, we released v1.0.2, which has fixed a number of bugs discovered since the launch, many of them by helpful members of the public.

We’ve improved documentation at a number of points, but the biggest improvements has been to the quick search bar. Now, when you search a point-of-interest, such as “hospital” or “gas station” will return 20 results.

7Eleven

Searching a specific address, such as “50 W Washington” will return just one top result.

city hall

This release is only available in the source code at the above link and running on chicago.opengrid.io.

If you feel something isn’t working that way it should, please feel free to submit it on our Issues page.

Things are heating up around Chicago.  Love is in the air and so too are the little bugs we all love to swat – mosquitoes.  The “little fly” is a fact of life in the Windy City, and the Chicago Department of Public Health (CDPH) is on our side in our attempts to enjoy the beautiful Chicago summers with family and friends. In working towards their goal of no human transmissions of West Nile virus, CDPH gets to work before the heat really rises. They work to prevent mosquitoes from breeding by placing larvicide in more than 100,000 catch basins around town and provide larvicide to their sister agencies for usage on their properties.

CDPH operates more than 80 traps and tests weekly for any evidence of West Nile virus or Saint Louis Encephalitis.  When scientifically determined to be necessary based upon the presence of virus and the associated health risk, CDPH will also spray a most environmentally-friendly and safe product to combat those pesky mosquitoes that escape their previous efforts. CDPH also maintains this page full of tips and tricks to ward off those most-friendly of pests.

You can now download the results of these tests from the open data portal. Below, you can look at the map of the most recent results. As summer sets in, more recent results will also be added to the portal in the map below and also downloadable formats. The Robert Wood Johnson Foundation also funded this Kaggle competition based on this released data to predict West Nile virus in mosquitos across the City of Chicago.

Three years ago Chicago Energy Benchmarking Ordinance was adopted to raise awareness of energy performance through information and transparency, with the goal of unlocking energy and cost savings opportunities for businesses and residents.

Last year in 2015, over 1,840 properties totaling 614 million square feet reported under the ordinance. Analysis of reported data revealed the median ENERGY STAR scores are higher than the nationwide averages. Yet, the analysis revealed potential annual energy savings of 13-24% and $100-184 million in energy cost savings are possible by reducing energy use per square foot to median or above-median levels.

The data collected through the ordinance is also online and available to the public on the open data portal. In 2015, Chicago shared energy consumption data for large commercial and institutional buildings over 250,000 square feet, which included approximately 250 properties.

 

energy-star-scores-benchmarking-2014

More…

We’ve released a new version of RSocrata with two new, big features. You can now upload data to Socrata using the new “write.socrata()” function. If you have a dataframe ready to go in R, and you’ve already setup a corresponding schema in Socrata, you can use your Socrata credentials to upload the data directly from R. This is a superb addition if you want to use R to upload datasets to your data portal. Our team has used our own ETL framework to work with our portal, but others may feel more comfortable using R to do the same thing.

rsocrata

For instance, Stuart Gano used the beta version to download calls for ambulances, create a statistical forecast of upcoming call volumes for upcoming months, and then use RSocrata to upload those estimates to a data portal.

Powered by Socrata

Second, you can now download private, non-public data from Socrata portals. While Socrata is intended to publish public data, there is an option to keep data private. If you want to download that private data using the “read.socrata()” function, you may use your Socrata credentials to complete that task.

These are a couple of great features that were added by the open source community. There have been dozens of contributions to City of Chicago open source projects, which have helped our team and also helped others who use this software. Thanks to all of those who have contributed to this project, especially John Malc and Mark Silverberg.

If you’re interested in other open source projects by the city, take a look at http://chicago.github.io for a list of active projects.

Feature image modified from “R Graffiti” by David Goehring and licensed under Creative Commons Attribution 2.0 Generic (CC BY 2.0).

Four years ago, Chicago led in the innovation of open data portals. The city was first to appoint a Chief Data Officer and, subsequently, the number of datasets grew to include detailed information on crimes in Chicago, building violations, food inspections, and up-to-date information on 311 calls for service, plus a lot more. It has been a worthwhile investment that has sparked a vibrant civic tech community, start-ups using open data to drive their business, and has even been used by the city to improve efficiency and save money.

To build on this, we developed OpenGrid, which uses that same data but in a more user-friendly interface. You can navigate through more than a dozen datasets at once. Below the fold, we have some quick tips on how you can use OpenGrid.

CY_qjabWwAAJYfx

More…

We have revamped the presentation of data about lobbyists on our Open Data Portal. People lobbying the City of Chicago are required to register with the Board of Ethics and file periodic reports through its Electronic Lobbyist Filing System (ELF), which began collecting data in 2012. The data structures are complex and previous attempts to recombine data about lobbyists, their employers, their clients, and their lobbying activity on behalf of these entities into tabular datasets ended up being difficult to understand.

In our new approach, we are publishing the data in a way that more closely matches the structures in the source system. Most of the datasets contain only a single type of information, with sufficient common IDs between the datasets for users to link them, as needed.  The subjects of these datasets are:

There is one dataset that does combine multiple types of information. It links a lobbyist and his or her employer and clients. This dataset contains elements from the Lobbyist, Employers, and Clients datasets but has been combined to give a single view of information from all three in order to show the most central set of relationships in the data:

Connections Between the Datasets

The new datasets have the indicated ID columns in common to allow for linking between datasets.

An ERD diagram of relationships between the new Lobbyist Datasets

The relationships between the new Lobbyist Datasets

Older Data

The older datasets are still present on the Data Portal but marked as either “Historical” or “Deprecated” datasets.  The Historical datasets are based on the predecessor system to ELF and contain data prior to 2012.  The Deprecated datasets are the previous presentation of ELF data, beginning in 2012.  (There is discussion of our general approach to dataset deprecation in this post on our Data Portal Status Blog.)

As always, we welcome comments and questions on these datasets at dataportal@cityofchicago.org or @ChicagoCDO.

applause

Just one percent of Chicago’s buildings comprise 20 percent of the total energy consumed. For the first time, the City of Chicago is releasing detailed information on the energy consumption and efficiency of the largest municipal, commercial, and institutional buildings – those over 250,000 square feet – on the city’s open data portal. The release provides the data and transparency that is useful to find opportunities for find energy efficiency.

More…