Three months ago, Chicago released OpenGrid, a project which lets you explore data in your neighborhood. This week, we released v1.0.2, which has fixed a number of bugs discovered since the launch, many of them by helpful members of the public.

We’ve improved documentation at a number of points, but the biggest improvements has been to the quick search bar. Now, when you search a point-of-interest, such as “hospital” or “gas station” will return 20 results.


Searching a specific address, such as “50 W Washington” will return just one top result.

city hall

This release is only available in the source code at the above link and running on

If you feel something isn’t working that way it should, please feel free to submit it on our Issues page.

Things are heating up around Chicago.  Love is in the air and so too are the little bugs we all love to swat – mosquitoes.  The “little fly” is a fact of life in the Windy City, and the Chicago Department of Public Health (CDPH) is on our side in our attempts to enjoy the beautiful Chicago summers with family and friends. In working towards their goal of no human transmissions of West Nile virus, CDPH gets to work before the heat really rises. They work to prevent mosquitoes from breeding by placing larvicide in more than 100,000 catch basins around town and provide larvicide to their sister agencies for usage on their properties.

CDPH operates more than 80 traps and tests weekly for any evidence of West Nile virus or Saint Louis Encephalitis.  When scientifically determined to be necessary based upon the presence of virus and the associated health risk, CDPH will also spray a most environmentally-friendly and safe product to combat those pesky mosquitoes that escape their previous efforts. CDPH also maintains this page full of tips and tricks to ward off those most-friendly of pests.

You can now download the results of these tests from the open data portal. Below, you can look at the map of the most recent results. As summer sets in, more recent results will also be added to the portal in the map below and also downloadable formats. The Robert Wood Johnson Foundation also funded this Kaggle competition based on this released data to predict West Nile virus in mosquitos across the City of Chicago.

Three years ago Chicago Energy Benchmarking Ordinance was adopted to raise awareness of energy performance through information and transparency, with the goal of unlocking energy and cost savings opportunities for businesses and residents.

Last year in 2015, over 1,840 properties totaling 614 million square feet reported under the ordinance. Analysis of reported data revealed the median ENERGY STAR scores are higher than the nationwide averages. Yet, the analysis revealed potential annual energy savings of 13-24% and $100-184 million in energy cost savings are possible by reducing energy use per square foot to median or above-median levels.

The data collected through the ordinance is also online and available to the public on the open data portal. In 2015, Chicago shared energy consumption data for large commercial and institutional buildings over 250,000 square feet, which included approximately 250 properties.




This year in 2016, residential properties from 50,000 to 250,000 square feet will be phased into the ordinance, and will be required to comply for the first time. A preliminary list of all buildings covered by the ordinance in 2016 has been published to the open data portal.

The 2016 compliance deadline for all covered properties is June 1, 2016.

Notification letters have been sent to all property owners and managers on file, but if you believe your property is covered by the ordinance and you didn’t receive a letter, please complete the online Chicago Energy Benchmarking ID Request Form.


Chicago Energy Benchmarking – Covered Buildings – Map


If you need to help, there are a plenty of opportunities to get assistance with reporting. There are free trainings, a Help Center, guidance materials, and pro-bono assistance for eligible organizations.

Find out more by visiting the city’s energy benchmarking website or contact the Chicago Energy Benchmarking Help Center at (855) 858-6878 or by email to

Feature image titled “Navy Pier Sunset” by Chris Pelliccione  and is licensed under Creative Commons Attribution-NoDerivs 2.0 Generic (CC BY-ND 2.0).

We’ve released a new version of RSocrata with two new, big features. You can now upload data to Socrata using the new “write.socrata()” function. If you have a dataframe ready to go in R, and you’ve already setup a corresponding schema in Socrata, you can use your Socrata credentials to upload the data directly from R. This is a superb addition if you want to use R to upload datasets to your data portal. Our team has used our own ETL framework to work with our portal, but others may feel more comfortable using R to do the same thing.


For instance, Stuart Gano used the beta version to download calls for ambulances, create a statistical forecast of upcoming call volumes for upcoming months, and then use RSocrata to upload those estimates to a data portal.

Powered by Socrata

Second, you can now download private, non-public data from Socrata portals. While Socrata is intended to publish public data, there is an option to keep data private. If you want to download that private data using the “read.socrata()” function, you may use your Socrata credentials to complete that task.

These are a couple of great features that were added by the open source community. There have been dozens of contributions to City of Chicago open source projects, which have helped our team and also helped others who use this software. Thanks to all of those who have contributed to this project, especially John Malc and Mark Silverberg.

If you’re interested in other open source projects by the city, take a look at for a list of active projects.

Feature image modified from “R Graffiti” by David Goehring and licensed under Creative Commons Attribution 2.0 Generic (CC BY 2.0).

Four years ago, Chicago led in the innovation of open data portals. The city was first to appoint a Chief Data Officer and, subsequently, the number of datasets grew to include detailed information on crimes in Chicago, building violations, food inspections, and up-to-date information on 311 calls for service, plus a lot more. It has been a worthwhile investment that has sparked a vibrant civic tech community, start-ups using open data to drive their business, and has even been used by the city to improve efficiency and save money.

To build on this, we developed OpenGrid, which uses that same data but in a more user-friendly interface. You can navigate through more than a dozen datasets at once. Below the fold, we have some quick tips on how you can use OpenGrid.



We have revamped the presentation of data about lobbyists on our Open Data Portal. People lobbying the City of Chicago are required to register with the Board of Ethics and file periodic reports through its Electronic Lobbyist Filing System (ELF), which began collecting data in 2012. The data structures are complex and previous attempts to recombine data about lobbyists, their employers, their clients, and their lobbying activity on behalf of these entities into tabular datasets ended up being difficult to understand.

In our new approach, we are publishing the data in a way that more closely matches the structures in the source system. Most of the datasets contain only a single type of information, with sufficient common IDs between the datasets for users to link them, as needed.  The subjects of these datasets are:

There is one dataset that does combine multiple types of information. It links a lobbyist and his or her employer and clients. This dataset contains elements from the Lobbyist, Employers, and Clients datasets but has been combined to give a single view of information from all three in order to show the most central set of relationships in the data:

Connections Between the Datasets

The new datasets have the indicated ID columns in common to allow for linking between datasets.

An ERD diagram of relationships between the new Lobbyist Datasets

The relationships between the new Lobbyist Datasets

Older Data

The older datasets are still present on the Data Portal but marked as either “Historical” or “Deprecated” datasets.  The Historical datasets are based on the predecessor system to ELF and contain data prior to 2012.  The Deprecated datasets are the previous presentation of ELF data, beginning in 2012.  (There is discussion of our general approach to dataset deprecation in this post on our Data Portal Status Blog.)

As always, we welcome comments and questions on these datasets at or @ChicagoCDO.


Just one percent of Chicago’s buildings comprise 20 percent of the total energy consumed. For the first time, the City of Chicago is releasing detailed information on the energy consumption and efficiency of the largest municipal, commercial, and institutional buildings – those over 250,000 square feet – on the city’s open data portal. The release provides the data and transparency that is useful to find opportunities for find energy efficiency.


One of the important lessons we’ve learned with open data is to leverage technology to automatically update data. There are a number of benefits to this: leveraging technology means we can update data every day, every hour, or even every 10 minutes. Providing data that is reliably updated also means companies, like Chicago Cityscape and EveryBlock, and civic developers can have confidence that fresh data will be available in their applications. Every day, a little over 100 scripts run to keep the data portal up-to-date using an internal “framework” we’ve developed for this process.

Last year, we released the software we used to drive our Extract-Transform-Load (ETL) process that automatically updates our data portal. Today, we’ve released version 1.2.0 with some new enhancements.

Windows compatibility

While the software is compatibility with Windows, there have been a few utilities that were only compatible on the Linux and MacOS. With this release, we’ve brought full compatibility to Windows, so you can use the useful ETL Log utilities. As always, this utility is platform agnostic so you can develop ETLs on a Windows machine and deploy it on Red Hat, or vice versa, with no modifications.

For instance, suppose I am experiencing issues with uploading data on Beach Water Quality. It appears to take a long time, but need to discover if the performance is unusual. I can use the A_ETLRuntimes.bat utility to summarize the time it. We refer to each data set by it’s unique ID, in the case of the beach data, that is “qmqz-2xku” (see the URL):

C:\path\to\open-data-etl-utility-kit\Log>A_ETLRuntimes qmqz-2xku

The output from the command summarizes the run-time:

INFO 01-08 00:45:31,840 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 01:45:30,199 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 02:45:31,140 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 03:45:28,912 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 04:45:36,126 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 05:45:34,713 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 06:45:30,634 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 07:45:30,623 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 08:45:29,526 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 09:45:31,162 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 10:45:29,547 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 11:45:30,572 - Kitchen - Processing ended after 11 seconds.
INFO 01-08 12:45:30,549 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 13:45:30,839 - Kitchen - Processing ended after 11 seconds.
INFO 01-08 14:45:31,076 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 15:45:30,413 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 16:45:36,389 - Kitchen - Processing ended after 15 seconds.
INFO 01-08 17:45:30,481 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 18:45:26,824 - Kitchen - Processing ended after 8 seconds.
INFO 01-08 19:45:35,574 - Kitchen - Processing ended after 11 seconds.
INFO 01-08 20:45:29,922 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 21:45:27,716 - Kitchen - Processing ended after 9 seconds.
INFO 01-08 22:45:28,732 - Kitchen - Processing ended after 10 seconds.
INFO 01-08 23:45:30,488 - Kitchen - Processing ended after 10 seconds.
INFO 02-08 00:45:29,156 - Kitchen - Processing ended after 9 seconds.
INFO 02-08 01:47:32,107 - Kitchen - Processing ended after 2 minutes and 14 sec
 onds (134 seconds total).

It appears that the most recent ETL did take longer than normal–over 2 minutes compared to the typical 10 seconds.

There are a handful of these utilities now compatible on Windows machines (and Linux/Mac):

  • Summarize ETL run times
  • Show the files associated with a particular data set
  • Show ETL logs from today
  • Run a specific ETL based on its name (see below)

Quick update from the command line

While it’s best to schedule uploads ahead of time, sometimes it’s convenient to run a one-time, unscheduled upload. Now, it’s a little easier to do this from a Linux/MacOS shell or Windows command prompt. Using that same unique ID as before, we run the ETL with:

C:\path\to\open-data-etl-utility-kit\Log>A_RunETL.bat qmqz-2xku

For Linux or Mac:

$ cd \path\to\open-data-etl-utility-kit\Log
$ ./ qmqz-2xku

Soon enough, your data set will be updated on the portal.

With the opening of swimming season, the Chicago Park District water sensors at Lake Michigan beaches are live again and streaming to the data portal hourly.

That dataset has a partner dataset, beginning this year, Beach Weather Stations. The Park District places land-based sensors at some beaches to measure air temperature, humidity, rainfall, wind, barometric pressure, and sunlight. The data collected are also streamed to the data portal hourly.

Finally, in order to facilitate geographic analysis of the water and weather data, we have published the locations of the beaches where the sensors are in operation.  Note that not all sensors are active at all times so locations sometimes will be listed that are not currently providing data.

Image by Basheer Tome.