City of Chicago’s data science team is a big fan of open source software, including the venerable R statistical programming language. Attendees of the annual UseR! (use R) conference were able to see how we used R to predict where to send food inspectors in Chicago. Take a look at Gene Leynes, a data scientist for the City of Chicago, discuss the project at a lightening talk.


 

Banner image credit: The Hunt by Edsel Little and licensed under Creative Commons Attribution-ShareAlike Generic 2.0 (CC BY-SA 2.0)

This week, the City released the newest version of OpenGrid,  v1.1.0 with a number of enhancements that is now available in the project repository and users of opengrid.io.

Queries with Relative Dates

Now you can search and save queries where certain dates fall within relative date ranges.  A relative date range is a period of time that is relative to the current date.  OpenGrid supports the following relative date time periods:

  • Today
  • Yesterday
  • 1 minute ago, # minutes ago
  • 1 hour ago, # hours ago
  • 1 day ago, # days ago
  • 1 week ago, # weeks ago
  • 1 month ago, # months ago
  • 1 year ago, # years ago

For example, suppose you are a foodie and want to stay up-to-date on all of the hotspots that opened within the last week. You would select the Business Licenses Dataset, and set the License_description = “Retail Food Establishments.” Then apply another filter with the Issue Date between “1 week ago” and “today.”

relative-dates

For those using OpenGrid with user management, users can save queries with relative date. Running those queries at a later date will use the relative date, making it more useful for those who want to re-run queries within a moving time window.

Datasets alphabetized for easier discovery

Previously all of the datasets available in OpenGrid were not alphabetized, making the finding and selection of a dataset an arduous task.  Now you can view and access all of the datasets available to you in a more organized manner.

sorted-dataset

Improved Geospatial Filtering Performance on OpenGrid.io

At times, users would notice that queries using geospatial filters would not always return all the results and may omit some data. This issue has been fixed, though, opengrid.io is still limited to a maximum of 1,000 results for the time being.

API support of geospatial filtering

OpenGrid’s API now supports geospatial filtering calls. Previous versions of OpenGrid would search for all data before “filtering” for specific geographies. As a result, queries were extraneous and took longer than necessary. Now, queries can be provided which will only search for data within a given geospatial parameter.

Opengrid.io displays a maximum of 1,000 research results.  Previously, when a user would try to maximize their search results for a particular area via application of a geo-spatial filter, the filtering was happening on top of the 1,000 random records that were being pulled from Plenar.io.  Now that we’ve modified the API which uses geo-spatial filtering, we are able to support filtering on the service itself.  When a user applies a geo-spatial filter, the query will run on the defined area to find the top 1,000 results, as opposed to querying the entire database for the 1,000 results and then applying the geo-spatial filter.

Future of OpenGrid

Over the past few months, City staff has worked with Chicagoans from various neighborhoods to gather feedback on usability.  With the information we have gathered, the next phase of OpenGrid will be focused on making the data that is available in the app even more accessible and user friendly.  The City will also continue to collaborate with technologists who can help improve the platform to let residents explore open data, to help the city keep our streets safer, and to even make it useful for non-profits who may want to adopt the platform.

Interested in contributing to OpenGrid?

If you’re interested in collaborating with the city, you can take a look at the past meeting notes on the project’s Wiki. While the source code is online, the project’s documentation gives a concise overview for developers. Likewise, look at the instructions for contributors to help us run a useful, collaborative project.

 

YouTube-logo-full_color

Did you know Chicago’s tech department is on YouTube? Now, information about Chicago’s smart city initiatives, tutorials on the open data portal, and predictive analytics is available in a central spot.

The channel features a number of playlists which aggregates videos of the Mayor and the city’s CIO, CDO, and others discussing tech initiatives across the city. For instance, this playlist shows speeches given by the city’s Chief Information Officer:

Take a look at the other social media accounts for City of Chicago departments, such as the Mayor’s Office YouTube account and Twitter feed.

The City of Chicago’s core tech team at the Department of Innovation and Technology, Chicago Public Libraries, and the Office of the Clerk are hiring several for several positions. The focus on all of these jobs will to make an impact on the daily lives of residents and visitors to Chicago. Whether it’s on the information security team or planning team, you will be protecting or improving city services for others, which has become incredibly technology-focused as city’s are still transitioning into the 21st century. Take a look at the job listings below and help your city.

Shaq Dance

Department of Innovation and Technology

Public Libraries

Office of the City Clerk

Featured image credit: “Server Room” copyright by SparkFun Electronics (c) 2014 and licensed under Creative Commons Attribution 2.0 Generic (CC BY 2.0)

 

Buckingham Fountain | Chicago | 2016

This past June was a busy month for tech initiatives that focused on increasing collaboration with communities, civic technologists, and developers:

Chicago’s open data portal grew substantially this week after launching a pair of new data sets, including the largest data set to date, on the open data portal during Bike Week about the city’s DIVVY bike share program. Users can now see the history of DIVVY bike availability since 2013. Likewise, details of every single DIVVY bike trip since 2013 are also online. Today, the data portal has nearly 90 million rows of data

This past month, the city’s tech leaders spoke at several conferences:

Featured image “Buckingham Fountain | Chicago | 2016“, copyright Xanic Lopez, 2016 and licensed under Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0).

Six months ago, the City of Chicago launched OpenGrid, an open-source platform to view data about your city, your neighborhood, or even your block. OpenGrid evolved from the city’s internal, proprietary WindyGrid project. Today, OpenGrid now forms the bones for WindyGrid. But unlike WindyGrid, OpenGrid’s code is publicly available online and is part of Chicago’s mission to work with civic technology innovators to develop creative solutions. Since it’s initial release, Chicago has published three new releases of the platform to fix bugs and improve the platform based on feedback from developers.

During Amazon’s DC Summit, the City of Chicago announced it’ll be expanding it’s partnership with civic technologists. Starting July 8th, developers who want to work with City of Chicago can join the weekly developer calls to review recent activity and coordinate new work for the OpenGrid platform. Perhaps you want to work on a significant feature suggested in the roadmap,  fix a bugimprove usability, or simply–but importantly–write a unit test.

If you’re interested in collaborating with the city, you can take a look at the past meeting notes on the project’s Wiki. While the source code is online, the project’s documentation gives a concise overview for developers. Likewise, look at the instructions for contributors to help us run a useful, collaborative project.

Calls will be held each Friday and details will be posted on the OpenGrid wiki page ahead of the meeting.

The aim of these calls is to better-engage potential contributors or adopters of the platform. That interaction will allow better planning, roadmapping, and divvying-up tasks. Ultimately, we believe OpenGrid will be a better platform when we build with others.

We will be experimenting with this to deepen collaboration between the city and those who can benefit, contribute, or use OpenGrid. This approach draws upon the practices of Mozilla Science Lab, Apache Foundation, and other pioneers who have build and maintained significant open source projects across a community. This builds upon years of collaboration the City of Chicago has established with technologists, non-profits, universities, and companies to help Chicago become a data-driven city.

We’ll post to this blog and provide updated on OpenGrid’s wiki page on details of the call-in/webinar details.

We are very excited that, this summer, the City of Chicago teamed-up with a team of volunteer data scientists to develop new statistical models to improve the accuracy of beach advisories due to the presence of E. Coli. Sean Thornton, a Program Advisor at Harvard University’s Ash Center, has a terrific article on the project:

For Chicagoans, few things are as enjoyable as a day at the beach. That joy, however, is contingent on clean waters that are free of contaminants such as E. coli bacteria.  With the arrival of this year’s beach season, Chicago has built an analytical pilot model that will enhance its Park District’s regular beach water quality inspection process.  The model specifically aims to guide which beaches may need to close based on predicted E. Coli readings, which helps protect the public with advisories or closures as soon as possible.

This is not Chicago’s first municipal predictive analytics project; the city has had success with predictive models for rodent baiting operations and food establishment inspections, among others.  This model differs, however, because it wasn’t built by the City of Chicago at all, but by a team of volunteers from the city’s civic tech community.

This was both a fascinating and difficult question to deliver a better model. In fact, the team put together an ensemble model ranging from logit regressions to gradient-boosting models to deliver improved performance. Each week, between 4 and 12 data scientists, statisticians, and researchers attempted to develop a model which can reduce the chance of children and Chicagoans getting sick from an enjoyable day at the beach.

Because the project is open source, you can find the entire source code on GitHub to see if you can improve the work done by a group of civic technologists. Notes about the project can be found on the project’s wiki page, including detailed noted on the lab testing process and weekly updates on the teams findings.

The summer of 2016 is needed to compare performance. As with any statistical model, the team was concerned with “over-fitting”–where such effort is made to predict past events, that the model suffers when predicting things that truly happen in the future. This summer provides an excellent testing-ground for the new model, providing feedback on the predictive power of these new techniques.

Open Data

Open data played a key role in enabling this project. For the past two years, Chicago Park District has published real-time information on the forecasts from the statistical model developed by the United States Geological Survey (USGS). Beginning this week, these results are now being stored on the city’s open data portal.

To support this project, the Chicago Park District and City of Chicago also released the actual results of the lab tests on the portal as well. Over the course of this summer, the team of volunteer civic technologists, the City of Chicago, and Chicago Park District will be comparing the results of the existing USGS model to the new models to compare performance. Both the forecast data and the actual lab results will be used to show the performance of each model.

The above statistical models depend on hourly weather forecasts. For this, the team used the excellent weather models from Forecast.io (which powers the popular Dark Sky apps). Updated weather information for each beach is available here.

Teamwork

Of course, this project was not possible without the team that volunteered their time on the project: Matt, RebeccaKevin, MelissaScott, David, Daniel, Nick, Scott, Chris, and Forest. Well over 150 hours were dedicated to this project by a team who worked hard on helping improve our summers a little more.

Featured image is Simple Times” by Christopher.F, copyright 2014 and licensed under Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic (CC BY-NC-ND 2.0)

Happy Chicago Bike Week! Chicago has over 200 miles of on-street protect, buffered, and shared bike lanes and over 13,000 bike racks. Chicago also has DIVVY, a bike-sharing program with 480 docking stations across Chicago and is the largest bike-share program in North American in terms of geography covered. With 3.2 million trips in 2015 alone, DIVVY bikes are a sizable portion of bicyclists who get to work, run errands, and get around Chicago using their bicycle.

DIVVY Trips dashboard

DIVVY Trips dashboard

Today, we’ve rolled-out some massive data sets for the data savvy to explore DIVVY. First, we have released the history of bike availability at DIVVY docking stations for the past three years. DIVVY users will know that sometimes bikes cannot be found at a DIVVY station and other times a station can be full of bikes. This data set contains the history of each station availability for every 10 minutes over the past 3 years. By leveraging the portal, you can get see each time a station was full and each time a station was empty.

This is a massive data set with over 52 million rows at the time of writing, but useful to others. For instance, this data set allows others to figure out when a bike station might be full or empty, helping the city optimize the allocation of bikes. This data was collected from DIVVY’s API. However, because of occasional glitches, there may be some gaps in the data. We will be filling-in any noticed gaps and if you notice any issues, feel free to contact us.

Second, we’ve released DIVVY bicycle trips on the data portal from 2013 through 2015. This data has always been available as a bulk download, but this centralizes it in the data portal, allowing you to leverage the data portal APIs and filtering, such as this command to only show trips originating from the Millennium Park bike station. There are 6.4 million trips at the time of writing, so we’re excited to see what everyone will be able to do with DIVVY data powered by an API.

These data sets give you some interesting insight into the short, but successful history behind the DIVVY program. Don’t forget, we also have the list of current DIVVY stations if you just want to be in the know.

Feature image is “First Divvy bike share station opens in Daley Plaza“, copyright WBEZ 2013 and licensed under Creative Commons Attribution-NonCommercial 2.0 Generic (CC BY-NC 2.0).

Three months ago, Chicago released OpenGrid, a project which lets you explore data in your neighborhood. This week, we released v1.0.2, which has fixed a number of bugs discovered since the launch, many of them by helpful members of the public.

We’ve improved documentation at a number of points, but the biggest improvements has been to the quick search bar. Now, when you search a point-of-interest, such as “hospital” or “gas station” will return 20 results.

7Eleven

Searching a specific address, such as “50 W Washington” will return just one top result.

city hall

This release is only available in the source code at the above link and running on chicago.opengrid.io.

If you feel something isn’t working that way it should, please feel free to submit it on our Issues page.

Things are heating up around Chicago.  Love is in the air and so too are the little bugs we all love to swat – mosquitoes.  The “little fly” is a fact of life in the Windy City, and the Chicago Department of Public Health (CDPH) is on our side in our attempts to enjoy the beautiful Chicago summers with family and friends. In working towards their goal of no human transmissions of West Nile virus, CDPH gets to work before the heat really rises. They work to prevent mosquitoes from breeding by placing larvicide in more than 100,000 catch basins around town and provide larvicide to their sister agencies for usage on their properties.

CDPH operates more than 80 traps and tests weekly for any evidence of West Nile virus or Saint Louis Encephalitis.  When scientifically determined to be necessary based upon the presence of virus and the associated health risk, CDPH will also spray a most environmentally-friendly and safe product to combat those pesky mosquitoes that escape their previous efforts. CDPH also maintains this page full of tips and tricks to ward off those most-friendly of pests.

You can now download the results of these tests from the open data portal. Below, you can look at the map of the most recent results. As summer sets in, more recent results will also be added to the portal in the map below and also downloadable formats. The Robert Wood Johnson Foundation also funded this Kaggle competition based on this released data to predict West Nile virus in mosquitos across the City of Chicago.