Chicago City of Learning, Chicago Public Schools, and Chicago Public Library wants youth, teachers, and mentors to share what you, your classroom, or after school program are doing to celebrate Digital Learning by posting to our Chicago #DLDayGallery.
We are looking for photos, links to student-made websites or digital artifacts, videos – anything that you feel showcases your efforts! We will display it here so others can get inspired by what you’re doing and we can learn from each other.
Send your photos, websites, and digital media – anything that you feel showcases your efforts to http://destinationchicago.explorechi.org/submit! Here are some tips to get you started, learn more about ways you can participate. You can follow along on Twitter with the hashtag #DLDAY
R is a powerful statistics program that is a favorite among data scientists. Using R with the City of Chicago data portal has been possible, but R users always needed to handle some residual issues after loading files from the data portal. These issues were also common for Chicago’s data science team, so we’re excited to release the RSocrata package to make the interaction with the Chicago data portal–and any other Socrata data portal–easier for R users.
RSocrata is available on CRAN and can be installed and loaded with:
Just use the URL of the datasets from any Socrata site to load data with read.socrata(). Below, the Towed Vehicles dataset is loaded as a dataframe with:
towed.vehicles <- read.socrata("https://data.cityofchicago.org/Transportation/Towed-Vehicles/ygr5-vcbg")
You can also use the API Access Endpoint address to load data. Locate the API Access Endpoint address under the Export button and the API menu. You will need to change the “.json” extension to “.csv”. For example, the API Access Endpoint for Towed Vehicles is http://data.cityofchicago.org/resource/ygr5-vcbg.csv.
To use with RSocrata, type:
towed.socrata <- read.socrata("http://data.cityofchicago.org/resource/ygr5-vcbg.csv")
Using either the human-readable URL or the API Access Endpoint will make the same call to Socrata and is designed to minimize throttling.
There are a couple of benefits from RSocrata. First, date values are loaded in R as POSIX formatted dates, which is not the case using read.csv. Comparing the two methods, read.csv will usually be classified as factors:
towed.csv <- read.csv("http://data.cityofchicago.org/api/views/ygr5-vcbg/rows.csv") # Reading CSV input class(towed.csv$Tow.Date) # Check the date classification for 'Tow Date' column  "factor" class(salaries.socrata) # Loaded with read.socrata  "POSIXlt" "POSIXt"
The RSocrata package uses a loop and Socrata’s $offset parameter to minimize throttling from the data portal.
Funded with a $1 million grant from Bloomberg Philanthropies, Chicago’s SmartData project will build the first open-source, predictive analytics platform – aggregating and analyzing information to help leaders make smarter, faster decisions and prevent problems before they develop. “The Mayors Challenge is all about finding great ideas that can spread to other cities,” said Jim Anderson of Bloomberg Philanthropies. “While several municipalities are working to harness the power of big data, Chicago will be the first city to do so open source, making it possible for this great idea to spread and empower other cities.” More information on the Bloomberg Philanthropies Mayors Challenge can be found at bloomberg.org/mayorschallenge.
THE SMARTDATA PLATFORM
Chicago collects 7 million rows of data a day — data on everything from weather to traffic patterns to the location of libraries, schools, sidewalks and public parks. But this abundance of data in itself can’t solve urban problems. Most of the data exists in separate systems, often in conflicting and confusing formats. So how can government managers use the data to make better decisions? They need a tool to help them collate, sift and analyze. They need the SmartData platform.
SmartData will give leaders a tool to search for relevant data and detect relationships, analyzing millions of lines of data in real-time. This will help make smarter, earlier decisions to address a wide range of urban challenges. Chicago residents will experience services delivered earlier, sometimes even before the problem is apparent. Officials will be able to target responses that will address a wide range of urban issues — from managing weather emergencies to reducing traffic accidents. The SmartData platform turns “thinking” into “doing.” It turns “reactive” into “proactive.” At its core, it makes data-driven and effective government the norm and can fundamentally alter the way the city operates.
“Continuing to expand the use of analytics across our city is a top priority,” said Chief Information Officer Brenna Berman, who is overseeing the platform’s development. “With the SmartData Platform, we’ll not only be able to expand analytics, but develop a new method of data-driven decision making that can change how cities across the country operate.”
THE PLATFORM IN ACTION
The project has two goals — first, to help city managers analyze trend data and engage in predictive problem-solving, and second, to share the platform with cities that cannot build the capacity themselves. All software developed on this project will be open source and will be made available to other cities.
SmartData will allow policymakers to make sense of the city’s billions of lines of data stored in disparate systems. Managers will be able to find answers to specific questions without having to manually search for data, or even know where or how the data is stored. End users will not see the millions of lines of code, or the billions of records stored and searched for their benefit. All a user will see is a simple search screen to perform the query. Results will be presented in easy-to-read formats, including geographical plots, customizable to the user’s preferences (police beats, school districts, sanitation districts, etc.).
The predictive power of the tool is its ability to analyze relationships in the data at a speed and on a scale not previously possible. For example, the SmartData Platform could query data on traffic patterns and pedestrian activity for a certain section of the city, and then compare it against other city data, such as weather patterns, traffic signal times and streetlight access. By doing so, SmartData might develop a prediction of where intervention is needed to reduce pedestrian-traffic collisions. The city could optimize services of all kinds in this way, benefitting citizens and reducing costs.
ENHANCING CITY OPERATIONS
The first stage of the SmartData project is now complete. The first tool, called WindyGrid, presents a unified view of operational data for all public safety agencies on a single dashboard. Varied types of data are displayed in a user-friendly single graphical interface, allowing the user to make queries and also receive automatic updates and alerts. WindyGrid includes more than a dozen different types of data, including geospatially tagged 311 reports, 911 calls and public Tweets, emergency operations data, video feeds from surveillance cameras, and city bus location data. Users can query data by type, time and distance from a given location and determine how they want results displayed, including a heat map to show concentrations of results.
A new pilot project focused on rodent baiting is demonstrating how Chicago can apply predictive analytics to core city services that affect quality of life in the city’s neighborhoods. Rodent complaints are one of the city’s top ten 311 inquiries. Traditionally, the city has reactively deployed rodent baiting teams in response to complaints or events, such as water main breaks, that are anecdotally linked to rodent activity. While Chicago’s 311 call center processes more than 600 types of calls, the pilot identifies and analyzes 31 key call types that can predict rodent activity spikes 7 days in advance. By doing this, the pilot will allow the City’s rodent baiting teams to deploy in an earlier, more targeted fashion to better prevent rodent outbreaks. Initial pilot results are promising, and have included the detection of rat infestations that would never have been found without the new algorithm. The pilot’s final evaluation will be based on both efficiency improvements and reduction in rodent complaints.
BUILDING A REPLICABLE MODEL
Any city seeking to create its own SmartData platform will benefit from the groundwork done in Chicago. Chicago is building an open-source data infrastructure and a set of algorithms that other cities can re-use with no startup software development costs. Other cities will be able to import the architecture and the predictive algorithms (both logic and source code) developed by Chicago and adjust it to accommodate their own 311 or related data sets. The technology team in Chicago is also creating an archive of instructive documents and templates that will give other cities a roadmap to develop their own predictive analytics projects.
What does the Mayors Challenge victory mean for Chicago? “Bloomberg’s support is crucial to helping us build the SmartData Platform and make it available to other cities,” Berman said. “It reflects what is at the heart of this initiative—our commitment to make our city smarter and a better place to live for our residents.”
The City of Chicago has a small team that works to liberate city data on the data portal. We use a variety of tools to clean, transform, and publish data on the portal. Often, datasets are automatically updated every day, but a number of datasets are one-time or infrequent updates.
In a series of posts, we will review the tools and techniques that are used for datasets. Whenever possible, we use open source tools to conduct a process to extract, transform, and load datasets on the portal. In this first post, we will give a detailed description of how we use OpenRefine (formerly GoogleRefine) to clean and transform one-time or infrequently updated datasets.
This morning the Mayor announced a new partnership with Code.org to announce a five-year plan to make computer science part of the Chicago Public School core curriculum instead of just an elective. The new K-12 plan includes creating a pipeline for students to learn how to build computer apps and offers at least one computer science class at every high school.
This week, students across the city are participating in “Hour of Code” events, exploring the basics of coding and programming. With over 20,000 students participating, Chicago has the most students participating in “Hour of Code” worldwide.
Why are we doing this? While STEM jobs are among the highest-paying jobs for new graduates, fewer than 3% of college students across the nation will graduate with a degree in computer science – and of all students taking AP Computer Science, fewer than 20% are women and fewer than 10% are African American or Latino. By 2020, the U.S. can expect almost 760,000 new jobs to be created in computer and information technology.
Our attempt is to lead the way in bridging this gap.
This Summer the Mayor’s office is offering Tech fellowships in 3 different categories:
Data Science fellows will primarily be responsible for developing and implementing statistical models to solve a variety of urban, social, and city management issues.
Java Developer fellows will participate in the development and improvement of Chicago’s web and mobile software projects including research, design, testing, and exploring ways of making information on the City’s data portal more useful and accessible to all.
Web Designer Fellows will explore ways of enhancing information located on the City’s data portal and design new innovative ways to leverage digital tools to connect with Chicagoans.
Don’t miss you opportunity to apply for great hands on experience meeting with City leaders and directly contributing to the technology that helps run Chicago. The deadline to apply for all three fellowships is January 12th.
As data continues to create innovative pathways to improve cities, Chicago is making this resource more accessible and powerful for everyone.
Mayor Rahm Emanuel just launched a comprehensive meta-data directory, a Data Dictionary, which gives unparalleled access to the data collected by all City departments and sister agencies.
Why does data need a dictionary, you ask?
Understanding the characteristics and context of data is important in facilitating reliable and effective use of the data throughout our portal. The Data Dictionary enables complex data to be better understood, while better enabling users to find specific data by searching specific terms. Journalists and civic enthusiasts can use this to more effectively discern public information before filing FOIA requests.
Our friends at the Smart Chicago Collaborative published a simple guide to how it works.
You can enter any term that you’re interested in (schools, TIF, budget, police) and it will return every item that contains that term.
Once you pull up a particular dataset, you will see details of all the metadata for that dataset that includes every variable. This gives insight into how the data is used, who uses it, and how they use it.
The database details feature gives valuable info like the department that runs the database, the point of contact for the database, which platform it runs on, how it loads the data, and the limitations of that data.
In only a few years’ time, municipal initiatives to open up public data have gone from drawing-board ideas to tech policy fixtures. According to data.gov, 39 U.S. cities now have their own data portal. Moreover, these cities are starting to enact policies around open data—and the number doing so is growing. Open data’s fast rise is already transforming how residents interact with their cities, and we’re only still in the early phases of the movement.
This is especially true in Chicago, a city that has rapidly expanded its open data activities the past few years. Since 2010, Chicago established a robust data portal and issued an executive order mandating routine releases of government data. It’s also experiencing a surge in size and activity from its civic hacker community, comprised of individuals who use their tech skills to help improve cities. By comprehensively looking at what Chicago’s done—and where it’s going—we can see what the continuing evolution of open data looks like for a major American city.
BEGINNINGS: THE DATA PORTAL
Chicago’s Data Portal was created to increase government transparency, make government data accessible to residents, and encourage the development of apps and tools that can enhance the lives of Chicagoans. The site currently hosts over 900 dataset variations with information on City services, facilities, agencies, and agency performance. These sets are presented in three main formats: tabular (which offers spreadsheet views), GIS (which offers map views), and API (which is used for software development). Since its inception three years ago, the Department of Innovation and Technology (DoIT) has grown the Portal into one of the largest and most dynamic models of open government in the country.
The Portal’s earliest beginnings were in May 2010, when the Daley Administration added FOIA request logs, statements of financial interest, and other records to the City’s regular website to make them more accessible. A year later, this small set of data became a major focus for the Emanuel Administration. The site earned its name, a separate website, and a new level of commitment, as called for in the new mayor’s 2011 transition plan.
Within a month of Emanuel’s inauguration, the City’s lobbyist data, building permit data, budget and finance records, and other key data was released, along with a plethora of GIS location-based information. By September 2011, the City of Chicago released all its crime data dating back to 2001—the largest municipal data release of its kind in history.
Prior to this release, comprehensive crime data was only publicly available in requested aggregate forms that were prepared on a monthly basis. Incident crime data going as far back as 90 days was also available on the Police Department’s CLEARMAP geographic data system. Now, all crime data over the past decade is not only listed, but includes what address, police beat, and city ward the incident occurred, as well as its case number.
The crime data’s granularity increases the potential for long-term criminological studies by experts. It can also potentially lead towards more informed crime prevention initiatives.
BEYOND THE PORTAL: EXPANDING OPEN GOVERNMENT INITIATIVES
While the Data Portal is Chicago’s primary vessel for releasing data, DoIT has sought to offer multiple data resources for Chicago. In February of 2013, the City of Chicago joined Github, the open source code-sharing website which allows users to upload, view, and edit each other’s files.
Github provides a welcome complement for more experienced techies who wish to analyze and experiment with municipal data. Chicago’s “repository,” or collection of files, includes APIs for 311 service requests, bike rack and sidewalk locations, and municipal twitter streams.
One of Github’s key features is its users’ ability to “fork” files – or copy a repository from one user’s account to another. With this capability, users can fork City data to improve upon it—allowing Chicago to essentially crowdsource for better, more accurate data.
To encourage the applied use of municipal data, Chicago has also hosted hackathons and other open-data events. In 2011, one of the largest of these events was Apps for Metro Chicago (A4MC). A4MC, a first-of-its-kind contest hosted by the City, County and State, encouraged the development of meaningful and sustainable apps using municipal data.
Several products from the event have moved on to become successful startup companies. Contest winner SpotHero, for example, found success by helping users find and reserve parking in given locations. While originally built for Chicagoans, SpotHero has expanded its model to other cities and metro areas nationwide, including New York and Newark, Washington and Baltimore, Boston, and Milwaukee.
FROM PORTAL TO POLICY
In December 2012, Mayor Emanuel issued a rare Executive Order that mandated city agencies to publish public data sets under their control, and update them on a regular basis. The Order also called for the creation of a Chief Data Officer (CDO), who would develop datasets and further the mission of the Data Portal and open government.
Emanuel also made it clear to the City’s departments that data transparency is a top priority for his administration: In addition to the Portal, the CDO is required to regularly convene an Open Data Advisory Group consisting of coordinators from each City agency.
Chicago’s first CDO, Brett Goldstein, was appointed in June of 2011. As CDO, he oversaw a rapid expansion of the portal. Goldstein also developed innovative new products for the City, such as theWindyGrid situational awareness tool, which is now operational citywide. Goldstein has since stepped down from his CDO post, but the City will appoint a new CDO in the near future.
OPEN DATA IN CHICAGO: TODAY AND BEYOND
More than two years into Emanuel’s first term, and three years into the first opened sets of data, Chicago’s Data initiatives continue to grow. The Data Portal, with hundreds of thousands of views, has become an essential tool for many residents, professionals, and tech developers. In 2013, new major data releases included energy consumption data and food desert data for Chicago.
In addition to growth, however, both the Data Portal and Github are steadily transforming to meet the needs of users that consume their information.
In September of 2013, DoIT released a crucial new API on Github to make it easier for programmers to analyze Portal data. Called “R-Socrata,” the API takes Socrata-formatted data from the Data Portal and converts it into an easily usable form for “R Package,” an open-source tool that programmers use for statistical analysis. In other words, this means DoIT has made it significantly more convenient for programmers to statistically analyze Portal Data.
DoIT also wants to make it easier to view multiple datasets in the same space. For example, if a person wishes to compare location information on the city’s 570 parks and 11,000+ bus stops, two separate queries would need to be made—a side-by-side comparison is not currently possible. DoIT plans to change this soon to continually enhance the user experience.
Chicago also wants ensure its residents are fully aware of the City’s open government initiatives. In January 2013, the City launched Chicago Digital, an online resource that helps connect Chicagoans with innovative digital tools and technology initiatives built from municipal data. To date, more than 30 successful apps have been featured on Chicago Digital.
Despite these accomplishments, there’s plenty of room for growth and improvement. While Chicago’s open data initiatives are still evolving, they mark the start of a smarter, more open, and more data-driven era for the city.