Visualizing City of Chicago owned properties with Power BI Designer


Update 8.9.2015: Title updated with new product name

 

The week after next, I’ll be in Chicago to present at Microsoft’s Ignite conference.  I thought it would be fun to look at some data for the City of Chicago to see what I could do with it.  

Revenue is important to any city.  Developing enduring sources of revenue, whether through direct means like leasing city owned property or other ingenious ideas, is something every city planner needs to consider when thinking about where to spend precious resources.  Let’s see how the City of Chicago Data Portal (https://data.cityofchicago.org/) helps us.  I’ll use Microsoft’s Power BI Designer to do the analysis (free download).

Before I get too far into it, a word of caution: While the data and tools used are real, without more analysis no conclusions should be drawn as to what actions to take.  I’m not sure what the facts on the ground are in Chicago or any of the neighborhoods mentioned below.  Maybe someone’s already thinking about all of this.  In any case, let’s see what the numbers say…

I opened the City of Chicago Data Portal and found the “City-Owned Land Inventory” data set. 

cblog1

As with any Socrata based data portals, you can get an OData url as shown below that makes it really easy to get the data into Power BI Designer.  Power BI Designer natively supports OData input so the data loads quickly and easily.

cblog2

 

Now to do the mashup, I needed to get a sense of the rental rates in the City of Chicago.  So I looked around the web and found http://www.loopnet.com.  They have a set of tables that show various real estate market trends in Chicago, including lease rates.  There are a lot of different trends to choose from. I somewhat arbitrarily chose the Office Property Asking Rent – Lease Trends and chose the “Metro” rate for the analysis.  I wanted to use a nice representative number.

cblog3

To get this data into Power BI Designer, I used the option Home->Get Data-> Web and entered the URL “http://www.loopnet.com/Chicago_Illinois_Market-Trends”.  I picker that opens has a bunch of tables in it and allows me to preview tables.  Some guess and check iteration is required, but eventually I found Table 28 is the one I wanted.

cblog4

I called this table “RentalRates”. Now that I had reasonable rental rates, I could really get going with my mashup.  To figure out the “Potential Revenue” for each property I needed to multiply the square feet of the property by the rental rate.  This required some Power Query know how. 

I added a custom column called “Potential Revenue” to the “City Owned Properties” query.   The column does a look up in the RentalRates table for the rental rate for “Metro” which is what I wanted to use. 

Some technical details…

If you don’t care about technical details, skip down to the pictures below! But this little bit of exposition shows a couple of Power Query tips and tricks you might want to have in your tool belt.

Here’s what the “Potential Revenue” column’s formula looks like:

[sq_ft] *
Table.SelectColumns(
    Table.SelectRows(RentalRates, each ([Region] = “-Metro”))
    , {“$/sqft (Mar 15)”}
){0}[#”$/sqft (Mar 15)”]

Let’s step through what this does.  You need to read it from the inside out.

     Table.SelectRows(RentalRates, each ([Region] = “-Metro”))

This selects a row in the “RentalRates” table where the column “Region” has the text “-Metro” in it.  You might ask why I chose that string, it’s by inspection of the data value returned from the web page.  I could have cleaned it a little, but that’s for next time.

Next we have the Table.SelectColumns that selects the column I want.  That column as the name “$/sqft (Mar 15)”

Table.SelectColumns(
    Table.SelectRows(RentalRates, each ([Region] = “-Metro”))
    , {“$/sqft (Mar 15)”}
)

Now comes the tricky part.  What the above returns is a table with one column name “$/sqft (Mar 15)” with one row, and in that row is the value I want. So I can append the power query row indexer {0} to get the first row of the table and specify the power query column indexer [#”$/sqft (Mar 15)”] to get the value of the column named “$/sqft (Mar 15)”.  Notice the # sign.  What that does is it makes string an object that the system can compare against the column.

{0}[#”$/sqft (Mar 15)”]

Whew – so am I really crazy smart at Power Query to get this done? No not really.  The way I build that was using a trick.  In the queries list, I right clicked the RentalRates query and selected “reference”.  This creates a “RentalRates (2)” query that I can use the Power Query UI to filter to the right row and select the right column. Then I could select the cell I wanted, right click and select “Drill Down” this gives the following query.  Btw, on the “View” tab you can use the “Advanced Editor” to see and edit the queries.

let
    Source = RentalRates,
    #”Filtered Rows” = Table.SelectRows(Source, each ([Region] = “-Metro”)),
    #”Removed Other Columns” = Table.SelectColumns(#”Filtered Rows”,{“$/sqft (Mar 15)”}),
    #”$/sqft (Mar 15)” = #”Removed Other Columns”{0}[#”$/sqft (Mar 15)”]
in
    #”$/sqft (Mar 15)”

With this query, I can easily get to the expression I used.  Now I did it that way because I was curious about indexing in in Power Query (I need to know all that for my day job). 

There’s an easier way to do it, that probably few people know about!  For analysts, instead of using “Drill Down”, I probably would recommend selecting “Add as New Query” on the drop down.  This makes a new query that returns the value of the cell you selected.  By default this gives a query with the name of the column.  After renaming the query to something like “metroRentalRate”.  What you should know is that in Power Query you can referenced other queries in your formulas.  In this case it’s easy since this query returns a single value. So your final formula for the “Potential Revenue” column would look like this:

[sq_ft] * metroRentalRate

Now that’s pretty nice.  You can remove the “RentalRates(2)” query since it’s not needed without breaking anything and you’re done with that part! As a side note when doing the above, I would still use the ‘reference query’ step since it results in a more efficient set of queries – the data will be downloaded by Power Query only once, rather than twice.

Beyond the above, I did some standard tweaks to the data returned from the OData feed.  To map addresses better, I created a couple of columns to help Bing geocode the addresses better. Here are the queries:

Street Address

= [address] & “, Chicago, IL, USA”

CMTY

= [cmty_area_name] & “, Chicago, IL, USA”

I think the second one is interesting.  The cmt_area_name contains names like “Englewood” that are regions within the city.  Bing has some knowledge of geographic boundaries alike Countries, States, Cities, Zipcodes.  You mileage may vary, but it’s pretty cool when it works, which it did in this case – see the filled map below.

I also used the “replace values” option on the “vacant” column to change the values from ‘true’, ‘false’, and ‘null’ to “vacant”, “occupied”, and “not specified”.  That’s much easier on the eyes.

Show me the data

Now that all the data modeling and queries are done, I created a few pages in Power BI Designer to show some of the data.  Using a TreeMap, I’m able to see part to whole analysis.  I created the view below to show potential revenue by ward number.  As you can see 4 wards account for the bulk of the potential revenue.  You’ll also see the distribution of properties across whether they’re vacant, occupied, or not specified. One very interesting thing in the TreeMap is that you can see how the number of properties in not specified is actually quite high but the potential revenue for those properties is extremely low.  The reason is that there’s very little square footage associated with these properties.  So they might be vacant lots.  What’s interesting about that is that vacant lots are sometimes easier to redevelop than lots with building on them already.

cblog5

Below, I used the Filled Map feature to see which of the communities have the most vacant properties.  Filled Maps are sometimes called choropleth maps. You can see that there are a few neighborhoods with higher densities of vacant city owned property.  That clustering makes it interesting as a potential way to focus resources on those areas to avoid a peanut butter effect.

cblog6

 

Lastly we can look at the wards with the most vacant property and see where all those properties are.  Again, you can see that clustering when looking at each property.  This suggest there might be some really big opportunity in those areas of Chicago.

cblog7

Hope you liked this analysis.  You can find the final Power BI Designer file here:  Chicago City Property.pbix

You can get Power BI Designer for free by click here.

Also, if you haven’t yet, you can signup for Power BI for free by heading over to www.PowerBI.com.

-Lukasz

Comments (0)

Skip to main content