Business GIS for Everyone

This blog takes an informal look into the debates and methods related to business GIS and mapping

Translating Big Analysis into Big Understanding (and Big Dollars): Part One

Author: Dr. Murray Rice

Advancements in GIS, data mining, and geoanalytics have brought a wealth of new, powerful, data-based methodologies to a wide range of small and medium-sized businesses who were totally excluded from accessing such power even a decade ago. Advanced multivariate statistical methods take complex datasets and extract core insights that could not be seen in any other way. The downside of this explosion of multivariate (and often spatial multivariate) tool development is that the complex datasets that serve as the input to the process often lead to complex, difficult to understand output as well.

Business GIS plays a role in cutting through some of this complexity and access core market insight without involving complex tables or unintuitive graphics. Coupling the communication and conceptualization power of Geography with the analytical power of multivariate analytics provides capabilities for an ever-widening circle of professionals who are not statistical analysts the ability to generate and interpret complex results.

The growing field of spatial segmentation is a prime example. Spatial segmentation involves bringing dozens of variables into analysis simultaneously, producing a complex but understandable picture of the structure of local, regional, and national markets. If we can add data from a business’ own operations to this mix, we can set up a framework where deep and useful insight can be gained.

Components of a Spatial Segmentation Analysis

There are two distinctive phases of work needed to produce a spatial segmentation analysis. These include:

Generation of the overall segmentation framework. This is the biggest and most challenging component of the analysis where every market segment is identified across the country. This sort of wide-ranging analysis needs to be done once, yielding a segmentation framework that can be used and reused multiple times. Commercial data analytical firms such as Caliper do this basic work and provide it to their customers, who then have what they need to complete a second stage.

Spatial segmentation works to make sense of complex datasets derived from any of many sources. Two major dataset foundations for spatial segmentation are demographic and psychographic data. Demographic data record measurable, census-type characteristics of people and populations, such as age, years of education, income, and marital status. Psychographic data also deal with measurable values of people and populations, but they focus specifically on dimensions related to preferences, interests, personality, and behaviors (such as above-average levels of tennis playing, the presence of Spanish as a second language in a household, or frequent purchase of books). Combining demographic and psychographic data in a joint analysis provides a uniquely powerful perspective on a population of interest.

Datasets that are used in spatial segmentation typically have a few dozen demographic and psychographic variables, with values for each variable being known for neighborhoods across the country (again, with neighborhoods being based on standard, widely-used geographies such as postal codes, census tracts, or block groups). The size of these geographical databases alone is quite large. For example, a typical analysis to create a segmentation system would bring in all 242,000 block groups in the US. Add in a typical number of variables (we’ll use 50 variables in this example) yields an analysis that is based on 12.1 million data values. Quite a hefty analytical load!

The goal of this large-scope segmentation analysis is to use each variable as a point of comparison between and among the neighborhoods represented in the database. In other words, the segmentation analysis creates groupings of most similar block groups based on a comparative analysis of every variable in our database. By using a broad range and variety of both demographic and psychographic variables, spatial segmentation produces a truly robust classification of geographic neighborhoods across the country.

Once this overall segmentation framework is complete, we are ready to begin stage 2, use of the overall framework to provide insight into a customer dataset.

Generation of an individual set of segmentation results, tailored to the interests of a specific business. Here we make use of the analytical foundation provided by our phase 1 work to generate an understanding of the segmentation characteristics of a particular business’ markets. We do this by bringing the business’ customer dataset into consideration. Implementing the segmentation analysis in this way acknowledges the reality that the creation of a segmentation system (phase 1) is a difficult, data and computing infrastructure-dependent task. A typical segmentation analysis makes use of a commercially-available segmentation framework that allows the analyst to focus on phase 2 tasks and avoid getting bogged down in the segmentation details of phase 1. Beginning with a commercially-available segmentation framework also provides a level of quality checking that would not be possible in a "one and done" situation where the segmentation is used a single time.

The possibilities for analysis of customer data are many. The remainder of this post further discusses the details of phase 1 and Caliper Corporation's implementation of the segmentation concept in its own proprietary Maptitude Segmentation System.

Caliper's General Segmentation Framework: What it Provides
Caliper has created a flexible framework that makes available to us a ready to use set of segmentation tools. These tools answer some basic questions that collectively answer the biggest questions for a complete and rigorous segmentation analysis. These questions include:

What kinds of neighborhoods does the United States have?
Caliper has done this basic environmental scan for us, creating a roster of 32 unique types of neighborhoods (geodemographic "subsegments") that can be found somewhere across the United States. Each segment is located in a color theme that also serves to group the 32 subsegments into 8 larger segments (see Figure 1). Use of 8 segments somewhat reduces the power of the analysis but provides the important benefit of reducing the number of neighborhood groupings dealt with by 75% (from 32, down to 8).

Figure 1: The Maptitude Segmentation System by Subsegment

For each of the 32 neighborhoods identified, where can that neighborhood type be found across any given city, state, or country?
Here, Caliper has defined the full geography of each of its segmentation system across the entire nation. Figure 2 below represents what this looks like at a metropolitan scale using the example of Harris County, Texas (Houston). The map represents where each of Caliper’s 32 neighborhood types can be found in and around Houston. Each color in this map represents a different neighborhood type, each of which has a unique demographic and psychographic profile. As an added benefit to focus on the contribution of this complex and potentially-confusing map, the Harris County map highlights two subsegments in particular.
- The "High-Earning Families" subsegment is located across the map in census tracts with a dark blue color shade. The graphic also includes a profile of the segment based on some of the data Caliper used to identify the subsegment.
- The "Opulent Homesteads" segment is also located on the map with a dark purple shade. Again, the graphic also provides a brief profile of this distinctive subsegment.

Figure 2: All 32 Subsegments in a Map of Harris County, Texas

If I have a particular interest in a specific subsegment, how can I track that neighborhood type in particular?
For example, suppose I am interested in the High-Earning Families subsegment because I know that this group is a great source of customers for my business. Implementation of this analysis allows us to isolate that particular subsegment in a given local market. The map below breaks out the High-Earning Families subsegment on its own in Harris County.

Figure 3: High-Earning Families Subsegment in Harris County, Texas

This map shows us exactly where this subsegment lives, and provides the foundation for more analysis that can indicate what set of business locations would most optimally serve the neighborhoods our analysis identifies. Clearly, a business serving this neighborhood type would need to focus on establishing locations in the northwestern and northeastern suburbs of Houston and avoid siting facilities in the core areas of Houston.

To give another example, one more breakout map isolates the Opulent Homesteads subsegment across Harris County (see Figure 4). Use this analysis allows us to consider multiple subsegments as we make further plans to develop and grow our business. This analysis demonstrates that, although the high earning families and opulent homesteads neighborhood types have some superficial similarity, each has a distinctive geographic pattern that characterizes the neighborhoods where each congregates.

Figure 4: Opulent Homesteads Subsegment in Harris County, Texas

The key thing to recognize about this analysis so far is that it is a general framework that helps us to broadly understand the spatial market structure of regions and metropolitan areas. But note that to this point the entire analysis is built around generic, public data and the insights that can be gained from what we earlier defined as phase 1. What value can adding a business’ own customer or order data contribute to this analysis (phase 2)? The next blog post will cover that next step: from generic, broad analytical framework to specific application based on a business’ own proprietary data.

For more insight into the geodemographic segmentation analyses described here, please visit Caliper’s resources on its geodemographic segmentation system. To learn how to perform spatial segmentation analysis, see the Maptitude Learning Portal.

Business GIS for Everyone

Translating Big Analysis into Big Understanding (and Big Dollars): Part One

Components of a Spatial Segmentation Analysis

More Posts by Dr. Murray