Surname Search - Surname Population Density & Income Distribution
Fri, Apr 18, 2014This application was developed for my undergraduate honors thesis and was accepted and published at EuroVA 2014. The work was supported partially from an award from U.S. Department of Homeland Security’s VACCINE Center and a grant from the Engineering and Physical Sciences Research Council UK EPSRC. It received surname input from the user and output a population density map and income distribution for that surname.
The application combined Java web services and MySQL on the backend and HTML, Javascript, d3.js, jQuery, and the Google Maps API on the frontend. The application also utilized Java to precompute surname population densities and income distribution. The precomputation for the surname data utilized a phonebook database to link surnames to latitudes and longitudes. Each surname within the database estimated the density utilizing a kernel density estimation function. With the surnames precomputed, when a user entered a surname, the web service simply looked up the precomputed data and returned it to the frontend. The frontend displayed the population density heatmap on top of a Google map of the United States and utilized D3.js to display the income distribution and a similar names word cloud.
I extended previous work that computed a population density heatmap based on a surname. My contributions and results from the project include:
- Rewrote the population density heatmap calculation to compute the kernel density estimation for a surname based on a regular distribution (plain distribution with no restrictions) and probabilistic distribution (lowering the probability based on the total population density of that area).
- Added calculations for income distribution based on the given surname utilizing United States Census data. Later added additional options for the income distribution to utilize the Zillow API.
- Converted from on demand surname calculations to precomputed population densities and income distributions.
- Expanded the application to find similar surnames based on the given surname’s distribution heatmap and income distribution.
- Integration with the Google Maps API to compute the population distribution heatmap overlay image and to display this image on top of a Google Map on the frontend.
- D3.js integration to graph the income distribution in predefined bins.
- D3-Cloud (a D3.js extension) integration to display the most similar surnames for a given surname.
- Integrated with the Twitter API to allow users to tweet about using the application.
- Presented the application at the Spring 2014 Barrett Honors Symposium.
- Successfully defended the research for my undergraduate honors thesis in Spring 2014.
- Published at EuroVA 2014. You can read the paper here.