Citygram: Mapping, Sonifying, and Visualizing Dynamic, “Avisual” Energies of Urban Environments

“Think of it not as ‘tracking your own sound’ but rather ‘look at sound in a collective sense.” – Park

taehong_parkTae Hong Park is a composer, music technologist, and bassist. His work focuses on composition of electro-acoustic and acoustic music, machine learning and computer-aided music analysis, research in multi-dimensional aspects of timbre, and audio digital signal processing. Dr. Park has presented his music at national and international conferences and festivals including Bourges, ICMC, MATA, SCIMF, and SEAMUS. Among the ensembles and performers who have played his work are the Brentano String Quartet, California E.A.R. Unit, Edward Carroll, Ensemble Surplus, Zoe Martlew, Nash Ensemble of London, and the Tarab Cello Ensemble. Professor Park is author of Introduction to Digital Signal Processing: Computer Musically Speaking (World Scientific, 2010). He is the Chief Editor of Journal SEAMUS, serves as Editorial Consultant for Computer Music Journal, and is President of the International Computer Music Association (ICMA). He received his Ph.D. from Princeton University.


Tae Hong is using a lot of interesting technology that is both readily available to consumers and inexpensive. However, he is also using NYU’s CUSP server to store data. The CUSP server has two memory servers that each have one TB of RAM and a total of 720 TB of hard drive space.  On top of this memory are 20 computer nodes with 1280 cores. The Android Mini PCs that are being used are mini computers with no screens and that have microphones installed into them. The Citygram team also has custom Android software installed on them that does the sound blurring and analysis to send to the servers.


For the Citigram project, Park draws on his concerns regarding noise pollution and its effects on city life as well as his interest in public sound data accessibility for creating a visualization of environments.  He sought out a practical and technically robust means to document, report and analyze noise pollution as well as increase public transparency of the sound data. Park explored various means of recording this information that could be shared without infringing on privacy. He enjoyed the process of testing devices and determining efficiency both with respect to cost and quality in order to design an optimal network.

City Gram

While the Citygram Project seeks to encompass all non-ocular information overlooked by traditional geospatial mapping (humidity, traffic patterns, and other energies), Park chose noise pollution because while a certain amount of noise pollution is necessary, too much can cause problems with concentration and other health issues. It may also have an undesirable impact on real estate values and uses. For example, if someone is shopping for apartments, he or she may find a map of noise time and intensity to be very beneficial. The underlying noise levels and the character of that noise at different point within a city are also constantly in flux, reflecting both level and form of human activity. The sound data would thus capture a wide range of insights into the ecology of the environment when overlaid with other data. In addition, variations in the data will embody natural rhythms and shifts that can be tapped by computer music programs for artistic presentations as “sonifications.”

As sensors and analysis technology advance, it may also be possible to register “sentiment” or mood embodied with the sounds, information that will be both useful for mapping and for extraction as an aesthetic element in sonification. From a long-term geospatial mapping perspective, the data will also assist in understanding differences between cities and cultures across time and through sonification, show the information in a clear, effective, and even esthetically interesting way.

Park was also intrigued by the technical challenges of developing both a GPS enabled smartphone application and a specialized flexible, reliable, and reprogrammable sensor that could be acquired inexpensively and mounted at fixed locations. The Android-based Mini PC was an excellent match and will offer the opportunity to track other environmental data. Rather than focus merely on academic research, Park wants to advance a project that could produce data that would be accessible to whomever wants it and leave it open for creative exploration. Thus, they worked on plugins for popular computer music applications and also preprocessed data such that it could not violate anyone’s privacy. The project was particularly interesting because there will be opportunities for broad social participation through leveraging current and future devices, as well as the enhancement and repurposing of obsolete technology such as the thousands of old pay phones that are scattered around many cities.


Future Goals

Park’s perspective on this project is anything but narrow. While he has started on a small scale, he plans to expand the project’s scope working iteratively. He is looking now to a second phase that will record environmental facets such as humidity, wind speed, color spectra, brightness and reflectivity. A third phase will open the project up to more semantically based factors such as mood and sentiments.

Dealing with Privacy Issues

After the lecture, many questions arose about privacy and how people respond to the information that is being tracked and analyzed. In Citygram, the Android Mini PC is listening for sound and analyzing the Flux, Centroid and other characteristics that can be derived by looking at the spectrum. The sound is not actually being recorded and on top of that there is a sound blurring algorithm that further makes sure that if anyone were to get access to the server or computer, they still would not be able to hear in real-time what is happening. The way that sound blurring works is: the sound is passed through a Fourier Analysis that analyzes the sound in terms of amplitude and phase. This result yields something called a ‘bin’ that holds the phase and amplitude values. Each sound (frame of sound) will have somewhere between 1024 and some power of two data points. From here the parts of the spectrum that yield the timbral characteristic of a person’s voice can be seen and then the bins randomized or scrambled. The information is still in the bin but is scrambled and analyzed before it reaches the server and the server is only storing the analyzed data such as flux and centroid. The only way that this project could invade privacy is to not only hack the server to get the files, but also to alter the software on each Android Mini PC and remove the blurring algorithm.


Park’s project is unique among those explored by Transduction this semester in that its long-term goals include not just expanding our understanding and access to data or even its possible use by broad audiences, but also seeks the eventual direct and on going participation of millions of citizens for the collection of localized data. And just as the pathways for cell signal transduction are presented by Brautigan and DeSimone are fundamental components of our sense perception, Park seeks to foster a framework for a pervasive and flexible sensor network, one that will extend biological processes and provide our minds and bodies with actionable information. Often we have more than one means to perceive an entity’s or a source’s characteristics. For example, our eyes detect light but our skin also detects radiant warmth; our ears detect audio energy but our bodies also sense vibration. The process of perception can also extend beyond mere sensation, both eyes and ears are known to subtly “preprocess” inputs. Our inner ears themselves facilitate discriminating and tracking sounds, as do linkages between nerves in our retinas facilitate spotting movement and depth. These senses have built-in bandwidth limits and means of aggregation least our brains become overwhelmed with the flow of inputs. Similarly, the expanding array of sensors proposed by Park will assemble energies and qualities that offer an overlapping perspective on  dynamic environments (brightness, reflectivity, spectral distributions, etc.). Each new “Sense” will likely embody not only a range of physical sensors but also new algorithms to preprocess the data, both to meet the practical constraints of network communications and toward some expectation of concise and purposeful analysis. However, the framework will also be able to ‘evolve’ and add new capabilities.

While the current Citygram data collection inherently logs time and location, the processing views them as point sources and does not yet integrate the data as an active array. In contrast, human sight and hearing function stereoscopically. Eventually, the divergent CityGram sensors will likely become a more holistic network that “understands” and tracks higher-level environmental events as part of its inherent framework of processing.

The precision of the analysis and sensors is very high and lends itself well to areas outside of Tae Hong’s immediate focus. Musicians and artists have access to the data and can use the information to create different visual or sonic mappings of the data while science researchers are able to track a wide range of different things such as spectral centroid or flux of an area. Citygram is designed in such a way that without any modifications many fields of study can use the information very easily.

Park lecture

Report by Janet Rafner and Maxwell Tfirn.