Data artist and software developer Eric Fischer, who used to work as an engineer on Google’s Android team, put together an amazing mapping project. Using Twitter’s public API he collected every geotagged tweet sent in the past three and a half years. Twitter API allows to get access to maximum a few days worth data, therefore Eric has been compressing and saving all the data in JSON format creating a repository of tweets weighting 3 terabytes and growing 4 gigabytes a day. In total he gathered 6,341,973,478 of geotagged tweets . Than he used the Mapbox API to visualize all these data points on a map.
Ultimately, only nine percent of the six billion tweets were represented as dots on the map. This is due to filtering that removed duplicate coordinates, mapping unique latitude and longitudes only once on the map:
For instance, every Foursquare check-in to a particular venue is tagged with the same location, and it doesn’t help the map to draw that same dot over and over. Showing the same person tweeting many times within a few hundred feet also makes the map very splotchy, so I filter out those near-duplicates too.
The whole process behind the project is described in Eric’s blog post.
Now let’s compare the tweet map with population density map. The conclusion is simple where there are large concentration of people and money there is higher density of tweets. When you look at India and China (where Twitter is banned) the map is practically blank. On the other hand we’ve got Brazil and Argentina with huge usage of Twitter compared even with Germany.