Understanding how the largest technology companies collect, use, and share user
information across the internet. We’ve transformed the “Big Four” terms of service and data policies -- the thousands of
lines of code that govern their use of your data -- into a database powering an interactive visualization, an initial
version of which we invite you to explore and critique. Select a company in the top menu and click on a line to see
the original snippet of text from the company's terms of service or data policy.
Apple takes pain to position itself as going above and beyond with regard to privacy and protecting users' data.
However, its Privacy Policy is extremely vague and open to interpretation, affording Apple the ability to collect and
use vast amounts of customer data.
Currently showing only data collected by Apple.
<>
This is important for two reasons. First, Apple’s privacy policy explicitly states that it "may collect, use, transfer,
and disclose non-personal information for any purpose." That’s pretty much all information beyond your name and contact
information. It collects data about everything you do on or with your phone - everywhere you go, which apps you interact
with, who you talk to the most, even your health data - and use it for whatever they want, including advertising. Even
personal data can be used to "create, develop, operate, deliver, and improve our products, services, content and
advertising," which includes almost every purpose we identified in our visualization. Apple may take precautions to
protect users' data, but it can collect and use it with very few limitations.
Currently showing only non-personal data collected by Apple.
<x
Secondly, Apple is only legally bound by its Privacy Policy, not representations made in its own marketing or keynote
speeches. Apple keeps their privacy policy purposely vague - allowing it to continually "collect, use, transfer, and
disclose" vast amounts of customer data while protecting itself from liability and lawsuits. In 2013 Apple was sued by a
group of people who claimed they relied on Apple's privacy claims when they purchased iPhones, but that Apple was still
collecting and sharing data. The judge ruled that even though statements were vague and misleading, no one could prove
that they actually read the policy, therefore they could not have been misled (yes, you read that right). Because the
Privacy Policy is the only legally binding statement, the case was dismissed, and Apple didn't have to change any
practices. Apple - and the tech industry - is counting on customers to believe their carefully constructed narratives
around privacy - but leaning on a legal defense that essentially boils down to this: No one reads, much less
understands, their privacy policies or terms of services.
Currently showing only personal data collected by
Apple for the purposes of account security, advertising, analytics, company operations, improvement of
products, legal compliance, provision of services, or research and development.
On Facebook, as with all online services, personally identifiable information (PII) like Social Security Numbers, date
of birth, and gender are regulated and protected. In addition, users can control the app’s access to GPS location,
cameras, and contact lists.
Currently showing only data collected by Facebook.
<>
However, even when you turn the privacy settings to the most extreme, Facebook can track all the content you view or
interact with, down to scroll times and and cursor movements. The company can still collect the unique identifiers that
tie users to their electronic devices and, even with GPS enabled, pinpoint user locations through cell phone towers and
nearby WiFi networks.
Currently showing only data collected by Facebook that users cannot
control through their privacy settings.
<x
Thanks to big data, Target knows if shoppers are pregnant before they do. In the same way, Facebook can learn more about
a user through their behavior than any information a user profile could provide--regardless of privacy settings. Unlike
PII, Facebook is free to share this device and behavior information with third parties for any purpose.
Currently showing only location, user behavior, network information, and device identifier
data collected by Facebook, that users cannot control through their privacy settings.
I consider myself a "light" user of Facebook and Instagram, by which I mean that I interact with the platform by
browsing and clicking exclusively. I don’t upload photos, create posts, write articles and keep the information in my
biography minimal.
I had been dating someone for about six months and suddenly noticed that we were getting very similar ads.
We weren’t in any pictures together, there was no indication of our relationship status, and never interacted with each other’s
Facebook activity or walls.
One day, he mentioned (verbally) how he had been considering moving to Boerum Hill in Brooklyn.
I had never heard of Boerum Hill. It sounded far away. But I never pressed it further, nor did I look it up. We never texted about
it either. One day later, I was served a few ads for gyms in Boerum Hill. While we had no intention of moving in together, Facebook
insinuated that I might be moving there with erie precision - as if it was in the room with us during our conversation.
It
certainly feels as if Facebook were listening in on our conversation, and by all rights (literally), the company
could very well could (we dare you to find in their terms of service where it says that they can’t). However, it takes a
lot of resources to listen to billions of conversations at scale, then turn them into advertisable objects (and a
potential PR nightmare). What’s actually going on is far more pedestrian. For Facebook, it works like this:
Currently showing only data collected by Facebook.
<>
Facebook tracks what you click on, share, buy, search, and converse about online.
Currently showing only harvested data collected by Facebook,
for advertising purposes.
<>
It then identifies people you're close to (in any number of ways), and also tracks the same information. (Facebook knows
where you are, and it also knows where the people you are connected to are located.)
Currently showing only harvested, active product use data collected by
Facebook, for advertising purposes.
<>
Facebook can then bucket you into advertising categories similar to people like you, again, based on what you click on,
share, buy, search, and converse about online.
Currently showing only harvested, current location, location history,
ip address, and network information data collected by Facebook, for advertising purposes.
<x
Facebook’s nearly inscrutable algorithms mix all this information together, and use it to serve ads it believes will be
most relevant to you - and to friends with shared interests. It’s not a new concept - birds of a feather, after all. But
what is jarring is the precision with which assumptions are made. People tend to talk “out loud” about the same things
they interact with online. Facebook’s extraordinary data gathering machine simply takes it from there.
Currently showing all harvested, data collected by Facebook, for
advertising purposes.
Privacy policies and terms of service are written for lawyers and courts, not regular people and certainly not for
computers (we learned this as we tried to turn them into a data driven visualization!). But one thing about these
documents is simple to grasp: Their language is maddeningly vague and imprecise.
Sometimes, a privacy policy does make a specific point, like when Amazon states with absolute precision:
Currently showing all data for all companies.
<>
“We receive and store any information you enter on our Web site or give us in any other way.”
In this case it’s pretty clear what data you are surrendering to use their product. But when it’s time to offer up an
explanation of what that data will be used for, Amazon suddenly becomes less concrete and instead only offers a few
open-ended use-cases.
“We use the information that you provide for such purposes as responding to your requests, customizing future
shopping for you, improving our stores, and communicating with you.”
So how does this work in practice? In recent news, two former Amazon employees laid bare the company’s practice of using
data collected from vendors to create their own product line, including brands like Amazon Basics.
Currently showing only data collected by Amazon.
<x
According to Krystal Hu at Yahoo Finance,
Amazon employees were “…able to analyze various datasets from the
marketplaces, including the list of top-selling and trending products in certain categories, pricing points, return data
and reviews.”
They could also “view aggregated data such as search interest and also pull reports from Amazon’s data warehouse on
specific products to help them decide what Amazon should make.”
Insights created from that data were then used to find the ideal product to reverse engineer and produce it under one of
Amazon’s in-house brand names.
Amazon may be vague about what it will do with your data, but it is specific about what you can do with theirs. The
Conditions of Use
which dictate what you can do on their site says, “This license does not include any resale or
commercial use of any Amazon Service, or its contents; any collection and use of any product listings, descriptions, or
prices; any derivative use of any Amazon Service or its contents; any downloading, copying, or other use of account
information for the benefit of any third party; or any use of data mining, robots, or similar data gathering and
extraction tools.”
As the old adage goes “Do as I say, not as I do.”
Currently showing only data collected by Amazon for
research and development purposes.
X
Welcome to Mapping Data Flows.
This visualization was designed for a desktop browser.
For a better experience, please return when not on a mobile or tablet device.
The data behind this visualization was created base on the following terms of service and privacy policies:
However, note that there are many more terms of service and privacy policy documents for each of these companies. Due to time constraints
we chose to focus only on the main documents for each one of them.
This visualization is generated using three csv files:
Nodes.csv, which generates the data source,
data type, and purpose nodes.
Generates.csv, which links the data source and the
data type nodes.
For feedback, comments or suggestions, please contact us at [email protected].
Team members:
JOHN BATTELLE - SIPA Senior Research Scholar, Adjunct Professor, Co-Founder & CEO of Recount Media
JUAN FRANCISCO SALDARRIAGA - Senior Data & Design Researcher, Brown Institute for Media Innovation
ZOE MARTIN - SIPA Masters of Public Administration
MATTHEW ALBASI - Masters of Science in Data Journalism
NATASHA BHUTA - SIPA Masters of Public Administration
VERONICA PENNEY - Masters of Science in Data Journalism
Mapping Data Flows is also supported by:
We use Google
Analytics. We promise not to use the data for anything other than seeing how many of you come and what you do on the site. We
won’t sell the data, although we may at some point visualize it. We can’t promise what Google’s doing with it, tho.