CVI – Excursus on Data Collection

I want to follow up on a point I made in my previous article on Data Privacy. You can read it at the link, but in short, the premise is this:

Proponents of Facebook argue that if you want to protect your data, you should simply stop using Facebook. This is the equivalent of arguing that if you don’t want your data to be collected, you should stop generating data. I argue that this is a backwards way of looking at the world because it asserts that people who make good and moral choices must amend their choices when people who do not make good and moral choices obstruct their good and moral conduct.

I think for this we’re going to have to go back to discussions of Validity and Licity. Is Data Collection moral? Lets define some parameters, using Facebook as the example.

The means of Data Collection is the internet. The internet exists by a user entering and transmitting data, that data being routed through a server, so the server can differentiate what you’re looking for, and transmitting the results of your request back to you. Data production is inherent to the internet. Data Collection is required for the internet to function.

As I understand it, there are two options. Functional Data collection and Predictive data collection. Functional Data collection is where nothing is stored. I request a webpage, the server delivers that page, and then clears anything related to it. No data is stored because the request has been satisfied. Predictive data collection is where some information about me is stored on the server. A user in the United States has requested Facebook. There are x users in the United States who request Facebook, and they make up y% of all users who request Facebook. The user spent this much time, clicked on this many pages.

The difference between predictive and functional data is the intention of the owner of the webpage. The same data is required for both kinds, and most webpage owners want to learn the behaviors of their users in order to offer a differentiated product. I say again: The data produced through simply requesting a webpage is required in either case.

We have no moral dilemmas yet because this so far is a function of hosting a webpage and making it accessible to users. We do have an important question: Who owns the data that was produced and stored?

Ownership might be the wrong way of thinking about it. The Data describes the path taken to reach a webpage, and the owner of the webpage owns that report. It’s an output of paths taken by visitors. It’s like a restaurant with a map that lets you pin where you’re from. The patrons don’t own the pins.

So what if the restaurant started selling the information on the pins? One visitor was on vacation from Alaska, so Burlington Coat Factory might pay to be able to know that there is a customer in Alaska who needs a coat. If I was the Alaskan, I would be understandably a little upset that that little piece of information was sold to another company. It would be a good practice for the Restaurant to notify patrons that Pins that can be monetized, will be.

We could argue that we don’t know the full extent of what data is collected by companies like Facebook. Certainly the obvious things: If I upload photos or write messages, it would be like the restaurant taking a picture of it’s Alaskan customers and posting it on the wall, along with a post it note bearing a short message. It would make sense that, once posted, you have no claims to it, even if you don’t know the full extent of what data is collected.

So imagine the Restaurant has security footage, and from the security footage they can tell that you are wearing Nike brand shoes. Nike is very interested to know this and pays for any information the restaurant has on the incidence of Nike shoes in the restaurant. What if the restaurant used the security footage to sell to a company that uses the footage in an advertisement for joint cream?

We can differentiate these kinds of data collection as active and passive. Active collection is a positive emission of data which is captured and stored, like sticking a pin on the wall. Passive collection is gathering incidental data you can gather by observation. Active data collection is if I were to go to the street and ask people “where are you from?”; passive data collection would be noting the State on the license plates for vehicles in a parking lot. This is public information. But what if I started selling it?

I think that’s the core of the issue: Selling information which it is not clear is going to be sold. When I talk to people in the street, if I don’t tell them I am going to sell their answers, I am deceiving them intentionally. If I tell them I am going to sell their answers, they may decline to answer. If I go to the parking lot to collect data, how do I tell everyone I am recording and monetizing their license plates? What if they don’t see a sign? If I go and ask everyone, it’s suddenly active collection. If I put a barrier and notify them, there’s a consent, so it becomes active. People do not have all the information available with passive data collection, and cannot decide whether or not to opt out. This is where I say in the original article that people going about their business should not be admonished for parking in a lot and having their license plates noted and monetized.

Passive data collection I wouldn’t say rises to the level of deception, because there’s no exchange. Active data collection can be approved or declined, but Passive data collection denies users the choice. Yes, all the data is “public” or “out there” but monetizing that information changes the essential nature of the interaction.

So the question becomes: Do Data Collectors have a moral prerogative, if not a legal prerogative, to get consent for all monetized data collection?

If no, then Data Collectors have no obligation to their users. If Yes, then users ought to approve or deny the monetization of their data. Even if Data Collectors require consent to view their site, users have a choice. The consent ought to be explicit, otherwise it could be construed as deceptive. And consent should be every time you access a site, a reminder that you are being monetized, again: to avoid the perception of deception.

AMDG