50 Shades of AI Data


As we at drivebywire.law grapple with legally allocating rights, obligations and risk in AI, we realized that we have to invent new legal classes of data to set the expectations of consumers, startups and regulators.  Here are some rough categories to get this conversation going:

  1. Type of sensor present: A device, like a door knob, that seemingly has no interest in temperature, might still contain a temperature sensor, placed as a sleeper sensor to be activated much later in the life of the product for purposes not yet imagined at the time of manufacture.  Manufacturers wish to keep a degree of secrecy concerning the sensors that are in their products.
  • Sensor degree of precision: This data set logs the degree of precision of a sensor.  For example, an accelerometer in a car can be more or less sensitive, which information is, of itself, valuable.  A vehicle manufacturer might not want to reveal that its accelerometer is less sensitive than one used by a competitor.  Therefore, data concerning the precision of sensors is to be treated as a distinct category of data subject to its own set of confidentiality parameters.
  • Raw sensor data: This is the data collected by sensors in vehicles, fridges, homes, wearables etc…  We are having to define who owns this data from its point of origin – i.e. from the sensor onwards.  Does the sensor manufacturer have rights in that data even though it is not involved in the larger product being produced?
  • Sensor-device correlation: This data set includes the sensor and the type of device in which it resides.  Here the data takes it’s first dive into depth because mere device / sensor correlation triggers a number of safe assumptions and accelerates triangulation on the ‘big picture’ of the subject. For example, windspeed indicator in an airplane is a fact that is, in and of itself, rich.
  • Sensor-subject correlation (objective): This data reveals that the sensor is tracking data for a person, but not the identity of that person.  Consider a temperature sensor on a city bus, used to track the temperature of commuters in order to assess the spread of a fever in a community.
  • Sensor-subject correlation (subjective): At the moment, this is the most sensitive dataset, as it connects data to an identified individual.  At the moment, this is considered highly valuable as it can be exploited to ‘hack’ the consumer into making one or another decision.
  • Sensor-subject correlation – community-wide: This is really just a large sample of subjective sensor-subject data, giving the controller the ability to peer into the data of a cohort of people and predict community-wide events.

Regardless of where your data set resides, it’s important to frame its legal status, otherwise you risk breaching the privacy rights of individuals or giving up commercial opportunities in the data you encounter.  At drivebywire.law Adam Atlas and his colleagues are busy parsing the data sets to help document who owns what, no-matter the shade.

drive-by-wire law