Privacy design patterns

for smart homes

fr
Avoiding abuse by collecting less data

Candle keeps all data inside the home. While this greatly reduces the risk of corporate surveillance, it doesn't solve all privacy issues. Often users of smart home systems are seduced to use available data to spy on each other. This is called coveillance.

To make it less attractive to spy on each other, Candle implements a large number of novel privacy protections. Here we try to organise all those innovations into more generic categories, or "design patterns".




‌1. LIMITING DATA COLLECTION

You cannot be seduced by data that isn't there

Disable data collection

It should be easy to stop accepting data from a device for a while. Just like the mute button on your phone or stereo, the implication is that this is something that should be temporary. For example, you could only allow collecting data from motion sensors when security mode is active. 

‌Candle’s network presence detection system implements this, as does the Zigbee addon.

There is room for more nuance here. For example, in the book Design My Privacy I mentioned that all smart devices should have a physical slider switch which can be set to three levels: 
‌- Data may not leave the device 
‌- Data may be communicated on the local network 
‌- Data may be communicated with the local network and the cloud 

‌As these devices with “disable data transmission” toggles do not exist, we decided to implement this “disable data collection” in Candle instead. The devices are still sending data, but Candle is limiting how much of it it accepts.

Data mute

This is a variation on the data collection toggle, where users can press a button to disable data collection for a certain amount of time. Candle’s implementation has a button where each time it’s clicked the device in question will be ignored for an additional hour. So, pressing it once will get the devices to be ignored for 1 hours. Pressing it again, increases that duration to two hours.

In our implementation the user is only shown how much time they added to the mute duration. If more than a minute later the button is pressed again, the user won’t be shown that the counter is now up to three hours. Instead they will only see that they added an hour of mute time. This is done to protect the social fabric, where otherwise pressing the button could confirm that someone else has already recently pressed the mute button, and indicate how much longer that duration would be in effect.

This is done to ensure some level of plausible deniability, and create equality of action on a level ground, where anyone can press that button - could have pressed that button. Maybe a child didn’t want to leave definitive proof that they left the home early or arrived late, preferring to create room for debate later instead of not having the option for debate because there is proof in the logs. It’s always a weighing of options, and giving people choices in “how they want to play it”. Making “gaming the system” almost akin to a game. Or giving a subtle signal that they don’t want to talk about something without having to literally say it. For example, there could be an agreement (even technically enforced) that everyone can use about 10 mute hours per month without it leading to discussion.

Data bracketing

Here the idea is to only record data if the values are “abnormal”. For example, a humidity sensor could not record data while the humidity level is between 40 and 60 percent, but start recording when the level is too high, too low, or in this case in both of these situations.

An example use case is a home that is rented out via AirBNB. The renters could have data privacy as long as they behave. If they start smoking or paying loud music, then this could be registered.

With audio it becomes clear this could also have a temporal component: 
‌- short bursts or loud sound are ok (wake up alarm), but sustained loud sound (playing loud music deep into the night) is not.
‌ - The upper and lower boundaries could shift over time. For example, loud music could be ok during the day, but cross the limit after midnight.

In the case of Candle his would necessitate yet another way of displaying a sensor’s data. 
‌- The actual precise value 
‌- “unknown”. When there is no data (for example because its muted) 
‌- “OK”. To indicate that the value is within acceptable boundaries, without revealing what it is exactly. Visually this could also be done by showing the bounds range, e.g. “40-60”, which might be a clearer way to communicate.




‌2. LOWERING GRANULARITY

Collecting data less frequently or with less precision

Data blur over time

Candle makes it possible to change how frequently data is collected. For example, many sensors send data every few seconds, but this can be overkill, or be undesirable. 
‌- More data means more storage space is needed. 
‌- More data might make it harder to make sense of a graph, creating lots of squiggly lines. 
‌- Often the sensors are highly inaccurate, so you’re recording noise more than anything.

That’s where data blur comes into play. It allows users to select at which interval data should be accepted. If a user only wants data to be recorded once every ten minutes, and the sensors sends 100 values during that time, then an average value is calculated at the end of that period.

This is currently implemented in the Zigbee2MQTT addon.

Data blur of values

Another way of “blurring” data could be to limit the precision with which values are recorded. For example, a humidity sensor could give a (unrealistically precise) value of 47.52, but the system could only record what “bucket” the value fell in. Any value between 40 and 44 could be recorded as 42. Values between 44 and 48 could be recorded as 46, and so forth.

This can be quite useful outside of a privacy context too, as many sensors claim a level of precision that is not realistic. Cheap air quality sensors are notorious for being poorly calibrated to begin with, and giving a false sense of precision in general. We found that a CO2 sensor that outputs its levels as “bad, poor, ok, good, great” is more realistic and more useful too. When the display of such a device shows both the number and the “opinion”, I personally find myself looking at the opinion much more than the actual numerical value.

Fake data

Where data blur means the data is collected less frequently, but is still representative, creating fake data goes a step further. It generates plausible looking new data based on old data. For example, when fake data is enabled on the CO2 sensor, it will continue to generate new data around the same level it was when the faking was enabled. That way it’s possible to invite extra people into a room without being “snitched on” by the sensors.

This idea has been implemented in some Candle 1.0 prototypes, such as the Carbon Sensor. The Privacy Manager addon also allows users to generate fake data points. But ideally it would be a feature for all devices, which could be implemented in Candle 3.0.




‌3. Communication

How can the smart home communicate how much data it's collecting?

Data collection status communication

Candle’s ability to limit when data is collected creates a new design challenge arrives: how can the current data transmission/collection state of devices be communicated to users?

If the device is itself a physical device, then the data collection status could be shown on the device itself. A classic example if the red led on video cameras that indicate that they are recording. 



‌High marks if this is “glanceable”, meaning the status could be picked up in the periphery of someone’s attention, and trigger them if the status changes, but otherwise remain low-key. An example if the red LED indicating when a camera is recording. The Candle Camera, for example, allows users to set the color of the LED, but it will always turn red whenever someone is at risk of being recorded. An even cooler example are the Candle prototypes by Jesse Howard, which had large red glanceable switches. These could be easily spotted from across the room, and made a sound when their position was changed remotely.

It could also be useful to communicate the status in the space around the device, for example when user are about the enter the camera or sensor range, but could still turn back. We see this in professional recording studios that indicate whether recording is in process on the outside of the room (although in this example it’s not done to protect privacy). In smart homes it should be relatively easy to create this type of "ambient notice".

Access transparency

It could be valuable to know how often data is being looked at. For example, Candle could record at what time the data logs page was last accessed. This might be useful in case of family disputes, as this makes it possible to ask others in your home why they were looking at data. It might create a small barrier to look at data logs, again lowering the seduction of all that juicy data.

We foresee room for a “privacy dashboard” type of device, and in the future we might implement an addon that shows an overview of the privacy state of devices, as well as when the data logs were last accessed.

Data veils

Sometimes smart devices can make a “faux pas” when the reveal more about our messy daily lives than we’d hope. For example, a dust sensor with a display could reveal that the home is very dusty to a visiting mother in law.

When we asked jewelery designer Dinie Besems to design smart devices, she created a very simple solution to this. Inspired by Marilyn Monroe she created “skirts” that could be draped over the display, like a veil. It’s a very simple way of attaining what was mentioned above: having a “true” data layer, and a “presentation layer”. These veils offer a very simple form of control over this presentation of data.

These veils could be used more precisely too. For example, we designed some sensors to show data in increasing levels of privacy sensitivity from left to right, so that the veil could be draped over the sensitive part only, while still revealing the less sensitive data.