PEPR 2020

My incident-response talk from PEPR 2020 is here: When Things Go Wrong. Slides (with bonus section about how metrics are important but can really, really mess up your incident management) here.

Template incident-handing doc is available here, as promised in my talk. Add whatever’s useful for your company — this is a bare template which we used at Humu. The Humu incident-handling guidelines (edited to remove references to specific employees and internal systems) are available here.

CCPA: Bugs and engineering commentary on the California Consumer Privacy Act Regs

CCPA, the California Consumer Privacy Act, is coming into effect soon — but the regulations are still in draft. CCPA is quite hard to read (even for those of us who spend a lot of time in the space!) and the regulations have some rather interesting implications and a number of bugs. This isn’t a compliance guide. This is commentary and suggested fixes for the CCPA coming from a privacy and security engineer who has seen far too many things go wrong.

Disclaimer: I’m a privacy engineer, not a lawyer, but I’ve tackled this stuff at large and small scale for years.

The proposed regulations are here; this article follows along with them.

999.301: Defines a bunch of things

  • Commentary: 999.301 — When someone is opting in to “sale”, people 13+ years have to do a two-step consent process where they opt-in and then confirm that opt-in. This is a UX technique that we want to avoid unless it’s something with huge implications which happens rarely, because it does reduce erroneously opting in but also causes issues with opting in: a) people surprisingly often do the first part and do something like close the browser window without realizing they’ve failed to do the second and b) people have to pay a lot of attention and there limited attention to go around — always make sure you’re saving some for real emergencies. I would expect this to happen rarely, so not really concerned.
  • Fix: 999.301(e) — “Categories of third parties” means types of entities that do not collect personal information directly from consumers, including but not limited to … internet service providers, … social networks
    • ISPs do collect your information directly, when you sign up. Most of them also look at your browsing information, which they are collecting from your traffic, not some other entity — what sites you go to, etc. They have historically done everything from selling that information to inserting a “supercookie” which forces you to be identified and over which you have zero control to even inserting code (including ads) into (non-HTTPS) web pages you’re browsing! ISPs need to be way, way more regulated on the privacy front.
    • Social networks also collect information directly. That’s kinda why people use them. But they also collect data through use of 3rd party cookies (FB button on that page you’re looking at? FB knows you’re there.) and in some cases directly (pushing data in through APIs).
    • Fix: The regulations use the term “third party” as someone to whom data is disclosed by some other entity. Can fix by defining that way, since social networks do get data directly.
    • Remaining problem: … but ISPs aren’t having some other party disclose data to them, so all of the restrictions on “third party” aren’t going to regulate the privacy intrusion in that way. I suspect they would be best primarily classed as a business with which the consumer is communicating directly. (CCPA won’t fix everything skeevy that ISPs do/have done, mind you, but it’s better than nothing.)
  • Commentary: 999.301(i) “Notice at collection” means the notice given by a business to a consumer at or before the time a business collects personal information from the consumer
    • Note that “at or before”, it’s important. You literally cannot get a web page (or anything else through the internet) without sending a request to the server and causing that server to collect information, including data which may be personal, like your IP address or identifiers embedded in a cookie. The fact that there can be notice at collection seems directly aimed at having an opt-out link on a webpage which you can click after that web page renders as well as provision for something like a browser setting which proactively passes an opt-out. That “at or before” is what is standing between you and having huge opt-in cookie notices on pretty much every web page like it Europe.
  • Commentary: 999.301(s) “typical consumer” means a natural person residing in the United States.
    • Sounds funny for both a California law and if you read it without a lawyer hat on. I take my laughs where I can.

999.305: Businesses need to provide notice of collection of personal information

  • Commentary: yes, this law means businesses. CCPA doesn’t apply to non-profits (and that doesn’t just mean charities. That includes things like political action committees, which can continue to do incredibly aggressive things with personal data) or the government (the amount of information that city governments have about, say, your movements is growing. I’d love for that to have restrictions placed on it.). I am not happy about this particular choice, which is definitely a pragmatic choice, not a principled one. It also doesn’t apply to businesses which don’t make enough money and don’t “sell” enough user data.
  • Fix: 999.305(a1b) Use a format that draws the consumer’s attention to the notice and makes the notice readable, including on smaller screens, if applicable.
    • Note: I can find nothing in the actual CCPA which requires “drawing the consumer’s attention.” My worry here is that websites are going to go over the top with this and do something like implement a European cookie banner sort of situation. That wouldn’t be appropriate from a UX point of view as this is a notice, not a consent and it’s going to be on literally every website. People will get blind to these approximately immediately and start clicking “whatever go away”, which has two bad effects:
      • 1. Someone skeevy can add an extra consent into the banner that people effectively won’t see.
      • 2. This makes much less effective one of the techniques we have to notify users of new, important information (like a potential security breach, for example, or a change in service, or the like).
    • Fix: Change to “make clearly visible”. A stronger statement would be to require that this is visible on the initial load of the page, even on smaller screens.
  • Commentary: 999.305. I’m not going to mention this every time it crops up, but I’m really happy to see that this notice is required: a) in all of the normal languages in which a business operates and b) in accessible formats. Everyone is entitled to privacy protections.
  • Minor fix: 999.305(a2d) At a minimum, provide information on how a consumer with a disability may access the notice in an alternative format. 
    • … that notice should be accessible unless there’s a very good reason why that’s not possible, otherwise how is someone supposed to know to where to look for that alternative format?
  • Commentary: 999.305(a3) A business shall not use a consumer’s personal information for any purpose other than those disclosed in the notice at collection. If the business intends to use a consumer’s personal information for a purpose that was not previously disclosed to the consumer in the notice at collection, the business shall directly notify the consumer of this new use and obtain explicit consent from the consumer to use it for this new purpose.
    • Note that the business purposes are basically “a few normal things that everyone had better be doing” “a few ads things” and “MAKE YOUR SERVICE GO”. I’ll talk about them in more depth when we get there, but don’t expect to see this kind of notice cropping up much at all.

999.306: Businesses need to provide a method (including a link) to opt out of current or future “sale” of personal information.

  • Commentary: Sale doesn’t just mean sale in the normal meaning of that word where one trades something for money. Note also that “sale” is not defined in the section with definitions. In the CCPA itself it is defined as:
    •  “Sell,” “selling,” “sale,” or “sold,” means selling, renting, releasing, disclosing, disseminating, making available, transferring, or otherwise communicating orally, in writing, or by electronic or other means, a consumer’s personal information by the business to another business or a third party for monetary or other valuable consideration.
    • There are several exemptions, such as sharing personal information with a service provider for the purposes of providing the service, but this is extremely broad.
  • Commentary:
    • Note that a link with the specific words “Do Not Sell My Personal Information” or “Do Not Sell My Info” has to go on the webpage/mobile app landing page. No word on whether it’s allowed not to capitalize Literally Every Word.
    • Expect a lot of mostly-hidden signs with these links on them, as companies which substantially interact with computers offline have to post notice (signs or flyers). There are already a bunch of these signs up saying that they’re taking video footage of you (and using it for everything from fighting shoplifting to facial recognition) and I bet you’ve literally never noticed those signs; this probably won’t be more effective. Like ISPs, brick and mortar stores could really use some more effective privacy rules.
  • 999.306(c4) — this notice of the right to opt out must include a list of any proof required. This will really push companies towards very, very structured ways of verifying identity. Sometimes that’s good (because flexible processes can be used to move the goalposts on people). Sometimes that’s bad (because people with less traditional paperwork will not have the ability to exercise give the company assurance that they are, in fact, who they claim they are).
    • Might want to consider: List the data used for the standard proof path, but allow other paths to proof optionally. (Note that there is a real tradeoff in terms of security and flexibility here; you need someone who understands the threat models in details to come up with a safe new path. There are not enough of them and they’re too expensive to staff a call center. Secure proof can be flexible or it can scale, not both.)
  • Commentary: In addition to those links you can also use a special, not-yet-designed button/logo. I’m not sure why one would, as it would clutter up the webpage and take precious mobile pixels.

999.307: If the business gives a financial or service incentive in exchange for you not deleting your data or allowing “sale”, there’s a whole system for the notice.

999.308: Privacy policy. You have to have one, it has to have a bunch of information about personal information collected, disclosed, or “sold” and different kinds of requests you can make to the business about your data.

  • Commentary: The actual law doesn’t require a privacy policy. It goes out of its way not to require one. People should have one anyway, though, and every CCPA-subject website has a ton of privacy information to list, so I’m not fussed that this regulation requires one in 999.308(a3) even though 1798.130(a5) specifically doesn’t require it.
  • Commentary: 999.308(a2a) requires avoiding legal or technical jargon. As someone whose team has had to grapple with trying to figure out how to write a privacy policy which was: a) comprehensible and b) didn’t freak people out about things which weren’t even true (I’m serious, this is a very real problem), let me just flag that this is really hard. There’s a serious tension between comprehension by giving enough depth and comprehension by not causing people to glaze right over.
  • Fix: 999.308(a2e) requires providing an additional format to allow easy printing. If it prints out well already, why require an additional format? Having more than one copy of anything makes it far easier to end up with accidental inconsistencies.
  • Fix: 999.308(b1e2) states that the privacy policy must: List the categories of personal information, if any, that it disclosed or sold to third parties for a business or commercial purpose in the preceding 12 months.
    • Because “third party” was defined to include service providers in these regulations (as in “my website is hosted on AWS”) unlike in the CCPA itself, these notices are are going to be unnecessarily longer and harder to read for consumers. I would suggest one of the following:
      • Change the definition in these regulations to include the exemptions to “third party” included in the CCPA.
      • If the third party is a service provider (as defined in 999.314) they be exempted from this rule.

999.312: Methods for submitting requests to know and requests to delete

  • Fix: 999.312(a) requires that all businesses operate a toll-free telephone number for such requests. The law in 1798.130(1a) specifically provides an exemption from this requirement for online-only businesses which is not reflected in this regulation. Please add this back in.
    • This is particularly important for online businesses which operate in a large number of languages. Translation for websites happens asynchronously: the text of the website is sent to translators who send back translations. Having a web site available in a bunch of languages is important to increase access to that site’s information and services — needing to hire a staff of people who speak all of those languages available to handle high-stakes phone calls is a totally different issue, especially for small businesses.
    • Commentary: … speaking as the person who is going to need to personally handle these for my company, dear goodness please don’t make me try to figure out how to find translation services to handle technical, high-stakes phone calls in languages I don’t speak.
  • Commentary: 999.312(a) … Other acceptable methods for submitting these requests include, but are not limited to, a designated email address … 
    • As a security person, let me point out that email is not the method I would choose to submit proof of identity documents like a scan of one’s passport. Unless you’re using an email provider like Gmail which uses encryption and the business also uses such a provider, your email goes through the internet unencrypted.
  • Fix: 999.312(f) If a consumer submits a request in a manner that is not one of the designated methods of submission, or is deficient in some manner unrelated to the verification process, the business shall … [do one of two things which requires recognizing that this is a request]
    • Would suggest changing to “recognizable, received request”. It’s good to ask businesses to help people submit valid requests. However, people will sometimes do things like email the wrong email account or send completely incomprehensible writing and then get extremely grumpy that you didn’t comply with their request, even if you didn’t get it.

999.313: Responding to requests to know and requests to delete

  • Commentary: 999.313(d2) talks about the right to delete. But it’s not really a right to delete.
    • 999.313(d2b) allows the business to de-identify the information. This requires that they strip off the identifiers so it can’t “reasonably” be identified, plus policies where the business promises not to try to re-identify it or release the information. There is, sadly, a lot of evidence that this isn’t the same thing as anonymization and, almost always, that data can be re-identified. I would be much more comfortable if the technical bar were higher here.
    • 999.313(d6a) allows the business to refuse the deletion request and name a statutory exemption.
      • There are a lot of exemptions in the law, including “internal uses” and “public or peer-reviewed research”.
      • I’ll also note that two of the reasonable bases for refusal are because the business is using the data for debugging or security purposes. Expect to see a lot of “sure, we’ll delete your data… eventually… unless it’s malicious” responses for very reasonable reasons. Companies are really careful not to let users know exactly what they do to detect hackers and malicious people trying to abuse their services because that makes those attackers more effective. For example, if they know you only take the last 6 months into account for detecting abuse, they will do things like hold onto accounts for 7 months and then use them for abuse.

999.315: Requests to opt-out

  • Fix: 999.315(a) requires that every business have methods to submit requests to opt out with clear and conspicuous links for “Do Not Sell My Info”, including an interactive web form.
    • Suggest that if a business is not selling (or intend to sell) information that they do not need this link. Otherwise businesses are going to be required to collect personal information about consumers that they have literally no way to use and will need to protect and it’s going to be harder for consumers to figure out who is actually selling information.
    • Requiring a web form means that every business covered by this law is required to have a web page. Is this intended?
  • Big fix: 999.315(c) Other acceptable methods for submitting these requests include, but are not limited to, … and user-enabled privacy controls, such as a browser plugin or privacy setting or other mechanism, that communicate or signal the consumer’s choice to opt-out of the sale of their personal information.
    • Servers can only honor requests made using an agreed-upon standard. “Other mechanism” covers a lot of ground. If someone changes their browser’s user agent string to “Do not sell my personal information” or “Delete my data, dudes”, that would be intended to communicate an opt-out to the server, but if it’s not in a form that the server knows how to look for, it can’t be handled.

999.316: Requests to opt-in after opting out of the sale of personal information

999.317: Training and record-keeping

  • Suggested fix: 999.317(e) Information maintained for record-keeping purposes shall not be used for any other purpose.
    • Running one of these programs at any scale requires what are effectively the same exact records: number and type of requests, turnaround time, outcomes, etc. It would be good to allow use of these records for running and improvement of the program as well; otherwise there will end up being two copies of the exact same records for no good reason.

999.318: Requests to access or delete household information

  • Commentary: it is going to be extremely difficult to ever verify that these requests are coming from the appropriate people. Expect in particular a lot of malicious requests from abusive partners (who are often looking to control their partner’s behaviour) and untrustworthy roommates (often aimed at identity theft).
  • Clarify: 999.318(a) Where a consumer does not have a password-protected account with a business, a business may respond to a request to know or request to delete as it pertains to household personal information by providing aggregate household information, subject to verification requirements set forth in Article 4. 
    • Does “aggregate” here mean “aggregated” like some stats about the data or “aggregate” meaning all the data put in one pile?
  • Clarify: 999.318(b) If all consumers of the household jointly request access to specific pieces of information for the household or the deletion of household personal information, and the business can individually verify all the members of the household subject to verification requirements …
    • How do you define the members of the household? Does that include infants? Children? Adults? People who live there part time? People who lived there previously and have moved?
    • Given that definition, how would a business know the list of such persons? I am aware of no list to check against.

999.323: General rules regarding verification

  • Commentary: 999.323(b3c) Requires that you consider The likelihood that fraudulent or malicious actors would seek the personal information. The higher the likelihood, the more stringent the verification process shall be;
    • The regulations flat-out tell you that you should be considering the threats that cause malicious parties to act. What are those? Well, they vary by person, which means you should consider actual human needs. Don’t just think of financial risk here. I personally think through it like this.

999.324: Verification for password-protected accounts

  • Commentary: 999.324(a) Requires both re-authentication before disclosing or deleting data, when they have a password-protected account. Good! These features are important for privacy, but they’re also huge targets for abuse. But don’t lose your password, because you may lose the ability to delete your account.

999.325: Verification for non-accountholders

  • Commentary: This is hard. Really hard. Especially if you try not to hold on to more information than necessary, so you don’t have a whole bunch of personal data hanging around to verify against. We’ve seen this be widely insecure when companies implemented the analogous Subject Access Request under the GDPR. For example:
  • Fix: 999.325(c) A business’s compliance with a request to know specific pieces of personal information requires that the business verify the identity of the consumer making the request to a reasonably high degree of certainty, which is a higher bar for verification. A reasonably high degree of certainty may include matching at least three pieces of personal information provided by the consumer with personal information maintained by the business that it has determined to be reliable for the purpose of verifying the consumer together with a signed declaration under penalty of perjury that the requestor is the consumer whose personal information is the subject of the request. Businesses shall maintain all signed declarations as part of their record-keeping obligations. 
    • “Reasonably high” is too low of a bar — this verification procedure is going to be easy to exploit for many people: much of their personal information is shared with family, friends, or online for reasonable reasons (e.g. someone wishing them happy birthday). It will be especially easy to exploit for abusive partners.
    • The balancing control of requiring a signed declaration under penalty of perjury is not going to stop malicious people on the internet from exploiting this process. If I’m going after your data, then I’m not going to tell the company who I am, clearly. If something bad happens later and it’s traced back to this, it’s one more charge to throw on the pile. Security stops attacks much better than a threat of possible future legal consequences.

999.326: Authorized agent

  • Commentary: I’m so pleased that the business can require proof from the agent. Reading the CCPA text, there was no provision for this and someone was going use this as a loophole to mess with people.

999.330: Minors under 13 years of age

999:331: Minors 13 to 16 years of age

999.332: Notices to minors under 16 years of age

999.336: Discriminatory practices

  • Fix: 999.336(a) A financial incentive or a price or service difference is discriminatory, and therefore prohibited by Civil Code section 1798.125, if the business treats a consumer differently because the consumer exercised a right conferred by the CCPA or these regulations.
  • This definition literally ends up meaning that things like “giving you cruddier recommendations because you deleted the history of videos you watched” or “not serving you the photos you deleted” is a “service difference” but is impossible for the business to to avoid.
  • Now, if I were faced with this catch 22, I would personally define the service as “given whatever’s in the system, we <do something>” and cross my fingers. But really, I would prefer that we fix the definition to take into account that there are services which are literally impossible to provide on a technical level when rights like data deletion are exercised.

999.337: Calculating the value of consumer data

  • Commentary: Businesses are allowed to charge you money to use their service if you tell them not to “sell” your data or ask them to delete it but what they charge you has to be “reasonably related” to the value of your data.
    • It can either be the value (revenue from sale and/or value to the business) of your particular data (if, say, you make more money or are in a treasured demographic and so advertisers will pay more money to advertise to you) or the value of the “typical consumer’s” data.
    • Here’s the thing: those numbers are generally extremely small on a per-person basis. Tiny. But do you know what is less small? The cost of doing things like collecting credit card numbers to cover that cost (especially if this means needing to comply with PCI for the first time), handling chargebacks and disputes, administering that program, etc. These costs are also allowed to be covered in the cost charged to consumers who opt out.
    • So what I would expect is that, for a lot of web sites, the price that they’ll charge will be in large part overhead necessary in order to even obtain the money for the value of the data.

999.341: Severability

Pants problems

I don’t know if you’ve noticed this, but there are a large number of men on the internet who have problems with their pants. The way I can tell is that first they take a picture of the “affected area” and then share that picture with someone who doesn’t want it. The way I figure it, the men are just trying to say “help, I don’t understand pants!” and asking for technical assistance to correctly apply pants to their crotches.

First off, men: don’t do this.

Secondly: this is almost entirely a problem generated by cisgender men. I will not speculate as to why in this article.

Thirdly: no, I’m not really that clueless, but if you’re going to build products where one user can send another user pictures and you want that product to be built with respect, you are going to need to consider this problem. 

Pictures come through a ton of mechanisms like photo albums, pictures in social media posts or messages, and pictures sent through file transfer. Even more than that, pictures are sent in ways you might not originally consider, like profile pictures. Every time I send a text message through a mechanism with a profile picture, that picture goes right along with the message. Even text alone can be used to send rather surprisingly explicit imagery.

Every single one of these mechanisms (and more) has been used to send unsolicited pictures of pants problems. Pants problems are rampant. I asked another woman how many she got and she replied “My unsolicited pants problem is fairly minimal nowadays (like 2-3 a week).” First off, that’s still a lot of pants problems. Secondly, it didn’t go down because of improvements in the platform. It went down because she took a less high-profile job and pretty much entirely stopped engaging on social media.

So if you are designing a product that will work well, you need to take this mode of communication into account.

Note that this is a clear example to be wary of metrics that think all engagement is good. This will bite you in the longer term as people will retreat from the platform. One element of bad engagement can be far worse than one element of good engagement. It’s difficult to measure, but at the very least keep this in mind as a limitation of engagement and connection metrics.

One problem is that people disagree deeply on what constitutes good engagement. Some communities find certain types of religious statements deeply offensive, while others find the lack of them offensive. Mixing those people willy-nilly will almost certainly have bad effects. Even strangers who can engage on some topics productively (say, flowers) may not be able to engage productively on other topics (say, vaccination). Increasing people’s ability to engage with others is an open and urgent research problem. What we do know is that clashing over a polarizing issue can cause increased attitude polarization, deepening the problem. 

At root, consider engagement “good” only when everyone involved would agree that it was good.

The root of the pants problem is that the offensive communication is unsolicited. Unsolicited communication is risky. This is particularly tricky because you will want to welcome actual members to the system, even though some percentage of “sock puppets” (people using new accounts for nefarious purposes) will be camouflaged among them. Separate the goal of welcoming people from preventing harassment in your design. Consider a welcome flow rather than immediately directing communication from a new person at a stranger. Remember that new members will also experience pants problems.

If possible, strip images from risky communication. A picture is worth a thousand words and all of them may be offensive. If not completely possible, hide higher-risk images. For example, if Bob sends Alice an invitation to chat out of the blue, consider hiding his profile picture. It may be a sneaky request for assistance with his pants. If Bob uses a service to send pictures to those around him, consider hiding the pictures. People often wish to send pants problem pictures to strangers, but many men will not balk at sending them in person.

If you have a number of examples of the genre (and if you don’t, I’m sure that women online would happily donate them), consider training a machine learning model to detect possible offenders. Then plan for the annoying fact that your model will have both false positives and false negatives by not making a model match equal an automatic takedown. The appropriate action will depend on your application, but might include putting the possibly-offensive picture behind a warning note (with the picture only visible on click-through) or sending the picture to a human who can determine whether it is inappropriate for the context.

Just like with other content, let your users tell you what is offensive. Treating these reports appropriately is by itself a deep topic, but start by considering that users have more context than the person reviewing the report. Also consider that people can and will try to abuse your abuse-reporting mechanism, so don’t allow it to work blindly; audit and evaluate success with a human eye.

People looking for help with their pants are exceedingly inventive, so this won’t stop them entirely, but it should help you clear up your product, the space over which you have control and for which you are directly responsible.