Calling all type nerds my new job deals with a lot of encryp Future of Coding #present-company

Calling all type nerds: my new job deals with a lo...

Garth Goldwater

08/15/2022, 3:24 PM

Calling all type nerds: my new job deals with a lot of encrypted data that shouldn't be transmitted in plaintext. We’ve just added sorbet to our rails project (gradual typechecker). I feel like I've read somewhere that it's possible to use types to enforce that kind of restriction. Does anyone have any experience using types for that (or links to talks/papers/blog posts)?

Mariano Guerra

08/15/2022, 3:40 PM

Language-based information-flow security https://ieeexplore.ieee.org/abstract/document/1159651

Mariano Guerra

08/15/2022, 3:42 PM

you may find more searching for "tainted data", "taint propagation" and similar

🙏 1

Garth Goldwater

08/15/2022, 3:48 PM

for anyone looking, here's the preprint: https://www.cs.cornell.edu/andru/papers/jsac/sm-jsac03.pdf

Garth Goldwater

08/15/2022, 3:54 PM

@Cole

Garth Goldwater

08/15/2022, 3:56 PM

i'm almost imagining something like a NoTransmit<a> getting returned from the decryption function and delegating all the methods to the member… but then using some kind of exclusion logic on methods that transmit data to the client. or like have the transmission methods only accept Encrypted<a> for certain values of a. is that too wild?

Cole

08/15/2022, 4:05 PM

I might be harebrained to say this. But, my initial thought is to have a kind of encrypted dictionary of information + a centralized security master with an access log.

Garth Goldwater

08/15/2022, 4:05 PM

the company is SOC 2 compliant—encryption at rest and in transit is already happening

Cole

08/15/2022, 4:06 PM

Ah, so this is just a type design question?

Garth Goldwater

08/15/2022, 4:07 PM

yeah the scenario is basically: the server decrypts some data for a calculation, then asks an api a question or sends some data to a client, while accidentally including some personally identifiable info. we catch it in code reviews but i was was wondering if there was a way to catch it at compile time

Cole

08/15/2022, 4:11 PM

Something simple could be to always return decrypted data back to the user code under a

{ $dec: ___ }

key, so then, you would always have to access the decrypted data by using

respJson.userName.$dec

then, you can write a linter to statically check all use-sites accesses of the decrypted data. Just spit balling. This could obv be modified to some other tools in Ruby

Cole

08/15/2022, 4:12 PM

Or, it's easier to lint with your eyes...

Garth Goldwater

08/15/2022, 4:13 PM

yeah i'm pretty much trying to figure out how to encode that with a type—the decryption process is really clever, so it basically feels like you're using a regular rails object even though it's encrypted in memory. that's really nice for ergonomics and not having to update the whole codebase but it also means there's no guardrail to prevent you from sending it somewhere. maybe the answer is just to look through the encryption/decryption code and see if there's an obvious spot for a shim

Mariano Guerra

08/15/2022, 4:14 PM

a data type that throws when encoded/decoded? (if that's a thing in ruby)

Garth Goldwater

08/15/2022, 4:14 PM

so we want it to be decoded for use. we just don't want the decoded data sent off the server

Garth Goldwater

08/15/2022, 4:16 PM

or saved without encryption (very unlikely because the save instructions etc get interrupted by an encrypt-first proxy afaict)

Garth Goldwater

08/15/2022, 4:18 PM

this is kind of like the expression problem tbh: it's really easy in a dynamic OO PL to make sure that data gets encrypted/decrypted for save and use but it's really hard when we start calling functions like ‘`respond_to`’ or ‘`to_json`’ with that data as arguments or values deep in a hash somewhere

Garth Goldwater

08/15/2022, 4:19 PM

i'm hoping there's some way to make that data a wrapped type that complains only when it's going over the wire

Cole

08/15/2022, 4:21 PM

Sounds almost like sonarqube control flow analysis type stuff. Tracking when a piece of data was unwrapped and where it goes in order to detect this type of issue statically. Not really an answer...

Cole

08/15/2022, 4:24 PM

Another strategy is to try to completely isolate communications with this business logic somehow. So, you can do all the dangerous PII Management inside the business logic process, but it has to go through a small port to communicate externally (port like Elm)

Cole

08/15/2022, 4:26 PM

One other crazy idea, would be to include mocking tools to generate pii during your testing. And, then use a proxies on the request to make sure pii matching data from the mocker are not being sent between services.

Cole

08/15/2022, 4:27 PM

Almost like a security fuzzer *

Garth Goldwater

08/15/2022, 4:40 PM

woah these are all great ideas thank you

Joakim Ahnfelt-Rønne

08/15/2022, 5:41 PM

Perhaps create a Classified<T> type that can only be read by a Declassifier object, giving you something like object capabilities.

Tom Larkworthy

08/16/2022, 7:16 AM

At google protobuffer message field descriptions can be annotated with options, so then there are linters that check the service definition is not allowing public access to confidential fields. I think there was deeper support in some of the processing libraries to pass those flags along variables but I can't see anything equivalent in the public domain and they are not really foolproof anyway. I do think this is more of a metadata topic rather than a types topic though. gRPC can represent annotated datatypes, which might be a useful building block (?)

Cole

10/13/2022, 8:35 PM

@Garth Goldwater did you figure out an interesting approach?

Garth Goldwater

10/16/2022, 12:20 AM

i didn’t unfortunately. we had to move on to some launch stuff for the moment. still very interested

👍 1

3 Views

Open in Slack

Previous Next