Calling all type nerds: my new job deals with a lo...
# present-company
g
Calling all type nerds: my new job deals with a lot of encrypted data that shouldn't be transmitted in plaintext. We’ve just added sorbet to our rails project (gradual typechecker). I feel like I've read somewhere that it's possible to use types to enforce that kind of restriction. Does anyone have any experience using types for that (or links to talks/papers/blog posts)?
m
Language-based information-flow security https://ieeexplore.ieee.org/abstract/document/1159651
you may find more searching for "tainted data", "taint propagation" and similar
🙏 1
g
for anyone looking, here's the preprint: https://www.cs.cornell.edu/andru/papers/jsac/sm-jsac03.pdf
@Cole
i'm almost imagining something like a NoTransmit<a> getting returned from the decryption function and delegating all the methods to the member… but then using some kind of exclusion logic on methods that transmit data to the client. or like have the transmission methods only accept Encrypted<a> for certain values of a. is that too wild?
c
I might be harebrained to say this. But, my initial thought is to have a kind of encrypted dictionary of information + a centralized security master with an access log.
g
the company is SOC 2 compliant—encryption at rest and in transit is already happening
c
Ah, so this is just a type design question?
g
yeah the scenario is basically: the server decrypts some data for a calculation, then asks an api a question or sends some data to a client, while accidentally including some personally identifiable info. we catch it in code reviews but i was was wondering if there was a way to catch it at compile time
c
Something simple could be to always return decrypted data back to the user code under a
{ $dec: ___ }
key, so then, you would always have to access the decrypted data by using
respJson.userName.$dec
then, you can write a linter to statically check all use-sites accesses of the decrypted data. Just spit balling. This could obv be modified to some other tools in Ruby
Or, it's easier to lint with your eyes...
g
yeah i'm pretty much trying to figure out how to encode that with a type—the decryption process is really clever, so it basically feels like you're using a regular rails object even though it's encrypted in memory. that's really nice for ergonomics and not having to update the whole codebase but it also means there's no guardrail to prevent you from sending it somewhere. maybe the answer is just to look through the encryption/decryption code and see if there's an obvious spot for a shim
m
a data type that throws when encoded/decoded? (if that's a thing in ruby)
g
so we want it to be decoded for use. we just don't want the decoded data sent off the server
or saved without encryption (very unlikely because the save instructions etc get interrupted by an encrypt-first proxy afaict)
this is kind of like the expression problem tbh: it's really easy in a dynamic OO PL to make sure that data gets encrypted/decrypted for save and use but it's really hard when we start calling functions like ‘`respond_to`’ or ‘`to_json`’ with that data as arguments or values deep in a hash somewhere
i'm hoping there's some way to make that data a wrapped type that complains only when it's going over the wire
c
Sounds almost like sonarqube control flow analysis type stuff. Tracking when a piece of data was unwrapped and where it goes in order to detect this type of issue statically. Not really an answer...
Another strategy is to try to completely isolate communications with this business logic somehow. So, you can do all the dangerous PII Management inside the business logic process, but it has to go through a small port to communicate externally (port like Elm)
One other crazy idea, would be to include mocking tools to generate pii during your testing. And, then use a proxies on the request to make sure pii matching data from the mocker are not being sent between services.
Almost like a security fuzzer *
g
woah these are all great ideas thank you
j
Perhaps create a Classified<T> type that can only be read by a Declassifier object, giving you something like object capabilities.
t
At google protobuffer message field descriptions can be annotated with options, so then there are linters that check the service definition is not allowing public access to confidential fields. I think there was deeper support in some of the processing libraries to pass those flags along variables but I can't see anything equivalent in the public domain and they are not really foolproof anyway. I do think this is more of a metadata topic rather than a types topic though. gRPC can represent annotated datatypes, which might be a useful building block (?)
c
@Garth Goldwater did you figure out an interesting approach?
g
i didn’t unfortunately. we had to move on to some launch stuff for the moment. still very interested
👍 1