Does anyone here know anything (or know someone wh...
# thinking-together
j
Does anyone here know anything (or know someone who knows something) about data governance practices in enterprise environments and is willing to answer some newbie questions? I think there may be a strong use case for my tool in that space, but I'm uninformed.
v
I’m happy to try to answer, or could intro you to people.. been in the enterprise SaaS space for a while..
e
I’ve worked on a few platforms that include guidance for data and other sorts of governance. Happy to answer whatever I can.
j
Thanks, both. Here's my initial Q. It feels to me like this involves setting out data governance policies that say things akin to "don't give access to personally identifying information unless the user has admin privileges", and then what happens is that the people running the software need to configure their tools to comply with that rule. That feels like it would a) make it more complicated to add new tools, driving organizations to centralize, and b) be difficult for the people writing the policy, who presumably don't know the software tools, to confirm has been implemented correctly. I'm presuming that they do some form of user testing to see if they can find violations, but they don't validate the configurations directly, and there is nothing in the way of real-time or near-real-time auditing. Is that more or less how it goes? Or is the tooling more sophisticated than that?
e
Youve defo hit a nail upon the head with mighty and thunderous hammer. The distinction between governance and what I’ll call “programmatic gate keeping” is often times 2 overlapping circles. Defining what needs to be enforced “where” can help. In your example of access rules: don’t bother to include “governance” level guidance about that. As you say it can be nuanced and complicated. Enforce access rules in code if at all possible, and make the governance about how to use the right and good code pathway. E.g. “we’ve got oauth set up. You must use it.” Here governance and code world are combined, and become stronger and (ideally) self supporting.
I think of this governance stuff as guidance for devs and other humans that direct them toward the correct and expected practices that you can then “enforce” or “verify” in code. Where it gets tricky is when you’ve got stuff that needs “governance” that can’t be enforced in code, or codified into a CI/CD pipeline for some reason. For me, the canonical example of this is the focus of my job — accessibility. There are accessibility auditing tools that can run automatic scans as part of a build and deploy pipeline, but they’re not really all that good. There are a bunch of accessibility rules that are codified into literal laws. So, governance says “do the things the law tells you to” so now you’ve got a 3rd category, somewhere between • governance • programmatic gate keeping There is now also “guidance.” Guidance, in my meaning here, is the worst kind of (read perhaps as “most difficult to verify?”) governance because it relates to code practices that are critical but potentially reliant on individual implementations and are difficult to verify…
Also, this is, from my understanding what lead to the invention of SAFe Agile…so, um…beware, for here be seriously asinine management practices ⚠️ ⚠️ ⚠️
Your insight about the drive to centralize I think is correct. The oldest platform governance structures I’ve seen or had to interface with, the ones that I’d say were “successful” in their goals were all centralized, and had a clear group or person, acting as the authority…a governance group that had a governance process to gate keep what was in and out. This leads to consistency at the coast cost of speed. It also centralizes failure to a single authorities group in some cases.
j
Correct me if I'm wrong, but whether it is enforced automatically, or manually audited after the fact, it still represents an objective that was decided upon somewhere, like "use Oauth" or "comply with accessibility laws". That's the "governance" part, I would think ... the choosing what to achieve and what to avoid (and potentially but not necessarily how)? Also, I'm wondering specifically about "data" governance, which includes access security but excludes accessibility, I would think. Is "data" governance different, somehow? Is the difference meaningful? Or have we created a sort of arbitrary division because people think of data as having value now?
e
when you say data governance, do you mean like access controls or like what data we keep?
j
From what I have been reading, it deals with security, cleanliness, non-duplication, standards and schema compliance, privacy, usability, accessibility, regulatory requirements about non-collection or disposal. Also of the organizational decision making surrounding the collection, use, management, and disposal of data.
But if I told you that the people I'm talking with in the "data governance" seemed like they had a clear idea of what their job was, I would be lying. So it may be a "literally no one knows" sorry of situation.
e
Gotchya! Yes. I was speaking a bit more generally of “platform governance” than specifically data governance. I’ve been involved with data governance within platform governance, too, but would say, as you point out, it’s a lot less well defined since what exactly “data encompasses is a wee bit hand wavy.
j
Honestly, from what I have read, in the phrase "data governance" you have two terms that are deeply hand-wavy. But thanks for the clarification.
I think I'm slowly starting to hone in on what's important for me, here... The thing that the tool I am working on offers is an easier and more verifiable symmetry between written rules and their encoded equivalent. That tends to be most useful when following the rules is very important, but the requirements and text of the rule are not under your control, and the rules are complicated, and demonstrating adherence to them in automated systems is important. In data governance, with the exception of things like GDPR, you usually have control over the rules. If they are hard to implement, you can rewrite them. So I'm curious whether there is a pain point in data governance around automating and demonstrating compliance with written rules, in the terms of those written rules. For instance, if it was possible to run something in CI/CD that would take test inputs and outputs, detect data policy violations, and flag them with a link to the written policy they violate... is that anything? Does that make anyone's life better? Or is that like a hat on a hat?
e
I think that is something! Cynically, I think it is something because it is an element that can be automated, and plugged into an existing system. In my experience, folks want people out, and CI/CD automation in — automat as much stuff as possible, no matter the implications of what it means for the system.
v
So I think the presence of data governance and compliance rules is very valuable for enterprises but hampers agility and iteration based testing. So you have IT/Infosec being the bad guys while other parts of the org want to innovate. If you can solve for this cleanly, lots of value there.
j
Yeah, I'm just thinking the abstraction layer becomes where the bad guys live. Tell us about your software's data model, and we will build a connector to our compliance API, so you can automatically test or audit. But if your model doesn't fit our API, we are still the bad guys... I'm not sure how much better that is.
v
Are you thinking of doing this as a product?
not sure if thats relevant
j
LexiFi is in the same "computational law" space that I live in, but it solves for a different problem. I'm looking for structural isomorphism to the legal text, and ease of use (ala low- no-code). Functional programming gets you neither of those things, but gets you other cool stuff. Catala is another functional approach in the space, for tax and benefit systems. OpenFisca is object-oriented, aimed at comparative analysis and microsimulstion of tax & benefits rules. I think the Accord project has a language aimed at smart contracts, there's DataLex from AustLII, and a few others.
My entry in the space is Blawx, which is currently still more prototype than product. I am currently hosted inside a department of the Canadian federal government that is trying to find an in-house demonstration use-case, and data governance is the most promising current proposal (of about 4 options). Just trying to understand whether there is a problem/tool fit, or not.
I have a meeting with the CDO later this week, trying to understand more before I get there.
Appreciate all the help!
v
What would be examples of roles that could help answer questions for you. My take is that the buyer here would be on the innovation side, the governance side is likely to have veto powers but is not necessarily a purchaser. My network is pretty strong on the SaaS side, lemme know if I can introduce you to specific personas…
j
Yeah, that's one of the issues. It's fundamentally a bit of a two-sided tool. You need someone responsible for writing the rule but not for implementing it, and someone responsible for implementing it, but not for writing it. The rule expert validates the encoding, the implementer just uses the API. So in data governance, I'm interested in data governance policy writers, who wish the implementers were better at following the rules, and software devs and data stewards, who wish following the rules was easier, plus whoever is paying them, and wishes less time was spent on compliance.
If you are right, and the purchaser is the innovator, then I need to talk to people who are building software or dealing specifically with software compliance testing in heavily regulated enterprise environments. Or something? :)
v
yup - healthcare, banking, insurance..