I made a little demo of a text based knowledge man...
# share-your-work
I made a little demo of a text based knowledge management/second brain thingy


👍 2
❤️ 4
Basically it is meant to be a more humane and flexible version of graph databases like Neo4j. The "database" is currently text files with a git-like design that looks for changes and updates hands off indexes that are used for the queries (stored separately under a single .bha folder).
Ultimately I would like it to ship with a large "standard library", that doesn't just search for your own entities, but extracts them from known file types (probably using Apache Tika). So it would be like plugging a super-powered graph search engine onto your normal OS. For example, if you had an Excel file, it would add Entities in your graph, not just at the file level, but also things within it. Like it would automatically make;
Copy code
@[C:\Users\Chris\my excel file.xls]
   ![MS Excel Workbook]
   #created:01/03/2019  #filesize:120kb
   #worksheet: @[Sheet 1]
   #worksheet: @[Sheet 2]
   #worksheet: @[Sheet 3]
I think a system like this could be used to create the Itemized OS that Alexander Obenauer writes about, on top of the OSes that we already have.
I love how this is riffing on a question in this forum. #collaboration The order of the tags in queries is significant, right? Like the difference between asking who is the brother of Bart vs whose brother is Bart. I wonder if including explicit query variables like Prolog does will make the queries easier to read/learn, even if more verbose. It might also give you more room for maneuver in adding more features.
There's actually a colon that distinguishes the "with the value of" case, but yeah I think you would eventually want named variables in queries, for some more complicated patterns. SPARQ uses
notation for this which actually fits with the rest of the syntax neatly. I was planning to use something like this for the functional calculation side, like;
Copy code
  #sent-emails:?[Chris's sent emails]

?[Chris's sent emails]
  !Email     <- filter to !Email's
  ?[Chris email address] = @Chris #email
  #from:?[Chris email address] <-- filter using var

?[Chris email recipients]
  ?addr = ?[Chris's sent emails] #to
  #email: ?addr
I haven't really thought it through properly, I just wanted to actually try out some of the things I had been discussing in the mentioned threads
If you decide to create a new surface syntax for what are essentially RDF triples, you might want to make sure your model is compatible with RDF so you can interoperate with other knowledge bases. For some prior art in terms of the syntax itself, check out the rather similar Turtle notation:
Copy code
@prefix : <<http://example.org/elements>> .                                                                              
    :atomicNumber 2 ;               # xsd:integer                                                                      
    :atomicMass 4.002602 ;          # xsd:decimal                                                                      
    :specificGravity 1.663E-4 .     # xsd:double
☝️ 1
It does essentially get compiled to triples under the hood, because it currently is implemented on top of Neo4j, but the queries are bewilderingly complex when expressed in those terms. For example;
#sibling: @Bart
Is translated to the following Cypher query (it would be similar in SPARQL);
Copy code
match (v_: Entity {{name:"Bart"}})
    match p_v = ((v:Entity)-[:IMPLIES*0..]->(v_))
    with distinct v, [m in nodes(p_v) where 'Context' in labels(m)] as contexts
    match (k_:Attribute {name:"sibling"})
    match p_k = ((k:Attribute)-[:IMPLIES*0..]->(k_))
    with distinct v, k, [m in nodes(p_k) where 'Context' in labels(m)]+contexts as contexts
    match (k)-[:VALUE]->(c:Context)-[:VALUE]->(v),
    with distinct n, contexts
    where size(contexts) = 0 or 
    all(c in contexts where exists((c)-[:CONTEXT]->()<-[:IMPLIES*0..]-()<-[:CATEGORY|ATTRIBUTE*0..1]-()<-[:IMPLIES*0..]-(n)))
    return distinct n
Which is not reasonable for a user to write.
I'm on a bit of an "ethnographic" thing at the moment so trying to come at stuff from a starting point of how people are already interacting with computers right now in the real world, rather than my own preferences which tend to be far too mathematical, "precise" and "elegant" for most of the species 🙃
🤔 1
👍 2
Interesting! A small-scale knowledge graph with a lightweight interface looks very useful indeed. From my first impressions, I'd say yours is already getting too complex. Example: the way you disambiguate "Bart" by context. For a personal knowledge database, I'd probably prefer to keep entities unique, implying longer names, and add a completion interface for discovery and for faster entry. You'd type "Bart" and the system proposes all the entities containing "Bart" for selection.
Yeah for sure you would need autocomplete in an actual system. To clarify the underlying model,
@[Bart Simpson]
are separate entities, marking it as synonymous just allows them to be switched in/out in a query. This was meant to be like Wikipedia titles. The canonical title will be something globally unique like "Queen Victoria (British monarch)", "Victoria (Australian state)", but will typically be referred to by a name ("Victoria") which is only contextually unique. This was my attempt to reflect how humans use names, but without introducing too much "magic" on the computer side. A dumb example that might be more clearer of the point of this feature;
Copy code
@[My front door]

@[My garden door]

@[Half open interval [0, 1)]
   ![Subset of the Reals]
   #not-open #not-closed

   #not-open = #closed
Then the query
with return the two doors, but not the set, because the rule
#not-open = #closed
does not apply in the context of
, only
Neo4J is worse at this sort of thing than a TripleStore would be. In terms of making this query in SPARQL (using my Clojure wrapper):
Copy code
(query `{:select [?siblingLabel]
         :where  [[~(entity "Bart Simpson") ~(wdt :sibling) ?sibling]]})
[{:siblingLabel "Lisa Simpson"}
 {:siblingLabel "Maggie Simpson"}
 {:siblingLabel "Hugo Simpson"}]
My own note-taking system includes a bunch of semantic web/GOFAI stuff, but also uses neural nets, &c. I strongly encourage this line of inquiry! 🙂 But I also wanted to give you some advice from farther down this road.
But how does it know that "Bart Simpson", sibling, "Lisa Simpson"? The actual triple is "Bart Simpson", sister, "Lisa Simpson"
I actually had Hugo in the demo originally but I thought it would confuse people with deficit Simpsons knowledge 🤣
😂 1
@Jack Rusher I'm curious to hear more about how you use symbolic AI in your note taking system!
@Chris Knott In this case the database happens to contain those assertions, as I'm running this query against Wikidata. If I had a DB with only brother and sister, I'd be able to specify what sibling means using a meta-language that would then allow this query to remain as simple.
@Kartik Agaram I need to make a video about the whole thing
💯 5