The future "visual" languages I like to imagine are all about making execution something that the programmer participates in. One way to do that is to reify concepts that are implied or invisible in current tools — things like the "current location" of the instruction pointer, the path(s) it followed to arrive there, and the path(s) it'll take in the future. Showing those things graphically is one way of rendering that concept. You can also render it verbally, by giving the things in your program place names or attributes that convey the right meaning when spoken.
Think of the difference between a book (say, Hitchhiker's Guide) and a text adventure game (say, the Hitchhiker's Guide game). A book is static, a text adventure game is participatory. They both tell the same story, and both depend on the reader/player to advance the story, but the game allows the player to consciously shape how the story unfolds for them. The fact that the text adventure game is "rendered" in text is just an implementation detail — you could also make a version of the same game with the same narrative and gameplay rendered graphically. One could argue that Myst is the 3d rendered equivalent of a text adventure game. You could also make a text adventure game equivalent with a purely spoken interface. I once did the score for a theatrical production where the whole room was the stage and the audience sat on pieces of the set. The audience members' presence was part of the story, and if the audience did something then the actors responded to it — to a very limited extent, the theatre equivalent of a text adventure game. All of these media are different ways of rendering a story, and for each kind of rendering there can be a participatory version (game/interactive) and a non-participatory version (novel/movie/audiobook).
Same goes for programming. Right now, we're only able to participate in the execution of our software via the UI that we create for it, and whatever debugging tools our environment permits. Live Programming tools let us participate a little bit more. I'm interested in taking participation a lot further. Using (eg) "nodes and lines" to render this programming experience is just an implementation detail. You could offer the same hyper-participatory experience using text, or speech, or holograms, or Braille. The reason I'm generally enthusiastic about the "visual" programming is because I'm a visual artist, and I have a nostalgic love for complex GUI tools like Maya, so that's the aesthetic I'm interested in first. If I were blind, I'd still be chasing the same participatory vision, but I'd be building a spoken interface first.