This is in response to a question asked on Fora Is there a “ceiling” in software engineering? Why?
Software developers hit a glass ceiling because software development itself has hit a glass ceiling. Software development has failed to figure out how to work with information directly inside a computer program. Instead, we work with data rather than information. We have left the information work to the BI, data analytics, data mining, knowledge representation world. This world works with data after the fact to extract information from it.
What if programmers could work with information inside a program? before the fact. The next level of development could be to work with information as well as data inside a program. The glass ceiling is that we generally can not work with information. An entire new industry development effort needs to get started developing tools and techniques for doing this. I am doing my little bit to break through this glass ceiling at my blog. I also provide libraries for the techniques so far developed. But we need much more than a few techniques that one software developer can create, God willing. We need an industry push to break through the glass ceiling and build the entire information level of software development. Let’s get started!
I will start with an example.
Here are examples at each level to handle a user-customer relationship:
- Level 1 – RDB: A join table:
- Level 2 – OO: A few classes depending on use and need:
a user table with a customer objects list or a customer id’s list depending
a customer class
a user class
a ‘client’ data structure that contains both objects (references to them)
a user-customer relationship class
- Level 3 – IOP: Three endeme sets defining user-customer fields:
a UserIdentity set
a CustomerIdentity set
A relationship set
- Level 4 – KR: Two nodes and a relationship edge:
Native Organization Schemes
- Level 1 – RDB – relations
- Level 2 – OO – hierarchies
- Level 3 – IOP – matching/(or memberships?)
- Level 4 – KR – graphs
For matching, each piece of data is part of whatever it matches well. For membership, each piece of data is part of multiple things.
The First Three Organization Schemes Are Implemented Through Four Types of Tables:
- data tables – usually have a large number of rows focused on data
- lookup tables – usually have a smaller amounts of rows focused on data [OO]
- endeme tables – usually have a small number of rows focused on ‘type’ [IOP]
- join tables – plumbing used to connect data tables to tables in a n to n relationship [RDB]
Given the needs of various part of a project, work with one or more of these for different purposes.
Resulting in These Structures:
- hierarchy – oo is especially good with these [like lookup tables]
- join – uniquely relational [join tables]
- search – generally information oriented [endeme tables and endeme structures
- data – everyone works with data to some extent
- graphs – above the level of database tables?
Then adding two more types of tables:
- Endeme implementation tables – endemes
- Graph implementation tables – knowledge representation
These two new groups of tables implement endeme and knowledge representation structures (information structures) in a database. They generally set somewhat apart within the database in their own little engines.
Programming won’t really be complete until we push our software development technology up to level 4.
Level 4 issues:
- ontological meaning/context
The reason we have so much trouble with workflow and layout is that we haven’t built level 4 yet.
build level 3 toward level 4?
| | |
Level 4 UI -------+--+ layout workflow | | | +-- networks
| | | | | | | |
reports --+-------+----+----+---+---+-+-- database
| | | | | | integration
Level 3 | | endemes ------+ | |
| | | | |
| | +-----+-----+------+ |
| | | | |
Level 2 | objects |
| | |
Level 1 data
Wisdom is a Computational Creativity Function. The computer has to create what it is going to do by applying its understanding gained through artificial intelligence.
- Data: (Raw) Red, 18.104.22.168, v2.0
- Information: (Meaning) South facing traffic light on corner of Pitt and George Streets has turned red.
- Knowledge: (Context) The traffic light I am driving towards has turned red.
- Wisdom: (Applied) I better stop the car!
The Example Converted to Levels:
0. Raw Data
1. Stored Data
2. -> Meaning Data types and objects and hard defined structural interpretation
3. -> Meaning Information with structural (and conceptual?) context as meaning
4. -> Meaning knowledge representation with relational context to user
5. -> Context knowledge and what various outcomes and actions mean
6. -> Applied wisdom – stop the car!
Level 6 is computational creativity. The computer has to create an action.
Context and meaning are both built up through multiple levels rather than just being at one level each.
Endemes Allow a Computer to Understand Information Better. They will allow you to write programs that understand information. I was recently reading the Wikipedia page on web 3.0. It has a cool diagram showing the information layers of Web 3.0. The Semantic Web Stack:
To be clear, web 3.0 will use the following technologies to help a computer understand information:
- OWL – ontologies
- RIF – rules
- RMD – key value pairs
- RMDS – taxonomies
But this is not enough.
Ontologies tell what something means as related to other things. But the things need an initial meaning to transfer meaning to what they are related to. Taxonomies tell what something means in terms of hierarchies, i.e what something is part of
information base on what group something is in, useful but hierarchies don’t tell the computer enough (my intuition says). Key value pairs allow the building of ‘classes’ with ‘members’ and they say what they are made of but still not enough somehow and members tend to be very rigid and hard coded. Rules are useful but tend to focus on what to do with the information/its processing rather than the information itself.
The 3.0 Information Component set has:
a piece of information:
- what it's related to
- what it's part of
- what it's made of
- what to do with it
This still does not really tell you anything about what it means.
A Fifth Information Component Can Address Information Directly
Endemes allow you to include another thing to your informationn collection above.
Endemes allow you to say something about it’s importance The concepts meaningful and important and closely related. So adding endemes we have:
a piece of information
- what it's related to - ontologies
- what it's part of - categorization/taxonomies
- what it's made of - constitution
- what to do with it
- what is its importance - characterization
The ‘what is its importance’ here not only includes a measure of its importance but a qualitative measures of the importance of concepts that it consists of or is defined with. You see endemes provide a set of concepts and the relative importance of each condept in defining a piece of information they define its improtance in qualitative terms rather than quantitative. When you do an endeme query on a list of items defined by endemes then you do get the importance of each of the items based on the query
A Possible Sixth Component of Information is Outcome
You can perhaps add another information component to the mix by addition an item for the future. A concept like ‘courage’ includes a future part. Courage has to do with taking on things that could hurt me or that might not turn out. This brings in a predictive ramification, creative possibility, planning, strategy, goal, desire, emotion, time aspect
It may be only be relevant within a simulation or context.
This possible new information component relies on:
1. outcomes and goals(desired outcomes)
2. context – related to rules above
3. actions – related to strategy above.
For information, we only really need the outcomes list. The rest of the items get into a whole ‘nother ball of wax.
You can think of level 3 as adding a third dimension to programming. The first dimension is database, the second dimension is object orientation. The third dimension is information orientation. As with the analogy, the third dimension allows you to cover much more ‘space’ with less code. 10x10x10 is 1000, whereas 20×20 is only 400.
Adding the third dimension in programming allows generic programming where we would have done ad hoc programming before. This is why having the right tool helps. It allows generic programming rather than ad hoc programming. Ad hoc programming is programming on a case by case basis rather than generic programming which is using a consistent approach. Ad hoc programming uses different structures for the same thing. Generic programming provides the same structure (or perhaps a few generic structures) for the same thing.
This is they key to why adding a third dimension reduces programming cost. It makes some ad hoc things into standardized things.
Trace-backs are highly useful for computational creativity programming. In order to debug and build on something created, you need to know how the system did it.
Are trace-backs useful for information programming too? You see sometimes the business rules result in the user not being allowed to do something or something simply not happening. Telling them why could be useful.
One of the problems with using .NET DataRows the way they are generally used is that you lose some of the context, i.e. you lose some of the trace-back.
EndemeFields (semantically defined data) can include tracebacks as long as the traceback can be properly applied to the field. This would usually be result fields. This is a .NET implementation detail.