We Need to Use Tracebacks in Some Kinds of Programs

Trace-backs are highly useful for computational creativity programming. In order to debug and build on something created, you need to know how the system did it.
Are trace-backs useful for information programming too? You see sometimes the business rules result in the user not being allowed to do something or something simply not happening. Telling them why could be useful.

One of the problems with using .NET DataRows the way they are generally used is that you lose some of the context, i.e. you lose some of the trace-back.

EndemeFields (semantically defined data) can include tracebacks as long as the traceback can be properly applied to the field. This would usually be result fields. This is a .NET implementation detail.

An Inner Platform is a Good Thing?

In figuring out how to work with information inside a computer program, I am working toward building an inner platform. The thing I have to watch out for is an anti-pattern called the “Inner Platform Effect”. The inner platform effect often starts out with a generic database that gets added to and enhanced until it becomes a replica (usually a poor and incomplete one) of one of the systems that it is built on top of. Most often the kinds of inner platforms built are replications of relational databases and operating systems.

It is possible to build an inner platform that replicates the capabilities and organization of a relational database. It is possible to build an inner platform that replicates the capabilities and organization of an operating system. It is possible to build an inner platform that replicates the capabilities  and organization of an object oriented software development system.

In data oriented software development the inner platform effect is an anti-pattern. The Inner Platform Effect – “The Inner-Platform Effect is a result of designing a system to be so customizable that it ends becoming a poor replica of the platform it was designed with.” This “customization” of this dynamic inner-platform becomes so complicated that only a programmer (and not the end user) is able to modify it.

Three things tend to lead to the creation of an inner platform

  • Users want something that they can modify themselves without having to have a programmer involved.
  • Programmers want to build something very generic because they do not know what precise needs their user have.
  • Programmers may realize deep down that the way forward in software development is to build a level above the level they are currently working. This may be the real reason programmers keep building them and suffering the consequences. Without information structures and theory, you simply can not build an inner platform that has added more capability than a database or object oriented approach. Any or all of these may be in play.
  • Programmers look at repetitive field handling code and think to themselves that there must be a way of abstracting it. You see (to oversimplify the situation) each column in a table ends up being a member in a class. A whole bunch of members tend to look the same, get set, changed and erased and displayed the same way, and the thinking is that there really should be a generic abstracted way to handle them.

Avoid the Inner Platform Effect

When you are building an inner platform there are some things you can do to avoid the inner platform effect:

  1. Do something the underlying platform cannot do or can do very poorly. This is my goal – to build an inner platform that works with information as well as data.
    Working with information is something that a database or an object oriented programming can barely do if at all. Usually when someone attempts to use one of these technologies for information, they end up with a snarl of excessively complex, nearly unmaintainable code.
    Endemes provide work-with-able information about the items in the ‘inner platform’. Information about the items is what inner platforms are normally missing.
  2. For simple information: avoid objects and attributes in the endeme tables. have only endeme strings, labels and ids
    • have the endeme id’s in the tables (objects) directly
    • the endeme is the value
  3. for endemized attributes/data:
    avoid objects in the endeme tables, each endeme for the data is its attribute
    – the endeme is the attribute, the endeme’s value is the data, the label is the name of the attribute
  4. for endemeized objects: This would be a generic database and full inner platform.
    I don’t know how to avoid building an inner platform here. I need much more experience working with 1,2,ad 3. above before I can hope to work with endemized
    objects without suffering the inner platform effect. The trick is, that in order to get this experience, I need to build a bunch of inner platforms, have them die their expected gory death due to the inner platform effect, and eventually figure out how to build a true information programming platform that does not suffer this.
    need to use level 4, can not do this properly until level 3 is well understood
    this would build an actual inner platform.
    – endeme lists would avoid the problem of not using the database for some of its excellent features.
    this is because the database would need to support custom indexes, (since information has no order) and functions and triggers to power those indexes, (since the database does these very well)
  5. Level 4 programming will be based on an inner platform. We need to build an inner platform for level 4 information programming. This will allow us to use knowledge representation in programming. Note to self: look to see if there are any successful knowledge representation inner platforms out there. OWL comes to mind.
  6. AI programming will be based on the knowledge representation programming and is possible once an inner platform is built for level 4 programming. The inner model has a column for id, key/field, and value.

More thoughts on avoiding the inner platform effect

  • how do I avoid the inner platform effect?
    • keep the inner platform really sparse in features
    • do a hybrid system
    • have columns for id(must have), field(in only its information format), and value(must have)
      • access by information, not ‘table’ and ‘key’
      • but an endeme list has endemes with labels, access by information, the endeme makes it different
      • have no search attributes, put these in regular table columns
    • don’t let a user modify it, only for programmers
    • build a solid infrastructure to hide the implementation details
    • implement in really slow steps, taking years and learning how to do each step well before moving on to the next one
  • how do I use the RDB for what it’s good for?
  • the question is what can I do with information?

Building a Successful Inner Platform

The inner platform effect is an intuition by programmers that we need to build a level above the data levels. However without the information oriented programming theory to base it on, inner platforms just generally only implement things that already exist. Thus they usually fail to build a level above data levels.

I want to take the inner platform above both the database, the OS and the object oriented program level. The challenge is to have the inner platform reach out to the rel platform it replaces to use the performance facilities of the real platform below. The inability of an inner platform to do this is the most common downfall for the inner platform.

An inner platform for a database:

  • table
  • column
  • row
  • relation
  • pk
  • value

An inner platform for object orientation/xml/json:

  • class
  • member
  • object
  • type
  • collection

An inner platform for an operating system:

  • path
  • folder
  • filename
  • parent
  • child

Generic Data Models

– object     attribute     value

From C2: Definitely not an anti-pattern, and it can often greatly simplify table design for some corner scenarios. Apart from giving rapid prototyping benefits it also helps with handling ‘jagged’ data. Instead of introducing null columns using GenericDataModel enables us to save only relevant data thereby actually decreasing complexity. Reporting and manual labor against the table can be handled via views as mentioned above. not

Entity–attribute–value model

  •     The entity: the item being described.
  •     The attribute or parameter: typically implemented as a foreign key into a table of attribute definitions. The attribute definitions table might contain the following columns: an attribute ID, attribute name, description, data type, and columns assisting input validation, e.g., maximum string length and regular expression, set of permissible values, etc.
  •     The value of the attribute.

From Wikipedia: As noted above, EAV modeling makes sense for categories of data, such as
 – clinical findings, where attributes are numerous and sparse.
Where these conditions do not hold, standard relational modeling (i.e., one column per attribute) is preferable; using EAV does not mean abandoning common sense or principles of good relational design. In clinical record systems, the subschemas dealing with patient demographics and billing are typically modeled conventionally. (While most vendor database schemas are proprietary, VistA, the system used throughout the United States Department of Veterans Affairs (VA) medical system, known as the Veterans Health Administration (VHA),[1] is open-source and its schema is readily inspectable, though it uses a MUMPS database engine rather than a relational database.)

We Need an Inner Platform for Information

We need an inner platform because the db and the language do not provide what we need and we need a platform that will.
the problem is based on members

When to Avoid EAV Models

We should probably avoid the EAV model. The definition of an EAV model is having tables in a database that contain three columns: object, attribute, value.

  • Either do information full up in a database or don’t do it at all in a database. Databases are designed for data.
  • Of course you have to store your information somewhere.
  • So something like an inner platform is needed for that.
  • But you have to avoid the inner platform effect which is to create a platform which does pretty much the same thing as the platform used to build it, i.e. creating a database in code using a database to store it in for example.
  • To avoid this you need to provide significant additional value in your inner platform. Information orientation allows you to do this.

The DRY Principle and Information

DRY

In software development, DRY stands for Don’t Repeat Yourself. Conventionally this principle gets implemented by frameworks like ASP.NET MVC and Ruby on Rails by having a single source for column/field names, either the database or a class.

Framework Challenges

The challenge with these systems is to manage the sources. This requires high integration, and makes it very difficult for these frameworks to play with others. They are also very solidly based in a data oriented approach, meaning that they institutionalize hard coded context. With hard coded fields even when there is only one source for each, framework APIs become inflexible.

An Information Based DRY Framework

An information approach would treat each field as something to be processed and characterized rather than just as a field name and a type and other hard coded attributes. Information could provide flexible relationships between fields, flexibly created new fields, and control for the source of the field names, labels, types, and parsers.

Interacting with Another Framework

An endemized English vocabulary could be used to come up with an automatic signature for each of the columns in a database, each of the fields in entity framework, and each of the members of various data classes. These could be glommed. I have an endemized signature vocabulary for about 5000 words already available. Adding synonyms and two word definitions, and including another 1000 very common database column words could push this to 10000. Relationships could be identified from class memberships, operations and stored procedures. In this way an endemized framework could interact with MVC and Ruby on Rails without a lot of code written (once this framework was written).

An API to Anything

A further DRYing of code could happen by making the endemized framework a source. Once the framework above works, then this step becomes simple. And the endemized information framework can be a conduit to the other sources. We can build an API for interacting with the endemized framework giving other systems access to fields and rich metadata about fields in apps and databases. Endemizing provides flexible metadata, so there could also be an API to adapt the generalized framework to specific sources.

Glomming a Database Part II

Once you have the endeme sets, you can automatically build a UI to access the information and the data
– the data with endeme based context UI’s, these ‘views’ would have to be generated in a big info searchable list, along with context-info ‘advanced’ search ‘screens’
– the information with endeme UIs, returning -summaries, -totals, -endematic profiles, -data mined correlations, -etc.
– once you have a row you can edit it
– you can also edit the columns of a list given specially selected data fields and endeme metadata

data: a big honkin list of database data column paths including column, path, endeme(s)
info: a big honkin list of database information paths including endeme, column(s), path(s)
context: a big honkin list of database context including join(s), table(s), path(s), column(s), endeme(s)

combine databases?

unfortunately we may have to get level 4 working before we can do this properly.
Fortunately this project could be a good test bed for getting level 4 working.

Glomming a Database – Part I

The first thing to do when wrapping and glomming a database is to extract the endeme sets.

Part I – Extracting the endeme sets

To extract candidate endemes from a database, it would be nice to build a tool that would extract endeme sets from a database in preparation to glomming a database. Then you could build an endematic wrapper around an entire database.

Process the items of each column to identify the endeme sets:

  • use Levenshtein matrix to coalesce similar stuff.
  • use my endemes for 5000 words document to coalesce similar meanings.
  • more approaches based in data science.

Lookup tables may be implicit or explicit.

Table Column analyses

Column analyses:

  • general content/data columns
    – mostly unique items columns [U] – no endeme sets extracted
  • id/plumbing column
  • context column
  • lookup id column
  • implicit lookup table columns
    • Bits – multiple bit columns ‘column’, process multiple items of multiple columns with same type (bit) [B] – 1 set extracted
    • Conflated – two or more endeme sets [C] – 2+ sets extracted
    • Denormalized – zero normal form column [D] – 1 or 2 sets extracted
    • Endematic – endematic range – 16-32 rows [E] – 1 set extracted
    • Few – few rows – under 8 [F] – a fraction of a set extracted
    • Many different items column [M] – 1 set extracted
  • Freetext
    – R 1+  freetext column with repeating words
    – R 1+  freetext column with repeating concepts
  • Look for concept sets, concepts have additional structure that endemes do not have
    – concepts are generally characterized by two, 3 or 4 endeme sets in a row or chain

Lookup table analyses

Explicit lookup table analyses, process lookup id’s in a databse to build stuff

  • explicit lookup table content column
    • Bits – multiple bit columns ‘column’, process multiple items of multiple columns with same type (bit) [B] – 1 set extracted
    • Conflated – two or more endeme sets [C] – 2+ sets extracted
    • Denormalized – zero normal form column [D] – 1 or 2 sets extracted
    • Endematic – endematic range – 16-32 rows [E] – 1 set extracted
    • Few – few rows – under 8 [F] – a fraction of a set extracted
    • Many – many rows – over 64 – [M] 1 set extracted
    • Unique – most rows are unique – no endeme sets extracted
  • where it’s used as context

Junction table analyses

  • there’s got to be something I can do with junction tables.

Views and Reports

Endemizing reports, views, and stored procedures that return data sets (‘report’ and ‘get’ sp’s)

  • Context profiles?
  • Endematic metadata – reports mostly show numbers, endemes can store relative values
  • The row is the endeme item, the endeme indicates how it ‘compares’/’relates’ to other items

Other stored procedures (and inline SQL code)

These may be used to identify contexts and relationships between tables and columns.

 

I wonder if I can do the same thing with code?

What Kind of Software Developer Are You?

Technology    +-------------+          User
Focus         | User        |         Focus
              | Interface   |
+-------------+-------------+-------------+
| Integration | Object      | Information |
| Orientation | Orientation | Orientation |
+-------------+-------------+-------------+
              | Data        |
Network       | Storage     |            BI
Focus         +-------------+         Focus

There are five kinds of software developers. Many developers cover more than one kind. Object Orientation is the core of programming. The value of specialization is that you can get more work done, with higher quality, and better service to users if you assign tasks to the software developers that specialize in each kind of task. The value of information is that it can tell you what specializations are needed for each tasks, and which software developers have which specializations. The value of endemes is that they provide a framework for seamless specialization information gathering and use.

  • Integration oriented developers specialize in IT, integrating systems, using pre-built systems, using frameworks, maintenance programming and troubleshooting, new technology integration, and system architecture.
  • Object oriented developers specialize in data structures, architectures, framework building, middle tier development, and computer languages, testing, and UML.
  • Information oriented developers specialize in domain based design, user needs, endemes, knowledge representation, business intelligence, business rules, business needs and middle tier development, user concepts, and information modeling.
  • User interface developers specialize in user interface coding, layout, UI design, UX, usability, mobile, web, desktop and user concepts.
  • Data storage developers specialize in databases, SQL, NoSQL, performance, data modeling, load balancing, and database administration.