>Why are we asking a programmer to understand engines?
>
The programmer has to build a system that interacts with the engine among other things that is what the specification I gave as an example is about.
>Your assumptions that humans will always write bad documents because they are human
is a poor judgement on the humans.
>
It is a judgement of their capacity, and what happens when they exceed their capacity. If they are directed to do something outside their capacity, it is the fault of the director.
>I see a number of problems with the processes you are talking about that would have
to be fixed before applying technology.
>- The human writing the stuff should be a knowledgeable person. If they are not
(and you are implying they are not), then the cause is lost at the start regardless of
what technology is thrown at it (garbage in, garbage out).
>
I talked about some highly competent and conscientious people and "expert in the large organisation" being unable to see particular sorts of errors cant see any implication there. I also talked about specialists not knowing other specialties in depth who does?
>- The human writing the stuff should be trained to write documents.
This will prevent 99% of the garbage in the first place.
>
Most of the people are quite proud of their language expertise and feel bad about a hanging index. Stanley Fish considers himself an expert in training people in writing, but when he gave examples of different skills, it was clear he saw each particular skill as separate and independent. We rely on the person to do the integration, and cant train them in that, because we only do it intuitively ourselves. It is like some mathematics we know the result they should get, but can only prompt them until they develop a process we cant describe it in detail. When they must integrate complex semantics together with a readable style, we should expect mistakes. Giving them feedback when they make mistakes might be one way of training them to do better.
> - The human should be writing the document for the target audience, not their own
gratification.
>
The person is writing it for their organisation, but it will not remain internal it will be passed to another organisation, with different goals. There is a classic under way at the moment the aerial tanker, which is being fought tooth and bloody claw, and may run out to $300 billion. Every word will be ruthlessly scrutinised by all parties for some leeway or leverage, with the top brass, politicians, Uncle Tom Cobley and all involved. It wont be written for gratification having your fingernails torn out would probably be bliss in comparison.
>- Headings are supposed to
make sense - this is a basic writing skill. (see the previous points).
>
Mostly they do, but we are talking about high reliability, not mostly. People expect high reliability in their machines, but arent keen on it when it comes to themselves it is onerous and demanding.
>I pick out two trends in your discussion:
- Humans are useless at anything that is over a couple of paragraphs long.
>
If you have ever worked on QC and reliability analysis, you will be familiar with how unsettled and defensive people become when you start analysing their performance. So why dont we work out their capacity.
First, a mechanistic example.
Lathes range in size from something that would fit in your hand to something that needs a large building to house it. They are rated on diameter, length and mass they can handle. If you exceed their capacity, the quality deteriorates very rapidly you have to get a bigger one. The really big ones come in pieces, which are assembled onsite for a particular job. There have been standards for a hundred years, so quality of output is known in advance the field is mature.
So, what is the capacity of a person. They have to be able to bring elements together and fire them over a period of about 200 msec, otherwise they cant form connections. The number they can handle is also limited by the fetch time (The 5 to 9 range is for simple numbers, but is emblematic). As an example
"Available Hours" means the period defined in clause 4.3.1.1.
When the person read that, clause 4.3.1.1 didnt exist in their head, so when they attempt a fetch on Available Hours used somewhere else, they have a messy and time-consuming fetch into Clause 4.3.1.1, meaning what they were attempting to synthesise will fail, while they establish a connection here. Once they have established the connection, they can make a fast fetch, but they are no longer using the specification, they have introduced a shortcut. Other terms are redefinitions of common words Equipment, Run etc. These connections must be maintained, over the top of deeply ingrained meanings. Lets say they can do this while they work on the document uninterrupted how many pages can they read in what time? Lets say they can do lunch while keeping it together. Perhaps 10 pages at maximum reliability, assuming no long break or frequent interruptions. When they come back tomorrow, they would need to re-establish the links some wretch may have changed the reference to 4.3.1.7. so their shortcut is invalid, and they have to ensure they dont start thinking of run where it says Run, or get mixed up with another document they glanced at which had different definitions.
Unfortunately, some examples at 7 pages show low reliability, but lets say their incidence can be reduced with greater use of resources. The window for new connection is not amenable to large change drugs can move it about 20%. People can run it "hot" by pumping it, but that is very tiring.
We could say no specifications bigger than 10 pages, but our masters want a bigger gadget, and it cant be described in 10 pages, it takes a hundred pages. Humans dont come in a range of guaranteed capacities, like lathes do, so what to do. We could accept that the quality will be low, and it will cost extra because of it. Try telling that to the program manager (or even the government, who may prefer to spend the money on pensions. In fact they do, the days of unlimited money for stuffups is over, so there has to be a ratcheting up of quality). The pieces analogy for big jobs can we use a team. That means we have to fit them together, or give them separate independent tasks, when it is the integrity of the overall specification we are interested in, not how well each section is written. We could introduce draconian controls on what goes into the document, but that slows down its creation, and the things being specified are often obsolete when they emerge from manufacturing now.
If you dont like my estimate, work out one of your own, but dont assume infinite capacity, or the magic of extra training look at the world as it is, or can be made to be by practical steps. People work on FPS up to 300 pages, and contracts up to 1000 pages, with thousands of pages in annexes. Outsourcing contracts, at about 600 pages, are a good example of a document that no-one understands (the lawyer who did it says so, but it doesnt matter, they will get paid to sort out the misunderstandings).
>- Applying a little bit of technology to assist qualifies it as
"automatic".
>
"A little bit" may not be accurate it employs self-extension, which is unusual except for 6 billion other machines.
>Both of these trends are in error (look at all the good work that is being done in
the world at the moment by humans) are throwing the discussion to somewhere less useful.
>
You see the glass half full including global warming, the GFC. Fighting self-caused conflagrations is not good work. The world would be different if we understood our limits and worked around them. To quote Dirty Harry "a mans gotta know his limitations". Remember the Masters of the Universe on Wall Street was that good work, or ego-driven foolishness. Can I mention again that 99% of new science is junk the good work is far outweighed by dross, but it lives on while the dross is just like any mining operation. Could we reduce the dross to gold ratio with some QC probably.
The QC or Reliability Analysis person has to see the glass as half empty what can go wrong? And then Moores Law takes care of the rest.
I understand why people are discomfited about their errors being exposed I dont like it either. What happens next is what is important do they see it as an opportunity to improve what they do, or to fight it. If they can make it so the machine never finds an error without becoming bogged down themselves fine, that is the best sort of QC.
>What would be needed to move towards the automation quality control of knowledge
creation? And why would we do it?
>
The latter is easy to answer because we make a mess of it when we go outside our competence envelope, and yet we are driven outside it by the complexity of modern life.