List of Archived Posts

2004 Newsgroup Postings (5/15 - 5/27)

c.d.theory glossary (repost)
c.d.theory glossary (repost)
Quote of the Week
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
The Network Data Model, foundation for Relational Model
racf
racf
racf
command line switches [Re: [REALLY OT!] Overuse of symbolic
1.7rc2 & developer's build/install
racf
command line switches [Re: [REALLY OT!] Overuse of symbolic
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
IBM 7094 Emulator - An historic moment?
layered approach
Infiniband - practicalities for small clusters
Why does Windows allow Worms?
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
command line switches [Re: [REALLY OT!] Overuse of symbolic
command line switches [Re: [REALLY OT!] Overuse of symbolic
Infiniband - practicalities for small clusters
command line switches [Re: [REALLY OT!] Overuse of symbolic
[Meta] Marketplace argument
[Meta] Marketplace argument
[Meta] Marketplace argument
vm
MITM attacks
Usenet invented 30 years ago by a Swede?
MITM attacks
Usenet invented 30 years ago by a Swede?
Questions of IP
MITM attacks
Why doesn't Infiniband supports RDMA multicast
Infiniband - practicalities for small clusters
Who said "The Mainframe is dead"?
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
can a program be run withour main memory ?
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Finites State Machine (OT?)
Infiniband - practicalities for small clusters
Random signatures
can a program be run withour main memory?
Blinkenlights?
before execution does it require whole program 2 b loaded in
Leaf routines
Infiniband - practicalities for small clusters
[HTTP/1.0] Content-Type Header
Infiniband - practicalities for small clusters
Phone # layout vs Computer keypad # layout???!???!!!
Text Adventures (which computer was first?)
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
Infiniband - practicalities for small clusters
before execution does it require whole program 2 b loaded in
before execution does it require whole program 2 b loaded in
before execution does it require whole program 2 b loaded in

c.d.theory glossary (repost)

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: c.d.theory glossary (repost)
Newsgroups: comp.databases.theory
Date: Sat, 15 May 2004 09:14:38 -0600

mAsterdam writes:

Narrowing this down:

The glossary is a list of items that led to mutual misunderstandings
in the c.d. theory newsgroup. It is built from contributions.  The
newsgroup uses terms from database design, implementation, operation
and change management, cost sharing, productivity research, indexing
and cataloging database literature, and /or basic databse research.

The glossary's purpose is to limit lengthy misunderstandings.  It
consists of signposts: watch out! You may think the OP means A but
she might mean B. Alternative names and views of the same concept
are only introduced when the danger of mutual misunderstandings is
appearant. When context matters, it is provided. The glossary is a
highly biased list of problematic concepts.

slight drift ... the nlm has books, articles, papers ... there is
essentially an online (card?) catalog for the library. umls is sort of
the structured set of words used for the catalog. it is sort of
structured into somewhat hierarchy of concepts, terms, and word
sequences.  however there is also mesh of complex many-to-many
relationships between concepts. there are tens of thousands of
concepts, hundreds of thousands of terms, and millions of word
sequences.

this is sort of independent of having any definitions for the
concepts, terms, and/or word.  if you have a set of words that you
might want to look for a article with ... umls gives other related
words, terms, and/or concepts that might also be used to search for
articles.

it is also used by the people cataloging the library ... lots of
listed terms and word sequences have preferred relationships, i.e.  if
an article abstract contains certain set of terms and/or word sequences, there
are guidelines about preferred terms to be used for
indexing/cataloging. this structure of preferred/nonpreferred relationships
can also be used for people looking up entries in the catalog

at this level, umls is effectively the structure used for
understanding the cataloging of the articles (as opposed to
understanding the articles themselves).

there was some statement that nlm reached the state of many current
search engines possibly by the late '70s. a boolean term search would
be quite bimodal, at six to seven terms there could still be hundred
thousand hits ... but adding one more term dropped the number of hits
to zero.  the holy grail was finding magic combination of five to
eight terms that resulted in 50-100 hits. in the early 80s, an online
interface (grateful med) was developed that by default didn't ask for
the hits but just the number of hits. then a 2-3 day task might be to
discover the magic query combination that resulted in a reasonable hit
result (say greater than zero but less than several hundred).

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

c.d.theory glossary (repost)

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: c.d.theory glossary (repost)
Newsgroups: comp.databases.theory
Date: Sat, 15 May 2004 09:41:04 -0600

at one point, i heard that there were something like 40,000 medical
librarians around the world that specialized in assisting doctors and
medical researchers in doing nlm lookups.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Quote of the Week

Refed: **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Quote of the Week
Newsgroups: comp.databases.theory
Date: Sat, 15 May 2004 13:55:18 -0600

jcelko212@earthlink.net (--CELKO--) writes:

1) Someone send Chris Date one of my very old SQL puzzles and wanted
help with it.  Chris answered it with his personal programming
language rather than Standard SQL.  The problem involved displaying
the hire date and last promotion date for each employee in a
personnel.  The specs were to use a NULL, if the employee was a new
hire.  Since Chris Date's version of the Relational Model does not
have NULLs, he used an arbitrary dummy date instead and thus created
false information.

posting a year ago in this n.g.
http://www.garlic.com/~lynn/2003g.html#40 How to cope with missing values - NULLS?

mentioning a long ago and far away article by date (1992) titled "An
Explanation of why three-valued logic as a mistake" ... related to
the handling of NULLS in SQL.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

Refed: **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Sun, 16 May 2004 09:58:50 -0600

Anne & Lynn Wheeler writes:

Several hundred instructions in the kernel were SMP'ed for concurrent
operation ... bascially a relatively thin layer in all the interrupt
interfaces and the dispatcher. On entry to the kernel, an attempt was
made to obtain the kernel lock, and if it couldn't ... rather than
spinning, it queued an extremely lightweight thread request and went
to the dispatcher looking for something else to do (aka rather than
spinning on the kernel lock, it "bounced" off the kernel lock and went
looking for non-kernel work to do).

a side issue of SMP support was pricing for software. somewhat with
the unbundling of 6/23/69, services and application software started
being priced/charged ... as opposed to free. however, kernel software
was still free, somewhat under the theory that it was necessary for
the operation of the hardware.

i was somewhat working on the resource manager, ecps (extensive
microcode performance enhancements), and VAMPS all at the same time.
the resource manager was big package of software that nominally
improved the resource allocation algorithms. however it also had
a bunch of structual changes to the kernel.

one of the things was that a infrastructure for automated benchmarks
was create (as part of calibrating the resouce manager algorithms). in
preperation for releasing the over 2000 benchmarks were run taking
something like 3 months elapsed time. some of the benchmarks were
extreme stress tests that were possibly a factor of ten times outside
the normally observed operating situations. at the start these stress
test were guarenteed to crash the kernel. eventually as part of the
resource manager, the whole kernel serialization structure was
completely rewritten ... which eliminated all observed crashes under
heavy stress load and also all known situations that involved
hung/zombie processes.

previously, in the cp/67 to vm/370 (port from 360/67 to 370) there had
been some kernel restructuring; some of the cp/67 that had been done
in support of multiprocessing was eliminated. for the resource
manager, i re-introduced the dispatching and paging related structures
that had been there from cp/67.

the other characteristic was that it was decided to make the resource
manager the guinea pig for charged/priced kernel code ... under the
revised guidelines that kernel code that wasn't directly required for
using some hardware could be charged for.

So the resource manager and ecps ships ... and VAMPS is canceled.  The
decision is then made to turn out a real vm/370 multiprocessor kernel
and to adopt the thin-layer multiprocessor design from VAMPS with
bounce lock and extremely lightweight kernel requests (rather than the
traditional kernel spin lock).

The only problem was that SMP support needed something like 80 percent
of the code that was in the resource manager (involved in various
restructuring). The issue was that they just couldn't just make the
resource manager a prerequisite for SMP support ... since the resource
manager was charged for ... and the business guidelines required that
kernel software directly needed to support hardware (in this case
multiprocessor operation) was still free (and free software with a
prerequisite for priced software wasn't an option).

The solution was to remove the 80 percent or so of the code from the
resource manager needed for SMP support and place it in the base
(free) kernel. They then continued to charge for the five times
smaller resource manager (in terms of lines of code) at the same
price as the original resource manager.

all of this seems trivial now that it has evolved that all (mainframe)
software is charged for (including all kernel software) .... but at
the time it represented an enormous issue.

misc. past resource manager related posts
http://www.garlic.com/~lynn/subtopic.html#fairshare
and paging system posts
http://www.garlic.com/~lynn/subtopic.html#wsclock
and smp posts
http://www.garlic.com/~lynn/subtopic.html#smp
and microcode posts
http://www.garlic.com/~lynn/subtopic.html#mcode
and benchmarking (in support of resource manager)
posts
http://www.garlic.com/~lynn/subtopic.html#bench

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Sun, 16 May 2004 10:11:32 -0600

Joe Seigh writes:

For simple synchronous kernel calls that return without exiting to
the dispatcher/scheduler, there is some additional complications.
You might need checkpointing for error handling.  If you can avoid
blindly doing a pipeline flush and maybe use some mainframe like commit
logic, you might be able to implement leaf kernel calls that are fairly
efficient.

i created something that I called fastpath for various cp/67 kernel
calls when i was undergraduate ... doing detailed feature/function
krenel path enalaysis of bunch of paths thru the kernel and
identifying those that needed the minimum of kernel services ... and

a) handling them in completely in the interrupt handler (and directly
resuming application from the interrupt handler) or

b) recognizing them in the dispatcher and providing special path thru
the dispatcher for application resume or

c) dispatcher recognizing that while it was longer kernal processing
it was still resuming the same application and being able to still
optimize some pathlength.

one of the remaining timing issues was just that the svc
interrupt/call from application state to kernel state also changed the
machine state from enabled for i/o & timer interrupts to disabled for
i/o & timer interrupts. on resume, the state of the machine then
changed back from disabled for interrupts to enabled for
interrupts. The change in machine state between enabled/disabled for
i/o & timer interrupts was an extremely expensive process by itself.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Sun, 16 May 2004 10:15:52 -0600

Anne & Lynn Wheeler writes:

a) handling them in completely in the interrupt handler (and directly
resuming application from the interrupt handler) or

trivial example in this scenario was avoid doing save/restore of the
floating point registers ... since the brief kernel processing
handling in the interrupt handler never touched them.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Sun, 16 May 2004 10:50:31 -0600

this is a report i gave at ibm user group meeting while
undergraduate on operating system/kernel speed up.
http://www.garlic.com/~lynn/94.html#18 CP/67 & OS MFT14
http://www.garlic.com/~lynn/94.html#20 CP/67 & OS MFT14

MFT14 was the main batch operating system that was used at the
university for the bulk of the work. a characteristic was that it was
heavily disk bound ... design point that assumed everything was
heavily real memory constrained and so everything had to be done using
disks. I speeded up nominal university workload thruput by a factor of
three times by carefully positioning data & files on disk to optimize
disk arm.

CP/67 was virtual machine operating system from cambridge science
center:
http://www.garlic.com/~lynn/subtopic.html#545tech

some people from cambridge had installed it at the university january
of 1968 ... and the above referenced presentation was made august of
'68.

when running MFT14 in virtual machine the main issues for CP/67
were in the

1) SVC interrupt handler.

MFT application were using SVC interrupts to call the MFT kernel;
under CP/67 they interrupted into the CP/67 SVC interrupt handler and
a SVC interrupt into the virtual machine had to be simulated, resuming
the virtual machine at the virtual svc interrupt address.

2) program interupt handler

MFT kernel would execute a large number of "privileged" 360 instructions.
CP/67 ran all virtual machines in non-privileged mode which resulted in
privileged instructions interrupting into the CP/67 kernel. The CP/67
kernel then had to simulate the privilege instruction and resume virtual
machine execution

3) dispatcher

standard process for resuming virtual machine execution was the
dispatcher which all kernel processes would eventually converge on.
the dispatcher in additional to actually activating the execution of
virtual machine, also managed the selection of virtual machine to run
as well as managing all the kernel thread infrastructure and
misc. other tasks.

...

The fastpath work and misc. other kernel work that I did between
january and august of '68 ... reduced the kernal cpu utilization by
nearly 80percent:

before: 534 cpu secs in the cp/67 kernel
after:  113 cpu secs in the cp/67 kernel.

the measurements were by running the MFT14 on the real machine and
clocking real time and then repeating under CP/67 and clocking real
time. The MFT14 workload was run in an virtual machine enivoronment
where all of the virtual machines pages were resident in real memory
and CP/67 performed no paging operations during the period.

The increase in real time between real machine operation and virtual
machine operation is totally attributable to CPU utilization by the
CP/67 kernel operation ... not necessarily solely instruction
pathlength, since it would also include the overhead for things like
the privilege instructions from the virtual machine interrupting into
the CP/67 kernel.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

The Network Data Model, foundation for Relational Model

Refed: **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: The Network Data Model, foundation for Relational Model
Newsgroups: comp.databases.theory
Date: Sun, 16 May 2004 14:58:09 -0600

"Ken North" writes:

I've never heard the term "network model" DBMS in any context other than the
CODASYL DBTG standard.

i was writing code for a "semantic network" DBMS (a network model with
some additional characteristics) out at the los gatos VLSI group about
the same time I was writing code for System/R ... the original RDBMS
at research (research lab, bldg 28 was about 10 miles from the Los
Gato VLSI lab, bldg 29 ... the bldgs were consecutively numbered i
would guess based on when they were built, rather than strict physical
proximity).

The "semantic network" DBMS done by the VLSI tools group drew a lot
from Sowa's work on semantic networks. There has also been various
efforts over the years to map semantic networks into RDBMS technology
...  frequently the intermediate layer represents a 10:1 performance
overhead handling the semantic network representation to relational
representation. however, quicky use of search engine turns up some
number of explicit implementations.

a random reference turned up by search engine that happens to also
mention medical informatics and UMLS semantic networks ... even tho
there has been significant effort mapping UMLS semantic networks to
rdbms over the years:
http://courses.mbl.edu/Medical_Informatics/2001/outlinesFall2001.html

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

racf

Refed: **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: racf
Newsgroups: bit.listserv.ibm-main
Date: Mon, 17 May 2004 13:36:21 -0600

tom.schmidt@ibm-main.lst (Tom Schmidt) writes:

DES uses a public key and a private key to verify authority.  In
RACF's case the public key is the 8-character userid and the private
key is the 8- character password.  (RACF pads both fields to
8-characters with blanks.)

DES uses symmetric, shared, secret key for encryption and decryption.

warning: long authentication topic drift.

typically a userid/password scenario is that you assert something ...
aka: assert you are authorized for the "userid" and you prove the
assertion by knowing the corresponding password.  the "userid" is used
for the authorization process, and the "password" is used for the
authentication process.

in 3-factor authentication paradigm:

• something you knowsomething you havesomething you are

a password is a single factor, shared-secret, something you know
authentication. some security infrastructures may also depend on the
"userid" (used for authorization) also be kept totally secret and
therefor also become part of something you know authentication
process.

asymmetric encryption has a pair of keys and uses different keys for
encryption and decryption. business processes establish convention
that a specific key is to be treated as public and the other half of
the key pair is to be treated as private and never divulged.

In DES, the shared-secret key isn't private (in the sense of the
asymmetric encryption business process of public/private keys) since
both the encryptor and the decrypter have to share knowledge of the
same key (although they may keep it secret from everybody else).

the use of "private" in the business application of asymmetric
encryption is, in effect, intended to be more restrictive control of
the key than just secret (aka public/private key systems don't
represent technology, asymmetric encryption represents technology;
public/private keys represent business process application of
asymmetric encryption).

it is possible to have two factor authentcation:

• something you knowsomething you have

where the something you know authentication can be private as
opposed to shared-secret.

the something you have can be some form of chip token that possibly
contains a private key that can never (practically) be removed from
the token. The private key in the token is used to "digitally sign"
(aka encrypt a hash) which is trasnmitted. If the receiver can verify
the digital signature with a recorded public key, then something you
have authentication can be demonstrated (since only the person in
possesion of the hardware token could have generated the correct
digital signature). This is scenario where somebody walks in and says
instead of recording a password for my userid ... record a public key
... and instead of the system doing password compare ... uses the
public key to perform digital signature verification.

Now, if the hardware token has been certified to only operate in a
specific way when the correct PIN has been entered ... the
authentication of the digital signature then not also implies
something you have authentication, but also something you know
authentcation ... i.e. the recepient doesn't know the PIN ... but
knows the correct PIN was entered since the token appears to be
operating correctly.

In this way, something you know authentication can be demonstrated
w/o having to use shared-secrets between the sender and the
receipient.

There are horrendous problems with the shared-secret paradigm.

First off, since it is shared-secret, there is security requirement
that every unique security domain use a unique shared-secret. Specific
security domains, then in addition ask that the shared-secrets used be
very hard to guess ... and many of these security domains seem to
operate under the assumption that they are the only security operation
in the whole world. However, the reality is that people may
participate in scores of different security domains ... with the
result they have large tens of very hard to guess and remember unique
shared-secret passwords. Individuals then are forced to create
repositories of all their shared-secrets which become attractive
targets for criminals.

Second, institutions also tend to have repositories of significant
amounts of shared-secrets, which become attractive targets. In the
shared-secret scenario, criminals skimming/harvesting shared-secrets
repositories, are able to use the information to impersonate
people. small side drift on security proportional to risk:
http://www.garlic.com/~lynn/2001h.html#61
if passwords/shared-secrets were replaced in these repositories with
public keys .... criminals could harvest the repositories all they
wanted and still not be able to impersonate somebody by knowing
their public key (in the way they can impersonate individuals by
knowing their shared-secrets).

Finally, people have to know their shared-secrets and therefor are
prone to phishing (con-artists convincing people to divulge their
secrets in one way or another). hardware tokens using private keys
(that can't be extracted from the token) help address the problem. If
nobody can know the private key, then the owner of the hardware token
also won't know the private key .... and a con artist can't get
somebody to tell them something that they don't know.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

racf

Refed: **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: racf
Newsgroups: bit.listserv.ibm-main
Date: Mon, 17 May 2004 13:57:28 -0600

tom.schmidt@ibm-main.lst (Tom Schmidt) writes:

The EFF took most of the fun out of DES several years ago though.

i've got one of the souvenir chips from the machine ... they didn't
take the fun out of DES ... they reduced the cost of doing a brute
force attack on a specific DES key ... showing that you can do a brute
force attack and find a specific key in small tens of hours.

this becomes a problem for using specific DES key for extended periods
of time, especially over a large domain, which might involve lots of
value.

so there are two countermeasures used in the financial world

1) dukpt ... derived unique key per transaction. basically things like
ATM machines that continue to use existing DES hardware ... but
wrapper it so every transaction over the ATM network uses a unique DES
key. The individual transactions are valued in the tens to low
hundreds of dollars and lifetime is typically measured in seconds to
minutes. the lifetime of the key is much shorter than existing
techniques to find a key ... and besides the value of the transactions
are much less than the cost to find the unique key for each
transaction.

2) triple DES ... basically an encrypt/decrypt/encrypt sequence using
different keys ... achieving 112bit strength with two different DES
keys. Each additional bit doubles the elapsed time for the brute force
attack on specific key. so if specific 56bit DES key can be found in
say ten hours ... then it takes either 2**56 times as long to find a
112bit DES value (and/or cost 2**56 times as much)

the following recent RFC has some discussion of various key strength
issues
http://www.garlic.com/~lynn/rfcidx12.htm#3766

3766 I
 Determining Strengths For Public Keys Used For Exchanging
 Symmetric Keys, Hoffman P., Orman H., 2004/04/26 (23pp)
 (.txt=55939) (BCP-86) (was draft-orman-public-key-lengths-08.txt)

if you click on the above URL ... then clicking on the ".txt=55939"
field retrieves the actual RFC.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

racf

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: racf
Newsgroups: bit.listserv.ibm-main
Date: Mon, 17 May 2004 14:39:43 -0600

Rick.Fochtman@ibm-main.lst (Rick Fochtman) writes:

Chuck DES and go to RSA, which uses both public and private keys.  Copy of
original working paper on request. <G>

if you look at determining strengths for public keys in
previous referenced RFC ... also reference
http://www.garlic.com/~lynn/2004e.html#18 RFC 3766
and recent previous post in this thread:
http://www.garlic.com/~lynn/2004f.html#9 racf

based on the tables, you are starting to talk about RSA key sizes well
over 2048 bits ... to give compareable strength to triple-DES
(discussed also in previous post in this thread) or even minimum AES
key size.

a thread from sci.crypt
http://www.garlic.com/~lynn/2004d.html#58 How secure is 2048 bit RSA?

for pure authentication, you might be much better off with one of the
FIPS186-2 implementations, dsa or ecdsa (see strengths from referenced
RFC or table extracted in above referenced URL on how secure is 2048
bit RSA):
http://csrc.nist.gov/cryptval/dss.htm

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

command line switches [Re: [REALLY OT!] Overuse of symbolic

Refed: **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: command line switches [Re: [REALLY OT!] Overuse of symbolic
 constants]
Newsgroups: alt.folklore.computers
Date: Mon, 17 May 2004 19:49:51 -0600

jmfbahciv writes:

[puzzled emoticon here]  The last sentence doesn't make sense.
Why would there be output?

os/360 compilers/assemblers produces object decks (80 byte images)
that had ESDs (external symbols), "relocatable adcons", external
program references, etc.

various past posts about os/360 object deck formats
ttp://www.garlic.com/~lynn/2001.html#8 finding object decks with multiple entry points
http://www.garlic.com/~lynn/2001.html#14 IBM Model Numbers (was: First video terminal?)
http://www.garlic.com/~lynn/2001.html#60 Text (was: Review of Steve McConnell's AFTER THE GOLD RUSH)
http://www.garlic.com/~lynn/2002n.html#62 PLX
http://www.garlic.com/~lynn/2002n.html#71 bps loader, was PLX
http://www.garlic.com/~lynn/2002o.html#25 Early computer games
http://www.garlic.com/~lynn/2002o.html#26 Relocation, was Re: Early computer games
http://www.garlic.com/~lynn/2003d.html#47 IBM says AMD dead in 5yrs ... -- Microsoft Monopoly vs. IBM
http://www.garlic.com/~lynn/2003f.html#26 Alpha performance, why?

the link-editor could take several object decks, combined them
together resolve what was possible to resolve and emit it as series of
disk records ... that had some amount of the object deck overhead
eliminated (it still had entry points, any remaining unresolved
external references and relocatable address constants).  there was a
some number of link-edit control commands that had control over things
like which libraries to search for resolving external program
references.

the information about internal program address constants had to be
kept around until it was actually decided to run the program
... because address constants were absolute ...  once the program
image was loaded into specific address location at runtime ... one of
the last things that had to be done (before starting program
execution) was go thru the "loaded" memory image of the program and
adjust all the absolute address constants.

this os/360 convention for absolute address constants gave me lots of
problems attempting to create shared executables ... where the same
object appeared simultaneously in multiple different address spaces at
potentially different virtual addresses. some past discussions of
loacation/address independent code
http://www.garlic.com/~lynn/subtopic.html#adcon

the os/360 loader and the os/360 link-edit had a lot of common
features in terms of processing imput ... the loader targeted the
results to a specific memory address/image ... and the linkage edit
just wrote an image to disk that could be more efficiently loaded at a
later time (than if it had to repeat a lot of the steps everytime).
the link-edit addressed the scenario where there was large number of
repeated executions of an unchanged program.

this didn't do a whole lot for a university student environment where
the same, unchanged program was very rarely executed more than once.
to some extent the os/360 loader was introduced to address
environments like student programs ... which would compile and then
(maybe) run ...  and never have a repeat execution w/o a compile.

and now there is the "binder" ... ibm FAQ ... what is the difference
between the binder and linkage editor ...
http://www.developer.ibm.com/tech/faq/individual?oid=2:22397

the following url
http://www-306.ibm.com/software/awdtools/hlasm/share.html
is "high level assembler and toolkit feature"  ... it has a number
of descriptions ... including (pdf file) tutorial/overviews:

• how the linkage editor works: a tutorial on object/load modules; link
editors, loaders, and what they do for (and to) you.

• the DFSMS/MVS binder and its "program ojbect" format: what the new
program model will mean to you.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

1.7rc2 & developer's build/install

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: 1.7rc2 & developer's build/install
Newsgroups: netscape.public.mozilla.general
Date: Mon, 17 May 2004 18:54:12 -0600

yesterday (sunday, 5/16) developer's build would install and run
... wasn't too bad, but could crash under really heavy load.

today both 1.7rc2 and today's developer's build repeat the following
three times:

nsNativeComponentLoader: SelfRegisterDll(libnegotiateauth.so) Load FAILED with error: libcom_err.so.3: cannot open shared object file: No such file or directory

and won't load/run

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

racf

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: racf
Newsgroups: bit.listserv.ibm-main
Date: Tue, 18 May 2004 09:38:02 -0600

wfarrell@ibm-main.lst (Walt Farrell) writes:

It could work with smart-cards and readers to supply your digital
certificate, and the cost of the cards and readers is getting into
the reasonable range, for small deployments.  I'm not sure they're
inexpensive enough for large-scale deployment yet (but I haven't
checked the market lately, either).

note that hardware tokens (smartcards) just supplying your digital
certificate .... and nothing else is just another form of static
data and shared-secret. some set of hardware tokens have tried that
and found out that they were prone to evesdropping and reply ...
somebody just listens and captures the digital certificate ... installs
it into a counterfeit token ... and voila they have their own

something you have

token to impersonate you.

The issue of using public/private key, asymmetric encryption,
authentication is almost totally independent of whether or no a
digital certificate exists.

the early x.509 identity digital certificaes were target at a totally
unconnected environment ... and the x.509 identity digital certificate
supposedly contained all the necessary information for the
receiving/relying party to both authenticate as well as authorize you;
aka these digital certificates were designed that the system receiving
the digital certificates would not have to look up anything on any
system to determine whether you were a valid user and/or what
permissions you might have .... everything was carried in the
certificate. there was also quite a bit of fud generated portraying
public/private key authentication as equivalent to digital
certificates.

so some of the problems with x.509 digital certificate take-up ... is
that it has very little relationship to most real live business
operations.

one of the first things that x.509 identity digital certificates
encountered was that the overloading of the digital certificates with
enormous amounts of identity information tended to create a severe
privacy problem. so in the mid-90s you started to see appear something
called a relying-party-only certificate.

In this case, you register your public key (in place of a password) in
the receivers/relying account database (say the userid database). They
then issue you a relying-party-only digital certificate that contains
(two things) your userid and your public key (a drastic subset of the
information that is contained in your userid database).

when you go to logon ...

1) you contact the server/relying party with your userid.

2) the server sends back some unique, random data.

3) you digital sign the random data with your private key

4) you transmit the digital signature and you certificate
to the server

5) the server looks up the userid information using "1" and retrieves
the userid record (giving your authentication and authorization
information). It then uses the public key from the userid record to
validate the digital signature.

The digital certificate is basically a stale, static subset, copy of
information from your userid/account record. The only time that a
relying/receiving party would ever need to resort to a stale, static
subset of information contained in a digital certificate is when it
otherwise doesn't have access to the real information (allowing a
random user to logon to a system based purely on the contents of a
digital certificate and there is no local definition for that user and
there is no means of making online contact to a system that might have
a definition for that user).

The usefullness of x.509 digtital digital certificates was severely
reduced in the mid-90s when the amount when it was realized that
overloading it with enormous amounts of identity and/or authorization
detail created huge privacy problems for the individual and/or the
corporation.

Furthermore, the original design point for digital certificates was
something out of the early 80s ... that of offline email. In those
days, email was handled by electronic post-offices. PCs would dail-up
and create a temporary connection to their post-office, exchange
email, and then hangup. This was long before the days of ubquituous
internet online connectivity. The person then was sitting there with
some amount of email, possibly from individuals that there had been no
previous contact. The issue was how to perform any validation on the
original sender of the email when there had not been any previous
communication, and there was no recourse to an online environment.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

command line switches [Re: [REALLY OT!] Overuse of symbolic

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: command line switches [Re: [REALLY OT!] Overuse of symbolic
 constants]
Newsgroups: alt.folklore.computers
Date: Tue, 18 May 2004 08:54:21 -0600

jmfbahciv writes:

If the segment is sharable, each address space had to use the same
absolute address of the sharable segment.  The whole point of
sharable is to need only one copy in core.  When did the virtual to
physical address calculations get done by you (meaning the monitor)?

OS/360 had convention that "address constants" were absolute ... thay
had something called relocatable adcons that were additional data
structures for the image on disk that when loaded in memory ... the
loader could run thru and swizzel/fix all the storage areas with
absolute address constants. programs could load values from these
absolute address constants and directly address areas of memory.

in the shared segment world .... if a shared thing (data or
instructions) were to appear at different address in different virtual
virtual address spaces ... it is obvious that absolute address
constants isn't going to work. so the common solution is to use
relative address constants. each virtual address space has some
private dictionary area and/or carries some local absolute address in
a general purpose register. for relative address constants a
convention is established that the address constant is added to some
base registerd, (where the base register is local to the specific
address space and has the local address of the shared object. The
convention is that the relative address constant is some offset from a
base ... and the absolute value for the base in any specific address
constant is in a known register.

a trivial analogy is the whole 360 base+displacement addressing
convention. nearly all 360 instructions address storage with a base
register plus 12bit displacement convention (or base register plus
index register plus 12bit displacement). In the assembler world, all
the symbolic locations are tracked by the assembler ... and the
assembler generates instructions with the correct base register and
appropriate 12bit displacement.

So a not uncommon 360 convention is:

            BALR   R15,0
            USING  *,R15

which is branch and link register ... i.e. used for branching to
subroutines and saving the return address of the next instruction;
except in the case of register zero for a branch-to address ... it
doesn't actually go anywhere ... just loads the address of the next
instruction into R15. The "using" statement doesn't generate any
instructions, it just tells the assembler to use the relative location
of what is in R15 as the base ... and generate subsequent storage
references with 12bit displacements off R15.

So a real standard calling sequence is something like:

           L     R15,=a(subroutine)
           BALR  R14,R15

which uses absolute adcon of the subroutine. as a replacement, I had
to use something like:

           L     R15,=a(subroutine-base)
           AR    R15,R12
           BALR  R14,R15

where "subroutine-base" becomes a relative or offset address ... and the
location of "base" for the specific virtual address space is known to
be in r12.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Tue, 18 May 2004 09:07:18 -0600

Eric writes:

The VAX had the PROBER and PROBEW instructions to check for access.
However this may not be much help unless the page table cannot change.
Otherwise the table might change right after you check and kaboom.
Whether this can happen or not depends on the OS and the exact spot
in the kernel code.

the instruction on 360/67 was LRA ... load real address. the condition
code from the instruction gave whether it was valid or another kernel
routine had to be called to make it valid.

note however that for the cp/67 kernel most entries weren't exactly
the virtual machine calling the kernel but a privilege instruction
interrupt ... which the kernel then had to decode and simulate.

the program interrupt gave the starting address of the offending
instruction; so the first thing that the cp/67 kernel had to do was a
LRA of the instruction and access the first two bytes. the first two
bits of the instruction gave the length of the instruction.
instructions were half-word aligned so instructions longer than two
bytes could cross page boundaries ... and therefor require the LRA to
be repeated for every two bytes in the instruction.

there were interlocks in the cp/67 kernel which allowed it to reliable
access virtual memory for the purpose of instruction retrieval and
emulation. start i/o instruction got especially complex since all of
the channel command words and many of the arguments were effectively
part of the instruction execution that needed to be accessed for
simulation.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Tue, 18 May 2004 11:44:38 -0600

Terje Mathisen writes:

OTOH, Netware was designed from the bottom up on the idea that you
should never have to touch anything you didn't actually intend to do
something specific with, i.e. all IO was async using scatter/gather
extension lists.

there was a precusor project called DataHub (sort of deparmental PC
LAN-based network server) done by GPD san jose. some of the coding
work was subcontracted out to organization in provo (one of the
DataHub people used to commute between san jose and provo almost every
week) at some point GPD san jose decided to walk away from the effort
and allowed the group in provo to retain all code from the project.

all of the GPD san jose people had mainframe background where async
i/o with scatter/gather had been the norm for some time.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

IBM 7094 Emulator - An historic moment?

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: IBM 7094 Emulator - An historic moment?
Newsgroups: alt.folklore.computers
Date: Tue, 18 May 2004 11:37:15 -0600

Tom Van Vleck writes:

The CTSS machines also had a Chronolog clock attached to unit A9.
This device appeared to be a tape drive to the channel: reading it
returned the month, day, hour, minute, and tenth of minute in BCD.
Notice that the year was not returned: we had to reassemble the CTSS
supervisor every year to change the constant.

it was carried over when the guys did cp/40 and then cp/67 ...  and
there was a virtual chronolog device at unix 0FF sort of defined by
default (at least in every cms virtual machine) ... except it was
extended to return

byte   value
0-7    mm/dd/yy
8-15   hh.mm.ss
16-19  total kernel/supervisor CPU time since logon
20-23  total virtual CPU time since logon

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

layered approach

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: layered approach
Newsgroups: comp.protocols.tcp-ip
Date: Tue, 18 May 2004 12:19:54 -0600

Barry Margolin writes:

>why do we need layered approach (OSI model)?

I'm sure this is explained in the first chapter of your textbook.

if it is the OSI model specifically ... one could consider that one
might need OSI if they wished to outlaw LANs, WANs, and the Internet.

ISO international standards body had a rule that ISO and its national
chartered bodies couldn't consider standards that violated OSI.

I was involved in HSP (high-speed protocol) attempt to work on it as
standard in X3S3.3 (the us body chartered by ISO for working on level
3 & 4 standards in the OSI model ... aka network layer and transport
layer).

HSP had a specification that would go directly from the level 4/5
interface directly to the LAN MAC interface. This violated the OSI
model and therefor X3S3.3 couldn't work on it for standards.

The problem was (at least) two fold:

1) going directly from level 4/5 interface to the LAN MAC interface
skipped the level 3/4 interface and therefor violated the OSI model

2) the LAN MAC interface sits logically somewhere in the middle of the
OSI level 3 network layer (at least in part because it includes some
network addressing function). by definition anything that interfaces
to a LAN MAC interface is in violation of OSI model ... because LANs
are in violation of the OSI model.

The Internet issue is that the Internet defines a "internetworking
layer" (internetworking protocol layer or IP for short) that doesn't
exist in the OSI model ... the osi model only has level 4, transport
and level 3, network ... there is no such thing as an "internetworking
layer". Therefor anything that involves IP and/or the internet is also
in violation of the OSI model .. and by the ISO rules could not be
considered.

basically the OSI model reflects somewhat a pre-70s communication
state-of-the-art with point-to-point copper wires and disallows the
invention of LANs, WANs, and the Internet (or at least the standards
body for OSI disallowed inventions that violated the OSI model).

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Tue, 18 May 2004 12:04:09 -0600

glen herrmannsfeldt writes:

I thought it was the next instruction, and you use the instruction
length code in the OPSW to correct it.   I think not all exceptions
do it the same way, though.

yes, i was obfuscating too much detail ...

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Why does Windows allow Worms?

Refed: **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Why does Windows allow Worms?
Newsgroups: comp.os.ms-windows.misc,comp.os.ms-windows.networking.misc,comp.security.misc,comp.windows.misc
Date: Tue, 18 May 2004 14:43:29 -0600

Bruce Barnett <spamhater95+U040518143634@grymoire.com> writes:

Perl has a mechanism to tag data that came from non-trusted
sources. Certain operations cannot be executed because the data is
tainted.  It's not foolproof, and user errors in Perl coding can
occur, but I suspect it's easier for a beginner to write safe Perl
code than it is to write secure C code.

the converse is true ... anybody (beginner or not) can write unsafe C
code ... detailed analysis at least in the 80s identified the
traditional string handling library functions with implicity lengths
as creating an extrodinarily unsafe environment. somewhat like when
the hew came into farming environment and mandating all farm equipment
needed protection because even experience farmers where getting caught
in one thing or another. the standard C string library functions and
string handling paradigm is hazardous equipment.

detailed vulnerability analysis in the late 80s predicated that C
language environment would have a factor of ten times to hundred times
more buffer length related problems than other program environments
with better length handling paradigms (because of the standard length
handling paradigm that was part of the standard C environment).

something like 30 years ago, mid-70s, a situation was analysed and
documented regarding vulernability allowing exectuable code of any
kind to arrive over a network interface.

computing paradigm from the 60s was systems where all software and
programs were relatively carefully vented and installed by experienced
and trained stated. ordinary people might be able to use such systems,
but didn't actually write code &/or install executables. the problem
that started appearing with various time-sharing systems in the 70s
that supported personal computing ... was that some of them actually
allowed end-users to introduce executable code.

The other characteristic is that most of these time-sharing systems
from the 70s (or earlier) at least started with the basic premise in
their design that they had to protect different users from each other.
That permeated the basic design through-out the system.

The stand-alone, dedicated personal computers from the 80s had none of
these problems ... they didn't require partitioning that protected a
very large number of different users from each other ... and they
didn't have to worry about foreign and possibly hostile executables
arriving over any network.

There are two somewhat different vulnerabilities:

1) huge number of compromises because of fundamental flaw in the
   length paradigm used in C language

2) partitioning and security features that needed to a) isolate
different local users from each other and eventually b) isolate a user
from a foreign and hostile network environment

so there is some analogy to automobiles. long ago and far away
... automobiles required drivers that were trained in all the quirks
and mechanics of an automobile. the problem was eventually that
somebody wanted to sell everybody a car ... but there wasn't enuf
personal chauffer/mechanics for everybody to have a car. they had to
come up cars that people could operate themselves w/o requiring a
personal chauffer/mechanic.

so if everybody was going to operate their own car ... they eventually
had to

1) require a minimum of expertise ... so there were mandated training
and licensing programs

2) require a huge amount of mandated safety features

4) have institutonalized vehicle safety checks

3) have a bunch of laws that could charge people with a) reckless
operation of a vehicle and/or b) operation of an unsafe vehicle. This
could confiscate their vehicle and take away their driving rights.  It
didn't matter whether people knew anything about the mechanics of a
car or not ... they were still liable for operating an unsafe vehicle.

now these are applicable for operation of a vehicle in a public
environment. if you have a vehicle that you will only operate in your
backyard and never bring into a public environment you aren't subject
to most of the regulations.

recent post about looking at entries in CVE database:
http://www.garlic.com/~lynn/2004e.html#43 security taxonomy and CVE

collection of past posts about all kinds of vulnerabilities, exploits, and
fraud:
http://www.garlic.com/~lynn/subintegrity.html#fraud
collection of past posts related somewhat to the reverse ... assurance
http://www.garlic.com/~lynn/subintegrity.html#assurance

all sort of random past threads mentioning the length issue and buffer overflow
exploits:
http://www.garlic.com/~lynn/99.html#219 Study says buffer overflow is most common security bug
http://www.garlic.com/~lynn/2000.html#30 Computer of the century
http://www.garlic.com/~lynn/2000g.html#50 Egghead cracked, MS IIS again
http://www.garlic.com/~lynn/2001n.html#30 FreeBSD more secure than Linux
http://www.garlic.com/~lynn/2001n.html#71 Q: Buffer overflow
http://www.garlic.com/~lynn/2001n.html#72 Buffer overflow
http://www.garlic.com/~lynn/2001n.html#76 Buffer overflow
http://www.garlic.com/~lynn/2001n.html#84 Buffer overflow
http://www.garlic.com/~lynn/2001n.html#90 Buffer overflow
http://www.garlic.com/~lynn/2001n.html#91 Buffer overflow
http://www.garlic.com/~lynn/2001n.html#93 Buffer overflow
http://www.garlic.com/~lynn/2002.html#4 Buffer overflow
http://www.garlic.com/~lynn/2002.html#19 Buffer overflow
http://www.garlic.com/~lynn/2002.html#20 Younger recruits versus experienced veterans  ( was Re: The demise of compa
http://www.garlic.com/~lynn/2002.html#23 Buffer overflow
http://www.garlic.com/~lynn/2002.html#24 Buffer overflow
http://www.garlic.com/~lynn/2002.html#26 Buffer overflow
http://www.garlic.com/~lynn/2002.html#27 Buffer overflow
http://www.garlic.com/~lynn/2002.html#28 Buffer overflow
http://www.garlic.com/~lynn/2002.html#29 Buffer overflow
http://www.garlic.com/~lynn/2002.html#32 Buffer overflow
http://www.garlic.com/~lynn/2002.html#33 Buffer overflow
http://www.garlic.com/~lynn/2002.html#34 Buffer overflow
http://www.garlic.com/~lynn/2002.html#35 Buffer overflow
http://www.garlic.com/~lynn/2002.html#37 Buffer overflow
http://www.garlic.com/~lynn/2002.html#38 Buffer overflow
http://www.garlic.com/~lynn/2002.html#39 Buffer overflow
http://www.garlic.com/~lynn/2002i.html#62 subjective Q. - what's the most secure OS?
http://www.garlic.com/~lynn/2002l.html#42 Thirty Years Later: Lessons from the Multics Security Evaluation
http://www.garlic.com/~lynn/2002l.html#45 Thirty Years Later: Lessons from the Multics Security Evaluation
http://www.garlic.com/~lynn/2002m.html#8 Backdoor in AES ?
http://www.garlic.com/~lynn/2002m.html#10 Backdoor in AES ?
http://www.garlic.com/~lynn/2002m.html#58 The next big things that weren't
http://www.garlic.com/~lynn/2002p.html#6 unix permissions
http://www.garlic.com/~lynn/2003i.html#59 grey-haired assembler programmers (Ritchie's C)
http://www.garlic.com/~lynn/2003j.html#4 A Dark Day
http://www.garlic.com/~lynn/2003j.html#8 A Dark Day
http://www.garlic.com/~lynn/2003j.html#20 A Dark Day
http://www.garlic.com/~lynn/2003k.html#64 C & reliability: Was "The Incredible Shrinking Legacy"
http://www.garlic.com/~lynn/2003l.html#2 S/360 Engineering Changes
http://www.garlic.com/~lynn/2003n.html#14 Poor people's OS?
http://www.garlic.com/~lynn/2003o.html#6 perfomance vs. key size
http://www.garlic.com/~lynn/2003o.html#20 IS CP/M an OS?
http://www.garlic.com/~lynn/2003o.html#25 Any experience with "The Last One"?
http://www.garlic.com/~lynn/2003o.html#50 Pub/priv key security
http://www.garlic.com/~lynn/2004.html#30 Threat of running a web server?

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Tue, 18 May 2004 15:53:09 -0600

Brian Inglis writes:

That was done before SMP came along: the CP (central processor)
handled I/O, and APs (attached processors) ran app code, and requested
services on the CP.
ISTR Lynn Wheeler's VM resource manager/HPO dynamic adaptive code
switched into non-SMP mode to avoid cache thrashing when the interrupt
rate or system CPU went over some threshold.
ISTM it would be easier to design a single architecture and drop some
hardware to customize a core, similar to the split of 370/158? CPU and
I/O microcode onto separate cores to produce the next faster model
(168?/3033?)

note quite ... the resource manager dynamically switched into disabled
interrupt mode when the i/o interrupt rate exceeded some limit.

nominally most of the kernel ran disabled for external & i/o
interrupts ..  but all virtual machine execution was fully enabled for
interrupts. the virtual machine was run with a time-slice ... and if
nothing else interrupted the virtual machine ... there would at least
be an external timer interrupt into the kernel. the kernel would then
at least update some dispatching priority and possibly re-arrange
things so a different virtual machine would be chosen to run.

there was two parts. the most frequently executed instructions were in
the dispatcher loading up all the stuff for dispatching a virtual
machine. you can see that in the CPU hotspot measurements we took as
part of the ECPS effort (looking to see where the kernel spent most of
its time):
http://www.garlic.com/~lynn/94.html#21 370 ECPS VM microcode assist

so frequently, if it had been in the kernel for some time ... there was
some queued I/O interrupts that had arrived and where queued up ... and
would interrupt as soon as the virtual machine was dispatched. that
wasted all of those instructions "dsp+8d2" to "dsp+c84" referenced in
the above ECPS URL.

So, there is an instruction "SSM" ... set system mask that can change
whether you are enabled or disabled for i/o interrupts. I placed a
pair of such instructions effectively just before all the work to load
up a virtual machine that opened an I/O interrupt window and then
immediately shut it. If there was a queued interrupt(s), the processor
would go off to the interrupt handling routine and not get to the 2nd
SSM instruction in the dispatcher. There was still small possibility
that an I/O interrupt would arrive during the load-up ... but that
period was very short.

So the interrupt window effectively eliminated wasting loading up a
virtual machine if there was already any queued i/o interrupts.

So the next piece was the dynamic adaptive stuff ... if you ran the
virtual machine disabled for I/O interrupts ... and only took
interrupts with the dispatch interrupt window ... it would increase
the latency processing for handling the interrupt. The problem was
that if the environment was a very high interrupt rate ... the
constant switching back & forth between application/virtualmachine
execution and interrupt handling could totally destroy any cache
localty and system thruput. So the dynamic adaptive stuff had to make
trade-off decision between interrupt latency processing and cache hit
rate associated with interrupt rate. Furthermore, if you slightly
delayed interrupts ... and processed multiple in a batch ... the
improved cache hit rate of the i/o interrupt processing might actually
not only improve thruput ... but actually reduce avg. latency.

So the default was run the virtual machine enabled for I/O interrupts
which allowed for latency optimization. However, the i/o interrupt
rate would be monitored and considered to be so low as to not have
significant effect on cache hit ratio. It was at high i/o interrupte
rates ... that the dynamic adaptive stuff would switch the system back
& forth between free-for-all i/o interrupts and "structured"
interrupts only thru the interrupt window in the dispatcher.

Part of the latency issue in a heavily loaded system was effective
device utilization ... since the I/O interrupt routine was also
responsible for redriving the device with any pending i/o requests.
On 2305 fixed head disks ... this was compensated for by having
"multiple exposure" feature. The basic channel infrastructure only
supported a single request at a time for device. Multiple exposures
created multiple logical devices for the purpose of the channel
program structure ... which were all mapped to the same physical
device. With a half-dozen requests constantly queued in the hardware
for a device, any redrive latency in the i/o interrupt handler was
less of an issue. I tried w/o much success to get multiple exposure
feature added to other high utizalization devices.

Later working in the non-mainframe environment ... I did get some
stuff done for multiple command queueing ... that helps address the
device redrive latency.

The problem in the (370) SMP environment is somewhat more complicated
with shared controllers being used to simulate shared channels ... and
the original (370) SMP only had a very thin layer that had been SMP'ed
...  and the rest was behind a global kernel lock (although a bounce
lock with queueing rather than a spinlock). If you've turned off i/o
interrupts while running in virtual machine mode ... and one processor
monoplizes kernel execution (and the other is busily churning away
with virtual machine execution) .... it is fantastic for cache hit
ratio ...  but terrible if there are queue I/O interrupts on channels
dedicated to the processor not getting into the kernel. So you have to
play some games about making sure that a processor always gets some
window to drain pending i/o interrupts on dedicated channels (again
you are trading off cache efficiency against i/o service latency).

For 370 158 & 168s ... IBM announced both SMP configurations and
what they called "attached processor" configuration. An "attached
processor" configuration was fully shared-memory SMP ... but one of
the processors had no (dedicated or otherwise) channels at all.  In
this case, the dynamic adaptive code didn't have to worry about a
processor getting i/o interrupt starved ... because one of the
processors didn't have any channels. In this scenario the dynamic
cache efficiency optimization wasn't to manage just the bad effects of
i/o interrupts but also to optimize cache from the effects of cross
processor migration (execution of same code constantly changing
between processors and loosing cache locality).

Now, 370s, 303x, 3081s, etc ... in two processor configurations ran
caches at .90 times that of a uniprocessor (to allow for all the
cross-cache chatter); therefor a two-processor system was only 1.8
times a single processor system. A SMP kernel might throw in a lot of
other overhead ... so a two-processor SMP system might have only
1.3-1.5 times the effecitvely thruput of the same workload on a
uniprocessor.

With all sorts of tricks and magic in the SMP code ... and the dynamic
adaptive stuff doing some more magic ... there were 370 "attached
processors" configurations that were running at more than twice that
of a uniprocessor ... the magic with cache locality more than offset
the hardware only being 1.8 times (and whatever minimum magic
additional kernal SMP pathlength there was).

For SP1 there was a rewrite to make it less magic and more elegent,
traditional SMP implementation. It had the adverse downside that most
of the cache locality optimization was lost and typical customers were
finding that in a two processor system ... both processors were
spending ten percent of elapsed time executing new SMP kernel overhead
that it hadn't been executing before.

... oops, longer than i thot it was going to be, guess i got carried
away again.

The 370/158 supported integrated channels ... the processor engine
inside the 158 ran microcode that implemented both the channel
function as well the 370 processor instructions. The 370/158 was also
on the knee of the manufacturing cost/performance curve ... something
like some automobile assembly lines.

For the 303x line of computers ... they decided to create a channel
director ... which was actually a repackaged 370/158 engine running
only the integrated channel microcode (supporting six chnannels).  The
3031 was a 370/158 repackaged to only have the 370 microcode and to
use external channels (in the channel director) rather than integrated
channels (time-shared in the processor). The 3032 was a 370/168
repackaged to use the (erzats 370/158) channel director. The 3033
started out being the 370/168 wiring diagram remapped to newer
technology chips that were 20% faster. Along the way, they decided
that the 3033 needed to be more than 20% faster than the 370/168/3032.
The newer chip technology had something like ten times the circuit
density of the 168 chips ... but wasn't being utilized because of the
simple remapping of the 168 wiring diagram. Some amount of the 3033
performance boost project was redoing the wiring so that it did a lot
more on-chip operations. The resulting 3033 was more like 50% faster
than the 168 instead of only 20% faster.

I had previously mentioned VAMPS, a 5-way shared-memory processor
project where I sort of originally created the whole structure for
bounce lock and really lightweight queued kernel requests. It didn't
get announced. Somewhat in that timeframe there was also work on
something called logical machines .... a 16-way 370/158 shared memory
multiprocessor w/o cache consistency (which also didn't get
announced). This was sort of the tail-end of the VAMPS effort and
before the official product VM/370 SMP support gat underway.

For logical machines, the VM kernel would reliably use
compare&swap ... which had cache specific semantics on
everything it needed to in the kernel ... and there was feature for
cache flush. So the CP kernel would understand about shared memory
across the 16 processors ... and would dispatch different virtual
machines on specific processors ... which were essentially all
private, non-writeable shared virtual address spaces. R/O shared
virtual memory was fine and non-shared virtual memory was fine.
Anything that might want to operate in this machine with read/write
shared memory had to faithfully use compare&swap for all
such memory operations ... since not only did it do the
compare&swap semantics ... but it was also the only
instruction that had any cross-cache semantics.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Tue, 18 May 2004 17:35:04 -0600

slight digression ... there was some other systems and kernels that
had SMP support for these machines.

a specific one had really, really bad thuput and performance on both
uniprocessor and two-processor. however, they went around highlighting
the fact tha they had really, really great SMP support (support was
carefully used in place of thruput) becuase their two-processor
thruput was 3.8 times their single processor support. it was sort of
met to confuse customers between the issue about really great SMP
thruput and really great SMP support.

actually their claim to great SMP support was that their two-processor
thruput was 3.8 times their single processor thruput.

what they didn't say was that the single processor benchmark was on a
1mbyte machine and their two-processor benchmark was on a two mbyte
machine (if you have twice the machine ... then you double both the
number of processors as well as the amount of memory).

so the real issue was that they also had a really bloated kernel with
large fixed storage requirements and there was barely any storage left
for applications with only one mbyte of real memory. on a two
processor machine with only a single copy of the kernel and twice the
real storage, they had nearly ten times the amount of real storage
available for applications ... and that was the real reason the two
processor benchmarks had 3.8 times the thruput of the single processor
benchmarks.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

command line switches [Re: [REALLY OT!] Overuse of symbolic

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: command line switches [Re: [REALLY OT!] Overuse of symbolic
 constants]
Newsgroups: alt.folklore.computers
Date: Tue, 18 May 2004 21:34:49 -0600

Peter Flass writes:

This is the way VM's DCSS's (Discontinuous Saved Segments) work, and
it's a mess.  The creator has to assign the address when the DCSS gets
built.  When you get a lot of them you get overlaps or segments that
want to occupy the same virtual address space as other segments, and
you've got trouble.

I think what Lynn was talking about are programs that contain no
absolute addresses, so they can be loaded anywhere and, better yet,
mapped to different virtual addresses in different processes while
still occupying only one set of physical addresses.  With the 360,
etc. architecture it's potentially possible, since the instructions
only contain base/displacement addresses, but it's not easy.  I think
it's also possible on a PC.  Is this the meaning of PIC (position
independent code) that Linux uses for shared objects?

i had done the precusor to DCSS .. and called it virtual memory
management ... and it was part of some other stuff i did for a cms
page mapped filesystem
http://www.garlic.com/~lynn/subtopic.html#mmap

the issue in 360 & 370 is you have a program

               balr  r12,r0
               using *,r12
               .....
               l    r3,=a(abcd)
               tm   0(r3),x'ff"
               ...
               ...
               bunch of instructions
               ...
               ...
abcd           dc  F'0"

               ltorg
               =a(abcd)

... now lets say that abcd is at location x'800' within the program
and the literal pool is at x'a00' in the program ... then the assembler
will generate something like

               5830C9fe

i.e.
58   load opcode
3    into register 3 from storage
0    no index register
C    base register 12
9fe  displacement added to contents
     of reg 12 to form address where
     storage is to be loaded.

now the standard for os/370 and cms was that the storage location
=a(abcd)/a00 got stuck with a value something like
                800
and some dictionary stuff that said that storage location
a00 contained a value relatively to the begining of the
program (aka relocatable adcon). when the program was loaded
into memory ... the loader would run thru all the relocatable
adcons listed by the directory and physically adjust them so
that they were absolute addresses ... i.e. if the program
got loaded at x'1000000' ... then the loader would add
x'1000000' to the contents of storage location x'a00' (resulting
in x'1000800') before starting the program. at the time the program
is running all "relocatable adcons" were now absolute values ...
not relative to anything.

so if I had a program that occuped same r/o shared segment at
different virtual addresses in different virtual address spaces
... all the (absolute) relocatable adcons ... could at best take on a
single value ... which means it would be limited to working at one
specific (virtual) address location. so it doesn't work ... at least
not until we get quantum computing at the adcon will know what address
it is supposed to be based on what address it is being used from.

So the issue is how to be address constants work in an address free
environment ... more so than just the 12bit displacement paradigm
supported by instruction storage address calculation.

so to make it work .... i take a page out of the instruction decoding
and force address constants (which can be 24bits ... or later 31bits)
and make them 24bit displacements ... rather than 24bit absolute
addresses. first off, you have to turn off the whole relocation adcon
directory infrastructure, if i specified something like:
               =a(abcd)
it would store the displacement and leave around the information
for the loader to add the BASE address to it at load time before
the program started running. however, if I specified
               =a(abcd-base)
it would believe it was a displacement/absolute address and not leave
any work orders for the loader. The unloaded value would be the same
.... however, the first form is fixed into an absolute address by the
loader when the program is brought into memory. the second form the
loader leaves alone and remains a displacement.

now to get the real useable value for the second (displacement) form,
there has to be some inline code at run time that

                l     r3,=a(abcd-base)
                ar    r3,r12
                tm    0(r3),x'ff'

the issue is that the original 360 displacement addressing only
allowed for 12bit displacements (that are physically part of the
instruction) to be automatically handled by hardware address
calculation.

So I went thru some amount of CMS kernel code, the original CMS
editor, some number of other programs and converted them from
traditional os/360 "relocatable adcons" to fixed displacements with
inline, runtime code that calculate the real address. I also later did
something similar to iso3270, browse and fulist package.  Some amount
of this code wasn't "read only" ... so I had to do various
restructures for it to reside in a read/only shared segments.

The DWSS shared segment code was based on having captured a virtual
memory snapshot of the code and putting away in a special VM/370
kernel structure. The original code included support from the CMS
paged mapped filesystem ... instead of having to record the shared
segments in special VM/370 control structures ... I just put virtual
memory images out into the CMS filesystem ... and applications could
load virtual memory images directly out of the page mapped filesystem
... and optionally specify whether they were shared or not shared.

So originally there was a bunch of VM/370 kernel code to support the
"shared segment" extensions, the page mapped file system for cms. In
addition, there was a bunch of cms kernel and application code
converted to be "read only" and be address free ... as well as the
support for the page mapped filesystem ... and the CMS support to load
application images from the filesystem and specify whether they were
"shared" or "not shared".

The CMS people wanted to pick up all the code for DWSS ... but the
VM/370 kernel people only wanted to add a drastic subset of the
changes to the VM kernel. So the VM/370 kernel people didn't pick up
the page mapped filesystem support and only a very restricted subset
of the shared segment changes ... and then they cribbed the DWSS
diagnose instruction to interface to that drastic subset.

The CMS people picked up most of the shared segment code ... but left
out all the page mapped filesystem support (since the necessary vm/370
kernel support had been dropped). Now, this did create something of an
anomoly ... all of that initial CMS changes for DWSS shared segment
had code changes for both read-only code as well as eliminating
absolute adcons (allowing the same exact memory image to execute
simultaneously in different virtual address spaces at different
virtual addresses).

So one of the anomolies in the CMS DWSS code was the SVC$202 in page
zero. Normal calls in CMS involved loading the address of a parameter
list in register zero (which also specified the function) and
doing executing SVC 202. This could be the form of:

                   SVC 202
                   instructions

or

                   SVC 202
                   DC  AL4(*+4)
                   instructions

the cms kernel svc handler would look at the first byte after the svc
instruction and if it was zero (this is 24bit addressing ... so a
32bit adcon would always have the high byte zero) and for a normal
return skip the 4byte adcon. The address following the svc call was
for error returns. If there was an error in the processing, and there
was an adcon, the kernel would load the address in the adcon to return
to. If there was an error in the processing, and there was no adcon,
the kernel would go off to some standard kernel error handler and
never return to the program. I could go thru all the applications and
remove the DC AL4(*+4) which eliminates the relocatable address
problem ... but all application specific error handling and recovery
is lost. As an aside, while the interface allowed any address to go
into DC AL4(), common useage would have both error returns and
non-error returns come to the following instruction which would then
check the return code in the register for normal/error.

In any case, I did a hack ... I put a dummy svc in CMS kernel page
zero ... in CMS NUCON and called it:

SVC$202            SVC   202
ERR$202            DC    AL4(*+4)
BR14$202           BR    R14

and then inline application code I changed from

                   SVC  202
                   DC   AL4(*+4)
                   instructions

to

                   BAL  R14,SVC$202
                   instructions

aka ... go off to the NUCON svc instruction ... which would return to
the branch on register 14 ... which had the value of the next
instruction from the branch and link operation.

that was part of the distributed CMS dwss ... which wasn't required
for fixed address shared-segments ... but I had only invented to
handle the case for "floating" shared segments.

Later, the SVC202 processing was redefined ... and allowed for

                   SVC  202
                   DC   AL4(1)
                   instructions

the "1" is easily recognizable as not a valid address ... and the
kernel should return to *+4 after the svc whether there is an error or
not.

a couple past posts on shared sgements and some nucon SVC$202 hack
that I had done for address free code.
http://www.garlic.com/~lynn/2001f.html#9 Theo Alkema
http://www.garlic.com/~lynn/2002o.html#25 Early computer games
http://www.garlic.com/~lynn/2003f.html#32 Alpha performance, why?
http://www.garlic.com/~lynn/2003g.html#27 SYSPROF and the 190 disk

somebody want to see some general cms application assembler code
with the use of adcons, svc 202s, and other stuff ... this is
a version of Kermit for CMS dated 1982:
http://www.ibiblio.org/pub/academic/computer-science/history/pdp-11/rsx/decus/rsx83b/356040/cmskermit.asm

over the years people have found other uses for svc$202; various old
postings in the vmshare archive referencing svc$202
http://vm.marist.edu/~vmshare/read?fn=INTRANS&ft=MEMO&line=116
http://vm.marist.edu/~vmshare/read?fn=STRANGE&ft=MEMO&line=125
http://vm.marist.edu/~vmshare/read?fn=STRANGE&ft=MEMO&line=171
http://vm.marist.edu/~vmshare/read?fn=STRANGE&ft=MEMO&line=236
http://vm.marist.edu/~vmshare/read?fn=TT&ft=NOTE&line=17
http://vm.marist.edu/~vmshare/read?fn=CMSSPR2&ft=MEMO&line=147
http://vm.marist.edu/~vmshare/read?fn=DVF&ft=MEMO&line=225

... how 'bout source for program originally written 1976 ..  but
posted version in the archive has some changes dated in mid-80s.
http://vm.marist.edu/~vmshare/read?fn=PRTRCVR&ft=NOTE&line=1

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

command line switches [Re: [REALLY OT!] Overuse of symbolic

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: command line switches [Re: [REALLY OT!] Overuse of symbolic
 constants]
Newsgroups: alt.folklore.computers
Date: Wed, 19 May 2004 09:54:39 -0600

jmfbahciv writes:

How one views the mapping depends on which hat you have on.

From the kernal's POV the mapping is NOT physical-->virtual;
it is exactly the other way around.  The user sees a virtual;
the monitor has to see a physical (CPUs do not execute virtual
code).

For a user program to be "sharable" it can't have hardwired
absolute addresses.  If it does, then the monitor has to
put that code at exactly that physical location.

the issue is not physical addresses it is fixed addresses ... if they
are virtual fixed addresses ... the virtual->real translation hardware
converts them to some physical address.

a standard 360/370 address constant was typically a 32bit word
containing a 24bit address ... absolute/fixed ... relative to "zero".
if it was running real ... it was relative to real location zero.
if it was running virtual, it was relative to virtual location
zero.

these were called "relocatable adcons" ... in the sense they were
stored as relative/displacement addresses ... and an administrative
dictionary was kept for the loader. when the loader/binder/etc brought
the image into memory ... it would access the administrative
dictionary and run thru the program image converting all the
displacements to absolute (relative to zero) before starting the
program.

the 370/370 had convention in instructions for base+displacement
that the hwrdware automatically resolved to absolute address
at instruction decode. the standard form was
                BDDD

B  ... 4bit/16 register
DDD .. 12bit displacement

the contents of the register was combined with the displacement to
form the effective address (relative to zero) at instruction decode
time. each dynamic image of a program had its own private copy of
registers ... so, from a instruction decode standpoint, (most of) the
program image was not tied to specific address (relative to zero,
whether virtual or real).

the thing that tied a program image to a specific absolute address
(relative to zero, whether virtual or real) was the os/360 convention
of relocatable adcons ... which in the executable program image was
always converted from a displacement to an absolute address (relative
to zero).

the issue that I was grappling with was that I wanted the program
image to be completely location free ... not only all the address
processing in instructions (which had convention of register plus
12bit displacement that was resolved dynamically to an absolute
address at instruction execution) ... but also all the "relocatable
adcons" (which were being converted to absolute by the standard os/360
loader process).

each program image (address space) had its own private copy of
registers. the instructions in the program image were mostly location
free because of the instruction convention of addresses being register
contents plus 12bit displacement. the issue that prevented a program
image from been truely location independent was that they tended to be
sprinkled with these 32bit words containing absolute addresses (rather
than displacements, they were relative to zero, whether virtual or
real).

so i had to go thru code that i wanted to make totally location
independent/free and fixup all uses of address constants ... by
forcing them to be displacements and modifying the code to do some
inline code sequence that added the contents of a register to the
dispalcement value to form the absolute address value. From a
programming point of view, absolute address is the same whether
running virtual or real. When executing virtually, the hardware
translation takes a virtual absolute address (relative to virtual
zero) and converted to an absolute real address (relative to real
zero).

so if i have a program image that is part of a read-only, shared,
virtual memory segment ... then if the program image starts at
x'100000' in one virtual address space and the same program image
starts at x'200000' in a different virtual address space ... then all
the instruction storage addresses will work correctly because there
will be some register with x'100000' (or x'200000' depending on the
address space) ... and all the 12bit displacements will be added to
the register value and converted to the correct absolute address for
the virtual address space that it is executing in.

however, the program image contains a number of different modules, say
each 4kbytes in size or larger (larger than 12bit displacement). to
branch between such modules the conventional instruction sequence
picks up an absolute address contained in the program image and
branches to that location. Since it is an absolute address ... it will
need to be relative to zero ... and can either be address of the form
x'1nnnnn' or x'2nnnnn' ... but it can't be both (at least w/o having
something like quantum effects).

In conventional os/360, these are referred to as relocatable adcons
... supposedly giving the program-image location independence ...  but
they are only stored as a displacement with respect to some base
before they have been loaded into memory. as part of the loader
bringing the program-image into memory, the loader runs thru a
dictionary of all adcons and converts them from relative to something
within the program ... to absolute (relative to location zero).
during normal execution, all of the relocatable adcons have become
absolute.

For my purposes to have a read-only program image to occupy different
(virtual) addresses in different virtual address spaces simultaneously
... the program image had to be totally location independent (even at
run time). All the relocable adcons had to still be displacement form
at runtime ...  not absolute form.

One possible solution that some other technologies have used is
totally separate instructions and data (not allowed intermixing of
data and instructions that can occur in 360 programs). Address
constants are placed in special part of the address space that is
not-shared and private to the virtual address space. The programing
convention has a process where the program can find the location of
its address constants. These address constants can be absolute, but
since they aren't part of the program image, they can have the value
that is specific to that address space.

the problem that i was having in 360/370 with location independent
read-only, shared program image was a combination of the convention of
having absolute address constants (at least at runtime) and the
address constants were part of the program image.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Infiniband - practicalities for small clusters

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Infiniband - practicalities for small clusters
Newsgroups: comp.arch
Date: Wed, 19 May 2004 14:51:40 -0600

hack@watson.ibm.com (hack) writes:

No, because it is in processor-private real memory (z/Series page 0).

360/370 architecture (and later) used real page zero for a lot of
processor specific operations ... for example on an interrupt to a
specific processor ... the current processor status is stored in real
page zero.

so for 360 SMP ... there are multiple real processors ,,, sharing the
same flat real address space ... where there are multiple processors
(each needing their own page zero) and there is only a single page
zero.

the solution was a processor specific page zero "prefix register"
... where the processor loaded a specific (real) page address which
was used to modify all absolute real addresses for page zero. of
course processors in the same smp complex all agreed to choose
different real page addresses for their page zero prefix registers.

On 360 SMP ... any real addresses that mapped to real page zero were
rerouted to the page specified by the processor's page zero prefix
register. Any real address that mapped to the value in the prefix
register ... also went to that real address.

The operation of the SMP page zero prefix register was changed in
370. For real addresses that mapped to real page zero where rerouted
to the page specified by the processor's page zero prefix register
(just like 360) ... however 370 introduced reverse prefix register
mapping. In 370, for a real address that mapped to the page specified
in the page zero prefix register ... instead of going to that address
... in went to the "real, real" page zero ... aka the page zero that
is the same for all processors in the complex.

In the 360 SMP scenario ... the real, real page zero was somewhat
lost. The enhancement for 370 SMP ... was the real, real page zero
could become a convenient place to store common, cross-complex
information.

here is specific section in esa/390 principles of operation discussing
prefixing
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/BOOKS/DZ9AR004/3.7?SHELF=EZ2HW125&DT=19970613131822

here is the search page from esa/390 principles of operation for words
prefix register
http://publibz.boulder.ibm.com/cgi-bin/bookmgr_OS390/SEARCH?Book=dz9ar004&searchRequest=prefix+register&SEARCH=Search&Type=FUZZY&SHELF=EZ2HW125&DT=19970613131822&searchTopic=TOPIC&searchText=TEXT&searchIndex=INDEX&rank=RANK

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

command line switches [Re: [REALLY OT!] Overuse of symbolic

Refed: **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: command line switches [Re: [REALLY OT!] Overuse of symbolic
 constants]
Newsgroups: alt.folklore.computers
Date: Thu, 20 May 2004 09:39:00 -0600

jmfbahciv writes:

Then that code isn't a sharable segment.  A sharable segment means
that the only thing JMF had to do was update the users's PMP (page
map page) with the PMP of the sharable segment and that the PMP of
the sharable segment could be put into more than one user's PMP.  I
can't recall how it was done on the KA (before paging hardware).
But a similar thing happened.  The sharable segment was in core at
ADDRfoo; all users who had that high seg "in core" pointed to the
same physical address.  Note that these segments were read-only.

360/370 sort of had location independent code .... in 360, the kernel
could bring code into arbritrary location in real memory (for the most
part because instructions were location independent having address of
the form register+displacement values) ... and some of it was even
"shareable" by all applications running in real memory. the issue that
prevented these program images from being totally location independent
were these thing called "relocatable adcons" which were given an
absolute address at load time.

the issue for cp/67 & vm/370 ... with using the virtual memory
hardwware and segment sharing for share between different virtual
address spaces. on 370, vm/370 used 64kbyte segments & in 24bit
addressing that provided up to 255 segment "objects".

The problem with relocatable adcons ... and the subset of virtual
memory management released in the product as DWSS ... was that at
installation time, a fixed/absolute virtual address had to be chosen
for a shareable program. Now any program image might be one or more
64kbyte shared segments ... and there were lots more program images to
be defined as shared than there were unique virtual addresses. For a
specific user, CMS tended to have all applications that person needed
running in a single virtual address space.

The issue became when a person needed combinations of shared program
images simultaneously in the same address space. For the first dozen
or so ... the system installation process could fabricate unique
virtual addresses for the desired combinations. However, passed a
certain point there weren't sufficient system-wide unique virtual
addresses (combination of one or more 64kbyte contiguous segments in
24bit address space) to satisfy all possible user requirements at the
installation. No specific user might want more simultaneous shared
objects than could in a single 24bit address space ... but for the
aggregate of all users wanting some combination subset of all possible
shared objects ... being able to assign unique system-wide virtual
address for every possible shared object became impossible.

So the original virtual memory management addressed the limitation
of having at most 255 possible shared objects were two fold:

1) shared-objects could be normal program images in the cms filesystem
... rather than in a global, system-wide kernel defined facility.
subsets of the user community could have there own set of program
images that were defined as shared ... and didn't need to follow
a single, system-wide convention defined for all users at the original
installation of the application

2) it was possible to go thru various applications and modify the
relocatable adcon convention ... to use address constants that were
displacements/offsets/relative to some value that would be in a
register ... and have inline code to add the displacement value
occupying the program image to some value in a register. the net
result was that the program image became totally location independent
... and the same program image could occupy a read-only shared segment
and that shared-segment could be defined as arbritrary different
virtual addresses in different virtual address spaces.

The whole relocable adcon philosophy was invented in the days of
os/360 and real memory ... where a single real memory was shared by
all simultaneously running applications. The program image on disk
could be loaded at an arbritrary location ... so any arbritrary
combination of programs could occupy whatever (real) available
addresses available. The installation didn't have to go thru every
available installed program at installation time and assign a
arbritrary unique address ... which could prevent some combinations of
applications of running simultaneously if they were forced to
pre-assign fixed, absolute address to every program.

However, the os/360 people hadn't yet considered virtual address
spaces and shared segments ... where different virtual address spaces
might want to have arbritrary different combinations of program images
... but still want to have the physical space taken up by such program
image to shared across all simultaneous users. While the instruction
architecture defined for program images that were location independent
... the os/360 designers made some sort cuts in defining convention
for location related data (i.e. the relocatable adcons).

So, I spent a fair amount of time trying to overcome the short cut
taken with the relocatable adcons convention actually being absolute
addresses at execution time ... and being the chink in preventing
executing program images from being totally location independent.

Note that also, the relocable adcon convention also prevented the
program image on disk being exactly the same as the executing program
image (since there was the little matter of doing the relocatable
adcon swizzle from offset to absolute at load time).  This then
created opportunities with the page mapped filesystem and simply
pointing the paging system at a program image in the filesystem and
depending on the paging system to do all the work (not requiring the
loader to swizzle the adcons when the page was brought into storage
for execution):
other posts about page mapped filesystem
http://www.garlic.com/~lynn/subtopic.html#mmap
past post about location independent program immages
http://www.garlic.com/~lynn/subtopic.html#adcon

An sort of side story ... was somewhere along I was doing some work on
logical machines (part of a recent thread in comp.arch on symmetric
multiprocessing) ... and there was a corporate advanced technology
conference ... where it was asked that logical machines be presented.
Also on the agenda was presentation of 801 and cp.r.

so during the logical machine presentation, there was some heckling by
somebody in the 801 group ... who stated that they had examined the
product vm/370 code at the time and it contained no SMP support ... so
they didn't believe that we could run a vm/370 system on a logical
machine (a non-cache coherent, 16-way smp ... with 16 370/158 engines
all tied together). the response was that the code in the vm/370 could
be modified to support multiprocessor (the heckler from the 801 group
was basically expressing the opinion that they didn't believe that
somebody could write a couple thousand lines of code modifications to
the vm/370 kernel).

so it came time for the 801 group to give their presentation ... and
they said that the machine was 32bit virtual address and it had 16
256mbyte segments ... which were implemented as 16 segment registers
(rather than a table of 16 segment table pointers). I returned the
curtesy of heckling them about 1) the huge number of lines of code
being proposed for cp.r ... sure seemed like a lot bigger effert than
the rather modest amount of changes needed to enhance vm/370 to
support 16-way multiprocessor (and which they sort of had implied
disbelief that it could be done) and 2) having only the limited 16
possible segments .... seemed to severely restrict the total number of
useful different shared objects that could defined in such an
environment (which came from trying to grabble with the problem with
trying to manage an environment with a maximum of 255 unique shared
objects in the 370 world).

the reply was that there were no protection domains in 801 at runtime;
that all application, runtime code could change the contents of
segment registers as easily as it could change the contents of general
registes. that the convention allowing free access to any location in
a 32bit (virtual) address space by simplying changing the contents of
a general register to address the location ... was extended to the
segment registers to provide the application inline code to any of the
possible system segments (i.e. applications could as easily do cross
address space addressing as it could do intra-address space
addressing).

this was somewhat the target for the office product division
displaywriter follow-on product; 16bit 801 processor called ROMP
running with cp.r ... written in pl.8. A problem did start to show up
when the displaywriter project was canceled and the decision was to
retarget the machine to the unix workstation environment. They would
get the company that did the AT&T port for PC/IX to do one to what was
to become called the PC/RT. An issue tho with unix there was something
of assumption about having hardware protection domain between what
applications could do and what the kernel could do ... and you didn't
provide every arbritrary application free wheeling access to all
privileged protected security features.

recent post in comp.arch mentioning logical machines
http://www.garlic.com/~lynn/2004f.html#21 Infiniband - practicalities for small clusters

other posts about 801, romp, etc.
http://www.garlic.com/~lynn/subtopic.html#801

other posts about smp, compre&swap, etc
http://www.garlic.com/~lynn/subtopic.html#smp

misc. other posts in the recent comp.arch thread
http://www.garlic.com/~lynn/2004e.html#40 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004e.html#41 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004e.html#42 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004e.html#44 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004e.html#51 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004e.html#52 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#3 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#4 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#5 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#6 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#15 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#16 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#19 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#22 Infiniband - practicalities for small clusters
http://www.garlic.com/~lynn/2004f.html#25 Infiniband - practicalities for small clusters

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

[Meta] Marketplace argument

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: [Meta] Marketplace argument
Newsgroups: comp.arch
Date: Fri, 21 May 2004 13:34:14 -0600

Rick Jones writes:

The computing industry isn't as much a game of "Follow The Leader" as
it is one of "Ring Around the Rosy" or perhaps "Duck Duck Goose."
- Rick Jones

there was a joke in the valley during the early 80s ... and possibly
still that there were actually only (at most) 200 people in the
industry ... it isn't either "follow the leader" or "ring around the
rosy" ... it is just the same people in different disguises.

there use to be a page on some hp.com url that described the work of
somebody who done the dual address space 370 architecture (i.e. having
software that needed to have addressing to multiple, different virtual
address spaces), worked on 801 fort knox (replace all the
micro-engines in various controllers, misc. general purpose computers
as well as the low&mid-range 370s), and snake (and talked within past
couple years recently about having done a lot of the ia-64
architecture).

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

[Meta] Marketplace argument

Refed: **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: [Meta] Marketplace argument
Newsgroups: comp.arch
Date: Fri, 21 May 2004 21:18:07 -0600

"del cecchi" writes:

Wasn't bernbaum or something like that the name of the guy that left IBM
and went to HP and did HPPA?  Long about Iliad time?

i have to go back and check some dates ... the person i'm talking was
6-18? months after joel ... there is some folklore he spent the last
two weeks (after giving notice) finishing up some work on blue iliad.

... hint
http://www.hpl.hp.com/news/2001/apr-jun/2worley.html

some of the other hp.com references that went into more detail are now
404

lets see if the way-back machine works
http://web.archive.org/web/20000816002838/http://www.hpl.hp.com/features/bill_worley_interview.html

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

[Meta] Marketplace argument

Refed: **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: [Meta] Marketplace argument
Newsgroups: comp.arch
Date: Fri, 21 May 2004 22:17:17 -0600

Anne & Lynn Wheeler writes:

lets see if the way-back machine works
http://web.archive.org/web/20000816002838/http://www.hpl.hp.com/features/bill_worley_interview.html

in the endicott/801 reference above from the way-back machine ... this
was in the 4381 time-frame. endicott had been a whole lot of converted
one story warehouse and manufacturing bldgs ... however,
brand-spanking, new, multi-story bldg. with brick facing was built for
the 801 effort (or at least built in that time-frame and most of the
801 people got offices there). i contributed to the justification that
killed the endicott 801 effort based on implementing 370 directly in
hardware (aka 4381) ... rather than use 801 for a micro-engine and
implement 370 as microcode running on the 801 engine.

that whole 801 thing in the very late 70s and early 80s was the large
number of different micro-processors all over the corporation ... all
requiring their unique programming. the low and mid-range 370s had all
been microcode engines ... going back to 360 days. the issue going
into the 4381 was that chip technology was starting to get to the
point where you could consider actually doing a mid-range 370 directly
in hardware. the advances in chip technology was happening at the same
time the push for using 801 to replace the wide variety of
micro-engines with a common 801 architecture. The 4381 issue wasn't so
much that the idea for replacing all the micro-engines was bad ... but
that 4381 could get much better price/performance by implementing
directly in hardware (no longer using the micro-engine approach for
mid-range 370).

by comparison, the 4381 predecessors; 4341 & before that 148 ... were
heavily microcoded micro-processor engines ... avg. something like ten
microcode instructions per 370 instructions i.e. the mip rate of the
microprocessor engine had to be ten times faster than the 370 mip
rate. this is akin to several of the current day 370/390 emulators
running on intel platforms.

in fact, ecps for 148 ...
http://www.garlic.com/~lynn/94.html#21 370 ECPS VM microcode assist
http://www.garlic.com/~lynn/94.html#27 370 ECPS VM microcode assist
http://www.garlic.com/~lynn/94.html#28 370 ECPS VM microcode assist
there was effectively a ten times speed up for straight kernel code
dropped into microcode ... on nearly a byte-for-byte bases i.e. took
6k bytes of kernel 370 code and dropped it into 6k bytes of microcode
for a 10:1 speedup. the issue here was that while overall the
microcode engine was simpler than 370 ... the straightline
kernel code made little use of any 370 complexities.

there were two categories of ECPS speed-up ... straight kernel code
dropped into microcode on nearly 1:1 basis (giving 10times speedup)
... and modifying the virtual machine execution of privileged
instructions. Enhancing privilege instructions operation in virtual
machine mode could bring a 40-100 times speed up ...  since the change
was done directly in the native microcide instruction processing
routine w/o having to interrupt at all into the kernel for instruction
simulation.

I would have met bill when he was share chair of the hasp group (again
see wayback reference) but can't remember ... since as an
undergraduate I had done a lot of changes to hasp system and was given
the opportunity to present at share a number of times.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

vm

Refed: **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: vm
Newsgroups: bit.listserv.ibm-main
Date: Sun, 23 May 2004 13:48:12 -0600

richgr writes:

To answer your original question the answer is "yes, but . . .".  The
but is that while there is some VM here, "ibm-main" is mostly MVS & OS/390
oriented.

If you want VM, I would suggest joining the VM-ESA list which is
primarily VM.  There is no corresponding newsgroup (that I know of).
To join the VM-ESA list:

send email to the listserv:

listserv@listserv.uark.edu

the whole bit newsgroup hierarchy originated from mailing lists on
(vm-based) bitnet/earn somewhat starting back in the early to
mid-80s. the mailing list processor was somewhat borrowed from a
earlier computer conferencing mailing list processor from the internal
corporate network.

at some point there were gateways established between the bitnet/earn
mailing lists and the usenet news distribution facility. usenet which
had been for the most part uucp ... over time, mostly migrated to
tcp/ip and the internet.

many current ISPs don't carry all of the "bitnet" gatewayed mailing
lists newsgroup ... and for the places i've run across
bit.listserv.vmesa-l ... there has been almost no activity (i.e.
little of the vmesa traffic is actually being gatewayed). I count over
400 newsgroups defined in the bit newsgroup hierarchy ... but the
majority don't seem to be active &/or actually have a functioning
gateway.

misc. past postings related to bitnet/earn:
http://www.garlic.com/~lynn/subnetwork.html#bitnet

the internal corporate network was almost totally vm-based (as well as
all the networking related tools) and larger than the arpanet/internet
from just about the beginning until about sometime mid-85.

a big issue for the arpanet came with the big switch-over from
homogeneous networking to internetworking with technology like
gateways on 1/1/83. one of the big issues for the vm-based internal
corporate network was that it had effectively gateway function from
its origins (and significantly contributed to the internal network ease
of growth).

In various respects the JES & other mainframe family of networking
shared numerous of the arpanet homogeneous limitations. The stereotype
of problems with homogeneous networking was the requirement for
synchronized conversion of all nodes. The internal corporate example
was a mvs system being upgraded to the latest release of JES ... which
was generating file-formats that was causing MVS systems in hursley to
crash ... which was in turn blamed on the VM-network infrastructure.
One of the tasks that had been given the corporate vm-based networking
infrastructure was to maintain canonical JES header formats and the
vm-gateways were given the responsibility for knowing which version
JES might be on the other end of a wire and to appropriately convert
JES headers to keep the MVS from crash and burn. The problem with
polluted JES headers between different releases causing respective MVS
systems to crash ... was just one of the features that restricted MVS
systems to isolated end-nodes in the internal network.

Another common similarity between the JES networking support and
arpanet was the use of a one byte field for node addressing. JES was
slightly more restricted since the one byte field was also used to
address all the local hasp psuedo-devices ... and JES has a nasty
habit of trashing anything on the network that has an origin and/or a
destination that wasn't in the local one-byte device table (which
would have been a real disaster for the internal network to even
consider using JES as any sort of intermediate node in the internal
network; aka it would arbitrary trash traffic that had origin or
destination that wasn't in its local node table .... and it could
arbitrarily crash the whole machine if some traffic from a JES system
at a different level happened to come by).

At the time of the 1/1/83 switchover from arpanet to internet, arpanet
had approx. 250 nodes (at the limit of its addressing infrastructure)
and the internal network had almost a thousand nodes. misc. related
posts on the subject:
http://www.garlic.com/~lynn/internet.htm

minor reference on the web: A Social History of Bitnet and Listserv
http://www.computer.org/annals/articles/bitnet.htm

in the above, it mentions that bitnet grew slowly in the early '80s,
having only 157 nodes at the beginning of 1984. The summer of 1983,
the internal network had passed 1000 nodes and was well on its way to
doubling that; specific reference:
http://www.garlic.com/~lynn/internet.htm#22

and here is a note from somewhere early 1984 sort of outlining earn
http://www.garlic.com/~lynn/2001h.html#65
the above somewhat refers to that fact that I took a lot of heat and
blame for computer conferencing in the 80/81 timeframe.

the corporation contributed heavily to all the bitnet links inside the
US ... and possibly paid for nearly all the "earn" (i.e. non-US
bitnet) links outside the US

in my desk i have a clear plastic ball (about same size of baseball)
commemorating the 1000th node on the internal corporate network.

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

MITM attacks

Refed: **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: MITM attacks
Newsgroups: sci.crypt
Date: Sun, 23 May 2004 16:44:52 -0600

Guy Macon <http://www.guymacon.com> writes:

O.K.  I will byte. :)  (Note that I am not an expert, so feel free
to correct the misunderstandings I almost certainly have.)

If I communicate with someone using PGP, and both of us have our
keys verified with a high degree of confidence through the web of
trust, and you are the man in the middle, how would you break our
defense?

Enquiring minds want to know!  ;)

you've sort of been sucker punched. basically all MITM countermeasures
involve some (trusted) out-of-band communication ... that isn't
subject to the MITM attacks of the communication channel in question.

you walk into your bank and something is exchanged that can uniquely
provide unique mutual authentication. from then on, you and the bank
can exchange messages based on the mutual authentication technology.

somebody else walks into the bank and something else is exchanged that
also enables unique mutual authentication. you then want to
communicate to this other entity ... you can securely send the message
to your bank and have them securely forward to the destination.

the security business process of trust effectively works the same if
the two of you exchange messages directly and the bank acts just acts
public key server (somewhat akin to the yahoo ietf draft submitted
last week). a vulnerability is if you are using the same exact
infrastructure to establish trust with the web-of-trust keyserver ...
then MITM could be attacking that also. the countermeasure is again
some out-of-band information that isn't vulnerable to the MITM
attacker. sometimes web-of-trust assumes that it might be able to use
a suspect communication channel (prone to MITM) in multiple different
ways ... in the hopes that the MITM isn't your ISP and therefor
constantly operating.

However, if it is possible to demonstrate trusted mutual
authentication between two different parties and a trusted 3rd party
... then it is possible to leverage that to extend mutual
authentication directly between the two parties. The level of business
trust isn't directly affected by having the bank be the intermediate
transmission or by just having them provide the authentication
infrastructure.

now attacks on this infrastructure wouldn't be MITM ... but they might
be insider. a lot of existing authentication infrastructure is based
on various kinds of shared-secrets, aka something you know static
data. a lot of current fraud is harvesting such static data and using
it to impersonate other entities. a lot of phishing email is making
the email sound official enuf that consumers are tricked into
believing it w/o having actual proof. So is this a MITM attack? ... or
social engineering? Are all insider attacks, MITM?

minor drift, reference to recent news note about a study that is
about to be published:
http://www.garlic.com/~lynn/aadsm17.htm#38 Study: ID theft usually an inside job

--
Anne & Lynn Wheeler | http://www.garlic.com/~lynn/

Usenet invented 30 years ago by a Swede?

Refed: **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Usenet invented 30 years ago by a Swede?
Newsgroups: sci.archaeology,soc.culture.nordic,soc.history.medieval,soc.history.science,comp.protocols.tcp-ip,alt.folklore.computers
Date: Sun, 23 May 2004 15:57:07 -0600

Doug Weller writes:

Somehow, on newsgroups devoted to history and archaeology, we were
sidetracked to a discussion of who invented Usenet and whether Usenet rules
insisted that top posting was correct (I thought I knew the answers to
both.

sort of hard to have discussion of usenet w/o also having discussion
of uucp ... usenet having grown up in somewhat different genre from
arpanet ... lots of store&forward dial-up network (w/o end-to-end
connectivity). 1980 or so, somebody tried to make a scathing
comparison of the arpanet and store&forward networks .... by claiming
that store&forward networks were like the postal system ... and if the
postal system was run like the arpanet ... if you were to send a
letter from tokyo to london ... you could only post it when it was
simultaneously first shift for the origin post office in tokyo, the
destination post office in london, as well as every post office between
tokyo and london.

old posting in alt.folklore.computers on the uucp subject:
http://www.garlic.com/~lynn/2001b.html#57

the above has reference to the UUCP web site:
http://www.uucp.org/

misc refs from search engine using: usenet, uucp, history
http://www.vrx.net/usenet/history/hardy/
http://www.uucp.org/history/index.shtml
http://www.cs.uu.nl/wais/html/na-dir/usenet/software/part1.html
http://www.tldp.org/LDP/nag/node256.html
http://livinginternet.com/u/ui_old.htm

usenet & uucp both predate the big switch-over of the arpanet to
tcp/ip on 1/1/83. somewhat an aside, 5-6 years after the switch-over
to tcp/ip, the federal gov. had gosip which was mandating the
elimination of tcp/ip with total conversion to osi.

recent posting today t