List of Archived Posts

1996 Newsgroup Postings

360/370
Cache
Hypothetical performance question
360/370
360/370
360 "channels" and "multiplexers"?
John Hartmann's Birthday Party
360 "channels" and "multiplexers"?
Why Do Mainframes Exist ???
Why Do Mainframes Exist ???
cics
Caches, (Random and LRU strategies)
Caches, (Random and LRU strategies)
IBM song
PC reliability
mainframe tcp/ip
tcp/ip
middle layer
middle layer
IBM 4381 (finger-check)
IBM 4381 (finger-check)
1401 series emulation still running?
IBM 1403 printer
Old IBM's
Old IBM's
old manuals
SGI O2 and Origin system announcements
System/360 Model 30
Mainframes & Unix
Mainframes & Unix
Mainframes & Unix
interdata and perkin/elmer
Mainframes & Unix
Mainframes & Unix
Mainframes & Unix
Mainframes & Unix
Mainframes & Unix (and TPF)
Mainframes & Unix (and TPF)
interdata & perkin/elmer machines
Mainframes & Unix
Mainframes & Unix
what happened to the 286?
IBM 4361 CPU technology

360/370

From: Lynn Wheeler <lynn@garlic.com>
Date: 1996/01/06
Subject: 360/370
Newsgroups: comp.arch

typically circuits/chip have significantly increased
and the mis-match between processor cycle and memory
latency has also increased. So there are two approaches
to processor stall ... go to weak memory consistency
or support concurrent threads.

assuming a cache machine & all other being equal ... concurrent
threads would tend to increase the cache size requirements over
weak memory consistency i.e. executing double the instructions
per unit time ... isn't likely to double the cache
line requirements ... while concurrent threads would
tend to double the cache requirements (except possibly
in degenerate case of the two threads executing the
same code) ...

To some extent the workstation & PCs tend to single threaded
uses ... chips for that market have optimized for single thread.
Given that the volumes drive to commodity price ... then they
become the chip of choice for other uses also. That could change
if usefull multi-threaded systems/applications appear that
provide perceived/tangible added value to the end-user ... in
which case the volume chip market might start supporting
concurrent multi-threaded instruction streams.

Cache

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@netcom23.netcom.com
Date: Sun, 28 Jan 1996 18:58:49 GMT
Subject: re: Cache
Newsgroups: comp.arch

David Mayhew, ibm worte:

Suppose you held cache size constant. As caches grow, the
they have a decreasing marginal return in hit rate. A 4K cache
might yield a 60% hit ratio while a 16K cache might not be much
better than 90%. Statistically 4 threads each of which only have
60% hit ratios, but that can "instantly" context switch amongst
themselves will have a 96.5% hit ratio, compared to the original
single thread of only 90%. As the disparity grows between
processor speed and memory speed the multithreading seems
to be advantageous.

multiple hardware threads on a chip should handle processor-stall
masking associated with cache misses and better utilization of shared
resource (cache size).

current point on the curve seems that to be that there isn't a big
over abundance of cache.

i know of one software vendor that (with detailed simulator) has
convinced at least some hardware vendors that 1mbyte cache is bare
minimum and 4mbyte L2 is much better.

also 2/4/etc way set associative also impacts the ability to utilize
efficiently what is there. There have been some pathological
mis-matches between software and bits indexing sets where some sets
don't get utilized at all.

assuming a moderate amount of cache ... then there is some trade-offs
with increase in cache miss (assuming short of saturating memory bus)
for hardware multi-thread ... and processor-stall masking. However, it
would seem at this point in the curve, weak-memory and out-of-order
execution provides better trade-off in processor-stall masking and any
cache size increase requirements.

one of the DEC people mentioned that optimizing for alpha cache starts
to look like some of the techniques from the late 60s & early 70s for
virtual memory optimization.

in late '60s & early '70s i could beat global LRU with full
competition for all pages for all slots, super fast path replacement,
fetch & context switch as well as dynamic adaptive based on behavior,
latency, and saturation for selecting number and which mix to run
together (and later on an adaptive switch between local/global LRU on
per thread basis).

given that you were on the right point in the curve ... 4 programs
could run faster in shared "4*X" than individually in fixed "X" each
(i.e. the argument from the late 60s with denning and global/local lru
... if i remember right at sigops circa '79 somebody had rediscovered
global clock for his phd thesis & it was being held up because
somebody apparently thot that the local/global LRU argument still
raged).

In any case, on the wrong side of the curve (& w/o dynamic adaptive
controls) things could get very pathological in terms of mean-time
between miss and bus saturation.

Hypothetical performance question

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@netcom19.netcom.com
Date: Sat, 10 Feb 1996 02:17:11 GMT
Subject: Re: Hypothetical performance question
Newsgroups: comp.arch

cache protection can help since LRU algorithms aren't necessarily good
for everything. i did some 2-way SMP kernel support (quite a while
ago) where some workloads ran better than twice as fast than uni
... because the effective MIP rate of both processors increased
(because of decreased cache miss rate).  The scenerio involved some
intelligent application processor affinity ... which because of
non-shared cache resulted in cache affinity.

this is the counter argument to global LRU outperforming local LRU
(i.e. partitioning the environment and forcing replacements from
cache lines specific to an application).

circumstances were even more remarkable because the SMP hardware at
the time slowed the uniprocessor cycle time by 15% to allow for
cross-cache invalidation signals ... i.e. the improved cache hit
rate, in order to be better than twice as fast, had to also compensate
for the 15% reduced cycle time.

360/370

Refed: **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Date: 1996/02/11
Subject: 360/370
Newsgroups: alt.folklore.computers

i had access to the 3rd or 4th 4341 built ... and
had a friend in the 4341 product group in endicott who
couldn't get access to one for running performance and
evaluation tests ... so I would run tests out on the
west coast for them on their own product.

360/370

From: Lynn Wheeler <lynn@garlic.com>
Date: 1996/02/15
Subject: 360/370
Newsgroups: alt.folklore.computers

... except for i/o 360m67 with 768kbyte memory would
be somewhere between 8086pc and 286pc.

360 "channels" and "multiplexers"?

From: lynn@garlic.com
Subject: Re: 360 "channels" and "multiplexers"?
Newsgroups: alt.folklore.computers
Date: 09 Mar 1996 10:40:23 -0800

360 typically had 1-6 selector channels and 1 multiplexor channel.

in some sense channel was independent processing unit which executed
"channel programs" which was this funny language. basically each
channel had the equivalent of multiple "instruction streams" that it
would hardware multiplex thru a single instruction stream decoder.

the multiplexor channel could support up to 256 of this simultaneous
instruction streams ... but was limited to attachment of relatively
slow speed devices (i.e. data transfer typically <1mbit/sec).

selector channels supported higher speed devices (multi mbyte/sec data
transfer) but was limited in the number of concurrent programs that
it could support ... with a further restriction that only one I/O
program at a time could have an active data transfer section.

there were other funny rules about how the channels shared the memory
bus with each other and with the processor ... and various processor
functions.

when i was an undergraduate there were four of us that put together
a hardware 360 control unit (i think there is a write-up someplace
blaming us for having originated the ibm oem control unit business).
the 360 that we worked with was a m65 which had a 13.? microsecond
timer in real memory ... basically every 13 microseconds the real
storage timer location "tic'ed" requiring a storage update. One of
our first hardware bugs was that we raised bus-in on the channel
to be able to do data transfer to memory ... and then processor
died. turns out that if the memory bus was held for two timer tics
(i.e. timer hardware was held off updating memory for the first
timer tic ... and then a 2nd timer tic occured still w/o updating
memory) the processor would figure it was in a hardware error condition
and stop.

--

John Hartmann's Birthday Party

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: John Hartmann's Birthday Party
Date: 6 Mar 1996

attached was posted to
http://vm.marist.edu/~piper/party/jph-12.html#wheeler

Lynn Wheeler

lynn@garlic.com

I've reproduced John's abstract from an advanced technology conference
I held on March 4th and 5th, 1982 at San Jose Research (I think it was
the first adtech conference since the one held in the mid-70s in POK
where we presented a 16-way processor design ... and the 801 group
presented 801 & CPr ... i.e. precursor to RIOS & PC/RT).

Also on the agenda was Cary Campbell talking about DataHub (a pc
fileserver; Cary was traveling to Provo nearly once a week to work
with a group of students on parts of the implementation ... which
eventually spawned a new network PC company in Provo).

Bob Selinger gave a talk about 925 work station. When the original Sun
people went to IBM Palo Alto Science Center to ask IBM about building
the machine ... PASC brought in the 925 group, the Boca PC people and
a YKT workstation group to review the proposal. All three groups
claimed to be doing something better and so IBM declined (not) to
produce the Sun workstation.

Barry Goldstein gave a talk on running CMS applications under MVS.

Rip Parmelee and Peter Capak gave talks on VM/370 support for Unix as
well as the TSS/370 UNIX PRPQ for Bell Labs.

The CFP and the agenda were not IBM Confidential ... but some of the
other presentations were.

I have fond memories of John taking us on site-seeing tour of the
(Hamlet's) castle at Helsinger(?)  and ferry ride across to Malmo (and
the castle on the other side).

======================================================================

From the CFP ..

                                 TOPICS

     High level system programming language
     Software development tools
     Distributed software development
     Migration of CP functions to virtual address spaces
     Migration to non-370 architectures
     370 simulators
     Dedicated, end-user system

A possible project which would utilize extensions in all the before
mention areas is a relatively inexpansive, relatively fast non-370
CPU. A VM kernal (many CP functions having been migrated to virtual
address spaces) is coded in an high level system programming
language. The kernal will initially be compiled into 370 code and
executed using the 370 simulator. Eventually the kernal (and possibly
some of the virtual address space code) will be recompiled into the
native machine language and execute along side the 370 simulator
(providing both native mode and 370 virtual machines).

Although a definite pilot project is envisioned, nearly all work will
be benificial to all current VM/370 environments.

=======================================================================

                            THE TOY PROGRAM

by: John Hartmann

ABSTRACT

The TOY Program for receiving and processing messages was originally
run in a reserved storage area like the Yorktown nucleus extensions,
but the instability of the CMS system in use lead to a desire to have
the TOY Program run "outside" CMS. Also, some kind of full screen
support was felt to be desirable.

The objectives set for the current TOY Program were:

     Investigate the use of CMS as a slave operating system for
     applications running mostly in a stand alone mode like RSCS and PVM.

     Investigate the implementation of shared segments in CP.

     Investigate the possibility of restarting CMS after an error without
     disturbing information outside VMSIZE.

     Investigate the feasibility of using storage protection for such an
     operating system where the use of key zero is restricted to handling
     CMS data areas.

     Create an environment for programs where the components of a large
     package can be plugged in and out dynamically, so that software
     changes can be applied to "outlying" components without requiring
     a re-initialization of the whole system.

     To present RSCS conversational messages on a full screen and keep
     conversations separate.

     To be able to log such conversations on a disk file.

     Make the 1052 line mode console support more usable by allowing the
     user to delete lines from the virtual console listing selectively
     instead of the all-or-nothing-at-all approach used by CP.

The objectives have been met, and the TOY Program has proven to
increase the productivity of my terminal sessions considerably.

360 "channels" and "multiplexers"?

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: Re: 360 "channels" and "multiplexers"?
Newsgroups: alt.folklore.computers
Date: 10 Mar 1996 08:54:18 -0800

byte mux & selector basically transferred one byte per bus in/out.

selector could start a channel program ...  and when it received
"channel end" status ... it could start another channel program, the
first one may have had completed the data transfer portion ... but
still could be active (i.e. channel program running in the UCW,
basically the equivalent of an i-stream in a multi-thread chip
design).

byte mux could have several concurrent channel programs w/o the selector
data transfer restriction ... i.e. multiples doing active data transfer

block mux introduced the concept of channel program
disconnect/reconnect for active channel programs ... supporting
multiple concurrent channel programs with psuedo active data transfer.

It also introduced the concept of RPS-miss for disk technology ...
i.e. when it was introduced in the early '70s ... the disk drives
didn't have buffers ... the disconnect/reconnect was used to disconnect
the disk from the channel until specific record was in position for
transfer and at that point attempt to reconnect. if the channel
was busy at the reconnect ... there was a RPS-miss and the disk
had to rotate a full revolution and try again.

around the '82 time-frame when I wrote the paper that system
performance of disk technology had slowed down by a factor of 5-10
over a 15 year period ... there was a lot of people that got red under
the color. I had just done a rough approx. based on processor and
memory increasing by a factor of 50 in thruput during the period while
disk only increased by a factor of 10 or less (i.e. the relative
system thruput therefor had declined by a factor of at least five
since the disk performance curve didn't track the processor/memory
curve). When they did a more detailed analysis ... including RPS-miss
in the equation indicated that relative system thruput of disks had
declined by even a larger factor. In the '60s there were system
configurations that had E/B ratios with multiple bytes transferred for
every instruction executed. Now E/B ratios tend to be discussed in
instructions per bit ... not instruction per bytes.

data streaming was introduced initially because the mip rate of the
i/o processors declined ... data streaming allowed for eight bytes
to be transferred per bus in/out ... rather than a single byte
... effectively the data path got wider ... but the control path
slowed down (needed only 1/8th the i/o processor path length for every
byte transferred). for latency sensitive operations it was actually
possible to demonstrate the reduction in thruput (even tho the
data rate went up).

later on data streaming was able to show additional benefits for
things like extended-length (escon) channels where signal propagation
was becoming a factor (as opposed to processing latency ... i.e. not
required to do a synchronous end-to-end bus in/out operation for every
byte transfered)

--
Lynn & Anne Wheeler

Why Do Mainframes Exist ???

Refed: **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: Re: Why Do Mainframes Exist ???
Newsgroups: comp.arch
Date: 25 Mar 1996 09:11:11 -0800

in the commercial mainframe environment there is a lot of 7x24
procedural and automation infrastructure that has somewhat grown up
thru trial and error over the past 30 years ... only some of it
represented by hardware technology.

there is also a fundamental difference in a basicly "interactive"
environment design point (where default tends to be having a person
handle situations) and "batch" environment design point (where default
tends to be automated applications handling situations and the
corresponding instrumented infrastructure to support it).

a trivial example of infrastructure/design-point is what happens when
a sort program runs out of temp space ... current os/mvs (as well as
os/mft circa 1968) generates a specific error code for just about
every condition ... and frequently there is an automated
infrastructure that can handle each of the return codes and can take
corrective/recovery action.

for how many of the newer operating systems is it possible to create
a batch-procedure that based on sort utility return code ... recognize
out of space condition and take automated corrective action?

there are some number of current commercial situations where the cost
of a daily application failing once a year to not complete on schedule
exceeds the cost of the hardware. more common is where the delta
people costs that are involved in attempting to utilize an
"interactive" operating system for a "batch" environment exceeds the
cost of the hardware.

--
Lynn & Anne Wheeler

Why Do Mainframes Exist ???

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: Re: Why Do Mainframes Exist ???
Date: 1996/03/26
Newsgroups: comp.arch

not shared memory ... these are sorts that not only exceed virtual
memory but also potentially disk ... and/or in some situations
multi-volume tape. little things like corporate payrolls,
stock-exchange settlement, etc.

how 'bout when /var/tmp fills up ... sort fails ... and then you try
and backtrack to find out why.

currently mvs, sms and some other stuff intercepts the space fill
condition ... temporarily suspends things while space can be
re-arranged so that enuf is available .. and then resume operation for
completion (pro-active prevention). MVS JCL has the option of
pre-allocating space ... but extents are possible. sophisticated sort
applications can do their own JFCB operations ... but when all
available space on all available disks fill up ... what else is there
to do?

none of the primitive stuff of mapping files to specific filesystems
(and frequently to specific disk drives).

i don't see anything wrong with translating batch-paradigm processing
to distributed commodity priced hardware ... at some abstraction
hardware is orthogonal to the system-paradigm issue. however,
frequently the case is commodity priced hardware operated with systems
that started out with interactive design-point. there is significant
infrastructure difference between interactive and batch design-point.

infrastructure extends past system structure issues. for instance, is
by default your system configured and operated for disaster recovery
...  i.e. if the bldg. containing your system collapsed tonight
... would all applications be running on schedule tomorrow (including
current/up-to-date data ... i.e. not is it possible to provide for
such a contigency, but as a matter of default operation,
disaster/recovery is provided for).

if you are interested in shared memory ... SCI & various dictionary
cache consistency. also there is work that looks more like the
precursor proposal before SCI went thru the standards process
(i.e. joke about starting with race horse requirements and producing a
camel).

--
Lynn & Anne Wheeler

cics

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: cics
Date: 1996/03/31
Newsgroups: alt.folklore.computers

university i was at ... was beta-test for prerelease cics product in
fall of '69. first bug i remember shooting in cics was bdam open ...
code as shipped only supported a specific set of bdam ... and we were
using something different (developing library project on grant from
navy research).

in some sense, cics was similar to watfor ... it wasn't that
facilities weren't there to do the job ... but the overhead was too
expensive, large part of cics was lightweight thread multitasker
... with lightweight file i/o (open files at start-up ... and pretty
much leave them open). result was significant reduction in processing
overhead.

about the same time i also put 2741 & tty support into HASP ... in order
to get interactive job support.

there were also four of us about the same time that were credited with
building 2702-replacement control unit (blamed for originating the ibm
oem control unit business). part of the reason was that 2702 didn't
work as originally designed. among other things we implemented dynamic
type recognition (at least 2741, 1052, & tty) as well as dynamic speed
recognition (old had these days with modern modems).

we had also been running cp/67 since jan. of 1968 for interactive.

--
Lynn & Anne Wheeler

Caches, (Random and LRU strategies)

Refed: **, - **, - **, - **, - **, - **
From: lynn@garlic.com
Subject: Re: Caches, (Random and LRU strategies)
Date: 1996/04/26
Newsgroups: comp.arch

i did clock in '68/'69 ... but in '71 came up with version of >1bit
(i.e.  instead of clear does a shift ... assuming shift right, then
hardware alwas sets the left most bit, clock does a shift right one
bit ... with left most bit being zero'ed; only take entry if all bits
are zero) and two-handed clock that would beat straight LRU.

standard clock tended to be 10-15% worse than LRU ... variation tended
to be 10-15% better than LRU. detailed analysis was that it tended
to approximate LRU ... when locality was good ... but tended to
approximate random ... when locality was poor ... or when patterns
started to look like MRU (i.e. program requirements slightly
larger than cache size and straight LRU would constantly replace the
block that was going to be needed next, whereas random would seldom
be selecting block needed next).
--
Lynn & Anne Wheeler

Caches, (Random and LRU strategies)

Refed: **, - **, - **
From: lynn@garlic.com
Subject: Re: Caches, (Random and LRU strategies)
Date: 1996/04/26
Newsgroups: comp.arch

(psuedo) random solves nasty problems with pathological behavior when
programs don't operate according to assumed behavior (i.e.  LRU
algorithms are done based on implied assumption that the things
referenced most recently will have the highest probability of being
referenced in the near future).

presumably predictability refers to some series of events running
at the microscopic level ... generalized program behavior across
wide-range of applications tends to have more uniform macroscopic
behavior with random (although not necessarily optimal).

--
Lynn & Anne Wheeler

IBM song

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com (Lynn Wheeler)
Subject: Re: IBM song
Date: 1996/06/15
Newsgroups: alt.folklore.computers

'31 book had 45 pages and 106 songs ... but i like the hasp songbook
better.

denver '7? share .. i believe one of the people that did the song bony
fingers ... was relative of one of the jes committee members and he
got her to do a parady version about jes ..although it isn't listed in
my version of the book. Thursday nights at share used to be especially
hard, trying to help the head of share not have to lug any cases home
on the plane the next day.

(I guess I shouldn't say too much bad about jes ... I worked on HASP
in school; implementing crje support in HASP-III/mvt18 with CMS edit
syntax and drivers for 2741, 1052, and tty ... borrowed from my work
on cp/67 ... but my wife worked in the jes2 group for awhile in
g'burg).

If I had a HASP, I'd be SPOOL.in' in the morning,
I'd be SPOOL.in' in the evening, all over this land.
I'd SPOOL all the SYSIN, I'd SPOOL all the SYSOUT,
I'd SPOOL the jobs between
   the remotes and the local
Aaa-aahh, all over this land.

--
Lynn & Anne Wheeler

PC reliability

From: lynn@garlic.com (Lynn Wheeler)
Subject: PC reliability
Date: 1996/07/08
Newsgroups: alt.folklore.computers

in fact it looks as if pc reliability may be heading in the opposite
direction; instead of having PC parity memory (9bits, 1parity bit for
every 8data bits, can detect one bit errors) ... direction for pcs are
things like EDO (can't detect any errors).

workstations now frequently have ECC (10bit, 2 error correcting bits
for every eight data, detect all two bit errors, correct any single bit
errors)

mainframes have things like 80bit ecc (16bit correcting for 64bit data,
same ratio as 10bit ecc, but can detect combinations of 16bit errors
in 64bit data, can correct 15bit errors).

--
Lynn & Anne Wheeler

mainframe tcp/ip

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com (Lynn Wheeler)
Subject: mainframe tcp/ip
Date: 1996/07/11
Newsgroups: alt.folklore.computers

i wrote hyperchannel ibm mainframe drivers in '81 allowing the IMS
group to relocate several hundred people and all their local channel
devices several miles from the center.  interesting thing was that
remoting 327x local channel attach controllers ... improved overall
system thruput by 15% (mostly by reducing channel busy; w/o degrading
response).

i also did rfc1044 for tcp/ip product ... while rest of the product
limped along at 44kbyte/sec ... was doing transfer between 4381
and cray at channel speeds.

--
Lynn & Anne Wheeler

tcp/ip

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com (Lynn Wheeler)
Subject: tcp/ip
Date: 1996/07/12
Newsgroups: alt.folklore.computers

more recently ... late '80s ... it took me several months to get the
engineer that did 6000 version of escon (sla) to stop working on
800mbit version and come to the FCS meetings. however we got sidetracked
as the attached indicates

also for index to rfcs (i.e. rfc1044 reference) see

http://www.garlic.com/~lynn/rfcietf.htm

.... a posting i did last year to comp.arch.storage

grump ....

a large number of the 9333 systems were for ha/cmp and we heavily backed
the project. however we were also doing cluster scaleup using fcs.

during san fran usenix, jan. 1992, Hester, my wife, and I had a
meeting with Ellison, Baker, Shaw, and Puri in Ellison's conference
room. We proposed having 16-way fcs pilot clusters in customer shops
with parallel oracle by summer of 1992 ... upgraded to 128-way by
ye92.

unfortunately the kingston group were out trolling for technology and
found cluster scaleup the very next week. in something like 10
weeks, the project was transferred to kingston, announced as a
supercomputer, and we were instructed to not work on anything
involving more than 4 processors.

in the elephant's dance to do the supercomputer subset of cluster
scaleup ... the device interconnect strategy got obliterated.  so
instead of 9334->interoperable family(1/8 fcs on serial copper, 1/4
speed fcs on fiber, & full speed fcs on fiber) ... in the resulting
confusion, 9334->ssa.

while ssa is quite good technology (especially compared to scsi),
the interoperable family strategy is better.

--
Lynn & Anne Wheeler

middle layer

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@quake.garlic.com (Anne & Lynn Wheeler)
Subject: middle layer
Date: 1996/07/14
Newsgroups: comp.infosystems.www.authoring.cgi,comp.infosystems.www.servers.unix,comp.databases,comp.client-Server

i have a foil presentation laying around that i put together in dec
'88 describing what i called 'middle layer' ... which i presented to a
number of companies. large focus was operation, management, and
coordination of distributed environment i.e. legacy systems didn't
know how to ...  and the desktops didn't either. departmental servers
started to fit into middle layer ... but didn't have the ops c&c
characteristics.

presentation included aggregation ...  lots of LAN description with
peer-to-peer operation seemed to also imply symmetrical traffic ...  by
definition lots of concentration points exhibit asymmetrical traffic
flows (i.e.  concentration point sees the aggregate bandwidth of the
desktops).

operation of concentration points imply requirements that start to
look a lot like 7x24 (even if coordination achieves availability via
replication ... which also needs to be managed and operated). there
were also thruput efficiencies where dedicated box's provided
concentration/batching for backends (especially legacy backends).

one of the most interesting (violent) reactions was that single t/r
segment should support unlimited large number of clients ... since
presumably no single client required more than t/r worth of bandwidth.
one marketing group wanted my head after presenting such heresy to a
particularly large electronics company. because of niche in the
technology & market cycle ... concentrators and short e-net segments
with aggregate bandwidth delivery hundreds of times single-segment was
less expensive than single-segment t/r (with no concentrators).

concentrators then provided focal point for added-value
function/feature delivery opportunities.

strong reactions regarding middle layer even being responsible for
functions and/or various types of added-value features and protocol
translations (disinmediation?). other aspect was that the desktops
were on generation & deployment cycles significnatly shorter than
backend/legacy. desktop software delivery/maint. still wasn't solved
... so middle layer provided a compromise between new feature delivery
and support overhead.

--
Anne & Lynn Wheeler          | lynn@garlic.com, lynn@netcom.com
                             | finger lynn@garlic.com for public key

middle layer

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: lynn@garlic.com (Lynn Wheeler)
Subject: middle layer
Date: 1996/07/14
Newsgroups: comp.infosystems.www.authoring.cgi,comp.infosystems.www.servers.unix,comp.databases,comp.client-Server

oh yes, some of the characteristics of the economic nitch in '88
creating middle layer:

... 10baset enet cards were going for $300 or less, station-to-station
burst was running at 95+% media ... free-for-all was running at 85+%
media for clusters of 20-40 stations.

... 16mbit t/r cards with effective aggregate media thruput of 50%-60%
media were going for $1000

i had done the tcp/ip mainframe product support for rfc1044 ... and
was getting channel thruput at the mainframe ... while the product
using other means was around <50mbytes/sec thruput (>order magnitude
difference).

For reasonable corporate configuration of >300 stations ... could put
in mulitple small 10baseT configurations ... necessary backbone
routers with direct channel attach to the mainframe and dedicated enet
&/or fddi to middle layer servers ... in additon to some of the middle
layer servers for the same cost as single segment 16mbit t/r (adapter
card cost spread at 300 was >$210,000)

small enet configs operating at 9mbit/sec thruput (aggregated by
backbone routers with direct channel attach and dedicated server lan
attach, avg bandwidth per station >300kbit/sec). or 300 stations
sharing single t/r operating at 9mbit/sec thruput (avg. bandwidth per
station about 30kbit/sec).

cost per avg. bit/sec to the desktop then was also order of magnitudes
difference.

feature/function/ops management, maintenance, etc ... at the middle
layer servers was also significantly less than doing at each of the
individual desktops.

in any case, lots of corporate dollars easily justified for middle
layer infrastructure.

--
Lynn & Anne Wheeler

IBM 4381 (finger-check)

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler 
Subject: Re: IBM 4381 (finger-check)
Newsgroups: alt.folklore.computers
Date: 24 Jul 1996 09:07:38 -0700

finger check with directory reference ... correct version is:

	ftp.netcom.com/pub/ly/lynn/
                             ^^^

i usually had early access to several models of 43xx and 370s. in the
late '70s the disk development lab and product test/assurance lab were
organized in big machine rooms with the disk & controller products in
"test cells" (limited access campus, controlled access building,
controlled access machine room, within each machine room, numerous
heavy mesh steel cages with combination locks on the doors ... each
cage was "test cell" containing product under development &/or test).

At the time attempting to operate a single "test cell" connecting to a
mainframe running a standard product operating system ... would
typically crash the operating system within 10-15 minutes. the result
was that mainframe tests were run with special "stand-alone" dedicated
testing programs ...  on a scheduled basis (one test cell at a
time). To help the situation, I did a bullet-proof rewrite of the
operating systems I/O support and failure routines ... so that 6-10
test cells could be operated concurrently running under mainframe
operating system. I also wrote up internal document describing some of
the failure modes (for which i got the 2nd line manager of RAS at
meyers corner after me ... not for fixing the problems, but
documenting that they existed).

In any case, in return for creating and supporting such a beast ... I
got quite a bit of latitude in being able to use the machines as long
as I didn't impact development and test (i.e. the "heavy" i/o testing
typically resulted in less than 5% cpu busy). The first one or two
mainframe machines produced typically went to CPU product test
... then next couple machines went to disk product test ... as a
result I frequently had much better access to the machines than some
of my cpu engineering friends back at the cpu manufacturing sites.

the downside was that whenever something wasn't working correctly
... i got called. the first weekend they replaced 3830 running 16
drives with 3880 ... i got called. They suspected that I had done some
change in the operating system that caused severe performance
degradation. Turned out to be idyosyncrocy in the 3880 which hadn't
been tested in multiple drive configuration up until then. Had to go
in and identify the 3880 internal hardware bottleneck and get it fixed
before first customer ship. spent a lot of time browbeating engineers
when they would do something that violated performance, cpu, and/or
channel architecture (self-defense ... because I would get the initial
blame supposedly as having done something to the software).

--
Anne & Lynn Wheeler   lynn@{netcom|garlic}.com finger for public key

IBM 4381 (finger-check)

Refed: **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler 
Subject: Re: IBM 4381 (finger-check)
Newsgroups: alt.folklore.computers
Date: 24 Jul 1996 09:31:38 -0700

... i also somewhat aggravated the performance situation ... when i
rewrote the mainframe i/o supervisor, i also redid the channel
balancing, channel redrive and disk redrive algorithms ... which
tended to come back and hit a controller much faster and harder
(increasing thruput, but it exaserbated this early 3880 multi-drive
bug ... which had to get fixed before first customer ship).

they also got my page-mapped filesystem ... which I had done in the
early '70s (at 545 tech sq) but never shipped externally as a
product. the up-side to the page-mapped filesystem was that it was
totally transparent to the users and applications and nobody knew
anything about it (except the performance thruput was up to 3* that of
standard filesystem). the down-side to the page-mapped filesystem was
that almost nobody knew about (it ran on a large number of machines
... but since it was so transparent ... it apparently never occured to
anybody to make it a product).

--
Anne & Lynn Wheeler   lynn@{netcom|garlic}.com finger for public key

1401 series emulation still running?

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: 1401 series emulation still running?
Newsgroups: comp.society.folklore
Date: 06 Sep 1996 09:12:23 -0700

there is actually this story regarding testimony at some anti-trust
trial. one of the competitors that got out of the business testified
at the trial that EVERYBODY in the computing business in the late '50s
realized that the SINGLE MOST IMPORTANT CRITERIA TO BE SUCCESFUL in
the computer business was to have a single compatible machine
architecture across the product line. He testified that everybody
tried to pull it off ... but only IBM succeeded. The other vendors had
problems controlling local plant managers ... the manager in charge of
the low-end wanted to tweak things for their technology, the manager
in charge of the mid-range wanted to tweak things for the mid-range
technology and the manager for the high-end wanted to tweak things for
the high-end. Only ibm hdqtrs was able to control all their plant
managers and force them to implement the same, single architecture.
The hdqtrs of all the other companies failed in one way or another.

Presumably the justification was this was a period of rapid business
expansion. A business would write an application and it would become
critical to the operation of the business. As their business expanded
they needed a bigger machine to run the application. Could they simply
bring in a larger machine ... and run the same application or would
they have to wait while the application (and/or some other aspect of
the environment) was ported. The cost of machine/hardware became far
less than lost business associated with delays associated with
porting/converting software.

Note the perception of this is as important ... or more important than
the reality.

The 360 1401 and 7094 hardware emulators were a concession to the
transition from the old environment to the new.

--
Lynn & Anne Wheeler

IBM 1403 printer

From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: IBM 1403 printer
Newsgroups: comp.society.folklore
Date: 18 Sep 1996 08:29:34 -0700

1403N1 ... faster (1100 lpm), super noise insulation in the cover
(loader noise because of faster operation), and mechanical cover lift
(because of the weight of the insulation)

had interesting feature that cover would automatically lift (paper
jam, etc) ...  spilling everything placed on top.

--
Lynn & Anne Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Old IBM's

From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: Old IBM's
Newsgroups: alt.folklore.computers
Date: 19 Sep 1996 09:22:23 -0700

given that many s/360 were implemented in some m'code or another it is
a 360 ... except for the fact that I/O and some other supervisor
instructions weren't implemented.

there was also a bit-slice implementation that one of the labs did
that was sufficient 360 to execute FORT-H binaries.  they were placed
at data gathering stations along the accelerator for doing initial
data reduction.

--
Lynn & Anne Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Old IBM's

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: Old IBM's
Newsgroups: alt.folklore.computers
Date: 20 Sep 1996 20:35:29 -0700

the 360/370 princ. of ops ... was a (s?)gml subset/extract of the "red
book" which specified what was required for something to be 360/370
from standpoint of operating system, aplication programs,
software. most things (including i/o instructions) were required to be
present including specification on how they worked.  some things were
required ... but some details of how they worked were model dependent
(like diag/83 instruction).

xt/370 & at/370 executed highly modified version of vm/370 because the
implementation was non-conformance in many (supervisor instruction)
areas with regard to the red book (i.e. various processor lines had
all sorts of microprocessors as core ... so issue of m68000 core
wasn't the issue ... issue was that it was only partial
implementation).

typically the lower end of the 370 range tended to be various
m'processor cores with 10:1 instruction execution ratio between
m'processor (vertical) instruction and 370 instruction.

higher end of the 370 line tended to be horizontal (wide) instruction
machines ... where measure was cycles per instruction rather than
(vertical) m'code instructions per 370 instruction. 165 avg.  about
2.1 machine cycles per 370 instruction. one of the enhancements going
from 165->168 was avg.cycle/instruction was reduced from 2.1 to about
1.6.

--
Lynn & Anne Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

old manuals

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: old manuals
Newsgroups: comp.society.folklore
Date: 07 Oct 1996 08:24:18 -0700

i got into various trouble by referring to FS as the inmates in charge
of the institution ... reference to cult movie that had been playing
down the street (from 545 tech sq) in central sq for over a decade. FS
seemed to include every blue-sky concept that had ever been dreamed
up.  one thing that killed it was projection that FS machine built of
hardware faster than currently existed (370/195) would have
application thruput less than 370/145. simple hardware add instruction
could go thru five levels of memory access for each parameter to
determine what kind of object it was and how the add should be
executed (length, type, interger, float, character, etc).

course i also thot that what i had running & distributing was better
than what they were promising in the resource management chapter (8?,
does somebody remember); little things like running avg utilization
for concepts like dynamic adaptive and fair share rather than just
previous delta, total to-date, pretty fixed dispatch ordering.

one might consider John's effort on RISC in the mid-70s to be at least
partial FS-backlash

with respect to FS manuals ... they were softcopy and kept online
under special access procedures. one weekend when i had test time in
machine room where a copy was kept ... they claimed proadly that only
authorized people had access to the documents and even i couldn't get
at them. took me about five minutes at the console and demonstrated
unlimited access to everything on the machine. i suggested possibly
development of an encrypted filesystem so that even if i had physical
access to all the components ... i wouldn't be able to recover the
stuff.

as to how little things change ... this was nearly 25 years ago and
the manuals ... as well things like the 370 "red book" (i.e. complete
370 architecture manual from which the 370 principle of ops manual was
produced as subset) ... were gml (precursor to sgml ... from which
html and lots of web stuff are derived).

how many people have been doing gml for over 25 years?

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

SGI O2 and Origin system announcements

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: SGI O2 and Origin system announcements
Newsgroups: comp.arch,comp.sys.sgi.misc
Date: 09 Oct 1996 07:45:21 -0700

does anybody know whether or not interconnect is SCI (ala convex,
sequent, dg, etc) for distributed shared memory???

2way seems to be two processors on the same board ... 64-way SCI would
yield 128 processor configuration.

convex(hp) with two processors on the same board ... & 64-way SCI
yield 128 processor configuration

sequent using intel's quad-board and 64-way SCI achieves 256 processor
config.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

System/360 Model 30

Refed: **, - **, - **, - **, - **, - **, - **, - **
From: Lynn Wheeler <lynn@garlic.com>
Subject: Re: System/360 Model 30
Newsgroups: alt.folklore.computers
Date: 03 Nov 1996 07:45:05 +0000

some of the early 360 documents predate shipment of the actual
hardware &/or software products. there tended to be minimum memory
requirement inflation as software approached FCS (first customer
ship).

i've got a 360 document that describes the models 60 and 62 ... with
the 62 available in 1cpu, 2cpu, and 4cpu configurations. What actually
shipped were the models 65 and 67.

standard 67 only came in one and two processor configuration. the two
processor configuration had channel controller and dual-ported memory
bus. the "channel controller" was able to route all i/o bus/channels
and memory banks ... including being able to partition the machine
into two uniprocessor units. the channel controller was configured via
switches on the front panel (later two processor 370 models were
actually a step back from the 67).

i know of one custom triplex 67 built which had a much more
sophisticated channel controller ... including the ability to reset
all the configuration switches under program control.

original 67 documentation described entry level memory configuration
of 256k. As TSS got closer and closer to ship ... entry level memory
requirements grew to 512k ... and ibm had to retrofit 67s in customer
locations.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 05 Nov 1996 21:43:40 +0000

some of the "bulletproof" issues aren't the obvious RAS (reliability,
availability, serviceability) ... i.e. lots of work to eliminate bugs,
lots of work done to handle situations progamatically when there were
failures/problems, and lots of stuff put into the system for
visability, recording, and management of problems.

one simple comparison from a cultural viewpoint is lots of
"production" NT machines running on pentium pros with EDO memory (i.e.
fake parity, no protection at all) compared to modern mainframes with
80/64 ECC (i.e. 16 bits of ecc for every eight bytes ... capable of
detecting 16 bit errors and correcting 15 bit errors).

another is some simple philosophical background. MVS heritage is 30
years or so that a program was run under batch control ... and all
aspects of its operation needed to be handled programatically
(including things like being able to intercept b37 error condition,
effectively filespace full, and recover). In prior discussion, it was
noted that common business appliation is sort ... and that SV4 sort
out of the box didn't even check for write error on temp file. Some
vendors had upgraded to recognized write error ... and at least fail.

compare unix heritage, for nearly as long, has been that a program is
a user command ... i.e. if sort command failed ... the user could
figure out how to reconfigure things to get it to work.

I've looked at taking some 7x24 business critical mainframe
environments and migrating them to open systems ... and there are
still quite a few business rules (embodied in places like operational
command & control centers) that I haven't figured out how to translate
to unix or nt platforms ... just because the necessary system
infrastructure doesn't exist.

On the other hand ... when my wife and I were starting our unix high
availability cluster stuff in the late '80s (based on some experience;
having done some of the mainframe stuff 10-15 years earlier), there
were large number of people around the (open system) industry in total
opposition to the effort ... who, somewhat surprisingly, now are some
of the biggest proponents.

MVS isn't perfect. I was involved in a somewhat atypical installation
in the late 70s and early 80s where the standard MVS product (if run)
would crash & burn regularly ... typically within 15 minutes of being
brought up ... because of various types of I/O errors and failures
(things that normal installations might see less than once a year). I
did a bullet proof IOS rewrite for this environment which eliminated
all the failures ... hard, as well as various soft failures involving
loops and resource starvations.

Another scenerio is in '81 I did redesign and re-implementation of the
HYPERChannel remote device support. As part of that redo ... I choose
to reflect various sorts of transmit failures, involving channel
programs loaded into the A51x remote device adapters, as channel
checks. Nearly nine years later, I got a call from some IBM quality
control expert. There is this industry service that collects mainframe
erep (error recording) logs from customer locations and produces
industry reports regarding various production machine operational RAS
characteristics. The (then) current generation of IBM mainframes had
reduced channel error rates to well under 10**-20. As a result nearly
all of the reported channel check errors ... across a large percentage
of all operational production mainframes in existance ... were not
real ... but were being generated by various installations running
HYPERChannel remote device support (some for "channel extenders" at
the end of telco lines ... which had BER no better than 10**-9). After
some review ... I determined that the transmission error condition
could be reflected as interface control check (i.e. IFCC instead of
CC) ... because the same exact software recovery operations were
executed for IFCC and CC errors. The advantage is that IFCC errors are
reported differently than CC errors in the industry RAS reports.

To put this is some perspective ... imagine a vendor of large
production unix systems has a customer base well in excess of several
thousand machines ... and that every error for all customers was being
captured and reported by an industry standard RAS organization
(i.e. things like every SCSI bus error). Imagine the vendor becoming
quite concerned when it finds that there were a total of ten such
(SCSI bus) errors reported across the all the machines in the whole
customer base in a period of 12 months ... when they believed there
should be no more than one.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 06 Nov 1996 08:40:11 +0000

... oh yes, somewhat implicit in my previous post was the concept of
instrumentation and measurement. to get out of the art stage and at
least into the engineering stage (not even necessarily science)
... requires instrumentation and measurement. RAS (reliability,
availability, serviceability) "engineering" requires not just the
pieces that implement it ... but also the instrumentation and
reporting structure that support it.

the hypothetical RAS case of all large production unix systems (fast
cpu, 100+ mbytes of memory, at least a dozen disk drives, etc)
includes extensive measurement and reporting structure that is not
only used at the respecitive data centers ... but reports are fed into
national service that can catagorize RAS profiles for all such systems
(somewhat analogous would be frequency of repair for all trucks in the
US broken down by types of failure) ... with lots of people really
concerned about things like whether or not even a single SCSI bus
error occured on any of the systems anytime in the past 12 months.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 09 Nov 1996 16:00:56 +0000

there seems to be a variety of apps on TPF res systems. I looked at
some of them a couple years ago. One of them was "routes" which
accounted for approximately 25percent of total processing in a
complex.  I redesigned with a paradigm shift and got about a factor of
1000* thruput improvement ... and then implemented the top 10
(impossible) things on the wish list (which required about factor of
100* ... cutting aggregate thruput improvement back to about 10*
... one item was eliminating any carrier bias). It was a query mostly
app ... so relatively straightforward to move to unix clusters (turns
out to be cultural problem, more difficult than technical issues).

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

interdata and perkin/elmer

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: interdata and perkin/elmer
Newsgroups: comp.society.folklore
Date: 10 Nov 1996 15:47:46 +0000

does anybody remember the interdata & perkin/elmer lineage???

when I was an undergraduate ... I had rewritten much of the mainframe
driver software for 2702 ... in theory using the SAD command and some
hacks ... to reconfigure which line-scanner was associated with which
port. Objective was to allow various terminals (tty/ascii, 2741,
1052s, etc) to dial into common modem subpool ... and dynamically
recognize the incoming terminal type. During tests ... everything
seemed to work. Finally, one day the IBM hardware engineer explained
to be that it really wouldn't work reliably because ... while it was
possible to specify under program control which line scanner was
associate with which port/line ... that 2702 implementation took some
short cuts and hired wired specific oscillators to each line (fixing
the baud rate on each line).

In reaction to the problem we started a project to build a 2702
replacement (that could do dynamic baud detection as well as dynamic
terminal type identification). We started with Interdata 3 and built
our own hired wired channel attachment card. Somewhere I believe there
is an artical blaming four of us for originating the IBM OEM
plug-compatible controller business. The project eventually grew into
an Interdata4 with multiple embedded Interdata3 processers. About that
time, I graduated and went on to other things.

I still remember two problems debugging the interface. One was
watching the mainframe "red-light" ... because the 65/67 was still
locked out on the 2nd timer tic from updating location 80 timer (can't
hold bus-in continously on the channel for two timer tics ... because
it also locks up the memory bus). The other problem was when it was
identified that ascii/tty bits were going into the mainframe
"backwards" (actually the 2702 tty line-scanner was loading leading
bits into lower-order bit ... before transmitting to mainframe memory
... effectively reversing bit order). To be "plug-compatible" we had
to also reverse ascii bit sequence in each byte before sending to
mainframe memory.

I believe I've seem some writeup that the first non-DEC Unix port was
to a Interdata 7/32(?).

Also at some point Perkin-Elmer bought up Interdata. In any case, I
ran into somebody a couple weeks ago that said that he sold large
number of Perkin-Elmer boxes in the early 80s with wire-wrapped
mainframe channel attached boards ... he implied that the wire-wrapped
board possibly had changed little from our original.

In any case, there still seem to be shops still running perkin-elmer
boxes as terminal/line controllers.

-
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 11 Nov 1996 08:58:19 +0000

for routes and fares ... it would be possible to distribute workload
across all machines in cluster ... then outage only represents
capacity issue not availability.

harder issue becomes the other 98 (out of 100) issues involved in
operating production, commercial system ... say command&control
operational center. what automated processes are in place to make
sure the daily updates for routes and fares are guarenteed to run to
run every time and on schedule?

one example from similar thread here last march is with sort. typical
business critical mainframe shop will have procedures that intercepts
temp-space full condition for specific applications ... reconfigures
space and then succesfully runs appliation to completion. lots of
business apps rely on sort. it was noted that 5.4 sort out of the box
doesn't even check for write-error so if /tmp fills, output is
truncated w/o error indication (also kernel error reflected isn't even
specific to condition ... but is lumped into generic catch all
... assuming sort is even modified to check for an error).

Catching space full can be one out of 1000 business rules defined for
business critical application and most of them could require custom
hack to translate to unix (code that isn't part of the application
... but part of the operation of the application).

it is not that i don't believe in unix clusters ... my wife and I
originated ha/cmp as something of skunk-works and despite various
opposition saw it thru to first customer ship. ha/cmp typically refers
to no-single-point-of-failure.

lots of the mainframe isn't science but lore. it is the thousands of
outages that have occured over the past 30 years ...  some of which
were wierd combinations of multiple failures ... which a solution was
then devised for. there are some large business-critical complex apps
that have hundreds of thousands of business rules that have evolved
over the past 30 years.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 11 Nov 1996 10:19:59 +0000

... another aspect of business critical apps is failure mode analysis.

sometime last year ... i did a review of a relatively modest unix
business critical application for operation on tcp/ip and the
internet. it had been thru development, test, stress-test and q&a.

we put together a failure mode matrix involving system components,
networking components, internet components, isp, firewalls, routers,
client components, etc.  the application was analysed from the
standpoint did it handle &/or recover from each specific failure
mode. in the case where it didn't handle and/or recover ... was there
at least sufficient information being logged to diagnose what the failure
was. first pass thru the majority of items weren't even diagnosable.

the original effort was somewhat the expected straight-forward quality
application development. to support business critical involved about
four times as much (more) effort ... as went into the initial
straight-forward quality effort.

a number of mainframe applications have acquired the additional 4*
effort (or more) ... not necessarily thru fore-sight but thru
evolution. many times the people currently involved might not even
being able to cataloge everything.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 11 Nov 1996 10:35:18 +0000

That is a rather small "production unix system" nowadays. The large
ones (from Sun at least) have tens of CPUs (up to 30), many Gigabytes
of memory (up to 30GB) and many hundreds to thousands of disk drives.
The "interesting" installations in the last year or so have been
multi-terabyte. Sun's in-house maximum config tests are for tens of
terabytes. Over the last few years Sun has learned (the hard way)
what you have to do to make this size of system run well, and what
kind of RAS features and service are needed.

i recognize it as being small ... but there are mainframe systems that
small also. point was to establish lower bound (not average or upper)
on all system configurations that were consistently monitored and
reviewed.

one fundamental issue is culture & people that are focused on even
things as minor as all the soft failures that have no observable
external impact. even in cases where the software monitoring exists
... is it being extensively reviewed. are there morning departmental
meetings to cover all operational characteristics that occured in the
last 24 hrs. are there weekly meetings. do all the installations
provide information for ranking across the industry.

make it slightly simpler, for the top ten solstice installations, what
is the total number of scsi bus errors that occured at each
installation in the past 12 months (including distribution). how does
that compare to the top ten sgi installations and the top ten hp
installations.

for large oracle clusters ... are they raw devices or filesystems.  is
there difference in the way errors are logged and reported between raw
devices, filesystems, and hardware outboard controllers that provide
error masking with raid? are all errors; kernel, dbms, outboard
controllers, etc ... recording consistently in the same place?

after having done ha/cmp for unix ... a lot of the gotchas are the
nits and the details ... not the high-level stuff.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 11 Nov 1996 10:49:07 +0000

a "gatcha" detail ... for ha/cmp i wanted to do ip-address take-over
to mask various types of system, network, adapter, etc. failures.

the server software worked correctly, the tcp/ip architecture works
correctly, most environments worked correctly ... until ...

there is this function in tcp/ip that uses an ARP (address resolution
protocol ... for detailed reference look at
	http://www.garlic.com/~lynn/rfcietf.htm
in the term index.

Most systems implementat an ARP cache. The ARP cache maintains the
most recently resolved IP->mac addresses. The ARP cache architecture
has well defined rules that entries in the ARP cache times-out after
a specified period. It is also possible to issue a ARP clear command
to wipe everything from the ARP cache.

The bottom line is that in theory, client-side tcp/ip support should
allow servers to do ip-address take-over with a different LAN card
... and the clients will discover the new mapping between the
ip-address and the mac address ... loading it into the ARP cache.

The problem was that the 4.3 tahoe and reno tcp/ip source used by a
large number of vendors had a glitch. The tcp/ip routine that called
the ARP routine to provide the IP->mac mapping ... kept a one-deep
most recently used mapping in its hip-pocket. This was a performance
enhancement ... but also wasn't subject to the ARP cache time-out
rules and/or the ARP flush command.

There are some environments where effectively all of a client's tcp/ip
activity is with a single server ... using the same tcp/ip address.
The problem is that in such an environment ... the ip-address
take-over event will never be noticed by the client ... since it never
will get into a situation that invalidates the hip-pocket mapping.

...

Another example is trying is trying to put up ha/cmp like servers on
the internet with multiple diverse routings into the internet backbone
(including diverse routing to different central exchanges, different
cable points of entry into the building etc) ... and then banging my
head on a brick wall trying to get all the browser vendors to support
multiple A-records (i.e. the same DNS name mapping to multiple ip
addresses).

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix (and TPF)

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix (and TPF)
Newsgroups: comp.arch
Date: 12 Nov 1996 21:51:03 +0000

i don't know all the details ... but it seems that different parts of
at&t ran different things for different purposes.  around '74 a
somewhat standard mainframe system was provided to at&t longlines that
included about 30k lines of kernel code modifications and enhancements
that I had generated.

that was pretty much the last i heard ... until 10 years later when
the local salesman tracked me down ... to say that longlines was still
running the same kernel (had migrated to newer models of mainframes as
they came out over the years). concern was that finally after 10 years
... the evolutionary changes of the mainframe models had progressed to
where it was not possible to run the '74 kernel on the next generation
of mainframes coming out (i.e. mainframe binary application program
compatibility has been preserved over the years ... but hardware
interfaces supported requiring support by the kernel didn't stay
completely compatible over multiple hardware generations).

fortunately the 30k modifications did include some advanced dynamic
adaptive resource and workload management ... so it was able to adapt
to the 20* increase in resource capacity and workload that occured
during the period.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix (and TPF)

Refed: **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix (and TPF)
Newsgroups: comp.arch
Date: 13 Nov 1996 06:58:21 +0000

the last time we looked at ss7 ... for at least 800-number support, it
was a non-unix redundant hardware box. problem was the availability
criteria of 5minutes of downtime per year ... and doing any kernel
maintenance at all tended to blow 20 to 30 year worth of availability
budget.

my wife and I proposed ha/cmp cluster. some response was that with a
little work it would be possible to create clusters of the current
system (i.e. alwas smop ... small matter of programming).

then it became an issue of price for availability.

given that the ss7 defined availability as being able to get the
800-number mapping back from either of the redundant T1 lines
... there was effectively no difference in availabiliity between a
unix cluster solution and a cluster of redundant hardware systems (the
probability that all boxes in either cluster would be down at the same
time was statistically the same ... given the measurement criteria).

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

interdata & perkin/elmer machines

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: interdata & perkin/elmer machines
Newsgroups: alt.folklore.computers
Date: 13 Nov 1996 07:19:34 +0000

does anybody remember the interdata & perkin/elmer lineage???

when I was an undergraduate ... I had rewritten much of the mainframe
driver software for 2702 ... in theory using the SAD command and some
hacks ... to reconfigure which line-scanner was associated with which
port. Objective was to allow various terminals (tty/ascii, 2741,
1052s, etc) to dial into common modem subpool ... and dynamically
recognize the incoming terminal type. During tests ... everything
seemed to work. Finally, one day the IBM hardware engineer explained
to be that it really wouldn't work reliably because ... while it was
possible to specify under program control which line scanner was
associate with which port/line ... that 2702 implementation took some
short cuts and hard wired specific oscillators to each line (fixing
the baud rate on each line).

In reaction to the problem we started a project to build a 2702
replacement (that could do dynamic baud detection as well as dynamic
terminal type identification). We started with Interdata 3 and built
our own wire-wrap channel attachment card. Somewhere I believe there
is an artical blaming four of us for originating the IBM OEM
plug-compatible controller business. The project eventually grew into
an Interdata4 with multiple embedded Interdata3 processers. About that
time, I graduated and went on to other things.

I still remember two problems debugging the interface. One was
watching the mainframe "red-light" ... because the 65/67 was still
locked out on the 2nd timer tic from updating location 80 timer (can't
hold bus-in continously on the channel for two timer tics ... because
it also locks up the memory bus). The other problem was when it was
identified that ascii/tty bits were going into the mainframe
"backwards" (actually the 2702 tty line-scanner was loading leading
bits into lower-order bit ... before transmitting to mainframe memory
... effectively reversing bit order). To be "plug-compatible" we had
to also reverse ascii bit sequence in each byte before sending to
mainframe memory.

I believe I've seem some writeup that the first non-DEC Unix port was
to a Interdata 7/32(? ... something like 10 years later).

Also at some point Perkin-Elmer bought up Interdata. In any case, I
ran into somebody a couple weeks ago that said that he sold large
number of Perkin-Elmer boxes in the early 80s with wire-wrapped
mainframe channel attached boards ... he implied that the wire-wrapped
board possibly had changed little from our original.

In any case, there still seem to be shops still running perkin-elmer
boxes as terminal/line controllers.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 16 Nov 1996 08:24:32 +0000

there is minor mention at
http://vm.marist.edu/~piper/party/jph-12.html#wheeler

also check unix system implementation for system/370, bell labs tech
journal v63n8p2 p1751, oct '84.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

Mainframes & Unix

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: Mainframes & Unix
Newsgroups: comp.arch
Date: 17 Nov 1996 16:07:32 +0000

mainframe handler of terminal devices ... from a post I made recently
to alt.folklore.computers. ... I was corrected as first "unix" port
was to 8/32 (not 7/32 as per attached).

the "problem" was/is that mainframe "channel" interface is half-duplex
... although in the past have used pairs of sub-channel addresses for
dual-simplex simulating full full-duplex.

does anybody remember the interdata & perkin/elmer lineage???

when I was an undergraduate ... I had rewritten much of the mainframe
driver software for 2702 ... in theory using the SAD command and some
hacks ... to reconfigure which line-scanner was associated with which
port. Objective was to allow various terminals (tty/ascii, 2741,
1052s, etc) to dial into common modem subpool ... and dynamically
recognize the incoming terminal type. During tests ... everything
seemed to work. Finally, one day the IBM hardware engineer explained
to be that it really wouldn't work reliably because ... while it was
possible to specify under program control which line scanner was
associate with which port/line ... that 2702 implementation took some
short cuts and hard wired specific oscillators to each line (fixing
the baud rate on each line).

In reaction to the problem we started a project to build a 2702
replacement (that could do dynamic baud detection as well as dynamic
terminal type identification). We started with Interdata 3 and built
our own wire-wrap channel attachment card. Somewhere I believe there
is an artical blaming four of us for originating the IBM OEM
plug-compatible controller business. The project eventually grew into
an Interdata4 with multiple embedded Interdata3 processers. About that
time, I graduated and went on to other things.

I still remember two problems debugging the interface. One was
watching the mainframe "red-light" ... because the 65/67 was still
locked out on the 2nd timer tic from updating location 80 timer (can't
hold bus-in continously on the channel for two timer tics ... because
it also locks up the memory bus). The other problem was when it was
identified that ascii/tty bits were going into the mainframe
"backwards" (actually the 2702 tty line-scanner was loading leading
bits into lower-order bit ... before transmitting to mainframe memory
... effectively reversing bit order). To be "plug-compatible" we had
to also reverse ascii bit sequence in each byte before sending to
mainframe memory.

I believe I've seem some writeup that the first non-DEC Unix port was
to a Interdata 7/32(? ... something like 10 years later).

Also at some point Perkin-Elmer bought up Interdata. In any case, I
ran into somebody a couple weeks ago that said that he sold large
number of Perkin-Elmer boxes in the early 80s with wire-wrapped
mainframe channel attached boards ... he implied that the wire-wrapped
board possibly had changed little from our original.

In any case, there still seem to be shops still running perkin-elmer
boxes as terminal/line controllers.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

what happened to the 286?

From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: what happened to the 286?
Newsgroups: alt.folklore.computers
Date: 28 Nov 1996 13:17:14 +0000

the bottom dropped out from under memory chip prices after the 386
machines hit the market (and drove 286 machines off the scene; there
was huge stockpile of 286 clones from pacific rim countries getting
ready for fall buying season ... and the 386 just blew things to
pieces ... there was somehting like a couple month period where 286
clone prices dropped from around 800 to under 300).

i had a 286 system configured with 6mbyte ... but I don't believe that
was very common ... given memory prices (typically <10-20% system
thruput increase for double system cost).

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key

IBM 4361 CPU technology

Refed: **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **, - **
From: Anne & Lynn Wheeler <lynn@garlic.com>
Subject: Re: IBM 4361 CPU technology
Newsgroups: comp.arch
Date: 08 Dec 1996 09:12:39 +0000

... somewhat related; ibm service procedure requires the ability to
field bootstrap diagnose failing systems. 15 years or so ago, larger
machines got so complex that it was no longer possible to meet field
bootstrap diagnose requirement ... so a "service" processor was
created. Various types of probes were built into the machine connected
to the service processor; it was possible to field bootstrap diagnose
the service processor ... and then use the service processor to
diagnose the mainframe complex.  Early service processors eventually
graduated to 4331s ... and then to pair of replicated 4361s (embedded
inside larger mainframe). 4331s and 4361s service processors ran
modified version of VM/370 release 6 using ios3270 to drive the
service panels.

--
Anne & Lynn Wheeler    |  lynn@garlic.com, lynn@netcom.com
                       |  finger for pgp key
next, previous, subject index - home