BigEd · June 23, 2015 18:22
diff --git a/TomasuloSpeaks b/TomasuloSpeaks
 Transcript of the talk at https://www.youtube.com/watch?v=S6weTM1tNzQ

 Thank you very much for that kind welcome!

 What I intend to do with my time today
 is divide it into two pieces
 not necessarily equal.

 The first piece will be a very cursory
 examination of 20, 30 years of computer design
 from the model 91 on forward to when it finally got replaced
 as an out-of-order execution machine

 and then I'll spend the rest of my
 time answering your questions.

 By the way interrupt me any time you have a
 question or a comment that you think is appropriate.

 So, to come way ahead, to roughly 1990 or thereabouts
 IBM finally brought out an out of order machine
 and this was such a shocking thing that
 they felt it necessary to offer
 an explanation or two as to why it
 took them 30 years to do this

 and they did

 and the main explanation which was
 quite a good one
 was that once you have a cache
 with a very rapid access to memory
 (not counting misses of course which we'll discuss a little bit later)
 and you don't have to worry about floating point
 and you don't care about long execution
 instructions like floating point then there's really no need
 for what the model 91 offered
 and so naturally they got rid of it.

 But as time marched on it became apparent that
 they were going to need something.

 Now, IBM moves in mysterious ways - 
 they probably move a lot slower than
 I would have liked or other people would
 have liked but ultimately they get it done.

 So they brought out this machine
 and they provided a reasonable rationale
 which the only flaw in it was that as time went on
 (because we're talking now about a 30 year span)
 it became less and less applicable
 to think that way because memories kept getting slower
 as they always have
 I mean you can think of it as kind of a Golden Age
 of the cache
 that we had this period where you could get away with
 two cycle access to main memory
 and Gee, if you missed the main memory you had like
 five cycle access to whatever backed it up
 instead of, in today's machines,
 you've got 70 cycle access to whatever backed it up.

 5:14
 so
 after... let me get straight what I want to say...
 I'm gong to jump around somewhat...
 So with all these caveats we didn't do an out of order machine
 and maybe we could have a little bit sooner
 than otherwise.

 Now we jump back to the Model 91 timeframe
 and why didn't they do an OOO execution machine then?
 Well, they did, they did. It was a machine - I forget
 what the nomenclature was now - it was a full OOO
 machine. It was completely logically designed
 it wasn't physically designed. So it's hard facts.

 And, it featured some very nice innovations.
 Including a branch prediction table which was
 a new, a relatively new thing.

 What else did they have? They had something in there specially
 for RAS purposes, it's slipped my mind, but no-one cares
 about RAS anyway. Although we should.

 So, this machine was carried through to everything
 except final physical design. It was good as gold, it
 was completely debugged and conformed to the
 architecture and all that other good stuff.

 It was deemed that it was too expensive
 which it probably was.
 By the way, it had a 7ns cycle, which for
 this period of time was pretty good. Pretty good.
 Contemporary IBM machines were like 40, 50ns
 cycle at that time.

 7:36
 So, now we have nothing. We don't have the model 91 any more
 and we didn't bring out a successor machine.
 So in a sense nothing happened, from that point of view, for machine
 design, for some 30-some odd years.  Which was sad, I think
 but that's OK.

 > Any questions?

 ... just leapfrog to the rest of my talk. Or rather, the rest of your talk.

 I don't have to do too much talking - as you see I don't have too much to
 do because IBM made it too easy, right?  They made one - I wouldn't
 say it was a halfhearted attempt - they made a good attempt, to make
 a really good machine, but it wasn't on the cards, and that was it.

 Then 30 years go by, and then it's in the stars, and they can start over
 again.

 So that's really the main thing I wanted to cover.  Obviously I can cover
 more things if you want, obviously machine design didn't stand still.
 During this time there improvements in RAS, and all kinds of things.
 But for our purposes I didn't think I'd want to devote too much
 time to those things.

 9m40
 > Now onto questions.
 [The questions are difficult to hear clearly enough to transcribe]

 [Roughly when was this OOO design done?]
 My memory isn't that good... early 70s, 72.

 [Something about instructions]
 Yes, in a limited way. What it did was classify the instructions
 into fixed point, floating point, and decimal.  And one instruction
 in each class could be executed along with an instruction
 from another class.
 So it didn't rely really heavily on OOO, because you have to
 remember that it's still true that, with the cache, a lot of the
 gain of OOO evaporates. So there's not much point pursuing
 it just for the sake of pursuing it.

 [How large a team of people]
 Well, the model 91 was special in the sense that they did
 a whole new technology, they did a whole new design
 automation system, and they did a whole new machine.
 So they had a lot of change on their plates.

 [How many people?]
 The model 91, because it was developing the design
 automation system and the software and everything, it had a lot of
 people, altogether.  It didn't have that many, if you focus in on
 the designers, people like I was, there might have been
 twenty, maybe, order of magnitude.
 [laughter]
 Why is order of magnitude funny?  You think I'm trying to
 hide a hundred?  I wouldn't do that!

 [Most trouble in 91?]
 We had trouble until we discovered the OOO algorithms,
 that cleared up one source of trouble.
 The whole machine was stretching.
 It was a strange machine because it had
 really pitiful memory access. This was before
 caches, so it had like 10 cycles, minimum
 memory access.  Now of course it had 16-way interleaved
 memory, so you're not necessarily waiting 10 cycles
 for every single access, but it's
 pretty pitiful.
 So memory was a big bottleneck for that machine.
 When they finally brought out... part of the 90 line...
 they may have brought out a high speed version.
 I seem to remember they made two versions
 of thin-film memory machine, which was very fast
 memory. Unfortunately it couldn't have the huge
 number of megabytes you can get with conventional
 memories.

 [You went on to work with STC on one of the first microprocessor based...
 how did those servers differ from PCs]
 As is often the case, a big company like IBM, they're not
 necessarily first out of the gate with new things. It may take them
 a while, especially if they don't have competition.
 So part of what happened is due to that kind
 of phenomenon.
 But I don't know if the rest is due to that.
 16:22
 Understand, this machine, the first one we did
 as STC - the only one we did as STC - was
 not supposed to be a high performance machine.
 It was supposed to be below the IBM machines.
 Because it supposed to serve as a server, or a
 lead-in to the IBM machine.
 Reality intruded. The first thing that happened, the
 technology was 2x slower than we thought it was.
 But fortunately it was also 2x faster than we thought
 it was, so we recouped back.
 But we were still in the hole, and we end up
 pulling some pretty sophisticated tricks
 and you don't want to do that
 you're dealing with a group - not all - of neophytes
 and you don't want to be tackling complicated
 things if you can avoid it.
 And that was, partially, what did them in.
 It was too complicated, for them.

 17:51
 [Something about marketing as the PC?]
 Oh, Yeah! God love marketing people, I wanted to strangle them.
 [laughter]
 we were going for three years on this project,
 sweating bullets, to try and to wring out
 every last bit of performance
 that these guys, the marketing people, were telling us
 "we absolutely need that performance, we're going to
 put you in a certain environment, you're not going to
 sell that many machines"
 and what happens, is that
 "oh no, we don't really have to be in that environment
 we want a cheaper machine that doesn't go as fast"
 Sheesh, I wanted to strangle them, all of them.
 It happens.

 18:43
 BTW this is not the end of the programming, it's the
 second part where you get to ask questions.

 [question about moving on from the system/360 to consulting
 how did you find your job had changed]
 Oddly enough, I don't think, that much.
 Except that it was a mistake, striving for too much
 performance, was bad. We got ourselves a lot
 of grief from doing that.
 Because, you know, the architecture of the machine
 is semi-stable. It changes, you have to upgrade with
 the times, but it is semi-stable.

 20:14
 [what was the original motivation for the OOO architecture
 and how come it took so long to be used in production.
 Why was the idea ahead of its time]
 The short answer, we've already discussed it. One, there was
 a machine, a successor, which had got scrapped. Without that
 machine, and with the advances in cache, there was no point,
 really, in out of order execution. At least not until the 80s.
 You can argue about at what point it might have made
 sense.
 [Given that, why try to do OOO]
 At the time, we were young
 [laughter]
 young and bold and we wanted to go for everything we can
 get.  And if we hadn't the idea, we would have built a
 perfectly good machine which in the best case
 might be 20 or 30% faster thanks to the floating point
 guy. Because he speeded up the floating point.
 So it wasn't that big a deal. But it was a coup.
 It was something Seymour Cray didn't have.
 No-one had it. So IBM could get some bragging rights
 out of the whole thing.

 [Back to 60s-70s what was it like to convince designers
 and architects that OOO was good]
 Pretty smart team!  They didn't take any convincing.
 I had the idea on a weekend, went in Monday morning
 sketched out the bulk of the important parts of the idea.
 They were thorough, they made sure they covered things,
 they made sure there were no serious quibbles.
 And then we were off and running.
 Now, you know that you can do more with OOO
 than we did. And in particular one thing which is very
 important for the operating system is to be able to do
 loads out of order. So that you can stack up loads which
 might be delayed for whatever reason, and then maybe
 get some other instruction through.  Because that's the
 main ... once you get rid of floating point - let's say long
 instructions - then that's basically it.  Even the OOO that
 I was just talking about is not really a barn-burner kind of thing.
 It's good to have. Like all of these things, you get to build
 on them.  We didn't get to build on them for like 30 years
 but you get to build on them.
 And out of the closet they come and you find you can do
 something you couldn't do before.

 25:00
 [What kind of design automation did you have]
 Ha ha, none!
 [laughter]
 We were the first machine in IBM to simulate the logic of
 the machine. And we could simulate 1000 gates at a time.
 [laughter]
 It's pitiful! I mean, the model 91 isn't by any means a huge
 machine but it's like 40 or 50 thousand gates.
 So that's nothing.
 And this was something that developed towards the end of
 the project.
 Like I say, I give loads of credit, we had smart people.
 Including this one guy and gal who were really DA people
 and they were pushing what they could do, to improve
 the performance of the machine.

 26:28
 [Debugging efforts and kinds of problems/bugs, and the process
 to fix]
 Yes, And an interesting sideline to that, we had this programmer
 who eventually wrote a simulation of the machine.
 And he discovered two bugs, in the machine - I think it had to do
 with fetching - because in those days we had a pretty sophisticated
 fetching algorithm, where you start out with nothing and you try and
 catch a loop. So we really didn't have much of anything in the way
 of debugging. It was coming, but not for us. Which is too bad.
 [Followup, how to debug the machine]
 Well, that's somewhat of an art. Especially in those days.
 28:38
 We had an interesting little experience. We were plagued by something
 called the 'cracked stripe' 
 which none of you probably ever heard of. But it was a fault in the
 technology
 due to the extremely high current density that they had pushed the
 circuits to, such that the wind, the electron wind, going through these very
 fine circuits were blowing the atoms away. And you would get faults.
 You would get open circuits.
 So that was an interesting problem that we had to deal with.
 [How to find that this was going on? How to deal with it?]
 The 'cracked stripe' was special. We were experiencing one
 failure every day. Now, most of you don't have experience debugging
 a machine, but you can't debug a complex machine if you have one failure
 a day.  There are just too many things to find.
 We were in real trouble.
 And the answer was technology, and they had to fix the technology.
 And in the case of the 91, they remade all the technology, in the case of
 some of the slower machines in the 360 line they only partially remade them
 and some they didn't remake at all.
 Because it was a time-dependent thing. The faster your circuit was, the more
 it was prone to this problem.

 30:52
 [how long were these systems under development, from
 Thomas Watson saying I want a fast computer]
 well, what i have to do, which doesn't make a nice clean
 picture, is the following.
 We commenced on the model 91 in about 1963
 possibly late 62.
 Because of the cracked stripe problem, it took us a very long
 time to debug the machine.
 If there'd been no cracked stripe problem we probably would have
 brought up the machine two years earlier than we ultimately
 brought it up.
 So that was a real devastating blow to us.

 And you know, it's really hard, when your hardware is failing under you,
 it's hard to make progress
 and it's failing in random ways
 and sometimes you take it out of the machine
 and it doesn't fail!
 Now what do you do?
 Thank your lucky stars if it fails next time.

 [During development of 91 did you think about compiler optimisations]
 We didn't have much to say about compilers
 I was in the hardware group
 we were conversant with some of their problems
 later on there was more back and forth
 because I always had - after the initial model 91 thing -  a dual role of architect
 in the early phases which is really
 software architecture and then implementing the machine.
 Does that answer your question?

 [You mentioned Seymour Cray, he was still at Control Data
 was there a lot of competition
 did you take his machines apart?]
 We did a little more of that later on
 [laughter]
 but no we didn't do that that much.
 He had, it's very interesting how these things work out
 because he had a jump start on us, okay
 because he was already working on
 his machine
 and we were just starting on ours - we didn't even
 have the technology to build our machine
 So we were in real trouble.
 So, what are we to do in this circumstance? How can we
 make up lost ground. We tried all kinds of things, perhaps not
 well-founded, to try and make up this lost ground
 34:38
 But it's difficult. And what happened was, we were saved. We had an
 assessment of how fast the Cray machine was. So we said
 okay, we think it's this fast, they are going to be two years after us
 so we've got to be twice as fast as them so we can ... come out.
 That sounds feasible.
 Well, it turns out the Cray is four times faster
 [laughter]
 than we thought. Meanwhile our machine is two times faster
 than we thought it was.
 The net result of all this fiddling around was rough parity.
 There were some things we did faster
 some kinds of problems they did faster.
 But they still had - perhaps undeserved - the reputation
 for raw compute speed.
 And I think all of the customers, like the Atomic Energy Commission
 laboratories, who were supposedly interested in that, they all went
 to Cray, and IBM sold to - how to characterise it - database kind
 of applications, that need a lot of memory and concurrency, a lot
 of I/O running and all that kind of stuff. Doesn't particularly need a lot of floating point performance,
 computational performance in general
 [36:21]
 [Tell us about how IBM culture changed over your career]
 That's a tough one for me inside.
 First, I have to divide my career into two parts. First, the five or six
 years when I did my machine design - not my machine design - and then after that.
 How did IBM change? It became less interesting and cutting edge.
 In the beginning it was wide open, crazy ideas and if you could implement them
 and it would buy some performance, you got it.
 As time went on, you get more and more constrained, by the architecture
 and by the necessity of other machines. You're not just allowed to design
 for the model 91 class, you have to design a machine, you know, the next class
 down, we had three or four [classes] by that time.  So there were all kinds of
 things standing in the way of pure performance. And you just had to 
 live with those. There's no way around it.

 38:38
 [How about backward compatibility?]
 Oh yeah, that was a must, that was a no-brainer, we weren't allowed to touch
 that with a ten foot pole. In fact we had to get a special dispensation, because
 the model 91, because of its out of order floating point, the effect on interrupts
 was actually in violation of the architecture, and they had to get a special
 dispensation for the model 91.
 I don't think anyone ever suffered from this dispensation in those days, but
 nevertheless you had to get it.

 39:20
 [Any specific ideas from your team that wouldn't have worked]
 You're asking me if any ideas sort of died?
 That's really hard to answer. You like to think that you wring out the most
 performance you can get out of your technology, from what you've got,
 but that didn't really... really and truly there are all manner of compromises
 that have to be made - not have to be made, some have to be made, other's
 don't have to be made, but you're not omniscient, you don't know everything,
 so it's really hard to say how much that affects machine design.
	Transcript of the talk at https://www.youtube.com/watch?v=S6weTM1tNzQ

	Thank you very much for that kind welcome!

	What I intend to do with my time today
	is divide it into two pieces
	not necessarily equal.

	The first piece will be a very cursory
	examination of 20, 30 years of computer design
	from the model 91 on forward to when it finally got replaced
	as an out-of-order execution machine

	and then I'll spend the rest of my
	time answering your questions.

	By the way interrupt me any time you have a
	question or a comment that you think is appropriate.

	So, to come way ahead, to roughly 1990 or thereabouts
	IBM finally brought out an out of order machine
	and this was such a shocking thing that
	they felt it necessary to offer
	an explanation or two as to why it
	took them 30 years to do this

	and they did

	and the main explanation which was
	quite a good one
	was that once you have a cache
	with a very rapid access to memory
	(not counting misses of course which we'll discuss a little bit later)
	and you don't have to worry about floating point
	and you don't care about long execution
	instructions like floating point then there's really no need
	for what the model 91 offered
	and so naturally they got rid of it.

	But as time marched on it became apparent that
	they were going to need something.

	Now, IBM moves in mysterious ways -
	they probably move a lot slower than
	I would have liked or other people would
	have liked but ultimately they get it done.

	So they brought out this machine
	and they provided a reasonable rationale
	which the only flaw in it was that as time went on
	(because we're talking now about a 30 year span)
	it became less and less applicable
	to think that way because memories kept getting slower
	as they always have
	I mean you can think of it as kind of a Golden Age
	of the cache
	that we had this period where you could get away with
	two cycle access to main memory
	and Gee, if you missed the main memory you had like
	five cycle access to whatever backed it up
	instead of, in today's machines,
	you've got 70 cycle access to whatever backed it up.

	5:14
	so
	after... let me get straight what I want to say...
	I'm gong to jump around somewhat...
	So with all these caveats we didn't do an out of order machine
	and maybe we could have a little bit sooner
	than otherwise.

	Now we jump back to the Model 91 timeframe
	and why didn't they do an OOO execution machine then?
	Well, they did, they did. It was a machine - I forget
	what the nomenclature was now - it was a full OOO
	machine. It was completely logically designed
	it wasn't physically designed. So it's hard facts.

	And, it featured some very nice innovations.
	Including a branch prediction table which was
	a new, a relatively new thing.

	What else did they have? They had something in there specially
	for RAS purposes, it's slipped my mind, but no-one cares
	about RAS anyway. Although we should.

	So, this machine was carried through to everything
	except final physical design. It was good as gold, it
	was completely debugged and conformed to the
	architecture and all that other good stuff.

	It was deemed that it was too expensive
	which it probably was.
	By the way, it had a 7ns cycle, which for
	this period of time was pretty good. Pretty good.
	Contemporary IBM machines were like 40, 50ns
	cycle at that time.

	7:36
	So, now we have nothing. We don't have the model 91 any more
	and we didn't bring out a successor machine.
	So in a sense nothing happened, from that point of view, for machine
	design, for some 30-some odd years. Which was sad, I think
	but that's OK.

	> Any questions?

	... just leapfrog to the rest of my talk. Or rather, the rest of your talk.

	I don't have to do too much talking - as you see I don't have too much to
	do because IBM made it too easy, right? They made one - I wouldn't
	say it was a halfhearted attempt - they made a good attempt, to make
	a really good machine, but it wasn't on the cards, and that was it.

	Then 30 years go by, and then it's in the stars, and they can start over
	again.

	So that's really the main thing I wanted to cover. Obviously I can cover
	more things if you want, obviously machine design didn't stand still.
	During this time there improvements in RAS, and all kinds of things.
	But for our purposes I didn't think I'd want to devote too much
	time to those things.

	9m40
	> Now onto questions.
	[The questions are difficult to hear clearly enough to transcribe]

	[Roughly when was this OOO design done?]
	My memory isn't that good... early 70s, 72.

	[Something about instructions]
	Yes, in a limited way. What it did was classify the instructions
	into fixed point, floating point, and decimal. And one instruction
	in each class could be executed along with an instruction
	from another class.
	So it didn't rely really heavily on OOO, because you have to
	remember that it's still true that, with the cache, a lot of the
	gain of OOO evaporates. So there's not much point pursuing
	it just for the sake of pursuing it.

	[How large a team of people]
	Well, the model 91 was special in the sense that they did
	a whole new technology, they did a whole new design
	automation system, and they did a whole new machine.
	So they had a lot of change on their plates.

	[How many people?]
	The model 91, because it was developing the design
	automation system and the software and everything, it had a lot of
	people, altogether. It didn't have that many, if you focus in on
	the designers, people like I was, there might have been
	twenty, maybe, order of magnitude.
	[laughter]
	Why is order of magnitude funny? You think I'm trying to
	hide a hundred? I wouldn't do that!

	[Most trouble in 91?]
	We had trouble until we discovered the OOO algorithms,
	that cleared up one source of trouble.
	The whole machine was stretching.
	It was a strange machine because it had
	really pitiful memory access. This was before
	caches, so it had like 10 cycles, minimum
	memory access. Now of course it had 16-way interleaved
	memory, so you're not necessarily waiting 10 cycles
	for every single access, but it's
	pretty pitiful.
	So memory was a big bottleneck for that machine.
	When they finally brought out... part of the 90 line...
	they may have brought out a high speed version.
	I seem to remember they made two versions
	of thin-film memory machine, which was very fast
	memory. Unfortunately it couldn't have the huge
	number of megabytes you can get with conventional
	memories.

	[You went on to work with STC on one of the first microprocessor based...
	how did those servers differ from PCs]
	As is often the case, a big company like IBM, they're not
	necessarily first out of the gate with new things. It may take them
	a while, especially if they don't have competition.
	So part of what happened is due to that kind
	of phenomenon.
	But I don't know if the rest is due to that.
	16:22
	Understand, this machine, the first one we did
	as STC - the only one we did as STC - was
	not supposed to be a high performance machine.
	It was supposed to be below the IBM machines.
	Because it supposed to serve as a server, or a
	lead-in to the IBM machine.
	Reality intruded. The first thing that happened, the
	technology was 2x slower than we thought it was.
	But fortunately it was also 2x faster than we thought
	it was, so we recouped back.
	But we were still in the hole, and we end up
	pulling some pretty sophisticated tricks
	and you don't want to do that
	you're dealing with a group - not all - of neophytes
	and you don't want to be tackling complicated
	things if you can avoid it.
	And that was, partially, what did them in.
	It was too complicated, for them.

	17:51
	[Something about marketing as the PC?]
	Oh, Yeah! God love marketing people, I wanted to strangle them.
	[laughter]
	we were going for three years on this project,
	sweating bullets, to try and to wring out
	every last bit of performance
	that these guys, the marketing people, were telling us
	"we absolutely need that performance, we're going to
	put you in a certain environment, you're not going to
	sell that many machines"
	and what happens, is that
	"oh no, we don't really have to be in that environment
	we want a cheaper machine that doesn't go as fast"
	Sheesh, I wanted to strangle them, all of them.
	It happens.

	18:43
	BTW this is not the end of the programming, it's the
	second part where you get to ask questions.

	[question about moving on from the system/360 to consulting
	how did you find your job had changed]
	Oddly enough, I don't think, that much.
	Except that it was a mistake, striving for too much
	performance, was bad. We got ourselves a lot
	of grief from doing that.
	Because, you know, the architecture of the machine
	is semi-stable. It changes, you have to upgrade with
	the times, but it is semi-stable.

	20:14
	[what was the original motivation for the OOO architecture
	and how come it took so long to be used in production.
	Why was the idea ahead of its time]
	The short answer, we've already discussed it. One, there was
	a machine, a successor, which had got scrapped. Without that
	machine, and with the advances in cache, there was no point,
	really, in out of order execution. At least not until the 80s.
	You can argue about at what point it might have made
	sense.
	[Given that, why try to do OOO]
	At the time, we were young
	[laughter]
	young and bold and we wanted to go for everything we can
	get. And if we hadn't the idea, we would have built a
	perfectly good machine which in the best case
	might be 20 or 30% faster thanks to the floating point
	guy. Because he speeded up the floating point.
	So it wasn't that big a deal. But it was a coup.
	It was something Seymour Cray didn't have.
	No-one had it. So IBM could get some bragging rights
	out of the whole thing.

	[Back to 60s-70s what was it like to convince designers
	and architects that OOO was good]
	Pretty smart team! They didn't take any convincing.
	I had the idea on a weekend, went in Monday morning
	sketched out the bulk of the important parts of the idea.
	They were thorough, they made sure they covered things,
	they made sure there were no serious quibbles.
	And then we were off and running.
	Now, you know that you can do more with OOO
	than we did. And in particular one thing which is very
	important for the operating system is to be able to do
	loads out of order. So that you can stack up loads which
	might be delayed for whatever reason, and then maybe
	get some other instruction through. Because that's the
	main ... once you get rid of floating point - let's say long
	instructions - then that's basically it. Even the OOO that
	I was just talking about is not really a barn-burner kind of thing.
	It's good to have. Like all of these things, you get to build
	on them. We didn't get to build on them for like 30 years
	but you get to build on them.
	And out of the closet they come and you find you can do
	something you couldn't do before.

	25:00
	[What kind of design automation did you have]
	Ha ha, none!
	[laughter]
	We were the first machine in IBM to simulate the logic of
	the machine. And we could simulate 1000 gates at a time.
	[laughter]
	It's pitiful! I mean, the model 91 isn't by any means a huge
	machine but it's like 40 or 50 thousand gates.
	So that's nothing.
	And this was something that developed towards the end of
	the project.
	Like I say, I give loads of credit, we had smart people.
	Including this one guy and gal who were really DA people
	and they were pushing what they could do, to improve
	the performance of the machine.

	26:28
	[Debugging efforts and kinds of problems/bugs, and the process
	to fix]
	Yes, And an interesting sideline to that, we had this programmer
	who eventually wrote a simulation of the machine.
	And he discovered two bugs, in the machine - I think it had to do
	with fetching - because in those days we had a pretty sophisticated
	fetching algorithm, where you start out with nothing and you try and
	catch a loop. So we really didn't have much of anything in the way
	of debugging. It was coming, but not for us. Which is too bad.
	[Followup, how to debug the machine]
	Well, that's somewhat of an art. Especially in those days.
	28:38
	We had an interesting little experience. We were plagued by something
	called the 'cracked stripe'
	which none of you probably ever heard of. But it was a fault in the
	technology
	due to the extremely high current density that they had pushed the
	circuits to, such that the wind, the electron wind, going through these very
	fine circuits were blowing the atoms away. And you would get faults.
	You would get open circuits.
	So that was an interesting problem that we had to deal with.
	[How to find that this was going on? How to deal with it?]
	The 'cracked stripe' was special. We were experiencing one
	failure every day. Now, most of you don't have experience debugging
	a machine, but you can't debug a complex machine if you have one failure
	a day. There are just too many things to find.
	We were in real trouble.
	And the answer was technology, and they had to fix the technology.
	And in the case of the 91, they remade all the technology, in the case of
	some of the slower machines in the 360 line they only partially remade them
	and some they didn't remake at all.
	Because it was a time-dependent thing. The faster your circuit was, the more
	it was prone to this problem.

	30:52
	[how long were these systems under development, from
	Thomas Watson saying I want a fast computer]
	well, what i have to do, which doesn't make a nice clean
	picture, is the following.
	We commenced on the model 91 in about 1963
	possibly late 62.
	Because of the cracked stripe problem, it took us a very long
	time to debug the machine.
	If there'd been no cracked stripe problem we probably would have
	brought up the machine two years earlier than we ultimately
	brought it up.
	So that was a real devastating blow to us.

	And you know, it's really hard, when your hardware is failing under you,
	it's hard to make progress
	and it's failing in random ways
	and sometimes you take it out of the machine
	and it doesn't fail!
	Now what do you do?
	Thank your lucky stars if it fails next time.

	[During development of 91 did you think about compiler optimisations]
	We didn't have much to say about compilers
	I was in the hardware group
	we were conversant with some of their problems
	later on there was more back and forth
	because I always had - after the initial model 91 thing - a dual role of architect
	in the early phases which is really
	software architecture and then implementing the machine.
	Does that answer your question?

	[You mentioned Seymour Cray, he was still at Control Data
	was there a lot of competition
	did you take his machines apart?]
	We did a little more of that later on
	[laughter]
	but no we didn't do that that much.
	He had, it's very interesting how these things work out
	because he had a jump start on us, okay
	because he was already working on
	his machine
	and we were just starting on ours - we didn't even
	have the technology to build our machine
	So we were in real trouble.
	So, what are we to do in this circumstance? How can we
	make up lost ground. We tried all kinds of things, perhaps not
	well-founded, to try and make up this lost ground
	34:38
	But it's difficult. And what happened was, we were saved. We had an
	assessment of how fast the Cray machine was. So we said
	okay, we think it's this fast, they are going to be two years after us
	so we've got to be twice as fast as them so we can ... come out.
	That sounds feasible.
	Well, it turns out the Cray is four times faster
	[laughter]
	than we thought. Meanwhile our machine is two times faster
	than we thought it was.
	The net result of all this fiddling around was rough parity.
	There were some things we did faster
	some kinds of problems they did faster.
	But they still had - perhaps undeserved - the reputation
	for raw compute speed.
	And I think all of the customers, like the Atomic Energy Commission
	laboratories, who were supposedly interested in that, they all went
	to Cray, and IBM sold to - how to characterise it - database kind
	of applications, that need a lot of memory and concurrency, a lot
	of I/O running and all that kind of stuff. Doesn't particularly need a lot of floating point performance,
	computational performance in general
	[36:21]
	[Tell us about how IBM culture changed over your career]
	That's a tough one for me inside.
	First, I have to divide my career into two parts. First, the five or six
	years when I did my machine design - not my machine design - and then after that.
	How did IBM change? It became less interesting and cutting edge.
	In the beginning it was wide open, crazy ideas and if you could implement them
	and it would buy some performance, you got it.
	As time went on, you get more and more constrained, by the architecture
	and by the necessity of other machines. You're not just allowed to design
	for the model 91 class, you have to design a machine, you know, the next class
	down, we had three or four [classes] by that time. So there were all kinds of
	things standing in the way of pure performance. And you just had to
	live with those. There's no way around it.

	38:38
	[How about backward compatibility?]
	Oh yeah, that was a must, that was a no-brainer, we weren't allowed to touch
	that with a ten foot pole. In fact we had to get a special dispensation, because
	the model 91, because of its out of order floating point, the effect on interrupts
	was actually in violation of the architecture, and they had to get a special
	dispensation for the model 91.
	I don't think anyone ever suffered from this dispensation in those days, but
	nevertheless you had to get it.

	39:20
	[Any specific ideas from your team that wouldn't have worked]
	You're asking me if any ideas sort of died?
	That's really hard to answer. You like to think that you wring out the most
	performance you can get out of your technology, from what you've got,
	but that didn't really... really and truly there are all manner of compromises
	that have to be made - not have to be made, some have to be made, other's
	don't have to be made, but you're not omniscient, you don't know everything,
	so it's really hard to say how much that affects machine design.