From bear at typewritten.org  Sat May  2 19:41:20 2009
From: bear at typewritten.org (r.stricklin)
Date: Sat, 2 May 2009 19:41:20 -0700
Subject: [LispM] Fwd: troubleshooting nubus memory?
References: <mailman.81.1241306232.10984.lispm@tunes.org>
Message-ID: <8FE65227-12DC-4279-8471-1B1405D423F2@typewritten.org>

Folks;

I have a MacIvory which has recently begun failing to cold boot LISP.  
I am pretty sure that it is due to a hardware error on the NuBus  
memory board. Does anybody here have any information or experience  
that will help me track down which is the failing component? This is  
my only NuBus memory board so my methods are somewhat limited.

It's a NatSemi 8/16 board, with the full 16 MB (2.3 MW). If I remove  
the daughterboard to make it an 8 MB (1.6 MW) board, the MacIvory  
extension crashes during booting MacOS with a bus error, so either  
that tells me that the problem is on the base board or that 1.6 MW is  
insufficient.

The FEP Test Main Memory utility spits out periods for a while, then  
starts blasting out errors starting after address 5150000 (somewhere  
around the 1.3 MW mark - it disappeared off the screen too quickly for  
me to take more careful notes). The test continues spitting out  
errors, only limited by the speed of the FEP display. I lost patience  
after about thirty minutes; the test had managed to proceed as far as  
address 5522135 (about 0.2 MW of continuous errors).

The XOR displayed is always the same: either 0100000000000000 or  
0000000000000200.

This should probably be enough to tell me where to look, but I don't  
know how the LISP addresses relate to the physical layout of memory  
devices on the board.

Update: if I disable acceleration on the 8*24GC card, I get a much  
more reasonable result from the Test Main Memory utility. Just one  
message, and it stops:

Uncorrectable memory error referencing physical 00005152530 at #<DTP- 
EVEN-PC (150 in (:INTERNAL FEP::TEST-48-BIT-MEMORY 0 #:SPLIT59854))  
37000013002>
    ECC Syndrome 204; Log Address 00005152530; Error Logged; Errors Lost
    Uncorrected data: 11320002205537

After several tries, the location is relatively static. Several  
identical results, one or two with closeby addresses.

The error mode is the same with either of two MacIvory processors, so  
I'm reasonably confident it's not a problem with the processor itself.

Thoughts? Thanks!

Also - is the 8*24GC control panel known to be incompatible with the  
MacIvory software?

ok
bear




From gyro at zeta-soft.com  Sat May  2 21:33:11 2009
From: gyro at zeta-soft.com (Scott Burson)
Date: Sat, 2 May 2009 21:33:11 -0700
Subject: [LispM] Fwd: troubleshooting nubus memory?
In-Reply-To: <8FE65227-12DC-4279-8471-1B1405D423F2@typewritten.org>
References: <mailman.81.1241306232.10984.lispm@tunes.org>
	<8FE65227-12DC-4279-8471-1B1405D423F2@typewritten.org>
Message-ID: <310c21410905022133j57cf8d45g9e5cb2aadd77b8a7@mail.gmail.com>

Forgive me if I state the obvious, but have you reseated all the socketed
chips?  That's repair technique #1 for hardware of that era.  (In case you
don't know what I mean -- chips in sockets tend to work loose eventually
because of thermal cycling.  Reseating means to press each chip back into
its socket.)

I would think 1.6MW would be plenty to boot Lisp, though I don't recall for
sure.

-- Scott

On Sat, May 2, 2009 at 7:41 PM, r.stricklin <bear at typewritten.org> wrote:

> Folks;
>
> I have a MacIvory which has recently begun failing to cold boot LISP. I am
> pretty sure that it is due to a hardware error on the NuBus memory board.
> Does anybody here have any information or experience that will help me track
> down which is the failing component? This is my only NuBus memory board so
> my methods are somewhat limited.
>
> It's a NatSemi 8/16 board, with the full 16 MB (2.3 MW). If I remove the
> daughterboard to make it an 8 MB (1.6 MW) board, the MacIvory extension
> crashes during booting MacOS with a bus error, so either that tells me that
> the problem is on the base board or that 1.6 MW is insufficient.
>
> The FEP Test Main Memory utility spits out periods for a while, then starts
> blasting out errors starting after address 5150000 (somewhere around the 1.3
> MW mark - it disappeared off the screen too quickly for me to take more
> careful notes). The test continues spitting out errors, only limited by the
> speed of the FEP display. I lost patience after about thirty minutes; the
> test had managed to proceed as far as address 5522135 (about 0.2 MW of
> continuous errors).
>
> The XOR displayed is always the same: either 0100000000000000 or
> 0000000000000200.
>
> This should probably be enough to tell me where to look, but I don't know
> how the LISP addresses relate to the physical layout of memory devices on
> the board.
>
> Update: if I disable acceleration on the 8*24GC card, I get a much more
> reasonable result from the Test Main Memory utility. Just one message, and
> it stops:
>
> Uncorrectable memory error referencing physical 00005152530 at
> #<DTP-EVEN-PC (150 in (:INTERNAL FEP::TEST-48-BIT-MEMORY 0 #:SPLIT59854))
> 37000013002>
>   ECC Syndrome 204; Log Address 00005152530; Error Logged; Errors Lost
>   Uncorrected data: 11320002205537
>
> After several tries, the location is relatively static. Several identical
> results, one or two with closeby addresses.
>
> The error mode is the same with either of two MacIvory processors, so I'm
> reasonably confident it's not a problem with the processor itself.
>
> Thoughts? Thanks!
>
> Also - is the 8*24GC control panel known to be incompatible with the
> MacIvory software?
>
> ok
> bear
>
>
>
> _______________________________________________
> LispM mailing list
> LispM at tunes.org
> http://lists.tunes.org/mailman/listinfo/lispm
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: </archives/lispm/attachments/20090502/9bd410b3/attachment.htm>

From bear at typewritten.org  Sat May  2 21:38:33 2009
From: bear at typewritten.org (r.stricklin)
Date: Sat, 2 May 2009 21:38:33 -0700
Subject: [LispM] Fwd: troubleshooting nubus memory?
In-Reply-To: <310c21410905022133j57cf8d45g9e5cb2aadd77b8a7@mail.gmail.com>
References: <mailman.81.1241306232.10984.lispm@tunes.org>
	<8FE65227-12DC-4279-8471-1B1405D423F2@typewritten.org>
	<310c21410905022133j57cf8d45g9e5cb2aadd77b8a7@mail.gmail.com>
Message-ID: <483E46D5-D6EF-4E27-9EB3-197F191B936A@typewritten.org>


On May 2, 2009, at 9:33 PM, Scott Burson wrote:

> Forgive me if I state the obvious, but have you reseated all the  
> socketed chips?  That's repair technique #1 for hardware of that  
> era.  (In case you don't know what I mean -- chips in sockets tend  
> to work loose eventually because of thermal cycling.  Reseating  
> means to press each chip back into its socket.)
>

Only 4 MB of the 16 total is in sockets, but yes.

> I would think 1.6MW would be plenty to boot Lisp, though I don't  
> recall for sure.

I didn't even get that far. With 1.6 MW, the control panel bombs  
during booting of the IIfx and forces a reset. I don't know enough  
about what's supposed to happen to know whether that's an indication  
of my problem,

ok
bear