STM32F767ZI External Interrupt Handling - c++

I'm attempting to create a proper SPI slave interface for an AD7768-4 ADC. The ADC has a SPI interface, but it doesn't output the conversions via SPI. Instead, there are data outputs that are clocked out on individual GPIO pins. So I basically need to bit-bang data, and output to SPI to get a proper slave SPI interface. Please don't ask why I'm doing it this way, it was assigned to me.
The issue I'm having is with the interrupts. I'm using the STM32F767ZI processor - it runs at 216 MHz, and my ADC data MUST BE clocked out at 20MHz. I've set up my NMIs but what I'm not seeing is where the system calls or points to the interrupt handler.
I used the STMCubeMX software to assign pins and generate the setup code, and in the stm32F7xx.c file, it shows the NMI_Handler() function, but I don't see a pointer to it anywhere in the system files. I also found void HAL_GPIO_EXTI_IRQHandler() function in STM32F7xx_hal_gpio.c, which appears to check if the pin is asserted, and clears any pending bits, but it doesn't reset the interrupt flag, or check it, and again, I see no pointer to this function.
To more thoroughly complicate things, I have 10 clock cycles to determine which flag is set (1 of two at a time), reset it, incerment a variable, and move data from the GPIO registers. I believe this is possible, but again, I'm uncertain of what the system is doing as soon as the interrupt is tripped.
Does anyone have any experience in working with external interrupts on this processor that could shed some light on how this particular system handles things? Again - 10 clock cycles to do what I need to... moving data should only take me 1-2 clock cycles, leaving me 8 to handle interrupts...
EDIT:
We changed the DCLK speed to 5.12 MHz (20.48 MHz MCLK/4) because at 2.56 MHz we had exactly 12.5 microseconds to pipe data out and set up for the next DRDY pulse, and 80 kHz speed gives us exactly zero margin. At 5.12 MHz, I have 41 clock cycles to run the interrupt routine, which I can reduce slightly if I skip checking the second flag and just handle incoming data. But I feel I must use the DRDY flag check at least, and use the routine to enable the second interrupt otherwise I'll be constantly interrupting because DCLK on the ADC is always running. This allows me 6.12 microseconds to read in the data, and 6.25 microseconds to shuffle it out before the next DRDY pulse. I should be able to do that at 32 MHz SPI clock (slave) but will most likely do it at 50MHz. This is my current interrupt code:
void NMI_Handler(void)
{
if(__HAL_GPIO_EXTI_GET_IT(GPIO_PIN_0) != RESET)
{
count = 0;
__HAL_GPIO_EXTI_CLEAR_IT(GPIO_PIN_0);
HAL_GPIO_EXTI_Callback(GPIO_PIN_0);
// __HAL_GPIO_EXTI_CLEAR_FLAG(GPIO_PIN_0);
HAL_NVIC_EnableIRQ(GPIO_PIN_1);
}
else
{
if(__HAL_GPIO_EXTI_GET_IT(GPIO_PIN_1) != RESET)
{
data_pad[count] = GPIOF->IDR;
count++;
if (count == 31)
{
data_send = !data_send;
HAL_NVIC_DisableIRQ(GPIO_PIN_1);
}
__ HAL_GPIO_EXTI_CLEAR_IT(GPIO_PIN_1);
HAL_GPIO_EXTI_Callback(GPIO_PIN_1);
// __HAL_GPIO_EXTI_CLEAR_FLAG(GPIO_PIN_0);
}
}
}
I am still concerned about clock cycles, and I believe I can get away with only checking the DRDY flag if I operate on the presumption that the only other EXTI flag that will trip is for the clock pin. Although I question how this will work if SYS_TICK is running in the background... I'll have to find out.
We're investigating a faster processor to handle the bit-banging, but right now, it looks like the PI3 won't be able to handle it if it's running Linux, and I'm unaware of too many faster processors that run either a very small reliable RTOS, or can be bare metal programmed in a pinch...

10 clock cycles to do what I need to... moving data should only take me 1-2 clock cycles, leaving me 8 to handle interrupts...
No way. Interrupt entry (pushing registers, fetching the vector and filling the pipeline) takes 10-12 cycles even on a Cortex-M7. Then consider a very simple interrupt handler, just moving the input data bits to a buffer and clearing the interrupt flag:
uint32_t *p;
void handler(void) {
*p++ = GPIOA->IDR;
EXTI->PR = 0x10;
}
it gets translated to something like this
handler:
ldr r0, .addr_of_idr // load &GPIOA->IDR
ldr r1, [r0] // load GPIOA->IDR
ldr r2, .addr_ofr_p // load &p
ldr r3, [r2] // load p
str r1, [r3] // store the value from IDR to *p
adds r3, r3, #4 // increment p
str r3, [r2] // store p
ldr r0, .addr_of_pr // load &EXTI->PR
movs r1, #0x10
str r1, [r0] // store 0x10 to EXTI->PR
bx lr
.addr_of_p:
.word p
.addr_of_idr
.word 0x40020010
.addr_of_pr
.word 0x40013C14
So it's 11 instructions, each taking at least one cycle, after interrupt entry. That's assuming the code, vector table, and the stack are all in the fastest RAM region. I'm not sure whether literal pools work in ITCM at all, using immediate literals would add 3 more cycles. Forget it.
This has to be solved with hardware.
The controller has 6 SPI interfaces, pick 4 of them. Connect DRDY to all four NSS pins, DCLK to all SCK pins, and each DOUT pin to one MISO pin. Now each SPI interface handles a single channel, and can collect up to 32 bits in its internal FIFO.
Then I'd set an interrupt on a rising edge on one of the NSS pins (EXTI still works even if the pin is in alternate function mode), and read all data at once.
EDIT
It turns out that the STM32 SPI requres an inordinate amount of delay between NSS falling and SCK rising, which the AD7768 does not provide, so it will not work.
Sigma-Delta interface
The STM32F767 has a DFSDM peripheral, designed to receive data from external ADCs. It can receive up to 8 channels of serial data with 20 MHz, and it can even do some preprocessing that your application might need.
The problem is that the DFSDM has no DRDY input, I don't exactly know how could the data transfer be synchronized. It might work by asserting the START# singal to reset the communication.
If that doesn't work, then you can try starting the DFSDM channels using a timer and DMA. Connect DRDY to the external trigger of TIM1 or TIM8 (other timers won't work, because they are connected to the slower APB1 bus and the other DMA controller), start it on the rising edge of ETR, and let it generate a DMA request after ~20 ns. Then let the DMA write the value needed to start the channel to the DFSDM channel configuration register. Repeat for the oher three channels.

There's a startup file generated before compile: startup_stm32f767xx.s - which contains all the pointers to functions.
Under the marker g_pfnVectors: is .word NMI_Handler pointing to a function for handling the non-masked interrupts, and two other pointers, .word EXTI0_IRQHandler and .word EXTI1_IRQHandler as vectors to the external interrupt handlers. Further down in the same file, is the following compiler directives:
.weak NMI_Handler
.thumb_set NMI_Handler,Default_Handler
.weak EXTI0_IRQHandler
.thumb_set EXTI0_IRQHandler,Default_Handler
.weak EXTI1_IRQHandler
.thumb_set EXTI1_IRQHandler,Default_Handler
This was the info I was looking for to be able to control my interrupts with more precision and fewer clock cycles.

I readed AD7768 DS more carefully and found that it can srnd four channels data to one DOUT pin. So, I talking again about serial audio interface (SAI).
If you can lower DCLK frequency up to 2.5MHz than you can lower sample with ratio 1:8 (as ratio 2.5 MHz to 20 MHz) irt sample rate at full ADC clock.
If you route all 4 channels to one output DOUT0 you slow down sample rate just in ratio 1:4.
AD7768-4 DS
page 53
On the AD7768, the interface can be configured to output conversion
data on one, two, or eight of the DOUTx pins. The DOUTx configuration
for the AD7768 is selected using the FORMATx pins (see Table 33).
page 66 table 34: (for AD7768-4)
page 67 figure 98:
FORMAT0 = 1 All channels output on the DOUT0 pin, in TDM output. Only DOUT0 is in use.
You can use SAI with FS = DRDY and four slots, 32 bits/slot

Related

SPI Extra Clock Cycle over communication between STM32 Nucleo L432KC and MAX31865 ADC

The setup that I'm working with is a Nucleo L432KC connected to 8 different MAX31865 ADCs to get temperature readings from RTDs (resistive thermal devices). Each of the 8 chip selects is connected to its own pin, but the SDI/SDO of all chips are connected to the same bus, since I only read from one at a time (only 1 chip select is enabled at a time). For now, I am using a 100 ohm base resistor in the Kelvin connection, not an RTD, just to ensure an accurate resistance reading. The read from an RTD comes from calling the function rtd.read_all(). When I read from one and only one RTD, I get an accurate reading and an accurate SPI waveform (pasted below):
correct SPI reading for 1 ADC
(yellow is chip enable, green is clock, blue is miso, purple is mosi)
However, when I read from 2 or more sequentially, the SPI clock for some reason gains an additional unwanted cycle at the start of the read that throws off the transmitted values. It's been having the effect of shifting the clock to the right and bit-shifting my resistance values to the left by 1.
Logic analyzer reading of SPI; clock has additional cycle at start
What could be causing this extra clock cycle? I'm programming in C++ using mbed. I can access the SPI.h file but I can't see the implementation so I'm not sure what might be causing this extra clock cycle at the start. If I need to add the code too, let me know and I'll edit/comment.
rtd.read_all() function:
uint8_t MAX31865_RTD::read_all( )
{
uint16_t combined_bytes;
//SPI clock polarity/phase (CPOL & CPHA) is set to 11 in spi.format (bit 1 = polarity, bit 0 = phase, see SPI.h)
//polarity of 1 indicates that the SPI reading idles high (default setting is 1; polarity of 0 means idle is 0)
//phase of 1 indicates that data is read on the first edge/low-to-high leg (as opposed to phase 0,
//which reads data on the second edge/high-to-low transition)
//see https://deardevices.com/2020/05/31/spi-cpol-cpha/ to understand CPOL/CPHA
//chip select is negative logic, idles at 1
//When chip select is set to 0, the chip is then waiting for a value to be written over SPI
//That value represents the first register that it reads from
//registers to read from are from addresses 00h to 07h (h = hex, so 0x01, 0x02, etc)
//00 = configuration register, 01 = MSBs of resistance value, 02 = LSBs of
//Registers available on datasheet at https://datasheets.maximintegrated.com/en/ds/MAX31865.pdf
//The chip then automatically increments to read from the next register
/* Start the read operation. */
nss = 0; //tell the MAX31865 we want to start reading, waiting for starting address to be written
/* Tell the MAX31865 that we want to read, starting at register 0. */
spi.write( 0x00 ); //start reading values starting at register 00h
/* Read the MAX31865 registers in the following order:
Configuration (00)
RTD (01 = MSBs, 02 = LSBs)
High Fault Threshold (03 = MSBs, 04 = LSBs)
Low Fault Threshold (05 = MSBs, 06 = LSBs)
Fault Status (07) */
this->measured_resistance = 0;
this->measured_configuration = spi.write( 0x00 ); //read from register 00
//automatic increment to register 01
combined_bytes = spi.write( 0x00 ) << 8; //8 bit value from register 01, bit shifted 8 left
//automatic increment to register 02, OR with previous bit shifted value to get complete 16 bit value
combined_bytes |= spi.write( 0x00 );
//bit 0 of LSB is a fault bit, DOES NOT REPRESENT RESISTANCE VALUE
//bit shift 16-bit value 1 right to remove fault bit and get complete 15 bit raw resistance reading
this->measured_resistance = combined_bytes >> 1;
//high fault threshold
combined_bytes = spi.write( 0x00 ) << 8;
combined_bytes |= spi.write( 0x00 );
this->measured_high_threshold = combined_bytes >> 1;
//low fault threshold
combined_bytes = spi.write( 0x00 ) << 8;
combined_bytes |= spi.write( 0x00 );
this->measured_low_threshold = combined_bytes >> 1;
//fault status
this->measured_status = spi.write( 0x00 );
//set chip select to 1; chip stops incrementing registers when chip select is high; ends read cycle
nss = 1;
/* Reset the configuration if the measured resistance is
zero or a fault occurred. */
if( ( this->measured_resistance == 0 )
|| ( this->measured_status != 0 ) )
{
//reconfigure( );
// extra clock cycle causes measured_status to be non-zero, so chip will reconfigure even though it doesn't need to. reconfigure commented out for now.
}
return( status( ) );
}
Background:
I took a look at the entire implementation of the MAX31865_RTD class and the thing I find "troubling" is that a MAX31865_RTD instance creates its own SPI instance on construction. If you create multiple instances of this MAX31865_RTD class then there will be a separate SPI instance created and initialized for each of these.
If you have 8 of these chips and you create 8 separate MAX31865_RTD instances to provide one for each of your chips then this also creates 8 SPI instances that all point to the same physical SPI device of the microcontroller.
The problem:
When you call the read_all function on your MAX31865_RTD instance it in turn calls the SPI write functions (as seen in the code you provided). But digging deeper in the call chain you will eventually find that the code of the SPI write method (and others as well) is written in a way that it assumes that there can be multiple SPI instances that are using the same SPI hardware with different parameters (frequency, word length, etc...). In order to actually use the SPI hardware, the SPI class instance must first take ownership of the hardware if it does not have it yet. To do this it "acquires" the hardware for itself which basically means that it reconfigures the SPI hardware to the frequency and word length and mode that this particular SPI instance was set to (This happens regardless of the fact that every instance is set to the same parameters. They don't know about each other. They just see the fact that they have lost ownership and thus have to reacquire it and they also automatically assume that the settings are to be restored.). And this frequency (= clock) reinitialization is the reason that your clock is having a weird artefact/glitch on it. Each time you call the read_all on a different MAX31865_RTD instance the SPI instance of that instance will have to do an acquire (because they steal the ownership from each other on each read_all call) and it will make the clock behave weird.
Why it works if you only have one device:
Because when you have one and only one MAX31865_RTD instance then it has only one SPI instance which is the sole "owner" of the SPI hardware. So no-one is stealing the ownership on each turn. Which means that it does not have to re-acquire it on every read_all call. So in that case the SPI hardware is not reinitialized every time. So you don't get the weird clock pulse and everything works as intended.
My proposed solution #1:
I propose that you change the implementation of the read_all method.
If the version of the SPI class that you use has the select method, then add the
spi.select();
line just before pulling the chip select (nss) low. Basically add the line above this block:
/* Start the read operation. */
nss = 0;
If there is no select function, then just add a
spi.write(0x00);
line in the same place instead of the line with the select.
In essence both of the proposed lines just force the acquire (and the accompanying clock glitch) before the chip select line is asserted. So by the time the chip select is pulled low and the actual data is being written the SPI instance already has ownership and the write methods will not trigger an acquire (nor the clock glitch).
My proposed solution #2:
Another solution is to modify the MAX31865_RTD class to use an SPI instance reference and provide that reference through its constructor. That way you can create one SPI instance explicitly and provide this same SPI instance to all your MAX31865_RTD instances at construction. Now since all of your MAX31865_RTD instances are using a reference to the same and only SPI instance, the SPI hardware ownership never changes since there is only one SPI class instance that is using it. Thus the hardware is never reconfigured and the glitch never happens. I would prefer this solution since it is less of a workaround.
My proposed solution #3:
You could also modify the MAX31865_RTD class to have a "setter" for the nss (= chip select) pin. That way, you could have only one MAX31865_RTD instance for all your 8 devices and only change the nss pin before addressing the next device. Since there is only one MAX31865_RTD instance then there is only one SPI instance which also solves the ownership issue and since no re-acquisition has to be made then no glitch will be triggered.
And of-course there can be any number of other ways to fix this knowing the reason of the problem.
Hope this helps in solving your issue.

Interrupt to trigger SPI read of external ADC (MCP3464)

I have been trying to speed up the sampling rate on a SPI controlled ADC for an ECG device and have been stumping my head against the wall.
I am using an external ADC (MCP3464 datasheet) to read ADC data over 8 channels.
Here is how I am reading the adc:
// SPI full duplex transfer
digitalWrite(adcChipSelectPin,LOW);
SPI.transfer(readConversionData);
adcReading = (SPI.transfer(0) << 8);
adcReading += SPI.transfer(0);
digitalWrite(adcChipSelectPin, HIGH);
I tried single conversion mode, but this was slow speeds of 3200 samples per second. I believe the ESP32 can flash SPI at 80MHz, so surprised it's that slow.
Instead, I have set the MCP3464 to SCAN mode (see page 89 of datasheet) where it completes continuous conversions and you can set delays between each channel and each cycle (8 channels). That way you don't have to write an additional 2 SPI commands to change the channel and start the next conversion.
From the datasheet:
Each conversion within the SCAN cycle leads to a data ready interrupt and to an update of the ADCDATA register as soon as the current conversion is finished. In SCAN mode, each result has to be read when it is available and before it is overwritten by the next conversion result
Hence the data ready interrupt needs to be detected by the ESP32 to immediately read the latest value from the channel, before a new conversion to the next channel occurs.
I have tried waiting for IRQ to be LOW (I am multi-threading so I can afford to blocking wait) :
// wait for IRQ to trigger active LOW indicating data ready state
while(digitalRead(interruptPin));
*(packetLocation + ((ch+2) + (ts*(interface.numOfCh + 2)))) = adc.read();
*(unsigned long*)(packetLocation + (ts*(interface.numOfCh + 2))) = micros();
However this results in slow speeds of around 2640 samples per second and I am worried the synchronicity of reading the correct channel will collapse if just one active LOW is undetected.
I have also tried attaching an interrupt:
attachInterrupt(interruptPin, packageADC, FALLING);
but I am unable to link a non-static method to an IRQ.
Any ideas how to either get the interrupt to work quickly and reliably in SCAN mode or speed up the ESP32 SPI communication in Single shot mode?
Sorry, quite a lot to take in but any suggestions much appreciated!
Thanks in advance,
Will

Quadrature signal generation with bit manipulation

I try to generate quadrature signal but with the lowest operation possible. I use a STM32 and GPIO pin B8 and B9 for sending the signal.
couple of pin 8 and 9 have four possible options which are in clock wise:
0/0 1/0 1/1 and 0/1
and counter clockwise
0/0 0/1 1/1 1/0
I can't find the way with bitwise to be able to quickly set or reset the bit for the selected pin.
Moreover, I must be able to go clock or counterclokwise and change sense whenever I want like if it was a rotary or linear encoder.
Thank you for your help
Bit-banging
Bitwise thinking, B9 gets the previous value of B8, and B8 gets the inverse of B9, or the other way round when counting down. You swap the two bits, and exclusive-or with 0x100 or 0x200 depending on the direction.
inline void incB89(int down) {
uint32_t temp;
/* read the current output state */
temp = GPIOB->ODR;
/* modifying the significant bit-pair
don't care about overflow */
temp = (((temp & 0x100) << 1) | ((temp & 0x200) >> 1)) ^ (0x100 << down);
/* Setting the reset bits BR8 and BR9. This has the effect that
bits 8 and 9 will be copied into the ODR, and the rest will
be left alone */
temp |= ((1 << 24) | (1 << 25));
GPIOB->BSRR = temp;
}
Using a timer (or two)
On most STM32 series controllers, TIM4 channels 3 and 4 outputs can be mapped to PB8 and PB9. If you have one of these, this timer can control the outputs autonomously, unaffected by code, memory, or interrupt latency.
Set the GPIO mode and alternate function registers according to the reference manual of your controller.
Configure both channel 3 and 4 to toggle mode, set the OC1M and OC2M bits in TIM4->CCMR1 to 0b011.
Set the input clock, prescaler PSC and reload ARR to achieve twice the desired frequency, because each output will be toggled once in every timer cycle.
Set TIM4->CCR3=0 and TIM4->CCR4=(TIM4->ARR+1)/2 for counting in one direction. Swap them (while the counter is stopped) to reverse direction.
Enable the outputs in TIM4->CCER.
You can start and stop counting by setting or resetting the CEN bit of TIM4->CR1.
To count the cycles, you can to configure an interrupt for toggle or update events in TIM4->DIER, or use another timer as a slave to TIM4.
To use e.g. TIM3 to count:
Set the MMS bits in TIM4->CR2 to 0b010 to output a trigger pulse on each overflow.
Configure TIM3->SMCR to External Clock Mode 1, and select the internal trigger of TIM4.
Set TIM3->ARR to the required number of half-cycles - 1.
Configure an interrupt on update.
Start the counter.
There are some more tricks possible with timers, like using DMA bursts triggered by the slave to update the ARR and CCR registers of the master timer from a table of "wawelength" values.

IRQ 8 isn't working... HW or SW?

First, I program for Vintage computer groups. What I write is specifically for MS-DOS and not windows, because that's what people are running. My current program is for later systems and not the 8086 line, so the plan was to use IRQ 8. This allows me to set the interrupt rate in binary values from 2 / second to 8192 / second (2, 4, 8, 16, etc...)
Only, for some reason, on the newer old systems (ok, that sounds weird,) it doesn't seem to be working. In emulation, and the 386 system I have access to, it works just fine, but on the P3 system I have (GA-6BXC MB w/P3 800 CPU,) it just doesn't work.
The code
setting up the interrupt
disable();
oldrtc = getvect(0x70); //Reads the vector for IRQ 8
settvect(0x70,countdown); //Sets the vector for
outportb(0x70,0x8a);
y = inportb(0x71) & 0xf0;
outportb(0x70,0x8a);
outportb(0x71,y | _MRATE_); //Adjustable value, set for 64 interrupts per second
outportb(0x70,0x8b);
y = inportb(0x71);
outportb(0x70,0x8b);
outportb(0x71,y | 0x40);
enable();
at the end of the interrupt
outportb(0x70,0x0c);
inportb(0x71); //Reading the C register resets the interrupt
outportb(0xa0,0x20); //Resets the PIC (turns interrupts back on)
outportb(0x20,0x20); //There are 2 PICs on AT machines and later
When closing program down
disable();
outportb(0x70,0x8b);
y = inportb(0x71);
outportb(0x70,0x8b);
outportb(0x71,y & 0xbf);
setvect(0x70,oldrtc);
enable();
I don't see anything in the code that can be causing the problem. But it just doesn't seem to make sense. While I don't completely trust the information, MSD "does" report IRQ 8 as the RTC Counter and says it is present and working just fine. Is it possible that later systems have moved the vector? Everything I find says that IRQ 8 is vector 0x70, but the interrupt never triggers on my Pentium III system. Is there some way to find if the Vectors have been changed?
It's been a LONG time since I've done any MS-DOS code and I don't think I ever worked with this particular interrupt (I'm pretty sure you can just read the memory location to fetch the time too, and IRQ0 can be used to trigger you at an interval too, so maybe that's better. Anyway, given my rustiness, forgive me for kinda link dumping.
http://wiki.osdev.org/Real_Time_Clock the bottom of that page has someone saying they've had problem on some machines too. RBIL suggests it might be a BIOS thing: http://www.ctyme.com/intr/rb-7797.htm
Without DOS, I'd just capture IRQ0 itself and remap all of them to my own interrupt numbers and change the timing as needed. I've done that somewhat recently! I think that's a bad idea on DOS though, this looks more recommended for that: http://www.ctyme.com/intr/rb-2443.htm
Anyway though, I betcha it has to do with the BIOS thing:
"Notes: Many BIOSes turn off the periodic interrupt in the INT 70h handler unless in an event wait (see INT 15/AH=83h,INT 15/AH=86h).. May be masked by setting bit 0 on I/O port A1h "

Arduino locks up

The intention of the program below is to periodically output a dataframe on serial. The period is defined by a timed interrupt, every second.
The code worked on Arduino IDE version 0022, but on 1.0 I can't get it working. When using the timer routine and maxFrameLength is set to 0x40 or higher, the controller locks up. When using 0x39 or lower, the program keeps running (indicated by the flashing LED).
What's going wrong here and why? Is it a bug? Am I doing something wrong?
I'm using http://code.google.com/p/arduino-timerone/downloads/detail?name=TimerOne-v9.zip for the timer routine on a Mega1280.
#include "TimerOne.h"
#define LED 13
#define maxFrameLength 0x40
boolean stateLED = true;
byte frame[ maxFrameLength ];
void sendFrame() {
digitalWrite( LED , stateLED );
stateLED = !stateLED;
Serial.write( frame, maxFrameLength ); // ptr + bytes to send
}
void setup() {
pinMode( LED , OUTPUT );
Timer1.initialize( 1000000 ); // initialize timer1 with 1 second period
Timer1.attachInterrupt( sendFrame );
Serial.begin( 9600 );
};
void loop() {
};
There are a number of issues that may or may not be causing a problem, but it should be fixed in any case. These comments are general in nature; I am not familiar with Arduino or its library specifically.
It is almost certainly inappropriate to issue a Serial.write() call in an interrupt handler (ISR). If the Serial object is interrupt driven, it will have an associated buffer. If that buffer is not large enough to take all the data, the function may block, which is a no no in an interrupt handler. Moreover, if the timer interrupt is a higher priority that the serial interrupt, you will cause a deadlock when Serial.write() blocks. 0x40 (64 bytes) seems like a likely buffer size for serial output, so that is likely the primary cause. If you can increase the buffer size that may make it work, but it remains a bad idea to perform potentially blocking operations in an ISR.
Even if serial output is polled rather than interrupt driven, your interrupt handler will take rather long, which is also a bad idea, but probably not the issue in this case, but at 9600,n,8,1, 64 characters will take 67 milliseconds to clear the transmit register.
stateLED and frame are shared variables (between interrupt and main contexts) and should therefore be declared volatile.
It is not shown in your fragment how and where frame is updated, but since the interrupt will occur asynchronously, any update to frame should be in a critical section - with at least the timer1 interrupt disabled.
Update
In the light of A.H.'s response I downloaded the source code and took a look. Serial is a static object of class HardwareSerial defined in \arduino-1.0\hardware\arduino\cores\arduino\hardwareSerial.cpp/.h. The transmit buffer length is indeed 64 bytes, and the HardwareSerial::write() function does "busy-wait" if the buffer is full. You will need to modify and re-build the source to extend the buffer or add a non-blocking version of write().
This is however certainly the cause of the lock-up - the buffer will never empty because the transmit interrupt cannot be serviced while the timer1 interrupt is running.
The release notes for 1.0 tell you:
Serial transmission is now asynchronous - that is, calls to
Serial.print(), etc. add data to an outgoing buffer which is transmitted
in the background. Also, the Serial.flush() command has been repurposed
to wait for outgoing data to be transmitted, rather than dropping
received incoming data.
Therefore your code worked before 1.0, because HardwareSerial::write(uint8_t) (which is the foundation for all output) had no buffer and returned only after the byte has been transmitted.
I find it astonishing, that the reference page for Serial does not mention this behaviour.