Posts Tagged stm

Code to produce video signals on the STM32L Discovery

I’ve finally coded the program that produces a video signal.  There weren’t too many surprises.

One issue I had was that there was that occasionally a line would be drawn slightly to the right.  I changed the interrupt priority from 7 to 0, and disabled ChibiOS’ thread preemption. Probably only changing the interrupt will do, but I had no trouble after that.

A bigger problem is that any vertical line wriggles around on the screen a lot.  I don’t think there’s anything in my code that might cause this.  I came across a question on the STM site, which suggests that an instruction must complete before an interrupt will be triggered.  If the CPU happens to be running a long instruction, the line produced by the following interrupt will be shifted over slightly.  The Arduino TV-Out library doesn’t have this problem, even though the AVRs instructions can take different amounts of time.  I’m not sure what I can do about this – the timer should still be correct, so maybe I’d need a busy-wait loop while waiting for the timer to hit a particular value.  It looks like the TV-Out library does this.  It might need some assembler, which I don’t plan to learn right now (but maybe check this ST forum post).  But the author of the RBox suggests it’s because of the wait states.  I don’t know why there would be anything non-deterministic with wait states, unless there was caching involved, which I don’t think the STM32 has.  I suspect it’s ChibiOS running stuff during the timing interrupt, which would cause jitter.

I don’t plan to do any more work on this program, since it’s served its task of teaching me about ARM processors.  It has potential to show data its capturing or interacting with an operator.  There’s plenty of memory for a framebuffer for its its 400×288 display, It should be fairly easy to port the TVOut library to it, to add graphics and text rendering capabilities.  The advantage of the ARM chip is that because it uses DMA to write to the screen, the CPU is doing almost nothing while it’s displaying an image.  An AVR needs to work hard while a line is being drawn.

I’ve seen one project where an ARM chip produced colour signals.  The CPU didn’t have DMA though, but was faster than the STM32L.  The Freescale Freedom board looks like a good target (although ChibiOS doesn’t support it yet).  I was thinking about the way that 2D polygons are drawn, and I think it might be possible to render a number of 3D polygons with occlusion as each line is being drawn.  The unusual part of this rendering is that instead of the frame rate slowing when the CPU was busy, the vertical resolution would decrease instead as the previous line keeps getting rendered as a new line is being drawn.

Generating video signals like is is nifty, but maybe a bit pointless since there are chips around with composite output anyway.  The OLinuXino iMX233 would be ideal for this, as its CPU has a complete reference manual available.  It’s designed for running Linux, but some low level programming like I did here would provide an “instant-on” function.  The same could be done with the Raspberry Pi, but since there’s no user manual available, you’d need to rely on its limited documentation and Linux drivers.  I like the idea of porting the RTEMS operating system to the OLinuXino, since that OS provides a POSIX API and BSD networking, so porting other applications would be easier.

Here’s a video of my results.  My code is here:

Comments (1)

DMA on the STM32L Discovery

There’s one more part to my video generator – the picture data, which I want to transfer to the SPI port using DMA. This actually looks fairly straightforward, these are the available registers:

MEM2MEM I’m transferring from memory to a peripheral, so this should be off.
PL I’ll make this “very high” priority, because I want to keep the picture stable at all costs. If a program writes to the framebuffer during this DMA transfer, it will be blocked.
MSIZE I’ve set the SPI port to 8 bits so I’ll stick with that. I don’t think it will make any difference whether it’s 8 or 16.
MINC I want the memory pointer to increment during the transfer
PINC I guess this should be off, because to write SPI you keep sending data to the same memory location.
CIRC I don’t want the memory pointer to circle around.
DIR Read from memory
Interrupts I won’t need any yet, but eventually I’ll have to turn off the SPI port at the end of the transfer, otherwise I’ll get white bars down the sides of the screen.

DMA_CNDTRx contains how much data to transfer. There are 7 channels, and table 40 of the reference manual says SPI2_TX is on channel 5. This needs to be set to the number of pixels / 8, since I’ll have 8 pixels in one byte.  There’s a “auto-reload” setting somewhere which resets this counter value after a transfer; I think this happens in circular mode.

Table 40 also suggests I must use DMA1 for these transfers.

The peripheral address register should point to the SPI data register (&(SPI2->DR)), and the memory register is the start of the current line of pixels.

That’s all of the available settings!  There’s one more thing to do though: section 10.3.7 says this:

The peripheral DMA requests can be independently activated/de-activated by programming the DMA control bit in the registers of the corresponding peripheral.

I guess this is the TXDMAEN bit in the SPI_CR2 register.

Now for some code… first I’ll make some data to send:

const uint8_t image[] = { 0xAA, 0x55, 0xAA, 0x55 };

Of course later on I’ll have a lot more data…

Now to set the above settings:

  DMA1_Channel5->CCR = DMA_CCR5_PL // very high priority
          | DMA_CCR5_MINC  // memory increment mode
          | DMA_CCR5_DIR;  // read from memory, not peripheral

Section 10.3.3 has this useful bit of information:

The first transfer address is the one programmed in the DMA_CPARx/DMA_CMARx registers. During transfer operations, these registers keep the initially programmed value. The current transfer addresses (in the current internal peripheral/memory address register) are not accessible by software.

This suggests that I only need to set these at the start and shouldn’t need to touch them again.

To set these:

  DMA1_Channel5->CMAR = (uint32_t) image;       // where to read from
  DMA1_Channel5->CPAR = (uint32_t) &(SPI2->DR); // where to write to

Time to try it out… and… nothing!  Maybe there’s another clock setting for DMA, and sure enough there is:

  rccEnableAHB(RCC_AHBENR_DMA1EN, 0); // Enable DMA clock, run at SYSCLK

I still haven’t got anything, so I tried setting the source and destination registers each time before I start a DMA transfer. It looks like now I get a single transfer, but I’m trying to get a transfer on every hsync.

I poked around with the debugger, especially at 0x40026058 which is DMA5->CCR1 (I calculated the address from values in stm32l1xx.h), and noticed that the Enable flag is still set.  Maybe it has to be toggled each time?  Now I get a square wave instead of my data… I then tried decreasing my hsync timer, and decreasing the SPI speed, and I got a reasonable output.  I’m getting some nasty aliasing on my DSO Nano though, maybe I should have borrowed a faster scope!  I think I was triggering the DMA transfers too quickly, which produced that square wave.  Conveniently, I notice the SPI line is now low when it’s idle, which is the output I want.  I’m not sure why it’s gone low, but I’m not complaining.

So to sum up:

  rccEnableAHB(RCC_AHBENR_DMA1EN, 0); // Enable DMA clock, run at SYSCLK
  // Configure DMA
  DMA1_Channel5->CCR = DMA_CCR5_PL // very high priority
          | DMA_CCR5_MINC  // memory increment mode
          | DMA_CCR5_DIR;  // read from memory, not peripheral
  DMA1_Channel5->CMAR = (uint32_t) image;       // where to read from
  DMA1_Channel5->CPAR = (uint32_t) &(SPI2->DR); // where to write to

then in my hsync handler:

    // Activate the DMA transfer
    DMA1_Channel5->CCR &= ~DMA_CCR5_EN;
    DMA1_Channel5->CNDTR = sizeof(image);
    DMA1_Channel5->CCR |= DMA_CCR5_EN;

I didn’t need to reset CMAR and CPAR after all.

I think that’s now demonstrated everything I need for the video signal generator! My code needs a big cleanup, and I’d like to use ChibiOS functions where I can (palSetPadMode instead of messing around with memory locations and data structures, etc).

Leave a Comment

Picture data using SPI

I plan to use SPI to send the picture data for my video generator.

First I need to work out what speed to run the port at.  Each line goes for 52 μs, or 1664 cycles.  I could divide this by 4 for 416 pixels per line or 8 for 208 per line.  This sets the baud rate, so I shouldn’t need to divide this by 8 again to get a bytes per second speed.  It looks (from the clock registers) like SPI1 is connected to the APB2 clock, and SPI2 is connected to APB1.  I’m already running APB1 at the system clock (32MHz), so I’d like to use that if I can.  The speed is set in the CR1 register, by the BR bits, which supports dividing by 4 or 8.  I might as well use SPI2.  The datasheet says that SPI2_MOSI can only be on pin B15.  I won’t need the clock output, so I won’t configure a pin for that.

The CR1 register contains a setting for 8 or 16-bit operation.  This affects the size of the data being written.  Since I plan to use DMA I’ll leave it at 8 bits.

It turns out there are very few settings to get SPI working.  I had to stuff around a lot before I got it working though – eventually I copied the ChibiOS code, and set the SPI_CR1_CPOL, SPI_CR1_SSM, SPI_CR1_SSI and SPI_CR2_SSOE flags even though I wouldn’t have thought I need them, and it suddenly worked!

This was enough to get SPI working:

  rccEnableAPB1(RCC_APB1ENR_SPI2EN, 0); // Enable SPI2 clock, run at SYSCLK
  palSetPadMode(GPIOB, 15, PAL_MODE_ALTERNATE(5) |
                           PAL_STM32_OSPEED_HIGHEST);           /* MOSI.    */
  SPI2->CR1 = //SPI_CR1_BR_0 // divide clock by 4
          SPI_CR1_CPOL | SPI_CR1_SSM | SPI_CR1_SSI |
          SPI_CR1_BR // divide clock by 256
          | SPI_CR1_MSTR;  // master mode
  SPI2->CR1 |= SPI_CR1_SPE; // Enable SPI

To send data, write bytes to SPI2->DR.  The output appears on PB15.  I think in the future I’ll try using palSetPadMode for configuring the pins, since it’s better than the 8 lines of code I’ve been using previously to do this.  The above code divides the clock by 256 so I could see the output on my DSO Nano, but I’ll change this to 4 later.

The next step will be using DMA to write the data to SPI instead.

Leave a Comment

More on the vsync interrupts

I had a thought after my previous disheartening attempt to get the vsync working: since they might have stuffed up timer 11 in ChibiOS, I might change to timer 4.  One massive advantage is that timers 2, 3 and 4 have 4 compare registers, instead of one!  This means I can use one to turn off the vsync pulse, and one to trigger the DMA interrupt and adjust the sync timings on the next line.  Changing everything to use TIM4 was fairly straightforward, the only tricky part being changing the single compare register to use compare register 4, and enabling APB1 instead of APB2 since that’s where timer 4 is.  (I originally used register 1, but that is connected to PORTB6 which is attached to the blue LED.)

There was one catch though when compiling:

$ make
Compiling main.c
Linking build/ch.elf
build/obj/main.o: In function `VectorB8':
/home/damien/projects/stm32/video/main.c:8: multiple definition of `VectorB8'
build/obj/pwm_lld.o:/opt/ChibiOS_2.4.2/os/hal/platforms/STM32/pwm_lld.c:221: first defined here

Looking in pwm_lld.c suggests that I need to unset STM32_PWM_USE_TIM4.  I notice that one example has another configuration file: /demos/ARMCM3-STM32L152-DISCOVERY/mcuconf.h, which declares this symbol.  I copied this file to my project directory, hoping the makefile would use it in preference.  Now there were no timers available for ChibiOS, its PWM module made the compiler complain because it had no timers to use, so I had to set HAL_USE_PWM to FALSE in halconf.h.

The good news: my interrupt is being called! I connected the debugger and used the “p” command to show the contents of “line”.  The bad news: my PB7 light isn’t blinking, which means the ChibiOS main thread isn’t working.  Maybe the interrupt is using all of the CPU?  It shouldn’t, since I thought each line took about 1000 cycles to run, and the interrupt shouldn’t be using more than about 50.

I remember seeing somewhere that ARMs don’t reset their interrupt flags automatically, like AVRs do.  If this is the case, my interrupt will return, and the NVIC (the “nested vectored interrupt controller”, apparently) will see the flag is still set, and call the interrupt again.  The interrupt handler in ChibiOS’ pal_lld.c contains this line, which would clear this flag:


In my case, this would be:


and my light blinks again, so that seems to have worked!  I checked with gdb that the “line” variable is still incrementing.

I’ll try setting the PWM duration registers during the interrupt, so my interrupt handler looks like this:

    if (line & 1) {
        TIM4->ARR = STM32_SYSCLK * 0.0001;   // horizontal line duration
        TIM4->CCR4 = STM32_SYSCLK * 0.00009; // hsync pulse duration
    } else {
        TIM4->ARR = STM32_SYSCLK * 0.000064;   // horizontal line duration
        TIM4->CCR4 = STM32_SYSCLK * 0.0000047; // hsync pulse duration

That seems to have worked fine.  That’s about everything I need to generate the vsync signals.  I’ve done something slightly wrong though – I’d be better off using one of the compare registers to trigger the interrupt instead.  In the interrupt handler, I’d initiate a DMA transfer for the current line, then set the timing registers for the next line.

Now my code looks like this (maybe I should start putting it on Github):

#include "ch.h"
#include "hal.h"
#include "stm32l1xx.h"

volatile int line;

    TIM4->SR &= ~TIM_SR_UIF;

    if (line & 1) {
        TIM4->ARR = STM32_SYSCLK * 0.0001;   // horizontal line duration
        TIM4->CCR4 = STM32_SYSCLK * 0.00009; // hsync pulse duration
    } else {
        TIM4->ARR = STM32_SYSCLK * 0.000064;   // horizontal line duration
        TIM4->CCR4 = STM32_SYSCLK * 0.0000047; // hsync pulse duration

int main(void) {

  rccEnableAPB1(RCC_APB1ENR_TIM4EN, 0); // Enable TIM4 clock, run at SYSCLK

  nvicEnableVector(TIM4_IRQn, CORTEX_PRIORITY_MASK(7));

  // TIM11 outputs on PB6
  GPIOB->OTYPER &= ~GPIO_OTYPER_OT_9;        // Push-pull output
  GPIOB->PUPDR |= GPIO_PUPDR_PUPDR9_0;      // Pull-up
  GPIOB->MODER |= GPIO_MODER_MODER9_1; // alternate function on pin B9

  // Reassign port B9
  GPIOB->AFRH |= 0x2 << 4; // ChibiOS doesn't seem to have constants for these   TIM4->CR1 |= TIM_CR1_ARPE; // buffer ARR, needed for PWM (?)
  TIM4->CCMR2 &= ~(TIM_CCMR2_CC4S); // configure output pin
  TIM4->CCMR2 =
          TIM_CCMR2_OC4M_2 | TIM_CCMR2_OC4M_1 /*| TIM_CCMR1_OC1M_0*/  // output high on compare match
          | TIM_CCMR2_OC4PE; // preload enable
  TIM4->CCER = TIM_CCER_CC4P // active low output
           | TIM_CCER_CC4E; // enable output
  TIM4->DIER = TIM_DIER_UIE; // enable interrupt on "update" (ie. overflow)
  TIM4->ARR = STM32_SYSCLK * 0.000064;   // horizontal line duration
  TIM4->CCR4 = STM32_SYSCLK * 0.0000047; // hsync pulse duration

  TIM4->CR1 |= TIM_CR1_CEN; // enable the counter

  while (1) {
    palSetPad(GPIOB, 7);
    palClearPad(GPIOB, 7);

Next I’ll try using DMA, which I’ve never used before.  With any luck I’ll be able to use ChibiOS to do this.

Leave a Comment

Starting on the vsync interrupts

Now it’s time to adjust the sync signals to get vsync working, so I need an interrupt when the signal goes low so I can adjust the sync pulse on the next cycle.  I have no idea how to start the interrupts.

I did a search in the ChibiOS code for the term “vect”, and found a bunch of them in hal/platforms/STM32L1xx/hal_lld.h, of which TIM11_IRQHandler looks the most appropriate.  But there seems to be only one vector all possible events, like an overflow or a compare match.

It looks like “fast” interrupt handlers look like this:

volatile int line;

It compiles at least… but did it register as an interrupt handler?  I tried the trick with AVRs that shows disassembled code:

arm-none-eabi-objdump -S -h build/ch.elf

It shows that the first section starts at address 0x08000100… just after the vector table, which should appear at the start! Right at the top though, there’s a “startup” section at the correct address. After stuffing around with various objdump options, this showed me the vector table:

arm-none-eabi-objdump -h -s --special-syms build/ch.elf

Table 34 of the reference manual lists all of the interrupts, and TIM11 is at offset 0xAC. Objdump doesn’t show anything promising in this location!  But there’s something strange in hal_lld.h:

#define TIM9_IRQHandler         VectorA0    /**< TIM9.                      */
#define TIM10_IRQHandler        VectorA4    /**< TIM10.                     */
#define TIM11_IRQHandler        VectorA8    /**< TIM11.                     */
#define LCD_IRQHandler          VectorAC    /**< LCD.                       */

The datasheet though says that the LCD is vector A0, TIM9 is A4, 10 is A8 and 11 is AC.  I guess the way to find out is to try both and see what happens.

First the timer needs to be configured to use those interrupts.

  TIM11->DIER = TIM_DIER_UIE; // enable interrupt on "update" (ie. overflow)

So does this do anything? I used the debugger to find out:

$ arm-none-eabi-gdb build/ch.elf 
Reading symbols from /home/damien/projects/stm32/video/build/ch.elf...done.
(gdb) p lineA8
$1 = 0
(gdb) p lineAC
$2 = 0
(gdb) cont
Program received signal SIGINT, Interrupt.
0x08000598 in _idle_thread (p=)
    at /opt/ChibiOS_2.4.2/os/kernel/src/chsys.c:62
62	  chRegSetThreadName("idle");
(gdb) p lineAC
$3 = 0
(gdb) p lineA8
$4 = 0

So it’s doing nothing!  Why not? I looked through the ChibiOS sources where the timers are configured, and found a call to nvicEnableVector, which looks promising.  It needs a “priority” parameter though, so what should that be?  gpt_lld.h lists some priorities.  ChibiOS always sends the priorities through a macro called CORTEX_PRIORITY_MASK, but that seems to move the number to the correct place in a register.  The CPU manual (section 5.3) says that lower numbers have higher priority, and this needs to be very high, so I’ll choose 2.

  nvicEnableVector(TIM11_IRQn, CORTEX_PRIORITY_MASK(2));

It doesn’t like that much – the light doesn’t blink, so it looks like the chip crashed!  I’m starting to get a bit annoyed with my slow progress; maybe I’ll try to rewrite my code using ChibiOS as much as I can.  While this chip is powerful, its also very complicated, which makes me think it’s not that practical to program it directly.  I soldiered on anyway

Leave a Comment

More timing video signals on the STM32L Discovery

I’ve looked how ChibiOS does its timing, and worked out that it’s unsuitable for timing video signals.  Now I’ll look at the using the timers directly.

The chip has a number of timers, I can’t work out how many.  The ChibiOS HAL manual says it can use timers 2, 3 and 4, so let’s leave those alone for other uses.  That leaves timers 9, 10 and 11.

If the timers work like the AVR’s timers, they work by starting at 0 and counting up to some maximum value, where the counter is reset to 0.  There’s also a compare register, and when the timer matches the compare register, something can happen – we can trigger an interrupt, change the state of a pin and so on.  Being able to change a pin is how PWM works.

It would be nice to use PWM to produce the sync signals.  I’ve found an excellent description of these signals, and at the start of each line, there’s always a falling signal.  So if we set the timer’s maximum value to the end of the line, and have the signal go low when it overflows, that’s the signal start taken care of.

The signal end is a bit trickier.  The image shows that it varies depending on the line; further, on some of the vertical sync lines it happens twice!  This means that we might need to change the value at which the signal goes high as the timer is running.  The AVR can do this, but in some modes the timer register is double-buffered.  If you write a new value to the timer compare register, it’s only applied the next time the timer resets.  They do this so you don’t set a compare value lower than the current register, which means the timer will keep counting up until it overflows!

So can you adjust the compare registers on the fly on the STM32L, and is it double buffered?  It looks like the compare register is called TIMx_CCR1.  There doesn’t seem to be a CCR2, so maybe these timers only have one output.  In the reference manual, section 17.6.11 says:

It is loaded permanently if the preload feature is not selected in the TIMx_CCMR1 register (bit OC1PE). Else the preload value is copied in the active capture/compare 1 register when an update event occurs.

So if the preload feature is off, the compare register can be updated straight away!  But back in the PWM mode description (section 17.4.9), it says:

You must enable the corresponding preload register by setting the OCxPE bit in the TIMx_CCMRx register

So we’re not so lucky.  We need to use the preload register, and we know it updates on an “update event”.  What’s an update event? Back in section 17.4.1:

The update event is sent when the counter reaches the overflow and if the UDIS bit equals 0 in the TIMx_CR1 register.

So all we need to do is update the compare register one line early!

What about the vertical sync lines, where there are two pulses?  That shouldn’t be a problem; we simply consider them two separate lines in software, so the lines are numbered like this:

This also shows that the maximum value will need to be changed in the same way.  In the reference manual, they call this value the “auto-reload” value, and it’s kept in the TIMx_ARR register.  Section 17.4.1 suggests you can choose whether this is double-buffered or not.  We might as well use this feature since the compare register needs it.

There’s one more thing to look at.  At the start of each line, I’ll need to start transferring data from memory to an external port using DMA, and configure the compare and maybe the maximum value register for the next line.  I could either do both in a single interrupt, or set the registers on the reset interrupt, and start the DMA on the compare interrupt.

Do we have enough time from the start of the interrupt to do anything useful? The horizontal sync pulse on a PAL signal is 4.7μs.  If we assume the CPU runs at 16MHz, this is about 75 instructions.  Section 5.5.1 of the ARM manual suggests that it takes 12 cycles to enter an interrupt.  In the AVR, it’s up to the programmer to save the register state at the start of an interrupt.  This means it’s a good idea to do as little as possible in an interrupt, because the compiler inserts lots of “push” and “pop” instructions around the interrupt.  Since the ARM looks after this for you, and takes a fixed amount of time to enter an interrupt, this isn’t a problem.  If we assume it takes 12 cycles to leave an interrupt too, that leaves about 50 cycles to do stuff.  This stuff is working out how long it should take to raise the signal again, and set the compare register and maybe the reset register.

That should be all we need to know about the timers – now I’ll try to use the timer to produce these horizontal sync pulses.

Leave a Comment

Blinky on the STM32L Discovery

My program seems to have locked-in syndrome, so now I’ll see if I can get it to flash an LED.

A good start would be to check the schematic for where the LED is connected.  There’s one connected to PB6 and PB7, and they’re actually marked with this on the PCB, next to the two push buttons.

Now how to interface them?  The programming manual has a whole section on GPIOs.  It mentions that there are registers for selecting the alternate function (which is how you activate SPI, the USARTs etc), selecting whether the pin is an output or input, whether there are pull-up or pull-down resistors activated, among other things.  One thing worth noting is section 6.3.1, which says “During and just after reset, the alternate functions are not active and the I/O ports are configured in input floating mode.”  What it doesn’t say though is how the registers map to the I/O pins.

The first page of the reference manual mentions one document I haven’t looked at yet: the datasheet.  And sure enough, the memory map in section 5 says that port B is at memory location 0x40020400.  There’s still no mention of how these map to the I/O registers, or how to access the registers from C code.

Figure 1 of the reference manual suggests the GPIO access is via the “AHB system bus”.  A search of the CPU reference manual says that AHB is the “Advanced High-performance Bus”, which doesn’t really mean anything for this.

Another look at the memory map shows that port B goes from 0x40020400 to 0x400207FF.  That’s 1kB of address space, so maybe all of the port registers live here?  If I assume that, I need to set a few bits in GPIOA_MODER at 0x40020400, and turn on the output pin in GPIOA_ODR at 0x40020414 (the reference manual shows the offset of this register as 0x14).  Like this:

    *((int*) 0x40020400) = 0x00005000;
    *((int*) 0x40020414) = 0x00000080;

No that doesn’t work… time to cheat.  I’ll look at “blinky.c”, which is included with the Keil IDE.  It mentions a GPIO clock, maybe I need to enable that.  This idea is a bit unusual to me, since AVRs don’t have a clock for the output pins, but maybe in an ARM you need one so DMA works or something.  Figure 12 contains a rather elaborate map of how the clocks work, but the important bit is on the right: HCLK goes to the AHB bus (which I saw earlier and dismissed!)  This is fed through a prescaler from SYSCLK.  Section 5.3.8 discusses the AHB peripheral clock enable register (RCC_AHBENR) which has a “GPIOB EN” bit at bit 1.  RCC is at 0x40023800, AHBENR is at offset 1C, so this register is at 0x4002381C.

So this gets the LED to light:

  *((uint32_t*) 0x4002381C) = 0x00000002; /* Enable GPIO clock */
  *((uint32_t*) 0x40020400) = 0x00005000; /* Output mode */
  *((uint32_t*) 0x40020408) = 0x00005000; /* 2MHz clock speed */
  *((uint32_t*) 0x40020418) = 0x00000080; /* LED on */

That’s pretty ugly and you wouldn’t want to write too much code like that, so I’ll look at libraries that contain these numbers instead.

Leave a Comment

How the STM32L Discovery demo works

In my previous post, I got a basic program running on a STM32L Discovery board.  Now I hope to work out what the program works.

The program contained this data structure:

// Define the vector table
unsigned int *myvectors[4]
__attribute__ ((section("vectors"))) = {
    (unsigned int *) STACK_TOP,         // stack pointer
    (unsigned int *) main,              // code entry point
    (unsigned int *) nmi_handler,       // NMI handler (not really)
    (unsigned int *) hardfault_handler  // hard fault handler

This is a structure with four pointers.  It also has this in front: __attribute__ ((section("vectors"))).  The linker script contains a section with a similar name, and while I don’t know anything about linker scripts, it looks like it goes right at the start of the flash memory. In other words, these four pointers look like the first 32 bytes of any program.

Is there any documentation that describes this? After suffering through ST’s “product selector”, I found the page for the CPU, where I found the reference manual. This is a bit like a AVR datasheet; it tells you all about the interfaces the chip has. Since my program doesn’t talk to the outside world yet, this document isn’t terribly helpful; but it does point to a document from ARM about the CPU core.

After searching for various terms in this document, I eventually found out that this table is called the “vector table” and is described in section 5.9.1. Although the table in the code is self-explanatory, it’s nice to find the reference to what exactly it does. The document also says there’s other vectors that may appear after these, so that may be useful to know one day.

Now that I’ve started to find my way around the documentation, maybe I can go on to making the chip actually do something!

Leave a Comment

Getting started with an STM32L Discovery with Linux and GCC

I’ve got the “hello world” working on my STM32L Discovery board that I got about 8 months ago.  It’s not even the canonical blinking light, but it counts up and you only know that it works by using a debugger!  Another site gave me the basic idea, but I needed a few changes to get it working.

    1. Download the Linaro bare metal ARM toolchain (it’s near the bottom of the page).  Extract it somewhere (I put it in /opt).
    2. Download and build OpenOCD.  I’m using version 0.6.0.  I used Checkinstallso I had a managed package:
      tar -zxvf openocd-0.6.0.tar.gz
      cd openocd-0.6.0.tar.gz
      ./configure --prefix=/usr --enable-jlink --enable-amtjtagaccel --enable-ft2232_libftdi
      sudo checkinstall make install
    3. Now something to compile.  I used this:
      // By Wolfgang Wieser, heavily based on:
      #define STACK_TOP 0x20000800   // just a tiny stack for demo
      static void nmi_handler(void);
      static void hardfault_handler(void);
      int main(void);
      // Define the vector table
      unsigned int *myvectors[4]
      __attribute__ ((section("vectors"))) = {
          (unsigned int *) STACK_TOP,         // stack pointer
          (unsigned int *) main,              // code entry point
          (unsigned int *) nmi_handler,       // NMI handler (not really)
          (unsigned int *) hardfault_handler  // hard fault handler
      int main(void)
          int i=0;
      void nmi_handler(void)
      void hardfault_handler(void)
    4. Build it:
      arm-none-eabi-gcc -I. -fno-common -O0 -g -mcpu=cortex-m0 -mthumb -c -o main.o main.c

      I believe the -O0 is to stop the compiler optimizing out the counting loop.

    5. Now for linking. The script on the other site didn’t seem to work for me – when I started the debugger, it looks like it was trying to run code from memory address 0. From what I’ve seen, the flash actually lives at 0x02000000, which might explain the problem. I found another script at ChibiOSwhich seemed to work better. Download the script, then run the linker:
      arm-none-eabi-ld -v -TSTM32L152xB.ld -nostartfiles -o demo.elf main.o
    6. Now extract the binary image from the .elf:
      arm-none-eabi-objcopy -Obinary demo.elf demo.bin

      My binary is a whopping 52 bytes!

    7. Before uploading the binary, the permissions on the Discovery board need changing, because only root can access it at the moment. Put this in /etc/udev/rules.d/90-stm32ldiscovery.rules:
      ATTRS{idVendor}=="0483", ATTRS{idProduct}=="3748", MODE="0666"

      This will give everyone write access to the Discovery. To apply the rules, run:

      sudo service udev restart
    8. Now to start OpenOCD, and upload the binary:
      $ openocd -f /usr/share/openocd/scripts/board/stm32ldiscovery.cfg
      Open On-Chip Debugger 0.6.0 (2012-09-15-16:06)
      Licensed under GNU GPL v2
      For bug reports, read
      adapter speed: 1000 kHz
      srst_only separate srst_nogate srst_open_drain
      Info : clock speed 1000 kHz
      Info : stm32lx.cpu: hardware has 6 breakpoints, 4 watchpoints

      In another terminal:

      $ telnet localhost 4444
      Connected to localhost.
      Escape character is '^]'.
      Open On-Chip Debugger
      > poll
      background polling: on
      TAP: stm32lx.cpu (enabled)
      target state: halted
      target halted due to breakpoint, current mode: Thread 
      xPSR: 0x01000000 pc: 0x0800001a msp: 0x200007f0
      target state: halted
      target halted due to breakpoint, current mode: Thread 
      xPSR: 0x01000000 pc: 0x0800001a msp: 0x200007f0
      > reset halt
      target state: halted
      target halted due to debug-request, current mode: Thread 
      xPSR: 0x01000000 pc: 0x08000010 msp: 0x20000800
      > flash probe 0
      flash size = 128kbytes
      flash size = 128kbytes
      flash 'stm32lx' found at 0x08000000
      > flash write_image erase demo.bin 0x08000000
      auto erase enabled
      target state: halted
      target halted due to breakpoint, current mode: Thread 
      xPSR: 0x61000000 pc: 0x20000012 msp: 0x20000800
      wrote 4096 bytes from file demo.bin in 0.325034s (12.306 KiB/s)
      > reset
      target state: halted
      target halted due to breakpoint, current mode: Thread 
      xPSR: 0x01000000 pc: 0x08000010 msp: 0x20000800
      > exit
      Connection closed by foreign host.

      I don’t know what all of those commands do though!

    9. Now to see whether the code is actually running:
      $ arm-none-eabi-gdb demo.elf
      GNU gdb (GNU Tools for ARM Embedded Processors)
      Copyright (C) 2011 Free Software Foundation, Inc.
      License GPLv3+: GNU GPL version 3 or later <>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
      and "show warranty" for details.
      This GDB was configured as "--host=i686-linux-gnu --target=arm-none-eabi".
      For bug reporting instructions, please see:
      Reading symbols from /home/damien/projects/stm32l-demo/demo.elf...done.
      (gdb) target remote :3333
      Remote debugging using :3333
      main () at main.c:21
      21	{
      (gdb) cont
      Program received signal SIGINT, Interrupt.
      main () at main.c:26
      26	        i++;
      (gdb) print i
      $3 = 496378
      (gdb) cont
      Program received signal SIGINT, Interrupt.
      main () at main.c:26
      26	        i++;
      (gdb) print i
      $4 = 903650
      (gdb) quit
      A debugging session is active.
      	Inferior 1 [Remote target] will be detached.
      Quit anyway? (y or n) y
      Ending remote debugging.

      Yay, it looks like it’s running!

A program isn’t of much use if it can’t communicate outside of the chip, so driving I/O will be next. There looks like three options:

  1. Write to the hardware directly.  This involves looking through the CPU’s user manual, and working out how to access the I/Os.
  2. Use another library to access the hardware.  This is much like how you write AVR code – you access all of the I/Os through C library calls.  ST supplies a library, while it doesn’t have a particularly nice license it’s probably a good starting point.
  3. Use a operating system like ChibiOS, which has support for this board.  Having developed stuff for the AVR, I think it would be nice to have the resources of a real operating system – I wouldn’t have to worry about implementing scheduling and interrupts myself.

Hopefully one day I’ll try these out and get around to writing about the results!

My next post describes a what the code in this example does.

Comments (2)