Monday, March 23, 2015

Turning a Machine Vision Camera into a Digital Cinema Camera

(with a title that pretentious it has to be good, right?)


Recently Sony released the IMX174 sensor, a 16mm-format global shutter CMOS sensor good for 150FPS at 1920x1200. The sensor is current available in several machine vision cameras for around $1000, including the Point Grey Grasshopper3 and the Basler Ace.

I had been looking for a good high-speed video camera lately that could double as a day-to-day raw camera for general videography; while the Redlake was perfectly good camera, its absurd light requirements, short record times, and general bulk limited it to special occasions. Inspired by Shane's recent success with using a Grasshopper3 on a Freefly rig for general video, I decided to give it a go using a Basler camera and my own spin on what I thought the UI should look like.

The Camera

I managed to hunt the camera in question (acA1920-155uc) in stock at Graftek; it was overnighted to me and the next day I had it in hand. The first thing that struck me was how small the head was - at 29mm on a side the camera was barely larger than a dollar coin. The next thing that struck me was how bad the stock viewer was - AOI didn't work, and recording dumped debayered BMP files frame-by-frame to the disk. Basler promised me that AOI would be implemented soon, but didn't have a good answer for the recording issue ("use Labview" was not a valid answer!)

But as I often say, "its just code"...

First Light

Getting RAW images frame-by-frame out of the camera was easy - there was a Pylon SDK example for grabbing raw framebuffer data. Throughout this project, Irfanview proved to be invaluable for viewing and debayering raw frame data.

It was fairly straightforward to write the raw frames frame-by-frame to disk, but that was good for under 60FPS due to the overhead of file creation. The logical next step was to buffer the images to an in-memory buffer and then flush the buffer several hundred MB at a time to disk - using this strategy I was able to reliably grab at maximum frame rate...

...until about 5000 frames, at which point disk write speeds would plummet to <150MB/s and frames would start dropping like crazy. I spent a day trying to fix this, thinking it was a resource handling issue on my end, until I realized that the SSD on my development machine was nearly full. Most consumer SSD's have terrible write speeds towards the end of their capacities, as they run out of free blocks and have to start erasing and rewriting entire blocks for even small amounts of data. Deleting some files and trimming the drive seemed to fix the issue, and I could record continuously at maximum frame rate (with frames cropped from 1200p down to 1080p) so long as background processes were closed.

Getting a Preview

After much derping with GDI bitmaps and DirectX, I remembered that SDL existed and, as of version 2.0, was hardware-accelerated. A rough preview using terrible nearest-neighbor debayering was easy. Unfortunately, while the chunk-based disk writing scheme significantly boosted storage performance, it also had the side effect of freezing preview during the disk writes. This was OK for low data rates, but not for high frame rates where the disk would be mostly busy.The natural solution was to run the framebuffer update in a separate thread, which was easier said then done - Pylon ostensibly has callback functions that can be attached to its internal grab thread, but enabling callbacks seemed to disable Pylon's internal buffer update. The solution was to write a wrapper around the Pylon InstantCamera object that ran in an SDL_thread and updated an external buffer.

Finding a Host Computer

I wasn't particularly keen on paying $1000 for a Surface Pro 3 and the firesale on obsolete SP1's had largely ended, but fortunately the Dell Venue 11 Pro uses an M.2 SSD in the Core i3 and i5 versions and costs about $300 refurb. The stock SSD is somewhat lacking - it tops out at 300MB/s and needs a little bit of warming up in the form of a few hundred frames written before it can sustain this speed, but was OK to start with.

Also worth noting at this point I ran into random dropped frames, which turned out to be a crappy USB cable; unlike USB 2.0, USB 3.0 actually needs a good cable to sustain maximum transfer rate.

The horrible nearest-neighbor preview also looked pretty awful on the Venue's low-DPI screen (1080p on 11" versus 1600p on 13" on the laptop I had written the code on), so it was time to write a quick-'n-dirty software bilinear debayering function. It runs a bit hot (~40% CPU on the Venue) but seems good enough for 30FPS preview. There were also considerable tearing problems, which turned out to be a combination of thread contention in the buffer update code and a lack of Vsync during the screen update.

Better Post-processing

At this point I could grab raw ADC data to disk, but that wasn't too useful for producing viewable videos. I wrote a quick player that used bilinear debayering to preview the videos, and used Windows file associations to avoid having to write an actual GUI for it (this also made it more touch-friendly; also, cool trick: the Windows "Open With" menu option passes the file name as argv[1] to the program it calls). The 's' key in the player cycles between LUTs (currently there are linear, S-Log, Rec709, and an curve of my own creation that I thought "looked good"), 'd' dumps .RAW files for Irfanview or dcraw, and 'b' dumps bilinear debayered BMP's.

I also wanted better debayering without actually having to write a better debayer-er. After scrolling through more lines of pure global-variable based C than I ever wanted to (its somewhat horrific that most of the worlds' raw photo handling stems from that mess), I figured out how to add support for new formats to dcraw. For metadata-less files, dcraw picks the file type based on file size.

Sensor performance

The chart says it all:

Stops of DR versus stops of gain
The sensor has a base ISO of around 80.The plot above shows stops of DR versus gain in 12-bit mode (which is good for 78 FPS). Performance at base ISO is excellent; performance at higher ISOs is terrible enough to the point where digitally applying gain in post is probably better for most applications. At high gain there is severe banding noise originating from a yet unlocated source.


Download the package here. You will need the Visual C++ runtime DLL's, Basler's Pylon runtime, an acA1920-155uc camera, a USB 3.0 port, and a reasonably fast SSD. Use Pylon Viewer to set the bit depth and use either grab_12p_fs or grab_8_fs to record files. To use the player, associate .aca files with play_12p.exe and .ac8 files with play_8.exe. Inside the player, 'd' dumps RAW frames, 'b' dumps BMP's, and 's' cycles between curves.

Turning Used Car Parts into Silly Vehicles Part 3: Totally FOC'ed

Just code for now, to prove it works.

Most notably, proper field-oriented control is now implemented, and the code is now interrupt-free.
Nick's blog has some hardware details about the analog-hall-sensor based position sensor. The schematic for the board:

And the layout:

The firmware doesn't run on the above board out of the box, you'll have to scramble some pins definitions around in main() to make it work.

Expect a more detailed update either here or on Nick's site sometime in the near future.

Wednesday, February 11, 2015

Turning Used Car Parts into Silly Vehicles 2: Prius Brick

Previously we (really just Nick) had managed to mount up the rotor and stator from a Ford hybrid and turn it into a working motor. The next step was to find a suitable controller for the motor, and what better place to look than the guts of a Toyota Prius?

There has been some previous work done on the Prius inverter; this fellow here has a fairly extensive teardown of the assembly, complete with glamour shots of the gooey innards of the brick. I'll supplement his information with the following:
  • The brick has significant internal propagation delays. At 20KHz, anything under 20% duty cycle is clamped to zero, and anything over 80% duty cycle is clamped to 100%.
  • The internal power supplies expect a fairly low-impedance source; if your brick doesn't want to turn on check your power connections.
  • The logic going into the brick is inverting; enable needs to be pulled low to turn on the phases, and the phases are HIGH when the inputs are low.
  • The current sensors are 400A and 200A, respectively.
The first step was to build a board to interface with the module. A bit of Eagle layout work gave the following board and schematic:

The board is Arduino-shaped, but is actually meant to mount atop one of these:

An STM32F4 Nucleo, part of ST's latest attempt to push their high-performance ARM microcontrollers and actually a pretty good one at that. The hardware pins are Arduino-compatible, and the board can be programmed via the Mbed online IDE. Sure, their libraries are as slow as balls, but even with the overhead of the Mbed libraries the Nucleo still has about 10x the performance of an Arduino, not to mention all-important hardware floating-point support.

The board should be self-evident - X5 is a throttle input, X2 is hall sensors, X1 is 12V power, X3 and X4 control the two halves of the Prius brick (remember, there are two controllers in the brick!) in parallel, and X6 is a quick-and-dirty resistor-divider based input for the current sensors.

We have a board - firmware was next. You can download the firmware package here. A quick code walkthrough follows.

Motor commutation is interrupt-based, with throttle polling and (in the future) the current loop runnin in the main loop. Math is done in floating point thanks to the STM32F4's hardware FPU.

The commutate routine is called once every sensor change. It converts the hall sensor reading to a position estimate (in intervals of 60 degrees) and stores it. It also zeroes the ticker that is used for position interpolation, and updates the motor speed estimate (motor->last_time). In addition, the code also tries to detect and prevent jitter at the hall sensor transitions by maintaining forwards rotation of the motor when possible, which is crucial in preventing commutation misses.

The dtc_update routine is called at regular intervals (in the default firmware, 5KHz). It does two things: measure the time elapsed since the last commutation (motor->ticks += 1.0f) and update the duty cycle via a table lookup.

In its current state the firmware isn't really production-ready; the lack of current control makes it too unwieldy for use on large vehicle, but it should at least get us off the ground for further testing.

Tuesday, December 2, 2014

Turning Used Car Parts into Silly Vehicles 1: Ford Fusion Motor

Part of an ongoing series of shenanigans to turn used car parts into EV's. See Nick Kirkby's blog post about it here.

Monday, October 27, 2014

RY40511: a few notes

Stuff to note for other tinkerers out there:

  • The entire controller gets extremely hot at 50A with the stock transistors, probably too hot for continuous duty without a fan. However, the limit is probably purely thermal - I did most of my initial testing with a 50A max current setpoint with the motor coupled to a second chainsaw motor driving a dead short, and both the current loop and short-term behavior of the controller seemed reasonable.
  • Likewise, the motor is not good for 50A under stall or near-stall conditions; everything gets rather hot. However, at reasonable RPM it is probably actually a 2KW motor thanks to the integrated fan.
  • Without some further connection on the blue "COMM" wire coming out of the battery, the integrated BMS shuts off at 25A. Charles has gotten 65A out of the battery, so presumably reconnecting this wire will enable high current operation; given the lack of thermal protection on both controller and motor some adaptive current limiting will probably be necessary to keep temperatures down.
  • If you want to upgrade the MOSFETs, go for the lowest Rds,on FETs you can afford - gate charge doesn't really matter given the 10nF parallel capacitor on the gate. Furthermore, only 1/6 of the FETs are switching at any one time, and only at 8KHz, so switching losses are negligible. In fact, switching too fast will result in nasty transients and FET failures (don't use logic level FETs!).
  • The 40V voltage limit is pretty hard with the board in its stock state; the high-side gate drive BJTs see 55V and are rated to 65V. Likewise, the LVPS assembly is not going to appreciate any sort of significant overdrive. Replacing the BJTs with 100V parts might allow it to operate at a 48V nominal pack voltage.

Frozen Chainsaw Massacre Part 2: New firmware

So previously I had briefly mentioned a scooter built out of a brushless chainsaw; remarkably enough, it worked well enough so that we (Peter and I) decided to write new firmware for the chainsaw to make it behave more like a vehicle and less like a power tool.

The first step was to trace out the schematic for the board. After a couple hours of rooting around the board we came up with this:

The PNP transistors are BC856B'; the NPN transistors are BC846's. The MOSFETS are Alpha Omega AOT470's - pretty generic 100V 10mOhm FETs with 130nC of gate charge; not high performance by any means but they work.

Next we needed to set up the PWMs and interrupts to get the motor turning. The following snippet of code initializes the timers to do block commutation with high-side PWM:

This configures the high side PWM's to be enabled, and leaves the low side pins free for GPIO use.

The actual commutation is done by the following function, which is called every time the hall sensors change:

htable[] and ltable[] are lookup tables for the states of the high and low side FETs in each phase.

This was enough to get the motor turning; the next step was to read the current shunt and add some overcurrent detection. Unfortunately, the current surge during motor startup in open-loop mode was enough to trip even a 300A current limit.

We added a current loop, which ran in a main() to avoid conflicting with commutation:

After dealing with a bunch of minor-but-deadly bugs with the current loop (and in the process killing a couple dozen transistors), the motor seemed to more or less turn smoothly.

Next we added stall protection - the firmware would compute the speed of the motor and limit the duty cycle to 10% if the speed was too low. This inevitably resulted in a few more dead transistors due to integral windup, but worked pretty effectively after the bugs were fixed, giving just enough startup torque to move the scooter without pushing dangerous amounts of phase current.

At this point we were a couple weekends and ~50 dead MOSFETs (and about a hundred little BJT's) into the project and it was getting to be pretty late. We figured that before we plugged it into the battery we should investigate the startup transient of the micrcontroller. This resulted in the removal of a bug that caused a brief shoot-through event on startup (the timer duty cycles and complementary modes were not being initialized before the PWM channels were enabled) and the loss of another bridge.

We couldn't figure out what was up with the damaged bridge - no amount of replacing shorted MOSFETs could restore it to functionality. Eventually we replaced all of the transistors, and it worked again. Lesson learned: zombie-FETs are a thing.

Get the code here. To flash it, solder a 4-pin header to the "P0" pads on the board and connect a STM8 programmer, notch side in, to the header. You'll also need to configure the option bytes to set the correct alternate functions for the timer pins, which is pretty trivial in ST Visual Programmer. Code tested to work in IAR studio, definitely will not compile in other environments without some tweaking.
The default settings (constants.h) set the max current limit to ~25A (controller gets real hot above that) and UVLO to 20V (gate drives brown out below that).

Monday, October 6, 2014

Frozen Chainsaw Massacre: Ryobi RY40511 Cordless 40V Chainsaw on a scooter

Following up on Charles' blog post here I decided to snag one of these chainsaws for myself and see how it faired in vehicle duty.

The end result:
Chainsaw on a Disney Princess scooter - hence the name
The scooter turned out remarkably rideable, except for the extremely annoying motor controller cut-in behavior, which gave the motor the start-up behavior of a gas engine- if throttle was applied from a standstill too quickly, the motor would cut out, requiring a controller reset. The motor had to be ramped up to speed over the course of several seconds to work properly.

The only way to fix this was to reflash the controller with new firmware. Conveniently enough, there was an unpopulated programming/debug header on the board. We rooted around the board for a while and came up with this schematic:

Next up: port my sinewave controller firmware to the STM8S micro that the board uses.