Free IoT Telemetry using DNS Tunneling and Other Peoples’ WiFi (Part 1)

Well, it happened. Despite all my talk about “Internet of Things” hype being teh suck and not ready for primetime yet, I’m now an official IoT Hero. I suppose I should actually do something about that. Today, we explore cheap-as-free data exfiltration for mobile IoT gadgets using a trick known as DNS tunneling.

IoT Hero

Superpowers include starting the toaster via TCP/IP and tripping over tall bandwidth bills in a single bound.

Internet of Toilets, data exfiltration and wardriving!

Suppose you are a company that rents, leases, or otherwise loans out large numbers of some mobile object that you hope to get back at some point… say, Port-a-Potties Port-a-Johns Portaloos Honey Buckets portable toilets (yes, this is based on a true story). As it happens, they often get lost, stolen, blown away, forgotten somewhere or simply lost track of somewhere in your own logistics chain. This happens more than you might think, with various computer glitches or simple human screwups leaving inventory trapped on trucks or lost right under your nose in your own warehouse.

As a world leader in mobile outhouses with over 1 meeeelion units in circulation worldwide, you can’t afford to have your product going walkabout all the time, so you’d like to tag them with a bit of battery-powered IoT smarts so they can report back their locations periodically. A FindMyCrapper app tied into your logistics would let you opportunistically round up your lost sheep on the way by and bring them home.

So, we have 2 needs:
1) Get the location of the object periodically
2) Phone it home

This applies to whatever object you’d like back, including toilets, dumpsters, lighted traffic devices, reusable shipping containers… Housepets too if you can convince them to wear it.

The tried-and-mostly-true approach for #1 is GPS, with the caveat that it won’t work well, if at all, indoors (including units lost in your distribution center) or under dense cover. For #2 (hah!) it’s cellular. You can get a uBlox module for $35USD that covers this as well as free access to their cell triangulation database, which will provide rudimentary location, even indoors. Off-brand cell modems from the usual sources can probably be had for much less. Just beware of the cost of connectivity, especially if you’re not a big enough fish to sweet-talk your way out of per-device maintenance charges (I omit satellite options for just this reason). Also, as anyone living outside metropolitan areas can attest, coverage outside metropolitan areas can be iffy. So ideally, we want a cheap backup option for both of these needs.

Seeing as you can now get ludicrously cheap WiFi modules (like the infamous $2 ESP8266) and run them indefinitely (if infrequently) with a small solar cell and rechargeable battery, they scream possibilities. If you could use random WiFi access points nearby to (a) triangulate trilaterate your location and (b) phone it home, you’d be in business. We know WiFi geolocation is a thing (Apple and Google are doing it all the time), but sneaking data through a public hotspot without logging in?

To find out, I ran a little experiment with a Raspberry Pi in my car running a set of wardriving scripts. As I went about my daily business, it continuously scanned for open access points, and for any it was able to connect to, tried to pass some basic data about the access point via DNS tunneling*, a long-known but recently-popular technique for sneaking data through captive WiFi portals. Read the footnote if you’re interested in how this works!

The Experiments
Since I’m pressed for freetime and this is just a quick proof-of-concept, I used a Raspberry Pi, WiFi/GPS USB dongles and bashed together some Python scripts rather than *actual* tiny/cheap/battery-able gear and clever energy harvesting schemes (even if that is occasionally my day job). The scripts answer some key questions about WiFi geolocation and data exfiltration. All of the scripts and some supporting materials are uploaded on GitHub (WiSneak).

1) Can mere mortals (not Google) viably geolocate things using public WiFi data? How good is it compared to GPS? Is it good enough to find your stuff?

The ‘wifind’ script in scanning mode grabs the list of visible access points and signal strengths, current GPS coordinate, and the current time once per second, and adds it to a Sqlite database on the device. Later (remember, lazy and proof of concept), another script, ‘query-mls’ reads the list of entries gathered for each trip, queries the Mozilla Location Service with the WiFi AP data to get lat/lon, then exports .GPX files of the WiFi vs. GPS location tracks for comparison.

There are other WiFi location databases out there (use yer Googler), but most are either nonfree (in any sense of the word) or have very limited or regional data. MLS seemed to have an answer for every point I threw at it. The only real catch is you have to provide at least 2 APs in close physical proximity as a security measure – you can’t simply poke in the address of a single AP and get a dot on the map.

2) Just how much WiFi is out there? How many open / captive portal hotspots? How many of them are vulnerable to DNS tunneling?

A special subdomain and DNS record were set up on my web host (cheap Dreamhost shared hosting account) to delegate DNS requests for that subdomain to my home server, running a DNS server script (‘dnscatch’) and logging all queries. This is the business end end of the DNS tunnel.

In tunnel discovery mode, ‘wifind’ repeatedly scans for unencrypted APs and checks if their tunneling status is known yet. If unknown APs are in range, it connects to the one with the highest signal strength, and progresses through the stages of requesting an address (DHCP), sending a DNS request (tunnel probe with short payload), validating the response (if any) against a known-good value, and finally, fetching a known Web page and validating the received contents against the expected contents. The AP is considered ‘known’ if all these evaluations have completed, or if they could not be completed in a specified number of connection attempts. The companion script, ‘dnscatch’ running on a home PC (OK, another RasPi… yes I have a problem) catches the tunneled probe data and logs it to a file. The probe data includes the MAC address and SSID of the access point it was sent through. Finally, ‘query-mls’ correlates the list of successfully received tunneled data with the locations where the vulnerable AP was in range, outputting another set of .GPX files with these points as POI markers.

Preliminary Results

This is expected to be an ongoing project, with additional regional datasets captured and lots of statistics. I haven’t got ’round to any of that yet, but here is a look at some early results.

All of the map images below were made by uploading the resulting .GPX files into OpenStreetMaps’ uMap site, which hosts an open-source map-generation tool (also) called uMap. Until finding this, I thought generating the pretty pictures would end up being the hardest part of this exercise.

WiFi geolocation and data exfiltration points

WiFi geolocation and data exfiltration points

This trip, part of my morning commute (abridged in the map image), consists of 406 total GPS trackpoints and 467 WiFi trackpoints (the lower figure for GPS is due to the time it takes to fix after a cold start). Of these, 238 (51%) were in view of a tunnelable AP. The blue line is the “ground truth” GPS track and the purple line with dots is the track estimated from WiFi geolocation showing the actual trackpoints (and indirectly, the distribution of WiFi users in the area). The red dots indicate trackpoints where a tunneling-susceptible AP was in-view, allowing (in a non-proof-of-concept) live and historical telemetry to be exfiltrated by some dirty WiFi moocher.

Overall, the WiFi estimate is pretty good in an urban residential/commercial setting, although it struggles a bit here due to the prevalence of big parking lots, cinderblock industrial parks and conservation areas. The apparent ‘bias’ in the data toward WiFi-less areas in this dataset is consistent, based on comparison to the return drive, and does not appear to be an artifact of e.g. WiFi antenna placement on the left vs. right side of the Pi.

GPS and WiFi geolocation points with tunneling-friendly points shown in red.

GPS and WiFi geolocation points with tunneling-friendly points shown in red.

Here is basically the same thing, with the tunnelable points shown according to the ground-truth GPS position track rather than the WiFi location track.

How does it do under slightly more challenging WiFi conditions?

Here is another dataset, taken while boogeying down MA-2 somewhere north of the posted speed limit. Surprisingly, the location accuracy is pretty good in general, even approaching proper highway speeds. This is the closest thing I have to a “highway” dataset, due to just now finishing up the scripts and not having a chance to actually drive on any yet. It will be interesting to see how many significant locatable APs can be found on a highway surrounded by cornfields instead of encroaching urban sprawl. I suspect there are a few lurking in various taffic/etc. monitoring systems and AMBER alert type signage scattered about, but they may not be useful for locating (with MLS, at least) due to the restriction of requiring at least two APs in-view to return a position. This is ostensibly to prevent random Internet loonies from tracking people to their houses via their WiFi MACs, although I have no idea how one would actually accomplish that (at least in an easier way than just walking around watching the signal strength numbers, which doesn’t require MLS at all). Unfortunately, I don’t have tunneling data for this track since I haven’t driven it with the script in tunnel discovery mode yet.

Location track on MA-2 with fairly sporadic WiFi coverage

Location track on MA-2 with fairly sporadic WiFi coverage

Closeup of location track on MA-2 with fairly sporadic WiFi coverage

Closeup of location track on MA-2 with fairly sporadic WiFi coverage

In this dataset, MLS would occasionally (typically at the fringes of reception with only a couple APs visible) return a completely bogus point, centered a few towns over, with an accuracy value of 5000.0 (I presume this is the maximum value, and represents not having any clue where you are). Every unknown point produced the same value, resulting in an odd-looking track that periodically jumps to an oddly-specific (and completely uninteresting) coordinate in Everett, MA, resulting in the odd radiants shown below. These bogus points are easily excluded by setting a floor on allowable accuracy values, which are intended to represent the radius of the “circle where you might actually be with 95% confidence, in meters”.

Location track on MA-2 with fairly sporadic WiFi coverage, bogus points included

Location track on MA-2 with fairly sporadic WiFi coverage, bogus points included


1) Can mere mortals (not Google) viably geolocate things using public WiFi data?

Totally! Admittedly, my sample set is very small and only covers a local urban area, but it’s clear that geolocating with MLS is very approachable for mortals. If you plan to make large numbers of requests at a time, they expect you to request an API key, and reserve the right to limit the number of daily requests, but API keys are currently provided free of charge. IIRC, numbers of requests that need to worry about API keys or limits are on the order of 20k/day; this requirement is mainly aimed at folks deploying popular apps and not individual tinkerers. For testing and small volume stuff, you can use “test” as an API key.

How good is it compared to GPS? Is it good enough to find your stuff?

So far, ranging from GPS-level accuracy to street-level accuracy (or “within a couple streets” if they are packed densely enough but not much WiFi is around). Generally, not as good as GPS. The estimated accuracy values ranged typically from 50-some to 250 or so meters, vs. a handful for GPS. Remember though, the accuracy circle represents a 95% confidence value, so there’s a decent chance the thing you’re looking for is closer to the middle than the edges. This might also depend on how big your thing is and how many places there are to look. In some cases, narrowing it down to your own distribution center might be enough.

2) Just how much WiFi is out there? How many open / captive portal hotspots? How many of them are vulnerable to DNS tunneling?

Like mentioned above, I found about 50% of points taken in a dense residential area were in-view of an access point susceptible to tunneling. In this area, the cable operator Comcast is largely – if inadvertently – responsible for this, so your mileage may vary in other areas (although I expect others to follow suit). In the last few years, Comcast has been replacing its “dumb” rental cable modems with ones that include a WiFi hotspot, which shows up unencrypted under the name ‘xfinitywifi’ and serves up a captive portal. The idea is that Comcast customers in the area can log into these and borrow bandwidth from other customers while on the go. Fortunately for us, so far it also means plenty of tunneling opportunities: ‘xfinitywifi’ represented nearly 50% of all APs in my local area, and 15% of a much larger dataset including upstate New York (this dataset has limited tunnel-probe data only and does not include location data). It also means Comcast – and other cablecos – could make an absolute killing selling low-cost IoT connectivity if they can provide a standardized machine-usable login method and minimal/no per-device-per-month access charges. An enterprise wishing to instrument stuff buys a single service package covering their entire fleet of devices and pays by the byte. Best of all, Comcast’s cable Internet customers already pay for nearly all of the infrastructure (bandwidth bills, electricity, climate-controlled housing of the access points…), so they can double-dip like a Chicago politician.

Under The Hood
There are a few practical challenges with this test approach that had to be dealt with:

The method used to manage the WiFi dongle (a Python library called ‘wifi‘) is a bit of a kludge that relies on scraping the output of commandline tools such as iwlist, iwconfig, etc and continually rewriting your /etc/network/interfaces file (probably via the aforementioned tools). This combination tends to fall over in a stiff wind, for example, if it encounters an access point with an empty SSID (”). Running headless, crashes are neither detectable or recoverable (except by plugcycling), so to keep things running, I had to add a check that avoids trying to connect to those APs. It turns out there are quite a few of them out there, so I’m probably missing a lot of tunneling opportunities and location resolution, but the data collected from the remaining ones is more than adequate for a proof-of-concept. I also added the step of replacing /etc/network/interfaces with a known-good backup copy at every startup, as a crash will leave it in an indeterminate state of disrepair, ensuring a fair chance the script will immediately crash again on the next run.

Keeping track of trips. A headless Pi plugged into the cig lighter and being power-cycled each time the car starts/stops will quickly mix data from multiple trips together in the database, and adding clock data would only compound the problem (as the clock is reset each time). The quick solution to this was don’t use a database adding a variable called the ‘runkey’ to the data for each trip. The runkey is a random number generated at script startup and associated with a single trip. To my mild surprise, I have gotten all unique random values despite the startup conditions being basically identical each time. Maybe the timing jitter in bringing up the wifi dongle is a better-than-nothing entropy source?

Connection attempts are time-consuming and blocking. At the start of this project, I wasn’t sure that connecting to random APs would even work at all at drive-by speeds. It does, but you kind of have to get lucky: DHCP takes some seconds (timeout set to 15), then the remaining evaluations take some further seconds each. Even with the hack of always preferring to connect to the strongest AP (assuming it is the closest and most likely to succeed), the odds of connecting are iffy. Luckily, my daily commute is the same every day, so repeated drives collect more data. Ordinarily, whatever AP I encounter ‘first’ would effectively shadow a bunch of others encountered shortly after, but the finite-retry mechanism ensures they are eventually excluded from consideration so that the others can be evaluated.

The blocking nature of the connection attempts also prevents WiFi location data (desired at fixed, ideally short intervals) from being collected reliably while tunnel probing. The easy solutions are to plop another Pi on the dashboard (sorry, fresh out!) or just toggle the script operating mode on a subsequent drive (I did the latter).

The iffyness of the connections may also explain a significant discrepancy between tunnel probes successfully received at my home server and APs reported as successful by the script side (meaning they sent the probe AND received a valid reply). Of course, I also found the outgoing responses would be eaten by Comcast if they contained certain characters (like ‘=’) in the fake domain name, even though the incoming probes containing them were unaffected.

One thing that hasn’t bitten me yet, oddly enough, is corruption of the Pi’s memory card, even though I am letting the power cut every time the car stops rather than have some way to cleanly shut down the headless Pi first. You really ought to shut it down first, but I’ve been lucky so far, and my current plan is to just reimage the card if/when it happens.

*DNS Tunneling
Typically, public WiFi hotspots, aka captive portals, are “open” (unsecured) access points, but not really open to the public – you can connect to it, but the first time you try to visit a website, you’re greeted with a login page instead.

DNS Tunneling is one technique for slipping modest amounts of data through many firewalls and other network-access blockades such as captive portals. How it works is the data, and any responses, are encoded as phony DNS lookup packets to and from a ‘special’ DNS server you control. The query consists mostly of your encoded data payload disguised as a really long domain name; likewise, any response is disguised as a DNS record response. Why it (usually) works is a bit of a longer story, but the short answer is that messing with DNS queries or responses yields a high chance of the client never being able to fetch a Web page, thus the access point never having the opportunity to feed a login page in its place, so the queries are let through unimpeded. If the access point simply blocked these queries for non-authenticated users, the HTTP request that follows would never occur; likewise, since hosts (by intention and design of the DNS) aggressively cache results, the access point feeding back bogus results for these initial queries would prevent web access even after the user logged in.

DNS tunneling is far from a new idea; the trick has been known since the mid to late ’90s and is used by some of the big players (PC and laptop vendors you’ve most certainly heard of) to push small amounts of tracking data past the firewall of whatever network their users might be connected to. However, it’s gained notoriety in the last few years as tools like iodine have evolved to push large enough amounts of data seamlessly enough to approximate a ‘normal’ internet connection.

Isn’t it illegal?

I am not a lawyer, and this is not legal advice, but my gut says “probably not”. Whether this is an officially sanctioned way to use the access point or not, the fact is it is intentionally left open for you to connect to (despite it being trivial to implement access controls such as turning on WPA(2) encryption, and the advice screamed from every mountaintop that you should do so), accepts your connection, then willfully passes your data and any response. The main thing that leaves me guess it’s on this side of the jail bar is that a bunch of well-known companies have been doing it for a long time on other peoples’ networks and not gotten into any sort of trouble for it. (Take that with a big grain of salt though; big companies have a knack for not getting in trouble for doing stuff that would get everyday schmucks up a creek.)

Doesn’t it open up the access point owner to all sorts of liability from anonymous ne’er-do-wells torrenting warez and other illegal stuff through it?

Not really. The ‘tunneling’ part is key; the other end of the tunnel is where the ne’er-do-well’s traffic will appear to originate from; that’s a server they control. Unless they are complete idiots, the traffic within the tunnel is encrypted, and it’s essentially the same thing as a VPN connection. Anyone with a good reason to track down the owner of that server will follow the money trail in a similar way. While a clever warez d00d will attempt to hide their trail using a torrent-VPN type service offered from an obscure country and pay for it in bitcoins, the trail will at least not point to the access point owner.

So it’s just a free-for-all then?

No; there are at least a few technical measures an access point designer/owner can take against this technique. An easy and safe one is to severely throttle or rate-limit DNS traffic for unauthenticated users. While this won’t stop it outright, it will limit any bandwidth consumption to a negligible level, and the ‘abuse’ to a largely philosophical one (somebody deriving a handful of bytes/sec worth of benefit without paying for it). The ratelimit could get increasingly aggressive the longer the user is connected without authenticating. Another is to intercept A record responses and feed back fake ones anyway (e.g. virtual IPs in a private address range, e.g. 10.x.x.x), with the caveat that the AP then must store the association between the fake address it handed out and the real one, and once the user is authenticated, forward traffic for it for the life of the session. I wouldn’t recommend this approach as it may still have consequences for the client once they connect to a different access point and the (now cached) 10.x.x.x address is no longer valid, but I’ve seen it done. Finally, you can target the worst offenders pretty easily, as they (via software such as iodine) are pushing large volumes of requests for e.g. TXT records (human-readable text records which can hold larger data payloads) instead of A records (IP address lookups). However, some modern internet functionality such as email anti-spam measures (e.g. SPF) do legitimately rely on TXT record lookup, so proceed with caution. Finally, statistical methods could be used – again, the hardcore methods like iodine will attempt to max out the length AND compress hell out of the payloads, so heuristics based on average hostname lookup length and entropy analysis of the contents could work. This is more involved though, deep packet inspection territory, and as with any statistical method runs the risk of false positives and negatives.

Notes To Myself: Starting out with OpenWSN (Part 1)

Successful toolchain setup, flashing and functional radio network! Still todo: Fix network connectivity between the radio network and host system, and find/fix why the CPUs run constantly (drawing excess current) instead of sleeping.

Over the last few weeks (er, months?), I build up and tried out some circuit boards implementing OpenWSN, an open-source low-power wireless mesh networking project. OpenWSN implements a stack of kid-tested, IEEE-approved open protocols ranging from 802.15.4e (Time-Synchronized Channel Hopping) at the physical layer, 6TiSCH (an interim, hardcoded channel/timeslot schedule until the smarts for deciding them on the fly is finalized), 6LoWPAN (a compressed form of IPv6 whose headers fit in a 127-byte 802.15.4etc. frame), RPL/ROLL (routing), and finally CoAP/HTTP at the application level. The end result is (will be) similar to Dust SmartMesh IP, but using all open-standard and open-source parts. This should not be a huge surprise; it turns out the project is headed up by the original Berkeley Smart Dust guy. Don’t ask me about the relationship between this and Dust-the-click-n-buy-solution (now owned by Linear Technology), TSCH, any patents, etc. That’s above my pay grade. My day-job delves heavily into low-power wireless stuff, and here SmartMesh delivers everything it promises. But it’s rather out of the price range of hobbyists as well as some commercial projects. That, and if you use it in a published hobby project Richard Stallman might come to your house wielding swords. So how about a hands-dirty crash course in OpenWSN?


At the time of this writing, the newest and shiniest hardware implementation seems to be based on the TI CC2538, which packs a low-power ARM Cortex core and radio in a single package. OpenMote is the (official?) buyable version of this, but this being the hands-dirty crash course, I instead spun my own boards. You don’t really own it unless you get solderpaste in your beard, right? The OpenMote board seems to be a faithful replication of a TI reference design (the chip, high and low-speed crystals, and some decoupling caps), so we can start from there. To save time I grabbed an old draft OpenMote schematic from the interwebs, swapped the regulator for a lower-current one and added some pushbuttons.

OpenWSN PCB based on an early OpenMote version

OpenWSN PCB based on an early OpenMote version

Here is the finished product. Boards were ordered thru OSH Park, and SMT stencil through OSHStencils (no relation). Parts were placed using tweezers and crossed fingers, and cooked on a Hamilton Beach pancake griddle. 2 out of 3 worked on the first try! The third was coaxed back to life by lifting the chip and rebaking followed by some manual touch-up.


I first smoke-tested the boards using the OpenMote firmware, following this official guide. No matter where you start, you’ll need to install the GCC ARM toolchain. Details are on that page.

This package REQUIRES Ubuntu (or something like it), and a reasonably modern version of it at that (the internets say you can theoretically get it working on Debian with some undue hacking-about, if you don’t mind it exploding in your face sometime in the future.).

If your Ubuntu/Mint/etc. version is too old (specifically, package manager version), you’ll get an error relating to the compression type (or associated file extension) used in the PPA file not being recognized. You can maybe hack about to pull in a ‘future’ version for your distro version, but who knows which step you’ll get stuck at next for the same reasons. (Maybe none, but I just swapped the hard drive and installed a fresh Mint installation on another one.)

First build the CC2538 library: In the libcc2538 folder: python
This will build libcc2538.a. Probably after you got a “libcc2538.a does not exist. Stop.” error message.

Next, try compiling a test project to make sure the toolchain works:

chmod 777

Assuming all goes well, now you can flash the resulting binary onto the board!

sudo make TARGET=cc2538 BOARD=openmote-cc2538 bsl

Needless to say, you need some kind of serial connection to the bootloader UART (PA0, PA1) on the board for this to work (I used a USB-serial dongle with 3.3V output).

Successful output from this step looks something like:

Loading test-radio into target...
Opening port /dev/ttyUSB0, baud 115200
Reading data from test-radio.hex
Connecting to target...
Target id 0xb964, CC2538
Erasing 524288 bytes starting at address 0x200000
Erase done
Writing 524288 bytes starting at address 0x200000
Write done

Now you can actually try compiling OpenWSN.

OPTIONAL STEP: If you foresee doing active development on OpenWSN, you might want to install Eclipse. NOTES:

Direct from website; even Mint package manager version as of 12/15 is still on v3. Follow the instructions on this page, for the most part.
If you installed arm-none-eabi etc. from the previous step, it “should” be ready to rock.
The options in Eclipse have changed a bit since this was written. When creating a new (blank) test project, select “Cross ARM GCC”. Create a main.c file with a hello-world main() (or copy & paste from the page above), then save and build (Ctrl-B).

You may get a linker error similar to: “undefined reference to `_exit'” . I solved this by selecting ‘Do not use standard start files (-nostartfiles)’

I’m currently getting a 0-byte file as reported by ‘size’ (actual output file has nonzero size). Not sure whether to be concerned about this or not:

Invoking: Cross ARM GNU Print Size
arm-none-eabi-size --format=berkeley "empty-test.elf"
text data bss dec hex filename
0 0 0 0 0 empty-test.elf

The actual .elf and .hex are 13k and 34 bytes on disk, respectively.

This is not actually crucial to compiling OpenWSN, so I gave up here and went back to the important stuff:

NON-OPTIONAL: Download and set up SCons. This is the build tool (comparable to an advanced version of ‘make’) used by OpenWSN.

Again, anything in your package manager is horribly out of date, so grab it from the web site, unpack and ‘sudo python install’.

Clone the ‘openwsn-fw‘ repository somewhere convenient (preferably using git, but you could just download the .zip files from github), change into its directory and run scons without any arguments. This gives you help, or is supposed to. It gives some options for various ‘name=value’ options, along with text suggesting that the options listed are the only valid ones. However, popular options like the gcc ARM toolchain and OpenMote-CC2538 are not among the listed options. Luckily, they still work if you googled around for the magic text strings:

scons board=OpenMote-CC2538 toolchain=armgcc goldenImage=root oos_openwsn

This results in an output file of decidedly nonzero reported size:

arm-none-eabi-size --format=berkeley -x --totals build/OpenMote-CC2538_armgcc/projects/common/03oos_openwsn_prog
text data bss dec hex filename
0x16303 0x28c 0x1bd8 98663 18167 build/OpenMote-CC2538_armgcc/projects/common/03oos_openwsn_prog
0x16303 0x28c 0x1bd8 98663 18167 (TOTALS)

Add ‘bootload /dev/ttyUSB0’ (or whatever your serial device shows up as) and run it with the mote in boot mode (hold Boot button / pin PA6 low and reset), and it should Just Work. Upload takes a while. Ideally, you need to flash at least 2 boards for a meaningful test (one master, or ‘DAG root’ in OpenWSN parlance, and one edge node).

Now, need to run openvisualizer to see if anything’s actually happening.

First… currently ‘’ for this package is broken, and barfs with errors e.g.:

Traceback (most recent call last):
File "", line 34, in
with open(os.path.join('openvisualizer', 'data', 'requirements.txt')) as f:
IOError: [Errno 2] No such file or directory: 'openvisualizer/data/requirements.txt'

‘pip install’ might be another way to go, but this appears to install from an outdated repository, and barfs with some version dependency issue.

Since the actual Python code is already here, we can just try running it, which seems to be expected to go through SCons: run ‘sudo scons rungui’ in the openvisualizer directory.

Traceback (most recent call last):
File "bin/openVisualizerApp/", line 29, in
import openVisualizerApp
File "/home/cnc/workspace/openwsn-sw/software/openvisualizer/bin/openVisualizerApp/", line 17, in

from openvisualizer.eventBus import eventBusMonitor
File "/home/cnc/workspace/openwsn-sw/software/openvisualizer/openvisualizer/eventBus/", line 18, in

from pydispatch import dispatcher
ImportError: No module named pydispatch

Well, that doesn’t actually work, but it’s at least a starting point to flushing out all the unmet dependencies by hand. Install the following packages:

pip (to install later stuff)
pydispatch (pip install…)*
pydispatcher (pip install…)
python-tk (apt-get install…)

*No, wait, that one’s already installed. According to the externalized Google commandline, sez you actually need to install a separate package named ‘pydispatcher‘.

Finally, let’s give it a go.

cnc@razor ~/workspace/openwsn-sw/software/openvisualizer $ sudo scons rungui

The OpenWSN OpenVisualizer

The OpenWSN OpenVisualizer

It works! Sort of. I get a mote ID, and can toggle it to be DAG root, and after a while, the 2nd board appears with a long-and-similar address in the neighbor list. It’s receiving packets and displays a RSSI. So at least the hardware is working. However, I can’t interact with it, and it doesn’t show up as a selectable mote ID (presumably just an openvisualizer issue). Nor can I ping either one as described in the tutorial, even though the part of the console dump relating to the TUN interface looks exactly as it does in the example (warning messages and all):

scons: done building targets.
cnc@razor ~/workspace/openwsn-sw/software/openvisualizer $ ioctl(TUNSETIFF): Device or resource busy

created following virtual interface:
3: tun0: mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
inet6 bbbb::1/64 scope global
valid_lft forever preferred_lft forever
inet6 fe80::1/64 scope link
valid_lft forever preferred_lft forever
22:43:24 INFO create instance
22:43:24 INFO create instance
22:43:24 INFO create instance

Killing openvisualizer, plugcycling both radios and restarting it returns a screenful of:

[Errno 11] Resource temporarily unavailable
device reports readiness to read but returned no data (device disconnected?)

Lets try rebooting, it fixes stuff in Windows… Actually, it looks like the GUI window sometimes persists after openvisualizer is supposedly killed at the console and can’t be closed via the UI; there is probably a task you can kill as a less drastic measure.

That cleared up the ‘no data’ errors, but still can’t ping any motes:

cnc@razor ~ $ ping bbbb::0012:4b00:042e:4f19
ping: unknown host bbbb::0012:4b00:042e:4f19

Well, at any rate we know the radio hardware is working, so let’s see the next moment of truth: power consumption. That’s really the point of this whole timeslotted radio exercise; otherwise you’d just drop an XBee on your board and hog as much bandwidth and electrons as you like. The 802.15.4e approach is for wireless networks that run for months or years between battery changes.

Firing up the ol’ multimeter is not the best way to measure current draw of a bursty load, but as a quick first peek it’ll do. On startup, the radio node draws a steady ~28mA, which is not all that unexpected (it needs a 100% initial on-time to listen for advertisements from the network and sync up.) After a few moments, the current drops to 11mA and the node appears in OpenVisualizer. Wait a minute… 11mA you say, with an ‘m’? That’s not that low. Scoping the 32MHz crystal confirms that the CPU is running all the time, rather than sleeping between scheduled activity. Scoping the 32KHz crystal mostly confirms that you can’t easily scope a low-power 32KHz crystal (the added probe capacitance quenches it), but doing so causes the node to drop off the network, then reappear a short time after the probe is removed, so that crystal appears to be functional (not to mention important).

Now, is it software or hardware?

Back to the OpenMote (not OpenWSN) example projects, let’s try an example that ‘should’ put the CPU to sleep:

cnc@razor ~/OpenMote/firmware/projects/freertos-tickless-cc2538 $ make TARGET=cc2538 BOARD=openmote-cc2538 all
Building 'tickless-cc2538' project...
Compiling cc2538_lowpower.c...
Compiling ../../kernel/freertos/croutine.c...
Compiling ../../kernel/freertos/event_groups.c...

cnc@razor ~/OpenMote/firmware/projects/freertos-tickless-cc2538 $ sudo make TARGET=cc2538 BOARD=openmote-cc2538 bsl
Loading tickless-cc2538 into target...
Opening port /dev/ttyUSB0, baud 115200
Reading data from tickless-cc2538.hex
Connecting to target...
Target id 0xb964, CC2538
Erasing 524288 bytes starting at address 0x200000
Erase done
Writing 524288 bytes starting at address 0x200000
Write done

Sure enough, with this example project loaded, the board’s current consumption drops to the uA range (actually, I was lazy and concluded the “0.00” reading on the mA scale told me what I wanted to know), with the 32MHz crystal flatlined except for very brief (<1msec) activity periods.

That’s it for Part 1. Stay tuned, Future Tim, for the part where we track down the source of the missing milliamps!

Clutter, Give Me Clutter (or, a GUI that doesn’t use Google as an externalized commandline)

UX nightmare: Get the menu at a restaurant, and it has only 2 items: toast and black coffee. But if you spindle the corner just right, a hidden flap pops out with a dessert menu. And if you shake it side to side, a card with a partial list of entrees falls in your lap (not all of them, just the ones you’ve ordered recently).

When you eat at Chateau DeClutter, bring a friend. If you can pinch all 4 corners of the menu at the same time, you can request the Advanced menu, wherein you just yell the name of a food across the room, and if that’s something they make, it’ll appear in about 20 minutes, and if not, nothing will happen.

Tim Tears It Apart: Measurement Specialties Inc. 832M1 Accelerometer

So, yesterday the outdoors turned into this.

This much snow in a few hours gets you a travel lockdown...

This much snow in a few hours gets you a travel lockdown…

Not quite the snowpocalypse, but it was enough that a travel ban was in effect, and work was closed. What happens when we’re stuck in the house with gadgets?

All right, I’d like to tell you that’s the reason, but this actually got broken open accidentally at my work a little while back. Sorry for the crummy cellphone-cam pic and not-too-exhaustive picking at the wreckage.

Today’s patient anatomical specimen is a fancy 3-axis piezo accelerometer from Measurement Specialties Inc. This puppy retails for about $150, so this is a ‘sometimes‘ teardown.

The insides of the 832M1 showing two sensor orientations and the in-package charge amplifiers

The insides of the 832M1 showing two sensor orientations and the in-package charge amplifiers

One thing that comes to your attention right away is holy shit, there’s an entire circuit board in there. In retrospect, I probably shouldn’t be too surprised by this. It appears that these are full-on piezoelectric sensors (not e.g. piezoresistive), which are a bit dastardly to read from without a charge amplifier inline. On the circuit you can see three identical-ish copies of a small circuit that is almost certainly that, with a small SOT23-5 opamp in each. The part’s total quiescent current consumption is billed at 12uA, so that’s a paltry 4uA per circuit.

Here you also get a gander at the acceleration sensors themselves. Each is ‘glued’ by what appears to be low-temperature solder paste to its own metal pad on the ceramic substrate, with more of the same used to bond together the parts of the sensor itself. These consist mainly of a layer of gray piezoceramic material sandwiched between two chunks of metal. The larger of these acts as a proof mass, compressing or tensioning the piezoceramic layer when the part is moved on the axis normal to it. The metal ‘buns’ double as electrodes. There are (were) three such sensors in different orientations, one per axis, but the middle one broke off and flew across the room when the package was cracked open.

Like most piezoceramics, the sensors inside are affected by thermal changes, and become more sensitive with increasing temperature. The designers appear to account for this and provide some measurement headroom over the nominal value (this bad boy is a 500G accelerometer) so that the full quoted range can be measured even at the maximum specified operating temperature. This means at room temperature, where it’s less sensitive, you can actually measure accelerations of maybe 30-40% higher than the nominal value before the output limits (with appropriate calibration and reduced resolution of course). At very cold temperatures, even quite a bit higher measurements are possible, with the same caveats.

Tim Tears It Apart: Koolpad Qi Wireless Charger (Also: how to silence it without soldering)

Koolpad outer case

Koolpad outer case

My wife goes to bed long before me, so when I go to bed, it behooves me to do so without significant light or racket. After countless nights of fiddling with a 3-sided micro-USB cable in the dark, I bought this neat little USB phone charger. It’s not the cheapest, nor the priciest, but was approximately the size and shape of my phone (less risk of our cat bumping it off a tiny pad during the night and waking us up), and lives up to its promise of cool operation, especially while not charging (meaning it is not constantly guzzling power trying to charge a device that isn’t there).


This charger also came with one untenable drawback: this gadget for charging my phone without any noisy fiddling about, comes with a built-in noisemaker that beeps loudly every time you put a device on it to charge. That’s bad enough for a one-time event, but if the phone is placed or later bumped off-center (i.e. the cat’s been anywhere near the nightstand), it will go on beeping all through the night as the charger detects the phone intermittently. So if the ladies don’t find you handsome

First off, the part you’re probably here for: how to silence that infernal beeping sound once and for all, without soldering. Open the charger (there are 4 small screws hidden beneath the rubber feet), find the 8-pin chip in the lower-left and use your favorite small tool (nippers, X-Acto, etc.) to cut the indicated pin. This pin directly drives the piezo buzzer (tall square part in the bottom-left corner). Of course, you can also cut, desolder or otherwise neutralize either of these parts in its entirety too if you have the time and a soldering iron, but I think cutting the one pin is easier.

Cut this pin to disable that beeping once and for all!

Cut this pin to disable that beeping once and for all!

While we’re in here, let’s have a look at the circuitry. There’s actually rather a lot of it – I was expecting a massively integrated single-chip solution, but there’s a surprising pile of discretes in here, including a total of 12 opamps.

Koolpad innards showing charging coil and PCB

Koolpad innards showing charging coil and PCB

Koolpad PCB

Koolpad PCB

The main IC, U5 in the center, is an “ASIC for QI Wireless Charger” manufactured by GeneralPlus. At the time of this writing, their site is rather less than forthcoming with specs or datasheets. This is driving a set of 4 MOSFETs (Q1 ~ Q4) via a pair of TI TPS28225 FET drivers (U1, U4) to drive the ludicrously thick charging coil. The coil itself is mounted to a large ferrite disk resembling a fender washer, which in turn is adhered to a piece of adhesive-backed foam – probably to dampen any audible vibrations as the coil moves in response to nearby magnetic or ferrous objects as it is driven (such as components in the device being charged). The parallel stack of large ceramic caps (C12, C14, C15, C18), along with the coil thickness itself, gives a hint as to the kinds of peak currents being blasted through it.

For fans of Nikola Tesla, this planar coil arrangement should look oddly familiar. Qi inductive charging, like most contemporary twists / rehashes / repatentings of this idea, extend it by using modern cheap-as-chips microcontrollers to allow the charger and chargee to talk amongst themselves. This allows them to collude to tune their antennas for maximimum transfer efficiency, negotiate charging rate, perform authentication (maybe you don’t want freeloaders powering up near your charger) or etc. In the case of Qi, the communication is unidirectional – chargee to charger – and used to signal the charger to increase or decrease its transmit power as needed. This provides effective voltage regulation at the receiving end and can instruct the charger to more-or-less shut down when charging is complete. Communication is achieved via backscatter modulation, similar to an RFID tag.

The chips to the left and right of the Qi ASIC, U2 and U3, are LM324 quad opamps. Without formally reverse-engineering the circuit, my gut says these opamps circuits, surrounded with RC discretes like groupies at a One Direction concert, are likely active filters, probably involved in sensing the backscatter modulated signal and overall power impact of the chargee, if any. Again, this is just educated guessing without actually tracing out the circuitry (which would involve more time than I care to spend).

The chip at the bottom right, U6, is an LM358 dual opamp, with what is probably a LM431-compatible voltage reference (U8) a bit to its left, acting directly as a 2.5V reference (cathode and reference pin tied together). At least one pin of the LM358 is visibly supplying the power to the charge-indicator LED, so it’s a reasonable guess this circuit is there to control both LEDs in response to the voltage loading produced by a device during charge. Finally, U7 near the bottom-left, noted earlier as driving that irritating-ass beepy speaker, is a 555 timer that provides the actual oscillation to drive the charge-indication beep in response to a momentary signal from elsewhere (it disappears underneath the ASIC). Q5 is most likely acting as a power switch for U7, keeping it disabled (unpowered) between beeps.

One final note completely unrelated to any teardown of the device itself: it comes with a troll USB cable. That is to say, while it looks like a regular USB cable, and may easily get mixed in with the rest of your stash, it’s actually missing the data wires entirely and only provides power. While this is not unreasonable considering it’s just a charger, beware not to let this cable get mixed in with ‘real’ ones unless you’re pulling a prank on someone. Otherwise it’ll come back to bite you some months later when you grab the nearest cable, plug it into a gadget and it’s mysteriously stopped working.

Tim Tears It Apart: Cheap Solar Pump

GY Solar water pump package

GY Solar water pump package

So, I picked up a pair of these cheapo solar pump on fleabay for about 6 or 8 bucks a pop, to filter water for the fish in my old-lady-swallowed-a-fly lotus pot. They actually work pretty well, apart from one very occasionally getting stuck and needing a spin by hand to get going. But it’s winter, the fish have met Davy Jones (natural causes) and my “plastic twinwall and a big water tank will keep my greenhouse above freezing in a New England winter” hypothesis turned out to be way not true, so they’re just sitting around my basement for the interminable non-growing season. Winter boredom plus unused gadgets sitting around equals…

Package contents. Note the cable was not severed out of the box; that was my doing!

Package contents. Note the cable was not severed out of the box; that was my doing!

The mechanical end of the pump

The mechanical end of the pump

Inside the pump end. There's nothing more to see without destroying it.

Inside the pump end. There’s nothing more to see without destroying it.

There’s not much to see on the pump end itself. A slotted cover blocks large particulates from getting into the works, followed by a plastic baffle and a centrifugal impeller, which flings the water at the outlet port. The impeller “shaft” is a magnet and doubles as the rotor for the electric motor, allowing the coils to be in the non-rotating housing (stator). Under bright sun, this arrangement can generate a head of a few inches or a decent amount of flow, not bad for a cheap pump running from a little solar panel.

Potting compound over pump power entry side

Potting compound over pump power entry side

Lifting the cover on the other end reveals where the electronics must be, a cavity completely filled in with potting compound. I declined trying to get through this mess and look at the circuitry on the pump itself. Guessing wildly though, this should probably look very similar to the circuit that drives a DC computer fan, with a small Hall effect sensor detecting the passing of the magnetic poles on the rotor and flipflopping power to the stator coils as needed to push/pull it in the desired direction.

Back of solar panel. There is a strange lump on the back.

Back of solar panel. There is a strange lump on the back.

'Kickstarter' PCB back side

‘Kickstarter’ PCB back side

'Kickstarter' PCB component side

‘Kickstarter’ PCB component side

The interesting part is a random lump on the back of the solar panel. Pop it open and, sure enough, it contains some active circuitry. This consists of a large (4700uF) electrolytic capacitor and undervoltage lockout circuit. This circuit cuts power to the pump until the capacitor charges to several volts, giving it an initial high-current kickstart to overcome static friction. This works about like you’d expect, as a voltage comparator with a fairly large hysteresis band (on at 6V, off at 2V, for example). Interestingly though, there’s no discrete comparator in sight. Instead, there’s an ATTINY13 microcontroller. The ATTINY does have a builtin comparator though, and the chip’s only purpose in this circuit seems to be as a wrapper around this peripheral. It’s entirely possible that from Chinese sources, this chip was actually cheaper than a standalone comparator and voltage reference. Another likely possibility is it was competitive or cheaper than low-power comparators, and the use of a microcontroller allows better efficiency by sampling the voltage at a very low duty cycle. For reference, the ATTINY13 runs about 53 cents @ 4000pcs from a US distributor. That’s pretty cheap, but not quite as cheap as the cheapest discrete with internal voltage reference at <=100uA quiescent current, which currently comes in at ~36 cents @ 3000pcs. Noting the single-sided PCB, another possibility is that the ATTINY and other silicon were chosen for their pinouts, allowing for single-sided routing and thus cost savings on the PCB itself. Anyway, onto the circuit. R3 and D1 are an intriguing side-effect of using a general-purpose microcontroller as a comparator, as the absolute maximum Vcc permitted on this part is ~5.5V. D1 is a Zener diode, which along with the 47-ohm resistor, clamps the voltage seen by the uC to a safe level. This seems like it would leak a lot of current above 5.5V - and it does - but under normal operation, the pump motor should drag the voltage down below that when operating. R1 and R2 form the voltage divider for the comparator, which is no doubt using an on-chip voltage reference for its negative "pin". Pin 5 of the micro is the comparator output, which feeds the gate of an n-channel enhancement-mode MOSFET, U2, through R5, with a weak 100k pulldown (R4). With this circuit, the pump makes periodic startup attempts in weak sunlight until there's enough sun to sustain continuous operation, with no stalling if the power comes up very slowly (e.g. sunrise).

Notes To Myself: Cheap Feedlines for Cheap Boards

Goal: Produce reasonable impedance-matched (usually 50-ohms) RF feedlines for hobby-grade radio PCBs. Rather than get a PhD in RF engineering for a one-off project, use an online calculator and some rules of thumb to get a “good enough” first prototype.

Problem: Most RF boards and stripline calculators assume or drive toward 4-layer boards. In hobby quantities, 4-layer boards are much more expensive and have longer leadtimes. If using EAGLE, can no longer use the free/noncommercial or Lite editions (they only allow 2 layers).

The main driver of feedline impedance is its geometry and dielectric “distance” from the groundplane. The aforementioned stripline/microstrip/etc. calculators often assume there is nothing on the top (feedline) layer in its vicinity, there is just a groundplane on another PCB layer beneath it, and all proximity to the groundplane is through the FR-4 between the layers. For bog-standard 2-layer boards, that’s ~.062″ of material, which yields unacceptably wide traces (>100 mils) that cannot be cleanly terminated to most antennas or connectors, let alone a surface-mount IC pad.

Solution: Forget plain microstrip stuff, look up a “coplanar waveguide with ground” calculator instead. This takes into account a groundplane on the same layer, surrounding the feedline, in addition to a groundplane on a lower layer. Now the clearance between the feedline and coplanar goundplane can be tweaked to get a sane trace width for various copper weights, board thicknesses or other factors less easily in your direct control.

More notes:
FR=4 relative dialectric constant: 4.2 (in reality, it can vary quite a bit, and there are about a million material variants called “FR-4” and used interchangeably by board houses, but if you can’t be a chooser, this is probably as good an approximation as you get.)
“1oz” copper: 1.37 mils thickness (multiply-divide for other copper weights).
An example calculator is here:

Debugging a shorted PCB the lazy way

I recently assembled a set of prototype boards for a particular project at my day job, and ran a math- and memory-intensive test loop to test them out. Two of the boards ran the code fine, but one crashed consistently in a bizarre way that suggested corruption or otherwise unreliability of the RAM contents.

Since the other two identical boards worked, a hardware/assembly defect was the likely explanation. These boards feature a 100-pin TQFP CPU and 48-pin SRAM with 0.5mm lead pitch, all hand-soldered of course, but a good visual inspection turned up no obvious defects.

The first thing I tried was a quick-n-dirty RAM test loop that wrote the entire RAM with an address-specific known value and immediately read it back (in this case, it was just the 32-bit RAM address itself), but this (overly simplistic, it turns out) check didn’t report any failures. However, I did notice the current reading on my bench supply occasionally spiking from 0.01A to 0.04A. This board uses a low-power ARM micro specifically chosen for efficiency, and should rarely draw more than about 15mA at full tilt, so this was a red flag.

With this in mind, the next thing I did was get a higher-resolution look at what was going on with the current. The CPU vendor provides a sweet energy profiling tool for use with its pre-baked development kits, which also double as programmer and debugger for the kit and external boards. The feature works by sampling input current to the development kit via a sense resistor at high speed, and optionally coupling it to the running program’s instruction counter via the debugger to estimate energy usage for each function in your program. By uploading a dummy program to the kit that just puts it into a deep sleep, and tying the external board into its VMCU/GND pins, it can be used with any external target board that draws up to 50mA or so.

Running the memory test again with the profiler active, I got the following:

Current trace

Current trace as reported by EnergyAware Profiler

Again, the kit can supply a max of 50mA or so, and this graph shows a repeating cycle which spends half the time somewhere a bit northward of this. Sure enough, probing the supply voltage with a scope, the voltage drops a bit whenever the overcurrent occurs. A memory test loop should draw a fairly constant current; it shouldn’t vary with time or data or address as this appears to be doing. So it’s a safe bet that one of the address or data pins to the external RAM is shorted. But where?

I could begin probing address and data lines to find the ones that toggled at the same rate as the dip on the voltage rail, but on a wild hare (or hair) picked up our new secret weapon (not the Handyman’s Secret Weapon), an IR thermal camera.

Just after power-on with the RAM test running, I saw this:

IR thermal view of the CPU when powered and running the external RAM test

IR thermal view of the CPU when powered and running the external RAM test

The part immediately to the left of the crosshair is the CPU. It wasn’t detectably warm to the touch, but here it’s easy enough to see where the die itself sits inside the chip package. There is also an apparent ‘hotspot’ roughly centered along the lefthand edge of the die. The I/O pins just next to this hotspot are address lines tied to the RAM. While it isn’t *always* the case due to the vagaries of chip layout, the GPIO pin drivers are almost always situated at the edges of the die, right next to the pins they drive. This is about as close to a smoking gun as you can get without the actual smoke. While it’s hard to see exactly which pin or pins this hotspot corresponds to, it does narrow the search quite a bit! For reference, here is how it looks unpowered.

IR thermal view of the CPU when unpowered.

IR thermal view of the CPU when unpowered.

Scoping a handful of adjacent pins, the issue becomes clear.

Probing near the hotspot seen on IR.

Probing near the hotspot seen on IR.

Suspicious 'digital' address line voltages near the hotspot.

Suspicious ‘digital’ address line voltages near the hotspot.

Two adjacent address lines to the memory, immediately next to the hotspot, show this decidedly non-quite-digital-looking waveform on them (bottom trace), lining up pretty well with the voltage droop (top trace). This points to not one shorted pin (to GND or etc.), but two adjacent pins shorted together, their on-chip drivers fighting one another to produce these intermediate voltages and consuming excess current in the process. A quick beep-test confirms the short.

It turned out to be a hair-thin solder bridge between two adjacent pins on the SRAM, pictured below. Do you see it?

Take my word for it, there's a solder bridge in this photo.

Take my word for it, there’s a solder bridge in this photo.

Yeah, neither did I at first. It was more visible only at a very specific angle…

Visible short

The short is just visible at this angle

Note the other apparent “stuff” crossing pins in this angle wasn’t solder, but remnants of a cotton swab used to clean flux from around the pins.

Mapping the I/O drivers and other stuff

Just for fun, I purposely shorted all the GPIOs available on headers to ground and ran a loop that briefly flipped each one high. Here is the result! Note that not all GPIOs on this board were available on a header (many go to the RAM chip), and they are not necessarily pinned out in a logical numeric order. I haven’t specifically tested it yet, but the same method should be usable to unintrusively map out the location of on-chip modules (core/ALU, voltage regulators, AES engines, etc.) that can be exercised individually.

<pipedream>The day when thermal imaging gets good enough we can use IR attacks instead of power analysis to figure out what a chip is doing (encryption keys, etc.) without decapping it…</pipedream>

Notes To Myself: EFM32 and heaps of external SRAM

Use the EFM32 microcontroller’s External Bus Interface (EBI) to place a large external SRAM and work with data larger than the chip’s internal memory will allow. Support dynamic memory allocation via standard malloc()/calloc() calls probably present in whatever 3rd-party code-snarfed-from-the-internet you are trying to integrate.

First off, ignore any notes about needing to ground the 0th address bit on the memory and shift the remaining address lines, as stated in the EFM32 appnotes/manuals. Unless very explicity stated otherwise, 1 address increment == 1 address change at the memory’s word size. For example, changing A[0] on a 16-bit SRAM generally addresses the next 16-bit memory location.

Sidenote about external memory address lines: If they are actually numbered in the RAM’s datasheet, this is an extremely polite suggestion only. In practice, it doesn’t matter if A[0..n] from the MCU map to A[0..n] of the memory in order; if the address lines are swapped around, they are swapped around for both read and write, so it doesn’t matter one bit (har!) to the MCU. Incidentally, same goes for the data lines. So feel free to run them however makes the PCB routing easier.

Setting up the heap in external memory:
You probably want bulk data to go to the external RAM, but your stack and most of your code’s internal housekeeping in the internal memory, which is faster and likely eats less juice. Especially if that code is using malloc() and friends to access that memory, this means creating the heap in external RAM.

The EFM32’s internal RAM starts at 0x20000000. Unless you do something funky, memory on the EBI maps in starting at 0x80000000.

Step 1: Linker has to know about the external memory.
This means tweaks to the vendor-supplied linker file (*.ld) to…

a) Tell it about the memory:
FLASH (rx) : ORIGIN = 0x00000000, LENGTH = 262144
RAM (rwx) : ORIGIN = 0x20000000, LENGTH = 32768
EXRAM (rwx) : ORIGIN = 0x80000000, LENGTH = 0x00200000
/* Add the EXRAM line above. Don't touch the CPU-specific FLASH/RAM base address or length from the original linker file.*/

b) Tell it to place the heap there:

.heap :
__end__ = .;
end = __end__;
_end = __end__;
__HeapLimit = .;
/* Change 'RAM' to the 'EXRAM' section you just defined */

BUT… As mentioned above, the external RAM has a much higher physical address than the internal RAM. This will confuse a check later in the vendor linker file, which assumes all the memory is allocated in the same segment, the stack is allocated starting from the end of RAM (grows downward) and thus is the highest RAM address anywhere. Since this is no longer true, this check needs to be modified so as not to generate a false stack collision warning:

Change this line

/* ASSERT(__StackLimit >= __HeapLimit, "region RAM overflowed with stack") */

to this:
/* The above assumes heap will always be at the top of (same) RAM section. Since it's now in its own section, simply check that the STACK did not overflow RAM. This modified check assumes the '.bss' section is the last one allocated (i.e. highest non-stack allocation) in main RAM. */
ASSERT(__StackLimit >= __bss_end__, "region RAM overflowed with stack")

Step 2: Tell the Compiler.
Now that we’ve told the linker, we need to tell the compiler/assembler. If you just build the code now, you will get a heap starting at 0x80000000 as expected, but with some tiny default size chosen by the vendor. This magic value is defined in the ‘startup_efm32wg.S’ (or part-specific equivalent) file buried in the SDK. This will be at e.g. “X:\path\to\SDK\version\developer\sdks\efm32\v2\Device\SiliconLabs\EFM32WG\Source\GCC\startup_efm32wg.S” . What’s the difference between the ‘.S’ file here (uppercase S) and the ‘.s’ (lowercase s) file located in ‘g++’? Don’t ask me. What’s the difference between either of these in /Device/SiliconLabs vs. the same files in /Device/EnergyMicro ? Don’t ask me. There are also compiler-specific variants (Atollic, etc.) and an ‘ARM’ version. Don’t ask me…

Anyway, once you figure out which one your project is actually using, open it and you should find a line like:
.section .heap
.align 3
#ifdef __HEAP_SIZE
.equ Heap_Size, __HEAP_SIZE
.equ Heap_Size, 0xC00

The specifics might vary depending on your exact CPU and its memory size of course (assuming the vendor selects a larger default value for those with larger internal memory, but I could be wrong.) So we just have to define __HEAP_SIZE somewhere and bob’s your uncle, right?

Er, sort of. There are two nuances to notice, in case your situation slightly differs from mine. One is that the double underscore before HEAP_SIZE looks like a standard compiler-added decoration (i.e. name mangling). Does the compiler expect you to supply the mangled, unmangled or some semi-mangled version of this name? The other is that the ‘.s’ (or ‘.S’) file is an assembler file, not a C file. So in this case you actually need to pass the magic value to the assembler, not the compiler (and beware that the two may in fact have different name mangling conventions). What a mess!

I figured the easiest way to figure out exactly what was expected was experimentally. If using the Simplicity Studio GUI/IDE, you can mousedance your way into Project -> Properties -> Settings -> Tool Settings -> toolname -> Symbols -> Defined symbols and add the symbol definitions there. So I created six versions in total: all three mangling permutations (HEAP_SIZE, _HEAP_SIZE and __HEAP_SIZE) for both the assembler and the compiler, with a different size value for each, then fished in the .map file after compilation to see which one ‘took’. In my particular case, it was the version passed to the assembler, with the fully mangled (double underscore) name. YMMV. Are there any cases where it must be passed to both the compiler and the assembler? Don’t ask me. When you find out which your particular setup is expecting, set the value to match the external memory size and delete the extra definitions.

Step 3: Fix any remaining braindead checks.
When using dynamic memory allocation (malloc() and friends), they (usually, probably) call a deep internal library function called _sbrk. Among other things, this function performs a check similar to the one we just fixed in the EFM32 linker file, failing nastily if it ever allocates heap memory with a higher address than the lowest stack allocation (at least in GCC). So to get around this, you have to override the builtin _sbrk with a fixed copy. If you are using the vendor’s ‘retargetio.c’ for anything (e.g. delivering printf output to the SWO debug pin), this file redefines a bunch of internal functions including sbrk. Failing that, is ‘just’ creating a function any-old-place with the same name sufficient to guaranteeably override the internal function in all cases? Don’t ask me.

The vendor-supplied copy in retargetio.c looks like the below. Here I’ve modified it crudely to just remove the check entirely. In my case, the external RAM contains only the heap and nothing else, so this should be OK.

caddr_t _sbrk(int incr)
static char *heap_end;
char *prev_heap_end;
static const char heaperr[] = "Heap and stack collision\n";
if (heap_end == 0)
heap_end = &_end;
prev_heap_end = heap_end;
// HACK HACK HACK: This check assumes stack and heap in same memory segment; remove it...
//if ((heap_end + incr) > (char*) __get_MSP())
// _write(fileno(stdout), heaperr, strlen(heaperr));
// exit(1);
heap_end += incr;
return (caddr_t) prev_heap_end;

Now your malloc() calls should stop failing! After performing the above steps, I was able to get a ‘complex’ piece of code with dynamic memory allocation (the SHINE mp3 encoder) running on an EFM32 microcontroller, with a few changes to be reported soon…

BONUS: SHINE particulars:
The encodeSideInfo() function in l3bitstream.c appears to build the mp3 header incorrectly. Try…

//shine_putbits( &config->bs, 0x7ff, 11 ); // wrong
shine_putbits( &config->bs, 0xfff, 12 ); // right
//shine_putbits( &config->bs, config->mpeg.version, 2 ); //wrong
shine_putbits( &config->bs, 1, 1 ); //right

It also seems to fail outright (generate incorrect, unplayable bitstreams) for certain input files, depending (probably) on mono vs. stereo and/or bitrate. A stereo .wav file (PCM 16-bit signed LE) at 44100Hz worked.

Fun with 3D Printing: Print a Parametric Peristaltic Pump

So, I’ve been playing around with the Lulzbot we got at work. Inspired by emmett’s sweet planetary gear bearing design, I adapted the design to be not a bearing but a peristaltic pump. Like the original bearing design, the pump prints as a single piece – no assembly required! – with captive rollers and no rotary bearing/wear surfaces. The only extra part needed is a length of surgical tubing to thread through the mechanism. This initial print is a proof-of-concept and is driven by a standard 1/4″(?) hex nut driver or Allen key: for a real application, you’d want to add a mount for an electric motor or similar. This one (or one like it) will probably end up attached to a small gearmotor and solar panel in my greenhouse to slowly trickle water through an aquaponics tower.

Printable Peristaltic Pump with captive rollers and minimal wear surfaces

Printable Peristaltic Pump with captive rollers and minimal wear surfaces

Pump with latex surgical tubing installed

Pump with latex surgical tubing installed

The pump design is written in OpenSCAD and pretty much fully parametric: the desired diameter, height, tubing geometry and a few other parameters can be tweaked as needed. There are a couple warts I’ll discuss later on.

You can download the OpenSCAD file here.

Here is a video of the pump in operation.

Peristaltic pumps operate on the same principle as your esophagus and intestines (yes, really – yuck…) – a squishy length of hose is squeezed starting from one end and ending at the other, forcing any contents along for the ride. This type of pump has several properties that make it useful in certain applications:

  • Self-priming – can pump air or fluid reasonably well
  • Able to pump viscous, chunky or otherwise particulate-filled liquids that would gum up or damage an impeller pump
  • Gives great head – Ehhem… “head” refers to the maximum height the pump can push fluid. For a comparable energy input, a peristaltic pump can generally push fluid up much larger elevation gains than typical impeller types. Flowrate is another story of course.
  • Precise volume delivery – the amount of fluid (or air, etc.) dispensed per rotation of the motor is much more predictable than with an impeller pump. Using a servo or stepper motor, the volume pumped can be very accurately controlled. For this reason, peristaltic pumps are commonly used in medical equipment to meter out IV fluids, handle body fluids or dispense drugs.
  • Corrosion-free, isolated fluid path – Also of great relevance to medical applications, the fluid makes contact only with the tubing, making it very self-contained and minimizing the risk of contamination – e.g. all the nooks and crannies where foreign matter and bacteria could hide in other pumps. Very important when pumping bodily fluids out of someone and then back in (e.g. dialysis). Likewise, if your pump guts were metal, pumping corrosive fluids would be OK at the two never touch.

I really can’t stress the medical angle enough: in a hospital setting, peristaltic pumps are everywhere. Being able to print them off for practically free is huge.

Of course they are not without drawbacks; among them are fairly low flowrates, often “spurty” output, added friction losses and finite tubing life.

The pump prints out pretty much preassembled, but you still have to supply the tubing. Latex or Tygon surgical tubing is ideal, but most any pliable tubing (PVC fishpump tubing, etc.) can be used. To install, poke the tubing into one of the holes on the side of the mechanism (move the rollers if necessary so it is not blocked), pull through the desired amount of slack, then slowly advance the rollers to push the tubing up against the inner wall. When it reaches the other hole, push through and pull out any remaining slack. Note, the design is symmetric, so the concept of “inlet” and “outlet” port just depends on which direction you turn the rollers.

Design Considerations:

The diameter and wall thickness of the tubing dictate the pump geometry to some degree: the rollers and corresponding track must be wide enough to accommodate the tubing’s width when squished flat, and the clearance between the two must be enough to squeeze it flat without applying excessive force. This can be adjusted via the tubing_squish_ratio variable. The pump shown used a value of 0.5 with good results, but if you don’t need excessive pressure/head, lower values should work fine and reduce friction.

In general, a larger overall pump diameter will minimize wear on the tubing.

When using an FDM (plastic-extruding) printer, crazy overhangs in the geometry can’t be printed without support material (which defeats the purpose of a print-n-play design). The parameter allowed_overhang controls the level of overhang in the output based on what your printer can print, between 0 (no overhang whatsoever) and 1 (“infinite”, i.e. 90-degree overhang). Of course a ‘0’ setting is not very practical. 45-60 degree overhang should be OK for most FDM printers (I used a raw value of 0.75 for this one).

Warts / Future Improvements:

In the current version, the final OD will actually be slightly larger than the value you enter (specifically, by the calculated tubing squished thickness. This is a result of laziness on my part; keep this in mind or fix it if you need a very exact OD on the outer ring.

When operating at high speed, I’ve noticed the tubing sometimes has a tendency to slowly “walk” in the direction opposite of travel, being slowly pulled through the pump. A compression / baffle feature at the inlet and outlet would help prevent this by friction-locking the tubing in place. Alternately, it could probably just be fixed in place with a bit of glue.

Tim Tears It Apart: Kidde KN-COB-B Carbon Monoxide Alarm

Of course it happens this way: stuff works for you, but breaks as soon as you have guests and drives them crazy. In this case, the missus and I were out of the house having a baby and her folks were in to hold down the fort. A carbon monoxide detector had failed in the most irritating possible way, emitting a very short low-battery chirp just often enough to drive everyone batty, but intermittently enough to be very time-consuming to track down. My poor father-in-law eventually managed to find the source of the racket, and changed the batteries.

The chirping continued.

He then trashed those batteries and put in another set of fresh batteries, from a new package.

The chirping continued.

And then took the damn thing off the ceiling and removed the batteries for good.

The chirping continued.

Oh yeah, it turns out that not one, but TWO detectors had failed simultaneously. And not for want of batteries, either.

It turns out the detector elements in most modern CO detectors have an indeterminate-but-finite lifespan, and are programmed to self-destruct when their time’s up. The actual sensor lifespan depends on the usual factors like operating temperature, humidity, CO exposure, etc., but most manufacturers take the easy way out and simply define a conservative time value where it may need replacing. In this case, it is 7 years. (I bought the house about 7 years ago, hmmm…)

Self-destruct timer disclaimer on back of detector

Self-destruct timer disclaimer on back of detector

Although design-to-fail schemes are occasionally on legally shaky ground, this product-death-timer is actually required by UL for CO detector products whose detector has a limited lifespan (which is most of them).

While they still power-on and blink (it’s not clear if the timer expiration also explicitly disables CO detection, but the labeling on the back suggests so), these units are basically landfill fodder now. I think you know what that means…

Front of detector with battery door removed. The marking indicating the direction to pull to release it from the nails in the ceiling is NOT factory stock :-)

Front of detector with battery door removed. The marking indicating the direction to pull to release it from the nails in the ceiling is NOT factory stock :-)

Top side of PCB

Top side of PCB

Top side of PCB with piezo horn removed

Top side of PCB with piezo horn removed

Bottom side of PCB

Bottom side of PCB

Main parts:
CPU: PIC16LCE625 – One-time programmable 8-bit microcontroller with 2k ROM / 128 byte RAM, 128 byte EEPROM.

MCP6042I/P – Dual Low power (0.6uA) opamp – guard ring attached to pin 7

LM385-1.2 (package marking 385B12) – 1.2V voltage reference with minimum operating current of 15uA.

Noisemaker: Ningbo East Electronics EFM-290ED piezoelectric horn claiming 90dB(A) sound output @ 9V/10mA @ 30cm.
Has GND, main and feedback connection.

Ningbo East Electronics ELB-38 or ELB-74 (?) – 3-terminal inductor (autotransformer) generating a stepped-up AC voltage to drive the horn.

A scattering of bog-standard transistors (2n3904/3906) rounds out the silicon ensemble.

The detector is a large metal cylinder marked with a Kidde part number and has a silica gel (dessicant) package shrink-wrapped to the front of the detection end. The detector is soldered to the board and not replaceable.

Business end of CO sensor showing silica gel dessicant covering aperture

Business end of CO sensor showing silica gel dessicant covering aperture

Some points of interest:

Idiot resistance: One thing to notice even before taking the unit apart are the little red spring-loaded tabs underneath each battery socket. I couldn’t find anything on the purpose of these in a quick web search, but my guess is they are there to block you from putting the battery door back on with no batteries in, e.g. after pulling them to silence a chirping alarm at 3am, and then forget to put new ones in.

Horn drive: Piezo horns are resonant systems with a very high Q; they must be driven at resonance to produce anywhere near their maximum sound output. However, due to manufacturing tolerances the exact resonant frequency may differ significantly between individual units. Another issue for this device is that piezo horns need comparatively high voltages to operate: this one has a rated voltage of 9V, but can probably go a fair amount higher (>100V drive signals for larger piezo sounders are not uncommon). But, the 3x AA batteries in this device can deliver a maximum of only ~4.5V. The self-resonant oscillator formed by Q2 and L1 efficiently solves both problems. The ‘feedback’ pin connects to a small patch of piezo material on the horn that acts as a sensor, translating deflection to voltage (more or less). Using this as the control signal for a simple oscillator allows it to automatically pull in to the piezo’s resonant frequency. The autotransformer coil, L1, is basically a step-up transformer with one end of its primary and secondary windings tied together and connecting to the 2nd pin. (You can think of it as a single winding with an asymmetric center-tap if you prefer.)

Detector analog frontend: The FR4 material the PCB is made of is a pretty good insulator, but its resistance is not infinite. With sensitive high impedance signals in the tens of Megaohm or more, even the tiny leakage currents across the PCB can induce a measurement error – especially when dust, finger oils from manufacture, other residue and humidity from the air combine on the surface. Notice the exposed silver trace that completely circumscribes the PCB area occupied by the sensor, with its green soldermask covering purposely omitted. This is almost certainly a guard ring intended to intercept such PCB leakage currents before they reach the connection points of the chemical CO sensor. The trace will be attached to a low-impedance circuit node whose voltage is as close to the sensor terminal voltage as possible, minimizing the voltage difference between them, and thus the current that can leak across. The trace is tied to pin 7 of the opamp.

Closeup of guard ring trace surrounding analog frontend

Closeup of guard ring trace surrounding analog frontend

End-of-life-lockout: As mentioned previously, this device is programmed to commit suicide after 7 years. There is no battery backup inside the device, nor any discrete realtime clock or other means of telling the time. How does it know when 7 years have elapsed? The CPU is clocked by a 32.768KHz crystal oscillator, otherwise known just as a “watch crystal” due to their ubiquitous use in watches, clocks and other timekeeping applications. While running the CPU at such a low speed also has certain power advantages relevant to a battery-powered system, this crystal is providing an accurate timebase. Needless to say, it is counting 7 years of power-on time, not wall time (even if it sat on the shelf quite a while, your alarm will not be dead and chirping the moment you remove it from the package). The CPU sports 128 bytes of EEPROM, which are used to store the peak CO reading (over the product’s lifetime or since the last alarm; not sure which) and most likely periodically count down its remaining lifetime. Basic operation of a CO detector is to stick batteries in and forget about it (unexpected powercycles will be infrequent), so the timekeeping can be very coarse, e.g. decrementing a couple-byte EEPROM countdown every time a very long counter rolls over some preprogrammed value.

I pulled the CPU, hooked it up to an ancient PIC programmer and tried dumping the firmware to see exactly how this worked, just in case they had left it unprotected, but no such luck. The code protect fuses are all set and readout attempts return all 0s. The EEPROM in this particular chip is actually implemented as a separate I2C “part”, either on the same die or a separate die copackaged with the CPU die, with the two I2C control pins and a power control line memory-mapped into a register. So there is no access to the EEPROM contents through a PIC programmer either.

Enclosure: At first glance, it’s about what you expect from a low cost consumer product that is designed to be thrown away periodically. There is not a screw to be found anywhere – everything, from the PCB to the enclosure halves themselves, clicks together via little plastic tabs. But wait a minute… hold this up to the light just right, and you can see hand-finishing marks where extra plastic (e.g. overmold) from the injection molding process has been filed or sanded off. On the *inside* of the enclosure, where nobody will see it! And yes, these marks appear to be from work applied to the finished enclosure itself, not the master mold it came from – the sanded portions go slightly in, not out.

Manual finishing marks on inside of plastic enclosure

Manual finishing marks on inside of plastic enclosure

Hidden Features: There are a few hidden features suggesting this same PCB, CPU and firmware are used for several models of alarm, including a fancier one. The most obvious is a non-stuffed footprint for another pushbutton switch, marked ‘PEAK’. When pressed, it causes the green test LED to flash a number of times in a row (presumably corresponding to the peak CO level ever measured by this detector – my 2 dead units show 9 and 10 blinks, respectively). Near the center of the board is a non-stuffed 6-pin header, with the outer two being power & ground, and the middle four signals going straight to CPU pins. Scoping these reveals unidirectional SPI signalling on 3 of the pins (CS\, CLK, DATA) that would probably drive an LCD readout on a more expensive version of this detector. Capturing the data in various modes doesn’t produce any obvious pattern (e.g. ASCII, numeric, BCD or raw 7-segment data). Finally, there are two mystery pads on the back of the PCB. Shorting them causes both the alarm and test LEDs to light, and the green LED to produce 5 extremely rapid blinks every few seconds. Doing this does not reset the timer-of-death, clear the PEAK reading or have any other long-term effects that I can ascertain. Both the PEAK switch and mystery jumper noticeably change the data pattern sent to the nonexistent LCD.

BUT… I did find a sequence of inputs that put the detector into some kind of trick mode permanently (persisting across powercycles). I believe the exact sequence of events that triggered it was to have S2 shorted at powerup, then short PEAK once the blinking sequence starts. It’s not clear if S2 must remain shorted during this time or only at powerup. The unit this sequence occurred on is now permanently in a mode where it emits long, repeating rapid blink sequences on the green LED (red lit continuously) and draws some 40mA continuously. The repeating sequence is 1 (pause) 63 (pause) 68 (pause) 24 (pause) 10 (last blink is longer) (pause) 21 (pause) 82 (pause) 82 (pause) 14 (long pause).

It’s official – I have spawn!

That day I never thought would come, and a younger me once *hoped* would never come…has come!

Our first spawn, Max Charles G, was born 7/26/2014 at around 6:40am.

Glad that’s over!

I kid, but seriously, the hospital part is the only real sucky part (for us, at least – Max is pretty chill). The part where you’ve both already been awake for a day and a half, the Mrs. is just all sorts of tore up, and the Mr. is camped out on this “thing” pretending to be a chair pretending to be a fold-out bed and failing outright at both. In otherwords, a medieval sleep-deprivation torture experiment of some kind. And then comes this tiny human that starts a one-sided screaming match every couple hours while you newbs haven’t more than a few books and a Google’s worth of a clue what to do about it. And about the second or third day of this you lay there with the tiny human in your arm, rubbing your eyes, thinking: Oh shit. This is what our lives will be now.

(Sometime on day 2-or-something of hospital)
T: I don’t even know what day this is.
K: It’s Monday.
T: It FEELS like a monday.
K: Get ready, every day’s going to feel like a Monday.

But… then you go home, sleep in your own bed, start getting the hang of all this feeding and changing business, and find out: those first several days are a fluke, and hey, this ain’t so bad after all! In fact, much like marriage for a dude, either I won the wife/baby lottery or the hype is BS: this is turning out to be much better than I expected.

Ah yes, and the tiny human is growing on me. I wasn’t expecting that either.

Anyway, here he is. If you don’t give a flying frog about pictures of Other Peoples’ Kids, you should probably look away now.

Max's first day...with that tasty, tasty hand

Max’s first day…with that tasty, tasty hand

That's my boy.

That’s my boy.

Either a yawn or an audition

Either a yawn or an audition

Max and the proud parents

Max and the proud parents

Notes To Myself: Fix for Windows 7 can’t delete file/folder on network drive (“in use”)

Problem: When trying to delete or rename a folder, typically on a network drive, Windows 7 reports the action can’t be performed because a file is in use, even when you definitely don’t have any files open in that folder (or even have a subfolder displayed in another Explorer window), and haven’t for quite some time. Typical error message popup:

“The action can’t be completed because the file is open in Windows Explorer. Close the file and try again.”

Apparently it is a longstanding bug in Windows explorer (that M$ has known about but will not fix!) where Windows creates hidden files (thumbs.db) to cache image thumbnails, but sometimes forgets to close them.

Workaround: Disable thumbnail caching:

  • Run ‘gpedit.msc’ (Click Start -> Run, type gpedit.msc in the search box and hit enter)
  • Drill down to User Configuration -> Administrative Templates -> Windows Components -> Windows Explorer
  • Highlight “Turn off the caching of thumbnails in hidden thumbs.db files”. Right-click this entry and choose ‘Edit’, and then enable this setting.

You probably have to reboot for this to take effect (mainly to clear any existing thumbs.db files that are already locked open). Don’t fiddle with any other gpedit settings.

It may also help to disable thumbnails on network drives entirely – folders with images will display much faster! To do this, enable the setting named “Turn off the display of thumbnails and only display icons on network drives” in the same location. Note there are two similarly named options (one omits the “…on network drives” part), so be sure to select the one you want.

This fix comes from a rather lengthy exchange about the bug on Microsoft’s forums.

How to Fragment Your File System

Here is a tiny little python script to generate file system fragmentation.

“But Tiiiiiiim! Tools are supposed to defragment your filesystem! Why would you ever want a script to fragment one?”

In one of the gadgets I’m working on, I had a need to evaluate disk (well, memory card) performance in real-world and worst-case scenarios. If you are sampling high-speed data with a puny microcontroller, you cannot afford your disk going into lalaland while your puny buffer RAM runneth over. While file fragmentation is – in theory – not a big deal for Flash media as it is for spinning rust drives (no mechanical heads to reposition), your filesystem driver still needs to grovel through the filesystem to find the next free block to write. In a typical implementation, writing to a FAT filesystem with a giant file in the middle of it could incur a significant write hiccup as the FS driver encounters the file and has to seek through its entire FAT chain, potentially fetching and parsing numerous sectors of allocation data before finally finding a free cluster for the next write. This script allows testing of such scenarios.

What it does:

Give it the name of a disk to fragment, and it will begin creating files full of junk data on the disk until it receives a write error (normally indicating the disk is full). It will then delete a random subset of these files, leaving free-space holes scattered throughout the disk. For most filesystems and OSes, the freespace will not be automatically consolidated, and will remain fragmented until the remaining files are deleted (or the disk formatted, etc.) or a defragmentation utility is run. You can then evaluate the performance of your (software, device, etc.) on this disk on a realistic simulation of a well-used drive.

Configuration options:

path – set this to the directory to generate the fragments in. On FAT filesystems, it is necessary to use a subdirectory and not the root directory, due to a limitation on the number of files that can be stored in the root directory.

filldensity – this value, ranging from 0.0 to 1.0, sets the percentage of junk files to remain at the end of operation. A higher value means more files left behind, i.e. less freespace gaps.

minfragsize, maxfragsize – this sets the size range of junk files to create. The size of each file created will be selected at random from this range.


I only tested this on a Windows PC, for reasonable file sizes (MB, not GB) and card sizes (a few GB). If your device’s size is measured in rooms, gigaquads or Libraries of Congress, it may not work, or your device may be obsolete by the time it finishes. The “junk” to make the junk files is stored in RAM out of laziness; you probably want to fix this if making multi-GB junkfiles.

This script was written to test a FAT-based device. Not all filesystems respond the same way to fragmentation, so YMMV on other filesystem types.

This emulates fragmentation only. Many other factors could affect your embedded Flash media performance, such as Flash cell wear (aka hot count, or total number of write/erase cycles performed), write amplification, operating temperature and/or voltage (depending on the memory technology and controller), phase of the moon, etc. This script does not emulate any of these other factors. On the bright side, it should be a more faithful test for other memory technologies, e.g. FRAM/MRAM, that are fast and relatively insensitive to cell wear, and will better reflect software delays due to filesystem parsing.

Notes To Myself: ‘Paste Plain Text’ keyboard shortcut/macro for Excel

Very common need: Copy some data into an Excel cell from an arbitrary other source (including another Excel sheet, or webpage, etc.). In the process, strip any external formatting, HTML tables, etc. with extreme vengeance and only paste the plain text.

Traditional way: Mouse fandango (Excel 2013: Home -> Paste -> Paste Special…->Text->OK) for every time you want to do this.

Better: Create “PastePlainEffingText” macro activated by a nice fast keyboard shortcut equivalent to Ctrl-V. Store this macro persistently in the Excel “Personal Workbook”, not the currently open document, so it is available in any open document.

1) View -> Unhide -> Personal etc. (The ‘Personal’ workbook is hidden by default. Attempting to save a macro to it generates an extremely helpful message saying to use the ‘unhide’ option, without giving the option to just do this, nor telling you where this setting is.)
2) View -> Record Macro
3) Mousedance as above (Paste Special etc.)
4) View -> Stop Recording
5) Assign keyboard shortcut. I just assigned it to “Ctrl-B” since it’s right next to Ctrl-V. This means I can no longer Ctrl-B to make text bold, but for the once-a-year I’d actually want to do this, it’s a plenty acceptable tradeoff.
6) Optional: Re-hide the “Personal” workbook.


When assigning the keyboard shortcut, the “Ctrl” portion is mandatory and cannot be changed. Excel will automatically insert a ‘Shift’ in addition to this if you happened to type an uppercase letter in the sole letter box provided (they way keyboard shortcuts are usually represented in text). This is somewhat unintuitive and does not mean Excel is blocking you from overwriting an existing shortcut – just change the letter to lowercase and it’ll go away. There is no warning if you do overwrite an existing shortcut, so you’ll have to check on this yourself.

At the time of this writing, Excel does not allow writing an actual macro (code) in the Personal Workbook directly. You have to ‘Record Macro’ and physically do whatever action/mousedance to initially generate the equivalent code. But once this is done, you can edit the actual code. To write/paste an arbitrary code macro, you can probably just “Record Macro” some trivial dummy operation (paste some text, etc.) then just replace the autogenerated code with your own.

The equivalent code for this macro is:

Sub PastePlainEffingText()
' PastePlainEffingText Macro
' Strip formatting when pasting buffer contents
ActiveSheet.PasteSpecial Format:="Text", Link:=False, DisplayAsIcon:= _
End Sub

Notes To Myself: J-Link SWOViewer with Silabs/EnergyMicro EFM32 CPUs

The EFM32 SWO port operates from a 14MHz timebase regardless of the current CPU frequency. Autodetection of actual frequency is feasible, but irrelevant. Manually specify 14MHz for “CPUFreq” in SWO Viewer. The corresponding calculated SWOFreq should be 900KHz. Tested and working as of SWOViewer version 4.84f.

Notes To Myself: Fixing TortoiseCVS breakage (permissions, crashes, missing overlays) on Windows 7 64-bit

Problem 1) TortoisePlink.exe crashes when attempting CVS operations.

Win7 throws the error message “A problem caused this program to stop working correctly” (gee, thanks, that’s a most helpful crash dump) and checks The Interclouds for solutions (finding none). Groveling down to the actual crash report (Control Panel -> Administrative Tools -> Event Viewer -> Windows Logs -> Application, scrolllll down to the most recent relevant “Error” entry, and bathe your mouse-clicking finger in icewater) reveals:

Faulting application name: TortoisePlink.exe, version:, time stamp: 0x4d3d6cef
Faulting module name: MSVCR90.dll, version: 9.0.30729.4940, time stamp: 0x4ca2ef57
Exception code: 0xc0000417
Fault offset: 0x00051380
Faulting process id: 0xfc4
Faulting application start time: 0x01cf7b616bc5e4c9
Faulting application path: C:\Program Files\TortoiseCVS\TortoisePlink.exe
Faulting module path: C:\Windows\WinSxS\x86_microsoft.vc90.crt_1fc8b3b9a1e18e3b_9.0.30729.4940_none_50916076bcb9a742\MSVCR90.dll
Report Id: ab10ecd7-e754-11e3-aa78-b8ca3abe82c0

Solution: At the time of this writing, the version of TortoisePlink that comes with TortoiseCVS is several years old, even for the experimental “new” (2012) RC1 build, the datestamp claims 2011 and the filesize is 200-some KB. A related project, TortoiseSVN, has a much newer version (400-some KB; datestamp claims 4/2014). Unfortunately I found no trustworthy places to download a standalone copy. So, download and install TortoiseSVN, copy-pasta its TortoisePlink.exe over the copy in TortoiseCVS, and you can then uninstall TortoiseSVN if you like.

Alternate solution: TortoiseCVS now has internal SSH support. If you don’t need to pass any external arguments to the SSH stuff (e.g. the “avoid re-entering password” trick (-pw mypassword)) or use the SSH-keypair-in-place-of-password thing, you can go into all your ‘Root’ files (inside the hidden .CVS directories added all over the place) and change every occurrence of :ext: to :ssh: , which will use the internal support instead of fobbing it off to the crashing TortoisePlink. Note that you will have to do this for EVERY. SINGLE. FILE.

Problem 2) Permission Denied error when trying to “CVS Commit” and possibly other operations.

Some other operations (“CVS Diff”) might still work. Example error message:

In P:\WVR_RIF\04_Design\Electronic\Software\wvr_workspace\wvr_navy_v1: “C:\Program Files
(x86)\CVSNT\cvs.exe” commit -M .cproject

cvs [commit aborted]: cannot open file .cproject for comparing: Permission denied
cvs commit: Committed on the Free edition of March Hare Software CVSNT Client
Upgrade to CVS Suite for more features and support:

Error, CVS operation failed

My own repositories happen to be on a network drive (my employer’s setup), so I don’t know if this error is unique to this situation.

Solution: This error seems to have been introduced in a more recent version. The solution is similar to that above, except you need to downgrade to a version without the bug. TortoiseCVS actually ships with two separate collections of programs, TortoiseCVS proper (32- and 64-bit) and a separate “CVSNT” (32-bit only, at least the version that comes with TortoiseCVS), which does some of the underlying dirty work. The bug is in the “CVSNT” portion of this matryoshka. I don’t know the exact version where the bug was introduced, but copying the version from my old PC (cvs.exe dated 7/5/2006; identifying as “cvsnt 2.5.03 (Scorpio) Build 2382”, and the rest of the folder) did the trick.

Sidenote: Notice also that recent versions accompany this specific error message with a smarmy note about updating to a paid version for “support”. Indeed, TortoiseCVS appears to be somewhat abandoned in favor of the paid/professional “CVSNT” by the same author. Makes one wonder…

Problem 3) File/folder icon overlays do not appear, or only appear sometimes but not always (e.g. every other reboot).

Solution: Windows Explorer provides a limited number of ‘slots’ (16 to be exact?) for programs to define icon overlays. In Win7 x64 (at least), about a half-dozen of these are eaten up for “SkyDrive”, Microsoft’s foray into cloud file hosting. (What, you did not voluntarily install SkyDrive, and possibly never even heard of it? Welcome to the club.) Anyway, to fix:

Open registry editor and navigate to HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\Explorer\ShellIconOverlayIdentifiers . Now start nuking entries that seem least likely to be useful to you (SkyDrive, Offline Files, …) until the total is down to 16 or less.

Note, if you’ve done any version mix-n-match and/or reinstalled TortoiseCVS (I’m not sure exactly what triggers it), you may have a bunch of obsolete entries in there from Tortoise itself. For example, my machine has a TortoiseNormal and a 1TortoiseNormal, etc. It appears that the current version of TortoiseCVS (1.12.5 stable, 1.12.6 beta) uses the unnumbered ones – start by trying nuking those. If this doesn’t work, just nuke ALL the Tortoise entries from orbit and then uninstall-reinstall the program “TortoiseOverlays” (may be available standalone from some other source, e.g. TortoiseSVN, or by fully uninstalling TortoiseOverlays and reinstalling TortoiseCVS, which includes it.).

Problem 4) ” end of file from server (consult above messages if any)”

Solution (maybe): The hits just keep coming, don’t they? This error could mean just about anything (server side, client side, bad-behaving firewall or network appliance, sunspot activity, voodoo curse…), but one likely culprit is a crash in an external program (namely, TortoisePlink.exe) used to perform the connection. One easy thing to check is to run TortoisePlink.exe on its own (e.g. doubleclick) and see if it crashes. In my case, this threw the error:

“The program can’t start because MSVCR110.dll is missing from your computer. Try reinstalling the program to fix this problem.”

In theory, installing TortoiseCVS also installs the necessary runtimes, but somehow during the circle-jerk of uninstall-reinstall cycles to diagnose the other problems above (or some other app I installed the next day, or who knows really), this file got wiped out. Installing it from Microsoft cleared that up.

Alternate solutions: I’ve had this problem with previous TortoiseCVS installs, but the “end of file from server” message came not immediately, but only after replying to the password prompt. In this case, it was “fixed” by supplying the “-pw mypassword” argument to the external SSH tool, bypassing the password dialog (and presumably crash). Your IT folks may frown on you doing this however, since it leaves your password in cleartext on the machine.

Another thing you can try (assuming it’s a client side problem) is as above, change all the “:ext:” to “:ssh:” in all your CVSROOTs. Well, try it on ONE first and see if it fixes the problem before spending the rest of your day updating the rest of them.

Palram Mythos Greenhouse Hacks / Improvements

Palram Mythos 6x8 Greenhouse. Pretty nice overall, but could use a bit of shoring-up for longevity.

Palram Mythos 6×8 Greenhouse. Pretty nice overall, but could use a bit of shoring-up for longevity.

My brother-in-law and I put this together over a long afternoon. Much of that time was spent building and leveling a 4×4 frame – the actual construction went pretty smoothly.

On the other hand…

It stayed intact for about 24 hours. The very next day, a typical springtime storm rolled through with a bit of wind (the weather report claimed 30mph gusts). When I got home from work, the door side of the greenhouse was crumpled in, some of the horizontal supports bent backwards on themselves and a few twinwall panels were blowing around the yard.

The window panels are standard-ish, 4mm polycarbonate twinwall (mostly 2ft x 4ft? sections) and can be sourced easily online, but the metal structural parts are custom and replacement parts can’t be bought separately – so wrecking any is a big deal!

Anti-Flex / Anti-Fall-Apart-In-A-Stiff-Wind Fixes
This revealed the main apparent design flaw: Many of the structural components are joined together by nothing more than the friction of a bolt head – not even passed through complete holes in both parts (which would somewhat fix the parts together even if the bolt were to loosen), but often via a U-shaped notch in one or both parts, or with the bolt sliding freely in a t-slot. Major places this appears to be a problem are:

  • where the vertical rails for the walls slot into the base
  • where the upper and lower halves join together at the ends (mainly the upper bolts in the horizontal metal supports about halfway up either end)
  • where the verticals around the door bolt into the horizontal near the ceiling

Add to this the fact that many of these end bolts must be tightened only after installing the twinwall panels, which renders the heads nearly inaccessible, and flimsy cross-bracing (more on that later), and you end up with a major structural problem. Each time a gust of wind hits, the top of the greenhouse can sway back and forth a bit with respect to the base (the diagonal support straps simply flex). Each time this happens is an opportunity for these friction-held bolts to very slightly work themselves apart. Enough cycles of this (a day’s worth, depending on the day) are enough to separate the vertical wall rails from the base, or the bolted notches at the above-indicated spots from one another.

If you live in a breezy location, one of the best favors you can do for yourself is scrap these flimsy diagonal straps on either end in favor of some sturdy aluminum angle or U-channel stock from your nearest hardware store. One catch, I’ve only seen such stock for sale in the US in 4-ft and 8-ft lengths, while the pieces for the greenhouse are 51″. So to do it proper you’d have to get 8ft pieces and have nearly half of each piece as scrap. Not a huge problem if you have other uses for this material, but otherwise it’s annoying. Since the lower bolt each one mates to is in a slot in the greenhouse’s vertical rails and can slide freely, you can maybe cheat and use 4-ft lengths by not having them go all the way to the bottom. Probably still better than the straps it came with.

Original diagonal brace (left) and one cut from aluminum U extrusion. Stiffening these prevents wind gusts from rocking the greenhouse back and forth and working the bolts loose.

Original diagonal brace (left) and one cut from aluminum U extrusion. Stiffening these prevents wind gusts from rocking the greenhouse back and forth and working the bolts loose.

In addition, I found the following small tweaks very helpful in keeping the thing together:

  • Ditch that silly tube-thing that comes with the greenhouse and is supposed to act as a nut driver. Use a proper nut driver. You just can’t torque them down tight enough with that tube-thing.
  • Wherever those U-shaped notches occur on the endwall pieces, replace the standard nut with a locknut and (on the head side) lockwasher. The square-headed bolts that come with the greenhouse appear to be 1/4″, but with a non-standard thread pitch (non-standard = not what the Home Depot sells). So you may as well replace the bolt too (these end ones don’t require the square heads for anything) – preferably with the widest head you can find. Locknuts tend to have a wide flange around them…and, well, be locking. This should help them get a better grip on those U-shaped notchy bits.
  • Find, buy or fashion some thin tool you can slip between the horizontal supports and the twinwall panels to hold the bolt heads in place while you torque them down. I got extremely lucky and found a thin stamped-metal “crescent wrench” (from some Ikea furniture, I think) lying around that was a perfect fit, that I could slip in and juuust grab the edge of those square-headed bolts. You can probably fashion something using a hacksaw and any thin piece of metal (like one of those useless diagonal straps).

One final comment on this. After it blew apart the first time and things shifted a bit, I discovered the vertical members on either side of the door were now “too short” (or the ceiling assembly “too tall”) for the two to bolt together reliably anymore. On further inspection, the stamped metal base on this side seems to have “sagged”, so when the vertical wall supports were bolted to it, they no longer adequately reached the part it’s supposed to bolt to. Of course, anyone stepping or even brushing their feet against the base on the way in/out will just make this worse. To remedy, I cut some braces out of some aluminum stock I had handy and wedged them under the lip to prop it up at the edges of the doorframe.

Where important bolts pass through U-shaped notches instead of proper holes, replace the standard bolt and washer to add a lockwasher and flanged lock nut for added grip. Somehow hold the bolt head so you can tighten the everloving shit out of these.

Where important bolts pass through U-shaped notches instead of proper holes, replace the standard bolt and washer to add a lockwasher and flanged lock nut for added grip. Somehow hold the bolt head so you can tighten the everloving shit out of these.

More questionable U-notch attachments, above the door. In addition, you may find (now or in the future) that these verticals near the door have become too short to fully mate with this horizontal support near the ceiling.

More questionable U-notch attachments, above the door. In addition, you may find (now or in the future) that these verticals near the door have become too short to fully mate with this horizontal support near the ceiling.

To avoid the eventual "too short" problem, wedge something underneath them to prop up the lip of the base and prevent it from sagging over time.

To avoid the eventual “too short” problem, wedge something underneath them to prop up the lip of the base and prevent it from sagging over time.

Spare Parts
After completing assembly, I found I had at least a half-dozen square-headed bolts left over. The instructions make oblique reference to there being spares of some parts, but if I had known I’d have this many, I’d have dropped the extras down the vertical wall supports to provide extra attachment points. This could be handy to double-up the cross-brace straps along the sidewalls (if you followed the very strong recommendation above, you should have 4 spare ones now), or provide a way to hang small tools, etc.

More Windproofing
The doorhandle is pretty loose and can be easily lifted by the wind, letting the door fly open and thrash itself and everything it touches into oblivion. If you bought the accessory plant-hanging hooks (little plastic doohickies that twist-lock into the t-slots along the walls and ceiling), you can insert one on the inside of the door behind the handle, providing a convenient place to hook a spring or rubber band to maintain some downward tension on the handle.

Online reviews for a cheaper greenhouse from another vendor (sounds like ‘Hazard Fraught‘) recommend caulking in the twinwall panels to prevent them being popped out by the wind. I haven’t done this yet, but plan to.

A hanging plant hook (optional accessory) is a convenient place to hook a spring or rubber band to prevent winds from lifting the door latch.

A hanging plant hook (optional accessory) is a convenient place to hook a spring or rubber band to prevent winds from lifting the door latch.

Tim Tears It Apart: Honeywell R8184 Oil-fired boiler controller

Honeywell R8184G oil burner control

Honeywell R8184G oil burner control

Its official designation is “R8184 Intermittent Ignition Oil Primary”.

“But Tiiiim! That sounds booooorrrring. Why this thing, and not one of those fancy cloud-enabled thermostats containing more RAM than the desktop computer you had in college and not less than five processors capable of running Angry Birds at a playable framerate?”

Yes, excitement-wise this one sounds right up there with having your toenails waxed, but there are a few interesting bits regardless. Also, I have a broken one sitting in my basement right now, and what do we do with broken gadgets?…

Underside of oil burner controller

Underside of oil burner controller

Here is the underside showing the PCB. This should give some sense as to the age of this design. These curvacious traces are something you just don’t see in the era of computer-aided PCB design. This board may very well have been laid out literally by hand, the master trace pattern drawn in magic marker. Speaking of which, I drew an arrow in marker pointing to the likely culprit for this unit’s failure: a cold solder joint on one of the relay terminals, specifically, the one that energizes the orange wire leading to the burner and motor. You can also see some strategic cuts in the board itself, providing a physical air gap to isolate the low-voltage stuff from the line-powered sections nearby.

Oil burner topside

Oil burner topside

Here is the topside. There’s really nothing much to it! You can probably take a stab at how this all works just by inspection, but in case not, Honeywell provides the actual schematic on their website.

The fat transformer at top-left steps the 120VAC from the line down to around 24VAC to drive its own circuitry and the thermostat (red and white wire normally connected to the “T” terminals). I peeled back the tape on the primary winding a bit so you can see the difference in wire diameter, allowing for many more turns on the primary side. Without documentation or proper test equipment, you could use this to visually determine its function as a step-down transformer and maybe even make a loose guesstimate of the turns ratio.

Oil burner 24VAC relay

Oil burner 24VAC relay

Kitty-corner from this transformer is a big honkin’ relay, armed with a similarly fat bundle of wire. This coil is powered right from the AC off the transformer; notice the large metal weight clamped to the top end of the part that actually moves. I suspect this is to provide added inertia to keep the contactor in-place and prevent buzzing during the low periods in the AC cycle where the magnetic force ordinarily holding it drops out. Energizing this relay closes two separate pairs of contacts; one (with the cold solder joint) powers up the boiler via the orange wire, and the other completes the circuit (transformer center tap, or ~12VAC) for the safety lockout logic, which I’ll get to in a moment.

In an oil burning boiler, turning on the boiler engages a large motor that both blows air into the combustion chamber and forces oil through an atomizing nozzle. The oil is ignited by a spark plug of sorts, formed by a high voltage transformer and a conductor near the nozzle. Home heating oil is otherwise known as diesel fuel. Needless to say, you want this atomized fuel to burn away in a quick and controlled way, not let large quantities of it accumulate and then go up suddenly.

To prevent your basement turning into a Super Mario Bros. boss level if the fuel doesn’t ignite in a timely fashion, there is a “flame sensor” (photocell) and lockout timer built in. The label on the front of the unit specifies a lockout time of 45 seconds. As you probably noticed, there are no microcontrollers, quartz crystals, counters or any other obvious timing devices on this board, so how does this work?

The answer may wow you, either with its ghetto-ness or its ingenious simplicity. Much like the electric stove guts described in an earlier post, the timer is thermal. The top-right component contains a heating element attached to a bimetallic strip, which in turn connects to some contacts and a mechanical latch. This is attached to the bit of circuitry at the bottom-left, which connects to the photocell (flame sensor) normally connected at the ‘F’ terminals. For the grisly details, look at the schematic linked above. Ordinarily, when the thermostat is on 24VAC flows through R1 and R2 to the “bilateral switch” (there’s a symbol and part you don’t see everyday), which trips the TRIAC and ultimately begins warming the heating element, eventually curling the metallic strip inside the lockout mechanism enough to trip and cut power to the boiler. Note, the schematic shows the gate of the “bilateral switch” not connected to anything, but in reality it is shorted back to the first terminal (at R2), turning this device into basically a voltage threshold detector. Light falling on the sensor lowers its resistance from near-infinite down to the k-Ohm range or less, forming a resistor divider with R1. This lowers the voltage at the bilateral switch below its turn-on threshold, cutting power to the heating element before it trips the lockout.

Protectorelay thermal safety / lockout switch with latching feature

Protectorelay(R) thermal safety / lockout switch with latching feature

A look through the clear plastic case of this device shows the heating element is an ordinary 1W flameproof resistor. A metal slug, no doubt carefully sized to provide the right thermal inertia for the desired lockout time, is clamped around it. On the side of the device is an access hole for a setscrew, which applies pressure to a spring-loaded plate behind the bimetallic element. This most likely sets the initial position/tension of the strip against the pushbutton latch, and so allows fine-tuning the trip time.

Here is a video of the mechanism in action.

If the previous TTIA installment was any indication, the burning question is how much the thing cost to manufacture. As before, the off-the-shelf parts are pretty cheap but the presence of complex custom parts makes it hard to pin down a number. A comparable step-down transformer can be had for about $5-8 bucks on Digikey. The discretes would run probably another buck total, and give another $3-5 bucks for wiring, the solder-on screw terminals and the blank PCB itself. The transformer is a bit harder – it’s a custom Honeywell part and can’t be sourced off the shelf – but comparably sized transformers might run in the $20 range in onesies. Now for that lockout switch assembly, that’s a real piece of work. Not heavy on any expensive metals, but plenty of NRE sunk into this part, and plenty of mechanical parts to assemble (possibly some or all by hand). I’ll pull a $15 out of my ass for that component.

Notes To Myself: Migrating legacy Microchip C18 projects to MPLAB X + XC8 toolchain, Windows 7

First note to myself: NEVER USE MICROCHIP AGAIN. If I didn’t just need to make “a couple tiny updates” to an already selling, on-the-shelf project I’d just scrap the PIC18 for an EFM32TGxxx part, gcc (shaft of light from the sky, harps playing melodically) and be done with this entire shit-show. Insert whining about the month+ long circlejerk with Microchip Support about the bug in the PICKit3 programmer that is now corrupting the config bits on said product here. Of course, if the code from 5 years ago, even with no changes, still compiled and fit onto the chip it was written for and used to fit on 5 years’ worth of versions ago, and current MCC18 did not insist on dragging in the gargantuan (>4KByte) ‘.code_vfprintf.o’ even if it is not used or referenced anywhere in the code, I wouldn’t even have to bother trying it with the new compiler in the first place….

Soooo…. Install MPLAB X (make tea, a sandwich, possibly a baby or two while waiting for the crunching sounds from your harddrive to finish) and XC8. NB: Licensing is done via a Windows batchfile, completely outside any of the devtools OR their installers. If you have the license file, ignore absolutely anything to do with licensing and install as if you want the “free” version.

License: Run said batchfile. Voodoo happens and it should “Just Work”. (It did. Quite surprised.)

Make XC8 “C18 Compatibility Mode” findable:

The fake “C18” that currently serves as the compatibility layer must first be manually setup in MPLAB X (apparently no autodetect). But first-first, you need to workaround a stupid MPLAB X bug that has been unfixed going on two years now. The bug is you are arbitrarily forbidden from having two toolchains set up whose executables are in the same directory. Unfortunately this is EXACTLY WHAT MICROCHIP’S OWN XC8 COMPILER DOES (of course that directory is already used for XC8 itself, which IS autodetected somehow). So you have to create a fake instance of this directory (symlink or hardlink) with a different name to fool MPLAB X.

NB: The below workaround only works if your filesystem is NTFS. If not, you could also try just copypasta-ing the entire contents somewhere else, and hope this doesn’t break a path dependency somewhere or whatever. I haven’t tried this, but worth a shot.

To do this, you have to first-first-first somehow get a Windows console with Administrator privileges. The way I found that works is to create a batchfile with the contents “cmd <carriage return> pause”, then right-click and “Run As Administrator”. (Using the ‘runas’ command, Windows 7’s answer to sudo, apparently does not work for this as it forces you to know the actual administrator password, and will not accept your user password even if you have administrator privileges.

At the console, cd into the XC8 directory directly above the binaries directory (e.g. “C:\Progra~1(X86)\Microchip\xc8\v1.31\”) and type:

mklink /D _c18bin_ bin

This should result in a message indicating a symbolic link named “_c18bin_” was created.

Now you can actually set up the devtool. Ignore anything on the splash page and go to Tools -> Options -> Embedded -> Build Tools tab. Press “Add…” and enter the fake directory you just created. Specify the location of each build tool (if it exists). NB: For some reason the individual devtool settings ‘disappear’ after specifying them (close and re-open this dialog and “C Compiler” is blank again!). Does this mean it doesn’t need to be specified, or this is another MPLAB X bug and your dev tool will never, ever work? Will soon find out…

Now, try to build project (it will fail).

In “Output -> Configuration Loading Error” tab: “Could not generate makefiles for configuration default.” “XMLBaseMakefileWriter::createRuntimeObjectForMakeRule: null”

In “projectname (Build, Load)” tab: make[1]: *** No rule to make target ‘.build-conf’. Stop.

FIXME: Fix this error…