Disassembling Arduino generated code
Johnny Five... No disassemble!
I recently had need to work out how many clock cycles certain functions in my code took to execute, so it was time to find the HEX file and disassemble back to the ASM instructions. In the Arduino environment you are shielded from getting your hands dirty with the generated compiler files, but they can be located in a temporary directory in Windows, found under:
C:\Users\USERNAME\AppData\Local\Temp\buildXYZ.tmp
where "USERNAME" is your ... user name and XYZ is some long integer that Arduino comes up with. Make sure you have recently compiled your code in Arduino - the folders are deleted or wiped when you exit Arduino. In this folder you'll find some interesting files, among them the .HEX file and .ELF files are of particular interest. Either make a .BAT file or use the CMD prompt to execute the following commands.
Disassemble from HEX
We'll use avr-objdump, but we need to specify the architecture. Find the necessary chip here:
or
For the Atmega328 MCU we're told to use "avr5" as the target architecture.
avr-objdump -j .sec1 -d -m avr5 foo.cpp.hex > output.txt
The ASM output is saved to "output.txt"
Disassemble from ELF
avr-objdump -d -S foo.cpp.elf > output.txt
The ASM output is saved to "output.txt"
Counting clock cycles
Now we have the ASM code, we want to count up the clock cycles used per instruction, within a given line range. I wrote a quick program in Python to scrape the datasheet for the Atmega328 chip to generate a dictionary of opcodes and the minimum and maximum clock cycles they can take, to give:
commands = {'BRLT': [1, 2], 'CPI': 1, 'CPSE': [1, 2, 3], 'CPC': 1, 'EOR': 1, 'WDR': 1,
'MOVW': 1, 'BRLO': [1, 2], 'RETI': 4, 'SBIC': [1, 2, 3], 'SBIW': 2, 'SBIS': [1, 2, 3],
'BRSH': [1, 2], 'MULSU': 2, 'BRPL': [1, 2], 'LPM': 3, 'BRCS': [1, 2], 'BRTS': [1, 2],
'BRCC': [1, 2], 'OR': 1, 'BSET': 1, 'BRTC': [1, 2], 'BRNE': [1, 2], 'TST': 1, 'DEC': 1,
'ROL': 1, 'IN': 1, 'LDD': 2, 'JMP': 3, 'SBRC': [1, 2, 3], 'LDS': 2, 'LSR': 1, 'ROR': 1,
'SBRS': [1, 2, 3], 'SEV': 1, 'ANDI': 1, 'BRMI': [1, 2], 'SUBI': 1, 'AND': 1, 'BRVS': [1, 2],
'CBI': 2, 'MOV': 1, 'CBR': 1, 'BRVC': [1, 2], 'RJMP': 2, 'NOP': 1, 'BREQ': [1, 2], 'INC': 1,
'SUB': 1, 'NEG': 1, 'BRHS': [1, 2], 'ICALL': 3, 'RET': 4, 'BLD': 1, 'MUL': 2, 'BRHC': [1, 2],
'COM': 1, 'ASR': 1, 'SET': 1, 'SES': 1, 'SER': 1, 'BST': 1, 'SEZ': 1, 'IJMP': 2, 'SEC': 1,
'SWAP': 1, 'PUSH': 2, 'SEN': 1, 'SEI': 1, 'SEH': 1, 'CLN': 1, 'CLH': 1, 'CLI': 1, 'ADIW': 2,
'CLC': 1, 'ADD': 1, 'ADC': 1, 'CLZ': 1, 'LDI': 1, 'CLT': 1, 'CLV': 1, 'CP': 1, 'CLR': 1,
'CLS': 1, 'SBR': 1, 'ST': 2, 'BRGE': [1, 2], 'SBC': 1, 'ORI': 1, 'SBI': 2, 'RCALL': 3,
'FMUL': 2, 'MULS': 2, 'BCLR': 1, 'LD': 2, 'SBCI': 1, 'BRBS': [1, 2], 'POP': 2, 'SLEEP': 1,
'BRBC': [1, 2], 'STD': 2, 'STS': 2, 'FMULSU': 2, 'BRID': [1, 2], 'FMULS': 2, 'OUT': 1,
'CALL': 4, 'BRIE': [1, 2], 'LSL': 1}
i.e. BRLT either takes 1 or 2 clock cycles, depending on the exact evaluation. Note this set may not be (probably won't be) useful for other chips!
It is then a fairly trivial task to go through output.txt, look at the commands in a certain line range, and spit out a minimum and maximum clock cycle count.
def sumclockcycles(startline, endline):
## Line numbers are inclusive
f = open("output.txt",'r')
error = 0
mincycles = 0
maxcycles = 0
for line in f.readlines():
if ":" in line:
splitline = line.split(":")
linenumber = "0x"+splitline[0].strip()
valid = 1
try:
ln=int(linenumber,16)
except ValueError:
valid = 0
if valid:
if ln>=startline and ln<=endline:
splitrestline = splitline[1].split("\t");
cmd = splitrestline[2].strip().upper()
if cmd in commands:
cycles = commands[cmd]
##print hex(ln), cmd, cycles
try:
length = len(cycles)
mincycles += min(cycles)
maxcycles += max(cycles)
except TypeError:
mincycles += cycles
maxcycles += cycles
else:
print "Cmd not found",cmd,splitrestline
error+=1
f.close()
return (mincycles, maxcycles,error)
print sumclockcycles(0x55e, 0x5e6)
Hence interrupts, functions, etc can be critically assessed for performance.
Comments
Post a Comment