Every now and then (with increasing frequency, it seems) I find myself needing to analyze millions of lines of CAN logs that look like
(1630536594.856841) can1 15F8051C#002FE7B7490FF3FD
(1630536594.857974) can1 15F8051C#01300049A616359F
(1630536594.859108) can1 15F8051C#02100680E22138F9
(1630536594.859666) can1 15F8051C#035194F22D88011A
(1630536594.860264) can1 15F8051C#0400000000400013
(1630536594.860834) can1 15F8051C#053C006E00B0F5FF
(1630536594.861406) can1 15F8051C#06FF01E000C800FF
(1630536594.876000) can1 09F80281#2CFC03010000FFFF
(1630536594.906718) can1 09F8021C#E7FC14010000FFFF
(1630536594.907256) can1 09F8031C#E70A000000000000
(1630536594.907821) can1 09F8041C#E70AC41401000000
(1630536594.915459) can1 09F80381#2CB2FEFFFF000000
(1630536594.916033) can1 09F80481#2CB2C40201000000
For CAN logs, it makes sense to use Unix timestamps (fractional seconds since epoch) because you're often only concerned with relative times between messages, not the absolute time a message occurred. However, I often find myself needing to correlate these CAN logs with messages from other log files with a different format, such as
[2021-09-01 22:49:54.906718] [CATEGORY] [INFO]: Some log message.
My normal workflow for this, is to filter the CAN log to the messages I care about (because I was serious about the logs containing millions of lines) to a new Vim buffer and then painstakingly interleave messages from different sources. After (or during, if I don't have a clear idea of what messages are important) I have the log message interleaved, I start annotating the messages. This is how I troubleshoot (because no one looks at CAN logs for fun) the events portrayed in the logs. This works, but it's time-consuming, and involves a lot of manual work.
date
Given a Unix timestamp like 1630536594.906718
, you can convert it to an ISO-8601 timestamp
with date
like so:
$ date --utc --iso=8601=ns -d @1630536594.906718
2021-09-01T22:49:54,906718000+00:00
Due to the limitations of
strftime(3)
, this is how I will convert between different timestamp formats.
sed
When sed
's your hammer, everything becomes a nail. I'm more comfortable in the shell than I
am in Vim, so sed
was the first tool I reached for.
$ head -n 1 candump.log |
sed 's/(\(.*\))\(.*\)/printf "("; date --utc --iso-8601=ns -d @\1 | tr -d "\n"; printf ")%s" "\2"/e'
(2021-09-01T22:49:54,856841000+00:00) can1 15F8051C#002FE7B7490FF3FD
This is overly verbose, and has several things I don't like:
printf
to print the surrounding parentheses and the rest of the log message
| tr -d "\n"
to remove newlines from the date
output.awk
It feels like this is something awk
could do better, but I don't really know
awk
, so it's not a tool I reach for. I'll eventually rectify this.
I've identified three different Vim solutions, which meets my needs a little better, because these
transformed logs aren't really a standalone log file – they're in some notes.txt
file
containing log snippets, timestamps of interest, hypotheses, shell snippets, CAN message definitions,
etc., so the more interactive this process is, the better.
You might think "oh, I'll just select the timestamp, and use
:'<,'> !xargs -I{} date --utc --iso-8601=ns -d @{}
!". That's nice, but remember that no matter what you do, selections in Ex mode are line-wise, so this
little snippet passes the whole line to date -d
.
To resolve this, that is, allow passing just the selections to the shell, you can use the vis plugin with
:'<,'>B !xargs -I{} date --utc --iso-8601=ns -d @{}
This is almost identical to the first attempt, except that we use
:'<,'>B
instead of :'<,'>
.
This is nice, but
date
converted to a
space?) before the closing parenthesis
system()
in a :substitut
ion
Using \=
in a :substitute
or substitute()
turns the rest of the
substitution string into an expression. This, combined with submatch()
to access a capture
group from within the expression. So we could do something like
:%s/(\(.*\))/\=system('date --utc --iso-8601=ns -d @' . submatch(1))/
but this has the same problem as the very first sed
incantation – the line ending
from the date
output is included in the replacement. This can be remedied with
:%s/(\(.*\))/\=system('date --utc --iso-8601=ns -d @' . submatch(1) . ' | tr -d "\n"')/
but it still leaves us with another problem – we lose the parentheses wrapping the timestamp.
Depending on the use case, this may or may not be important. I want to be able to use
:sort
on the interleaved log messages from different sources, so the timestamp prefixes
must all be in the same format.
To accomplish this, we can wrap the system()
call in a substitute()
like so:
:%s/(\(.*\))/\=substitute(system('date --utc --iso-8601=ns -d @' . submatch(1)), '\(.*\)\n', '(\1)', '')/
Depending on the timestamp being reformatted, you may need to further wrap the
submatch(1)
in a shellescape()
to properly escape anything that needs it.
Regardless, this method is far too unwieldy to keep handy in my back pocket.
This is my favorite solution. It's nicely composable, and once you record it, you can play it back over
and over as needed. Further, with .viminfo
files, the macro may be restored when closing
and reopening the file, as long as you don't overwrite the macro in an attempt to make it even better ;)
qd
0
"tcib
<C-r>=strftime("%FT%T %Z", @t)<CR>
<Esc>
q
That's nice and simple (broken onto multiple lines for clarity) but strftime(3)
has a fatal
flaw that makes it unusable for CAN log analysis – it can only provide second-level precision (no
fractional seconds). So we should replace strftime()
with a system()
call to
date
.
Additionally, using strftime()
will convert timestamps to your current locale – which
may or may not be helpful depending on the other log messages you're trying to correlate with the CAN
logs. In my case, I want UTC timestamps for everything.
qd
0
"tcib
<C-r>=system('date --utc --iso-8601=ns -d @' . @t . ' | tr -d "\n"')<CR>
<Esc>
q
This macro can then be invoked ad hoc by @d
, or over a range with
:{range}norm @d
. This solution is most flexible and repeatable, and is best for my needs,
where I may be adding and removing log statements as I learn about whatever problem I'm troubleshooting.