SLML Part 3 - JPEG to SVG to Stroke-3

Converting JPEG scans to vector paths

Andrew Look


November 7, 2023

SLML Part 3 - JPEG to SVG to Stroke-3

This post is part 3 of “SLML” - Single-Line Machine Learning.

To read the previous post, check out part 2.

If you want to keep reading, here is part 4.

Most computer vision algorithms represent images as a rectangular grid of pixels on a screen. The model that the Magenta team trained, SketchRNN, instead interprets the drawings as a sequence of movements of a pen. They call this “stroke-3 format”, since each step in the sequence is represented by 3 values:

  • delta_x: how much did it move left-to-right?
  • delta_y: how much did it move up-and-down?
  • lift_pen: was the pen down (continuing the current stroke) or was the pen lifted (moving to the start of a new stroke)

First, I had to convert my JPEG scans into the “stroke-3” format. This would involve:

  1. converting the files from JPEG to SVG
  2. converting SVG to stroke-3
  3. simplifying the drawings to reduce the number of points


When I first started converting to SVG, I had trouble finding a tool that would give me a single, clean stroke for each line. Eventually I found a tool called autotrace that was able to correctly do a “centerline trace”.

(a) potrace
(b) autotrace
Figure 1: Comparison of Vectorization Tools.

SVG to Points

Then I used a python library called svgpathtools to take the resulting SVG files, and convert each of the paths to a sequence of points. This step is necessary because SVG paths are often represented as Bezier curves.

One problem I noticed was that the drawings were represented as many separate strokes rather than one continuous line. For example, in the image below, each color represents a separate pen stroke.

separate strokes

Line Simplification

Finally, I’d apply the Ramer-Douglas-Pecker (“RDP”) algorithm on the resulting points, which uses an adjustable “epsilon” parameter to simplify down the drawings by reducing the number of points in a line’s path.

RDP example

This is important because the SketchRNN model has difficulty with sequences longer than a few hundred points, so it’s helpful to simplify the drawings down by removing some of the very fine details while preserving the overall shape.


Next in my SLML series is part 4, where I experiment with hyperparams and datasets in training SketchRNN.