Bboxing

09/04/2019


So you want to build your collections of boxes bounding pixels-picturing-objects.
First of all, you need gather a series of such group of pixels (aka images).
You can easily accomplish this by means of any of the notorious tubes.
Go find videos (the more the merrier) showing the sought objects; for example,
in case you are interested in detecting low-res musical instruments, try


$ youtube-dl -f best[ext=mp4] -o ./video.mp4 https://www.youtube.com/watch?v=s2YiJ13MRUE


Then it's time to butcher the video in frames.
You do not need to finely chop it tho, 1 image per second will do (-r 1)


$ ffmpeg -i video.mp4 -r 1 -f image2 ./imgs/image-%07d.png


Now the fun begins, fire up your trustworthy annotation tool and go for it


falling protocol droid


Tools of the trade:

Go back