Jan Stria's Notes¶
Obtaining source code¶
- Clone the repository from Bitbucket:
git clone https://firstname.lastname@example.org/janstria/clopema.git
- Update external submodules:
git submodule update --init --recursive
- Make C++ codes for polygonal approximation of close curves:
- Optionally make C++ library for finding maximum flow in a graph:
Binaries for Matlab 64-bit in Windows and Linux are already part of the repository.
- Download the archive
data.zip containing learned probabilistic models and sample images. Unpack it to the root directory of the repository. You should see two new directories
Running segmentation and model fitting for one image¶
- Run Matlab script
- For the best performance set the constant
PLOT_FIGS to false to avoid plotting of intermediate figures. Consider runnig Matlab from a command line without GUI and JVM as:
matlab -nodisplay -nodesktop -nojvm -r "process_image;exit"
Then the whole script including Matlab initialization should run under 5 seconds even on an older notebook.
- At the beginning of the script there are four constants which are loaded from environment variables:
INPUT_IMG determines path to the input image.
MODEL_NAME defines model of garment to which the image should be matched.
FOLD_STEP determines step in the folding sequence and corresponds to an amount of already performed folds. The sequence begins with step 0 for completely spread garment.
FOLD_COORDS determines image coordinates of a fold which was performed. It should have following format: 'from_x;from_y;to_x;to_y'. The fold has to be performed in a clockwise direction.
ROI_COORDS determines image coordinates of a polygonal region of interest which bounds the garment. It should have following format: 'x1;y1;x2;y2;...;xn;yn'. If it is empty then the whole image is used.
- The script expects that the input image contains only the known table and a piece of widely spread garment. Eventually, the region of image containing the table and garment can be determined by a polygonal region of interest. The table should be seen from a bird's eye perspective. The camera should be placed approximately above the center of the table.
- All output files are written to
nodes.txt contains (x,y) coordinates of the identified landmark in the input image coordinate system.
mask.png contains binary segmentation mask.
contour.txt contains (x,y) coordinates of the garment's contour extracted from the segmentation mask.
Training a new color model¶
- The archive
data.zip already contains a learned probabilistic color model of the table. However, it is possible to train your own model by running
- Input of the script is formed by several images of the table together with defined polygonal regions denoting bounds of the table in the images. Constant
DEFINE_ROI tells whether the user wants to define these regions of interest manually or whether they should be loaded from file.
- The output of the script is written to
out/rgb_model.mat to be used by the script for recognition.