Jan Stria's Notes

Obtaining source code

  • Clone the repository from Bitbucket:
    git clone https://janstria@bitbucket.org/janstria/clopema.git
  • Update external submodules:
    git submodule update --init --recursive
  • Make C++ codes for polygonal approximation of close curves:
    cd dp_poly
    make all
  • Optionally make C++ library for finding maximum flow in a graph:
    cd libs/maxflow
    make all

    Binaries for Matlab 64-bit in Windows and Linux are already part of the repository.

Adding data

  • Download the archive data.zip containing learned probabilistic models and sample images. Unpack it to the root directory of the repository. You should see two new directories data and out.

Running segmentation and model fitting for one image

  • Run Matlab script process_image.m.
  • For the best performance set the constant PLOT_FIGS to false to avoid plotting of intermediate figures. Consider runnig Matlab from a command line without GUI and JVM as:
    matlab -nodisplay -nodesktop -nojvm -r "process_image;exit"
    Then the whole script including Matlab initialization should run under 5 seconds even on an older notebook.
  • At the beginning of the script there are four constants which are loaded from environment variables:
    • Constant INPUT_IMG determines path to the input image.
    • Constant MODEL_NAME defines model of garment to which the image should be matched.
    • Constant FOLD_STEP determines step in the folding sequence and corresponds to an amount of already performed folds. The sequence begins with step 0 for completely spread garment.
    • Constant FOLD_COORDS determines image coordinates of a fold which was performed. It should have following format: 'from_x;from_y;to_x;to_y'. The fold has to be performed in a clockwise direction.
    • Constant ROI_COORDS determines image coordinates of a polygonal region of interest which bounds the garment. It should have following format: 'x1;y1;x2;y2;...;xn;yn'. If it is empty then the whole image is used.
  • The script expects that the input image contains only the known table and a piece of widely spread garment. Eventually, the region of image containing the table and garment can be determined by a polygonal region of interest. The table should be seen from a bird's eye perspective. The camera should be placed approximately above the center of the table.
  • All output files are written to out directory.
    • File nodes.txt contains (x,y) coordinates of the identified landmark in the input image coordinate system.
    • File mask.png contains binary segmentation mask.
    • File contour.txt contains (x,y) coordinates of the garment's contour extracted from the segmentation mask.

Training a new color model

  • The archive data.zip already contains a learned probabilistic color model of the table. However, it is possible to train your own model by running train_rgb_model_roi.m script.
  • Input of the script is formed by several images of the table together with defined polygonal regions denoting bounds of the table in the images. Constant DEFINE_ROI tells whether the user wants to define these regions of interest manually or whether they should be loaded from file.
  • The output of the script is written to out/rgb_model.mat to be used by the script for recognition.

data.zip (5.1 KB) Jan Stria, 03/03/2014 16:26