Stage 4. Centers of mass

This is an archived article, no longer relevant.

Use the CapMonster Cloud service to create your own modules. Detailed instructions can be found at this link - Creating a custom module

Table of contents


Setting up the Center of Mass search is one of the most important steps in creating a recognition module. On the captcha, the centers of mass will be highlighted with dots of a different color (green by default). The essence of these points is the place where the search for the symbol will take place. This step also adjusts the size of the character recognition window.

The main task

It is necessary to configure the search for centers of mass so that:

  1. The points obtained passed through the centers of the letters or as close to them as possible.

  2. There were as few points as possible. But not less than necessary.

  3. The displayed counting time should not be long.

Correct setting

  1. Character recognition window settings: the size of the window must be such that the largest representative of the characters can fit into it. Click on the captcha with the left mouse button and the character recognition window will be drawn with a green (default color) frame. Looking at this frame, you can choose the right size for it.

  2. Settings for the size of the area for calculating the mass: increasing the width of this area you make the line of the centers of mass smoother. And vice versa. It is necessary to make sure that the line of the centers of mass passes through the centers of the letters, but does not twitch much in the letter itself. The height of the area must be adjusted so that there is one line on each letter.

  3. Recognition threshold: set the recognition threshold so that the center of mass line becomes smaller, but does not disappear from the centers of the characters.

  4. Symbol Threshold: Do not touch this setting.

  5. No more than one check point for this rectangle: it is not necessary to search for a character very often in width, you can do it once every two points, if the captcha is small, or once every 3-4 points, if the captcha is enlarged. In height, this parameter should be slightly larger than the tallest letter. And more is possible if the captcha is one-line. If you add too many points, it will slow down the recognition speed. And if you overdo it with a scatter, then there will be too many recognition errors.

  6. Additional points: You can add validation points, with a small variation in height, to better search for symbols. Same as in step 5 - if you add too many points, it will slow down the recognition speed. And if you overdo it with a scatter, then there will be too many recognition errors.

Different types of captchas

Captchas can be with very closely spaced characters and with separate characters. In the first case, the centers of mass after correct adjustment will represent a green line passing through the centers of each symbol. In the second case, there are points at the centers of each symbol.

When the recognition core is trained

On this tab, you can still click with the left mouse button (or move without releasing the left mouse button) on the captcha and see the core responses in each place of the captcha in order to better understand where the recognition errors come from.

Note!

When configuring each parameter for finding centers of mass, it is advisable to scroll through the captchas and check the settings on several options at once, and not configure everything on one.

Video instruction on YouTube link. (in Russian)