Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Current »

Table of contents:

 Click here to expand the table of contents

Description

The action is used to automatically recognize captchas through services or manually .

Captcha (from CAPTCHA - eng. C ompletely A utomated P ublic T uring test to tell C omputers and H umans A part - a fully automated public Turing test for distinguishing between computers and people) - a computer test used to determine who a user of a system is: a person or a computer.

Some types of captcha

How to add an action to a project?

There are several ways to add an action to a project.

Through the context menu of the BROWSER

To add a recognition action using the context menu of the ProjectMaker browser, you must right-click on the picture on the site and select the This is a captcha item!

1. Via the context menu of the PROJECT

Add ActionTabsRecognize Captcha

The disadvantage of this method is that you first need to download the image to your computer and then specify the path to the file in the action.

Or use smart search .

2. Via the context menu of the BROWSER

To add a recognition action using the context menu of the ProjectMaker browser, you must right-click on the captcha on the site and select the This is a captcha item!

After adding an action, the manual captcha recognition window will immediately open, which you can close for now and go to the action settings.

Captcha recognition action settings.

The main

Recognition module

  • Selecting a module (captcha service) through which the captcha will be recognized.

  • Select the desired captcha recognition service from the drop-down list (you must first specify its API key in the settings ). The default is MonkeyEnter.dll - manual input.

Project variables can be used in this field.

Settings

When you click on the Settings button, you will be taken to the program settings, to the captcha services tab

Element search

Table of contents

 Click here to expand the table of contents

Before you can interact with an element on the page, you need to find it. In the Get Value, Set Value, Rise Event, Touch Event, Swipe Event actions, there are two ways to find items - classic and using XPath.


Classic - Search by HTML element parameters: tag, attribute and its value.

 

XPath - Search using XPath expressions . With the help of it you can implement a more versatile and resistant to layout changes way of data search in comparison with classic search or regular expressions.

Which tab

Select the tab on which the item will be searched.
Possible values:

  • Active tab

  • First

  • By name - when you select this item, an input field for the name of the tab will appear.

  • By number - in the entry field you will need to enter the serial number of the tab (numbering starts from zero!)

Document

It is recommended to set the value -1 (search in all documents on the page). 

Form

It is also better to set -1 (search in all forms on the page). Choosing this value will make the template more versatile.

 Why is it better to put "-1"?

Example: on page 3 of the form - search, registration, ordering goods. We need to click on the button in the order form and we have chosen as the value of the “Form” field - 2 (numbering from zero). After some time, a new login form appears on the site, and it is inserted in front of the order form. Number 2 will now be the login form, and our template will either give an error that the button was not found, or (much worse) will click on another button in another form.

In the program settings, you can select two checkboxes - Search all forms on the page and Search all documents on page, and then always when adding an element to the Action Designer, the document number and form will be set to -1.

Tag (classic search only)

The actual HTML tag from which you want to get the value.

You can specify several tags at once, the separator is; (semicolon)

Conditions (classic search only)

  1. Group - the priority of this condition. The higher this number, the lower the priority. If we could not find an element by the condition with the highest priority, then go to the condition with the next priority, and so on until the element is found, or until the search conditions are over. You can add several conditions with the same priority, then the search will be performed for all conditions with the same priority at the same time. 

  2. Attribute - HTML attribute of the tag by which the search is performed.

  3. Search type :

    1. text - search by full or partial text occurrence;

    2. notext - search for elements that do not contain the specified text;

    3. regexp - search using regular expressions
      By default RegeXp search is case insensitive. To change it you can prepend (?-i) to your expression (this mean “disable case insensitive mode”).

  4. Value - the value of the HTML tag attribute

  5. Match # - the ordinal number of the found element (numbering from zero!). Ranges and variable macros can be used in this field.

To delete a search term, left-click on the field to the left of it (highlighted in blue in the screenshot) and press the delete button on the keyboard.

Several conditions can be used to find the desired item.

It is always important to try to select search conditions in such a way that only one element remains, i.e. the serial number was 0 (numbering from zero).

Put to variable

The recognition result will be saved to the project variable specified here.


More

Expectation

Wait before executing - If positive numbers are specified in the FROM and TO fields, then the action will pause before starting work (the time will be randomly selected based on the specified range).

Wait no more for an item - if the item is not found after the time specified here, the action will exit on the red thread (with an error).

Module settings

In this field, you can enter additional parameters (conditions) for recognition captcha - case sensitive, only Russian characters, mathematical captcha, a few words, etc.

Format: parameter_name = parameter_value Several parameters are separated by & (ampersand)

Example (based on RuCaptcha API) phrase = 1 & numeric = 2 & regsense = 1 - captcha consists of two or more words, only letters, case sensitive

 Examples of parameters from different services

Additional parameters and the value that these parameters can take are individual for each service.

Let's consider a few examples based on two popular services for recognition captcha.

2Captcha - when you go to the page with the API description https://2captcha.com/2captcha-api#solving_normal_captcha, scrolling through the table below you can find a table where the parameters that can be specified

here is only a part of the possible parameters

Anti-Captcha - there is also a table with valid parameters on the documentation page for solving simple text captchas

Even based on only these two services and only a small part of their parameters, you can see that

  • some parameters that are responsible for the same are named differently (case sensitivity - case and regesense )

  • others have the same name, answer for the same thing, but accept different types of values ( phrase)

  • there are parameters that coincide in name, purpose, accepted values, but in one service slightly more values can be passed than in another ( numeric )

Be extremely careful when writing a project for several captcha services using additional parameters.

Captcha parameters

Scale - with this setting you can reduce or increase the size of the sent image-captcha.

Glue captchas - it happens that a captcha consists of several images, then they can be combined so as not to spend money on recognizing separate parts. To merge captchas, if you did not merge them when recording a template, you need to set the "Merge captchas" checkbox in the properties window of the first captcha element. Then right-click on the next item and a new item, Glue to captcha, will appear in the context menu.

With each click, a new action will be created, the last one will have the Last Captcha checkbox (for the previous ones, this checkbox is removed).

Asynchronous recognition

This setting allows you not to wait for a response from the service, but to continue the template execution.

When this option is enabled, a new action Waiting for captcha recognition is created. It has no settings, only the Go back to Recognition button, when clicked, will redirect you to the main action (very convenient when these actions are located in different edges of the action canvas in the template). The main action has a reverse button - Go to the end of recognition.

After the template reaches the main Recognize captcha action , it will send the captcha to the service and continue working until it encounters the Waiting ... action, at this action it will stop and wait for a response from the service. After receiving the response, you can use the variables that were specified in the main action.

Complaint URL

Captcha on the service is recognized by people, and people, as you know, tend to make mistakes. Sometimes employees make mistakes, or they don't read the task carefully and instead of writing the answer to the expression 3 + 88 =?, They write the expression itself, although it was indicated in the settings that this is a captcha where a mathematical task needs to be solved.

For such cases, this setting is used. - if the captcha was recognized incorrectly, then by sending a request for this url, you will complain about this specific recognition and the service will return your money.

Do not abuse this opportunity and use it only when the employee really made a mistake and misunderstood the captcha. If you complain and return money for correctly solved captchas, you will be banned very quickly.

Saving

With the help of these settings, you can save the image with the captcha and the answer to the specified directory.

This is useful when using CapMonster 2 (a program for automatic captcha recognition) - this software has a lot of captchas that it supports, as they say, out of the box, but there are also some for which you need to create modules yourself. And to create a module, you need a database of correctly solved captchas, this is where these action settings come to the rescue - you recognize a captcha manually or using services, save captchas and answers, and then use them to train CapMonster 2.

  • Directory - the directory where the pictures will be saved (you can use variables )

  • Answers - where to save answers to captchas:

    • The file name is convenient, but not always suitable, since Captchas may contain characters that cannot be used in file names in Windows - \/:*?”<>|

    • To file - when this setting is selected, a captcha picture with the name captcha (X) .png will be saved in the specified directory , where X is the serial number of the captcha. A captcha (X) .txt file will also be created in which the answer to this captcha will be. In this case, the system's restrictions on file naming will no longer be scary.

  • Ignore answer “sorry” - for some errors, the Recognize captcha action returns sorry instead of answering the captcha. When this option is enabled, the program will not save captchas with this answer.

Where it can be useful:

  • if you want to create your own module for the CapMonster Cloud service.

  • when using CapMonster 2 (a program for automatic captcha recognition) - this software has a lot of captchas that it supports, as they say, out of the box, but there are also some for which you need to create modules yourself. And to create a module, you need a database of correctly recognized captchas, and this is where these action settings come to the rescue - you recognize the captcha manually or using services, save the captchas and answers, and then use them to train CapMonster 2.


Additional Information

Text captchas

Quite often, especially on weakly protected resources, a text captcha is encountered. It differs from a simple (graphic) captcha in that it is not drawn in a picture, but simply written in text. In principle, such a captcha does not need to be sent anywhere, it can be taken (parsed) directly from the text of the page. To parse the captcha from the text of the page, you need to take the text of the page using the Data action, select the text of the page and, by marking "parse the result", enter a regular expression for parsing the page in the parameters.

Mathematical captcha

There is also a mathematical text captcha. This is the same text captcha, only it is usually used to write a mathematical expression like 58 + 63. You can turn this text into a picture and send it for recognition, or you can use JavaScript. To recognize the captcha, you can use a JavaScript action from the Custom Code category. In the field for the code, you can insert a link to the variable that contains the parsed expression, for example 58 + 63, and after execution the action will return the result 121.

Flash captcha and captcha from any other element

If you come across a flash captcha, you can turn it (render) into a regular picture and also send it for recognition. Find this element in the tree of elements , right-click to bring up the menu to select actions on this element. Select the item "This is a captcha" there ... that's it!

How to handle CAPTCHA recognition errors

https://www.youtube.com/watch?v=z1uLzCEUcZ8

The video was recorded for the outdated version of ZennoPoster, the processing algorithm itself remains the same and does not depend on the version of the program.

How to take a screenshot of the browser using the Recognize captcha action?

Sometimes it becomes necessary to take a screenshot of either a specific HTML element or the entire site (even those parts of it that are out of sight).

If you only need a screenshot of the browser window (visible area of the site), then it is better to use the Images Processing action

For this

  • add the Captcha Recognition action to the project (be sure to use the browser context menu: warning:, for this you can right-click on any picture on the site).

  • select CaptchaSaver.dll as the recognition module

  • enter the search criteria for the element for which you want to take a screenshot

  • in the Additional tab, in the Module parameters, specify the full path to save the image (you can use variable macros)

 Example of action settings for a screenshot of the entire site

Usage example

Typical case

  • right click (right click) the captcha image and select This is captcha from the context menu!

immediately after adding this action, a manual recognition window will open, you can close it

  • select the required recognition module (by default MonkeyEnter.dll - manual input)

make sure you have specified the API key in the settings and the service has money

  • After that, right-click in the field where you need to enter the answer to the captcha and select the Field for the result of captcha recognition, after which one more action will be added to enter the response to the captcha (for this Recording in the project must be enabled)

or you can manually find the field using Action designer and enter the answer using the Set Value action

Sticking

For this example, a page with the following content will be used:

 Test page source code.

<!DOCTYPE html> <html dir="ltr">
<head>
<title>CAPTCHA Test</title>
</head>
<body>
<img src="" id="1">
<img src="" id="2">
<img src="" id="3">
<img src="" id="4">
</body>
</html>

Each character is a separate HTML element. Click on the first picture right click-This is a captcha !, in the settings, select Stick captchas, right-click on the rest of the pictures and select Stick to captcha from the context menu. As a result, you should get four actions:

After launch, the first three actions will only collect pictures and stick to each other, and only the last action will glue the final part and send it to the service to recognize the full captcha.

Additional parameters when sending

Let's imagine that there is a similar captcha:

It consists of separate parts and you need to write the result of the expression (in this specific case - addition).

 Source code of the page with captchas

<!DOCTYPE html> <html dir="ltr">
<head>
<title>CAPTCHA Test</title>
</head>
<body>
<img src="" id="2">
<img src="" id="1">
<img src="" id="3">
<img src="" id="4">
</body>
</html>

First you need to stick all the individual pictures into one. Then, for the last action, select the required service (in this example, RuCaptcha) and in the Parameters on the Additional tab, indicate that a mathematical action should be performed here (for RuCaptcha - calc = 1 )


  • No labels