• Which letter is hidden? Alphabet letter recognition Letter recognition neural networks c

    24.10.2020

    Let us have a screen in front of us, divided into twelve cells, 4 x 3. The cells reflect the discreteness of image elements. When focusing the image, the cell is either illuminated or not. “Illumination” defines a single value of the magnitude of its excitation, “non-exposure” - zero. Thus, the letter O determines the illumination of cells, determined in Fig. 2.1. The letter A illuminates the screen, as shown in Fig. 2.2.

    What needs to be done so that some device we are designing can tell what letter it is?

    Obviously, all the signals for excitation of the screen cells, illuminated by the letter O, must be sent to the connector, which implements the I circuit. A single signal at the output of the connector, as shown in Fig. 2.1, will be generated if and only then when all the cells of the screen on which the image is placed are illuminated letter O. The presence of a single signal at the output of the conjunctor will determine the answer: “This is the letter O.”


    Rice. 2.1. Teaching the letter "O"


    Rice. 2.2. Teaching the letter "A"

    The same must be done for the letter A.

    Let's mark each cell of the screen with its coordinates. Then, in the language of mathematical logic, what we have done can be written in the form of logical statements - predicates:

    These predicates determine the “electronic” embodiment using circuit design methods.

    In this case, the letters will not “interfere” with each other, since the illumination of the corresponding screen cells does not partially coincide, and the unit value of the conjunction will be determined only for one of them.

    What if you put the letter K on the screen? Then none of the two conjunctors will produce a single value, since there will be no complete coincidence of illumination of the corresponding screen cells. To “teach” the system the letter K, you need to introduce another conjunctor and do the same constructions as above.

    Thus, we can say that we have built a system for recognizing two “correctly” given letters.

    But what to do if the letters on the screen are written with a trembling hand? Then we must allow alternative illumination of some neighboring screen cells and take this into account using the disjunction operation, OR. As is known, as a result of this operation, a single signal is generated if there is at least one single signal at the input.

    Let's consider the possibility of recognizing the letter O, allowing for the possibility of illumination of cells (1,1), (1,3), (4,1), (4,3). Then the previously constructed predicate will take the form:

    Similarly, for the letter A, let’s allow cells (4,1) and (4,3) to be illuminated:


    Rice. 2.3. Joint teaching of the letters "O" and "A"

    Combining both predicates, we get the diagram in Fig. 2.3.

    Thus, we have implemented a “circuit-technical” approach for learning and recognition, based on the use of Boolean functions and operating boolean variables 0, 1.

    Construction of a logical neural network trained to recognize letters

    Now let’s take that step, that transition, which determines the ingenious simplicity of the natural embodiment, designed for incomplete data, unreliability, “noise”, the requirement of high speed, high reliability and uniformity. For we cannot imagine an electronic circuit hidden in the skull.

    Nature and we, as part of it, never have accurate, definite and reliable information. The illumination of the screen cells, like the receptors of our eye, is never complete, the image is never correct, there are noises, omissions, etc. Then the concepts of similarity and associations acquire vital importance. “What is the most similar to the “shown” image, the situation that has arisen, and what response actions are most justified? - this is the question that determines the principle of our life among many dangers and achievements. The associativity of our thinking is absolute.

    This means that we need to move away from well-defined Boolean variables (0, 1, “yes - no”, “white - black”, etc.) towards uncertainty, reliability or other assessments of information - towards real variables.

    But then it is necessary to move away from Boolean algebra, since the concepts of conjunction and disjunction for real variables are not defined. This is where the analysis and application of the principles of natural implementation comes to the rescue - the principles of the neural network embodied in our brain.

    Let's transform the trained circuit we received into a neural network (Fig. 2.4).

    Each cell of the screen is a receptor neuron, which, as a result of illumination, acquires a certain amount of excitation, taking a value between zero and one. Receptors that replace the screen form the input, or receptor layer neural networks. We will replace each conjunctor and disjunctor with a single neuron model for the entire network. Let us introduce the output layer of the network, which in our example consists of two neurons, the excitation of which determines the recognition result. Let's name the neurons of the output layer by the names of the letters - O and A.

    Receptors, like a screen, are excited from the outside. Yet other neurons, imitating the spread of excitation in the brain, implement transfer function(in terms of automatic control theory) or activation function(in terms of neural network theory). This function converts the signals at the input of a neuron, taking into account the weights of these inputs (we will postpone their consideration for now), into the value of the excitation of this neuron, transmitted further through the network in accordance with the connections of the neurons and reaching one or more neurons of the output layer.


    Rice. 2.4. Neural network for recognizing the letters "O" and "A"

    Since the work of the brain is imitated by logical level, activation function the choice is quite simple. So, in our example it is enough to select the following activation function to find the excitation value of the i-th neuron:

    Initially we find

    Then we put

    This project does not claim to be the first place in the world and is not considered as a competitor FineReader, but I hope that the idea of ​​character pattern recognition using the Euler characteristic will be new.

    Introduction to the Euler characteristic of an image.

    The basic idea is that you take a black and white image, and assuming that 0 is a white pixel and 1 is a black pixel, then the entire image will be a matrix of zeros and ones. In this case, a black and white image can be represented as a set of fragments measuring 2 by 2 pixels; all possible combinations are presented in the figure:

    On each image pic1, pic2,... shows a red square of the counting step in the algorithm, inside which is one of the fragments F from the picture above. At each step, each fragment is summed, resulting in an image Original we obtain the set: , henceforth it will be called the Euler characteristic of the image or the characteristic set.


    COMMENT: in practice, the F0 value (for the Original image this value is 8) is not used, since it is the background of the image. Therefore, 15 values ​​will be used, starting from F1 to F15.

    Properties of the Euler characteristic of an image.

    1. The value of the characteristic set is unique, in other words, there are no two images with the same Euler characteristic.
    2. There is no algorithm for converting from a characteristic set to the original image; the only way is brute force.

    What is the text recognition algorithm?

    The idea of ​​letter recognition is that we pre-calculate the Euler characteristic for all characters in the alphabet of the language and store this in the knowledge base. Then we will calculate the Euler characteristic for parts of the recognized image and search for it in the knowledge base.

    Recognition stages:

    1. The image can be either black and white or color, so the first stage is approximation of the image, that is, obtaining black and white from it.
    2. We make a pixel-by-pixel pass through the entire image in order to find black pixels. When a shaded pixel is detected, a recursive operation is launched to search for all shaded pixels adjacent to the one found and subsequent ones. As a result, we will receive a fragment of the image, which can be either a whole character or a part of it, or “garbage” that should be discarded.
    3. After finding all the unconnected parts of the image, the Euler characteristic is calculated for each.
    4. Next, the analyzer comes into operation and, by going through each fragment, determines whether the value of its Euler characteristic is in the knowledge base. If we find the value, we consider that it is a recognized fragment of the image, otherwise we leave it for further study.
    5. Unrecognized parts of the image are subject to heuristic analysis, that is, I try to find the most suitable value in the knowledge base. If it was not possible to find, then an attempt is made to “glue together” nearby fragments and search for a result in the knowledge base for them. What is "gluing" done for? The fact is that not all letters consist of one continuous image, for example "!" The exclamation mark contains 2 segments (a stick and a dot), so before looking for it in the knowledge base, you need to calculate the total value of the Euler characteristic from both parts. If, even after gluing with adjacent segments, an acceptable result could not be found, then we consider the fragment as garbage and skip it.

    System composition:

    1. Knowledge base- a file or files originally created by me or someone else, containing characteristic character sets and required for recognition.
    2. Core- contains basic functions that perform recognition
    3. Generator- module for creating a knowledge base.

    ClearType and anti-aliasing.

    So, as input we have a recognizable image, and the goal is to make it black and white, suitable for starting the recognition process. It would seem that what could be simpler, we count all white pixels as 0, and all the rest as 1, but not everything is so simple. Text on an image can be anti-aliased or non-anti-aliased. Anti-aliased characters look smooth and without corners, while non-smoothed ones will look on modern monitors with pixels visible along the outline. With the advent of LCD (liquid crystal) screens, ClearType (for Windows) and other types of anti-aliasing were created, which took advantage of the features of the monitor matrix. The pixels of the text image change colors, after which it looks much “softer”. To see the result of smoothing, you can type some letter (or text), for example in mspaint, zoom in, and your text has turned into some kind of multi-colored mosaic.

    What's the matter? Why do we see an ordinary symbol on a small scale? Are our eyes deceiving us? The fact is that the pixel of an LCD monitor does not consist of a single pixel that can receive the desired color, but of 3 subpixels of 3 colors, which are enough to obtain desired color. Therefore, the goal of ClearType is to obtain the most visually pleasing text using the feature of the LCD monitor matrix, and this is achieved using subpixel rendering. Anyone who has a “Magnifying Glass” can, for the purpose of experiment, enlarge any place on the screen that is turned on and see the matrix as in the picture below.

    The figure shows a square of 3x3 pixels of the LCD matrix.

    Attention! This feature complicates obtaining a black and white image and greatly affects the result, since it does not always make it possible to obtain the same image, the Euler characteristic of which is saved in the knowledge base. Thus, the difference in images forces a heuristic analysis, which may not always be successful.


    Obtaining a black and white image.

    I was not satisfied with the quality of the color to black and white conversion algorithms found on the Internet. After their application, the images of characters subjected to sublepixel rendering became different in width, breaks in letter lines and incomprehensible garbage appeared. As a result, I decided to obtain black and white images by analyzing the brightness of the pixel. All pixels brighter (greater than value) 130 units were considered black, the rest were white. This method is not ideal, and still leads to an unsatisfactory result if the brightness of the text changes, but at least it received images similar to the values ​​​​in the knowledge base. The implementation can be seen in the LuminosityApproximator class.

    Knowledge base.

    The initial idea of ​​filling the knowledge base was that for each letter of the language I would calculate the Euler characteristic of the resulting symbol image for 140 fonts that are installed on my computer (C:\Windows\Fonts), add all the options for font types (Regular, Fatty, Italics) and sizes from 8 to 32, thereby covering all, or almost all, variations of letters and the base will become universal, but unfortunately this turned out to be not as good as it seems. With these conditions, this is what I got:

    1. The knowledge base file turned out to be quite large (about 3 megabytes) for Russian and in English. Despite the fact that the Euler characteristic is stored as a simple string of 15 digits, and the file itself is a compressed archive (DeflateStream), which is then unpacked in memory.
    2. It takes me about 10 seconds to deserialize the knowledge base. At the same time, the time for comparing characteristic sets suffered. It was not possible to find a function for calculating GetHashCode(), so I had to compare it bit by bit. And compared to a knowledge base of 3-5 fonts, the time for text analysis with a database of 140 fonts increased by 30-50 times. At the same time, the same characteristic sets are not saved in the knowledge base, despite the fact that some characters in different fonts may look the same and be similar, even there are, for example, 20 and 21 fonts.

    Therefore, I had to create a small knowledge base that goes inside the Core module and makes it possible to check the functionality. There is a very serious problem when filling the database. Not all fonts display small characters correctly. Let's say the character "e" when rendered in size 8 font named "Franklin Gothic Medium" turns out to be:

    And it bears little resemblance to the original. Moreover, if you add it to the knowledge base, then this will greatly worsen the results of the heuristic, since the analysis of symbols similar to this one is misleading. D This symbol was obtained in different fonts for different letters. The process of filling the knowledge base itself needs to be controlled so that each image of a symbol, before saving to the knowledge base, is checked by a person for compliance with the letter. But, unfortunately, I don’t have that much energy and time.

    Character search algorithm.

    I will say right away that initially I underestimated this problem with the search and forgot that symbols can consist of several parts. It seemed to me that during a pixel-by-pixel passage I would encounter a symbol, find its parts, if any, combine them and analyze them. A typical pass would look like this: I look up the letter "H" (In the knowledge base) and consider that all characters below the top dot and above the bottom dot belong to the current line and should be aliased together:

    But this is an ideal situation; during recognition, I had to deal with torn images, which, in addition to everything, could have a huge amount of garbage located next to the text:


    This image of the word "yes" will try to explain the complexity of the analysis. We will assume that this is a complete string, but b13 and i6 are fragments of garbage as a result of approximation. The character "y" is missing a period, and none of the characters are present in the knowledge base to say with certainty that we are dealing with a line of text from the "c" to the "i" line. And the line height is very important to us, since for gluing we need to know how close the fragments should be “glued together” and analyzed. After all, there may be a situation where we accidentally start gluing together characters from two strings and the results of such recognition will be far from ideal.

    Heuristics in image analysis.


    What are heuristics in image recognition?
    This is the process by which a characteristic set not present in the knowledge base is recognized as a correct letter of the alphabet. I thought for a long time about how to perform the analysis, and in the end the most successful algorithm turned out to be this:

    1. I find all the characteristic sets in the knowledge base that have greatest number values F fragments matches the recognized image.
    2. Next, I select only those characteristic sets in which, with the recognizable image based on unequal F values ​​of the fragment, the difference is no more than +- 1 unit: -1< F < 1. И это все подсчитывается для каждой буквы алфавита.
    3. Then I find a symbol that has greatest number occurrences. Considering it the result of a heuristic analysis.
    This algorithm does not give the best results on small character images (font size 7 - 12) . But it may be due to the fact that the knowledge base contains characteristic sets for similar images of different symbols.

    An example of use in C#.

    An example of the beginning of image recognition image. The result variable will contain the text:

    var recognizer = new TextRecognizer(container); var report = recognizer.Recognize(image); // Raw text. var result = report.RawText(); // List of all fragments and recognition state for each ones. var fragments = report.Symbols;

    Demo project.

    For a visual demonstration of the work, I wrote WPF application. It is launched from a project named " Qocr.Application.Wpf". An example of a window with the recognition result is below:

    To recognize an image you will need:

    • Presses "New Image" selects an image for recognition
    • Using the " Black and White"You can see which image will be analyzed. If you see an extremely low-quality image, then do not expect good results. To improve the results, you can try to write a color image to black and white converter yourself.
    • Choosing a language "Language".
    • Clicks recognize "Recognize".
    All image fragments should be marked with an orange or green frame.
    An example of English-language text recognition:

    and with probability 0.1 – to class C 2. The stated problem can be solved using an SME with N inputs and M outputs, trained to produce a vector at the output c, when the input is given p.

    During the learning process, the network builds a mapping P → C. It is not possible to obtain this mapping in its entirety, but it is possible to obtain an arbitrary number of pairs ( p → c), connected by display. For an arbitrary vector p at the input we can obtain approximate probabilities of class membership at the output.

    It often turns out that the components of the output vector can be less than 0 or greater than 1, and the second condition (1) is only approximately satisfied. Inaccuracy is a consequence of the analogue nature of neural networks. Most of the results obtained using neural networks are inaccurate. In addition, when training a network, the specified conditions imposed on the probabilities are not directly introduced into the network, but are implicitly contained in the set of data on which the network is trained. This is the second reason for the incorrectness of the result.

    There are other ways of formalization.

    We will represent letters in the form of dot images (Fig.).

    Rice. . Spot image.

    The dark pixel cell in the image corresponds to I ij = 1, light - I ij = 0 . The task is to determine from the image the letter that was presented.

    Let's build an SME with N i X N j inputs, where each input corresponds to one pixel: x k = I ij . The pixel brightnesses will be components of the input vector.

    As output signals, we choose the probabilities that the presented image corresponds to a given letter:

    The network calculates the output:

    where is the exit c 1 = 0.9 means, for example, that an image of the letter “A” is presented, and the network is 90% sure of this, output c 2 = 0.1 - that the image corresponded to the letter “B” with a probability of 10%, etc.

    There is another way: the network inputs are selected in the same way, and the output is only one, number m presented letter. The network learns to give meaning m according to the presented image I:



    (I ij) → m

    In this case, the disadvantage is that letters with similar numbers m, but dissimilar images, may be confused by the network during recognition.

    Letter recognition exercise. Various difficulty levels. A noise mask is applied to the letter. Sometimes you need to be quick-witted in order to understand by elimination what kind of letter was in the task.

    Teaching children to read and letters of the Russian alphabet. What letter is shown? Choose the correct answer on the right.

    Which letter is hidden? Online game For early development children. Recognition of letters of the Russian alphabet

    How to learn letters of the Russian alphabet

    Often the letters of the Russian alphabet begin to be taught in order, as they are written in the primer. In fact, letters should be taught in order of frequency of use. I'll give you a little hint - letters in the center of the keyboard are used more often than those on the periphery. Therefore, first you need to memorize A, P, R, O.... and leave such ones as Y, X, F, Shch for a snack...

    What is better - teaching a child to read letters or syllables?

    Many teachers teach immediately in syllables. I suggest you get around this small problem and play online games instead of learning syllables. This is how the child learns and plays at the same time. Or rather, it seems to him that he is playing and at the same time involuntarily repeating the necessary sounds.

    The advantage of online games is that if you pronounce a letter incorrectly, the simulator will patiently repeat the correct answer until you remember.

    Do ABC books help you learn letters? Why paper primers are still used in teaching practice

    Traditionally, paper ABC books are used to teach letters. Their advantages are undeniable. If you drop the paper version on the floor, you don’t have to worry about the device breaking. Primers can be opened on a specific page and placed in a visible place. All this is not found in electronic devices.

    However, programmable reading training simulators also have certain advantages, for example, they can speak, unlike their paper counterparts. Therefore, we can recommend both paper and electronic sources.

    Do online exercises help you remember letters?

    The main emphasis when using electronic and online games is that a person involuntarily repeats the same information many times. The more often the repetition occurs, the more firmly the information is introduced into the consciousness and brain. That's why online exercises A very useful addition to traditional cubes and paper books.

    At what age should a child be sent to educational centers?

    The speed of maturation is different. Usually. Girls up to a certain age are ahead of boys in development. Girls begin to speak earlier, they are more socially oriented and more amenable to learning. boys, on the contrary, are often very autistic - who walk on their own. From this we can conclude that girls learn to read a little earlier than boys. But this is only an external diagram. Each child is individual and his readiness for learning can be tested in practice. Does your child enjoy attending classes? does anything remain in his mind after he has unlearned it?

    Maybe try to study on your own, especially since riding the bus takes time, and no one understands your baby better than mom and dad.

    What to do if your child does not remember letters

    Studying is difficult. And it does not depend on whether it is an adult or a child. It is very, very difficult to learn. In addition, children learn only through play. Another fact is that in order to learn something, it must be practiced or repeated many times. Therefore, it is not surprising that children remember letters very poorly.

    There is a separate number of children who begin to speak late and at the same time confuse not only letters, but also sounds. With such guys you need to draw letters together, use all possible materials for this, cereals, matches, pebbles, pencils - whatever is at hand. Draw it and ask your child to repeat it.

    Can be done graphic dictations, you can play draw and repeat.

    What to do if your baby confuses letters, for example, D and T

    If a child confuses letters, this means it is too early to move on to reading words. Go back and repeat the letters. Children often confuse voiced and unvoiced letters or similar spellings, for example, P and R. Repetition practice can help. For example, you can sculpt letters together, you can make letters from the body, for example, by placing your arms to the sides to depict the letter T.

    How to teach a child to memorize letters if he doesn’t want to

    repetition is the mother of learning. Repeat letters in words, repeat letters in syllables, try to guess the letters. Let the child write the letter and you try to guess. Or, you can do the opposite - try to form a letter from grains of rice, and your son or daughter will guess what kind of letter it is. You can write with a stick in the sand.

    Why can't he pronounce the letters correctly? How to teach a child to pronounce letters clearly and clearly?

    Gaps may be at the physiological level. The person does not hear himself correctly. or it seems to him that he is speaking correctly. It’s very easy to check this - just record the conversation on a voice recorder and listen to the child read.

    It could also be a simple lack of training. Different people need a different number of times to repeat information before it is remembered, and children are no exception. It needs to be repeated many times and in different situations before he begins to pronounce letters and sounds correctly.

    What also needs to be noted is that you need to love children and work with them periodically. Don't start processes.

    How to teach your child the alphabet to prepare for school

    You need to work with children in game form. Exactly as stated on this site. Another secret to learning is that you need to study in small portions. Children cannot maintain attention for more than 5 minutes. Therefore, it is simply useless to study longer.

    What letters should you start memorizing the alphabet with?

    You need to start memorizing letters with commonly used letters. The second secret is to remember the letters that make up the child’s name, the name of mom and dad, you can add to these words the names of brother and sister, grandparents. These are the most favorite names.

    By the way, if you are learning to touch type, then the first word with which you need to start typing training is again your first and last name.

    Does your baby need to memorize the letters of the English alphabet?

    Knowledge English alphabet won't hurt. They don’t study the alphabet at school, but immediately start reading, leaving the alphabet up to the parents. It is also worth noting that large and small english letters look different and must be remembered. If your child started speaking late, then most likely, remembering Latin letters will be a problem for him.

    Is it possible to teach a child to read immediately in words?

    Written Russian looks the same as spoken Russian, unlike English or French, so remember the words

    How to remember numbers for a preschooler

    Draw numbers, count sticks, when you walk, count red and white cars, count whether there are more men or women walking down the street. Turn everything into games.

    Try to read the text letter by letter yourself - not only will it take a long time, but it will also be unlike the way we actually speak. Adults do not spell - unless the word is unfamiliar or in a foreign language. Then, in order to hear it, they read it slowly and carefully pronounce the words.

    Why does a preschooler forget letters? Teaching reading through games

    Why does a baby forget letters even though he learned them yesterday?

    Usually, a child easily remembers some letters, but not so much others. The role of an adult is to note what his ward does not succeed and give additional tasks.

    Another important thing is regularity. Since for a child all learning is, frankly speaking, cramming and repetition, the learning process should be such that information is repeated at certain intervals.

    Ebbinghaus (read more about this on Wikipedia) studied how quickly information that is meaningless to a person is forgotten and came to the conclusion that 40% of the information is forgotten in the first twenty minutes. And, if it is impossible to say exactly what a particular letter means, then this is tantamount to the fact that the letter is completely unfamiliar. There must be an unambiguous 100% recognition.

    Repeat, repeat, repeat

    For example, you train warehouses (syllable, combination of letters) ON, and the child more or less learned to recognize and read the combination. Add the syllable BUT to the tasks, and ask them to read the words, helping them to read letters that are still unfamiliar to the child. However, the child can click on the syllables himself and listen to the computer read.

    It is required to create a neural network to recognize 26 letters of the Latin alphabet. We will assume that there is a system for reading characters, which represents each character in the form of a matrix. For example, the character A can be represented as shown in Fig. 2.22.

    Rice. 2.22. Symbol representation

    The actual system for reading characters does not work perfectly, and the characters themselves differ in style. Therefore, for example, for symbol A, the units may not be located in the same cells as shown in Fig. 2.22. Additionally, non-zero values ​​may occur outside the character outline. The cells corresponding to the outline of the symbol may contain values ​​different from 1. We will call all distortions noise.

    MATLAB has a function prprob, which returns a matrix, each column of which represents a matrix written as a vector describing the corresponding letter (the first column describes the letter A, the second describes the letter B, etc.). Function prprob also returns a target matrix of size , each column of which contains one 1 in a row corresponding to the letter number with the remaining elements of the column being zero. For example, the first column corresponding to the letter A contains a 1 in the first row.

    Example. Let's define a template for the letter A (program Template_A.m).

    % Example of forming a template for the letter A

    Prprob;

    i=1; % letter number A

    v=alphabet(:,i); % vector corresponding to the letter A

    template=reshape(v, 5,7)";

    In addition to the already described function prprob the program uses functions reshape, which forms the matrix , and after transposition - (make sure that it is not possible to immediately form the matrix ), and the function plotchar, which draws 35 vector elements in a lattice pattern. After executing the program Template_A.m we get a matrix template and the letter A template as shown in Fig. 2.23.

    Rice. 2.23. Formed letter A template

    To recognize letters of the Latin alphabet, it is necessary to build a neural network with 35 inputs and 26 neurons in the output layer. Let's take the number of neurons in the hidden layer to be 10 (this number of neurons was chosen experimentally). If difficulties arise during learning, you can increase the number of neurons at this level.



    The pattern recognition network is built by the function patternnet. Please note that when creating a network, the number of neurons in the input and output layers is not specified. These parameters are implicitly set when training the network.

    Consider a program for recognizing letters of the Latin alphabet Char_recognition.m

    % Latin alphabet letter recognition program

    Prprob; % Formation of input and target vectors

    Size(alphabet);

    Size(targets);

    % Networking

    Train(net,P,T);

    % Learning in the presence of noise

    P = ;

    Train(netn,P,T);

    Train(netn,P,T);

    % Network testing

    noise_rage=0:0.05:0.5; % Array of noise levels (noise standard deviations

    for noiselevel=noise_rage

    for i=1:max_test

    % Test for network 1

    % Test for network 2

    title("Network error");

    xlabel("Noise level");

    ylabel("Error percentage");

    Operator = prprob; form an array of input vectors alphabet size with alphabet character patterns and array of target vectors targets.

    The network is created by the operator net=patternnet. Let's accept the default network settings. The network is first trained in the absence of noise. The network is then trained on 10 sets of ideal and noisy vectors. Two sets of ideal vectors are used to ensure that the network retains the ability to classify ideal vectors (without noise). After training, the network “forgot” how to classify some noise-free vectors. Therefore, the network should be trained again on ideal vectors.

    The following program fragment trains in the absence of noise :

    % Network training in the absence of noise

    Train(net,P,T);

    disp("Network training in the absence of noise is completed. Press Enter");

    Training in the presence of noise is carried out using two ideal and two noisy copies of the input vectors. The noise was simulated by pseudo-random normally distributed numbers with zero mean and standard deviation of 0.1 and 0.2. Training in the presence of noise produces the following program fragment:

    % Learning in the presence of noise

    netn = net; % saving of the trained network

    T = ;

    P = ;

    Train(netn,P,T);

    disp("Network training in the presence of noise is completed. Press Enter");

    Since the network was trained in the presence of noise, it makes sense to repeat the training without noise to ensure correct classification of ideal vectors:

    % Retraining in the absence of noise

    Train(netn,P,T);

    disp("Retraining the network in the absence of noise is completed. Press Enter");

    Network testing was carried out for two network structures: network 1, trained on ideal vectors, and network 2, trained on noisy sequences. Noise with a mean of 0 and a standard deviation of 0 to 0.5 in steps of 0.05 was added to the input vectors. For each noise level, 10 noisy vectors were generated for each symbol, and the network output was calculated (it is desirable to increase the number of noisy vectors, but this will significantly increase the running time of the program). The network is trained to form a one in the only element of the output vector, the position of which corresponds to the number of the recognized letter, and fill the rest of the vector with zeros. The network output will never generate an output vector consisting of exactly 1 and 0. Therefore, under noise conditions, the output vector is processed by the function compet, which transforms the output vector so that the largest output signal receives the value 1 and all other output signals receive the value 0.

    The corresponding program fragment looks like:

    % Perform a test for each noise level

    for noiselevel=noise_rage

    for i=1:max_test

    P=alphabet+randn(35, 26)*noiselevel;

    % Test for network 1

    errors1=errors1+sum(sum(abs(AA-T)))/2;

    % Test for network 2

    errors2=errors2+sum(sum(abs(AAn-T)))/2;

    % Average error values ​​(max_test sequences of 26 target vectors)

    network1=;

    network2=;

    plot(noise_rage, network1*100, noise_rage, network2*100);

    title("Network error");

    xlabel("Noise level");

    ylabel("Error percentage");

    legend("Ideal input vectors","Noisy input vectors");

    disp("Testing complete");

    When calculating the recognition error, for example, errors1=errors1+sum(sum(abs(AA-T)))/2, it is taken into account that in case of incorrect recognition two elements of the output vector and the target vector do not coincide, therefore, when calculating the error, division by 2 is performed . Sum(abs(AA-T)) calculates the number of unmatched elements for one example. The sum sum(sum(abs(AA-T))) calculates the number of unmatched elements across all examples.

    Recognition error graphs for a network trained on ideal input vectors and a network trained on noisy vectors are shown in Fig. 2.24. From Fig. Figure 2.24 shows that a network trained on noisy images gives a small error, but the network could not be trained on ideal input vectors.

    Rice. 2.24. Network errors depending on noise level

    Let's check the operation of the trained network (the trained network must be present in the MATLAB workspace). Program Recognition_J.m generates a noisy vector for the letter J and recognizes the letter. Function randn generates a pseudo-random number distributed according to a normal law with zero mathematical expectation and unit standard deviation. Random number with mathematical expectation m and standard deviation d obtained by the formula m+randn*d(in a programme m=0, d=0.2).

    noisyJ = alphabet(:,10)+randn(35,1) * 0.2;

    plotchar(noisyJ);

    disp("Noisy character. Press Enter");

    A2 = netn(noisyJ);

    A2 = compet(A2);

    ns = find(A2 == 1);

    disp("Symbol recognized");

    plotchar(alphabet(:,ns));

    The program displays the number of the recognized letter, the noisy pattern of the letter (Fig. 2.25) and the pattern of the recognized letter (2.26).

    Rice. 2.25. Noisy letter template

    Rice. 2.26. Recognized letter pattern

    Thus, the considered programs demonstrate the principles of image recognition using neural networks. Training the network on various noisy data sets allowed the network to be trained to work with images distorted by noise.

    Tasks

    1. Do all the examples given.

    2. Experience recognizing different letters

    3. Investigate the effect of noise in programs on the accuracy of character recognition.

    Function approximation

    Similar articles