Overview of Handwriting recognition application
The purpose of this application is to design a Neural Network that could recognize the number from 1 to 9 from the hand-writing gray image (28 pixels by 28 pixels) as shown in Figure 4.7.
Figure 4.7: Overview of handwriting recognition application.
Although handwriting recognition based on MNIST database is not well-suited for machine learning experiments, this application is selected to demonstrate that ANNHub is able to cope with large dataset application, and the overall accuracy could achieve around 90%.
The MNIST database, which can be obtained from http://yann.lecun.com/exdb/mnist/, consists of 60,000 samples in a training set and 10,000 samples in a test set. It is a subset of a larger dataset available from National Institute of Standards and Technology (NIST).
Figure 4.8: Handwriting recognition dataset.
The first step is to prepare the MNIST data into supported format that can be loaded into ANNHub. Since an image in MNIST database is in 28x28 grey-scale, it can be presented in 2D array (28x28) and its element values are within a [0;255] range. As the input layer of the Neural Network only accepts 1D array, it requires to flatten 2D array into 1D array with the length = 28x28 = 784. The output of the Neural Network is a number, from 1 to 9, that is corresponding with an image input. Figure 4.8 shows the format of the MNIST dataset in csv file
The first 784 columns are for an image input, and the last column is for output (target) value. Each row represents a sample in MNIST database. The dataset that includes both training dataset and test dataset in csv format can be found in the ANNHub installation folder (Examples>Classification Examples>MNIST)
Figure 4.9: MNIST dataset files, training and test sets, in csv format.
Load training dataset into ANNHub
Figure 4.10: Load MNIST training dataset
After datasets are prepared, training dataset will be loaded into ANNHub in the Step 1 in Figure 4.10. In this step, only a fraction of the dataset is loaded so that it gives ANNHub enough information about the dataset format that assists to configure a recommended Neural Network structure.
Configure Neural Network
Figure 4.11: Load MNIST training dataset
Based on the training dataset, the recommended structure of the Neural Network is configured as shown in Figure 4.11. However, users can still tweak to achieve better result. In this example, Scaled Conjugate Gradient training algorithm is used. The cross entropy is used as a cost function. The Neural Network structure that has 784 input nodes, 20 hidden nodes and 1 output node is configured. The activation function for hidden layer is Tansig, and Softmax is used as the activation function for output layer. Max min max method is used for both pre-processing and post processing. The training data ratio =75%.
For more information, please refer to Configuring Neural Network structure.
Train Neural Network
Figure 4.12: Train the Neural Network to learn MNIST features.
As shown in Figure 4.12, the Scaled Conjugate Gradient is used, the early stopping technique that utilizes validation set to determine the stopping location is automatically configured and applied during training procedure. The stopping criteria that includes 1 max fails, 0.0001 for training goal, 0.001 for gradient goal, and 300 epochs. The training process takes around 21 minutes to complete.
Better result could be achieve by tweaking the Neural Network structure, training algorithm and it parameters.
Evaluate the trained Neural Network
Figure 4.13: Evaluate the trained Neural Network.
After the Neural Network is being trained, confusion matrix and ROC curve techniques shown in Figure 4.13 are used to evaluate its performance. Both training set, validation set and test set are used in evaluation. As shown in Figure 4.6, some classes (class 1 corresponds to output that has a value as 1) have better accuracy than other classes, but the overall accuracy will still achieve around 95%.
For more information, please refer to Training Neural Network.
Test trained Neural Network with new dataset
Figure 4.14: Test the trained Neural Network with new test dataset.
Before being deployed into a real application, the trained Neural Network can be tested with a new dataset to confirm its generalization. The test dataset contains 10,000 samples that have not been used during the design process described above. As can be seen in Figure 4.14, the trained Neural Network still can recognize correct numbers from samples in the test dataset with accuracy rate of around 90% (with very strict threshold as 0.3, that means if the predicted result is in [1.31;1.69] range, the Neural Network will make false prediction it this image is 1 or 2).
If the threshold is set to 0.49, then the overall accuracy will be 93.73%.
Deploy the trained Neural Network in Handwriting recognition application
The deployment in different programing environments can be easily done thanks to their APIs provided by ANS Center. In this example, the deployment of the trained Neural Network is in LabVIEW environment by using ANNAPI for LabVIEW. The trained Neural Network model is exported to a file with ".ann" extension. The LabVIEW code to load the trained Neural Network model and perform prediction is shown as follows,
Figure 4.16: LabVIEW Block diagram to deploy trained Neural Network in handwriting application
Figure 4.17: Standalone handwriting application that use trained Neural Network to classify handwriting images