-1

I am working in a project named "Handwritten Math Evaluation". So what basically happens in this is that there are 11 classes of (0 - 9) and (+, -) each containing 50 clean handwritten digits in them. Then I trained a CNN model for it with 80 % of data used in training and 20 % using in testing of model which results in an accuracy of 98.83 %. Here is the code for the architecture of CNN model:

import pandas as pd 
import numpy as np 
import pickle 
np.random.seed(1212) 
import keras 
from keras.models import Model 
from keras.layers import *
from keras import optimizers 
from keras.layers import Input, Dense 
from keras.models import Sequential 
from keras.layers import Dense 
from keras.layers import Dropout 
from keras.layers import Flatten 
from keras.layers.convolutional import Conv2D 
from keras.layers.convolutional import MaxPooling2D 
from keras.utils import np_utils 
from keras import backend as K  
from keras.utils.np_utils import to_categorical 
from keras.models import model_from_json
import matplotlib.pyplot as plt
model = Sequential() 
model.add(Conv2D(30, (5, 5), input_shape =(28,28,1), activation ='relu')) 
model.add(MaxPooling2D(pool_size =(2, 2))) 
model.add(Conv2D(15, (3, 3), activation ='relu')) 
model.add(MaxPooling2D(pool_size =(2, 2))) 
model.add(Dropout(0.2)) 
model.add(Flatten()) 
model.add(Dense(128, activation ='relu')) 
model.add(Dense(50, activation ='relu')) 
model.add(Dense(12, activation ='softmax')) 
# Compile model 
model.compile(loss ='categorical_crossentropy', 
            optimizer ='adam', metrics =['accuracy']) 
model.fit(X_train, y_train, epochs=1000)

Now each image in dataset is preprocessed as follows:

import cv2
im = cv2.imread(path)
im_gray = cv2.cvtColor(im,cv2.COLOR_BGR2GRAY)
ret, im_th = cv2.threshold(im_gray, 90, 255, cv2.THRESH_BINARY_INV)
ctrs, hier = cv2.findContours(im_th.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
rects = [cv2.boundingRect(ctr) for ctr in ctrs]
rect = rects[0]
im_crop =im_th[rect[1]:rect[1]+rect[3],rect[0]:rect[0]+rect[2]]
im_resize = cv2.resize(im_crop,(28,28))
im_resize = np.array(im_resize)
im_resize=im_resize.reshape(28,28)

I have made an evaluation function which solves simple expression like 7+8 :-

def evaluate(im):
    s = ''
    data = []
    im_gray = cv2.cvtColor(im, cv2.COLOR_BGR2GRAY)
    ret, im_th = cv2.threshold(im_gray, 90, 255, cv2.THRESH_BINARY_INV)
    ctrs, hier = cv2.findContours(im_th.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    sorted_ctrs = sorted(ctrs, key=lambda ctr: cv2.boundingRect(ctr)[0])
    boundingBoxes = [cv2.boundingRect(c) for c in ctrs]
    look_up = ['0','1','2','3','4','5','6','7','8','9','+','-'] 
    i=0
    for c in ctrs:
        rect = boundingBoxes[i]
        im_crop = im_th[rect[1]:rect[1]+rect[3], rect[0]:rect[0]+rect[2]]
        im_resize = cv2.resize(im_crop,(28,28))
        im_resize = np.array(im_resize)
        im_resize = im_resize.reshape(28,28,1)
        data.append(im_resize)
        i+=1
    data = np.array(data)
    predictions = model.predict(data)
    i=0
    while i<len(boundingBoxes):
        rect = boundingBoxes[i]
        print(rect[2],rect[3])
        print(predictions[i])
        s += look_up[predictions[i].argmax()]
        i+=1
    return s

I need help extending this to compound fractions, but the problem is that the vinculum / is identical to the subtraction sign - when resized to (28, 28). So I need help in distinguishing between them.

This is my first question, so please let me know if any details are left.

Snehal Patel
  • 1,037
  • 1
  • 4
  • 27

1 Answers1

0

To better distinguish the vinculum vs. minus, you have a few options.

  1. First, you may add more examples of each in your training set.

  2. Alternatively, you may develop a second classifier (vinculum vs. minus subclassifier) that operates in tandem with your first "general purpose" classifier ([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, +, -, /]) which you already have. The subclassifier is triggered only when the first classifier predicts either a vinculum or minus. Since your subclassifier is expected to be more accurate than than the first classifier for the subtask of distinguishing between vinculum and minus, you use the prediction of the former as the final verdict. In summary, the first classifier is a selector for either vinculum or minus, and the second (sub)classifier is a discriminator for which vinculum or minus.

Note: For the vinculum vs. minus subtask, you may perform some old fashioned image analysis. First, derive a mathematic representation of the bar. Next, extract the length of the bar and the angle the bar makes with respect to the horizontal. Finally, these two tabular features can be used in one of the many options of machine learning algorithms at your disposal. For this subtask, it is possible that this approach may yield better results than deep learning.

Snehal Patel
  • 1,037
  • 1
  • 4
  • 27