7

I am currently working on my last project before graduating. For this project, I have to develop a Natural Language Question Answering System. Now, I have read quite some research papers regarding this topic and have figured out everything except for the parsing algorithm.

The NL Q-A will be programmed in Python, and I will use the spaCy library to finish this project. However, I am stuck when it comes to parsing algorithms. I managed to reduce the parsing algorithms to 3:

  • Cocke-Kasami-Younger (CKY) algorithm
  • Earley algorithm
  • Chart Parsing algorithm

Note: I know that all three algorithms are chart parsing algorithms. I also know that the Earley algorithm is context-free, but has a low efficiency for a compiler.

What I don't know is: Which one should I pick? (Please, provide a non-subjective answer to this question!)

The system is for a specific domain. The answer to the natural question will be displayed in the form of the result of a calculation of some kind. Preferably in the tabular or graphical form.

I have done my research. However, I probably do not understand the algorithms properly, which makes it difficult to make a selection.The algorithm should be efficient and perhaps outperform others.

nbro
  • 42,615
  • 12
  • 119
  • 217
lilienfa
  • 319
  • 1
  • 9

1 Answers1

1

I have been reading and reading, and found answers to almost all my questions.

I am sticking to Earley algorithm, given that it offers a dynamic programming approach (CKY does the same). Both algorithms are chart parsing algorithms.

Earley is a context-free, top-down parsing algorithm, which makes it a goal-driven algorithm. From start symbol down. Furthermore, it is more efficient than the CKY algorithm.

Here are the slides that provide more info.

nbro
  • 42,615
  • 12
  • 119
  • 217
lilienfa
  • 319
  • 1
  • 9