- Closed Domain QA system is natural way to request information that we do not know, check information that we are not sure.
- Developing of Closed Domain QA system involves Information Retrieval and Natural Language Processing.
- It is the better way to find the information directly rather than the search that particular information. Create Closed Domain QA system in which user can upload the document from which, user want to extract the information and want to get answer without wasting of time to search large number of documents or data.
- This document must be verified or checked by the expert because of irrelevant document may create invalid answer and decrease the accuracy of the system.
- Also user can search the query from already available source of data which is already verified by the expert.
- This can be used in FAQs type of question in which there are so many question and answers are already available but need to search it.
- User enters the query or question in natural language which is not handled by most of the search engine.
- After that question analyzer is used to analyses the question and identified the types of question among available category the question, so that system can judge which kind of answer will be possible for such kind of questions.
- Question type and its expected answer type are generally identified by looking at question keywords. Consider following table.
Question Type
|
Expected Answer Type
|
Who
|
Person
|
When
|
Date/Time
|
Where
|
Location
|
What
|
Object
|
How
|
Measure
|
- Now this identify the different keywords from the questions and based on that keyword fist of all identify the different paragraph that consists this keywords.
- Once system get the number of paragraph from available documents now system need to identify the relevant paragraph that is relevant to the questions.
- There is various answer extraction method like Heuristic, Pattern Based, Relation Based and logical based. From that method retrieve the answer from the paragraph.
- Answer extraction takes the input as expected type of answer and set of paragraph retrieved from the available source of data.
- In this process similarity between the keywords of the question and keywords founds in passage is computed in order to get best passage in a ranked list.