StuA: An Intelligent Student Assistant

, Amazon’s Alexa, Microsoft’s Cortana and Facebook’s M


I. Introduction
A rtificial Intelligence is the science of making computers perceive their environment in a similar way as a human does and takes actions [1]. Artificial Intelligence is not a cognitive process rather it has an array of separate components of intelligence like learning, problem-solving, perception, and language understanding. Using artificial Intelligence, researchers are creating various systems that can mimic decision -making abilities of human based on past experiences and understand human thoughts. Such systems consist of a knowledgebase, inference engine, and user-interface. The knowledge-base has facts i.e., past experiences whereas inference engine has a set of rules that can infer a given situation while the user interface is used to interact with the user. Knowledge-base represents the knowledge that is stored by a model and is used to conclude new rules and check inconsistencies. The development of most of the virtual and intelligent software systems is implemented in CLIPS(C Language Integrated Production System, first released by NASA in 1988) [2][3] programming environment. CLIPS is a programming tool which is designed to ease the development of software that can model human knowledge. CLIPS program is used because it is flexible, expandable and has low cost. It is the only expert system shell for which it is claimed that the shell has been certified correct [27].
Virtual Assistants refer to software that can provide required information or perform work for human. There are mainly two types of virtual assistants: rule-based and stochastic. When there is lack of data, expert knowledge is used to design rule-based assistants also known as expert systems. While in presence of already tested large and valid information, stochastic virtual assistant can be designed using various machine learning techniques. A number of virtual assistants exist in different domains. A number of rule-based assistants are working in different domains such as counseling [4][5][6][7], prediction [8][9][10], diagnosis [11][12][13][14][15][16][17][18][19], design [20][21]30], e-learning [28][29] and recommendations [31]. Some of the stochastic assistants are Chatbots, question answering systems and assistants like Apple's Siri, Google's Google Assistant, Amazon's Alexa, Microsoft's Cortana and Facebook's M.
All these virtual assistants are designed with an aim to replace a human being and assist the people around. These systems are doing really well in their domain. In the world of technology, people are more comfortable in communicating with a computer or any other electronic device than a human being. Sometimes people hesitate to interact and ask queries to a human being as either they want to keep it secret or they feel shy. Sometimes they want somebody's help but are not able to reach a right person to seek a help. This motivated us to design an interactive and intelligent student assistant situated in a close environment. It helps students who are new to the college environment. When a newcomer or a fresher enters in a college, he/she may not be familiar with the environment and seeks for the guidance from some experienced person like senior or teacher. But at the same time, some of them fear of being ragged or hesitate to ask. Hence, the new-comers remain unaware of the college environment. In such a scenario, the proposed student assistant, StuA, helps the newcomers and tries to provide the required information in a safe and sound manner to the students. It helps in familiarizing them with the rules and regulations of the college in a comfortable manner. StuA is equipped with the unique features which allow the user to ask questions on WHAT IS, WHERE, WHEN, WHAT HAPPEN format. Moreover, it facilitates real-time, low-cost expert-level assistance 24X7, unlike a human. The system based on its previously gained knowledge and beliefs is able to provide answers to most of the queries. The set of questions is not limited to only the prefixed questions. Nevertheless, to the best of our knowledge, no such virtual intelligent assistant exists till date. Moreover, it can be further customized to handle the queries of a new person in any environment such as a new office, organization, reception of hotels, hospitals, schools and malls. and general information in this domain. The implementation has been done in CLIPS. The main limitation of CLIPS is that it supports only forward chaining and does not allow backward chaining. As a result, many of the queries cannot be processed. Till date, no generic extension of CLIPS with backward chaining exists. A few researchers [22][23] have tried to add this functionality but for the specific domains only. So, here, we propose a generic model of backward chaining also, to extend the functionality of CLIPS so as to provide better inference mechanism. The whole model is designed using Java as the user interface. When java is integrated with CLIPS it provides the flexibility to redefine the output and with help of java, we can assert only relevant facts reducing the overhead of searching through irrelevant facts. It also aids the user by providing a friendly user-interface.So now, only the relevant rules are fired and the filtered and processed output is displayed.
The paper is organized as follows. In Section 2, the related work is discussed followed by the detail explanation of the proposed model (Section 3). Section 4 presents the implementation detail of the proposed model and Section 5 discusses the test results and validation of the proposed model. The Section 6 concludes the paper with limitations and provides direction for future work.

II. Related Works
Till date several systems are designed in various domains [4][5][6][7][8][9][10][11][12][13][14][15][16][17][18][19][20][21]. One of the expert systems in education domain is developed to help students in selecting the best branch who are planning to take admissions in engineering; this system is called Student Counseling System (SCS). Here they have used certainty factor to provide basis to their judgment. They have only chosen engineering branch as their domain for their counseling system. This expert system answers the queries in the form of 'yes' or 'no' only using forward chaining [4]. This system is rulebased system which uses CLIPS to answer the queries.
Another domain is diagnosing some of the eye diseases. It provides the expert guidance on eye diseases. The disadvantage of the system is that it is not taking the symptoms as input from the user. Rather, the expert system prompts menu and user has to select from it and, based on the disease, user has to answer queries in yes/no form [11]. Moreover, it is able to answer only the queries that can be derived either from the facts or from forward chaining. Various expert systems are being designed in the field of medical sciences. One of them is plant disease diagnosis system. They have used two methods to diagnose plant disease. One is Step by Step description and other is Graphical Representation System. The limitation of step by step description is the user is not allowed to enter his query, rather he has to select from a list of options. Hence search query is limited. Graphical Representation System provides more meaningful results when searched with keyword disease. Here the user has to only enter keywords. The limitation is he cannot enter the full description of disease [13]. The system proposed was the rule-based system which uses forward reasoning and pattern matching to answer the queries. Nevertheless, they concluded that expert systems is one of the successful methods which helped and supported users in making the right decisions in scenario where they have lack of knowledge.
Another expert system in the row deals with Diagnosis of Neuromuscular Disorders [15]. The proposed system is implemented using JESS. Here, the user is presented with a list of questionnaires about symptoms and possible treatments are suggested according to disorder. They designed different rules for different disorder on basis of knowledge acquired from experts. The limitation of this work is that it can detect only Cerebral Palsy, Multiple Sclerosis, Muscular Dystrophy and Parkinson's disease. Even other small diseases can't be detected using this system. Another research work deals with the work of diagnosis of rice plant disease. This expert system is also developed using JESS [16]. They have used SQL to store the data and extract information stored.
Similarly, there are different research works available in the field of system that assists people known as virtual assistants. Another example in domain of education is an expert system which aimed to assist Student in there major selection process. It is a prototype of an advisory expert system which can assist new and incoming students to select suitable majors they can apply to and the most convenient institutions they can attend. In this research work, user has to select from different courses and subjects and after confirmation final output is shown. They have designed a rule-based system in which they majorly defined three broad categories of rules to assist the students. The drawback of their work is that they have not tested their work on real-world cases; they have used fabricated test-cases to simulate their results [5]. A system PAS (Postgraduate Advisory System) is proposed which enables the students to select and get a plan for each semester without the need to consult advisors [6].The proposed system was different from other systems as it responded taking into account students thesis fields. The system is limited to one department; its rules are only defined for one department. Another expert system that is developed worked in Car Failure domain. Here, the author has categorized problems broadly into three categories. The drawback of the system is that user has to choose from a limited number of options [14]. He cannot input his own query. Moreover, there are virtual assistant such as Siri, Cortana, and Google assistant which are using some machine learning techniques to learn the procedure of doing the task. Cortana mostly answers general questions for which it pulls information from Bing. Whereas, Siri assists in all kind of works in a device like calling people, sending text messages, setting reminder, etc.
In conclusion, the virtual assistants can be designed in two ways: rule-based and machine learning based. For the domains, where data is not available in the form of corpus, it is difficult to apply some machine learning technique for decision making. In such cases, expert knowledge is required to design the systems (also known as expert systems). Most of the expert systems are using CLIPS and hence they are able to answer the queries using forward chaining only. They provide a fixed set of queries and user has to select one out of them only to get the answer which restricts the functionality of the system. Table I presents the summary of existing rule-based systems.

III. Proposed Model
Recognizing the need of a virtual assistant for the new-comers in a college, we propose an interactive and intelligent student assistant, StuA, which can help the new students in familiarizing them with the new environment, rules and regulations. It is a rule-based assistant because of lack of documented information. Furthermore, it tries to answer the queries of the students using inference mechanism without fixing the set of queries and hence providing a broader and unrestricted platform to resolve the doubts. The user is allowed to ask anything related to the domain and he need not pick the question from the already framed list. The domain knowledge is stored in the form of knowledgebase drilled by knowledge engineers. The proposed assistant responds to the question not only using forward chaining. Rather, the CLIPS tool is extended to empower the inference mechanism of the tool with backward chaining algorithm as well. The extended version of CLIPS with backward chaining is capable to handle most of the queries such as Inference rule: Student break rule => student has to pay fine Query: when student has to pay fine? The overall architecture of the proposed model is presented in Fig.  1. In the model, the knowledge engineer designed the knowledge-base in collaboration with the domain expert. The knowledge is stored in the form of facts and inference rules. The inference engine works with forward chaining as well as backward chaining. A user-friendly interface is provided to the user where he interrogates. To simplify the process, some drop downs are provided. The user asks for a query using the user interface. It is then treated by the inference engine. At this stage, first, the type of the query is analyzed. Depending upon the type of the query inference mechanism is carried out either through facts directly, or through forward chaining or the backward chaining using the inference rules. The result is then displayed to the user. Unlike CLIPS, only the relevant rules are fired and the filtered and processed output is displayed. The filtering and processing of output are being carried out by integrating CLIPS with java.

A. Domain Knowledge-Base
Domain specific knowledge is considered as bottleneck information in building knowledge-based systems. The model is trained with minimum possible knowledge. It often encountered missing data in running new test examples thus using inference mechanism it is able to answer them. Here, Information is handcrafting which is the simplest way to put knowledge into program. The focus is mostly on gathering knowledge from human experts, college website and through feedback from students.
One of the key points while constructing a virtual assistant is transparency i.e., making the system understandable despite the complexity of task. This is because: • The system improves through consecutive development, which requires thorough understanding of earlier versions.
• The system improves through criticism from the people who are not familiar with its implementation details.
• The system uses its own learning methods for solving the problems.
After acquiring knowledge from human experts, it maps the knowledge so that it can be used in the program. Knowledge is basically in the form of sentences which is then broken into subject, object and predicate so that they could be programmed in the knowledge-base.

B. Inference Mechanism
A survey was conducted to observe the type of query any newcomer could ask and four major classes of questions is identified. The first class is "TRUE/FALSE" which is used to tell whether an asked fact is true or not. For example "Student can return book on Monday ", such statement is either true or false. The second class of question is "WHAT-IS or WHERE". This class means that these questions are related to some atomic knowledge about the environment. For example, we have some information like "LRC is learning resource center. The minimum CGPA is 4", these are some of the atomic information which cannot be further broken. So the query can be "what-is minimum CGPA?" and this can be directly extracted from the information available. Similarly, the third class is "WHAT-HAPPEN". This class identifies the consequences of some situation or helps to know the outcome of some event. Suppose we have some fact like "if student fails supplementary examination then they get back in that year", in this fact query could be, "what happens if student fails supplementary examination?" This also led to the identification of the fourth class of question that is "WHEN" which is meant to know the cause for some event. For example, we have a fact that "if attendance is less than 60%, students get debarred." In such case, the user can ask the query that "when student get debarred?" So, for all these situations of an environment, the queries are divided into various classes. All the queries can be broadly classified into these four classes only. However, some queries can also be resolved in one level inferencing while others may require deep reasoning. Both are handled in the model.
The proposed model has a rule and fact based inference system. In this model, the TRUE/FALSE class is directly handled by CLIPS. The information is stored as fact in Knowledge Base (KB) and to answer that the fact is true or not, KB is searched. The information handled by "WHAT-IS or WHERE" class is atomic knowledge and can be inferred directly thus they are stored as facts in the knowledge-base of our model. The "WHAT-HAPPEN and WHEN" class has the information stored in the form of rules. In our model, the rules are typically structured as antecedent with their consequents. The inference engine examines the type of class to which the query belongs and then responds accordingly by executing the corresponding consequents or by searching for the correct antecedent. Forward chaining approach is used to infer queries of WHAT-HAPPEN class as in this class we want to know the consequent of a situation. Forward chaining is an inferencing method. It uses the available data and inference rules to extract more data until it is able to find the goal. This method is also called Data-driven as the data determine which rules should be selected and used. For example, the following information is present in the KB:

Initial fact:
Grade is F

Rules:
If (grade is F) then (student fails subject).

If (student fails subject) then (has to take supplementary exam)
If we want to know "what-happen-if grade is F". For this, we need to search for the consequent with the help of forward chaining. Similarly, backward chaining approach (explained in section III.C) is used to answer the query of WHEN class. Here, the goal is to decide that can we infer the fact "when student has to take supplementary exam" from the initial facts or not? With the help of backward reasoning, it tries to prove the goal (i.e. student has to take the supplementary exam). Hence, we have to show that antecedent (i.e. student fails subject) can be proved. Now, this becomes our new sub goal and so on. This continues until the initial fact as the antecedent is found.
To accommodate all these types of queries, an inference engine is proposed which is made up of query processor and different types of processing models as shown in Fig. 2. This inference engine does all the inferencing. It selects correct processing model for a given query. There are four models for query processing i.e. True False (TF) processor, Fact processor, Conclusion processor and Backward Chaining (BC) processor. The TF processor answers query for the TRUE/FALSE class of query. The Fact processor extracts the fact from the belief base to provide an answer. It handles the WHAT-IS and WHERE class of queries. The Conclusion processor infers a fact from the knowledge-base on the basis of the query provided to the inference engine. This processor uses forward chaining for the processing and thus handles the WHAT-HAPPEN class of queries. The BC processor uses the backward chaining approach to infer the facts for answering the WHEN class of queries.
The model is implemented with help of CLIPS and JAVA. CLIPS tool is integrated with JAVA with help of CLIPSJNI [18]. This integration helped in overcoming some limitation of CLIPS. First of all, it suppressed the triggering of all rules. Secondly, it also suppressed the assertion of irrelevant facts. As a result, the response time is also improved and only the relevant information is shown to the user. Moreover, CLIPS doesn't have a feature of backward chaining also. So, using JAVA, CLIPS is extended with a generic form of backward chaining which can provide a first level explanation of the queries.

C. Extended CLIPS Tool
Backward chaining is an inference methodology in which a conclusion is available and moving backward to find the base facts supporting the conclusion is possible. CLIPS is a public domain software tool which is widely used for building intelligent systems [2][3]. It combines the programming paradigms of procedural, object oriented and logical languages, but it does not support backward chaining [24][25]. Backward chaining is important as it is a faster approach and it is more suitable for goal driven queries.
There are various queries that require backward chaining. For example, if someone asks "when they can get debarred" then there exists a rule "if attendance less than 60% then get debarred" in the knowledge-base. So here, in this case, given the consequent, antecedents should be determined. Such type of queries is handled with help of backward chaining. It is observed that mostly "WHEN" type of queries can be answered using backward chaining and they are being processed by BC processor module of the proposed model. In this algorithm, firstly, a set of basic facts is created that exists in our knowledge-base with help of bc_factlist variable. Also, a list named connection_list is used to maintain the relationship between different base and derived facts. In settingValue() function recursive call is used to successfully infer a set of facts for a given query. This is done by searching for a given fact in the base facts list. For a derived fact, a recursive call is made. Through this recursive call, the facts deriving the other facts are being searched in the fact-base list.

IV. Implementation
The proposed model, StuA, is implemented using JAVA and CLIPSJNI. A knowledge-base is created with various facts and rules related to a specific domain i.e. the college environment. It takes the query as an input. The query is provided as an input to inference engine which distinguishes the query class and sends it to the appropriate processor. The processor answers the query by forming a simple sentence using a first level of natural language generation. The user interface is shown in Fig 3(a)-3(d). Fig. 3(a) shows the simulation of TRUE/FALSE class. It uses the TF processor to process the query. The TF processor searches if the fact asked exists in the knowledge-base or not accordingly answer the query. In Fig. 3(b), the query of WHAT-IS or WHERE class is being processed by Fact processor. It searches for facts matching the given information and extracts the rest of the information from there. It further uses that extracted information to answer. The WHAT-HAPPEN class of query is performed by conclusion processor as shown in Fig. 3(c). It processes the query by finding all the relevant facts that could be inferred. It uses forward chaining for this process. It finally sorts the inferred facts and answers the query. Fig. 3(d) shows the simulations related to WHEN class. This class of queries is processed using the BC processor in which the proposed backward chaining method is implemented. This method searches for the initial facts in the fact-base which could be reached from the goal statement and stores the first level information related to the particular goal. This first level information is processed to form a sentence if an initial fact is successfully found. Table II lists some questions posed to the virtual assistant, StuA.

V. Verification and Validation
The testing process of any automated tool includes verification and validation. Verification checks the completeness of the knowledgebase. Validation checks the correctness of the knowledge-base in terms of consistency. As suggested by Wentworth et al. [27] and Ghasem & Alizadeh [26], firstly the logical completeness and logical consistency of the knowledge-base is checked. Then, the knowledge model is validated followed by the validation of semantic consistency of knowledge items. Finally, the backward chaining algorithm integrated with CLIPS is validated.

A. Logical Completeness
Logical completeness means the expert system produces some conclusion for all inputs. This can be done by following the below mentioned steps [27]: 1. Constructing a logical formula that represents conditions under which the system is complete; this logical formula will be called the completeness formula in conjunctive normal form.
2. Eliminate ORs containing logical opposites or all possible values of a variable.
3. If the resulting logical expression is TRUE, the system is complete.
We checked the completeness of all the subsystems as specified above and found them COMPLETE. For illustration, Study_Material_ Accees subsystem completeness check is shown below:

B. Logical Consistency
Logical consistency means for all inputs, the knowledge base produces a consistent set of conclusions, i.e., that for each set of possible inputs, all the conclusions can be true at the same time. To establish consistency, the user must do the following [27]: 1. Construct a logical formula that represents conditions under which consistency fails; this logical formula will be called the consistency formula. Write this formula in disjunctive normal form.
2. Eliminate ANDs containing logical opposites or other contradictory sets of conjuncts.
3. If the left hand side of the resulting logical expression is FALSE, the system is consistent.
We checked the consistency of all the subsystems as specified above and found them CONSISTENT. For illustration, Study_Material_ Accees subsystem consistency check is shown below:

C. Knowledge Models Completeness Check
Logical completeness and consistency are necessary but not sufficient for a knowledge model to be complete. It should be semantically complete as well, i.e., it must base its decisions on all information considered to be relevant by the expert [27].
One of the ways to check the completeness of a knowledge model is to create a knowledge model with a single expert and review the knowledge model with other experts who are not connected with the development of the model. Following the same, the knowledge model is created by knowledge engineers and for completeness check, we selected 70 final year students (46 males and 24 females) of the college as they seem to be the best suited experts in this domain. They were asked to use the tool for one whole day and check for the correctness of the answers given by the virtual assistant. All the 70 students, in total, posed 836 questions to the assistant and found 829 answers correct. All of them were satisfied with the working of the tool and 99.16% accuracy is reported.

D. Validating the Semantic Consistency of Underlying Knowledge Items
Even if the expert knowledge has been properly encoded into an expert system knowledge-base, the KB will probably produce errors if the underlying expert knowledge is wrong. Therefore, it is important to validate the expert knowledge behind the knowledge-base. This can be done by checking the confidence level of experts.
The basic method for validating a knowledge item is [27]: • Ask a panel of experts whether it is true or false.
• Tally the TRUE/FALSE answers.
• Analyze the results statistically. Fig. 3(a). Simulation of TF processor. Fig. 3(b). Simulation of Fact processor. This test is conducted on the same population of experts. All the 70 students were asked to answer 35 question (excerpts shown in Table III) as yes or no. All the experts agreed upon all the 35 questions unanimously. The confidence level is computed using the formula Confidence Level = 1 -(1 / 2**N) Where N is the number of experts. In this experimentation, 100% confidence is reported.

E. Backward Chaining Algorithm Validation
As a next step of validation, the model is simulated for the presimulated domains [22] and [23]. In this step, mainly backward chaining module was validated. A few existing models have implemented the domain specific backward chaining. To test our generic model, we used those simulations. Table IV shows the testing results of the proposed model for backward chaining design in CLIPS with the help of JAVA. In table IV, the output of some of the pre-simulated examples of backward chaining is compared with our proposed model. Through this testing, it is concluded that the proposed model works perfectly in various scenarios. 100% correctness for all the examples is achieved and with an advantage of providing single level explanations.

VI. Conclusion
People sometimes hesitate to interact with a stranger and ask their queries in a new environment. A virtual assistant provides a solution for this. In this paper, we proposed an interactive and intelligent student assistant, StuA, situated in a specific domain (i.e. the college environment) where it is capable of answering all types of queries of a new-comer to make him/her familiar with the new environment. It facilitates real-time, low-cost expert-level assistance with 24X7 availability. The model is designed using CLIPS as it allows inferencing. Further, it provides a one-level explanation of queries, which is its advantage over other existing models. We have worked on the limitation of CLIPS by proposing and implementing a generic model of backward chaining. The model is validated on various presimulated examples which gave 100% correctness. Further, successful checks are performed for logical completeness and consistency of the knowledge-base. Further, semantic completeness is also checked with the help of 70 domain experts and found the accuracy of 99.16% The proposed system is restricted to the domain of college and

Pre-simulated scenarios Results
Results generated by the proposed model academics which can be customized to various other domains such as at reception of hotels, hospitals, and offices or in schools, malls or in an organization. The proposed model of backward chaining can further be extended to incorporate loops and other higher level programming components. This paper opens lots of future possibilities. A simpler query-writing mechanism can be designed, which might be achieved through Natural Learning Processing (NLP). Nevertheless, it can be made more adaptive by adding learning. With learning and continuous user interaction, it can become more efficient in answering queries.