Career Recommendation System for Scientiﬁc Students Based on Ontologies

Students are usually unaware of their own skills. They choose to follow the trend, rather than the proper pathway. Which negatively a ﬀ ects the professional sector, and the development of the country. Orienting students, and guiding them would o ﬀ er considerable beneﬁts. Building the appropriate student’s proﬁles is the golden key for an accurate orientation. To do so, relying on the simple use of the grade point average (GPA) will not be su ﬃ cient, and mislead the guidance. Instead, studying their personality and skills has to be done, in order to provide them with their reel orientation. The presented solution aims to orient students to the most suitable career, based on a mathematical model, valid for all education systems, and takes into account the trades trends and students capabilities.


Introduction
This paper is an extension of work originally presented in conference Computing Conference, London, UK 2017 [1].
Education has an important function in the economic development. Its anticipation and success are among the factors that explain the essential differences in standards of living between different countries. Recently Romer, Lucas, Mankiw have shown that education is the engine of economic growth [2]: In [3], the author shows that education is the basic need of every individual. Nowadays, anyone is concerned about selecting the right institution, that suits better his choices and interests. In making these decisions, individuals must consult a large number of physical records and institutional perspectives. To make the right decision becomes a hard mission for them [3].
Although career choice is fundamental to the future of the individual, the assignment is done in an unqualified, unreliable way. On the other hand, the use of new technologies has become a necessity nowadays in all areas. The benefits they can bring are unmistakable. This has led scientists and researchers to find the most optimal way to use, implement, and benefit as best as possible. According to [4] several economic and socio-educational problems can be solved by introducing genuine career guidance in the education system.
By providing guidance to students, their academic and professional success is greatly enhanced by helping them discover their interests, values, and skills [5].
Without the latest user information, user profiling would be more difficult. In fact, data is available across multiple platforms, where users interact with different web content: social activities, online learning platforms, and more. Proper use of this data will significantly help to build a profile of these users in any system. [6], [7]. In this context, building profiles based on ontology for career recommendation is proposed in this paper. This paper consists of five sections: First, it starts with a description of the Moroccan education system (MES). After that, it introduces the classification of students based on a combination of grades and skills. Then, it focuses on the prerequisites needed to integrate the desired field or pathway. It continues by describing our solution for career recommendation. Finally, it'll conduct an experiment on different students, and evaluate the effectiveness of the proposed solution as shown in figure 1 and 2.

General Context
In [8], the author defined 'Career guidance and orientation' as: a procedure that helps the individual to make himself known and understand with himself, and the world of work, with the specific purpose of making professional, instructive life choices.
Services that intend to set any person at any phase of his life, in order to make educational, training or occupational choices or manage his career' is another definition made by [9].
Though, in the MES, students select a pathway, without consideration of their personal skills and capabilities. The main cause is the strong dependence between the assignment and the cumulative grade point average (GPA): The higher your GPA, the more likely you are to integrate your desired filed, even if you do not have the required skills. Which penalizes the education system, as well as the students, causing them to miss their careers. In [10], the author emphasizes that, reflection and personalization can be directly related to student profiles to resolve this issue.
The objective of this work is to offer a guidance system to scientific students, at any stage. In [11], a model for guiding the 9th Grade students in an e-learning context was presented. Still, this solution couldn't work on the higher level studies: the 9th Grade students study roughly the same materials as in the first year of high school. But it's not the general situation: students will study subjects totally different from what they have in their current year. Which prevents us from comparing these subjects.
In [12], Amines proposed a solution for affecting the 9th's grades, of the MES (Described in detail in Chap 2.2) student to their adapted pathway. However, this model couldn't be applied in a higher scale. It wouldn't work on the students of preparatory classes for the integration of engineering school (PCGE), because it wouldn't be possible to compare their actual subjects with the ones that that'll encounter in the engineering schools.
Then, in [1], the author came up with a solution by developing an automated system that realizes the affectation of the PCGE students t to their adapted engineering specializations, by converting the extracted to the same type.
After that, in [13], a general model for career recommendation was established, enabling the guidance of scientific students, at any stage on their life.
In this paper, students profiles will be built by the aid of ontologies, allowing more performance and accuracy for the proposed solution, of career recommendation.  3. High school:3 years. At this point, the student will have to choose a field from the 15 th offered by the MES. 5 from the are scientific. In the 2 nd year, the student sits for a primary exam. In the 3 rd and final year, he will have to sit for the final exam as shown in figure 3.
In the MES, the grades are scaled from 0 as a minimum, to 20 as a maximum.

Process of Career Guidance
From figure 4, it can be shown that the MES consists of 4th principals stages.
In the 1 st and 2 nd one, guidance is not required: there are only common studies.
In the 3 rd stage (high school), the student will have to select one from the 15 th fields offered by the MES: Career recommendation is mandatory at this stage. To do so, the solution proposed in [1] will be used, based on RAISEC . This model is based on comparing the students' grades of their actual year, with those of the next year, calculated according to subjects' coefficient of each field.
In the 4 th stage, using the same solution for higher education is not possible, due to the great difference between the subjects of the university and those of the high school.
As a solution, the skills model can be used: the subjects will be converted into a standard type, of the same range. Thus, comparing these subjects one with another can be easily done, without any confusion.
In this paper, the science process skill (SPS) model is chosen, because it suits the most the goal of this work: SPS is developed on the basis of scientific research and is in association with cognitive and investigative skills [14], and this work aims to study scientific students.

Student Profile
According to [3] [15], Profiling can be defined as the 'process of Extracting, Integrating and Identifying the keywordbased information, in order to produce a structured Profile, and visualize the knowledge outside of these results. It's a major concept for retrieving the user pertinent information and solving difficult problems of a recommender system, such as items classification according to an individual's interest. [3].
The process of profiling a web user, consist of the act of getting values of different properties, that forms the user's model [16]. It can be either behavior based or knowledge based (already known/factual) [16].
Social profiling [17] techniques will be used also for user profiling: This process can be simplified by implicit usage of user's data, while their registration via Facebook [

Student Ontology
Collect the student's modeling data is time-consuming process, that demands the development of complex data structures to represent student's personal information, knowledge and behavior in the learning domain [18]. Recently, student modeling researchers have begun to adopt technologies, applications, and standards from the Semantic Web to solve the problems mentioned above [18]. Chen & Mizoguchi were the first to introduce using ontologies for modeling learners [19]. Kay also asserts about using this technology, for reusable and scrutable student models [20].

Ontology Use
After collecting the student's data, converting it into a format compatible with knowledge representation and reason-ing systems to operate as the input for the adaptive systems should be processed. Faced with these requirements, student modeling data is stored in general in proprietary and hardto-access formats, which won't motivate reusing it, or its distribution. [18].
Choosing RDF and RDFS as a solution for developing the student ontology would be a good approach. However, it would be better to use Web Ontology Language, in order to beneficiate from its wide functionalities, actualities, tools support, and being an official W3C recommendation [18].
This work presents a different method of student model representation. It demonstrates a way of its achievement using the semantic web technologies.
TO build an efficient student model based on web ontologies: OIL, DAML+OIL, RDF/RDFS, OWL, languages is not evident .
The ontology language chosen is OWL DL [21], for being an official W3C recommendation in addition to its functions, tool support, and more especially the Protege 3.0 development tool. In addition, (OWL) have the benefits of www.astesj.com 31 formal semantics, easy reuse, easy portability, and automatic serialization into a format compatible with popular logical inference engines [18].

Guidance Skill Model
The concept lies on supposing that a subject consists of a set of skills, assigned with different weights, enabling the comparison of the student's grades of the actual year, with next year subjects, even if they belong to dissimilar ranges, without any confusion. This work is based on the SPS, for the reason that it corresponds well to the needs if this research, and suits its goal: SPS was developed on the basis of scientific research, and this research targets scientific students. With this approach, the grades can be described as a set of SPS weighted vectors, and detect the student's reel skills.
An SPS test was developed for measuring the students' skills, with the same approach that the author did in [22]: The Test of Science Process Skill (TSPS) consisted of multiple choice items. According to [23], SPS can be classified into two categories: • BSPS: Basic science process skills.
• ISPS: Integrated science process skills.
This work is limited to 10 skills, considered pertinent by the developing team members. These skills are enough to describe any subject. For that matter, the subjects will be described as a set of skills, with different weight. The weight will vary from 0 as a minimum, to 10 as a maximum.
The skills selected are: 3 System Architecture and Design

General Architecture
We'll use the general architecture developed by Amine [12], for career guidance. (fig Process Architecture). It consists of four main modules: • Assignment module: which relies on algorithm, assigning each student to an appropriate category, according to the RAISEC / SPS test results, as well as his or her academic background.
• Ontology module: Store all the student's data. This module demonstrates the way in which the content of information is structured.
• Recommendation module: Exploit the assignment results, and usage history, to give directions to the Adaptation module • Adaptation module: Adjust the contents, for student individual needs.
Initially, a student profile will be created by filling a form. By entering the student's 'national code' (NC), the system will connect to the student personal information and academic background. If an educational system is not yet structured, the student will have to directly supply this static information. After being registered, the elementary school students answer to a questioner based on Holland's model, and the university ones answer to SPS based test. This will determine his personal and professional tendencies and his reel capabilities. After that, these answers will be exploited, with the help of his educational background, to determine his profile. RCS usage is necessary to do it. The profile will be modeled by ontologies. We can also determine his interaction preferences, content preferences and motivations (dynamic information),by means of the browsing history, and its interaction or behavior. The results will be stored in the system, so that they can be available at any time for the adaptation module as shown in figure 5. Ontology-based on the semantic web can be used to model and structure the educational domain, to be shared by a student's group. It is an explicit specification of conceptualization or a model [24].
In this paper, ontologies will be used to identify pedagogical concepts and semantic links, for representing the existing. So, students can get educational resources, and interest dynamically, and adapt it to their interests. Figure 6, represents a part of the ontology diagram, composed of 4 main higher levels: • Student's Level: Student's background, his actual level and results based on the proposed solution.
• Student's Profile: Student's classification based on RAISEC/SPS model, and his actual career pathway assignment.
• Content: Information related to the pedagogical content of the students' interface.
The purpose of this career recommendation ontology is to take advantage of student orientation in order to personalize their web content and define similarities of profiles.
The figure 7 shows a partial part of the ontology's object properties.

Notations
This work, uses the same nomenclatures used in [1]: n : Total students number. S td i : student's number i. m : Total number of skills. S k : skill number k. p : Actual year Total subjects number. S ub j : subject number j. q : Total School's field subjects number. t : Total School's field number. g i j : The grade of the student' i on the subject j.

Representation of a Subject on the Skill Mode
Using SPS is essential for learning with understanding [25].
For that matter, this paper propose a model for representing a subject on the Skill Mode. Comparing 2 subjects of different types, implies that they can be represented in the same format. For that reason, the author in [1] proposed the theory of Skill mode: Given a subject j, j can be described as a vector S ub j composed of a set of skills, with different weight ω jk : Where m is the total skills' number For instance, the Mathematics can be described as follow: Math 2 8 9 8 2 6 2 9 3 4 meaning that the Observing Skill S 1 has a weight of 2, the Questioning Skill S 2 has 8, . . . , and the Memorizing Skill S 10 has 4 as a weight. These weights can be assigned according to the institution requirements, and the tutors' vision. Which means that they can vary from a school to another one.
After building all the subjects vectors of total number q, the matrix of the skill mode composed of these subjects is defined as follow: www.astesj.com 33

Representing a student's grades on the Skill Mode
The first step is to transform the grades of the students, like what was done previously for the subjects, to a standard form.
The skill mode representation is the solution that'll allow as the manipulation, usage, and comparison of the grades with high flexibility. Let G S tudent i = g i1 , g i2 , . . . g ip be the student grades, And let M S ub j with weights ω jk the skill mode matrix, of the student's grades. The weights ω jk refers to the weight of the skill k, and for the subject i.
With the same process done to the subjects, the student can be represented by a vector composed of his subjects' grades, converted to skill mode:

Calculus of the student's Reel Grades
The singular use of GPA, without taking into consideration the real abilities of the learner, will lead to an erroneous orientation [23]. Using an SPS test is crucial for understanding and validating one's real abilities, allowing a more realistic representation of one's profile.
In this paper, in addition to the usage of the SPS test, the profiling vector will be added to the vector the results in order to have more accuracy in the results.
Let C i = c i1 , c i2 , . . . c im be the vector representing the coefficients of the students' skills, resulted from his SPS test.
The vector P i = p i1 , p i2 , . . . p im is representing the coefficients of the students' skills, calculated based on his profiling data.
The student's reel grades are defined as follow: R i r i1 , r i2 , . . . r im , where : The vector R i represents the reel student profile and capacities, that will be used later on for calculating his affectation.
Given F f a field of a certain school, this field consists of a set of subjects S j . The vector S F f S 1 , S 2 . . . , S n represents the representation on the skill mode of F f , that referes to the field f We can define the School Fields Matrix as: Where t is the fields' total number. From: R i , the reel grade vector of the student i, and, S F, the representation of the fields on the skill mode, The affectation vector can be inferred, and calculated as follow: . . a it on the basis of his grades of each field, where: From 6, it can inferred that the affectation is proportionally relative to a i f : The greater is a i f , the better it would be the student's grade on the field F f , leading to a more preferable affectation. Finally, the student α best choice would be the field χ, calculated as follow: F χ would be the student best-fitted choice.

Experiments and Results
In this chapter, experimentation of the model presented will be realized. To do this, a sample of 50 baccalaureate students who passed the final high school exam chose to study them. It will be calculated their assignments, and an analysis of the results will be established thereafter. The scientific fields that have been chosen for this experiment are: • Mathematics Sciences (SMA).
10 students from each field were chosen to form the test sample.
Two among the main institutions of the city of Tangier, were chosen, namely: • The faculty of Science of Tangier (FST).
• Office of Vocational Training and Promotion of Labor (OFPPT) The fields available at the fst are: • Mathematics, Computer science, Physics and chemistry (MIPC).
For the Ofppt, it has 4 fields or specializations. The duration of studies is 2 years, with a total of 4 semesters.
The fields taught at the Ofppt are: www.astesj.com • Computer developing Technics (TDI).
At first, student grades will be extracted from their final exams.
These notes will then be studied for a career recommendation purpose.

Matrix Subjects of final year High School
Let M BAC be the Matrix of the final year of high school exam Subjects. After assigning the skills weights of each subject by the members of the research team, M BAC can be written as bellow:
With the same way, G * 10 (2), G * 10 (3), . . . and G * 10 (10) = 49, 3. will be calculated, in order to get G * 10 at the end: In the same way, the other grades vectors represented in the skill mode of the students will be calculated. For calculation details and results, please refer to the appendix.

Calculus of the students' Reel Grades
The real grades of the students will be calculated from: • Grades vectors represented in the skill mode G * 10, G * 11, . . . G * 59.
The Grades vectors represented in the skill mode were calculated in the previous chapter. For students' skill coefficients, they are described in detail in the appendix 'Calculus of the student Reel Grades'. (4) can be used then in order to calculate the reel grades. For example, to calculate R G10 , the student 10 real grade: The same calculus will be to all the other students. For the full results and calculus, please refer to the appendix, subsection 'Calculus of the student Reel Grades'.

Calculus of the School fields Matrix
The following abbreviations will be used for the Faculty of science of Tangier: • F MP : Mathematics, Physics, Chemistry & Computer science • F GEM ,Electrical and mechanical.
• and F BCG Biology Geology & Chemistry field.
For the offpt, the following abbreviations will be used : • F T DI Computer software technics.
• F T RI Network Technics.
• F T DM multimedia developing technics.
• F In f Infographics.
The matrix FieldR G will be deduct from the calculation of all the skill vectors, for all the modules and fields:

Calculus of the Affectation Vector
Using (7), the affectation of S tudent10 can be calculated as follow: which means that F f st = 1, corresponding to F 1 . The S tudent10 should be affected to the MP. In the same way, the assignment of the other students will be calculated.
The following tables, show the results of the affectation, for each student.
The results are grouped by students' high school fields. The grades can be easily compared, with the affectation of each field, for each institutions.
The tables 6 to 10 show the results of the differents fields.   www.astesj.com 36  From the tables 6 to 10, the table 11 can be established, where the affectation summary of the students is illustrated. The table shows their affectation for both institution, the FST and OFPPT. In each institution, the affectation for each student will be calculated. Then, compare their real choice, with their best suitable field, that is expressed in the relevance column: A 'True' mention for an accurate choice, and 'False' in the other case.

Analysis
In this section, an analyze the results will be done.
According to these calculations, the percentage of relevant choices made by the student can be inferred. In global, more than 50% of the students made a wrong choice, which means that 1 out of 2 students will follow a wrong pathway, leading him to failure. We can notice also, that all the students with a high GPA, chose the FST rather than the Ofppt. In this institution, many students tend to select the MIP field, just because it's the tendency, while they would perform better in other fields. It is the case for the student 12, 31 and others. We can conclude that students chose a given field, only if their GPA allows it. Figure 8 compares the affectation with the real choices for the FST.  We expected that more than 25 students will choose the MIP, 5 the BCG, and 17 the GEM.
But in the reality, 15 preferred to attend the MIP, 15 the BCG, and 20 the GEM. It's due to the quota imposed by the institution. The students can not make a choice above the threshold. They're aware of their acceptance probabilities, and not applying for a field where they have better chances in order not to miss their opportunities.  The table 12 illustrates the final choice made by the students, and compare it to the affectation for each institution.
c refers to choice and R to relevance.
www.astesj.com 37  Figure 9: Ofppt Affectation: Orientation Vs Reality We expected that 12 will choose the TDI, 10 the TRI , 13 the TDM , and 12 the Inf. But in the reality, preferred the 12 TDI, 11 the TRI , 17 the TDM , and 10 the Inf. The difference is not as great as for the FST, except for the TDM field. But in general, the students tend to choose. From 12, the graph 10 can be generated, that describes the relevance for each institution, and the, a general affectation accuracy statistics.

Conclusion
In this work, an orientation model for science students towards their most suitable career was presented, tested and analyzed. The results of this research show that the current system of orientation penalizes students, who are forced to choose courses that are not dedicated to them, and make erroneous choices. The proposed solution would provide better guidance, to promote a better future for students, and at home.
The developed model has been submitted to the School, working in collaboration with the research team, for the purpose of using the first version of the solution. This model can be easily adapted to any education system, and be customized according to each organization needs.
We intend in the future works to integrate methodologies from Artificial Intelligence (AI) based on mathematical theories to add more significance and improvements to the actual model.