Publications
Auteur princpal :
Auteur secondaire(s) :






Titre :
Detecting hidden structures from Arabic electronic documents: Application to the legal field
Conférence :
The 14th International Conference on Software Engineering Research, Management and Applications (SERA)
Mois :
juin
Année :
2016
Journal, revue, proceedings ... :
Pays :
Etats-Unis
Ville :
Towson
Type de publication :
Conférence intertnationale
Abstract :
Abstract—Dealing with unstructured information is currently a hot research topic since most documents exist in an unstructured form. The effective exploitation of unstructured document, although intricate, is of paramount importance to Information Retrieval (IR). The key to using unstructured data set is to identify the hidden structures within the data set. In this paper, we present an approach to recognize the semantic structure of documents in Arabic legal data. Several main concepts of a document are expressed in this structure, which includes title, the headings of the chapters, sections, subsections, etc.
This structural information is employed to obtain a richer and more fine-grained annotation of documents forming a useful and coherent infrastructure ready for IR. Some experiments were conducted in order to evaluate our approach. The initial results seem promising.
 
Keywords—document ontology, document structure extraction, document annotation, legal information retrieval