TESIS DOCTORAL

PROGRAMACIÓN LÓGICA DIFUSA PARA LA GESTIÓN FLEXIBLE DE DOCUMENTOS XML

FUZZY LOGIC PROGRAMMING FOR THE FLEXIBLE MANAGEMENT OF XML DOCUMENTS

Programa Oficial de Doctorado en Tecnologías Informáticas Avanzadas (UCLM)

Autor: Alejandro Luna Tedesqui (Departamento de Sistemas Informáticos, UCLM)
Directores: Ginés Moreno Valverde  (Departamento de Sistemas Informáticos, UCLM)
Jesús M. Almendros Jiménez (Departamento de Informática, Universidad de Almeria)
 

RESUMEN/ABSTRACT


This thesis presents an extension of the popular XPath language which provides ranked answers to fexible queries taking proft of fuzzy variants of and, or and avg operators for XPath conditions, as well as two structural constraints, called down and deep, for which a certain degree of relevance is associated. In practice, this degree is very low for some answers weakly accomplishing with the original query, and hence, they should not be computed in order to alleviate the computational complexity of the information retrieval process. In order to improvethe scalability of our interpreter for dealing with massive XML fles, we make use ofthe abilityoffuzzy logic programming for prematurely disregarding those computations leading to non signifcant solutions (i.e., with a poor degree of relevance according the preferences expressed by users when using the new command FILTER). Since our proposal has been implemented with a fuzzy logic language, we have exploited the high expressive resources of this declarative paradigm for performing "dynamic thresholding" in a very natural and efcient way. But apart from using our FLOPER environment for developing the interpreter, we also propose an implementation coded with the standard XQuery language. Basically, we have defned an XQuery library able to diffusely handle XPath expressions in such a way that our proposed fuzzy XPath can be encoded as XQuery expressions. The advantages of our approach is that any XQuery processor can handle a fuzzy version of XPath by using the library we have implemented.

On the other hand, we present a method for debugging XPath queries by descri­bing how XPath expressions can be manipulated for obtaining a set of alternative queries matching a given XML document. For each new proposed query, we give a .chance degree" that represents an estimation on its deviation w.r.t. the initial ex­pression. Our work is focused on providing to the programmers a repertoire of paths (containing new commands for "JUMP/DELETE/SWAP" tags) which can be used to retrieve answers. Our debugger is able to manage big XML documents by making use of the new command FILTER which is intended to prematurely disregard those computations leading to non signifcant solutions (i.e., with a poor .chance degree" according to the user's preferences). The key point again is the natural capability for performing .dynamic thresholding" enjoyed by the fuzzy logic language used for implementing the tool, which somehow connects with the so­called «top­k answering problem» very well­known in the fuzzy logic and soft computing arenas.

Regarding non standard applications, in the last block of this thesis we reinforce the bi­lateral synergies between fuzzy XPath and FLOPER. In particular, we deal with propositional fuzzy formulae containing several propositional symbols linked with connectives defned in a lattice of truth degrees more complex than Bool. We frstly recall a fuzzy SMT (Satisfability Modulo Theories) based method for auto­matically proving theorems in relevant infnitely­valued (including Lukasiewicz and Godel) logics. Next, instead of focusing on satisfability (i.e., proving the existence of at least one model) as usually done in a SAT/SMT setting, our interest moves to the problem of fnding the whole set of models (with a fnite domain) for a given fuzzy formula. We re­use a previous method based on fuzzy logic programming where the formula is conceived as a goal whose derivation tree, provided by our FLOPER tool, contains on its leaves all the models of the original formula, together with other interpretations (by exhaustively interpreting each propositional symbol in all the possible forms according the whole set of values collected on the underlying lattice of truth­degrees). Next, we use the ability of the fuzzy XPath tool for exploring these derivation trees once exported in XML format, in order to automatically discover whether the formula is a tautology, satisfable, or a contradiction.

Objectives and structure of the Thesis

After introducing in the first pair of chapters some preliminary concepts regarding fuzzy logic, logic programming and the "Fuzzy LOgic Programming Environment for Re­search" FLOPER, in Chapter 3 we detail the design of our fuzzy XPath interpreter -which represents a fuzzy variant of the popular XPath query language for the fexible information retrieval on XML documents-thus providing a repertoire of operators that ofer the possibility of managing satisfaction degrees by adding struc­tural constraints and fuzzy operators inside conditions, in order to produce a ran­ked sorted list of answers according to user's preferences when composing queries [ALM11a; ALM11b; ALM12c].

By using the FLOPER system, our proposal has been implemented with a fuzzy logic language to take proft of the clear synergies between both target and source fuzzy languages [ALM15a]. In Chapter 4 we discuss the advantages of exploiting the high expressive resources of this declarative paradigm for performing "dynamic thresholding" when evaluating queries [ALM14a]. Moreover, in Section 4.3 we also provide an alternative implementation based on XQuery which increases the porta­bility of the fuzzy XPath interpreter [ALM14b].

In Chapter 5 we recast from [ALM12a; ALM12b; ALM13] our recently designed method for debugging XPath queries which produces a set of alternative XPath expressions (where some tags have been "jumped", "deleted" or "swapped") with higher chances for retrieving answers from XML fles. The use of fltering techniques in the FLOPER­based implemention of the tool represents once again the key point for gaining efciency and increasing its scalability when managing very large XML documents [ALM15b].

Regarding applications, in Chapter 6 we describe a new feedback between FLOPER and fuzzy XPath. In [ALMV13; ALMV15] we focus on the ability of our inter­preter for exploring derivation trees generated by FLOPER once they are exported in XML format, which somehow serves as a debugging tool for analyzing computatio­nal details such as discovering the set of fuzzy computed answers for a given goal, performing depth/breadth­frst traversals of its associated derivation tree, fnding non fully evaluated branches, etc. Such relationship grows through the connections we establish with recent (fuzzy) SAT/SMT techniques as explained in [ABL+15].

Finally, this thesis concludes in Chapter 7 by collecting a brief summary of the achieved results and by proposing too some lines for future work.


REFERENCIAS/REFERENCES

Papers co-authored by Alejandro Luna T.

JCR JOURNAL

INTERNATIONAL CONFERENCES

NATIONAL CONFERENCES

DESCARGAS & ENLACES

DESCARGAS

Full document and Slides
Summary (English and Spanish)
Poster

ENLACES

FuzzyXPath Interpreter and Debugger (Web)
FuzzyXPath Interpreter Test (Online)
FuzzyXPath Debugger Test (Online)
FuzzyXPath Interpreter Statistics (Online)
FuzzyXPath Debugger Statistics (Online)
DEC-Tau Research Group (Web)
FLOPER - Fuzzy LOgic Programming Environment for Research (Web)
FLOPER Test (Online)

CURRICULUM VITAE

FORMACION ACADEMICA

Estudios primarios: Escuela José Manuel Indaburo
Estudios secundarios: Colegio Antonio Díaz Villamil
Bachiller en humanidades: Colegio Delfín Eyzaguirre
Estudios universitarios:
  • Universidad de Castilla-La Mancha, Máster Oficial en Tecnologías Informáticas Avanzadas
  • Universidad Salesiana de Bolivia, Diplomado en Educación Superior
  • Universidad Mayor de San Andrés, Carrera de Informática.
  • Universidad Mayor de San Andrés, Carrera, de Filosofía
  • Universidad central, Curso de Capacitación en Hardware, Mantenimiento y reparación de computadoras.
Profesión: Licenciado en Informática, mención en Ingeniería de Sistemas
Postgrado: Máster en Tecnologías Informáticas Avanzadas (UCLM)
Cursando: Doctorado en Tecnologías Informáticas Avanzadas (UCLM)

CONTRATOS Y BECAS DE INVESTIGACION

  • Proyecto "DAMAS: Una Aproximación Declarativa al Modelado, Análisis y Resolución de Problemas", con referencia TIN2013-45732-C04-2-P.
  • Proyecto "Lenguajes Declarativos y Herramientas Para Datos WEB", con referencia TIN2008-06622-C03-03.
  • Proyecto "ALDDEIA: Aplicaciones de la Lógica Difusa al Desarrollo de Entornos Informáticos Avanzados", con referencia TIN2007-65749.
  • Se terminó el Máster en Tecnologías Informáticas Avanzadas gracias a una Beca otorgada por la Fundación Carolina.

GRUPOS DE INVESTIGACION

  • DEC-TAU, Declarative Programming and Automatic Program Transformation, Universidad de Castilla-La Mancha, Instituto de Investigación en Informática, Campus de Albacete.
  • UMSANET, Universidad Mayor de San Andrés, Carrera de Informática, La Paz – Bolivia, desde 1997, hasta que el proyecto se transformó en UMSATIC en el 2004.

EXPERIENCIA PROFESIONAL

  • Año 2015 (Septiembre - a la fecha): Universidad de Castilla-La Mancha, Contratado por Obra y Servicio para el Proyecto "Una Aproximación Declarativa al Modelado, Análisis Y Resolución De Problemas" con referencia TIN2013-45732-C04-2-P.
  • Año 2015(Julio-Septiembre): Ministerio de Salud, Bono Juana Azurduy. Responsable de Planificación y Sistemas de Información.
  • Año 2014(Enero)-2015(Junio): Ministerio de Planificación del Desarrollo, Consultoría Individual de Línea “Desarrollo de Sistemas”, para la Conclusión del Diseño y Desarrollo Informático del SI-SPIE (Sistema de Planificación Integral del Estado).
  • Año 2013: Ministerio de Planificación del Desarrollo, Consultoría Individual de Línea “Programador de Sistemas”, Desarrollo de Sistemas de Información Geográfico.
  • Año 2012: Universidad de Almería (España), Contratado por Obra y Servicio para el Proyecto Titulado Lenguajes Declarativos y Herramientas para Datos Web, con referencia TIN2008-06622-C03-03.
  • Año 2011: CAUDAL – Consultores S.R.L., Consultor en Redes y Desarrollo de Sistemas.
  • Año 2011: Universidad de Castilla-La Mancha (España), Contratado por Obra y Servicio para el Proyecto ALDDEIA: Aplicaciones de la Lógica Difusa al Desarrollo de Entornos Informáticos Avanzados, financiado por la MEC, Ref. TIN2007-65749,
  • Año 2009-2010: Consultor Administrador de Redes y Servidores, Institución de Desarrollo - COSAPI.
  • Año 2008-2009: Jefe de la Unidad de Tecnologías de Información, Ministerio de Economía y Finanzas Públicas – República de Bolivia.
  • Año 2005-2007: Analista Administrador de Red de Servidores, Ministerio de Hacienda – República de Bolivia.
  • Año 2004: Desarrollador de Sistemas - Ministerio de Hacienda (Sistema de Correspondencia, Sistemas de POAIs y otros.)
  • Año 2004: Soporte Técnico – Ministerio de Hacienda
  • ...