KDnuggets : News : 2005 : n09 : item18 < PREVIOUS | NEXT >

Academic

From: Bruno Cremilleux
Date: 26 Apr 2005
Subject: Post-doc position at GREYC (U. Caen, France), due May 16

Postdoctoral Position in Computer Science, Linguistic and Natural Language Processing: Using text resources for data mining

Research Unit: Groupe de REcherche en Informatique, Image, Automatique et Instrumentation de Caen (GREYC)
http://www.greyc.unicaen.fr/
Deadline for application : May 16th, 2005

Location: Caen, Normandy, France

This post-doc position is linked to the Bingo project which joins three computer scientists teams (EURISE, EA 3721, Universit� de St-Etienne, GREYC - CNRS UMR 6072, Universit� de Caen and LIRIS - CNRS UMR 5205, INSA de Lyon) and a team of biologists (CGMC - CNRS UMR 5534, Universit� de Lyon 1).

The Bingo project (Bases de donn�es INductives et G�nOmique in French - Genomics and Inductive Database in English, see http://www.info.unicaen.fr/~bruno/bingo/) focuses on several open problems, one of which is the use of text resources during the pattern post-processing stage, in order to make better use of domain knowledge during the knowledge discovery stage. This problem requires a close cooperation between linguistic knowledge and methods from knowledge discovery in databases.

The aim of the work of this post-doc position is to use texts and ontologies in order to support the knowledge discovery phase (i.e., when post-processing patterns) in order to present relevant knowledge for the needs of the experts. Indeed, KDD processes tend to produce a lot of patterns which are - a priori - interesting. The validation of the extracted information is a hard task and requires the background knowledge on the domain at hands. The background knowledge is partially embedded in the literature. The key idea is to help the validation step by using ontologies (cf. http://www.geneontology.org/) and textual resources (e.g., Medline). For instance, in the context of the genomic data, starting from a pattern which may be a synexpression group, the biologist would like to retrieve the texts which deal with this particular topic, which biological situations are concerned, and so on. Several work directions are proposed (e.g., text-reader profiling, text analysis, define constraints coming from text resources), see

http://www.info.unicaen.fr/~bruno/bingo/pages/menu_evenements.php

This post-doctoral position is supported by the CNRS, see also http://www.k-projects.com/cnrs_postdocs_2005/public/departement.php?Dep=INT&IdDpt=12

Sought profile of the candidate

Ph D in Computer Science with interest in liguistics or natural language processing. A significant experience in knowledge discovery in databases or linguistics would be highly appreciated. Speaking French is not required.

Duration of the fellowship (months): 12 (starting from September 1st, 2005)

Gross salary : 25,800 Euro per annum

Deadline for application : May 16th, 2005

Contact:
Bruno Cr�milleux +33 2 31 56 74 35 Bruno.Cremilleux@info.unicaen.fr
Nadine Lucas +33 2 31 56 73 36 Nadine.Lucas@info.unicaen.fr

GREYC - CNRS UMR 6072, Universit� de Caen, Campus C�te de Nacre F-14032 Caen Cedex - France


KDnuggets : News : 2005 : n09 : item18 < PREVIOUS | NEXT >

Copyright © 2005 KDnuggets.   Subscribe to KDnuggets News!