Vendredi 3 juin 5 03 /06 /Juin 00:00

J. Lafferty, A. McCallum, and F. Pereira. Conditional random fields: probabilistic models for segmenting and labeling sequence data. In International Conference on Machine Learning, 2001.

Résumé :We present a framework for building probabilistic models to segment and label sequence data. Conditional random fields offer several advantages over hidden Markov models and stochastic grammars for such tasks, including the ability to relax strong independence assumptions made in those models. Conditional random fields also avoid a fundamental limitation of maximum entropy Markov models (MEMMs) and other discriminative Markov models based on directed graphical models, which can be biased towards states with few successor states. We present iterative parameter estimation algorithms for conditional random fields and compare the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.

Application : Texte, Part of Speech Tagging

Bibtex :
author = {John Lafferty and Andrew McCallum and Fernando Pereira},
title = {Conditional Random Fields: {P}robabilistic Models for Segmenting and Labeling Sequence Data},
booktitle = {Proc. 18th International Conf. on Machine Learning},
year = {2001},
pages = {282--289},
publisher = {Morgan Kaufmann, San Francisco, CA} }

: L'article "fondateur" sur les CRF. Présentation des différences entre les HMM et les MEMM et les CRFs. Algorithme d'apprentissage utilisé l'IIS (Improved Iterative Scaling). Application a l'étiquetage morpho-syntaxique (POS tagging).

H. Wallach, Efficient training of conditional random fields, Master's thesis, University of Edinburgh, 2002
Résumé : This thesis explores a number of parameter estimation techniques for conditional random fields, a recently introduced probabilistic model for labelling and segmenting sequential data. Theoretical and practical disadvantages of the training techniques reported in current literature on CRFs are discussed. We hypothesise that general numerical optimisation techniques result in improved performance over iterative scaling algorithms for training CRFs. Experiments run on a a subset of a well-known text chunking data set confirm that this is indeed the case. This is a highly promising result, indicating that such parameter estimation techniques make CRFs a practical and efficient choice for labelling sequential data, as well as a theoretically sound and principled probabilistic framework.

Application : Texte, Part of Speech Tagging

Commentaires : Une présentation trés détaillée des modéles graphiques orientés (HMMs et MEMMS) et non orientés (CRF) avec une présentation du label bias problem. Présentation de deux algorithmes d'optimisation pour l'apprentissage (Ordre 1 : Gradient Conjugé Non Linéaire, Ordre 2 : Limited Memory Variable Metric). Application au pos tagging

F. Sha and F. Pereira. Shallow parsing with conditional random fields. In Proceedings of Human Language Technology, NAACL 2003, 2003.
Résumé : Conditional random fields for sequence labeling offer advantages over both generative models like HMMs and classifier applied at each sequence position. Among sequence labeling tasks in language processing, shallow parsing has received much attention, with the development of standard evaluation datasets and extensive comparison among methods. We show here how to train a conditional random field to achieve performance as good as any reported base noun-phrase chunking method on the CoNLL task, and better than any reported single model. Improved training methods based on modern optimization algorithms were critical in achieving these results. We present extensive comparisons between models and training methods that confirm and strengthen previous results on shallow parsing and training methods for maximum-entropy models.

Application : Texte, Chunking

Bibtex :
author = "Fei Sha and Fernando Pereira",
title = "Shallow Parsing with Conditional Random Fields",
journal = "Proceedings of the 2003 Human Language Technology Conference and North American Chapter of the Association for Computational Linguistics (HLT/NAACL-03)",
year = "2003"}

Commentaires : 3 nouveaux algorithmes d'optimisation pour l'apprentissage des CRFs : le Gradient Conjugé Préconditionné, le Voted Perceptron, et le Limited Memory Quasi Newton. Présentation d'un nombre important de caractéristiques utilisés pour l'apprentissage. Application à une tache de chunking.
Yasemin Altun, Mark Johnson and Thomas Hofmann. Investigating Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences. In Proceedings of the 2003 Conference on Empirical Methods in NLP 2003
Résumé : Discriminative models have been of interest in the NLP community in recent years. Previous research has shown that they are advantageous over generative models. In this paper, we investigate how different objective functions and optimization methods affect the performance of the classifiers in the discriminative learning framework. We focus on the sequence labelling problem, particularly POS tagging and NER tasks. Our experiments show that changing the objective function is not as effective as changing the features included in the model.

Application : Texte, Part of Speech Tagging

Bibtex : @INPROCEEDINGS{Altun03Investigating,
author = "Yasemin Altun and Mark Johnson and Thomas Hofmann",
title = "Investigating Loss Functions and Optimization Methods for Discriminative Learning of Label Sequences",
journal = "Proceedings of the 2003 Conference on Empirical Methods in NLP",
year = "2003"}
David Pinto, Andrew McCallum, Xing Wei and W. Bruce Croft Table extraction using conditional random fields. In SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference 2003.
Résumé : The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often contain tables in order to communicate densely packed,multi-dimensional information. Tables do this by employing layout patterns to efficiently indicate fields and records in two-dimensional form. Their rich combination of formatting and content present difficulties for traditional language modeling techniques, however. This paper presents the use of conditional random fields (CRFs) for table extraction, and compares them with hidden Markov models (HMMs). Unlike HMMs, CRFs support the use of many rich and overlapping layout and language features, and as a result, they perform significantly better. We show experimental results on plain-text government statistical reports in which tables are located with 92% F1, and their constituent lines are classified into 12 table-related categories with 94% accuracy. We also discuss future work on undirected graphical models for segmenting columns, finding cells, and classifying them as data cells or label cells.

Application : HTML, Extraction de structure

Bibtex :
author = {David Pinto and Andrew McCallum and Xing Wei and W. Bruce Croft},
title = {Table extraction using conditional random fields},
booktitle = {SIGIR '03: Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval},
year = {2003},
isbn = {1-58113-646-3},
pages = {235--242},
location = {Toronto, Canada},
publisher = {ACM Press},
address = {New York, NY, USA} }

Commentaires :
Une application des CRF à l'extraction de table dans des documents HTML. Comparaison des CRF et des HMM. Utilisation de caractéristiques dédiées pour cette tache.

Yasemin Altun, Alex J. Smola, Thomas Hofmann.Exponential Families for Conditional Random Fields. In Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-2004), 2004.
Résumé :In this paper we define conditional random fields in reproducing kernel Hilbert spaces and show connections to Gaussian Process classification. More specifically, we prove decomposition results for undirected graphical models and we give constructions for kernels. Finally we present efficient means of solving the optimization problem using reduced rank decompositions and we show how stationarity can be exploited efficiently in the optimization process.

Bibtex :
@inproceedings{ Lafferty04Kernel,
author = "Yasemin Altun and Alex J. Smola and Thomas Hofmann",
title = "Exponential Families for Conditional Random Fields",
booktitle = "Proceedings of the 20th Conference on Uncertainty in Artificial Intelligence (UAI-2004)",
year = "2004" }

Commentaires : Article introduisant les noyaux pour les CRF.
John Lafferty, Xiaojin Zhu and Yan Liu. Kernel conditional random fields: representation and clique selection. In Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004), 2004.
Résumé :Kernel conditional random fields (KCRFs) are introduced as a framework for discriminative modeling of graph-structured data. A representer theorem for conditional graphical models is given which shows how kernel conditional random fields arise from risk minimization procedures defined using Mercer kernels on labeled graphs. A procedure for greedily selecting cliques in the dual representation is then proposed, which allows sparse representations. By incorporating kernels and implicit feature spaces into conditional graphical models, the framework enables semi-supervised learning algorithms for structured data through the use of graph kernels. The framework and clique selection methods are demonstrated in synthetic data experiments, and are also applied to the problem of protein secondary structure prediction

Bibtex :
@inproceedings{ Lafferty04Kernel,
author = "John Lafferty and Xiaojin Zhu and Yan Liu",
title = "Kernel conditional random fields: representation and clique selection",
booktitle = "Proceedings of the Twenty-First International Conference on Machine Learning (ICML 2004)",
year = "2004" }

Commentaires :
Article introduisant les noyaux pour les CRF.
Trausti Kristjannson, Aron Culotta, Paul Viola and Andrew McCallum. Interactive information extraction with constrained conditional random fields Nineteenth National Conference on Artificial Intelligence (AAAI 2004). 2004
Résumé : Information Extraction methods can be used to automatically fill-in database forms from unstructured data such as Web documents or email. State-of-the-art methods have achieved low error rates but invariably make a number of errors. The goal of an interactive information extraction system is to assist the user in filling in database fields while giving the user confidence in the integrity of the data. The user is presented with an interactive interface that allows both the rapid verification of automatic field assignments and the correction of errors. In cases where there are multiple errors, our system takes into account user corrections, and immediately propagates these constraints such that other fields are often corrected automatically. Linear-chain conditional random fields (CRFs) have been shown to perform well for information extraction and other language modelling tasks due to their ability to capture arbitrary, overlapping features of the input in a Markov model. We apply this framework with two extensions: a constrained Viterbi decoding which finds the optimal field assignments consistent with the fields explicitly specified or corrected by the user; and a mechanism for estimating the confidence of each extracted field, so that low-confidence extractions can be highlighted. Both of these mechanisms are incorporated in a novel user interface for form filling that is intuitive and speeds the entry of data providing a 23 reduction rate in error due to automated corrections.

Application : Texte, Extraction d'information

Bibtex :
@inproceedings{ Kristjannson04Interactive,
author = "Trausti Kristjannson and Aron Culotta and Paul Viola and Andrew Mc Callum",
title = "Interactive information extraction with constrained conditional random fields",
booktitle = "Nineteenth National Conference on Artificial Intelligence (AAAI 2004)",
address = "San Jose, CA",
year = "2004" }

Commentaires : Utilise un modéle basé sur un algorithme viterbi contraint pour la correction par l'utilisateur d'une tache d'extraction d'information dans des formulaires.
Michelle L. Gregory and Yasemin Altun. Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004), 2004.
Résumé :The detection of prosodic characteristics is an important aspect of both speech synthesis and speech recognition. Correct placement of pitch accents aids in more natural sounding speech, while automatic detection of accents can contribute to better word-level recognition and better textual understanding. In this paper we investigate probabilistic, contextual, and phonological factors that influence pitch accent placement in natural, conversational speech in a sequence labeling setting. We introduce Conditional Random Fields (CRFs) to pitch accent prediction task in order to incorporate these factors efficiently in a sequence model. We demonstrate the usefulness and the incremental effect of these factors in a sequence model by performing experiments on hand labeled data from the Switchboard Corpus. Our model outperforms the baseline and previous models of pitch accent prediction on the Switchboard Corpus

Application : Parole

Bibtex :
@inproceedings{ Gregory04Using,
author = "Michelle L. Gregory and Yasemin Altun",
title = "Using Conditional Random Fields to Predict Pitch Accents in Conversational Speech.",
booktitle = "Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics (ACL 2004)",
year = "2004" }

Francis R. Bach and Michael I. Jordan. Discriminative training of hidden Markov models for multiple pitch tracking, Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2005 
Résumé : Un algorithme d’estimation de la fréquence fondamentale de signaux sonores est introduit: il utilise une modélisation du spectrogramme du signal à l’aide d’un modèle de Markov caché factoriel, dont les paramètres sont estimés de manière discriminative à partir de la base de données de Keele (Plante et al., 1995). Les algorithmes présentés permettent de suivre plusieurs fréquences fondamentales et de déterminer le nombre de fréquences présentes à chaque instant. Les résultats de simulations, effectuées sur des mélanges de signaux de parole et du bruit, illustrent la robustesse de l’approche présentée.

Application : Parole

Bibtex :
author = {Francis R. Bach and Michael I. Jordan.},
title = {Discriminative training of hidden Markov models for multiple pitch tracking},
booktitle = {Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP)},
year = {2005}}

Commentaires :
Présente la difference entre les modèles génératifs et les modéles discriminatifs.

Cohn, Trevor and Blunsom, Phil Semantic Role Labelling with Tree Conditional Random Fields. In Proceedings CoNLL-2005: Ninth Conference on Computational Natural Language, Ann Arbor, Michigan 2005.

Résumé : In this paper we apply conditional random fields (CRFs) to the semantic role labelling task. We define a random field over the structure of each sentence’s syntactic parse tree. For each node of the tree, the model must predict a semantic role label, which is interpreted as the labelling for the corresponding syntactic constituent. We show how modelling the task as a tree labelling problem allows for the use of efficient CRF inference algorithms, while also increasing generalisation performance when compared to the equivalent maximum entropy classifier. We have participated in the CoNLL-2005 shared task closed challenge with full syntacticinformation.

Application : Texte, Annotation sémantique

Bibtex : @incollection{Bach05Discriminative,
author = {Trevor Cohn and Phil Blunsom},

title = {Semantic Role Labelling with Tree Conditional Random Fields},
booktitle = {Proceedings ofCoNLL-2005: Ninth Conference on Computational Natural Language},
year = {2005}}

Sunita Sarawagi and William W. Cohen. Semi-Markov Conditional Random Fields for Information Extraction. In Advances in Neural Information Processing Systems 17 (NIPS 2004), 2005
Résumé : We describe semi-Markov conditional randomfields (semi-CRFs), a conditionally trained version of semi-Markov chains. Intuitively, a semi-CRF on an input sequence x outputs a “segmentation” of x, in which labels are assigned to segments (i.e., subsequences) of x rather than to individual elements xi of x. Importantly, features for semi-CRFs can measure properties of segments, and transitions within a segment can be non-Markovian. In spite of this additional power, exact learning and inference algorithms for semi-CRFs are polynomial-time—often only a small constant factor slower than conventional CRFs. In experiments on five named entity recognition problems, semi-CRFs generally outperform conventional CRFs.

Application : Texte, Extraction d'information

Bibtex : @incollection{Sarawagi05Semi,
author = {Sunita {Sarawagi} and William W. {Cohen}},
title = {Semi-Markov Conditional Random Fields for Information Extraction},
booktitle = {Advances in Neural Information Processing Systems 17},
editor = {Lawrence K. Saul and Yair Weiss and {L\'{e}on} Bottou},
publisher = {MIT Press},
address = {Cambridge, MA},
pages = {1185-1192},
year = {2005}}

Par Bruno GRILHERES - Publié dans : crf-fr
Ecrire un commentaire - Voir les 16 commentaires
Créer un blog gratuit sur - Contact - C.G.U. - Rémunération en droits d'auteur - Signaler un abus