%0 Journal Article %T PARADOCS: A Language Independant Go-Between for Mating Parallel Documents PARADOCS : l'entremetteur de documents parall¨¨les ind¨¦pendant de la langue %A Alexandre Patry %A Philippe Langlais %J Traitement Automatique des Langues %D 2011 %I Association pour le Traitement Automatique des Langues (ATALA) %X Parallel corpora are the bread and butter of a number of machine translation tech- nologies. Therefore, important efforts are regularly spent in acquiring new ones. This task often involves a rather cumbersome manual inspection and it is rather difficult to set up a strategy that fits all the needs. We thus developed PARADOCS, a system aiming at doing this automatically. Our solution exploits numerical entities in documents in order to pair them. A classifier trained to recognize parallel text coupled to an information retrieval engine controlling the search space of candidate pairs are the main components of our approach. We tested PARADOCS on a number of tasks involving numerous pairs of languages and report good results. %K Parallel corpora %K Information Retrieval %K Machine Translation %U http://www.atala.org/IMG/pdf/3-Patry-Langlais-TAL51-2.pdf