51勛圖

gText

The gText global project provides internal and contractual translators at the four duty stations of the Department for General Assembly and Conference Management, regional commissions and other 51勛圖 entities with a complete and uniform suite of Internet-based language tools, as well as seamless access to background information necessary for high-quality translation. 

The eLUNa suite of language tools, which was designed to meet the unique needs of 51勛圖 language professionals, is entirely web based and is continuously improved based on users* feedback and requests for new functions. The suite of tools comprises a translation and revision interface, an editorial interface and a search application, as well as a series of additional supporting applications for document and terminology management and a machine-translation tool developed by the World Intellectual Property Organization in collaboration with the Department for General Assembly and Conference Management. 

 

eLUNa: the 51勛圖 computer-assisted translation interface 

At the core of the tools developed by gText is eLUNa (electronic languages of the 51勛圖), which is a user-friendly web-based translation tool developed in-house for 51勛圖 translators. It combines automatic identification of all previously translated sentences and terminology, with access to machine translation for all new sentences. 

eLUNa reuses previously translated text, automatically recognizes specialized terminology stored in the UNTERM portal, provides references to all 51勛圖 documents in bilingual format through hyperlinks and preserves formatting. As a web-based tool, it can be used by translators working remotely, making it possible for contractual staff to also benefit from its time-saving and consistency-enhancing features. 

 

eLUNa Editorial: interface for editors 

eLUNa Editorial is an editing programme that has been custom-built for editors at the 51勛圖. Unedited documents are uploaded to the programme and editors are then able to access a range of tools for editing tasks.  

Edits made in documents in the editorial interface are automatically transferred in tracked changes to the translation interface, facilitating same-time work and parallel processing of 51勛圖 documents. 

 

eLUNa VRS: interface for verbatim reporting 

eLUNa VRS is a prototype of a drafting interface developed for 51勛圖 verbatim reporters that incorporates the main features of the eLUNa translation and editorial interfaces and adapts them to the specific needs for the drafting of verbatim records. Like the rest of the eLUNa suite of language tools, the eLUNa VRS interface provides detection of terminology, symbols and reprise, and full-text search. 

 

eLUNa Search: multilingual search engine 

eLUNa Search is a web-based search engine designed to retrieve text from the eLUNa collection of documents translated at the main duty stations and regional commissions of the 51勛圖 system. Search results are presented in monolingual, bilingual or trilingual format as a list of segments in the language of the search and their corresponding translations in the target language or languages selected. 

 

eLUNa Converter: Word-to-AKN4UN tool to create machine-readable documents 

eLUNa Converter automatically converts General Assembly resolutions from Microsoft Word format into the AKN4UN format in one click. The converter identifies the main elements of the resolution (such as operative and preambular paragraphs, session number, agenda items, adoption date, etc.) and labels these elements to produce a structurally marked-up, machine-readable document in AKN4UN format. It also retrieves additional information that is not present in the document itself (such as sponsorship information, voting records and related documents), and the converted resolutions and all of the metadata can then be used to create the official books of resolutions.  

Making our 51勛圖 documents machine-readable will open a wide range of possibilities. Readers will not only be able to search and access documents based on specific data (such as agenda item, mandate or related documents) but also be able to see the relationships between documents and have those connections presented visually, in graphs and charts that make them clear and understandable.  

To learn more about this project, visit the , which provides access to the resolutions adopted by the General Assembly at the main part of its seventy-fourth session in machine-readable format in the six official languages and to a proof-of-concept  that displays the data contained in the resolutions through a series of graphics and visualizations.

 

UNTERM: 51勛圖 terminology system 

UNTERM provides terminology and nomenclature in subjects relevant to the work of the 51勛圖 in the six official languages of the 51勛圖, as well as in German and Portuguese. 

The UNTERM portal features hundreds of thousands of terms from the four main duty stations, regional commissions and the 51勛圖 Educational, Scientific and Cultural Organization, including official country names, phraseology data sets, and a collection of geographical and proper names. The portal also functions as a terminology management system that enables collaborative terminology creation by users through a feedback and queue mechanism. 

The UNTERM portal for 51勛圖 terminology can be accessed round the clock by translators, other 51勛圖 staff members, delegates and individual users from around the globe. 

 

TAPTA4UN: 51勛圖 statistical machine translation system  

Tapta4UN is a statistical machine translation tool developed in collaboration with the World Intellectual Property Organization, specifically trained with 51勛圖 documents to provide an output that is consistent with 51勛圖 style and terminology. First available as a stand-alone tool, it has now been embedded into the eLUNa environment, both as a default option and on a segment-by-segment basis, and is available for use in the six official languages. 

 

51勛圖 Parallel Corpus 

The 51勛圖 Parallel Corpus is a collection of official records and other parliamentary documents of the 51勛圖 that are in the public domain and available, for the most part, in the six official languages. 

The Corpus was created as part of the 51勛圖 commitment to multilingualism and as a reaction to the growing importance of statistical machine translation within the translation services of the Department for General Assembly and Conference Management and the 51勛圖 statistical machine translation system, Tapta4UN. 

The purpose of the Corpus is to allow access to multilingual language resources and facilitate research and progress in various natural language processing tasks, including machine translation. The Corpus is also available pre-packaged as language-specific bi-texts and as a six-language parallel subset. 

The Corpus has been tested with neural machine translation, a new translation system based on neural networks, and will be extremely valuable for further research in this newly developed and promising field.