Terminología

9 11 2009

He aquí alguno de los términos que tienen relación con los Centros de Documentación.

  • Autenticidad. Garantía del carácter genuino y fidedigno de ciertos materiales digitales, es decir, de que son lo que se afirma de ellos, ya sea objeto original o en tanto que copia conforme y fiable de un original, realizada mediante procesos perfectamente documentados.
  • Certificación. Proceso de evaluación del grado en que un programa de preservación cumple con un conjunto de
    normas o prácticas mínimas previamente acordadas.
  • Preservación digital. Acciones destinadas a mantener la accesibilidad de los objetos digitales a largo plazo.
  • Integridad de objetos digitales. Estado de los objetos que se encuentran completos y que no han sufrido
    corrupción o alteración alguna no autorizada ni documentada.
  • Identidad de objetos digitales. Característica que permite distinguir un objeto digital del resto, incluidas otras
    versiones o copias del mismo contenido.
  • Derechos. Facultades o poderes legales que se tienen o ejercen con respecto a los materiales digitales, como son
    los derechos de autor, la privacidad, la confidencialidad y las restricciones nacionales o corporativas impuestas
    por motivos de seguridad.
  • Verificación. Acción de comprobar si un objeto digital, en un formato de fichero dado, está completo y cumple
    con la especificación de formato.
  • Ingesta. Operación consistente en almacenar objetos digitales, y la documentación relacionada, de manera
    segura y ordenada.
Anuncios




Documentación tradicional vs. Documentación digital

2 11 2009

Como muchas otras cosas, la documentación es otro de los factores que ha tenido importantes cambios gracias al avance tecnológico. Por ello, es  vital tener una idea clara de las diferencias que presentan, así como su utilidad.

Se denomina documentación tradicional a cualquier unidad significativa de información que haya sido registrada que permita su almacenamiento y su posterior recuperación. Así como, cualquier soporte que permita, simultáneamente, multiplicar ilimitadamente la consulta de la información en él registrada y postergarla indefinidamente en el tiempo. Nos referimos a documentación tradicional cuando hablamos de libros, periódicos, cartas, facturas, etc.

Por otro lado, tenemos la documentación digital, que viene a ser cualquier unidad significativa independiente de información registrada en un diskette, Cd Rom o disco duro. Podemos decir que,  una información está en línea cuando es posible acceder a ella desde terminales u ordenadores remotos, a través de redes de área local, de área amplia o bien de combinaciones de ambas. Se registra en un medio electrónico a través de codificaciones que se basan en el uso de combinaciones de señales eléctricas positivas (con el dígito “1”) y negativas (con el dígito “0”).

Junto con esto, debemos tener en cuenta las tres propiedades del soporte digital:

  • Computabilidad: La información puede ser procesada o “calculada” por un ordenador.
  • Virtualidad: La información digital no está sujeta a las limitaciones propias de la analógica.
  • Capacidad: Ausencia de limitaciones prácticas en cuanto al volumen de información al que puede tener acceso en línea mediante interfaces unificadas.

Entre estos dos tipos de documentación existen ciertas ventajas:

  • Documentación digital:

–  Permite que el usuario pregunte por contenidos, ingrese comentarios, modifique o agregue contenidos.

–  Dispone de información multimedia (texto, sonido e imagen).

–  Es posible  recuperar la información.

–  La cantidad de información por unidad de volumen es infinitamente superior.

–  Tiene acceso a los títulos.

–  Ofrece la posibilidad de establecer relación de una palabra o “saltar” de un lugar a otro del documento.

–  Fácil de publicar y también de sacarlo de la circulación.

  • Documentación tradicional:

–  La confortabilidad

–  Practicabilidad

Visto esto, podemos señalar que la documentación tradicional está en ligera desventaja comparanda con la documentación digital. A medida que la tecnología crece, seguramente la documentación tradicional acabará por desaparecer.

BIBLIOGRAFÍA:





Social Bookmarking

18 10 2009

Nowadays, there are a lot of inter users that experiment with sites such as Delicious (formerly del.icio.us), Citeulike, Backflip etc. You maybe have heard about the concept of social bookmarking, but what it means?

As Wikipedia says, “Social Bookmarking is a method for Internet users to share, organize, search, and manage bookmarks of web resources. Unlike file sharing, the resources themselves aren’t shared, merely bookmarks that reference them. Descriptions may be added to these bookmarks in the form of metadata, so that other users may understand the content of the resource without first needing to download it for themselves. Such descriptions may be free text comments, votes in favor of or against its quality, or tags that collectively or collaboratively become a folksonomy. Folksonomy is also called social tagging, “the process by which many users add metadata in the form of keywords to shared content””.

The concept of shared online bookmarks dates back to April 1996 with the launch of itList, the features of which included public and private bookmarks. Within the next three years, online bookmark services became competitive, with venture-backed companies such as those who I already mention.

Each user can see what others have chosen and add their description to the bookmark, or votes in favour or against them. Besides, like many other things it has its advantgaes and dissadvantages.

Advantages:

– We can keep our favourite references on the net instead of saving them in a computer.

– We can keep an eye on themes we are interested in.

– The use of simple “tags” is more confortable than keep them in folders.

– We can follow the links that other users add.

Disadvantages:

– There is no pre-established system of keywords or categories.

– Users can create ‘tags’ too customized with little meaning for others.

logorunner_social_bookmarking_icons

Sources:

– Bookmark (web). (2009, October 8). In Wikipedia, The Free Encyclopedia. Retrieved 18:41, October 18, 2009, from http://en.wikipedia.org/w/index.php?title=Bookmark_(web)&oldid=318755541

– Social bookmarking. (2009, October 16). In Wikipedia, The Free Encyclopedia. Retrieved 18:40, October 18, 2009, from http://en.wikipedia.org/w/index.php?title=Social_bookmarking&oldid=320232562

– ITList Information Technology Blog. Retrieved 18:41, October 18, 2009, from http://www.itlist.com/

– Social Bookmarking – Compartiendo enlaces de Internet. Retrieved 18:50, October 18, 2009, from http://eibar.org/blogak/prospektiba/es/archive/2005/02/20/178





Definition of 4 specialized terms (Q.3) 3rd theme

26 05 2008

In the following article, we are going to define four specialized terms:

  • Machine translation.
  • Machine aided translation.
  • Multilingual content management.
  • Translation technology.

The first one is Machine Translation:  Sometimes referred to by the abbreviation MT, is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. At its basic level, MT performs simple substitution of words in one natural language for words in another. Using corpus techniques, more complex translations may be attempted, allowing for better handling of differences in linguistic typology, phrase recognition, and translation of idioms, as well as the isolation of anomalies. 

The second term is Computer-assisted translation, computer-aided translation, or CAT is a form of translation wherein a human translator translates texts using computer software designed to support and facilitate the translation process.

Besides we have the term of Multilingual content management that concerns the administration of multilingual websites. According to Danny Stofer it involves the following issues:

  • Translation
  • Localization
  • Culture
  • Feedback
  • Design
  • Workflow
  • Non-Latin character sets

Finally, we have the last one, Translation Technology is the technology in which the main action is the interpretation of the meaning of a text, and subsequent production of an equivalent text, also called a translation, that communicates the same message in another language. The text to be translated is called the source text, and the language it is to be translated into is called the target language; the final product is sometimes called the “target text”. 

SOURCES:





Characteristics of a translation task (FEMTI report) (Q.3) 1rst theme

4 05 2008

According to the FEMTI or Framework for the Evaluation of Machine Translation, the three main features of a translation task are: Assimilation, Dissemination and Communication. In the following lines I’m going to give a brief explanation about each of them.

  • Assimilation: The ultimate purpose of the assimilation task (of which translation forms a part) is to monitor a (relatively) large volume of texts produced by people outside the organization, in (usually) several languages.
  • Dissemination: The ultimate purpose of dissemination is to deliver to others a translation of documents produced inside the organization.
  • Communication: The ultimate purpose of the communication task is to support multi-turn dialogues between people who speak different languages. The translation quality must be high enough for painless conversation, despite possible syntactically ill-formed input and idiosyncratic word and format usage. The ultimate purpose of dissemination is to deliver to others a translation of documents produced inside the organization.

SOURCES:

* FEMTI – a Framework for the Evaluation of Machine Translation in ISLE, May 2, 16:23 from, http://www.issco.unige.ch:8080/cocoon/femti/st-home.html 





Explanation of three Research Topics (Q.2) 2nd theme

22 04 2008

In this article, I am going to explain three of different research topics that I have chosen between the others.

The first topic I’m going to talk about is “NECA” or “Net Environment for Embodied Emotional Conversational Agents” one of previous projects of the Austrian Research Institute for Artificial Intelligence ÖFAI.

In the NECA project, the focus is on the design of credible agent-agent interaction patterns to be observed by human users. To achieve a high level of credibility, the agents must be able to express themselves using a combination of verbal and non-verbal output driven by personality and emotion models.

Moreover, the NECA project will develop a new generation of mixed multi-user / multi-agent virtual spaces populated by affective conversational agents. The agents will be able to express themselves through synchronised emotional speech and non-verbal behaviour, generated from an abstract representation which can be the output of an affective reasoner. This is the first time that such expressive capabilities are featured in Internet applications. The agents’ usefulness will be evaluated in two concrete application scenarios. From a technical point of view, the emerging NECA platform will provide a confederation of dedicated components including an affective reasoner, co-ordinated generation, and emotional speech synthesis, thus providing a basis for the development of new Internet applications with emotional agents.

The next research topic I am going to explain is “HUMAINE” or “Human-machine Interaction Network on Emotions”.

HUMAINE aims to lay the foundations for European development of systems that can register, model and/or influence human emotional and emotion-related states and processes – ‘emotion-oriented systems’. Such systems may be central to future interfaces, but their conceptual underpinnings are not sufficiently advanced to be sure of their real potential or the best way to develop them.

In addition, one of the reasons is that relevant knowledge is dispersed across many disciplines. HUMAINE brings together leading experts from the key disciplines in a programme designed to achieve intellectual integration. It identifies six thematic areas that cut across traditional groupings and offer a framework for an appropriate division of labour – theory of emotion; signal/sign interfaces; the structure of emotionally coloured interactions; emotion in cognition and action; emotion in communication and persuasion; and usability of emotion-oriented systems. Teams linked to each area will run a workshop in it and carry out joint research to define an exemplar embodying guiding principles for future work in their area.

Finally, the last research topic which I will focus on is Corpus Linguistics:

Corpus Linguistics is the study of linguistic phenomena through large collections of machine-readable texts: corpora. These are used within a number of research areas going from the Descriptive Study of the Syntax of a Language to Prosody or Language Learning, to mention but a few. An over-view of some of the areas where corpora have been used can be found on the Research areas page.

Furthermore, the use of real examples of texts in the study of language is not a new issue in the history of linguistics. However, Corpus Linguistics has developed considerably in the last decades due to the great possibilities offered by the processing of natural language with computers. The availability of computers and machine-readable text has made it possible to get data quickly and easily and also to have this data presented in a format suitable for analysis.

REFERENCES:

* “NECA” or “Net Environment for Embodied Emotional Conversational Agents”. Retrieved, 18:34, 21th April 2008 from, http://www.dfki.de/pas/f2w.cgi?ltc/neca-e

* “NECA” or “Net Environment for Embodied Emotional Conversational Agents”. Retrieved, 19:02, 21th April 2008 from, http://www.ofai.at/~brigitte.krenn/papers/web3d_krenn_paper.pdf

* “HUMAINE” or “Human-machine Interaction Network on Emotions”. Retrieved, 17: 14, 18th April 2008 from, http://www.dfki.de/pas/f2w.cgi?ltp/humaine-e

* Corpus Linguistics. Retrieved, 17:22, 18th April 2008 from, http://www.essex.ac.uk/linguistics/clmt/w3c/corpus_ling/content/introduction3.html





Research Topics (Q.2) 1st theme

16 04 2008

In this article I will point out some research topics that are mentioned on different sites of Human Language Technologies.

Firstly, members of The Stanford NLP Group pursue research in a broad variety of topics:

  • Computational Semantics.
  • Parsing & Tagging.
  • Multilingual NLP.
  • Unsupervised Induction of Linguistic Structure.

Secondly, in Edinburgh Language Technology Group of Scotland, UK we can mention some of their projects which conducts research and development in a number of areas.

  • Combining Shallow Semantics and Domain Knowledge.
  • Text Mining fot Biomedical Content Curation.
  • Cross-retail Multi-agent Retail Comparison.
  • Smart Qualitative Data: Methods and Community Tools for Data Mark-Up.
  • Machine Learning for Named Entity Recognition.
  • Named entity tagging of historical parliamentary proceedings.
  • Integrated Models and Tolls for Fine-Grained Prosody in Discourse.
  • Joint Action Science and Technology.
  • AMI consortium projects that are developing technologies for meeting browsing and to assist people participating in meetings from a remote location.
  • Study of how pairs collaborate when in planning a route on a map.

Finally, we can mention the German Language Technology Lab, which themes are elaborated in research, development and commercial projects:

  • Exploiting – and automatically extending – ontologies for content processing.
  • Tighter integration of shallow and deep techniques in processing.
  • Enriching deep processing with statistical methods.
  • Combining language checking with structuring tools in document authoring.
  • Document indexing for German and English.
  • Automatically associating recognized information with related information and thus building up collective knowledge.
  • Automatically structuring and visualizing extracted information.
  • Processing information encoded in multiple languages, among them Chinese and Japanese.

REFERENCES: