Transfer Learning Across Language Similarity Networks

Katrin Kirchhoff, Mark Hasegawa-Johnson, Leanne Rolston, and Preethi Jyothi; DARPA LORELEI


Knowledge about language relationships provides a wealth of information that can often substitute for, or complement, linguistic data resources. However, it should be encoded in a formal representation, ideally one that also offers principled methods and algorithms for utilizing this knowledge. Graphs and their associated algorithms have been explored in, e.g., machine learning, social network theory, or graphical models. Many graph-based algorithms have been analyzed both theoretically (e.g., they have approximation or performance guarantees), and within practical applications, thus providing a mature starting point. Second, weighted graphs yield a flexible representation in that weights can be specified either manually (when data is lacking), in a data-driven way, or both. Thus, a network of language relationships can accommodate both data-driven information derived from resource-rich languages as well as information from limited data resources, e.g., linguistic experts' judgments. Third, graph-based learning approaches have the advantage of being able to utilize information from the entire graph. This means that nodes can influence one another even if they are not directly connected --- information may flow indirectly between nodes, via connections through other nodes. A resource-poor language that is not directly connected to a resource-rich language can thus still benefit from the latter if relevant information can be propagated through intermediary languages. The Language Network: We informally define the LanguageNet as a set of graphs whose nodes represent languages and whose weighted edges represent pairwise relationships between languages. A given language is actually represented as a set of nodes, each of which corresponds to a level of linguistic description (syntax, morphology, lexicon, etc.). The information associated with a node can be thought of as either parameters (a vector of parameters for a given level) or data resources (a list of pointers to data resources available for a given level). Edges are defined between pairs of nodes. Edge weights represent parameter or data similarity, and since a LanguageNet is a set of graphs over the same node set, the relationships between two languages may differ for different linguistic levels.


The tasks being pursued at the University of Illinois are:


Useful External Links