Terrence D. Szymanski

About Me

I am currently a data scientist at ANZ in Melbourne, Australia. I develop analytics tools and software to facilitate institutional banking at ANZ, focusing on text analytics of financial and business news.

I was previously a postdoctoral researcher at the Insight Centre for Data Analytics at University College Dublin, involved in industry-driven research on data mining from text, with special focuses on social media and online news.

I received my PhD in Linguistics from the University of Michigan, where I studied computational linguistics and researched grammar induction from parallel text with a focus on resource-poor languages.

Interested parties may download my resume.

Publications

A. Kutuzov, L. Øvrelid, T. Szymanski, and E. Velldal. 2018. Diachronic word embeddings and semantic shifts: a survey. In Proceedings of the 27th International Conference on Computational Linguistics (COLING 2018). [ pdf | bibtex ]
@inproceedings{kutuzov-etal-2018-diachronic,
  title = "Diachronic word embeddings and semantic shifts: a survey",
  author = "Kutuzov, Andrey and {\O}vrelid, Lilja and Szymanski, Terrence and Velldal, Erik",
  booktitle = "Proceedings of the 27th International Conference on Computational Linguistics",
  month = aug,
  year = "2018",
  address = "Santa Fe, New Mexico, USA",
  publisher = "Association for Computational Linguistics",
  url = "https://www.aclweb.org/anthology/C18-1117",
  pages = "1384--1397",
  abstract = "Recent years have witnessed a surge of publications aimed at tracing temporal changes in lexical semantics using distributional methods, particularly prediction-based word embedding models. However, this vein of research lacks the cohesion, common terminology and shared practices of more established areas of natural language processing. In this paper, we survey the current state of academic research related to diachronic word embeddings and semantic shifts detection. We start with discussing the notion of semantic shifts, and then continue with an overview of the existing methods for tracing such time-related shifts with word embedding models. We propose several axes along which these methods can be compared, and outline the main challenges before this emerging subfield of NLP, as well as prospects and possible applications.",
}
T. Szymanski. 2017. Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL 2017). [ pdf | poster | github | bibtex ]
@inproceedings{Szymanski:2017,
   Author = {Terrence Szymanski},
   Title = {Temporal Word Analogies: Identifying Lexical Replacement with Diachronic Word Embeddings},
   Booktitle = {Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)},
   Year = {2017} }
Y. Gurin, T. Szymanski & M. T. Keane. 2017. Discovering News Events That Move Markets. In Proceedings of IEEE 2017 Intelligent Systems Conference (IntelliSys). [ link | bibtex ]
@inproceedings{Gurin:2017,
   Author = {Yuri Gurin, Terrence Szymanski and Mark T. Keane},
   Title = {Discovering News Events That Move Markets},
   Booktitle = {Proceedings of the 2017 SAI Intelligent Systems Conference (IntelliSys)},
   Year = {2017} }
T. Szymanski, C. Orellana-Rodriguez, and M. T. Keane. 2016. Helping news editors write better headlines: A recommender to improve the keyword contents & shareability of news headlines. In Natural Language Processing meets Journalism Proceedings of the Workshop (NLPMJ-2016), pages 30–34. [ pdf | slides | bibtex ]
@inproceedings{Szymanski:2016,
  Author = {Terrence Szymanski and Claudia Orellana-Rodriguez and Mark T. Keane},
  Booktitle = {Natural Language Processing meets Journalism Proceedings of the Workshop ({NLPMJ}-2016)},
  Title = {Helping News Editors Write Better Headlines: A Recommender to Improve the Keyword Contents \& Shareability of News Headlines},
  Pages = {30--34},
  Year = {2016}
}
T. Szymanski and G. Lynch. 2015. UCD: Diachronic text classification with character, word, and syntactic n-grams. In Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015), pages 879–883. [ link | slides | pdf | bibtex ]
@inproceedings{Szymanski:2015,
  Author = {Terrence Szymanski and Gerard Lynch},
  Title = {{UCD}: Diachronic Text Classification with Character, Word, and Syntactic N-grams},
  Booktitle = {Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)},
  Year = {2015}
}
T. Szymanski. 2013. Automatic extraction of linguistic data from digitized documents. In Proceedings of the Berkeley Linguistics Society, 39. [ slides | pdf | bibtex ]
@article{Szymanski:2013,
  Author = {Terrence Szymanski},
  Title = {Automatic Extraction of Linguistic Data from Digitized Documents},
  Journal = {Proceedings of the Berkeley Linguistics Society},
  Volume = {39},
  Year = {2013}
}
T. Szymanski. 2012. Morphological Inference from Bitext for Resource-Poor Languages. PhD thesis, University of Michigan. [ pdf | bibtex ]
@phdthesis{Szymanski:2012,
  Author = {Terrence Szymanski},
  School = {University of Michigan},
  Title = {Morphological Inference from Bitext for Resource-Poor Languages},
  Year = {2012}}
}
E. Keshet, T. Szymanski, and S. Tyndall. 2011. Ballgame: A corpus for computational semantics. Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011). [ link | pdf | bibtex ]
@article{Keshet:2011,
  Author = {Ezra Keshet and Terrence Szymanski and Stephen Tyndall},
  Journal = {Proceedings of the Ninth International Conference on Computational Semantics (IWCS 2011)},
  Title = {BALLGAME: A Corpus for Computational Semantics},
  Year = {2011}
}

Selected Talks and Presentations

9 December, 2016. Automatic Detection of Lexical Replacement in a 20-year News Corpus. Australian Linguistics Society Annual Conference. Monash University, Melbourne, Australia. [ slides ]
3 November, 2015. Summarization. Guest Lecture for Text Analytics for Big Data, UCD CSI. [ slides ]
23 March, 2013. Language Identification in Bilingual Documents for Linguistic Data Extraction. Penn Linguistics Colloquium 37. [ slides ]
15 October, 2010. Computational Methods and Ancient "Scripts". University of Michigan HistLing Discussion Group. [ handout ]
5 December, 2008. Probabilistic Comparative Linguistic Reconstruction. University of Michigan Linguistics Student Colloquium. [ slides ]

Elsewhere

You can find me all over the internet:

What is affrication?

Affrication is a process of sound change that turns stops into affricates. (I say "Tuesday" /tuzdeɪ/ with a /t/, maybe you say "Tuesday" /tʃuzdeɪ/ with a /tʃ/.) Affrication is also my domain name and twitter handle.