Publications

(2020). Iterative Feature Mining for Constraint-Based Data Collection to Increase Data Diversity and Model Robustness. EMNLP (short).

PDF Abstract Blog Post

(2020). Improving Low Compute Language Modeling with In-Domain Embedding Initialisation. EMNLP (short).

PDF Abstract Code Blog Post Supplementary Material ArXiv

(2020). Compositional Demographic Word Embeddings. EMNLP.

PDF Abstract Blog Post ArXiv

(2020). A Novel Workflow for Accurately and Efficiently Crowdsourcing Predicate Senses and Argument Labels. Findings of EMNLP.

PDF Abstract Blog Post

(2020). Qualification Labour: A Fair Wage Isn't Enough if Workers Need to Do 5,000 Low Paid Tasks to Qualify for Your Task. HComp (Work in Progress).

PDF Abstract

(2020). Inconsistencies in Crowdsourced Slot-Filling Annotations: A Typology and Identification Methods. CoLing.

Abstract

(2020). Exploring the Value of Personalized Word Embeddings. CoLing (short).

Abstract

(2020). Overview of the seventh Dialog System Technology Challenge: DSTC7. CSL.

PDF Abstract Dataset DOI Citations (7)

(2020). NOESIS II: Predicting Responses, Identifying Success, and Managing Complexity in Task-Oriented Dialogue. AAAI Wokshop: Dialogue System Technology Challenges.

PDF Abstract Dataset

(2020). Crowdsourced Detection of Emotionally Manipulative Language. CHI.

PDF Abstract

(2020). Analyzing the Surprising Variability in Word Embedding Stability Across Languages. ArXiv.

PDF Abstract

(2019). The Eighth Dialog System Technology Challenge. NeurIPS Workshop: Conversational AI: Today’s Practice and Tomorrow’s Potential.

PDF Abstract Dataset ArXiv Citations (5)

(2019). No-Press Diplomacy: Modeling Multi-Agent Gameplay. NeurIPS.

PDF Abstract Blog Post Supplementary Material ArXiv Citations (3)

(2019). Training Data Voids: Novel Attacks Against NLP Content Moderation. CSCW Workshop: Volunteer Work: Mapping the Future of Moderation Research.

PDF

(2019). An Evaluation for Intent Classification and Out-of-Scope Prediction. EMNLP (short).

PDF Abstract Dataset DOI ArXiv Citations (8)

(2019). DSTC7 Task 1: Noetic End-to-End Response Selection. ACL Workshop: NLP for Conversational AI.

PDF Dataset DOI Citations (3)

(2019). SLATE: A Super-Lightweight Annotation Tool for Experts. ACL (demo).

PDF Abstract Code Poster DOI Citations (2)

(2019). A Large-Scale Corpus for Conversation Disentanglement. ACL.

PDF Abstract Code Dataset Poster DOI Blog Post Supplementary Material ArXiv Citations (27)

(2019). Outlier Detection for Improved Data Quality and Diversity in Dialog Systems. NAACL.

PDF Abstract Dataset DOI ArXiv Citations (3)

(2019). Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog. Best Student Paper - CICLing.

PDF Abstract ArXiv Citations (1)

(2019). Learning from Personal Longitudinal Dialog Data. IEEE Intelligent Systems.

PDF Abstract Citations (3)

(2019). DSTC7 Task 1: Noetic End-to-End Response Selection. AAAI Wokshop: Dialogue System Technology Challenges.

PDF Abstract Dataset Citations (10)

(2018). Dialog System Technology Challenge 7. NeurIPS Workshop: Conversational AI: Today’s Practice and Tomorrow’s Potential.

PDF Abstract Dataset ArXiv Citations (23)

(2018). Improving Text-to-SQL Evaluation Methodology. ACL.

PDF Abstract Code Dataset Poster DOI ArXiv Citations (42)

(2018). Factors Influencing the Surprising Instability of Word Embeddings. NAACL.

PDF Abstract DOI ArXiv Citations (35)

(2018). Effective Crowdsourcing for a New Type of Summarization Task. NAACL (short).

PDF Abstract DOI Citations (6)

(2018). Data Collection for a Production Dialogue System: A Startup Perspective. NAACL (industry).

PDF Abstract Video DOI Citations (11)

(2018). World Knowledge for Abstract Meaning Representation Parsing. LREC.

PDF Abstract Citations (1)

(2017). Identifying Products in Online Cybercrime Marketplaces: A Dataset for Fine-grained Domain Adaptation. EMNLP.

PDF Abstract Code DOI Supplementary Material ArXiv Citations (10)

(2017). Understanding Task Design Trade-offs in Crowdsourced Paraphrase Collection. ACL (short).

PDF Abstract Dataset Video DOI PDF Slides ArXiv Citations (18)

(2017). Tools for Automated Analysis of Cybercriminal Markets. WWW.

PDF Abstract Code Citations (27)

(2017). Parsing with Traces: An O($n^4$) Algorithm and a Structural Representation. TACL.

PDF Abstract Code Video DOI Interview ArXiv Citations (9)

(2016). Algorithms for Identifying Syntactic Errors and Parsing with Graph Structured Output. EECS Department, University of California, Berkeley.

PDF Abstract

(2015). An Empirical Analysis of Optimization for Max-Margin NLP. EMNLP (short).

PDF Abstract Code Poster DOI Citations (10)

(2013). Error-Driven Analysis of Challenges in Coreference Resolution. EMNLP.

PDF Abstract Code Slides PDF Slides Citations (31)

(2013). An Empirical Examination of Challenges in Chinese Parsing. ACL (short).

PDF Abstract Code Slides PDF Slides Citations (14)

(2013). High-velocity Clouds in the Galactic All Sky Survey. I. Catalog. The Astrophysical Journal Supplement Series.

PDF Abstract ArXiv Citations (3)

(2012). Robust Conversion of CCG Derivations to Phrase Structure Trees. ACL (short).

PDF Abstract Code Slides PDF Slides Citations (2)

(2012). Parser Showdown at the Wall Street Corral: An Empirical Investigation of Error Types in Parser Output. EMNLP.

PDF Abstract Code Slides PDF Slides Citations (52)

(2011). Mention Detection: Heuristics for the OntoNotes annotations. CoNLL Shared Task.

PDF Abstract Poster Citations (17)

(2010). Spatiotemporal Hierarchy of Relaxation Events, Dynamical Heterogeneities, and Structural Reorganization in a Supercooled Liquid. Physical Review Letters.

PDF Abstract DOI ArXiv Citations (44)

(2010). Morphological Analysis Can Improve a CCG Parser for English. CoLing.

PDF Abstract Citations (3)

(2010). Faster Parsing by Supertagger Adaptation. ACL.

PDF Abstract Code PDF Slides Citations (12)

(2009). Faster parsing and supertagging model estimation. ALTA.

PDF Abstract PDF Slides

(2009). Large-Scale Syntactic Processing: Parsing the Web. Johns Hopkins University.

PDF Abstract Citations (9)

(2009). Adaptive Supertagging for Faster Parsing. The University of Sydney.

PDF Abstract Poster PDF Slides

(2008). Classification of Verb Particle Constructions with the Google Web1T Corpus. ALTA.

PDF Abstract Poster Citations (8)

(2008). The densest packing of AB binary hard-sphere homogeneous compounds across all size ratios. The Journal of Physical Chemistry B.

PDF Abstract Citations (26)