BEGIN:VCALENDAR VERSION:2.0 PRODID:-//Date iCal//NONSGML kigkonsult.se iCalcreator 2.20.4// METHOD:PUBLISH X-WR-CALNAME;VALUE=TEXT:ÌÇÐÄÔ­´´ BEGIN:VTIMEZONE TZID:America/New_York BEGIN:STANDARD DTSTART:20191103T020000 TZOFFSETFROM:-0400 TZOFFSETTO:-0500 TZNAME:EST END:STANDARD BEGIN:DAYLIGHT DTSTART:20200308T020000 TZOFFSETFROM:-0500 TZOFFSETTO:-0400 TZNAME:EDT END:DAYLIGHT END:VTIMEZONE BEGIN:VEVENT UID:calendar.377771.field_event_date.0@www.wright.edu DTSTAMP:20260220T000912Z CREATED:20191113T154109Z DESCRIPTION:Ph.D. Committee:  Drs. Krishnaprasad Thirunarayan\, Advisor\, V alerie L. Shalin (Psychology)\, Keke Chen\, Guozhu Dong\, Srinivasan Patha sarathy (The Ohio State University)\, and Steven Gustafson (noonum Inc.)AB STRACTInformation Extraction (IE) techniques are developed to extract enti ties\, relationships\, and other detailed information from unstructured te xt. Majority of the methods in the literature focus on designing supervise d machine learning techniques\, which are not very practical due to the hi gh cost of obtaining annotations and the difficulty in creating high quali ty (regarding reliability and coverage) gold standard. Therefore\, semi-su pervised and distantly-supervised techniques are getting more traction lat ely to overcome some of the challenges\, such as bootstrapping the learnin g in a faster way.This dissertation focuses on information extraction\, an d in particular entities\, i.e.\, Named Entity Recognition (NER)\, from mu ltiple domains\, including social media and other grammatical texts such a s news and medical documents. This work explores the ways for lowering the cost of building NER pipelines with the help of available knowledge witho ut compromising the quality of extraction and simultaneously taking into c onsideration feasibility and other concerns such as the user-experience. I present a type of distantly supervised (dictionary-based)\, supervised (w ith reduced cost using entity set expansion and active learning)\, and min imally-supervised NER approaches. In addition\, I discuss the various aspe cts of my knowledge-enabled NER approaches and how and why they are a bett er fit for today's real-world NER pipelines in dealing with and the partia l overcoming of the difficulties mentioned above.I present two dictionary- based NER approaches. The first technique is used for location extraction from text streams\, which proved very effective for stream processing with competitive performance in comparison with ten other techniques. The seco nd is a generic NER approach that scales to multiple domains and is minima lly supervised with a human-in-the-loop for online feedback. The two techn iques augment and filter the dictionaries to compensate for the incomplete ness of dictionaries (due to lexical variation between dictionary records and mentions in the text) and for eliminating the noise and spurious conte nt in them. The third technique I present is a supervised approach but wit h a reduced cost. The cost reduction was achieved with the help of a human -in-the-loop and smart instance samplers implemented using entity set expa nsion and active learning. The use of knowledge\, tabbing on the NER model s' accuracy\, and the full exploitation of inputs from the human-in-the-lo op was the key to overcoming the practical\, technical\, and monetary chal lenges. I make the data and codes of the approaches presented in this diss ertation publicly available. DTSTART;TZID=America/New_York:20191119T150000 DTEND;TZID=America/New_York:20191119T170000 LAST-MODIFIED:20191113T160232Z LOCATION:366 Joshi SUMMARY:Ph.D. Dissertation Defense: Knowledge-Enabled Entity Extraction By Hussein S. Al-Olimat URL;TYPE=URI:/events/phd-dissertation-defense-knowled ge-enabled-entity-extraction-hussein-s-al-olimat END:VEVENT END:VCALENDAR