Explore various approaches to organize and extract useful text from unstructured data using Java
About This Book
* Understand data hidden in text using the power of Java and natural language processing
* Find data, patterns, and gain interesting insights from language using this easy-to-follow book
* Get all the information to get up and running with natural language processing using this example-rich guide
Who This Book Is For
This book appeals to data analysts, or data science professionals who are looking to extract information from language using Java. Previous experience with Java and statistics is expected.
What You Will Learn
* Develop a deep understanding of the basic NLP tasks and how they relate to each other
* Discover and use the available tokenization engines
* Implement techniques for end of sentence detection
* Apply search techniques to find people and things within a document
* Construct solutions to identify parts of speech within sentences
* Use parsers to extract relationships between elements of a document
* Identify topics in a set of documents
* Integrate basic tasks to tackle more complex NLP problems
In Detail
Natural language processing allows taking any sentence and identify patterns, soecial names, company names, and such parts. This book will teach how to perform language analysis with the help of amazing libraries in Java and get insights from this analysis.You will start off with understanding how natural language processing works and various concepts in it. Followed by this, you will learn about important tools and libraries in Java for NLP. After this, you will directly dive into performing natural language processing on different inputs. You will learn things such as tokenization, finind entities, model training, parts of speech, parsing trees,a nd more. You will also learn about machine learning and corpus based methods and algorithms. Also, you will learn about statistical machine translation, summarization, dialog systems, complex searches, supervised and unsupervised NLP, and more.
- ISBN13 9781787288072
- Publish Date 30 March 2018
- Publish Status Active
- Publish Country GB
- Imprint Packt Publishing Limited
- Edition 2nd Revised edition
- Format Paperback
- Pages 407
- Language English