Pycon Notes: Natural Language Processing
by john on Feb.20, 2010, under PyCon
Nitin Madnani
Python is well suited to NLP due to nicode support, C/C++ extensibility, etc.
NLTK comes with its own corpora, lots of tools, and WordNet integration. Has its own O’Reilly book.
Dumbo is Python bindings for Hadoop Streaming. Hadoop Streaming lets you use any executable or script for mappers and reducers.
Word association example is trivially parallelized using Hadoop on EC2.
No comments for this entry yet...