Changelog¶
0.16.0 (2020-04-26)¶
Deprecations:
TextBlob.translate()andTextBlob.detect_languageare deprecated. Use the official Google Translate API instead (#215).
Other changes:
- Backwards-incompatible: Drop support for Python 3.4.
- Test against Python 3.7 and Python 3.8.
- Pin NLTK to
nltk<3.5on Python 2 (#315).
0.15.3 (2019-02-24)¶
Bug fixes:
- Fix bug when
Wordstring type after pos_tags is not astr(#255). Thanks @roman-y-korolev for the patch.
0.15.2 (2018-11-21)¶
Bug fixes:
0.15.1 (2018-01-20)¶
Bug fixes:
0.15.0 (2017-12-02)¶
Features:
- Add
TextBlob.sentiment_assessmentsproperty which exposes pattern’s sentiment assessments (#170). Thanks @jeffakolb.
0.14.0 (2017-11-20)¶
Features:
0.13.1 (2017-11-11)¶
Bug fixes:
- Avoid AttributeError when using pattern’s sentiment analyzer (#178). Thanks @tylerjharden for the catch and patch.
- Correctly pass
formatargument toNLTKClassifier.accuracy(#177). Thanks @pavelmalai for the catch and patch.
0.13.0 (2017-08-15)¶
Features:
0.12.0 (2017-02-27)¶
Features:
Bug fixes:
Changes:
- Backwards-incompatible: Remove Python 2.6 and 3.3 support.
0.11.1 (2016-02-17)¶
Bug fixes:
- Fix translation and language detection (#115, #117, #119). Thanks @AdrianLC and @jschnurr for the fix. Thanks @AdrianLC, @edgaralts, and @pouya-cognitiv for reporting.
0.11.0 (2015-11-01)¶
Changes:
- Compatible with nltk>=3.1. NLTK versions < 3.1 are no longer supported.
- Change default tagger to NLTKTagger (uses NLTK’s averaged perceptron tagger).
- Tested on Python 3.5.
Bug fixes:
- Fix singularization of a number of words. Thanks @jonmcoe.
- Fix spelling correction when nltk>=3.1 is installed (#99). Thanks @shubham12101 for reporting.
0.10.0 (2015-10-04)¶
Changes:
Bug fixes:
Translator.translatewill detect language of input text by default (#85). Thanks again @jschnurr.- Fix matching of tagged phrases with CFG in
ConllExtractor. Thanks @lragnarsson. - Fix inflection of a few irregular English nouns. Thanks @jonmcoe.
0.9.1 (2015-06-10)¶
Bug fixes:
0.9.0 (2014-09-15)¶
- TextBlob now depends on NLTK 3. The vendorized version of NLTK has been removed.
- Fix bug that raised a
SyntaxErrorwhen translating text with non-ascii characters on Python 3. - Fix bug that showed “double-escaped” unicode characters in translator output (issue #56). Thanks Evan Dempsey.
- Backwards-incompatible: Completely remove
import text.blob. You shouldimport textblobinstead. - Backwards-incompatible: Completely remove
PerceptronTagger. Installtextblob-aptaggerinstead. - Backwards-incompatible: Rename
TextBlobExceptiontoTextBlobErrorandMissingCorpusExceptiontoMissingCorpusError. - Backwards-incompatible:
Formatclasses are passed a file object rather than a file path. - Backwards-incompatible: If training a classifier with data from a file, you must pass a file object (rather than a file path).
- Updated English sentiment corpus.
- Add
feature_extractorparameter toNaiveBayesAnalyzer. - Add
textblob.formats.get_registry()andtextblob.formats.register()which allows users to register custom data source formats. - Change
BaseClassifier.detectfrom astaticmethodto aclassmethod. - Improved docs.
- Tested on Python 3.4.
0.8.4 (2014-02-02)¶
- Fix display (
__repr__) of WordList slices on Python 3. - Add download_corpora module. Corpora must now be downloaded using
python -m textblob.download_corpora.
0.8.3 (2013-12-29)¶
- Sentiment analyzers return namedtuples, e.g.
Sentiment(polarity=0.12, subjectivity=0.34). - Memory usage improvements to NaiveBayesAnalyzer and basic_extractor (default feature extractor for classifiers module).
- Add
textblob.tokenizers.sent_tokenizeandtextblob.tokenizers.word_tokenizeconvenience functions. - Add
textblob.classifiers.MaxEntClassifer. - Improved NLTKTagger.
0.8.2 (2013-12-21)¶
- Fix bug in spelling correction that stripped some punctuation (Issue #48).
- Various improvements to spelling correction: preserves whitespace characters (Issue #12); handle contractions and punctuation between words. Thanks @davidnk.
- Make
TextBlob.wordsmore memory-efficient. - Translator now sends POST instead of GET requests. This allows for larger bodies of text to be translated (Issue #49).
- Update pattern tagger for better accuracy.
0.8.1 (2013-11-16)¶
- Fix bug that caused
ValueErrorupon sentence tokenization. This removes modifications made to the NLTK sentence tokenizer. - Add
Word.lemmatize()method that allows passing in a part-of-speech argument. Word.lemmareturns correct part of speech for Word objects that have theirposattribute set. Thanks @RomanYankovsky.
0.8.0 (2013-10-23)¶
- Backwards-incompatible: Renamed package to
textblob. This avoids clashes with other namespaces calledtext. TextBlob should now be imported withfrom textblob import TextBlob. - Update pattern resources for improved parser accuracy.
- Update NLTK.
- Allow Translator to connect to proxy server.
- PerceptronTagger completely deprecated. Install the
textblob-aptaggerextension instead.
0.7.1 (2013-09-30)¶
- Bugfix updates.
- Fix bug in feature extraction for
NaiveBayesClassifier. basic_extractoris now case-sensitive, e.g. contains(I) != contains(i)- Fix
reproutput when a TextBlob contains non-ascii characters. - Fix part-of-speech tagging with
PatternTaggeron Windows. - Suppress warning about not having scikit-learn installed.
0.7.0 (2013-09-25)¶
- Wordnet integration.
Wordobjects havesynsetsanddefinitionsproperties. Thetext.wordnetmodule allows you to createSynsetandLemmaobjects directly. - Move all English-specific code to its own module,
text.en. - Basic extensions framework in place. TextBlob has been refactored to make it easier to develop extensions.
- Add
text.classifiers.PositiveNaiveBayesClassifier. - Update NLTK.
NLTKTaggernow working on Python 3.- Fix
__str__behavior.print(blob)should now print non-ascii text correctly in both Python 2 and 3. - Backwards-incompatible: All abstract base classes have been moved to the
text.basemodule. - Backwards-incompatible:
PerceptronTaggerwill now be maintained as an extension,textblob-aptagger. Instantiating atext.taggers.PerceptronTagger()will raise aDeprecationWarning.
0.6.3 (2013-09-15)¶
- Word tokenization fix: Words that stem from a contraction will still have an apostrophe, e.g.
"Let's" => ["Let", "'s"]. - Fix bug with comparing blobs to strings.
- Add
text.taggers.PerceptronTagger, a fast and accurate POS tagger. Thanks @syllog1sm. - Note for Python 3 users: You may need to update your corpora, since NLTK master has reorganized its corpus system. Just run
curl https://raw.github.com/sloria/TextBlob/master/download_corpora.py | pythonagain. - Add
download_corpora_lite.pyscript for getting the minimum corpora requirements for TextBlob’s basic features.
0.6.2 (2013-09-05)¶
- Fix bug that resulted in a
UnicodeEncodeErrorwhen tagging text with non-ascii characters. - Add
DecisionTreeClassifier. - Add
labels()andtrain()methods to classifiers.
0.6.1 (2013-09-01)¶
- Classifiers can be trained and tested on CSV, JSON, or TSV data.
- Add basic WordNet lemmatization via the
Word.lemmaproperty. WordList.pluralize()andWordList.singularize()methods returnWordListobjects.
0.6.0 (2013-08-25)¶
- Add Naive Bayes classification. New
text.classifiersmodule,TextBlob.classify(), andSentence.classify()methods. - Add parsing functionality via the
TextBlob.parse()method. Thetext.parsersmodule currently has one implementation (PatternParser). - Add spelling correction. This includes the
TextBlob.correct()andWord.spellcheck()methods. - Update NLTK.
- Backwards incompatible:
clean_htmlhas been deprecated, just as it has in NLTK. Use Beautiful Soup’ssoup.get_text()method for HTML-cleaning instead. - Slight API change to language translation: if
from_langisn’t specified, attempts to detect the language. - Add
itokenize()method to tokenizers that returns a generator instead of a list of tokens.
0.5.3 (2013-08-21)¶
- Unicode fixes: This fixes a bug that sometimes raised a
UnicodeEncodeErrorupon creating accessingsentencesfor TextBlobs with non-ascii characters. - Update NLTK
0.5.2 (2013-08-14)¶
Important patch update for NLTK users: Fix bug with importing TextBlob if local NLTK is installed.- Fix bug with computing start and end indices of sentences.
0.5.1 (2013-08-13)¶
- Fix bug that disallowed display of non-ascii characters in the Python REPL.
- Backwards incompatible: Restore
blob.jsonproperty for backwards compatibility with textblob<=0.3.10. Add ato_json()method that takes the same arguments asjson.dumps. - Add
WordList.appendandWordList.extendmethods that append Word objects.
0.5.0 (2013-08-10)¶
- Language translation and detection API!
- Add
text.sentimentsmodule. Contains thePatternAnalyzer(default implementation) as well as aNaiveBayesAnalyzer. - Part-of-speech tags can be accessed via
TextBlob.tagsorTextBlob.pos_tags. - Add
polarityandsubjectivityhelper properties.
0.4.0 (2013-08-05)¶
- New
text.tokenizersmodule withWordTokenizerandSentenceTokenizer. Tokenizer instances (from either textblob itself or NLTK) can be passed to TextBlob’s constructor. Tokens are accessed through the newtokensproperty. - New
Blobberclass for creating TextBlobs that share the same tagger, tokenizer, and np_extractor. - Add
ngramsmethod. Backwards-incompatible:TextBlob.json()is now a method, not a property. This allows you to pass arguments (the same that you would pass tojson.dumps()).- New home for documentation: https://textblob.readthedocs.io/
- Add parameter for cleaning HTML markup from text.
- Minor improvement to word tokenization.
- Updated NLTK.
- Fix bug with adding blobs to bytestrings.
0.3.10 (2013-08-02)¶
- Bundled NLTK no longer overrides local installation.
- Fix sentiment analysis of text with non-ascii characters.
0.3.9 (2013-07-31)¶
- Updated nltk.
- ConllExtractor is now Python 3-compatible.
- Improved sentiment analysis.
- Blobs are equal (with
==) to their string counterparts. - Added instructions to install textblob without nltk bundled.
- Dropping official 3.1 and 3.2 support.
0.3.8 (2013-07-30)¶
- Importing TextBlob is now much faster. This is because the noun phrase parsers are trained only on the first call to
noun_phrases(instead of training them every time you import TextBlob). - Add text.taggers module which allows user to change which POS tagger implementation to use. Currently supports PatternTagger and NLTKTagger (NLTKTagger only works with Python 2).
- NPExtractor and Tagger objects can be passed to TextBlob’s constructor.
- Fix bug with POS-tagger not tagging one-letter words.
- Rename text/np_extractor.py -> text/np_extractors.py
- Add run_tests.py script.
0.3.7 (2013-07-28)¶
- Every word in a
BloborSentenceis aWordinstance which has methods for inflection, e.gword.pluralize()andword.singularize(). - Updated the
np_extractormodule. Now has an new implementation,ConllExtractorthat uses the Conll2000 chunking corpus. Only works on Py2.
