Changelog¶
0.19.0 (2025-01-13)¶
Bug fixes:
Fix
textblob.download_corpora
script (#474). Thanks @cagan-elden for reporting.
Changes:
Remove vendorized
unicodecsv
module, as it’s no longer used.Support Python 3.9-3.13 and nltk>=3.9 (#486) Thanks @johnfraney for the PR.
0.18.0 (2024-02-15)¶
Bug fixes:
Remove usage of deprecated cElementTree (#339). Thanks @tirkarthi for reporting and for the PR.
Address
SyntaxWarning
on Python 3.12 (#418). Thanks @smontanaro for the PR.
Removals:
TextBlob.translate()
andTextBlob.detect_language
, andtextblob.translate
are removed. Use the official Google Translate API instead (#215).Remove
textblob.compat
.
Support:
Support Python 3.8-3.12. Older versions are no longer supported.
Support nltk>=3.8.
0.17.1 (2021-10-21)¶
Bug fixes:
0.17.0 (2021-02-17)¶
Features:
Performance improvement: Use
chain.from_iterable
in_text.py
to improve runtime and memory usage (#333). Thanks @cool-RR for the PR.
Other changes:
0.16.0 (2020-04-26)¶
Deprecations:
TextBlob.translate()
andTextBlob.detect_language
are deprecated. Use the official Google Translate API instead (#215).
Other changes:
Backwards-incompatible: Drop support for Python 3.4.
Test against Python 3.7 and Python 3.8.
Pin NLTK to
nltk<3.5
on Python 2 (#315).
0.15.3 (2019-02-24)¶
Bug fixes:
Fix bug when
Word
string type after pos_tags is not astr
(#255). Thanks @roman-y-korolev for the patch.
0.15.2 (2018-11-21)¶
Bug fixes:
0.15.1 (2018-01-20)¶
Bug fixes:
0.15.0 (2017-12-02)¶
Features:
Add
TextBlob.sentiment_assessments
property which exposes pattern’s sentiment assessments (#170). Thanks @jeffakolb.
0.14.0 (2017-11-20)¶
Features:
0.13.1 (2017-11-11)¶
Bug fixes:
Avoid AttributeError when using pattern’s sentiment analyzer (#178). Thanks @tylerjharden for the catch and patch.
Correctly pass
format
argument toNLTKClassifier.accuracy
(#177). Thanks @pavelmalai for the catch and patch.
0.13.0 (2017-08-15)¶
Features:
0.12.0 (2017-02-27)¶
Features:
Bug fixes:
Changes:
Backwards-incompatible: Remove Python 2.6 and 3.3 support.
0.11.1 (2016-02-17)¶
Bug fixes:
Fix translation and language detection (#115, #117, #119). Thanks @AdrianLC and @jschnurr for the fix. Thanks @AdrianLC, @edgaralts, and @pouya-cognitiv for reporting.
0.11.0 (2015-11-01)¶
Changes:
Compatible with nltk>=3.1. NLTK versions < 3.1 are no longer supported.
Change default tagger to NLTKTagger (uses NLTK’s averaged perceptron tagger).
Tested on Python 3.5.
Bug fixes:
Fix singularization of a number of words. Thanks @jonmcoe.
Fix spelling correction when nltk>=3.1 is installed (#99). Thanks @shubham12101 for reporting.
0.10.0 (2015-10-04)¶
Changes:
Bug fixes:
Translator.translate
will detect language of input text by default (#85). Thanks again @jschnurr.Fix matching of tagged phrases with CFG in
ConllExtractor
. Thanks @lragnarsson.Fix inflection of a few irregular English nouns. Thanks @jonmcoe.
0.9.1 (2015-06-10)¶
Bug fixes:
0.9.0 (2014-09-15)¶
TextBlob now depends on NLTK 3. The vendorized version of NLTK has been removed.
Fix bug that raised a
SyntaxError
when translating text with non-ascii characters on Python 3.Fix bug that showed “double-escaped” unicode characters in translator output (issue #56). Thanks Evan Dempsey.
Backwards-incompatible: Completely remove
import text.blob
. You shouldimport textblob
instead.Backwards-incompatible: Completely remove
PerceptronTagger
. Installtextblob-aptagger
instead.Backwards-incompatible: Rename
TextBlobException
toTextBlobError
andMissingCorpusException
toMissingCorpusError
.Backwards-incompatible:
Format
classes are passed a file object rather than a file path.Backwards-incompatible: If training a classifier with data from a file, you must pass a file object (rather than a file path).
Updated English sentiment corpus.
Add
feature_extractor
parameter toNaiveBayesAnalyzer
.Add
textblob.formats.get_registry()
andtextblob.formats.register()
which allows users to register custom data source formats.Change
BaseClassifier.detect
from astaticmethod
to aclassmethod
.Improved docs.
Tested on Python 3.4.
0.8.4 (2014-02-02)¶
Fix display (
__repr__
) of WordList slices on Python 3.Add download_corpora module. Corpora must now be downloaded using
python -m textblob.download_corpora
.
0.8.3 (2013-12-29)¶
Sentiment analyzers return namedtuples, e.g.
Sentiment(polarity=0.12, subjectivity=0.34)
.Memory usage improvements to NaiveBayesAnalyzer and basic_extractor (default feature extractor for classifiers module).
Add
textblob.tokenizers.sent_tokenize
andtextblob.tokenizers.word_tokenize
convenience functions.Add
textblob.classifiers.MaxEntClassifer
.Improved NLTKTagger.
0.8.2 (2013-12-21)¶
Fix bug in spelling correction that stripped some punctuation (Issue #48).
Various improvements to spelling correction: preserves whitespace characters (Issue #12); handle contractions and punctuation between words. Thanks @davidnk.
Make
TextBlob.words
more memory-efficient.Translator now sends POST instead of GET requests. This allows for larger bodies of text to be translated (Issue #49).
Update pattern tagger for better accuracy.
0.8.1 (2013-11-16)¶
Fix bug that caused
ValueError
upon sentence tokenization. This removes modifications made to the NLTK sentence tokenizer.Add
Word.lemmatize()
method that allows passing in a part-of-speech argument.Word.lemma
returns correct part of speech for Word objects that have theirpos
attribute set. Thanks @RomanYankovsky.
0.8.0 (2013-10-23)¶
Backwards-incompatible: Renamed package to
textblob
. This avoids clashes with other namespaces calledtext
. TextBlob should now be imported withfrom textblob import TextBlob
.Update pattern resources for improved parser accuracy.
Update NLTK.
Allow Translator to connect to proxy server.
PerceptronTagger completely deprecated. Install the
textblob-aptagger
extension instead.
0.7.1 (2013-09-30)¶
Bugfix updates.
Fix bug in feature extraction for
NaiveBayesClassifier
.basic_extractor
is now case-sensitive, e.g. contains(I) != contains(i)Fix
repr
output when a TextBlob contains non-ascii characters.Fix part-of-speech tagging with
PatternTagger
on Windows.Suppress warning about not having scikit-learn installed.
0.7.0 (2013-09-25)¶
Wordnet integration.
Word
objects havesynsets
anddefinitions
properties. Thetext.wordnet
module allows you to createSynset
andLemma
objects directly.Move all English-specific code to its own module,
text.en
.Basic extensions framework in place. TextBlob has been refactored to make it easier to develop extensions.
Add
text.classifiers.PositiveNaiveBayesClassifier
.Update NLTK.
NLTKTagger
now working on Python 3.Fix
__str__
behavior.print(blob)
should now print non-ascii text correctly in both Python 2 and 3.Backwards-incompatible: All abstract base classes have been moved to the
text.base
module.Backwards-incompatible:
PerceptronTagger
will now be maintained as an extension,textblob-aptagger
. Instantiating atext.taggers.PerceptronTagger()
will raise aDeprecationWarning
.
0.6.3 (2013-09-15)¶
Word tokenization fix: Words that stem from a contraction will still have an apostrophe, e.g.
"Let's" => ["Let", "'s"]
.Fix bug with comparing blobs to strings.
Add
text.taggers.PerceptronTagger
, a fast and accurate POS tagger. Thanks @syllog1sm.Note for Python 3 users: You may need to update your corpora, since NLTK master has reorganized its corpus system. Just run
curl https://raw.github.com/sloria/TextBlob/master/download_corpora.py | python
again.Add
download_corpora_lite.py
script for getting the minimum corpora requirements for TextBlob’s basic features.
0.6.2 (2013-09-05)¶
Fix bug that resulted in a
UnicodeEncodeError
when tagging text with non-ascii characters.Add
DecisionTreeClassifier
.Add
labels()
andtrain()
methods to classifiers.
0.6.1 (2013-09-01)¶
Classifiers can be trained and tested on CSV, JSON, or TSV data.
Add basic WordNet lemmatization via the
Word.lemma
property.WordList.pluralize()
andWordList.singularize()
methods returnWordList
objects.
0.6.0 (2013-08-25)¶
Add Naive Bayes classification. New
text.classifiers
module,TextBlob.classify()
, andSentence.classify()
methods.Add parsing functionality via the
TextBlob.parse()
method. Thetext.parsers
module currently has one implementation (PatternParser
).Add spelling correction. This includes the
TextBlob.correct()
andWord.spellcheck()
methods.Update NLTK.
Backwards incompatible:
clean_html
has been deprecated, just as it has in NLTK. Use Beautiful Soup’ssoup.get_text()
method for HTML-cleaning instead.Slight API change to language translation: if
from_lang
isn’t specified, attempts to detect the language.Add
itokenize()
method to tokenizers that returns a generator instead of a list of tokens.
0.5.3 (2013-08-21)¶
Unicode fixes: This fixes a bug that sometimes raised a
UnicodeEncodeError
upon creating accessingsentences
for TextBlobs with non-ascii characters.Update NLTK
0.5.2 (2013-08-14)¶
Important patch update for NLTK users
: Fix bug with importing TextBlob if local NLTK is installed.Fix bug with computing start and end indices of sentences.
0.5.1 (2013-08-13)¶
Fix bug that disallowed display of non-ascii characters in the Python REPL.
Backwards incompatible: Restore
blob.json
property for backwards compatibility with textblob<=0.3.10. Add ato_json()
method that takes the same arguments asjson.dumps
.Add
WordList.append
andWordList.extend
methods that append Word objects.
0.5.0 (2013-08-10)¶
Language translation and detection API!
Add
text.sentiments
module. Contains thePatternAnalyzer
(default implementation) as well as aNaiveBayesAnalyzer
.Part-of-speech tags can be accessed via
TextBlob.tags
orTextBlob.pos_tags
.Add
polarity
andsubjectivity
helper properties.
0.4.0 (2013-08-05)¶
New
text.tokenizers
module withWordTokenizer
andSentenceTokenizer
. Tokenizer instances (from either textblob itself or NLTK) can be passed to TextBlob’s constructor. Tokens are accessed through the newtokens
property.New
Blobber
class for creating TextBlobs that share the same tagger, tokenizer, and np_extractor.Add
ngrams
method.Backwards-incompatible
:TextBlob.json()
is now a method, not a property. This allows you to pass arguments (the same that you would pass tojson.dumps()
).New home for documentation: https://textblob.readthedocs.io/
Add parameter for cleaning HTML markup from text.
Minor improvement to word tokenization.
Updated NLTK.
Fix bug with adding blobs to bytestrings.
0.3.10 (2013-08-02)¶
Bundled NLTK no longer overrides local installation.
Fix sentiment analysis of text with non-ascii characters.
0.3.9 (2013-07-31)¶
Updated nltk.
ConllExtractor is now Python 3-compatible.
Improved sentiment analysis.
Blobs are equal (with
==
) to their string counterparts.Added instructions to install textblob without nltk bundled.
Dropping official 3.1 and 3.2 support.
0.3.8 (2013-07-30)¶
Importing TextBlob is now much faster. This is because the noun phrase parsers are trained only on the first call to
noun_phrases
(instead of training them every time you import TextBlob).Add text.taggers module which allows user to change which POS tagger implementation to use. Currently supports PatternTagger and NLTKTagger (NLTKTagger only works with Python 2).
NPExtractor and Tagger objects can be passed to TextBlob’s constructor.
Fix bug with POS-tagger not tagging one-letter words.
Rename text/np_extractor.py -> text/np_extractors.py
Add run_tests.py script.
0.3.7 (2013-07-28)¶
Every word in a
Blob
orSentence
is aWord
instance which has methods for inflection, e.gword.pluralize()
andword.singularize()
.Updated the
np_extractor
module. Now has an new implementation,ConllExtractor
that uses the Conll2000 chunking corpus. Only works on Py2.