textract usage flags / command options

basic usage is as simple as

textract filename.png -o output.txt

The above command will output the text contents of the image to the file output.txt

All usage flags / command options are as follows:

usage: textract [-h]
                [-e {aliases,ascii,base64_codec,big5,big5hkscs,bz2_codec,charmap,cp037,cp1006,cp1026,cp1140,cp1250,cp1251,cp1252,cp1253,cp1254,cp1255,cp1256,cp1257,cp1258,cp424,cp437,cp500,cp720,cp737,cp775,cp850,cp852,cp855,cp856,cp857,cp858,cp860,cp861,cp862,cp863,cp864,cp865,cp866,cp869,cp874,cp875,cp932,cp949,cp950,euc_jis_2004,euc_jisx0213,euc_jp,euc_kr,gb18030,gb2312,gbk,hex_codec,hp_roman8,hz,idna,iso2022_jp,iso2022_jp_1,iso2022_jp_2,iso2022_jp_2004,iso2022_jp_3,iso2022_jp_ext,iso2022_kr,iso8859_1,iso8859_10,iso8859_11,iso8859_13,iso8859_14,iso8859_15,iso8859_16,iso8859_2,iso8859_3,iso8859_4,iso8859_5,iso8859_6,iso8859_7,iso8859_8,iso8859_9,johab,koi8_r,koi8_u,latin_1,mac_arabic,mac_centeuro,mac_croatian,mac_cyrillic,mac_farsi,mac_greek,mac_iceland,mac_latin2,mac_roman,mac_romanian,mac_turkish,mbcs,palmos,ptcp154,punycode,quopri_codec,raw_unicode_escape,rot_13,shift_jis,shift_jis_2004,shift_jisx0213,string_escape,tactis,tis_620,undefined,unicode_escape,unicode_internal,utf_16,utf_16_be,utf_16_le,utf_32,utf_32_be,utf_32_le,utf_7,utf_8,utf_8_sig,uu_codec,zlib_codec}]
                [--extension {.csv,.doc,.docx,.eml,.epub,.gif,.htm,.html,.jpeg,.jpg,.json,.log,.mp3,.msg,.odt,.ogg,.pdf,.png,.pptx,.ps,.psv,.rtf,.tff,.tif,.tiff,.tsv,.txt,.wav,.xls,.xlsx,csv,doc,docx,eml,epub,gif,htm,html,jpeg,jpg,json,log,mp3,msg,odt,ogg,pdf,png,pptx,ps,psv,rtf,tff,tif,tiff,tsv,txt,wav,xls,xlsx}]
                [-m METHOD] [-o OUTPUT] [-O OPTION] [-v]
                filename

Command line tool for extracting text from any document.

positional arguments:
  filename              Filename to extract text.

optional arguments:
  -h, --help            show this help message and exit
  -e {aliases,ascii,base64_codec,big5,big5hkscs,bz2_codec,charmap,cp037,cp1006,cp1026,cp1140,cp1250,cp1251,cp1252,cp1253,cp1254,cp1255,cp1256,cp1257,cp1258,cp424,cp437,cp500,cp720,cp737,cp775,cp850,cp852,cp855,cp856,cp857,cp858,cp860,cp861,cp862,cp863,cp864,cp865,cp866,cp869,cp874,cp875,cp932,cp949,cp950,euc_jis_2004,euc_jisx0213,euc_jp,euc_kr,gb18030,gb2312,gbk,hex_codec,hp_roman8,hz,idna,iso2022_jp,iso2022_jp_1,iso2022_jp_2,iso2022_jp_2004,iso2022_jp_3,iso2022_jp_ext,iso2022_kr,iso8859_1,iso8859_10,iso8859_11,iso8859_13,iso8859_14,iso8859_15,iso8859_16,iso8859_2,iso8859_3,iso8859_4,iso8859_5,iso8859_6,iso8859_7,iso8859_8,iso8859_9,johab,koi8_r,koi8_u,latin_1,mac_arabic,mac_centeuro,mac_croatian,mac_cyrillic,mac_farsi,mac_greek,mac_iceland,mac_latin2,mac_roman,mac_romanian,mac_turkish,mbcs,palmos,ptcp154,punycode,quopri_codec,raw_unicode_escape,rot_13,shift_jis,shift_jis_2004,shift_jisx0213,string_escape,tactis,tis_620,undefined,unicode_escape,unicode_internal,utf_16,utf_16_be,utf_16_le,utf_32,utf_32_be,utf_32_le,utf_7,utf_8,utf_8_sig,uu_codec,zlib_codec}, --encoding {aliases,ascii,base64_codec,big5,big5hkscs,bz2_codec,charmap,cp037,cp1006,cp1026,cp1140,cp1250,cp1251,cp1252,cp1253,cp1254,cp1255,cp1256,cp1257,cp1258,cp424,cp437,cp500,cp720,cp737,cp775,cp850,cp852,cp855,cp856,cp857,cp858,cp860,cp861,cp862,cp863,cp864,cp865,cp866,cp869,cp874,cp875,cp932,cp949,cp950,euc_jis_2004,euc_jisx0213,euc_jp,euc_kr,gb18030,gb2312,gbk,hex_codec,hp_roman8,hz,idna,iso2022_jp,iso2022_jp_1,iso2022_jp_2,iso2022_jp_2004,iso2022_jp_3,iso2022_jp_ext,iso2022_kr,iso8859_1,iso8859_10,iso8859_11,iso8859_13,iso8859_14,iso8859_15,iso8859_16,iso8859_2,iso8859_3,iso8859_4,iso8859_5,iso8859_6,iso8859_7,iso8859_8,iso8859_9,johab,koi8_r,koi8_u,latin_1,mac_arabic,mac_centeuro,mac_croatian,mac_cyrillic,mac_farsi,mac_greek,mac_iceland,mac_latin2,mac_roman,mac_romanian,mac_turkish,mbcs,palmos,ptcp154,punycode,quopri_codec,raw_unicode_escape,rot_13,shift_jis,shift_jis_2004,shift_jisx0213,string_escape,tactis,tis_620,undefined,unicode_escape,unicode_internal,utf_16,utf_16_be,utf_16_le,utf_32,utf_32_be,utf_32_le,utf_7,utf_8,utf_8_sig,uu_codec,zlib_codec}
                        Specify the encoding of the output.
  --extension {.csv,.doc,.docx,.eml,.epub,.gif,.htm,.html,.jpeg,.jpg,.json,.log,.mp3,.msg,.odt,.ogg,.pdf,.png,.pptx,.ps,.psv,.rtf,.tff,.tif,.tiff,.tsv,.txt,.wav,.xls,.xlsx,csv,doc,docx,eml,epub,gif,htm,html,jpeg,jpg,json,log,mp3,msg,odt,ogg,pdf,png,pptx,ps,psv,rtf,tff,tif,tiff,tsv,txt,wav,xls,xlsx}
                        Specify the extension of the file.
  -m METHOD, --method METHOD
                        Specify a method of extraction for formats that
                        support it
  -o OUTPUT, --output OUTPUT
                        Output raw text in this file
  -O OPTION, --option OPTION
                        Add arbitrary options to various parsers of the form
                        KEYWORD=VALUE. A full list of available KEYWORD
                        options is available at http://bit.ly/textract-options
  -v, --version         show program's version number and exit

Leave a Comment