Methods and Functions

Python String encode()

In today’s world, security is critical in many applications. As a result, secure information storage in the database is required, and encoded copies of strings must be saved. Encoding, at its most basic, is a method of converting characters (such as letters, punctuation, symbols, whitespace, and control characters) to numbers and, eventually, bits. Each character can be encoded to a distinct bit sequence. To achieve this encoding process, Python encode() function is used. The Python String encode() function encodes a string using the given encoding scheme. This article focuses on Python encode() function, different encoding techniques, its applications along with examples.

Python encode() function

Definition

  • The Python encode() is a built-in string method that is used to return an encoded version of the string according to the encoded standard.
  • Python encode() string function is used to secure the string by encoding it based on the specified encoding type.

Python encode() Syntax

Python dictionary encode() function follows the below-mentioned syntax:

                    

string.encode(encoding, errors)

String encode() Parameters

By default, this function does not require any parameters. But the Python encode() function accepts a maximum 2 parameters:

  • encoding (Optional) – specifies the encoding standard to be used. If no encoding is specified, Python considers UTF-8 as its default encoding standard.
  • errors (Optional) – if any errors occur, it decided how to handle them. The default errors value is ‘strict’. There are 6 types of error responses:
    • strict – default response, which throws a UnicodeDecodeError exception if it fails.
    • ignore – it ignores all the unencodable characters from the result
    • replace – it replaces the unencodable characters from the result with a question mark ‘?’
    • xmlcharrefreplace – substitutes an XML character reference for unencodable Unicode
    • backslashreplace – instead of unencodable Unicode, inserts an \uNNNN escape sequence
    • namereplace – instead of unencodable Unicode, it inserts a \N{…} escape sequence

Return value from encode()

The Python string encode() function returns an encoded string.

Program to check all Encoding Standards in Python

Python provides a wide range of Encoding formats and standards to be used. The below program displays all the Encoding Standards in Python.

                    

from encodings.aliases import aliases

print("The available encodings are : ")
print(aliases.keys())

Output

The available encodings are :

                    

dict_keys(['646', 'ansi_x3.4_1968', 'ansi_x3_4_1968', 'ansi_x3.4_1986', 'cp367', 'csascii', 'ibm367', 'iso646_us', 'iso_646.irv_1991', 'iso_ir_6', 'us', 'us_ascii', 'base64', 'base_64', 'big5_tw', 'csbig5', 'big5_hkscs', 'hkscs', 'bz2', '037', 'csibm037', 'ebcdic_cp_ca', 
'ebcdic_cp_nl', 'ebcdic_cp_us', 'ebcdic_cp_wt', 'ibm037', 'ibm039', '1026', 'csibm1026', 'ibm1026', '1125', 'ibm1125', 'cp866u', 'ruscii', '1140', 'ibm1140', '1250', 'windows_1250', '1251', 'windows_1251', '1252', 'windows_1252', '1253', 'windows_1253', '1254', 
'windows_1254', '1255', 'windows_1255', '1256', 'windows_1256', '1257', 'windows_1257', '1258', 'windows_1258', '273', 'ibm273', 'csibm273', '424', 'csibm424', 'ebcdic_cp_he', 'ibm424', '437', 'cspc8codepage437', 'ibm437', '500', 'csibm500', 'ebcdic_cp_be', 
'ebcdic_cp_ch', 'ibm500', '775', 'cspc775baltic', 'ibm775', '850', 'cspc850multilingual', 'ibm850', '852', 'cspcp852', 'ibm852', '855', 'csibm855', 'ibm855', '857', 'csibm857', 'ibm857', '858', 'csibm858', 'ibm858', '860', 'csibm860', 'ibm860', '861', 'cp_is', 'csibm861', 
'ibm861', '862', 'cspc862latinhebrew', 'ibm862', '863', 'csibm863', 'ibm863', '864', 'csibm864', 'ibm864', '865', 'csibm865', 'ibm865', '866', 'csibm866', 'ibm866', '869', 'cp_gr', 'csibm869', 'ibm869', '932', 'ms932', 'mskanji', 'ms_kanji', '949', 'ms949', 'uhc', '950', 'ms950', 
'jisx0213', 'eucjis2004', 'euc_jis2004', 'eucjisx0213', 'eucjp', 'ujis', 'u_jis', 'euckr', 'korean', 'ksc5601', 'ks_c_5601', 'ks_c_5601_1987', 'ksx1001', 'ks_x_1001', 'gb18030_2000', 'chinese', 'csiso58gb231280', 'euc_cn', 'euccn', 'eucgb2312_cn', 'gb2312_1980', 'gb2312_80', 'iso_ir_58', 
'936', 'cp936', 'ms936', 'hex', 'roman8', 'r8', 'csHPRoman8', 'cp1051', 'ibm1051', 'hzgb', 'hz_gb', 'hz_gb_2312', 'csiso2022jp', 'iso2022jp', 'iso_2022_jp', 'iso2022jp_1', 'iso_2022_jp_1', 'iso2022jp_2', 'iso_2022_jp_2', 'iso_2022_jp_2004', 'iso2022jp_2004', 'iso2022jp_3', 'iso_2022_jp_3', 
'iso2022jp_ext', 'iso_2022_jp_ext', 'csiso2022kr', 'iso2022kr', 'iso_2022_kr', 'csisolatin6', 'iso_8859_10', 'iso_8859_10_1992', 'iso_ir_157', 'l6', 'latin6', 'thai', 'iso_8859_11', 'iso_8859_11_2001', 'iso_8859_13', 'l7', 'latin7', 'iso_8859_14', 'iso_8859_14_1998', 'iso_celtic', 'iso_ir_199', 'l8', 'latin8', 
'iso_8859_15', 'l9', 'latin9', 'iso_8859_16', 'iso_8859_16_2001', 'iso_ir_226', 'l10', 'latin10', 'csisolatin2', 'iso_8859_2', 'iso_8859_2_1987', 'iso_ir_101', 'l2', 'latin2', 'csisolatin3', 'iso_8859_3', 'iso_8859_3_1988', 'iso_ir_109', 'l3', 'latin3', 'csisolatin4', 'iso_8859_4', 'iso_8859_4_1988', 'iso_ir_110', 
'l4', 'latin4', 'csisolatincyrillic', 'cyrillic', 'iso_8859_5', 'iso_8859_5_1988', 'iso_ir_144', 'arabic', 'asmo_708', 'csisolatinarabic', 'ecma_114', 'iso_8859_6', 'iso_8859_6_1987', 'iso_ir_127', 'csisolatingreek', 'ecma_118', 'elot_928', 'greek', 'greek8', 'iso_8859_7', 'iso_8859_7_1987', 'iso_ir_126', 
'csisolatinhebrew', 'hebrew', 'iso_8859_8', 'iso_8859_8_1988', 'iso_ir_138', 'csisolatin5', 'iso_8859_9', 'iso_8859_9_1989', 'iso_ir_148', 'l5', 'latin5', 'cp1361', 'ms1361', 'cskoi8r', 'kz_1048', 'rk1048', 'strk1048_2002', '8859', 'cp819', 'csisolatin1', 'ibm819', 'iso8859', 'iso8859_1', 'iso_8859_1', 
'iso_8859_1_1987', 'iso_ir_100', 'l1', 'latin', 'latin1', 'maccyrillic', 'macgreek', 'maciceland', 'maccentraleurope', 'maclatin2', 'macintosh', 'macroman', 'macturkish', 'ansi', 'dbcs', 'csptcp154', 'pt154', 'cp154', 'cyrillic_asian', 'quopri', 'quoted_printable', 'quotedprintable', 'rot13', 'csshiftjis', 'shiftjis', 
'sjis', 's_jis', 'shiftjis2004', 'sjis_2004', 's_jis_2004', 'shiftjisx0213', 'sjisx0213', 's_jisx0213', 'tis260', 'tis620', 'tis_620_0', 'tis_620_2529_0', 'tis_620_2529_1', 'iso_ir_166', 'u16', 'utf16', 'unicodebigunmarked', 'utf_16be', 'unicodelittleunmarked', 'utf_16le', 'u32', 'utf32', 'utf_32be', 'utf_32le', 'u7', 
'utf7', 'unicode_1_1_utf_7', 'u8', 'utf', 'utf8', 'utf8_ucs2', 'utf8_ucs4', 'cp65001', 'uu', 'zip', 'zlib', 'x_mac_japanese', 'x_mac_korean', 'x_mac_simp_chinese', 'x_mac_trad_chinese'])

Example 1: Encode to Default UTF-8 Encoding

Example

                    

# Python program to illustrate encode()
my_str = 'Good Morning'
print('String is:', my_str)
# encodes to default utf-8
print('Encoded string is:', my_str.encode())
print('')

word = 'PŸTHØN'
print('String is:', word)
# encodes to default utf-8
print('Encoded string is:', word.encode())

Output

                    

String is: Good Morning
Encoded string is: b'Good Morning'

String is: PŸTHØN
Encoded string is: b'P\xc5\xb8TH\xc3\x98N'

Example 2: Encoding with other Standards

Example

                    

# Python program to illustrate encode()
my_str = 'Good Môrning'
print('String is:', my_str)
# encodes to latin10
print('Encoded string is:', my_str.encode('latin10'))
print('')

word = 'Δ π θ'
print('String is:', word)
# encodes to greek
print('Encoded string is:', word.encode('greek'))

Output

                    

String is: Good Môrning
Encoded string is: b'Good M\xf4rning'

String is: Δ π θ
Encoded string is: b'\xc4 \xf0 \xe8'

Example 3: Encoding with error parameter

                    

# Python program to illustrate encode()
greet = 'Wełcøme'

print('The encoded version (with ignore) is:', greet.encode("ascii","ignore"))

print('The encoded version (with replace) is:', greet.encode("ascii","replace"))

print('The encoded version (with namereplace) is:', greet.encode("ascii","namereplace"))

print('The encoded version (with backslashreplace) is:', greet.encode("ascii","backslashreplace"))

print('The encoded version (with xmlcharrefreplace) is:', greet.encode("ascii","xmlcharrefreplace"))

Output

                    

The encoded version (with ignore) is: b'Wecme'

The encoded version (with replace) is: b'We?c?me'

The encoded version (with namereplace) is: b'We\\N{LATIN SMALL LETTER L WITH STROKE}c\\N{LATIN SMALL LETTER O WITH STROKE}me'

The encoded version (with backslashreplace) is: b'We\\u0142c\\xf8me'

The encoded version (with xmlcharrefreplace) is: b'Wełcøme'

Frequently Asked Questions

Q1. What does encode() do in Python?

In today’s world, security is critical in many applications. As a result, secure information storage in the database is required, and encoded copies of strings must be saved. To achieve this encoding process, Python encode() function is used. The Python String encode() function encodes a string using the given encoding scheme.

Q2. How do you declare an encoding in Python?

Python dictionary encode() function follows the below-mentioned syntax:

                    

string.encode(encoding, errors)

Q3. What is Python default encoding?

If no parameters are specified, then the Python string encode() function uses its default values. For the encoding parameter, the default value is UTF-8, and for the errors parameter, the default value is strict.

Example

                    

msg = 'Edücatioñ'

print('Encoding with default parameters:', msg.encode())
print('Encoding with parameters:', msg.encode('utf8'))

Output

                    

Encoding with default parameters: b'Ed\xc3\xbccatio\xc3\xb1'
Encoding with parameters: b'Ed\xc3\xbccatio\xc3\xb1'

Q4. How do I encode a URL in Python?

To encode a URL in Python, we first need to import the ‘urllib’ module. Encoding a URL can be done in 3 methods:

  1. parse.quote()

                    

import urllib

msg = 'I ãm a študeńt'
print(urllib.parse.quote(msg))

Output

                    

I%20%C3%A3m%20a%20%C5%A1tude%C5%84t

  1. parse.quote_plus(): Encodes spaces to ‘+’

                    

import urllib

msg = 'I ãm a študeńt'
print(urllib.parse.quote_plus(msg))

Output

                    

I+%C3%A3m+a+%C5%A1tude%C5%84t

  1. parse.urlencode(): Encodes multiple parameters

                    

import urllib

msg = {'A': 'Hi', 'website': 'www.apple.com'}
print(urllib.parse.urlencode(msg))

Output

                    

A=Hi&website=www.apple.com

Share with friends

Customize your course in 30 seconds

Which class are you in?
5th
6th
7th
8th
9th
10th
11th
12th
Get ready for all-new Live Classes!
Now learn Live with India's best teachers. Join courses with the best schedule and enjoy fun and interactive classes.
tutor
tutor
Ashhar Firdausi
IIT Roorkee
Biology
tutor
tutor
Dr. Nazma Shaik
VTU
Chemistry
tutor
tutor
Gaurav Tiwari
APJAKTU
Physics
Get Started

Leave a Reply

Your email address will not be published. Required fields are marked *

Download the App

Watch lectures, practise questions and take tests on the go.

Customize your course in 30 seconds

No thanks.