python2与python3的decode()与encode()函数-白红宇

python2与python3的decode()与encode()函数

阅读量：4102 次

发布时间：2019-05-25

本文共 1430 字，大约阅读时间需要 4 分钟。

python3默认编码unicode，python2默认编码ascii。 utf8向下兼容unicode，即utf8是unicode的扩容

python3 utf8–>gbk

# python3默认编码unicodestr_unicode = "中国"# 转成gbkstr_to_gbk = str_unicode.encode('gbk')print(str_to_gbk)

输入的结果：

b'\xd6\xd0\xb9\xfa'

python3 已知gbk–>utf8

# decode("gbk")把gbk编码文本格式转换成str,再有str设置成utf8编码gbk_to_utf = str_to_gbk.decode("gbk").encode("utf8")print(gbk_to_utf)

输入的结果：

b'\xe4\xb8\xad\xe5\x9b\xbd'

python3中，encoding表征的编码或者解码方式；

bytes ------decode() -------> str ------encode()------->bytes

注：python 3中的str类型对象有点像Python2中的unicode，而decode是将str转为unicode编码，所以str仅有一个encode方法，调用这个方法后将产生一个编码后的byte类型的字符。

python2 utf8–>gbk

# python2默认编码ascii#-*-coding:utf-8str_code="中国"print(str_code)str_to_gbk=str_code.encode("gbk")

输入报错

# python2默认编码ascii，又#-*-coding:utf-8表示文本编码格式为utf8格式；# 如果utf8直接编码为gbk会报错，应该先解码unicode格式再进行转换中国Traceback (most recent call last):  File "encode.py", line 4, in 
   
        str_to_gbk=str_code.encode("gbk")UnicodeDecodeError: 'ascii' codec can't decode byte 0xe4 in position 0: ordinal not in range(128)

#-*-coding:utf-8str_code="中国"print(str_code)str_to_gbk = str_code.decode("utf-8").encode("gbk")print(str_to_gbk,type(str_to_gbk))

输入的结果：

中国('\xd6\xd0\xb9\xfa', 
   
    )

python2 gbk–>utf8

str_to_utf8 = str_to_gbk .decode("gbk").encode("utf8")print(str_to_utf8)

输入的结果：

中国

在python2中，使用unicode类型作为编码的基础类型。即

str ----decode() -------> unicode ----encode()------>str

注：python2中，不能直接打印unicode编码，需要将unicode转换成str才能进行打印输出，否则会报错。

转载地址：http://ypzsi.baihongyu.com/

你可能感兴趣的文章

yfan.qiu linux硬链接与软链接

C++总结8——shared_ptr和weak_ptr智能指针

Linux网络编程---I/O复用模型之poll

查看>>

Java NIO详解

查看>>

单列模式-编写类ConfigManager读取属性文件

查看>>

java中float和double的区别

查看>>

Statement与PreparedStatement区别

查看>>

Tomcat配置数据源步骤以及使用JNDI

查看>>

before start of result set 是什么错误

查看>>

(正则表达式)表单验证

查看>>