Python基础——数据深拷贝、浅拷贝

off999 2024-10-05 19:43 40 浏览 0 评论

各基本数据类型的地址存储及改变情况在python中的数据类型包括：bool、int、long、float、str、set、list、tuple、dict等等。我们可以大致将这些数据类型归类为简单数据类型和复杂的数据结构。

数据结构：集合结构：set 序列结构： tuple list (str) 映射结构： dict

基本数据类型 Int long float bool str ..

说明：由于python中的变量都是采用的引用语义，数据结构可以包含基础数据类型，导致了在python中数据的存储是下图这种情况，每个变量中都存储了这个变量的地址，而不是值本身；对于复杂的数据结构来说，里面的存储的也只只是每个元素的地址而已

1.数据类型重新初始化对python语义引用的影响

变量的每一次初始化，都开辟了一个新的空间，将新内容的地址赋值给变量

str = "hello word"print(id(str))    #43863640str1 = "new hello word"<br>print(id(str1))   #43863680

从上代码中可以看出str在重复的初始化过程中，是因为str中存储的元素地址由'hello world'的地址变成了'new hello world'的。

2.数据结构内部元素变化重对python语义引用的影响

对于复杂的数据类型来说，改变其内部的值对于变量的影响：

list1 = [1,2,3,4,5,6]print(id(list1))     #7705224list1.append('new item')print(id(list1))     #7705224list1.pop()print(list1)       #[1, 2, 3, 4, 5, 6]list1[0] = 'change_test'print(list1)          #['change_test', 2, 3, 4, 5, 6]print(id(list1))      #7705224list1 = [1,2,3,4,5]print(id(list1))      #7705224

当对列表中的元素进行一些增删改的操作的时候，是不会影响到lst1列表本身对于整个列表地址的，只会改变其内部元素的地址引用。可是当我们对于一个列表重新初始化(赋值)的时候，就给list1这个变量重新赋予了一个地址，覆盖了原本列表的地址，这个时候，list1列表的内存id就发生了改变。上面这个道理用在所有复杂的数据类型中都是一样的

3.变量的赋值

View Code我们刚刚已经知道，str1的再次初始化（赋值）会导致内存地址的改变，从上图的结果我们可以看出修改了str1之后，被赋值的str2从内存地址到值都没有受到影响

看内存中的变化，起始的赋值操作让str1和str2变量都存储了‘hello world’所在的地址，重新对str1初始化，使str1中存储的地址发生了改变，指向了新建的值，此时str2变量存储的内存地址并未改变，所以不受影响。

4.复杂的数据结构中的赋值　

print("复杂的数据结构中的赋值")list1 = [1, 2, 3, 4, 5, 6]list2 = list1print(id(list1)) # 42367240print(id(list2)) # 42367240list1.append('new item')print(list1) # [1, 2, 3, 4, 5, 6, 'new item']print(list2) # [1, 2, 3, 4, 5, 6, 'new item']print(id(list1)) # 42367240print(id(list2)) # 42367240

由次可知列表的增加修改操作，没有改变列表的内存地址，lst1和lst2都发生了变化，在列表中添加新值时，列表中又多存储了一个新元素的地址，而列表本身的地址没有变化，所以lst1和lst2的id均没有改变并且都被添加了一个新的元素

初识拷贝

我们已经详细了解了变量赋值的过程。对于复杂的数据结构来说，赋值就等于完全共享了资源，一个值的改变会完全被另一个值共享。然而有的时候，我们偏偏需要将一份数据的原始内容保留一份，再去处理数据，这个时候使用赋值就不够明智了。python为这种需求提供了copy模块。提供了两种主要的copy方法，一种是普通的copy，另一种是deepcopy。我们称前者是浅拷贝，后者为深拷贝。

深浅拷贝一直是所有编程语言的重要知识点，下面我们就从内存的角度来分析一下两者的区别。

浅拷贝：

首先，我们来了解

print("浅拷贝")import copylst = ['str1', 'str2', 'str3']sourcelst = ['str1', 'str2', 'str3', lst]copylst = copy.copy(sourcelst)print("原本地址")print([id(ele) for ele in sourcelst])print([id(ele) for ele in copylst])print("当sourceLst列表发生变化，copyLst中存储的lst内存地址没有改变")sourcelst.append('source')copylst.append('copy')print("->sourcelst: ", sourcelst)print("->copylst: ", copylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(copylst)) # 4print(copylst) # 5print([id(ele) for ele in copylst])print("sourceLst的第一个元素发生了变化。而copyLst还是存储了str1的地址，所以copyLst不会发生改变。")print([id(ele) for ele in sourcelst])print([id(ele) for ele in copylst])sourcelst[0] = 'change'print("->sourcelst: ", sourcelst)print("->copylst: ", copylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(copylst)) # 4print(copylst) # 5print([id(ele) for ele in copylst])print("以当lst发生改变的时候，sourceLst和copyLst两个列表就都发生了改变。")print([id(ele) for ele in sourcelst])print([id(ele) for ele in copylst])lst.append('Append')print("->sourcelst: ", sourcelst)print("->copylst: ", copylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(copylst)) # 4print(copylst) # 5print([id(ele) for ele in copylst])

浅拷贝：不管多么复杂的数据结构，浅拷贝都只会copy一层。

深拷贝

刚刚我们了解了浅拷贝的意义，但是在写程序的时候，我们就是希望复杂的数据结构之间完全copy一份并且它们之间又没有一毛钱关系，应该怎么办呢？我们引入一个深拷贝的概念，深拷贝——即python的copy模块提供的另一个deepcopy方法。深拷贝会完全复制原变量相关的所有数据，在内存中生成一套完全一样的内容，在这个过程中我们对这两个变量中的一个进行任意修改都不会影响其他变量。下面我们就来试验一下。

看上面的执行结果，这一次我们不管是对直接对列表进行操作还是对列表内嵌套的其他数据结构操作，都不会产生拷贝的列表受影响的情况。我们再来看看这些变量在内存中的状况

看了上面的内容，我们就知道了深拷贝的原理。其实深拷贝就是在内存中重新开辟一块空间，不管数据结构多么复杂，只要遇到可能发生改变的数据类型，就重新开辟一块内存空间把内容复制下来，直到最后一层，不再有复杂的数据类型，就保持其原引用。这样，不管数据结构多么的复杂，数据之间的修改都不会相互影响。这就是深拷贝~~~

print("深拷贝")import copylst = ['str1', 'str2', 'str3']sourcelst = ['str1', 'str2', 'str3', lst]deepcopylst = copy.deepcopy(sourcelst)print("原本地址")print([id(ele) for ele in sourcelst])print([id(ele) for ele in deepcopylst])print("当sourceLst列表发生变化，copyLst中存储的lst内存地址没有改变")sourcelst.append('source')deepcopylst.append('deepcopy')print("->sourcelst: ", sourcelst)print("->deepcopylst: ", deepcopylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(deepcopylst)) # 4print(deepcopylst) # 5print([id(ele) for ele in deepcopylst])print("sourceLst的第一个元素发生了变化。而copyLst还是存储了str1的地址，所以copyLst不会发生改变。")print([id(ele) for ele in sourcelst])print([id(ele) for ele in deepcopylst])sourcelst[0] = 'change'print("->sourcelst: ", sourcelst)print("->deepcopylst: ", deepcopylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(deepcopylst)) # 4print(deepcopylst) # 5print([id(ele) for ele in deepcopylst])print("以当lst发生改变的时候，sourceLst和copyLst两个列表就都发生了改变。")print([id(ele) for ele in sourcelst])print([id(ele) for ele in deepcopylst])lst.append('Append')print("->sourcelst: ", sourcelst)print("->copylst: ", deepcopylst)print(id(sourcelst)) # 4print(sourcelst) # 5print([id(ele) for ele in sourcelst])print(id(deepcopylst)) # 4print(deepcopylst) # 5print([id(ele) for ele in deepcopylst])

直接看一段代码：

import copywill = ["Will", 28, ["Python", "C#", "JavaScript"]]# wilber = copy.deepcopy(will)wilber = willprint(id(will)) # 1print(will) # 2print([id(ele) for ele in will]) # 3print(id(wilber)) # 4print(wilber) # 5print([id(ele) for ele in wilber])print("\n")will[0] = "Wilber"will[2].append("CSS")print(id(will)) # 6print(will)print([id(ele) for ele in will])print(id(wilber))print(wilber)print([id(ele) for ele in wilber])

代码输出结果：

['Will', 28, ['Python', 'C#', 'JavaScript']][31949688, 506294592, 42511880]42511816['Will', 28, ['Python', 'C#', 'JavaScript']][31949688, 506294592, 42511880]42511816['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']][42534368, 506294592, 42511880]42511816['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']][42534368, 506294592, 42511880]Process finished with exit code 0

['Will', 28, ['Python', 'C#', 'JavaScript']][31949688, 506294592, 42511880]42511816['Will', 28, ['Python', 'C#', 'JavaScript']][31949688, 506294592, 42511880]42511816['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']][42534368, 506294592, 42511880]42511816['Wilber', 28, ['Python', 'C#', 'JavaScript', 'CSS']][42534368, 506294592, 42511880]Process finished with exit code 0

下面来分析一下这段代码：

首先，创建了一个名为will的变量，这个变量指向一个list对象，从第一张图中可以看到所有对象的地址（每次运行，结果可能不同）

然后，通过will变量对wilber变量进行赋值，那么wilber变量将指向will变量对应的对象（内存地址），也就是说”wilber is will”，”wilber[i] is will[i]”可以理解为，Python中，对象的赋值都是进行对象引用（内存地址）传递