Deepcopy

From Rosetta Code
Revision as of 02:47, 18 July 2011 by rosettacode>Kernigh (Add Ruby, using the Marshal.load(Marshal.dump object) hack.)
Deepcopy is a draft programming task. It is not yet considered ready to be promoted as a complete task, for reasons that should be found in its talk page.

Demonstrate how your language handles complete copying, often referred to as deep copying, of complex data structures.

This can be via a built-in procedure or operator, a function from a common library (provide reference and source if it is written in the language), or a procedure (show source).

  • Describe the relevant semantics of structures, such as if they are homogeneous or heterogeneous, or if they can contain cycles. Describe any limitations of the method.
  • Demonstrate the procedure by copying a structure with all supported semantics/complexity (e.g. heterogeneous with cycles).
  • Demonstrate that the structure and its copy are different.

E

In E, serialization is generalized to transforming object graphs from one representation to another. Deep copying, therefore, consists of transforming a live object graph into a live object graph, by connecting deSubgraphKit's output to its input. No intermediate serialized form is needed.

<lang e>def deSubgraphKit := <elib:serial.deSubgraphKit> def deepcopy(x) {

 return deSubgraphKit.recognize(x, deSubgraphKit.makeBuilder())

}</lang>

As befits a serialization system, this deep copy may operate on any serializable structure, whether standard or user-defined, and the structure may contain cycles.

<lang e>? def x := ["a" => 1, "b" => [x].diverge()]

  1. value: ["a" => 1, "b" => [<***CYCLE***>].diverge()]

? def y := deepcopy(x)

  1. value: ["a" => 1, "b" => [<***CYCLE***>].diverge()]

? y["b"].push(2)

? y

  1. value: ["a" => 1, "b" => [<***CYCLE***>, 2].diverge()]

? x

  1. value: ["a" => 1, "b" => [<***CYCLE***>].diverge()]

? y["b"][0] == y

  1. value: true

? y["b"][0] == x

  1. value: false

? x["b"][0] == x

  1. value: true</lang>

(.diverge() produces mutable data structures, and <***CYCLE***> is what is printed when printing some object meets itself again.)

Icon and Unicon

Unicon and Icon support heterogeneous structures with loops. The Unicon book has an example of a simple algorithm for producing a deep copy of a structured value (set, list, table, or record); however, that code did not handle graph structures that are not trees. The code for deepcopy below from Unilib is a modification that addresses loops.

The code requires modification to run under Icon as Unicon extended key(X) to operate on lists and records not just tables.

<lang Unicon>procedure deepcopy(A, cache) #: return a deepcopy of A

   local k
   /cache := table()        # used to handle multireferenced objects
   if \cache[A] then return cache[A]
   case type(A) of {
       "table"|"list": {
           cache[A] := copy(A)
           every cache[A][k := key(A)] := deepcopy(A[k], cache)
           }
       "set": {
           cache[A] := set()
           every insert(cache[A], deepcopy(!A, cache))
           }
       default: {           # records and objects (encoded as records)
           cache[A] := copy(A)
           if match("record ",image(A)) then {
               every cache[A][k := key(A)] := deepcopy(A[k], cache)
               }
           }
       }
   return .cache[A]

end</lang>

The following code demonstrates deepcopy usage and that the resulting structure is different from the original by comparing assignment, copy, and deepcopy.

<lang Icon>link printf,ximage

procedure main()

  knot    := makeknot()     # create a structure with loops
  knota   := knot           # copy by assignment (reference)
  knotc   := copy(knot)     # built-in copy (shallow)
  knotdc  := deepcopy(knot) # deep copy


  showdeep("knota  (assignment) vs. knot",knota,knot) 
  showdeep("knotc  (copy)  vs. knot",knotc,knot) 
  showdeep("knotdc (deepcopy)  vs. knot",knotdc,knot) 
  xdump("knot   (original)",knot)
  xdump("knota  (assignment)",knota)
  xdump("knotc  (copy)",knotc)
  xdump("knotdc (deepcopy)",knotdc)

end

record rec1(a,b,c) # record for example

class Class1(a1,a2) # class - looks like a record under the covers

  method one()
     self.a1 := 1
     return
  end

initially

  self.a1 := 0

end


procedure makeknot() #: return a homogeneous structure with loops

  L := [9,8,7]
  T := table()
  T["a"] := 1
  R := rec1(T)
  S := set(R)
  C := Class1()
  C.one()
  T["knot"] := [L,R,S,C]
  put(L,R,S,T,C)
  return L

end

procedure showdeep(tag,XC,X) #: demo to show (non-)equivalence of list elements

  printf("Analysis of copy depth for %s:\n",tag)
  showequiv(XC,X)
  every showequiv(XC[i := 1 to *X],X[i])

end

procedure showequiv(x,y) #: show (non-)equivalence of two values

  return printf("   %i %s %i\n",x,if x === y then "===" else "~===",y)

end</lang>

DeepCopy.icn

printf.icn provides printf ximage.icn provides xdump

The sample of output below compares all elements of a copy of a structure against the original. Immutable types like numbers, strings, and csets will show as the same (i.e. ===) and different mutable types will show as not the same (i.e. ~===). This clearly shows the difference between assignment, copy, and deepcopy.

Analysis of copy depth for knota  (assignment) vs. knot:
   list_11(7) === list_11(7)
   9 === 9
   8 === 8
   7 === 7
   record rec1_2(3) === record rec1_2(3)
   set_2(1) === set_2(1)
   table_2(2) === table_2(2)
   record Class1__state_2(4) === record Class1__state_2(4)
Analysis of copy depth for knotc  (copy)  vs. knot:
   list_13(7) ~=== list_11(7)
   9 === 9
   8 === 8
   7 === 7
   record rec1_2(3) === record rec1_2(3)
   set_2(1) === set_2(1)
   table_2(2) === table_2(2)
   record Class1__state_2(4) === record Class1__state_2(4)
Analysis of copy depth for knotdc (deepcopy)  vs. knot:
   list_14(7) ~=== list_11(7)
   9 === 9
   8 === 8
   7 === 7
   record rec1_3(3) ~=== record rec1_2(3)
   set_3(1) ~=== set_2(1)
   table_4(2) ~=== table_2(2)
   record Class1__state_3(4) ~=== record Class1__state_2(4)
...

Another way to show the difference in the structures is to use the xdump procedure will produce the following in stderr (&errout):

knot   (original)"
L11 := list(7)
   L11[1] := 9
   L11[2] := 8
   L11[3] := 7
   L11[4] := R_rec1_2 := rec1()
      R_rec1_2.a := T2 := table(&null)
         T2["a"] := 1
         T2["knot"] := L12 := list(4)
            L12[1] := L11
            L12[2] := R_rec1_2
            L12[3] := S2 := set()
               insert(S2,R_rec1_2)
            L12[4] := R_Class1__state_2 := Class1__state()
               R_Class1__state_2.a1 := 1
   L11[5] := S2
   L11[6] := T2
   L11[7] := R_Class1__state_2
...

Ruby

Rubyists can hack a deep copy by using the core class Marshal. The intermediate form of Marshal.load(Marshal.dump object) saves the object and any descendant objects.

<lang ruby># _orig_ is a Hash that contains an Array. orig = { :num => 1, :ary => [2, 3] } orig[:cycle] = orig # _orig_ also contains itself.

  1. _copy_ becomes a deep copy of _orig_.

copy = Marshal.load(Marshal.dump orig)

  1. These changes to _orig_ never affect _copy_,
  2. because _orig_ and _copy_ are disjoint structures.

orig[:ary] << 4 orig[:rng] = (5..6)

  1. Because of deep copy, orig[:ary] and copy[:ary]
  2. refer to different Arrays.

p orig # => {:num=>1, :ary=>[2, 3, 4], :cycle=>{...}, :rng=>5..6} p copy # => {:num=>1, :ary=>[2, 3], :cycle=>{...}}

  1. The original contains itself, and the copy contains itself,
  2. but the original and the copy are not the same object.

p [(orig.equal? orig[:cycle]),

  (copy.equal? copy[:cycle]),
  (not orig.equal? copy)]	# => [true, true, true]</lang>

Marshal cannot dump an object that relates to the system (like Dir or IO), relates to the program (like MatchData or Thread), uses an anonymous class or module, or has a singleton method. (ri Marshal.dump documents this restriction.) If Marshal encounters any such object, then the deep copy fails.

Marshal can dump internal objects, but never copies them. The internal objects are nil, false, true and instances of Fixnum or Symbol. For example, Marshal.dump(Marshal.load :sym) returns the original :sym, not a copy.

The internal objects are almost immutable, so there is almost no reason to copy them. Yet, there are esoteric ways to modify them. For example, nil.instance_eval { @i = 1 } would modify nil. A program cannot have another copy of nil to escape such modification. If there was a deep copy of some object that contains nil, then such modification would also affect nil inside such copy.