Modified Automatic Reference Counting

Note: Still a work in progress, and a fun little thing I decided to think about. If there are any situtations in which this doesn't work or I have not explain a part of this (properly) then please let me know

If you're not familier with ARC, read this article by Swift.

MARC aims to solve the issue of cyclic references without the use of weak or strong pointer

The problem


Let's say you have a struct Person with the following (in a madeup C like language):
struct Person {
   String name;
   Person *friend;
};
And also you've used them like this:
Person alice = {"Alice", NULL};
Person bob = {"bob", NULL};

If you've read the article mentioned above, you will see the problem if we use ARC in this situtation.
If we do
alice.friend = &bob and
bob.friend = &alice we create a circular reference and normal ARC would not be able to garbage collect it since both objects keep referencing each other and prevent the reference count of each other from ever reaching zero.

Solution


If you have again read the article by Swift, you know that using strong and weak pointers prevent this issue.

But MARC doesn't need to use weak pointers

Here is how MARC approaches this issue.

  1. First, it sets the reference count of all objects to 0
  2. Then it starts counting the references for all objects
  3. If object A references B, then the reference count of B is increased and B is added as a "child" of A
  4. If object B also references A, then we have a problem:
    • If neither A or B are referenced by any other object, then both of them are considered dead
    • Both A and B are alive if and only if atleast one of the twos objects are being referenced by another object.
  5. In a programming language, you would implement this by making variables a constant object, that is the variables won't be deleted by MARC. Then the values of the variable will be the objects which can be deleted.

    Pictorically:

    image desribing the previous points

    Suppose you have the following: image for Indirect cyclic reference A -> C -> B -> A

  6. Assuming to start our search from A:
    • If the object which references A (in this case B), has another object (C) which references to itself, which is referenced by A, then the while system is considered dead. To put it more broadly, If an object has atleast one child or descendant (direct or Indirect) which references back to itself, then that descendant tree is considered dead. Unless, there is an object which references a child which references the initial or the initial object itself, then the tree is saved.
      (pretty complicated rule, hopefully the picture will clear it up a bit. I'm still working on figuring this part out clearly)
      Now cases like this: Are prevented
So for the following Bob and Alice problem: image for the bob and alice problem You can note the following:
  1. The objects in square represent the variables bob and alice. They are the so called "constant" objects mentioned earlier. Without these constant objects, the whole structure would be dead.
  2. The constant objects have a reference to one structure Person representing the Person struct for bob and alice which we'll call Bob and Alice respectively.
  3. Each person struct references to a string literal which is their names, and also to the each other which is for their friend field.
MARC would do this:
  1. Set the reference count of all objects to 0
  2. Skip the constant objects (because they represent identifiers (or variable names) alice and bob)
  3. Now for Bob:
    • It would increase the reference count of Bob by 1 (because of the constant that is referencing it)
    • It finds that Alice has a reference to Bob, so it checks if Bob also has a reference to Alice. It does so now it checks if either Bob or Alice have an object which is referencing them which it is not referencing to. In this case finds that constant alice is referencing to Alice, so it increments reference count of Bob by 1
    • It goes over to the "Bob" object and increments its reference count by one.
  4. Same procedure for Alice too:
    • It would increase the reference count of Alice by 1 (because of the constant that is referencing it)
    • It finds that Bob has a reference to Alice, so it checks if Alice also has a reference to Bob. It does so now it checks if either Bob or Alice have an object which is referencing them which it is not referencing to. In this case finds that constant bob is referencing to Bob, so it increments reference count of Alice by 1
    • It goes over to the "Alice" object and increments its reference count by one.

And thats it, it deems that neither of the two objects should be deallocated. So as long as either bob or alice is alive, neither Bob nor Alice will be dead.

Notes


Other way of thinking about it

You could instead of thinking in terms of reference counts, think in terms of reachability like a mark-sweep gc. If A can be reached by a constant object (defined before), and also follows rules 3-6, then it is considered reachable. Otherwise it is considered unreachable and memory is freed. Now I don't know if it can be done at compile-time - it sounds like it should be - but if it is possible then that's pretty cool.


How is figure 6 dead?

We use point 6 for this.

  • Starting from A, lets go to C. C has one child D, D references A, so C indirectly references A, hence the path A -> C is dead and as a consequence D is dead aswell.
  • Now B, has no references direct or indirect references to A it is alive.
  • E has one child D which references A, so the path from E is also considered dead.
And now the tree becomes (read color indicated dead objects/links):