The performance of reference counting takes a big dive in multi-threaded systems, which are nearly universal today. Pointer assignments require up to two atomic operations. For modern multithreaded systems, this should clobber performance. Of course, failure to reclaim any data structure that has cycles is pretty serious, too.