g-cache can use Simics's data profiling support to profile cache misses. You can use the add-profiler and remove-profiler commands to add and remove profiler to the cache for a specific type of events.
g-cache can profile the following type of events:
For example, if you would like to see which parts of the code are responsible for read and write misses, you could create profilers that count read and write misses per instruction. (This example assumes that your cache is called dc.)
simics> dc.add-profiler type = data-read-miss-per-instruction [dc] New profiler added for data-read-miss-per-instruction: dc_prof_data-read-miss-per-instruction simics> dc.add-profiler type = data-write-miss-per-instruction [dc] New profiler added for data-write-miss-per-instruction: dc_prof_data-write-miss-per-instruction
This creates two profiler objects and attaches them to the proper slots in the cache object. The profilers are initially empty, so we have to run for a while to give them time to collect some interesting data:
simics> c 10_000_000 [cpu0] v:0x0000000000002590 p:0x0000000003c7e590 sll %i0, 1, %o1
Now, we can ask either profiler what data it has gathered:
simics> dc_prof_data_read_miss_per_instruction.address-profile-data View 0 of dc_prof_data_read_miss_per_instruction: dc prof: data-read-miss-per-instruction 64-bit virtual addresses, profiler granularity 4 bytes Each cell covers 2 address bits (4 bytes). column offsets: 0x1* 0x00 0x04 0x08 0x0c 0x10 0x14 0x18 0x1c --------------------------------------------------------------------------- 0x0000000000002480: . . . 23331 . . . . 0x00000000000024a0: . . 15492 545 . . 90 . 0x00000000000024c0: . . . 112 . . . . 0x00000000000024e0: . . . . . . . . 0x0000000000002500: . . . . . . . . 0x0000000000002520: . . . . . . . . 0x0000000000002540: . . . . . . . . 0x0000000000002560: . . . . . . . . 0x0000000000002580: . . . . . . . . 0x00000000000025a0: . . . . . . . . 0x00000000000025c0: . . . . . . . . 0x00000000000025e0: . . . . . . 136 . 39706 counts shown. 0 not shown.
Since these two profilers are instruction indexed, it also makes sense to display their counts in a disassembly listing:
simics> cpu0.aprof-views add = dc_prof_data_read_miss_per_instruction simics> cpu0.aprof-views add = dc_prof_data_write_miss_per_instruction simics> cpu0.disassemble %pc 32 v:0x0000000000002590 p:0x0000000003c7e590 0 0 sll %i0, 1, %o1 v:0x0000000000002594 p:0x0000000003c7e594 0 0 lduh [%o1 + %o2], %o1 v:0x0000000000002598 p:0x0000000003c7e598 0 0 and %l0, %o1, %o1 v:0x000000000000259c p:0x0000000003c7e59c 0 0 sll %o1, 16, %i0 v:0x00000000000025a0 p:0x0000000003c7e5a0 0 0 sra %i0, 16, %i0 v:0x00000000000025a4 p:0x0000000003c7e5a4 0 0 jmpl [%i7 + 8], %g0 v:0x00000000000025a8 p:0x0000000003c7e5a8 0 0 restore %g0, %g0, %g0 v:0x00000000000025ac p:0x0000000003c7e5ac 0 0 jmpl [%i7 + 8], %g0 v:0x00000000000025b0 p:0x0000000003c7e5b0 0 0 restore %g0, -1, %o0 v:0x00000000000025b4 p:0x0000000003c7e5b4 0 0 illtrap 0 v:0x00000000000025b8 p:0x0000000003c7e5b8 0 0 illtrap 0 v:0x00000000000025bc p:0x0000000003c7e5bc 0 0 illtrap 0 v:0x00000000000025c0 p:0x0000000003c7e5c0 0 0 illtrap 0 v:0x00000000000025c4 p:0x0000000003c7e5c4 0 0 illtrap 0 v:0x00000000000025c8 p:0x0000000003c7e5c8 0 0 save %sp, -96, %sp v:0x00000000000025cc p:0x0000000003c7e5cc 0 0 sethi %hi(0x41c00), %o0 v:0x00000000000025d0 p:0x0000000003c7e5d0 0 0 add %o0, 840, %o1 v:0x00000000000025d4 p:0x0000000003c7e5d4 0 0 lduw [%o1 + 0], %o0 v:0x00000000000025d8 p:0x0000000003c7e5d8 0 0 subcc %o0, 1, %o0 v:0x00000000000025dc p:0x0000000003c7e5dc 0 0 stw %o0, [%o1 + 0] v:0x00000000000025e0 p:0x0000000003c7e5e0 0 0 bpos 0x25f0 v:0x00000000000025e4 p:0x0000000003c7e5e4 0 0 mov -1, %i0 v:0x00000000000025e8 p:0x0000000003c7e5e8 0 0 jmpl [%i7 + 8], %g0 v:0x00000000000025ec p:0x0000000003c7e5ec 0 0 restore %g0, %g0, %g0 v:0x00000000000025f0 p:0x0000000003c7e5f0 0 0 sethi %hi(0x4f000), %o0 v:0x00000000000025f4 p:0x0000000003c7e5f4 0 0 add %o0, 616, %o0 v:0x00000000000025f8 p:0x0000000003c7e5f8 136 0 lduw [%o0 + 0], %o2 v:0x00000000000025fc p:0x0000000003c7e5fc 0 0 add %o2, 1, %o1 v:0x0000000000002600 p:0x0000000003c7e600 0 0 stw %o1, [%o0 + 0] v:0x0000000000002604 p:0x0000000003c7e604 0 0 ldub [%o2 + 0], %o2 v:0x0000000000002608 p:0x0000000003c7e608 0 0 and %o2, 255, %i0 v:0x000000000000260c p:0x0000000003c7e60c 0 0 jmpl [%i7 + 8], %g0
In the listing above, we see that in the 32 instructions following the current program counter, one load instruction is responsible for 136 read misses, and no instructions have caused any write misses.
For more on getting information out of profilers, see section 13.3.