Chatting While Working on RI Algorithm Implementation

Hyaline @hyaline@hackers.pub

  • Subgraph isomorphism problem is a terrible rabbit hole.
  • Looking at papers like this where they say "our experiment took 556 days!" and many papers mention "we set a time limit of 10 days per case" and so on.
  • I was getting frustrated with the slow search algorithm right before completing the plugin I was developing
  • So I carefully revisited papers I had previously given up on reading(...)
  • I realized that half of what I built was correct and half was just inefficient... but the performance difference was so huge that it made me angry.
  • I decided to adopt and implement RI, which seemed decent among the papers with publicly available code
  • After spending a few days creating the integration part, I just ran it successfully.
  • It's incredibly fast. I'm excited.
  • But of course, the worst cases that I thought "this would probably take months to calculate" still don't finish running, haha ' -';
  • By the way, what appears to be the most recent(?) version of that algorithm is this: ArcMatch and the paper is here
    • I had also considered VF3 as a candidate, but it seemed to require some modifications to check monomorphism (non-induced subgraph isomorphism), so I avoided it ' -' I was thinking about switching to ArcMatch once I got more familiar, and above all, ArcMatch seemed better in terms of scalability.
  • Up to here was what I wrote on BlueSky, but here's a small update.
  • I naively thought that I could just grab the source code and use it without fully understanding the algorithm and code.
  • But if I wanted to tell the editor "that node doesn't look right, don't touch it anymore" based on the editor's circumstances or the plugin's design intentions... ahaha.
  • I had to understand everything completely. Actually, this is a core functionality, so I feel very ashamed for trying to use it without understanding, though unconsciously I was probably expecting that I'd end up understanding everything anyway... encapsulation means you can temporarily forget something in your context buffer while working, not that you don't need to understand what you're using - that's a blatant lie! There's no such thing in this field!
  • I spent 2 days carefully reading the paper and finally understood it.
  • But shockingly, the paper didn't cover the parts of the source code that I couldn't understand.
  • And the code looks like this:
		int psi = -1;
		int si = 0;
		int ci = -1;
		int sip1;
		while(si != -1){
			//steps++;
			if(psi >= si){
				matched[solution[si]] = false;
			}
			ci = -1;
			candidatesIT[si]++;
			while(candidatesIT[si] < candidatesSize[si]){
				//triedcouples++;
				ci = candidates[si][candidatesIT[si]];
				solution[si] = ci;

psi, si, ci, sip1... and in other places there are things like ii too

  • I ended up using the LLM feature in Rider to get code explanations and spent hours struggling to understand. Now it's time to set up debugging to check if my understanding is actually correct. The LLM probably lied to me in some parts.
  • It's the first time in my life seeing tree traversal without recursion, based on arrays and pointers to arrays. It's mind-bogglingly confusing, but I'm also enjoying witnessing this kind of black magic.
  • But the most enjoyable part is this aspect. The feeling of consuming in a short time not quite a day, but still the results that many people worked hard on for a long time is so sweet. I wish I could have told this to my younger self who struggled with studying.

It's really nice that Markdown is available here.

I'm procrastinating on starting to create test cases and setting up debugging. Now I should get back to work ' ㅅ'

2

2 comments

If you have a fediverse account, you can comment on this article from your own instance. Search https://hackers.pub/ap/articles/01977223-28cf-7f5f-bd07-954d0177736f on your instance and reply to it.

0

LLM 이 요약해준 글을 읽고 폭소했어요. 여러 논문들과 코드들을 보며 알고리즘들을 검토한 건 맞는데, 그 과정은 본문에 없단다 아가야 ㄲㄲㄲ

0