V5.2.0 - Segmentation fault

Attempting to sync an old node with quite a few (maybe 10) zaddrs on Ubuntu 20.04.4

During ‘Rescanning’ it died with a segfault, this was after approx five days.

There is nothing in debug.log to indicate a problem, it died somewhere around blk 1756361.

Found this in /var/log/syslog :-

zcashd[151478]: segfault at 18 ip 000055f413a755b6 sp 00007ffcad084150 error 4 in zcashd[55f413a69000+d09000]

zcash.conf has nothing special :-

rpcuser=deleted
rpcpassword=deleted
rpcport=8232
mainnet=1
addnode=mainnet.z.cash

EDIT: 12 core CPU, 16Gb RAM, its a reasonable spec machine.

Restarting zcashd & it resumed rescan from where it died (instead of starting from scratch)

I suspect it segfaulted when it completed scanning all previously downloaded blocks - its at the approx height of its last sync & after restart it immediately started fetching new blocks.

The mystery continues…

Checked for progress this morning, it had synced to block 1767335 during the night and then quietly died.

Nothing in debug.log to indicate why, only strange thing was the zcashd terminal display had a ‘Killed’ message.

Added a LOT more swap space just in case its a resources issue & restarted, rescanning resumed from block 1758108

I have been experiencing something similar on my laptop with v5.2.0. What’s odd is that we had no issues on the ZGo server, v5.2.0 was just upgrade and restart zcashd and everything worked.

After having a bunch of Seg Faults, I decided to replicate the exact config from the server on the laptop, Now the node has been downloading the new blocks and once it got past block 1730000 (the chunky blocks), from time to time the system kills zcashd with an out-of-memory error.

It does make progress in the download and indexing with every restart, I’m hoping to catch up soon.

This sounds like it could be your issue or related:

If you have any other details such as machine config (cpu, memory, disk) that would be helpful.

Since the increase in shielded transactions the blockchain is significantly larger requiring more disk space to store as well as memory to sync / scan due to the size of blocks.

I just checked, 42Gb free space on this machine - full disk encryption.

Spec: 16Gb ram, 12 GB swap, 12 core CPU, Ubuntu 20.04.

Currently rescanning blk 1759788, its doing approx 4 blocks a minute.

This is what I get as it’s processing blocks:

[522788.846566] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-3.scope,task=zcashd,pid=273096,uid=1000
[522788.846635] Out of memory: Killed process 273096 (zcashd) total-vm:11172624kB, anon-rss:9272416kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:18972kB oom_score_adj:0

For a while it will process blocks with a more or less flat RAM usage but at some point the usage starts growing, then pegs, then kills over.

I do not have that much swap though.

Checking logs again & I found this from the last attempt - same problem.

I added 8Gb swap this time around.

syslog:Aug 11 07:23:46 radium kernel: [45873.028663] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-2.scope,task=zcashd,pid=2192,uid=1000
syslog:Aug 11 07:23:46 radium kernel: [45873.028785] Out of memory: Killed process 2192 (zcashd) total-vm:22411108kB, anon-rss:15580508kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:40664kB oom_score_adj:0
syslog:Aug 11 07:23:46 radium kernel: [45874.342518] oom_reaper: reaped process 2192 (zcashd), now anon-rss:1024kB, file-rss:0kB, shmem-rss:0kB

EDIT: I wonder if this is related to the number of keys in the node wallet? This node has at least five, and a couple of sprout keys !!

I also managed to sync my Pi4 node with v5.2.0, however that has no keys.

EDIT: 18 hrs later…rescanning block 1764884…still going…‘top’ says its using 23.7% of memory…
EDIT: 1765740…still going
EDIT: 1766037…still going

I wonder what causes the change from more or less flat RAM usage, to the accumulating RAM usage that eventually pegs.

Hi there.
This is my issue:

So what should I do? It doesn’t seems like out of memory for me.
Help)

Every launch after the first crash required rescan just a one block then it die

For reference I encountered this issue but was able to side step it by the following:

My old wallet.dat was unable to sync with my laptop (Latest Ubuntu, 16GB Ram). Something happens with zcashd causing it to suck all memory until it crashes. I dont have the exact message but using dmesg it was for sure zcashd. So I started a new node and it was able to sync (Empty wallet). Afterwords I imported all my private keys and it -rescan successfully. Note: I didn’t have any funds in any UA’s at the time so I could do this. If you have funds in any UA, you’ll need to wait for to coming seed phrase import tool.

It completed the ‘rescanning’ and is now catching up with the network, currently four days behind.

EDIT: Once the ‘Rescanning’ was complete I was able to get a list of all addresses & the keys with z_exportkey, so at least if it all goes to hell I can recover the funds. There were nine sapling keys, which is eight-too-many in current conditions.

Machine has 4.1Gb of memory in use, approx 23.8% of total is zcashd.

Fingers crossed.

EDIT: Memory in use is now 6.4Gb, at block 1767337
EDIT: Memory in use 8.24Gb, at block 1768319
EDIT: Memory 10.2Gb at block 1769410
EDIT: Memory 12.2Gb, block 1770902

2 Likes

For the record, the trick I did to skip the rescan was to copy wallet.dat to a new directory, make a blank zcash.conf and restart zcashd there.

1 Like

Now at 1769155… don’t jinx it !!! :wink:

EDIT: Synced !!! Finally

2 Likes

I can not export private keys while rescanning in process

1 Like

Yep, that’s the way it is, you have to wait for the rescan to complete.

I think a tool to extract the keys from wallet.dat would be useful, there must be nodes out there with too many keys or on slow hardware that are now impossible to sync or rescan within reasonable time.

1 Like

Sorry for my English. But = )) I said that Wallet crashes first time. All next launcher require rescan just for one(1 =)) block. And then it crashes. It seems like it crashes directly when rescan is finished. Thus it is impossible to finish rescan

2022-08-10T08:17:45.562280Z  INFO Init: main: init message: Rescanning...
2022-08-10T08:17:45.562287Z  INFO Init: main: CWallet::InitLoadWallet(): Rescanning last 1 blocks (from block 1733601)...
2022-08-10T08:17:45.562292Z  INFO Init: main: CWallet::ScanForWalletTransactions(): Rewinding Orchard wallet to height 1733600; current is 1733601
2022-08-10T08:20:53.671067Z  INFO Init: main: Still rescanning. At block 1733602. Progress=0.979533

{here it stops}

And Then I see it every launch

Try leaving it for a very long time, maybe days.

It took my node almost a week and there were times when it looked like it was dead but was just very busy.

1 Like

What version of rust are you using? My clean node (all v5.2) is on rustc 1.25 (synced, no problems) and I’m catching one up now with a few addys on 1.62 so Ill letya know

Oh wait zcashd installs rust so that version may be different… Nevermind!

and…Segmentation fault…again !

All I did was restart the node after a clean shutdown with ‘zcash-cli stop’