V5.2.0 - Segmentation fault

ChileBob · August 10, 2022, 10:34pm

Attempting to sync an old node with quite a few (maybe 10) zaddrs on Ubuntu 20.04.4

During ‘Rescanning’ it died with a segfault, this was after approx five days.

There is nothing in debug.log to indicate a problem, it died somewhere around blk 1756361.

Found this in /var/log/syslog :-

zcashd[151478]: segfault at 18 ip 000055f413a755b6 sp 00007ffcad084150 error 4 in zcashd[55f413a69000+d09000]

zcash.conf has nothing special :-

rpcuser=deleted
rpcpassword=deleted
rpcport=8232
mainnet=1
addnode=mainnet.z.cash

EDIT: 12 core CPU, 16Gb RAM, its a reasonable spec machine.

Restarting zcashd & it resumed rescan from where it died (instead of starting from scratch)

I suspect it segfaulted when it completed scanning all previously downloaded blocks - its at the approx height of its last sync & after restart it immediately started fetching new blocks.

ChileBob · August 11, 2022, 1:12pm

The mystery continues…

Checked for progress this morning, it had synced to block 1767335 during the night and then quietly died.

Nothing in debug.log to indicate why, only strange thing was the zcashd terminal display had a ‘Killed’ message.

Added a LOT more swap space just in case its a resources issue & restarted, rescanning resumed from block 1758108

pitmutt · August 11, 2022, 1:43pm

I have been experiencing something similar on my laptop with v5.2.0. What’s odd is that we had no issues on the ZGo server, v5.2.0 was just upgrade and restart zcashd and everything worked.

After having a bunch of Seg Faults, I decided to replicate the exact config from the server on the laptop, Now the node has been downloading the new blocks and once it got past block 1730000 (the chunky blocks), from time to time the system kills zcashd with an out-of-memory error.

It does make progress in the download and indexing with every restart, I’m hoping to catch up soon.

steven-ecc · August 11, 2022, 5:17pm

This sounds like it could be your issue or related:

If you have any other details such as machine config (cpu, memory, disk) that would be helpful.

Since the increase in shielded transactions the blockchain is significantly larger requiring more disk space to store as well as memory to sync / scan due to the size of blocks.

ChileBob · August 11, 2022, 6:42pm

I just checked, 42Gb free space on this machine - full disk encryption.

Spec: 16Gb ram, 12 GB swap, 12 core CPU, Ubuntu 20.04.

Currently rescanning blk 1759788, its doing approx 4 blocks a minute.

pitmutt · August 11, 2022, 7:41pm

This is what I get as it’s processing blocks:

[522788.846566] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-3.scope,task=zcashd,pid=273096,uid=1000
[522788.846635] Out of memory: Killed process 273096 (zcashd) total-vm:11172624kB, anon-rss:9272416kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:18972kB oom_score_adj:0

For a while it will process blocks with a more or less flat RAM usage but at some point the usage starts growing, then pegs, then kills over.

I do not have that much swap though.

ChileBob · August 11, 2022, 7:58pm

Checking logs again & I found this from the last attempt - same problem.

I added 8Gb swap this time around.

syslog:Aug 11 07:23:46 radium kernel: [45873.028663] oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=/,mems_allowed=0,global_oom,task_memcg=/user.slice/user-1000.slice/session-2.scope,task=zcashd,pid=2192,uid=1000
syslog:Aug 11 07:23:46 radium kernel: [45873.028785] Out of memory: Killed process 2192 (zcashd) total-vm:22411108kB, anon-rss:15580508kB, file-rss:0kB, shmem-rss:0kB, UID:1000 pgtables:40664kB oom_score_adj:0
syslog:Aug 11 07:23:46 radium kernel: [45874.342518] oom_reaper: reaped process 2192 (zcashd), now anon-rss:1024kB, file-rss:0kB, shmem-rss:0kB

EDIT: I wonder if this is related to the number of keys in the node wallet? This node has at least five, and a couple of sprout keys !!

I also managed to sync my Pi4 node with v5.2.0, however that has no keys.

EDIT: 18 hrs later…rescanning block 1764884…still going…‘top’ says its using 23.7% of memory…
EDIT: 1765740…still going
EDIT: 1766037…still going

pitmutt · August 12, 2022, 2:21pm

I wonder what causes the change from more or less flat RAM usage, to the accumulating RAM usage that eventually pegs.

freebeego · August 12, 2022, 5:46pm

Hi there.
This is my issue:

So what should I do? It doesn’t seems like out of memory for me.
Help)

freebeego · August 12, 2022, 5:50pm

Every launch after the first crash required rescan just a one block then it die

dismad · August 12, 2022, 6:26pm

For reference I encountered this issue but was able to side step it by the following:

github.com/zcash/zcash

Zcashd v5.1.0 stop unexpectedly

opened 01:31AM - 17 Jul 22 UTC

melvjoshua

Hi Team, I have updated my Zcashd 5.0.0 to 5.1.0 and it always stop unexpectedly… without any error. The Zcashd always stop around block number 1736000, when I restart. I am using 2 CPU, 8 GB RAM. Please help Here is the logs before it stopped. Try 1 : ``` 2022-07-15T10:27:05.253688Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000001090616d4a2bcb276efaef094ed5503c00cde0156a86df17f5c6fcb height=1736700 bits=469880354 log2_work=59.291206 tx=11090280 date=2022-07-14 00:47:48 progress=0.999029 cache=62.3MiB(10251tx) 2022-07-15T10:27:05.398398Z INFO ProcessNewBlock: main: Leaving block file 335: CBlockFileInfo(blocks=0, size=0, heights=0...0, time=1970-01-01...1970-01-01) 2022-07-15T10:27:05.403544Z INFO ProcessNewBlock: main: Pre-allocating up to position 0x1000000 in blk00335.dat 2022-07-15T10:27:11.748017Z INFO ProcessNewBlock: main: Pre-allocating up to position 0x100000 in rev00335.dat 2022-07-15T10:27:11.748531Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000005e66bb0fd75667d0c397dd90e2623a2226fdcb9a421aad40427ccf height=1736701 bits=469880238 log2_work=59.291207 tx=11090301 date=2022-07-14 00:49:47 progress=0.999030 cache=62.3MiB(10254tx) 2022-07-15T10:27:17.815329Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000000a4d45190167f77ca8d68b404dcd41e58d56d0c370b557d76a582f2 height=1736702 bits=469878619 log2_work=59.291208 tx=11090321 date=2022-07-14 00:51:53 progress=0.999031 cache=62.3MiB(10256tx) 2022-07-15T10:27:23.937632Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000006dc2ab38de648d5807857565d206be9b8016ade4d299a0731a31e8 height=1736703 bits=469876451 log2_work=59.291209 tx=11090339 date=2022-07-14 00:52:21 progress=0.999031 cache=62.3MiB(10257tx) 2022-07-15T10:27:28.228763Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000000133ad55cf86ca481eb1e69b9e66ddeb231e720a0121c638d996222 height=1736704 bits=469877384 log2_work=59.291211 tx=11090354 date=2022-07-14 00:54:00 progress=0.999032 cache=62.3MiB(10258tx) 2022-07-15T10:27:32.087270Z INFO ProcessNewBlock: main: UpdateTip: new best hash=000000000007cffaf03d5c4d18d83c16c8378fa3c09ed5e87d17ef6df8733034 height=1736705 bits=469885598 log2_work=59.291212 tx=11090358 date=2022-07-14 00:54:13 progress=0.999032 cache=62.3MiB(10259tx) 2022-07-15T10:27:35.641771Z INFO ProcessNewBlock: main: UpdateTip: new best hash=000000000134ac074d8882eb11de0b23a54bb794986eaa7f4cfba63c7f0e5f25 height=1736706 bits=469888448 log2_work=59.291213 tx=11090361 date=2022-07-14 00:54:34 progress=0.999032 cache=62.3MiB(10260tx) ``` Try 2 : ``` 2022-07-16T02:00:30.693986Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000009d7d5ddc0d9881a3f09d13c86f9a6aaecf3eca7adb7e3f147deea6 height=1736695 bits=469871063 log2_work=59.291199 tx=11090192 date=2022-07-14 00:35:37 progress=0.998575 cache=25.8MiB(5895tx) 2022-07-16T02:00:36.305604Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000000cb22bfbb0f329782b031739255cbbf2be0f57d5d697cf0efb3fefa height=1736696 bits=469870903 log2_work=59.2912 tx=11090211 date=2022-07-14 00:36:35 progress=0.998575 cache=25.8MiB(5896tx) 2022-07-16T02:00:36.468017Z INFO ProcessNewBlock: main: Pre-allocating up to position 0x2000000 in blk00335.dat 2022-07-16T02:00:40.925004Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000000b0306ba8bcdffa84f7c283849d9934bf92a50ee1c1909249cc6c2e height=1736697 bits=469871522 log2_work=59.291202 tx=11090227 date=2022-07-14 00:37:05 progress=0.998576 cache=25.8MiB(5898tx) 2022-07-16T02:00:46.419291Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000010175da26adbf463243094471f91f6b0d77a39092d155d5ee8dbda2 height=1736698 bits=469871721 log2_work=59.291203 tx=11090243 date=2022-07-14 00:38:31 progress=0.998576 cache=25.9MiB(5903tx) 2022-07-16T02:00:46.577674Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000019f7506a47115f044cb8b7c595c31a50f6e5781c427bb7b8e32109b height=1736699 bits=469876889 log2_work=59.291204 tx=11090251 date=2022-07-14 00:45:26 progress=0.998580 cache=25.9MiB(5908tx) 2022-07-16T02:00:52.073932Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000001090616d4a2bcb276efaef094ed5503c00cde0156a86df17f5c6fcb height=1736700 bits=469880354 log2_work=59.291206 tx=11090280 date=2022-07-14 00:47:48 progress=0.998581 cache=25.9MiB(5909tx) 2022-07-16T02:00:57.504939Z INFO ProcessNewBlock: main: UpdateTip: new best hash=00000000005e66bb0fd75667d0c397dd90e2623a2226fdcb9a421aad40427ccf height=1736701 bits=469880238 log2_work=59.291207 tx=11090301 date=2022-07-14 00:49:47 progress=0.998582 cache=25.9MiB(5912tx) 2022-07-16T02:01:03.052673Z INFO ProcessNewBlock: main: UpdateTip: new best hash=0000000000a4d45190167f77ca8d68b404dcd41e58d56d0c370b557d76a582f2 height=1736702 bits=469878619 log2_work=59.291208 tx=11090321 date=2022-07-14 00:51:53 progress=0.998583 cache=25.9MiB(5914tx) 2022-07-16T02:01:05.298191Z INFO ProcessNewBlock: main: Pre-allocating up to position 0x3000000 in blk00335.dat ```

My old wallet.dat was unable to sync with my laptop (Latest Ubuntu, 16GB Ram). Something happens with zcashd causing it to suck all memory until it crashes. I dont have the exact message but using dmesg it was for sure zcashd. So I started a new node and it was able to sync (Empty wallet). Afterwords I imported all my private keys and it -rescan successfully. Note: I didn’t have any funds in any UA’s at the time so I could do this. If you have funds in any UA, you’ll need to wait for to coming seed phrase import tool.

ChileBob · August 12, 2022, 8:43pm

It completed the ‘rescanning’ and is now catching up with the network, currently four days behind.

EDIT: Once the ‘Rescanning’ was complete I was able to get a list of all addresses & the keys with z_exportkey, so at least if it all goes to hell I can recover the funds. There were nine sapling keys, which is eight-too-many in current conditions.

Machine has 4.1Gb of memory in use, approx 23.8% of total is zcashd.

Fingers crossed.

EDIT: Memory in use is now 6.4Gb, at block 1767337
EDIT: Memory in use 8.24Gb, at block 1768319
EDIT: Memory 10.2Gb at block 1769410
EDIT: Memory 12.2Gb, block 1770902

hanh · August 13, 2022, 12:44am

For the record, the trick I did to skip the rescan was to copy wallet.dat to a new directory, make a blank zcash.conf and restart zcashd there.

ChileBob · August 13, 2022, 12:45am

Now at 1769155… don’t jinx it !!!

EDIT: Synced !!! Finally

freebeego · August 13, 2022, 2:15pm

I can not export private keys while rescanning in process

ChileBob · August 13, 2022, 2:35pm

Yep, that’s the way it is, you have to wait for the rescan to complete.

I think a tool to extract the keys from wallet.dat would be useful, there must be nodes out there with too many keys or on slow hardware that are now impossible to sync or rescan within reasonable time.

freebeego · August 13, 2022, 3:04pm

Sorry for my English. But = )) I said that Wallet crashes first time. All next launcher require rescan just for one(1 =)) block. And then it crashes. It seems like it crashes directly when rescan is finished. Thus it is impossible to finish rescan

2022-08-10T08:17:45.562280Z  INFO Init: main: init message: Rescanning...
2022-08-10T08:17:45.562287Z  INFO Init: main: CWallet::InitLoadWallet(): Rescanning last 1 blocks (from block 1733601)...
2022-08-10T08:17:45.562292Z  INFO Init: main: CWallet::ScanForWalletTransactions(): Rewinding Orchard wallet to height 1733600; current is 1733601
2022-08-10T08:20:53.671067Z  INFO Init: main: Still rescanning. At block 1733602. Progress=0.979533

{here it stops}

And Then I see it every launch

ChileBob · August 13, 2022, 6:01pm

Try leaving it for a very long time, maybe days.

It took my node almost a week and there were times when it looked like it was dead but was just very busy.

Autotunafish · August 13, 2022, 9:37pm

What version of rust are you using? My clean node (all v5.2) is on rustc 1.25 (synced, no problems) and I’m catching one up now with a few addys on 1.62 so Ill letya know

Oh wait zcashd installs rust so that version may be different… Nevermind!

ChileBob · August 13, 2022, 11:11pm

and…Segmentation fault…again !

All I did was restart the node after a clean shutdown with ‘zcash-cli stop’

Topic		Replies	Views
V4.6.0 - segfaults during rescan Technical Support	2	429	January 7, 2022
Zcashd crashed version v2.1.2-3 Technical Support	0	513	May 22, 2020
Rescuing a zcashd node that won't catch up Technical Support	4	592	October 4, 2022
ZCASHD Crashing Help Technical Support	5	1206	November 2, 2018
Zcashd unable to sync Technical Support	8	2952	April 24, 2018

V5.2.0 - Segmentation fault

Related topics